A ROS 2-based TurtleBot 4 software stack for autonomous indoor data harvesting: mapping, robot telemetry, air-quality sensing, Wi-Fi scanning, and Robonomics/IPFS publication.
Table of Contents
Data Harvester is a modular ROS 2 project that turns TurtleBot 4 into a mobile indoor sensing platform. It combines autonomous robot motion, sensor fusion, and data packaging so a single mission can produce an archive of machine-readable environmental data.
The goal of the project is to provide reproducible and secure room-state analytics based on synchronized robot, external sensor, and wireless-network measurements.
Available features include:
- Wall-follow mission with automatic undocking, SLAM map saving, odometry recording, and video capture
- Navigation-mode stack (localization + Nav2) for running missions on pre-built maps
- ESP32-based external air sensor integration (temperature, humidity, luminosity, CO2)
- Periodic Wi-Fi scanning with BSSID/SSID/channel/rate/signal export
- Data chronicling into timestamped
harvested-data-*.ziparchives - Robonomics integration to publish resulting archives via IPFS and Robonomics parachain
For convenience, the project is divided into several packages:
.
├── data_harvester_chronicler # Mission data recorder and archive publisher integration
├── data_harvester_interfaces # Custom ROS 2 message definitions for ESP and Wi-Fi data
├── data_harvester_navigation # Navigation-mode package for localization and Nav2-based movement
├── data_harvester_perception # Perception package for external sensors and wireless environment scan
├── data_harvester_wall_follow # End-to-end wall-follow harvesting mission package
├── esp32-sensors # ESP-IDF firmware workspace for external air-sensor controller
└── ...
Core layers:
-
Perception layer (
data_harvester_perception) Reads external environment data and publishes it as ROS 2 streams:esp32_sensors_nodeparses JSON frames from ESP32 serial and publishes sensor readings (temperature,humidity,luminosity,co2).wifi_scanner_nodeperiodically scans nearby Wi-Fi networks and publishes scan results.
-
Navigation and mission-execution layer (
data_harvester_navigation,data_harvester_wall_follow) Provides robot motion strategies:data_harvester_navigationruns map-based localization and Nav2 navigation.data_harvester_wall_followruns a full autonomous mission flow (undocking, wall following, mission timing, and artifact generation).
-
Data chronicling and publication layer (
data_harvester_chronicler) Collects synchronized mission data and prepares result archives:data_harvester_chronicler(lifecycle node) records robot/perception streams into mission datasets.data_harvester_robonomicspublishes produced archives through the Robonomics/IPFS stack.
-
Interface contract layer (
data_harvester_interfaces) Defines custom message types used between packages (ESP and Wi-Fi data), so all producer/consumer nodes share a stable schema.
Data flow (high level):
- Perception nodes publish external measurements.
- Robot state/navigation topics are produced by TurtleBot/Nav2 stack.
- Chronicler subscribes to both robot and perception streams, aligns them by time, and writes mission artifacts.
- Produced archives are sent to Robonomics/IPFS by publication components.
Two operating scenarios are built on top of this architecture:
-
Scenario A: Wall-Follow mode (end-to-end) Use
data_harvester_wall_followas the main orchestrator for a simpler autonomous run. This path is optimized for quick deployment when one launch sequence should execute the full harvesting mission. -
Scenario B: Chronicler mode (map-based, staged) Build/save a map with SLAM, run localization + navigation on that map, start chronicler lifecycle collection, and publish outputs through Robonomics/IPFS. This path is better when you need explicit control over mapping, navigation, and publication stages.
Make sure the following base components are installed and available before running Data Harvester:
- Linux OS distribution (tested on Ubuntu 22.04.4)
- ROS 2 distribution (tested on Humble)
- Python 3 (tested on 3.10.12)
- TurtleBot 4 software stack (including
turtlebot4_navigation) nmcli(NetworkManager CLI) for Wi-Fi scanning- Robonomics ROS 2 Wrapper 3.1.0 packages (required for the chronicler/publication pipeline)
-
Complete the manufacturer baseline setup for TurtleBot 4. Official guide: https://turtlebot.github.io/turtlebot4-user-manual/setup/basic.html
-
Apply the baseline network and middleware settings used by this project:
- Connect to the robot Raspberry Pi access point and run
turtlebot4-setup. - Configure Wi-Fi credentials for your local network.
- Update packages on the robot (
apt upgrade). - In ROS 2 settings, use Fast DDS (
rmw_fastrtps_cpp) and configure Discovery Server. - In Create 3 web UI, verify Fast DDS, Domain ID, namespace, and Discovery Server settings match Raspberry Pi.
- From your workstation, verify topic visibility with
ros2 topic list.
- Connect to the robot Raspberry Pi access point and run
-
Create a ROS 2 workspace on the robot:
mkdir -p your_project_ws/src cd your_project_ws/src -
Clone this repository:
git clone https://github.com/Fingerling42/data-harvester.git
-
Build from the workspace root:
cd ../ colcon build -
Source the workspace in every new terminal:
source install/setup.bash
ESP32 sensing context in this project:
- ESP32 with sensors connects to TurtleBot via USB.
- Firmware publishes JSON frames over serial.
- A special node reads these frames and publishes ROS 2 messages on
data_harvester/esp_sensorstopic. - Supported sensors are:
- temperature and humidity (SHT3x)
- luminosity/light level (BH1750)
- CO2 concentration (SCD4x)
To build and flash ESP32 firmware (esp32-sensors/) you can use ESP-IDF.
Official ESP-IDF setup guide: https://docs.espressif.com/projects/esp-idf/en/stable/esp32/get-started/
Firmware variants:
esp32-sensors/data_harvester_esp_offline— sensor streaming over serial without Wi-Fi.esp32-sensors/data_harvester_esp_online— sensor streaming with Wi-Fi enabled.
Typical flashing command inside the selected firmware directory:
idf.py build flash monitor-
To configure ESP32 serial settings for the perception node, open
data_harvester_perception/config/esp_config.yamland change:esp_port: serial device path (for example/dev/serial/by-id/...)esp_baudrate: baud rate (default in repo:115200)
-
To configure wall-follow mode parameters, open
data_harvester_wall_follow/config/harvester_config.yamland change:runtime_min: mission duration in minutesmap_name: output map basename for SLAM artifacts
-
If using chronicler mode, review map/localization/nav2 parameters:
data_harvester_navigation/config/turtlebot4_localization.yamldata_harvester_navigation/config/turtlebot4_nav2.yaml
-
If using chronicler mode, provide a valid
pubsub_params_pathwhen launching chronicler. The chronicler requestsipfs_dir_pathfromrobonomics_ros2_pubsuband writes archives there.
Data Harvester is typically used in two runtime scenarios, both relying on the same perception stack.
Start perception once to publish external sensor and Wi-Fi data:
ros2 launch data_harvester_perception perception_launch.pyThis launches:
- ESP32 sensor reader (
esp32_sensors_node) - Wi-Fi scanner (
wifi_scanner_node)
- Start perception stack (command above).
- Start wall-follow mission:
ros2 launch data_harvester_wall_follow data_harvester_wall_follow_launch.pyArtifacts produced by this scenario:
harvested-data-<timestamp>.zipin workspace root.- Inside the archive:
harvesting_process.mp4(OAK-D preview video).odom.json(time-stamped robot state stream: SLAM pose, mouse odometry, IMU, cliff IR, bumper IR).esp_sensors.json(time-stamped ESP readings with SLAM pose context: temperature, humidity, luminosity, CO2).<map_name>.yamland<map_name>.pgm(occupancy map fromslam_toolbox/save_map).<map_name>.posegraphand<map_name>.data(serialized SLAM graph fromslam_toolbox/serialize_map).
<map_name> is taken from data_harvester_wall_follow/config/harvester_config.yaml.
- Build a map with SLAM:
ros2 launch turtlebot4_navigation slam.launch.py sync:=false params:=./turtlebot4_slam.yaml- Save occupancy map:
ros2 service call /slam_toolbox/save_map slam_toolbox/srv/SaveMap "name:
data: 'room_map'"- Serialize SLAM pose graph:
ros2 service call /slam_toolbox/serialize_map slam_toolbox/srv/SerializePoseGraph "{filename: 'room_map'}"-
Start perception stack (command from perception section).
-
Start navigation on the saved map:
ros2 launch data_harvester_navigation data_harvester_navigation_launch.py map:=./room/room_map.yaml- Start chronicler with Robonomics/IPFS parameters:
ros2 launch data_harvester_chronicler data_harvester_chronicler_launch.py pubsub_params_path:=dh_robonomics_params.yaml- Configure the chronicler lifecycle node:
ros2 lifecycle set /data_harvester_chronicler configureArtifacts produced by this scenario:
- Mapping stage outputs (from steps 2-3):
room_map.yaml,room_map.pgm,room_map.posegraph,room_map.data.
- Chronicler outputs in
ipfs_dir_path(received fromrobonomics_ros2_pubsubparams):data.json(synchronized robot + ESP dataset: pose, ESP air sensors, mouse, IMU, cliff IR, bumper IR).wifi_list.json(time-stamped Wi-Fi scan dataset with robot pose and per-BSSID signal/SSID entries).harvested-data-<timestamp>.zipcontainingdata.jsonandwifi_list.json.
- Publication handoff:
- Chronicler publishes archive filename to
data_harvester/archive_name. data_harvester_robonomicssends the archive through Robonomics/IPFS datalog flow.
- Chronicler publishes archive filename to
For persistent deployments, use the provided service files:
data_harvester_perception/systemd/data-harvester-perception.service- auto-starts perception stack.data_harvester_chronicler/systemd/ipfs-daemon.service- auto-starts local IPFS daemon used by Robonomics flow.
Typical setup:
- Copy the required
.servicefile(s) to/etc/systemd/system/. - Run
sudo systemctl daemon-reload. - Enable at boot:
sudo systemctl enable <service-name>. - Start now:
sudo systemctl start <service-name>. - Inspect status/logs:
systemctl status <service-name>journalctl -u <service-name> -f
For interactive exploration of harvested mission data, the project uses the external frontend airalab/data-harvester-dapp.
It is a Vue 3 + Vite web application designed to present TurtleBot 4 datasets in a user-friendly format and support the Robonomics/IPFS data flow used by this project. In practice, it acts as the visualization layer on top of generated Data Harvester archives, so operators can review collected telemetry and environment measurements without working directly with raw JSON files.
- Open
turtlebot4-setupon Raspberry Pi. - In ROS settings, disable diagnostics nodes.
- Use Fast DDS (
rmw_fastrtps_cpp) and Discovery Server inturtlebot4-setup. - Ensure Create 3 (in web UI) and Raspberry Pi use matching DDS settings (middleware, Domain ID, namespace, discovery mode).
- Optional advanced tuning: adjust Fast DDS fragment cache/memory parameters for your network and traffic profile.
- Check
chronyconfig on PC and Raspberry Pi:
cat /etc/chrony/chrony.conf- Open Create 3 web UI and go to
Edit ntp.conf. - Set NTP server (example from setup notes):
# Use RPi4 as preferred server
# Minpoll 2^4 = 16s
# Maxpoll 2^6 = 64s
server 192.168.186.3 prefer iburst minpoll 4 maxpoll 6
- Restart NTPD from Create 3 web UI.
Symptom: /dev/RPLIDAR can bind to the wrong USB adapter.
Cause: The default udev rule is too generic for similar USB-UART devices.
Fix:
- Edit TurtleBot 4 udev rules:
sudo nano /etc/udev/rules.d/50-turtlebot4.rules- Use a rule that matches your actual lidar USB serial number (replace values for your hardware in
ATTRS{serial}). - Reload udev rules:
sudo udevadm control --reload-rules
sudo udevadm triggerSymptom: Lidar node can fail intermittently or behave incorrectly.
Fix:
- Edit:
sudo nano /opt/ros/humble/share/turtlebot4_bringup/launch/rplidar.launch.py- Ensure
rplidar_nodeuses the expected parameters, including:serial_port: /dev/RPLIDARserial_baudrate: 115200frame_id: rplidar_linkauto_standby: False
- Restart TurtleBot 4 service:
sudo systemctl restart turtlebot4.serviceNote: Changes under /opt/ros/... can be overwritten after package updates.
Symptom: Navigation fails or local/global costmaps do not update correctly.
Fix:
- In localization config:
sudo nano /opt/ros/humble/share/turtlebot4_navigation/config/localization.yamlSet:
amcl:
ros__parameters:
...
scan_topic: /scan
...- In Nav2 config:
sudo nano /opt/ros/humble/share/turtlebot4_navigation/config/nav2.yamlSet scan topic to /scan for local and global costmaps.
Symptom: Camera occasionally fails with stream read errors.
Fix options:
- Use separate power delivery (instead of sharing power and data over one cable).
- Connect through stable USB-C side ports on the robot.
Symptom: Video topic uses low-bandwidth preview settings.
Fix:
- Edit OAK-D config:
sudo nano /opt/ros/humble/share/turtlebot4_bringup/config/oakd_pro.yaml- Review and adjust key parameters in:
/oakd:
ros__parameters:
camera:
...
rgb:
i_max_q_size: # Video message queue size
i_enable_preview: # Enabling preview
i_preview_size: # Preview resolution
i_low_bandwidth: # Enabling lower quality mode (for poor network)
i_publish_topic: # Parameter that enables and disables topics
i_resolution: # The resolution of this camera cannot be made lower than 1080P
...Use higher quality only if DDS/network/CPU load is acceptable for your deployment.
Distributed under the Apache-2.0 License. See LICENSE for more information.
- Project report article: Data Harvester report (April 2024)
- Twitter/X thread: @berman_ivan Data Harvester thread
Ivan Berman - @berman_ivan - fingerling42@proton.me


