Thermal Dense Mapping with Odometry-Guided Foundation Depth Estimation
Fireground-oriented dense mapping from thermal imagery, built on a thermal-radar-inertial SLAM backbone for smoke, water spray, fire, darkness, and other perception-degraded environments. The current prototype uses odometry-guided Depth Anything 3 inference and reaches about 0.5 Hz online dense reconstruction on an RTX 5090 laptop.
Overview
Fireground-oriented dense mapping from thermal imagery, built on a thermal-radar-inertial SLAM backbone for smoke, water spray, fire, darkness, and other perception-degraded environments. The current prototype uses odometry-guided Depth Anything 3 inference and reaches about 0.5 Hz online dense reconstruction on an RTX 5090 laptop.
Details
Fireground Robotics Context
This project comes from a fireground robotics setting, where the sensing problem is not simply low light or noisy RGB images. The operating scene may include dense smoke, water spray, mist, fire, heavy rain, and rapid visibility changes. In these conditions, visual cameras lose texture and contrast, while 3D LiDAR can be blocked or severely degraded by smoke and suspended droplets. A robot may still be mechanically capable of entering the scene, but without robust localization and mapping it remains largely tele-operated.
The project therefore uses non-traditional sensing as the core perception stack. Thermal cameras observe heat radiation rather than visible appearance, so people, hot surfaces, and structural boundaries can remain distinguishable when RGB assumptions collapse. 4D radar contributes long-range geometric and Doppler information that is less affected by smoke, dust, rain, and poor illumination. IMU measurements provide the high-rate motion continuity needed by the SLAM backbone.
The larger system targets a standalone radar-thermal-inertial payload for robotic platforms, with mapping and localization feedback for operators. The dense mapping project sits on top of this backbone: it converts robust but relatively sparse or modality-specific odometry into a more readable 3D representation for navigation, inspection, and human situational awareness.
From Robust SLAM to Dense Thermal Mapping
The system separates the problem into odometry and mapping. The odometry side focuses on robust thermal-radar-inertial SLAM in adverse environments; the mapping side turns that trajectory and sensor stream into point-cloud maps, occupancy maps, and dense thermal reconstructions. This project focuses on the dense reconstruction branch.
The design choice is deliberate. Radar and thermal-inertial odometry provide metric motion that can survive smoke-heavy and fireground-style scenes, but sparse radar point maps are not always intuitive for firefighters or remote operators. Dense thermal mapping aims to produce a geometry-rich view that is easier to inspect, while still remaining anchored to robot motion instead of becoming a disconnected foundation-model reconstruction.
Odometry-Guided Foundation Depth
Early experiments with foundation geometry showed that thermal images can recover plausible scene structure, but scale and trajectory consistency are fragile if the reconstruction is detached from robot odometry. The current direction therefore guides Depth Anything 3 with calibrated thermal observations and odometry-derived camera motion from the radar-thermal-inertial SLAM backbone.
This keeps the role of DA3 natural: it is used as a multi-view depth estimator conditioned by thermal image sequences and robot motion, rather than as a loose single-image prior. On an RTX 5090 laptop, the current prototype runs online at about 0.5 Hz for dense thermal mapping.
System View
The implementation is deliberately kept modular at the conceptual level: odometry supplies motion, DA3 estimates dense depth from thermal views, and the mapping stage integrates confident geometry into a global representation. The important result is that dense thermal mapping can be made online and robot-aligned, even though the sensing signal is fundamentally different from RGB.
Mapping Results
Current Status
The prototype currently demonstrates online thermal dense reconstruction at about 0.5 Hz on an RTX 5090 laptop, matching the project-level requirement for at least 0.5 Hz map display. The current focus is dense, robot-aligned thermal mapping for smoke, water-spray, and fireground-style environments.