BCoD-Diffusion Research

Abstract

Robots equipped with rich sensor suites can localize reliably in partially-observable environments---but powering every sensor continuously is wasteful and often infeasible. Belief-space planners address this by propagating pose-belief covariance through analytic models and switching sensors heuristically--a brittle, runtime-expensive approach. Data-driven approaches--including diffusion models--learn multi-modal trajectories from demonstrations, but presuppose an accurate, always-on state estimate. We address the largely open problem: for a given task in a mapped environment, which minimal sensor subset must be active at each location to maintain state uncertainty just low enough to complete the task? Our key insight is that when a diffusion planner is explicitly conditioned on a pose-belief raster and a sensor mask, the spread of its denoising trajectories yields a calibrated, differentiable proxy for the expected localisation error. Building on this insight, we present Belief-Conditioned One-Step Diffusion (B-COD), the first planner that, in a 10 ms forward pass, returns a short-horizon trajectory, per-waypoint aleatoric variances, and a proxy for localisation error--eliminating external covariance rollouts. We show that this single proxy suffices for a soft-actor-critic to choose sensors online, optimising energy while bounding pose-covariance growth. We deploy B-COD in real-time marine trials on an unmanned surface vehicle and show that it reduces sensing energy consumption while matching the goal-reach performance of an always-on baseline.

Methodology

Real-World Experiments (Qualitative Results)

Our evaluation targets a real-world, real-time scenario in which an autonomous surface vehicle (ASV) must navigate a previously unseen open-air lake to reach waypoint goals with just-enough sensing while keeping the CVaR-95 localisation error below a user budget of 2 m. The lake presents both natural and human-driven disturbances: winds, waves, fountains, floating buoys and human-induced sensor denied zones. The test platform is a SeaRobotics Surveyor ASV with a differential-thrust propulsion module and a heterogeneous sensor suite: a multi-beam LiDAR, day and night cameras, RTK-GPS, MEMS IMU, and an EXO2 sonde. Control inputs are augmented by a discrete mode flag that selects the estimator configuration implied by the powered sensors. Sensor power draw differs by an order of magnitude, so efficient scheduling has tangible impact on total mission energy.

Environment and Autonomous Surface Vehicle testbed

Experiment videos

Click on the videos below to view the experiments (the videos are sped up for brevity).

Experiment 1 (Day Lap): ASV navigating in an environment with a human induced sensor (GPS) denied zone and a EXO2 rich zone.

Experiment 2 (Night Lap): ASV navigating in the environment during night time.

Experiment 3 (Night Lap): ASV navigating in the environment during night time.

Experiment 4 (Day Lap): ASV navigating in the environment but we force LiDAR to be off in an obstacle rich zone (where LiDAR is usually heavily used).

Key Findings (Quantitative Results)

Click on the images below to view the detailed key findings.

Key Finding #1: B-COD+SAC delivers near-perfect task completion at less than half the sensing cost of the Always-ON baseline.

Metric	Always-ON	B-COD+SAC
Goal-reach (%)	100.0	97.9
Collision (%)	0.5	0.9
CVaR violations (%)	0.1	0.5
Mean #sensors	5.0	2.08
Energy vs AON (%)	100.0	42.3
Runtime (ms)	14.9	14.3
Peak RAM (MB)	305	284

Key Finding #2: B-COD's variance is a calibrated, context-aware predictor of localisation error.

Key Finding #3: B-COD stays within a 10 ± 1 ms envelope and out-scales analytic belief planners.

r (m)	B-COD	IGG	DL
25	9.8 ms	7.5 ms	565 ms
40	9.7 ms	10.9 ms	1446 ms
55	9.6 ms	14.6 ms	2737 ms
70	10.7 ms	18.2 ms	4430 ms
85	10.4 ms	18.7 ms	6536 ms
100	10.9 ms	23.3 ms	9040 ms

Key Finding #4: B-COD adapts online, re-allocating modalities to recover from faults.

Key Finding #1: B-COD+SAC delivers near-perfect task completion at less than half the sensing cost of the Always-ON baseline.

Metric	AON	GOF	IGG	R1	R2	σM	SS	NB	PRL	DL	Ours
Goal-reach (%)	100.0	47.3	89.9	18.5	29.1	79.6	94.3	67.8	54.8	87.9	97.9
Collision (%)	0.5	22.3	6.1	34.5	30.1	12.4	4.7	17.4	22.1	4.2	0.9
CVaR violations (%)	0.1	15.8	4.3	28.6	22.8	9.1	5.2	13.2	18.3	1.9	0.5
Mean #sensors	5.0	3.19	2.65	1.0	2.0	2.99	2.56	4.05	3.48	5.0	2.08
Energy vs AON (%)	100.0	61.2	49.8	24.2	38.9	60.1	91.2	68.2	67.5	100	42.3
Runtime (ms)	14.9	14.7	26.8	13.6	13.7	14.4	84.1	14.1	12.1	565.3	14.3
Peak RAM (MB)	305	282	403	277	281	287	674	279	299	731	284

Table above summarizes performance over 50 laps. B-COD reaches the goal on 97.9 \% of attempts, yet spends only 42% of the energy. Collisions remain at 0.9%, essentially identical to the Always-ON baseline. Heuristic scheduling cannot match this trade-off: Greedy-OFF conserves energy (61%) but sacrifices success (47%). InfoGain-Greedy raises success to 90% yet violates risk eight times more often than B-COD. Random masks fare worse, proving that local environment context--not just a lower duty cycle--is essential for task completion. Pure-RL generates trajectories and schedules sensors from raw rasters; the high-dimensional action space makes exploration sparse, and the policy converges to risk-averse dithering--only 55% goals reached and a 22% collision rate. DESPOT-Lite, by contrast, evaluates a principled belief tree with analytic models and therefore is able to plan accurately, but it expands hundreds of nodes; the resulting 0.5s runtime renders it unusable in real-time on the vehicle.

Key Finding #3: B-COD stays within a 10 ± 1 ms envelope and out-scales analytic belief planners.

r (m)	B-COD	IGG	DL
25	9.8 ms	7.5 ms	565 ms
40	9.7 ms	10.9 ms	1446 ms
55	9.6 ms	14.6 ms	2737 ms
70	10.7 ms	18.2 ms	4430 ms
85	10.4 ms	18.7 ms	6536 ms
100	10.9 ms	23.3 ms	9040 ms

The table above sweeps the workspace radius from 25 m to 100 m (full lake sector). B-COD's latency is flat--10.3 +- 0.6 ms throughout—because the belief crop is always down-sampled and the UNet's receptive field is fixed; compute therefore scales with network width, not with world area. The InfoGain-Greedy baseline must update an n-cell covariance grid; its cost grows \Theta(R^{2}), reaching 23 ms at 100 m. DESPOT-Lite's branching factor of the belief tree increases with visible free space; runtime balloons to 9000 ms over the same sweep, far beyond what an embedded loop can absorb. The takeaway is practical as well as theoretical: constant-time scaling lets B-COD replan over lake-scale horizons without ever violating the real-time threshold, whereas analytic planners become the computational bottleneck well before the map reaches lake-scale.

Dataset

Our dataset consists of field logs collected from freshwater lake operations, including twelve day-time and eight night-time sorties. The SeaRobotics Surveyor ASV collected:

32-beam spinning LiDAR point clouds (10 Hz, ROS/PCD)
RGB images (20 Hz, PNG)
Near-IR images under 850 nm active illumination (20 Hz, PNG)
RTK-GNSS fixes (5 Hz, NMEA)
Six-axis IMU messages (200 Hz, ROS/Imu)
Water-quality probe samples (2 Hz, CSV)

All topics share a chronologically consistent ROS /clock, with each log accompanied by recordings of wind and irradiance for domain-randomisation replay.

An annonymous subset of the dataset (15GB) is available for download (anonymization takes time and hence we chose to release a subset and not the whole dataset): https://github.com/bcod-diffusion/dataset. We plan to release the whole dataset (280GB) under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Real→Sim Transfer

Logs are imported into an in-house Unity 2022.3 + ROS 2 simulator that reconstructs the shoreline mesh, static obstacles, bathymetry and approximate above-surface lighting. Dynamic objects are re-instantiated with ground-truth trajectories.

Dataset Statistics

Modalities: 100% contain LiDAR and IMU; day camera appears in 72%, night camera in 28%, GNSS in 64%, sonde in 18%
Belief spread: Median planar 1σ = 0.38 m; 95th percentile = 2.1 m
Lighting: Illumination spans 0.2–55 kLux; clips are evenly stratified into five bins for training/validation
Obstacles: Each snippet is annotated with the minimum range to shoreline and to floating hazards; mean 14.2 m, min 0.8 m

Download Dataset

EXO2 and GPS Data Visualization

LiDAR Point Cloud Visualization

Camera Feed Visualization

Built with ❤ by the B-COD Team

For questions and collaborations, contact: gokulp2@illinois.edu

BibTeX

@inproceedings{Puthumanaillam2025BCoD,
  title     = {Belief-Conditioned One-Step Diffusion: Real-Time Trajectory Planning with Just-Enough Sensing},
  author    = {Gokul Puthumanaillam and Aditya Penumarti and Manav Vora and Paulo Padrao and Jose Fuentes and Leonardo Bobadilla and Jane Shin and Melkior Ornik},
  booktitle = {Conference on Robot Learning (CoRL)},
  year      = {2025},
  note      = {Oral (top 5%)},
  url       = {https://github.com/bcod-diffusion/bcod-diffusion.github.io/blob/main/paper.pdf}
}