Table 1. Metric comparison table across ablations
Ablation Collision ↑ Smoothness ↑ Best of 1 ↓ Best of 15↓
Ground Truth 97.6 - - -
Ours (Full dataset) 90.6 4.76 0.81 0.41
w/o visual memory prediction 89.3 3.13 0.91 0.50
Ours (Pilot Dataset) 89.2 2.04 0.87 0.47
w/ Markovian past state 88.8 1.56 1.04 0.52
w/ Hybrid generation (I20 P10) 88.7 2.17 0.89 0.49
w/o attention 86.6 2.78 1.00 0.49
w/ DDIM generation (n=30) 85.3 0.46 0.92 0.52
w/o semantic (RGBD only) 84.1 4.17 0.91 0.53
w/o visual input (Traj only) 82.5 2.04 1.19 0.48