Single-Scene Closed-Loop — 3 model variants (v2 fixed videos)

backbones 2026-05-17 20:13 dd855fd

Same eval scene (libero_spatial task 0, demo 0 init state). 1 episode, max_steps=600, --teleport --zero_rotation. All three models: 0% SR — every rollout hit the 600-step timeout.

Pred jerk in px/step (mean of |Delta-Delta-pos| on predicted target trajectory):

Modelval_px_err (train)closed-loop SRpred jerk (px)Notes
(A) 2D AR — model_autoregressive_v28.0 (3 ep, stride=5)0%1.89Largest jerk → policy moves the most but still in wrong direction
(B) Voxel + abs xyz — pilot12.4 (2 ep, stride=5)0%0.25Frozen — voxel features overpowering EEF history
(C) Voxel + EEF-rel xyz — pilot13.3 (2 ep, stride=5)0%0.06Most frozen of all — predictions barely change step-to-step

Videos (same init state)

White dot = current EEF projected. Green crosshair = model predicted next-EEF cell. Gripper sign printed as g=+1 (close) / g=-1 (open).

(A) 2D AR

2D AR at val_px_err=8.0 — jerk 1.89, EEF wanders but never finds the bowl.

(B) Voxel + abs xyz

Voxel-abs pilot — jerk 0.25, essentially frozen.

(C) Voxel + EEF-relative xyz

Voxel-rel pilot — jerk 0.06, most frozen. May also be hit by the rel-PE unit-mismatch bug I flagged earlier.

Takeaways from the videos

All 3 confirm the under-training pathology — predicted target stays near current EEF cell, so action deltas are 1-2 mm and nothing happens before max_steps. The voxel models are MORE frozen than the 2D AR, consistent with them being 2 epochs vs 3 epochs and (for C) having a unit-mismatch in the geometry PE input.