Pixel-Aligned Robot Actions — reformulating end-effector action prediction as a dense, pixel-aligned objective over image space.
Testing how PARA's pixel-aligned formulation generalizes to out-of-distribution object positions and camera viewpoints. Comparing robustness and data efficiency against global-regression baselines (ACT, DINO-VLA, InternVL).
Comparing PARA's pixel-aligned regression head vs. global regression on top of a video generation backbone (UVA). Testing the hypothesis that PARA is more data-efficient for learning joint video-action policies.
Pretraining PARA on the large-scale DROID dataset (100K+ trajectories) to test whether pixel-aligned prediction benefits from diverse cross-embodiment data.
Deploying PARA on a real Franka Panda arm to validate sim-to-real transfer and real-world pixel-aligned action prediction.
Testing how PARA's pixel-aligned formulation generalizes to out-of-distribution object positions and camera viewpoints. Comparing robustness and data efficiency against global-regression baselines (ACT, DINO-VLA, InternVL).
Comparing PARA's pixel-aligned regression head vs. global regression on top of a video generation backbone (UVA). Testing the hypothesis that PARA is more data-efficient for learning joint video-action policies.
Pretraining PARA on the large-scale DROID dataset (100K+ trajectories) to test whether pixel-aligned prediction benefits from diverse cross-embodiment data.
Deploying PARA on a real Franka Panda arm to validate sim-to-real transfer and real-world pixel-aligned action prediction.
No agents assigned yet.