Pixel Aligned Robot Actions (PARA)

Are More Data-Efficient and Robust to New Viewpoints and Environments

Project Overview

PARA project overview

Method Overview

PARA method overview
TL;DR

PARA reformulated EEF coordinate regression as a per-pixel regression problem, enabling dramatically improved data efficiency and generalization compared to global coordinate regression, thanks to dense supervision and shift equivariance.

PARA (Ours)
ACT
Motion Tracks

Results on Policy Trained on 40 Demonstrations (Same Viewpoint and Environment) (8x Speed)


97% Task Completion Rate

9% Task Completion Rate

5% Task Completion Rate

New Viewpoint Zero-Shot

52% Task Completion Rate

0% Task Completion Rate

0% Task Completion Rate

New Viewpoint with 5 Fine-tuning Episodes

87% Task Completion Rate

4% Task Completion Rate

4% Task Completion Rate

New Environment

94% Task Completion Rate

0% Task Completion Rate

6% Task Completion Rate

Other Tasks:

Wiping Table

95% Task Completion Rate

0% Task Completion Rate

61% Task Completion Rate

Fold Towel

97% Task Completion Rate

11% Task Completion Rate

42% Task Completion Rate