# PARA vs ACT — Key OOD Generalization Experiments

## Summary

Six experiments showing PARA's advantage over ACT on out-of-distribution generalization.
Report: `.agents/reports/backbones/2026-04-02_1621_ood_object_position_generalization_—_par.html`

| # | Experiment | PARA | ACT | Delta | OOD Axis |
|---|-----------|------|-----|-------|----------|
| 1 | Left → Right position extrapolation | **54%** | 1% | **+53%** | Object position |
| 2 | Near → Far position extrapolation | **46%** | 7% | **+39%** | Object position |
| 3 | Default → All viewpoints (zero-shot) | **61%** | 24% | **+37%** | Camera viewpoint |
| 4 | Left → Right viewpoint hemisphere | **40%** | 10% | **+30%** | Camera viewpoint |
| 5 | N=32 corner scaling | **54%** | 33% | **+21%** | Data efficiency |
| 6 | Distractor robustness | **28%** | 10% | **+18%** | Visual clutter |

### Sanity Checks (in-distribution, both models work)

These confirm both models work on training data — the OOD gap is real, not a training failure.

| Experiment | Position | PARA | ACT |
|-----------|----------|------|-----|
| Left→Right sanity | far-left (in train) | SUCCESS | SUCCESS |
| Near→Far sanity | near (in train center) | SUCCESS | SUCCESS |

**Sanity videos:**
- `media/para_lr_sanity_ep000_success.mp4` / `media/act_lr_sanity_ep000_success.mp4`
- `media/para_nf_sanity2_ep000_success.mp4` / `media/act_nf_sanity2_ep000_success.mp4`

### Multistage Breakdown (5×5 grids)

| Experiment | Model | MISS | GRASP | PLACE | Total |
|-----------|-------|------|-------|-------|-------|
| Left→Right | PARA | 6 | 4 | **15** | 25 |
| Left→Right | ACT | 20 | 2 | 3 | 25 |
| Near→Far | PARA | 10 | 3 | **12** | 25 |
| Near→Far | ACT | 17 | 7 | 1 | 25 |
| N=32 | PARA | 1 | 6 | **18** | 25 |
| N=32 | ACT | 10 | 7 | 8 | 25 |
| Distractor | PARA | 12 | 4 | **9** | 25 |
| Distractor | ACT | 21 | 2 | 2 | 25 |

---

## Datasets

### Object Position Dataset
```
Path:  /data/libero/ood_objpos_v3/libero_spatial/task_0/
Grid:  16×16, dx=[-0.40, -0.01], dy=[-0.30, +0.30] (39cm × 60cm)
Demos: 256, all at default viewpoint, clean scene
```

### Viewpoint Dataset
```
Path:  /data/libero/ood_viewpoint_v3/libero_spatial/task_0/
Grid:  8×8 viewpoints (θ: 0-25°, φ: 0-315°)
Demos: 640 (10 per viewpoint), random object positions from full range
```

### Splits
```
Object position: /data/libero/ood_objpos_v3_splits/
Viewpoint:       /data/libero/ood_viewpoint_v3_splits/
```

---

## Checkpoints

```
Object position experiments:
  PARA: para_normalized_losses/libero/checkpoints/para_v2_exp3_near/best.pth   (near→far)
  ACT:  para_normalized_losses/libero/checkpoints/act_v2_exp3_near/best.pth
  PARA: para_normalized_losses/libero/checkpoints/para_v2_exp3_left/best.pth   (left→right)
  ACT:  para_normalized_losses/libero/checkpoints/act_v2_exp3_left/best.pth
  PARA: para_normalized_losses/libero/checkpoints/para_v2_exp4_n32/best.pth    (N=32)
  ACT:  para_normalized_losses/libero/checkpoints/act_v2_exp4_n32/best.pth
  PARA: para_normalized_losses/libero/checkpoints/para_v2_exp4_n64/best.pth    (N=64, also used for viewpoint & distractor)
  ACT:  para_normalized_losses/libero/checkpoints/act_v2_exp4_n64/best.pth

Viewpoint experiments:
  PARA: para_normalized_losses/libero/checkpoints/para_v3_vp_left_hemi/best.pth  (left hemisphere)
  ACT:  para_normalized_losses/libero/checkpoints/act_v3_vp_left_hemi/best.pth
```

---

## Report Media Files

All in `.agents/reports/backbones/media/`:

### Experiment 1: Left → Right Position
```
Distribution:  exp3_leftright_distribution.png
Sanity PARA:   para_lr_sanity_ep000_success.mp4
Sanity ACT:    act_lr_sanity_ep000_success.mp4
Test PARA:     para_lr_test_ep000_success.mp4
Test ACT:      act_lr_test_ep000_fail.mp4
```

### Experiment 2: Near → Far Position
```
Distribution:  exp3_train_test_distribution.png
Sanity PARA:   para_nf_sanity2_ep000_success.mp4
Sanity ACT:    act_nf_sanity2_ep000_success.mp4
Test PARA:     para_nf_test_ep000_success.mp4
Test ACT:      act_nf_test_ep000_fail.mp4
```

### Experiment 3: Default → All Viewpoints
```
Distribution:         vp_default_to_all_polar_overview.png
PARA rollout grid:    vp_default_to_all_para_rollout_grid.mp4
ACT rollout grid:     vp_default_to_all_act_rollout_grid.mp4
```

### Experiment 4: Viewpoint Hemisphere
```
Polar overview:       vp_lr_polar_overview.png
Frame comparison:     vp_leftright_hemi_distribution.png
```

### Experiment 5: N=32 Corner Scaling
```
Distribution:  exp4_n32_distribution.png
```

### Experiment 6: Distractor Robustness
```
Distribution:  distractor_robustness_distribution.png
PARA video:    para_distractor_contrast_ep000_success.mp4
ACT video:     act_distractor_contrast_ep000_fail.mp4
```

---

## Eval Commands

All evals use:
```bash
export PYTHONPATH=/data/cameron/LIBERO:$PYTHONPATH
export DINO_REPO_DIR=/data/cameron/keygrip/dinov3
export DINO_WEIGHTS_PATH=/data/cameron/keygrip/dinov3/weights/dinov3_vits16plus_pretrain_lvd1689m-4057cbaa.pth
```

### Object position eval
```bash
python eval.py --model_type {para|act} --checkpoint CKPT \
    --benchmark libero_spatial --task_id 0 --n_episodes 5 \
    --teleport --zero_rotation --clean_scene --max_steps 600 \
    --shift_dx DX --shift_dy DY --out_dir OUT --save_video
```

### Viewpoint eval
```bash
python eval.py --model_type {para|act} --checkpoint CKPT \
    --benchmark libero_spatial --task_id 0 --n_episodes 3 \
    --teleport --zero_rotation --clean_scene --max_steps 600 \
    --shift_dx 0.0509 --shift_dy -0.2063 \
    --cam_theta THETA --cam_phi PHI --out_dir OUT --save_video
```

### Distractor eval (no --clean_scene)
```bash
python eval.py --model_type {para|act} --checkpoint CKPT \
    --benchmark libero_spatial --task_id 0 --n_episodes 5 \
    --teleport --zero_rotation --max_steps 600 \
    --shift_dx DX --shift_dy DY --out_dir OUT --save_video
```
