# libero/ — LIBERO Simulation Training

## Goal
Train PARA on LIBERO simulation benchmarks for rapid prototyping of the pixel-aligned action representation before moving to real robot data.

## Files
| File | Purpose |
|---|---|
| `train.py` | Main training script — PARA on LIBERO demos |
| `model.py` | `TrajectoryHeatmapPredictor` — DINOv3 + volume head + gripper/rotation MLP heads |
| `data.py` | `RealTrajectoryDataset` — LIBERO HDF5 loader, renders frames via `OffScreenRenderEnv` |
| `eval.py` | Closed-loop LIBERO rollout evaluation with open-loop window execution |
| `utils.py` | `recover_3d_from_direct_keypoint_and_height` — unprojection geometry |
| `debug_libero_projection.py` | Sanity check: project GT EEF into camera image, save overlay PNG/video |

## Server Setup (USC GVL Lab)

### 1. Clone repo
```bash
git clone https://github.com/cameronosmith/para.git /data/cameron/para
cd /data/cameron/para
```

### 2. Set up conda environment
```bash
# Use existing torch124 env (Python 3.10, PyTorch+CUDA already installed)
conda activate /data2/cameron/miniconda3/envs/torch124

# Or create a fresh one:
# conda create -n para python=3.10 -y && conda activate para
# conda install pytorch torchvision pytorch-cuda=12.4 -c pytorch -c nvidia -y
```

### 3. Install LIBERO
```bash
cd /data/cameron
git clone https://github.com/Lifelong-Robot-Learning/LIBERO.git
cd LIBERO
pip install -e .
cd /data/cameron/para
```

### 4. Install remaining dependencies
```bash
pip install wandb tqdm opencv-python h5py scipy
```

### 5. Download LIBERO datasets
```bash
mkdir -p /data/libero
cd /data/cameron/LIBERO
python benchmark_scripts/download_libero_datasets.py \
  --datasets libero_spatial \
  --download-dir /data/libero
```

### 6. Set up DINOv3 weights

The model uses a custom DINOv3 variant loaded from local weights. Two options:

**Option A: Copy weights from Mac (scp)**
```bash
# From Mac:
scp -P 24891 -r /Users/cameronsmith/Projects/robotics_testing/random/dinov3 \
  cameronsmith@frp-ask.com:/data/cameron/dinov3
```

**Option B: Set env vars to use a standard DINOv2 hub load**
*(requires model.py modification — see below)*

Once weights are on server, set env vars before training:
```bash
export DINO_REPO_DIR=/data/cameron/dinov3
export DINO_WEIGHTS_PATH=/data/cameron/dinov3/weights/dinov3_vits16plus_pretrain_lvd1689m-4057cbaa.pth
```

### 7. Configure W&B
```bash
wandb login  # enter API key from wandb.ai/settings
```

### 8. Launch training
```bash
cd /data/cameron/para

# Check available GPUs first:
nvidia-smi

# Launch on e.g. GPU 4 (first free one):
CUDA_VISIBLE_DEVICES=4 \
DINO_REPO_DIR=/data/cameron/dinov3 \
DINO_WEIGHTS_PATH=/data/cameron/dinov3/weights/dinov3_vits16plus_pretrain_lvd1689m-4057cbaa.pth \
LIBERO_DATA_PATH=/data/libero \
tmux new-session -d -s para_train \
  "python libero/train.py \
    --benchmark libero_spatial \
    --task_id 0 \
    --camera agentview \
    --max_demos 50 \
    --batch_size 8 \
    --epochs 1000 \
    --run_name para_libero_spatial_t0_server \
    --wandb_mode online \
    2>&1 | tee /data/cameron/para/train_server.log"

# Attach to watch:
tmux attach -t para_train
```

## How Data Works

LIBERO demos are HDF5 files. Per-frame we:
1. Call `env.set_init_state(states[t])` to set the simulator
2. Read `obs["robot0_eef_pos"]` (world-frame 3D EEF position)
3. Use demo `actions[t, 6]` for gripper value (NOT qpos — Panda symmetric fingers cancel to ~0)
4. Call `project_points_from_world_to_camera(...)` → row/col pixel
5. Use `get_camera_extrinsic_matrix` (camera→world) for 3D recovery
6. Use `get_camera_transform_matrix` (world→camera) for projection/visualization

**Key convention:** LIBERO obs images need `np.flipud(rgb)` before use. Pixel v coordinate from projection is already correct in flipped image space (no additional flip needed).

**Height:** stored as absolute world-Z (`eef_pos[2]`), discretized over `[MIN_HEIGHT, MAX_HEIGHT]` computed from dataset stats (cached in `checkpoints/<run_name>/dataset_stats.json`).

**Gripper:** demo `actions[:, 6]` is a clean -1.0 (open) / +1.0 (close) signal.

## Architecture Details

```
Input: (B, 3, 448, 448)
  → DINOv3 ViT-S/16 patch features: (B, D=384, 28, 28)
  → bilinear upsample: (B, D, 64, 64)
  → 3× Conv2d(D, D, 3×3, padding=1) + GELU: (B, D, 64, 64)
  → volume_head (1×1 conv): (B, N_WINDOW × N_HEIGHT_BINS, 64, 64)

For gripper/rotation (indexed at query pixel):
  feats[b, :, py, px] → (B, N_WINDOW, D)
  → LayerNorm → Linear(D, D) → GELU → Linear(D, N_BINS)
  → gripper: (B, N_WINDOW, N_GRIPPER_BINS=32)
  → rotation: (B, N_WINDOW, 3, N_ROT_BINS=32)  ← 3 euler axes
```

**Constants** (in `model.py`):
| Name | Value |
|---|---|
| N_WINDOW | 6 timesteps |
| N_HEIGHT_BINS | 32 |
| N_GRIPPER_BINS | 32 |
| N_ROT_BINS | 32 |
| PRED_SIZE | 64 |
| IMAGE_SIZE | 448 |

## Loss

```
total = volume_loss + 0.5 * gripper_loss + 0.5 * rotation_loss
```
- `volume_loss`: CE over flattened (H×W) per timestep, indexed at GT pixel + height bin
- `gripper_loss`: CE over N_GRIPPER_BINS at GT pixel feature
- `rotation_loss`: mean CE over 3 euler axes at GT pixel feature

## Checkpoints

Saved to `libero/checkpoints/<run_name>/latest.pth` and `best.pth`.
Each checkpoint stores `min_height`, `max_height`, `min_rot`, `max_rot`, `min_gripper`, `max_gripper` for reproducibility.

## Eval

```bash
python libero/eval.py \
  --checkpoint libero/checkpoints/<run_name>/best.pth \
  --benchmark libero_spatial --task_id 0 \
  --n_episodes 10 --save_video
```

Open-loop window execution: model predicts N_WINDOW=6 future actions from current frame, executes all 6, then re-predicts.

## Debugging & Experiment Protocol

**When eval results are 0% or unexpectedly bad:**
- Treat as a BUG, not a finding. Do NOT report until diagnosed.
- Trace the full eval code path for this specific model type — verify each stage activates.
- Add debug prints to check predictions → decode → execution.
- Compare with a known-working model on the same eval to isolate the issue.
- If model loss converged but eval fails, the bug is in the eval decode, not the model.

**Before reporting eval results, verify:**
1. The eval code path actually runs for this model type (e.g., does `--teleport` activate?)
2. Predictions look reasonable (print a few predicted vs GT values)
3. If 0% on train data with converged loss, it's a bug — fix it before reporting

**General approach:**
- Investigate unexpected results to completion before reporting. Don't wait for the user as bottleneck.
- Keep a running checklist of goals, current results, and next steps.
- Be a scientist: form hypotheses, test them, iterate until findings are coherent.

## Current Status
- [x] Pipeline verified on Mac (MPS), model trains and sometimes succeeds at libero_spatial task 0
- [ ] Full server training run with all 50 demos
- [ ] Eval across all 10 libero_spatial tasks