# para_panda_stuff — PARA on a real Franka Panda

Welcome, Runhao. This repo is your runway for getting PARA running on a
real Panda. It contains:

- A working **PARA model** (DINOv3 → pixel-aligned heatmap volume) and a
  training script tuned for real Panda data.
- The **streaming + dataset** pipeline (live joint states over rosbridge,
  RealSense capture, dataset parsing, MuJoCo rendering).
- The **hand-eye calibration** scripts you'll use to recover the camera's
  pose in robot world frame from ArUco detections.
- All the **MuJoCo robot models** and the ExoConfigs needed by the above.

If you read nothing else, read these three files first — in this order:

1. [`docs/system_overview.md`](docs/system_overview.md) — what PARA is, why
   hand-eye, how the pieces fit.
2. [`docs/always_visualize.md`](docs/always_visualize.md) — the single most
   important workflow rule. Don't skip it.
3. [`TASKS.md`](TASKS.md) — your four-stage onboarding roadmap, ending in
   "robot picks up the bowl."

## Repo layout

```
para_panda_stuff/
├── README.md                         ← you are here
├── CLAUDE.md                         ← context for you and your sub-agents
├── TASKS.md                          ← four-stage onboarding roadmap
├── requirements.txt
├── docs/
│   ├── system_overview.md            ← PARA + hand-eye, in plain English
│   ├── always_visualize.md           ← the workflow rule
│   └── server_setup.md               ← env, DINO weights, ROS bridge, cameras
├── para/
│   ├── CLAUDE.md
│   └── model.py                      ← TrajectoryHeatmapPredictor (DINOv3)
└── panda_streaming/
    ├── CLAUDE.md
    ├── data_panda_para.py            ← Dataset + camera intrinsics/extrinsics
    ├── train_panda_para.py           ← Training entry point
    ├── parse_video_into_episodes_panda.py
    ├── simple_dataset_record_panda.py
    ├── stream_panda_with_cam.py      ← live RealSense + MuJoCo overlay
    ├── stream_panda_with_vis.py      ← MuJoCo viewer with live joint states
    ├── deploy_ik_sequence.py         ← deploy IK joint trajectory on robot
    ├── test_ik_recovery.py           ← reproduce trajectory in sim via IK
    ├── exo_utils.py                  ← ArUco pose estimation helpers
    ├── vis_dataset_gt.py             ← inspect a recorded episode
    ├── ExoConfigs/                   ← Panda + ArUco board configs
    ├── robot_models/franka_emika_panda/
    ├── scripts/                      ← start_robot_server, run_teleop
    └── hand_eye_calib/               ← see hand_eye_calib/CLAUDE.md
        ├── CLAUDE.md
        ├── calibrate.py              ← joint hand-eye solver (no privileged info)
        ├── command_calib_poses.py    ← command Panda to each calibration pose
        ├── panda_exo_handeye.py
        ├── render_test.py
        └── viewer_test.py
```

## Goal in one paragraph

The point of this whole repo is to **calibrate the camera's pose in the
robot's world frame using hand-eye calibration**, so we can lift the
predictions from PARA's pixel-aligned heads back into 3D world points the
robot can act on. Without an accurate camera pose, the 3D point we recover
from `(u, v) + height bin` won't match where the gripper actually needs to
go. You will be running PARA experiments end-to-end on a Panda — recording
data, training, deploying. The hand-eye calibration is the bridge between
the image and the robot.

## What "done" looks like for onboarding

When you can run `deploy_ik_sequence.py` with a checkpoint trained on data
*you* recorded, and the Panda picks up a bowl, you're done with onboarding.
See [`TASKS.md`](TASKS.md) for the four-stage path that gets you there.

## Talking to Cameron / sub-agents

You'll be running with Claude sub-agents. Every important folder has a
`CLAUDE.md` describing what's in it, what's working, and what's not.
**Update those files as you learn things** — they're the context your
agents will read on every fresh session.