# para_panda_stuff — Project context for Claude agents

This repo is Runhao's onboarding workspace for running PARA on a real
Franka Panda. You are likely a Claude sub-agent helping him on a specific
slice (calibration, dataset, model, deployment). Read this file first,
then the `CLAUDE.md` in whichever sub-folder you're working in.

## What PARA is

PARA reformulates end-effector action prediction as a **pixel-aligned**
objective. Instead of regressing a global pose, the model predicts:

- a 2D heatmap over the image: argmax → `(u, v)` pixel where the EEF
  should go,
- per-pixel logits over `N_HEIGHT_BINS` height buckets along that pixel's
  ray (world-frame Z),
- per-pixel gripper and rotation predictions, indexed at the GT pixel
  during training (teacher forcing) or the argmax pixel at inference.

3D recovery: given `(u, v)` and a height bin, unproject to a 3D world
point using camera intrinsics + extrinsics + the height constraint.

This is why **hand-eye calibration matters**: without a correct
`T_cam_world`, the unprojection is wrong and the 3D target the model
"means" is not the 3D target the robot reaches.

## What's in this repo

| Path | Purpose |
|---|---|
| `para/model.py` | `TrajectoryHeatmapPredictor` — DINOv3 ViT-S/16 + heads. |
| `panda_streaming/train_panda_para.py` | Training driver for real Panda data. |
| `panda_streaming/data_panda_para.py` | Dataset class + cached `T_CAM_WORLD` and `CAM_K` (will be replaced with the calibrated values once Runhao re-runs hand-eye). |
| `panda_streaming/stream_panda_with_*.py` | Live joint states + camera tooling. |
| `panda_streaming/simple_dataset_record_panda.py` | Record a session. |
| `panda_streaming/parse_video_into_episodes_panda.py` | Split a recording into labeled episodes. |
| `panda_streaming/test_ik_recovery.py` | Sanity-check: take recorded EEF poses → IK → render and compare. |
| `panda_streaming/deploy_ik_sequence.py` | Push joint trajectories to the real robot via rosbridge. |
| `panda_streaming/hand_eye_calib/` | All calibration scripts. |
| `panda_streaming/ExoConfigs/` | MuJoCo + ArUco board configs (Panda only). |
| `panda_streaming/robot_models/franka_emika_panda/` | MJCF + meshes. |

Heavy folders that were trimmed when copying from the parent repo:
`franka_fr3*` MJCFs, the SO100/ARX exoskeletons, and prior wandb/checkpoint
runs. If you need them, ask Cameron — they live in `/data/cameron/para`.

## Hardware + connection (lab GVL → robot box)

The robot box is reachable via an ngrok TCP tunnel. The connection details
change daily; check `/data/cameron/agents_stuff/agents/panda/connection.txt`
or ask Cameron for the current host/port. Once you have it,
`panda_streaming/scripts/start_robot_server.sh <host> <port>` brings up:

| tmux session | What it runs |
|---|---|
| `panda_tunnel` | local SSH tunnel: `localhost:9090` → remote rosbridge `9090` |
| `panda_rosbridge` | `ros2 launch rosbridge_server rosbridge_websocket_launch.xml` |
| `panda_driver` | Franka FR3 driver (`arm_id:=fr3` on the real box) |

ROS2 joint topic names: typically `panda_joint{1..7}`, sometimes
`fr3_joint{1..7}` (legacy). The streaming scripts handle both.

## The "always visualize" rule

Before reporting a step as done, **save a PNG or MP4** that shows:

- masks/keypoints overlaid on the image,
- predicted vs GT trajectory,
- IK result vs recorded poses,
- whatever you can render that proves correctness visually.

Numbers lie. Images don't.

See [`docs/always_visualize.md`](docs/always_visualize.md) for the full
rationale and concrete patterns.

## Coordinate conventions (reference)

- **MuJoCo camera convention:** x-right, y-up, z-back (i.e. camera looks
  down `-z`). The `cam_xmat` columns are these axes in world frame.
- **OpenCV camera convention:** x-right, y-down, z-forward (camera looks
  down `+z`). Used by `cv2.solvePnP`, `cv2.calibrateCamera`,
  `cv2.calibrateHandEye`.
- **Convert MuJoCo → OpenCV:** `R_cv = diag([1, -1, -1]) @ R_mj.T` and
  `t_cv = -R_cv @ cam_pos_world`. See `hand_eye_calib/calibrate.py` for the
  reference implementation (`F` matrix, `T_cam_world_cv`).
- **Joint state topic:** quaternions on `/joint_states` and on MuJoCo
  bodies are `[w, x, y, z]`. SciPy's `Rot.from_quat` expects `[x, y, z, w]`
  — always remember to reorder.

## Memory + state for agents

- Cameron's persistent memory lives in `/home/cameronsmith/.claude/projects/-data-cameron-para/memory/` — relevant feedback for *this* repo: always start the camera server yourself; always generate masks + pre-cache 448 + add to data viewer when uploading panda data; always visually inspect outputs before reporting.
- The fleet shares an `agents_stuff/` dir with inboxes/outboxes — see `/data/cameron/agents_stuff/shared/GUIDELINES.md`.

## Don'ts

- Don't push large checkpoints, wandb dirs, or recorded datasets into this
  repo. Use `/data/cameron/panda_data/` for data and
  `panda_streaming/checkpoints/` for trained checkpoints (gitignored).
- Don't move files around without updating the relative-path imports in
  `train_panda_para.py` (`../para/model.py`) and the ExoConfigs.
- Don't change `T_CAM_WORLD` / `CAM_K` in `data_panda_para.py` from a
  one-off uncalibrated value. If you re-calibrate, save the result to a
  JSON in `panda_streaming/checkpoints/` and have the dataset load it.