# Agent Fleet Guidelines

These guidelines apply to every agent in the fleet. Read this once on startup
and re-read whenever you're nudged to.

## Communication Protocol

1. **Inbox** — `/data/cameron/agents_stuff/agents/<your_name>/inbox.md`. Check
   this when prompted. Tasks land here from `manager` or `mobile_chat`.
2. **Outbox** — `/data/cameron/agents_stuff/agents/<your_name>/outbox.md`.
   Write summaries / results here for other agents to pick up.
3. **Status** — `/data/cameron/agents_stuff/agents/<your_name>/status.md`. One
   of: `idle`, `working`, `done`, `blocked`. Keep it current.
4. **Reports** — for rich reports with media, use the Python helper described
   below. Drop reports under `/data/cameron/para/.agents/reports/<your_name>/`.

```python
import sys; sys.path.insert(0, "/data/cameron/agents_stuff")
from shared.report import Report   # if/when report.py is ported over

r = Report("Your Report Title", agent="your_agent_name")
r.text("Description of findings")
r.table(headers=["Col1", "Col2"], rows=[["val1", "val2"]])
r.video(r.add_media_file("/path/to/video.mp4"), caption="Description")
r.image(r.add_media_file("/path/to/plot.png"), caption="Description")
r.save()
```

The report helper (`report.py`) currently lives at
`/data/cameron/para/.agents/shared/report.py` — it has not been moved yet.
Importing it from there still works.

## Report Structure

All reports must follow `shared/REPORT_FORMAT.md`. Every report needs:

1. **Summary** — 2-3 sentence takeaway
2. **Training Data** — dataset details + sample images/videos
3. **Test Setup** — what changed vs training, sample test frames
4. **Results** — summary table + eval videos (successes AND failures)
5. **Analysis** — failure modes, surprising findings
6. **Next Steps & Concerns** — specific, actionable
7. **Reproducibility** — exact command to reproduce

## Experiment Conventions

- Always log runs to wandb with a descriptive `--run_name`
- Save checkpoints periodically
- Record the exact command used to launch any experiment
- Include the wandb run URL when reporting results

## Direct Agent-to-Agent Communication

Every agent runs in a named tmux window in the `agents` session. Window names
match agent names exactly.

### Send a message to another agent

```bash
tmux send-keys -t agents:vid_model "Your message here" Enter
sleep 1
tmux send-keys -t agents:vid_model Enter
```

### Read what another agent is doing

```bash
tmux capture-pane -t agents:vid_model -p -S -50
```

### Check if an agent is idle (at the prompt)

```bash
tmux capture-pane -t agents:vid_model -p -S -3 | grep "❯"
```

### Or use the Python helper

```python
import sys; sys.path.insert(0, "/data/cameron/agents_stuff")
from shared.comms import send_task, capture_pane, get_status

send_task("vid_model", "What's the current training loss?")
output = capture_pane("vid_model", lines=30)
status = get_status("vid_model")
```

**Important:** Always `sleep 1` + extra `Enter` after `tmux send-keys` for long
messages — Claude Code may treat them as pastes that need confirmation.
Confirm the agent is idle (at `❯`) before sending.

### When to use which channel

- **tmux send-keys** — interactive, expects a quick response
- **inbox.md** — async; agent will see it when they next check
- **outbox.md** — file-based result handoff between agents

## Agent Roster

| Agent | Window | Role |
|-------|--------|------|
| manager | `manager` | Orchestrator — coordinates fleet, infra |
| project_highlevel | `project_highlevel` | Project strategy, paper narrative |
| life_manager | `life_manager` | Personal advisor & life KB |
| backbones | `backbones` | OOD generalization experiments |
| vid_model | `vid_model` | Video model + PARA wrapper |
| droid | `droid` | DROID dataset + pretraining |
| panda | `panda` | Real Panda robot (remote SSH) |
| 567_project | `567_project` | Course project — augmentation/viewpoint |
| paper_writer | `paper_writer` | LaTeX paper |
| figure_maker | `figure_maker` | Figures, plots, video assets |
| data_visualizer | `data_visualizer` | Dataset viewer tooling |
| website_builder | `website_builder` | PARA project website |
| mobile_chat | `mobile_chat` | Mobile chat (omidlab.net/chat) |
| mac | `mac` | Mac-side robotics — MuJoCo/URDF, dataset collection, local inference (operates via SSHFS mount + ssh exec) |
| yams | `yams` | TRI workstation specialist — YAM robot, DGX compute, on-site data pipeline (school→mac→TRI ssh chain). Replaces panda. |
| our_wandb | `our_wandb` | File-system-driven mock wandb viewer over disk-cached run dirs (image gallery + smoothed loss curves + model-name grouping). |
| glasses | `glasses` | Even Realities G2 smart-glasses developer app — wearable hub for the fleet (status / chat / updates from the HUD). |

Service windows (no claude): `dashboard`.

## Vault Protocol (added 2026-06-02)

There is now a curated **vault** at `/data/cameron/vault/` that holds Cameron's
durable organizational layer over the work: per-project overview / tasks /
memory markdown files, plus INDEX.md files pointing at canonical code in lab
paths. **Code stays where it is; the vault holds pointers and conventions.**

See `/data/cameron/vault/README.md` for the structure.

### Your responsibilities as an agent

1. **On bootstrap**: in addition to your `ROLE.md`, read your slice of the vault.
   - Per-agent slice: `/data/cameron/vault/fleet/agents/<your-name>/` —
     `overview.md`, `tasks.md`, `memory.md` (create these if they don't exist
     yet — use the README's file-type table as the template).
   - Project slices you actively work on: read the relevant `overview.md` +
     `memory.md` in `/data/cameron/vault/para/...` or wherever your scope is.
   - Cross-cutting: `/data/cameron/vault/memory.md` (small, applies everywhere).

2. **On non-trivial task completion**: update the vault before declaring done.
   - **Update `overview.md`** of the affected slice if the current state
     changed (new SOTA, headline metric shifted, blocker resolved).
   - **Update `tasks.md`** of the affected slice — mark the task done, add
     any follow-ups it surfaced.
   - **Append to `memory.md`** if you learned anything non-obvious —
     conventions, gotchas, workarounds, "we tried X and it failed because Y".

3. **When you encounter script proliferation** (multiple `train_v1.py`,
   `train_v2.py`, etc. without clear current canonical), update the relevant
   `INDEX.md` to mark what's canonical and what's deprecated. Move dead
   variants to an `archive/` subdir.

4. **Reference vault files by path** when responding to Cameron or other
   agents. "Per `vault/para/memory.md`, rotation discretization uses 1D PCA
   when PC1 EV ≥ 0.85" beats remembering a fact from compacted history.

### Manager's responsibility

When dispatching a task, the manager should reference the relevant vault
slice in the dispatch (e.g., "see `vault/para/model/overview.md` for current
state"). Cameron has signed off on this enforcement convention.

## Code Hygiene (added 2026-06-02)

Cameron actively reviews code in `/data/cameron/para/` for conciseness and
clarity. **Default to less code, more clarity.** Apply these rules to anything
you write or edit under that tree:

- **File length**: target ≤ 300 lines per file. 500+ lines is a smell — split
  into modules. (The current 2200-line `deploy_yam_2view.py` is being
  refactored *because* of this rule.)
- **One responsibility per file**. A "deploy" script that also defines a
  renderer, a recalibrator, AND a viz layer is three files pretending to be
  one. Pull each concern into its own module under `<project>/lib/`.
- **No copy-paste between scripts**. If `record.py` and `deploy.py` both
  build the exo+arm mujoco renderer, the renderer is a shared module, not
  duplicated code.
- **Thin launchers**: top-level scripts (`record.py`, `train.py`,
  `deploy.py`) handle CLI + main flow only. Real logic lives in
  `lib/<module>.py`.
- **Delete dead code aggressively**. Disabled branches, removed flags,
  `# kept for reference` comments — git history is the right place for them.
  The vault `memory.md` pins the snippet inline if it's worth keeping.
- **Snippets are runnable**. Reference examples live at
  `<project>/snippets/` as ≤50-line files that actually execute. Markdown
  embeds in `vault/` should be 5-20 lines and point to a canonical
  implementation by `file:line` so they don't rot.
- **Refactor freely inside the PR**. If a refactor adds clarity without
  changing behavior, do it as part of the same change — Cameron prefers
  cleaner downstream over preserving "this is how it was".

### Factorized scripts — composable, reusable, small

The YAM stack (and similar robotics projects) should be organized as
many small reusable scripts/modules that compose. A launcher imports
from `lib/`; `lib/` modules each own one concern; **visualizations are
shared between training and inference** so wandb panels and rerun
panels are produced by the same code.

Canonical YAM layout (`/data/cameron/para/robot/yam/`):

```
lib/
├── robot.py            ← class Robot wrapping chiral + raiden controller
├── smooth_move.py      ← smooth_move(robot, target, vel) → block until reached
├── calibration.py      ← exo aruco solve, T_lfr load, scene-cam K
├── render.py           ← mujoco exo+arm render + principal-point warp
└── viz/
    ├── feature_pca.py   ← DINOv3 patch PCA → RGB panel
    ├── heatmap.py       ← per-T marginal YX heatmap + abstain bar
    ├── keypoints.py     ← 30-rainbow keypoints + connecting line
    ├── overlay.py       ← compose mujoco render onto live image (mask blend)
    └── trajectory.py    ← project scene-trajectory into wrist image

calibrate.py             ← launcher: rerun + calibration.solve + viz.overlay
record.py                ← launcher: teleop + record + viz panels
train.py                 ← launcher: training loop, calls viz/* for wandb
deploy.py                ← launcher: keys y/q/r/h/c, fusion, viz panels
convert.py               ← launcher: SVO2 → PNG + PKL
bin/                     ← shell wrappers (yam-mount, yam-rd-serve, …)
```

Rules of thumb:
- One concern per file. If a script does "calibration + viz + rendering",
  it's three files.
- Shared viz is non-negotiable: train and deploy import the **same**
  `viz/heatmap.py` to draw the heatmap panel. No copies.
- A launcher should fit on one screen of logic above its `if __name__ ==
  "__main__":` block — most code is delegation to `lib/`.

### Lab is for scripts, not large files

Tailscale transfers between lab server and puget/russet go through the
SFO DERP relay (no direct LAN route between Cameron's school server and
the TRI subnet). Throughput tops out around **~25 MB/min combined**.

**Rule**: `/data/cameron/` (the lab side) holds code, configs, vault
docs, kmeans centroids, mesh files — anything you'd `git diff`. **Large
binaries stay on puget** and the code on lab references them by their
**puget-local path**.

| Type | Lives on | Path example |
|---|---|---|
| Source code | lab | `~/lab/para/robot/yam/deploy.py` |
| Vault / docs | lab | `~/lab/vault/para/...` |
| MuJoCo meshes (small, code-like) | lab | `~/lab/para/robot/yam/raiden_fork/third_party/exo_redo/robot_models/` |
| Pretrained backbone weights | **puget** | `~/yam_para/dinov3/weights/dinov3_vits16plus_*.pth` |
| Training checkpoints | **puget** | `~/yam_para/checkpoints/<run>/latest.pth` |
| Training PKLs / PNG frames | **puget** | `~/yam_para/data/place_mug_wrist_cam/` |
| Eval video MP4s | **puget** | `~/yam_para/eval_videos/` |
| TRI-stereo / exo baselines / raiden weights | **puget** | `~/yam_para/raiden_fork/weights/`, `.../baselines_scripts_and_data/` |

Launch commands reference puget-local paths via env vars:

```bash
DINO_WEIGHTS_PATH=$HOME/yam_para/dinov3/weights/dinov3_vits16plus_*.pth \
python ~/lab/para/robot/yam/train.py --ckpt $HOME/yam_para/checkpoints/run_v3/latest.pth ...
```

If you find yourself rsyncing > 100 MB to lab, **stop and reconsider** —
it's almost certainly a binary that should live on puget.

### Fail loud — no defensive try/except

If something we can't verify goes wrong — camera pose solve fails,
extrinsics file is missing, the wrist obs isn't in the dict — **let
the exception propagate**. Do not:

- Fall back to identity matrices, zeros, or default values
- Load a "backup" JSON when the live one fails
- Wrap in try/except to print a warning and continue
- Use `dict.get(key, default)` for required keys

A bad value silently flowing through the pipeline is worse than a
crash. A crash points at the broken thing; a fallback masks it and
manifests as a confusing downstream symptom hours later (e.g.
"workspace mask collapsed to 30k voxels" — that's the fallback path
biting back).

Acceptable try/except cases (small list):
- `KeyboardInterrupt` to print a clean message before re-raising
- I/O at boundaries (writing logs, parsing user input)
- Documented integration points where a third party intentionally
  signals via exception (rare)

If you find yourself writing a try/except, the question to ask is:
"would the system be wrong-but-running if I swallowed this?" If yes,
delete the try/except and let it crash. If no (cleanup only, exception
re-raised), it's fine.

## Code Changes

- Don't modify files outside your sub-project scope without coordination
- Test before committing
- The fleet config and roles live in a git repo at `/data/cameron/agents_stuff`
  — when you change a ROLE.md or GUIDELINES.md, commit and push so it
  survives a reboot.