# Cameron's agent fleet

Source-of-truth git repo for Cameron's ~13-agent Claude Code fleet running on
the GVL lab server. Clone this repo on any Linux box, run `./bootstrap.sh`,
and the entire fleet comes back up in tmux — each agent re-reads its role and
the latest scrollback log so it can resume mid-task.

## Layout

```
agents_stuff/
├── README.md            ← this file
├── RECOVERY.md          ← run-this-when-everything-explodes
├── config.yaml          ← roster: which agents exist, where they cd, what role file
├── bootstrap.sh         ← creates tmux session + launches every agent
├── shared/
│   ├── GUIDELINES.md    ← fleet-wide protocol (every agent reads this)
│   ├── REPORT_FORMAT.md ← experiment report template
│   ├── comms.py         ← Python helper for nudging agents from code
│   ├── machines.yaml    ← remote machine config (panda, mac, etc.)
│   └── save_scrollback.sh  ← daily cron job: capture panes → logs/ → push
├── agents/
│   └── <name>/ROLE.md   ← per-agent identity + responsibilities
├── services/
│   └── dashboard/README.md  ← how to run the Flask + cloudflared tunnel
├── logs/
│   └── <name>_latest.log    ← last scrollback snapshot per agent (committed daily)
├── .env.example         ← secrets template (copy to .env, gitignored)
└── .gitignore
```

## Quick start

### First time on this machine

```bash
git clone https://github.com/cameronosmith/agents_stuff.git /data/cameron/agents_stuff
cd /data/cameron/agents_stuff
cp .env.example .env && chmod 600 .env  # then fill in real values
./bootstrap.sh
```

### Coming back after tmux died

```bash
cd /data/cameron/agents_stuff
git pull           # in case role files changed
./bootstrap.sh     # recreates the session, agents re-read their state
```

### After a server reboot

Same as above. The Claude session JSONL files in `~/.claude/projects/` are
local-only — bootstrap onboards the agents from `ROLE.md` + the latest
scrollback log committed in this repo, so context survives even if the
server's disk is gone.

## Daily backup

Add this to crontab so each agent's scrollback is committed once a day:

```bash
crontab -e
# Then add:
0 0 * * * bash /data/cameron/agents_stuff/shared/save_scrollback.sh \
            >> /data/cameron/agents_stuff/logs/cron.log 2>&1
```

The script:

- captures the last 5000 lines from each agent's tmux pane
- overwrites `logs/<agent>_latest.log` (single file per agent — no commit bloat)
- commits and pushes to origin if anything changed

If you want more frequent backups (every 3 hours, etc.), change the cron
schedule. Hourly is fine — the diff filter means most runs are no-ops.

## How bootstrap.sh onboards an agent

For each agent in `config.yaml`:

1. Creates a tmux window named after the agent
2. `cd`s to the agent's configured working directory
3. Launches `claude --dangerously-skip-permissions`
4. Sends an onboarding prompt that says: *"You are agent X. Read your ROLE,
   read GUIDELINES.md and REPORT_FORMAT.md, read your latest scrollback log
   (if it exists), give a one-line status, then wait for instructions."*

The onboarding prompt makes recovery automatic — the agent picks up from
where it left off, no human re-typing required.

## Service windows

Service windows (just `dashboard` for now) don't run claude — they run
long-living processes (Flask + cloudflared). `bootstrap.sh` creates the
window and prints the command-to-run as a banner; start it manually so
secrets aren't auto-injected. See `services/dashboard/README.md`.

## Adding a new agent

1. Add an entry to `config.yaml` under `agents:`
2. Create `agents/<name>/ROLE.md`
3. Commit + push
4. Run `./bootstrap.sh --reset` (or `--only=<name>` to only create the new window)

## See also

- [`RECOVERY.md`](./RECOVERY.md) — full disaster-recovery procedure
- [`shared/GUIDELINES.md`](./shared/GUIDELINES.md) — fleet protocol
- [`shared/REPORT_FORMAT.md`](./shared/REPORT_FORMAT.md) — experiment reports