Loading dataset... CachedTrajectoryDataset: 64 demos, 2048 samples Source: {'cache_root': '/data/libero/ood_objpos_v3_splits/exp4_n64_train', 'benchmark': 'libero_spatial', 'task_ids': [0], 'frame_stride': 3} Total: 2048 samples ✓ Train: 1946 samples ✓ Val: 102 samples Initializing model (type=act)... Loading DINOv2 model... ✓ DINOv2 backbone is trainable ✓ CLIP projection: 512 → 384 ✓ ACT model: 30,394,048 / 30,394,048 trainable params Input: CLS(384) + start_kp(2) + eef_pos(3) + gripper(1) = 774 pos_mlp: → (B, 4, 3) [sigmoid, normalized] rot_mlp: → (B, 4, 3) [sigmoid, normalized] gripper_mlp: → (B, 4) [sigmoid, normalized] Trainable parameters: 30,394,048 / 30,394,048 (100.00%) Computing dataset stats from random subset: 500/2048 samples (seed=42) /data2/cameron/miniconda3/envs/uva/lib/python3.10/site-packages/wandb/sdk/data_types/image.py:324: DeprecationWarning: 'mode' parameter is deprecated and will be removed in Pillow 13 (2026-10-15) ✓ Saved stats cache: /data/cameron/567_augmentation_viewpoint_project/checkpoints/act_defvp_all50_nokp/dataset_stats.json ✓ Height range from dataset: [0.917341, 1.177162] m ✓ Gripper range from dataset: [-1.000000, 1.000000] ✓ Rotation range (delta rotvec): ['-0.107', '-0.076', '-0.020'] .. ['0.040', '0.249', '0.020'] ✓ Reference rotation: ['0.9994', '-0.0011', '-0.0338', '-0.0038'] ✓ Position range from dataset: ['-0.408', '-0.366', '0.917'] .. ['0.132', '0.273', '1.177'] Starting training for 1000 epochs... ✓ Rotation loss SKIPPED ✓ EMA loss weights initialized (vol=11.8, rot=3.5, grip=0.69) self._image = pil_image.fromarray( Epochs: 0%| | 2/1000 [01:35<13:18:04, 47.98s/it] ============================================================ Epoch 0/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Train Loss: 0.2741 (Volume: 0.0274, Gripper: 0.2467, Rotation: 0.0000) Val - Loss: 0.2602, Volume: 0.0059, Pixel Error: 13.70px, Height Error: 0.000mm, Gripper: 0.1422 ✓ Saved best model (val_loss=0.2602) ============================================================ Epoch 1/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.h264.mp4 Train Loss: 0.0159 (Volume: 0.0079, Gripper: 0.0080, Rotation: 0.0000) Val - Loss: 0.2245, Volume: 0.0043, Pixel Error: 12.01px, Height Error: 0.000mm, Gripper: 0.1324 ✓ Saved best model (val_loss=0.2245) ============================================================ Epoch 2/1000 ============================================================ [async eval] launched at step 500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.mp4 Train Loss: 0.0105 (Volume: 0.0052, Gripper: 0.0053, Rotation: 0.0000) Val - Loss: 0.2209, Volume: 0.0029, Pixel Error: 9.09px, Height Error: 0.000mm, Gripper: 0.1127 ✓ Saved best model (val_loss=0.2209) ============================================================ Epoch 3/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0089 (Volume: 0.0045, Gripper: 0.0045, Rotation: 0.0000) Val - Loss: 0.1881, Volume: 0.0025, Pixel Error: 8.46px, Height Error: 0.000mm, Gripper: 0.0882 ✓ Saved best model (val_loss=0.1881) ============================================================ Epoch 4/1000 ============================================================ [async eval] launched at step 1000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_1000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.mp4 Train Loss: 0.0069 (Volume: 0.0035, Gripper: 0.0034, Rotation: 0.0000) Val - Loss: 0.1505, Volume: 0.0015, Pixel Error: 6.56px, Height Error: 0.000mm, Gripper: 0.0343 ✓ Saved best model (val_loss=0.1505) ============================================================ Epoch 5/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0067 (Volume: 0.0034, Gripper: 0.0033, Rotation: 0.0000) Val - Loss: 0.1552, Volume: 0.0020, Pixel Error: 7.13px, Height Error: 0.000mm, Gripper: 0.0539 ============================================================ Epoch 6/1000 ============================================================ [async eval] launched at step 1500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.h264.mp4 Train Loss: 0.0064 (Volume: 0.0032, Gripper: 0.0032, Rotation: 0.0000) Val - Loss: 0.1464, Volume: 0.0017, Pixel Error: 7.08px, Height Error: 0.000mm, Gripper: 0.0490 ✓ Saved best model (val_loss=0.1464) ============================================================ Epoch 7/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0062 (Volume: 0.0031, Gripper: 0.0031, Rotation: 0.0000) Val - Loss: 0.1705, Volume: 0.0017, Pixel Error: 6.56px, Height Error: 0.000mm, Gripper: 0.0588 ============================================================ Epoch 8/1000 ============================================================ [async eval] launched at step 2000 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Saved step checkpoint: step_2000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Train Loss: 0.0066 (Volume: 0.0034, Gripper: 0.0033, Rotation: 0.0000) Val - Loss: 0.1600, Volume: 0.0016, Pixel Error: 6.45px, Height Error: 0.000mm, Gripper: 0.0490 ============================================================ Epoch 9/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0067 (Volume: 0.0034, Gripper: 0.0034, Rotation: 0.0000) Val - Loss: 0.1802, Volume: 0.0017, Pixel Error: 6.82px, Height Error: 0.000mm, Gripper: 0.0588 ============================================================ Epoch 10/1000 ============================================================ [async eval] launched at step 2500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0061 (Volume: 0.0031, Gripper: 0.0030, Rotation: 0.0000) Val - Loss: 0.1609, Volume: 0.0015, Pixel Error: 6.20px, Height Error: 0.000mm, Gripper: 0.0539 ============================================================ Epoch 11/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0056 (Volume: 0.0028, Gripper: 0.0028, Rotation: 0.0000) Val - Loss: 0.1754, Volume: 0.0015, Pixel Error: 6.39px, Height Error: 0.000mm, Gripper: 0.0637 ============================================================ Epoch 12/1000 ============================================================ [async eval] launched at step 3000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_3000.pth [async eval] logged video to wandb: ep000_success.mp4 Train Loss: 0.0046 (Volume: 0.0023, Gripper: 0.0023, Rotation: 0.0000) Val - Loss: 0.1690, Volume: 0.0017, Pixel Error: 6.49px, Height Error: 0.000mm, Gripper: 0.0588 ============================================================ Epoch 13/1000 ============================================================ [async eval] logged video to wandb: ep000_success.h264.mp4 [async eval] logged video to wandb: ep000_success.h264.h264.mp4 [async eval] logged video to wandb: ep000_success.h264.h264.h264.mp4 Train Loss: 0.0040 (Volume: 0.0021, Gripper: 0.0019, Rotation: 0.0000) Val - Loss: 0.1614, Volume: 0.0015, Pixel Error: 6.27px, Height Error: 0.000mm, Gripper: 0.0539 ============================================================ Epoch 14/1000 ============================================================ [async eval] launched at step 3500 [async eval] logged video to wandb: ep000_success.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_success.h264.h264.h264.h264.h264.mp4 Train Loss: 0.0051 (Volume: 0.0026, Gripper: 0.0026, Rotation: 0.0000) Val - Loss: 0.1310, Volume: 0.0016, Pixel Error: 6.62px, Height Error: 0.000mm, Gripper: 0.0392 ✓ Saved best model (val_loss=0.1310) ============================================================ Epoch 15/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0051 (Volume: 0.0026, Gripper: 0.0025, Rotation: 0.0000) Val - Loss: 0.1189, Volume: 0.0013, Pixel Error: 5.71px, Height Error: 0.000mm, Gripper: 0.0245 ✓ Saved best model (val_loss=0.1189) ============================================================ Epoch 16/1000 ============================================================ [async eval] launched at step 4000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_4000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0048 (Volume: 0.0024, Gripper: 0.0024, Rotation: 0.0000) Val - Loss: 0.1331, Volume: 0.0013, Pixel Error: 5.78px, Height Error: 0.000mm, Gripper: 0.0294 ============================================================ Epoch 17/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0056 (Volume: 0.0029, Gripper: 0.0027, Rotation: 0.0000) Val - Loss: 0.1404, Volume: 0.0017, Pixel Error: 6.42px, Height Error: 0.000mm, Gripper: 0.0294 ============================================================ Epoch 18/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 4500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0061 (Volume: 0.0030, Gripper: 0.0031, Rotation: 0.0000) Val - Loss: 0.1543, Volume: 0.0017, Pixel Error: 7.60px, Height Error: 0.000mm, Gripper: 0.0539 ============================================================ Epoch 19/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0056 (Volume: 0.0029, Gripper: 0.0027, Rotation: 0.0000) Val - Loss: 0.1318, Volume: 0.0016, Pixel Error: 6.73px, Height Error: 0.000mm, Gripper: 0.0294 ============================================================ Epoch 20/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 5000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_5000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0043 (Volume: 0.0022, Gripper: 0.0021, Rotation: 0.0000) Val - Loss: 0.1313, Volume: 0.0010, Pixel Error: 5.22px, Height Error: 0.000mm, Gripper: 0.0343 ============================================================ Epoch 21/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0036 (Volume: 0.0018, Gripper: 0.0018, Rotation: 0.0000) Val - Loss: 0.1703, Volume: 0.0013, Pixel Error: 5.20px, Height Error: 0.000mm, Gripper: 0.0490 ============================================================ Epoch 22/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 5500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0037 (Volume: 0.0019, Gripper: 0.0018, Rotation: 0.0000) Val - Loss: 0.1355, Volume: 0.0015, Pixel Error: 6.05px, Height Error: 0.000mm, Gripper: 0.0343 ============================================================ Epoch 23/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0037 (Volume: 0.0019, Gripper: 0.0018, Rotation: 0.0000) Val - Loss: 0.1301, Volume: 0.0011, Pixel Error: 5.15px, Height Error: 0.000mm, Gripper: 0.0441 ============================================================ Epoch 24/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 6000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_6000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0039 (Volume: 0.0020, Gripper: 0.0019, Rotation: 0.0000) Val - Loss: 0.1392, Volume: 0.0012, Pixel Error: 5.35px, Height Error: 0.000mm, Gripper: 0.0392 ============================================================ Epoch 25/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0035 (Volume: 0.0018, Gripper: 0.0018, Rotation: 0.0000) Val - Loss: 0.1388, Volume: 0.0011, Pixel Error: 5.33px, Height Error: 0.000mm, Gripper: 0.0294 ============================================================ Epoch 26/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 6500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Train Loss: 0.0036 (Volume: 0.0019, Gripper: 0.0017, Rotation: 0.0000) Val - Loss: 0.1361, Volume: 0.0013, Pixel Error: 5.33px, Height Error: 0.000mm, Gripper: 0.0441 ⏱ Time limit reached (20.1 / 20 min). Stopping.