Loading dataset... CachedTrajectoryDataset: 64 demos, 2048 samples Source: {'cache_root': '/data/libero/ood_objpos_v3_splits/exp4_n64_train', 'benchmark': 'libero_spatial', 'task_ids': [0], 'frame_stride': 3} Total: 2048 samples ✓ Train: 1946 samples ✓ Val: 102 samples Initializing model (type=act)... Loading DINOv2 model... ✓ DINOv2 backbone is trainable ✓ CLIP projection: 512 → 384 ✓ ACT model: 30,394,048 / 30,394,048 trainable params Input: CLS(384) + start_kp(2) + eef_pos(3) + gripper(1) = 774 pos_mlp: → (B, 4, 3) [sigmoid, normalized] rot_mlp: → (B, 4, 3) [sigmoid, normalized] gripper_mlp: → (B, 4) [sigmoid, normalized] Trainable parameters: 30,394,048 / 30,394,048 (100.00%) Computing dataset stats from random subset: 500/2048 samples (seed=42) /data2/cameron/miniconda3/envs/uva/lib/python3.10/site-packages/wandb/sdk/data_types/image.py:324: DeprecationWarning: 'mode' parameter is deprecated and will be removed in Pillow 13 (2026-10-15) ✓ Saved stats cache: /data/cameron/567_augmentation_viewpoint_project/checkpoints/act_defvp_crop50_nokp/dataset_stats.json ✓ Height range from dataset: [0.917341, 1.177162] m ✓ Gripper range from dataset: [-1.000000, 1.000000] ✓ Rotation range (delta rotvec): ['-0.107', '-0.076', '-0.020'] .. ['0.040', '0.249', '0.020'] ✓ Reference rotation: ['0.9994', '-0.0011', '-0.0338', '-0.0038'] ✓ Position range from dataset: ['-0.408', '-0.366', '0.917'] .. ['0.132', '0.273', '1.177'] Starting training for 1000 epochs... ✓ Rotation loss SKIPPED ✓ EMA loss weights initialized (vol=11.8, rot=3.5, grip=0.69) self._image = pil_image.fromarray( Epochs: 0%| | 2/1000 [01:21<11:10:22, 40.30s/it] ============================================================ Epoch 0/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.2852 (Volume: 0.0255, Gripper: 0.2597, Rotation: 0.0000) Val - Loss: 0.2687, Volume: 0.0047, Pixel Error: 12.06px, Height Error: 0.000mm, Gripper: 0.1765 ✓ Saved best model (val_loss=0.2687) ============================================================ Epoch 1/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Train Loss: 0.0138 (Volume: 0.0068, Gripper: 0.0070, Rotation: 0.0000) Val - Loss: 0.2331, Volume: 0.0029, Pixel Error: 9.65px, Height Error: 0.000mm, Gripper: 0.1127 ✓ Saved best model (val_loss=0.2331) ============================================================ Epoch 2/1000 ============================================================ [async eval] launched at step 500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.h264.h264.mp4 Train Loss: 0.0089 (Volume: 0.0045, Gripper: 0.0044, Rotation: 0.0000) Val - Loss: 0.2241, Volume: 0.0027, Pixel Error: 9.49px, Height Error: 0.000mm, Gripper: 0.1078 ✓ Saved best model (val_loss=0.2241) ============================================================ Epoch 3/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0082 (Volume: 0.0041, Gripper: 0.0041, Rotation: 0.0000) Val - Loss: 0.1934, Volume: 0.0019, Pixel Error: 7.32px, Height Error: 0.000mm, Gripper: 0.0931 ✓ Saved best model (val_loss=0.1934) ============================================================ Epoch 4/1000 ============================================================ [async eval] launched at step 1000 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Saved step checkpoint: step_1000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.mp4 Train Loss: 0.0070 (Volume: 0.0035, Gripper: 0.0035, Rotation: 0.0000) Val - Loss: 0.1825, Volume: 0.0025, Pixel Error: 8.59px, Height Error: 0.000mm, Gripper: 0.0882 ✓ Saved best model (val_loss=0.1825) ============================================================ Epoch 5/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0059 (Volume: 0.0030, Gripper: 0.0029, Rotation: 0.0000) Val - Loss: 0.1762, Volume: 0.0016, Pixel Error: 6.71px, Height Error: 0.000mm, Gripper: 0.0686 ✓ Saved best model (val_loss=0.1762) ============================================================ Epoch 6/1000 ============================================================ [async eval] launched at step 1500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.h264.mp4 Train Loss: 0.0054 (Volume: 0.0027, Gripper: 0.0026, Rotation: 0.0000) Val - Loss: 0.1563, Volume: 0.0013, Pixel Error: 6.00px, Height Error: 0.000mm, Gripper: 0.0539 ✓ Saved best model (val_loss=0.1563) ============================================================ Epoch 7/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0053 (Volume: 0.0027, Gripper: 0.0027, Rotation: 0.0000) Val - Loss: 0.1336, Volume: 0.0016, Pixel Error: 6.79px, Height Error: 0.000mm, Gripper: 0.0343 ✓ Saved best model (val_loss=0.1336) ============================================================ Epoch 8/1000 ============================================================ [async eval] launched at step 2000 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Saved step checkpoint: step_2000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Train Loss: 0.0044 (Volume: 0.0022, Gripper: 0.0022, Rotation: 0.0000) Val - Loss: 0.1526, Volume: 0.0011, Pixel Error: 5.40px, Height Error: 0.000mm, Gripper: 0.0490 ============================================================ Epoch 9/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0045 (Volume: 0.0023, Gripper: 0.0022, Rotation: 0.0000) Val - Loss: 0.1970, Volume: 0.0015, Pixel Error: 6.36px, Height Error: 0.000mm, Gripper: 0.0833 ============================================================ Epoch 10/1000 ============================================================ [async eval] launched at step 2500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0060 (Volume: 0.0031, Gripper: 0.0029, Rotation: 0.0000) Val - Loss: 0.1502, Volume: 0.0012, Pixel Error: 6.09px, Height Error: 0.000mm, Gripper: 0.0539 ============================================================ Epoch 11/1000 ============================================================ [async eval] logged video to wandb: ep000_success.mp4 [async eval] logged video to wandb: ep000_success.h264.mp4 [async eval] logged video to wandb: ep000_success.h264.h264.mp4 Train Loss: 0.0042 (Volume: 0.0021, Gripper: 0.0021, Rotation: 0.0000) Val - Loss: 0.1374, Volume: 0.0011, Pixel Error: 5.37px, Height Error: 0.000mm, Gripper: 0.0441 ============================================================ Epoch 12/1000 ============================================================ [async eval] launched at step 3000 [async eval] logged video to wandb: ep000_success.h264.h264.h264.mp4 Saved step checkpoint: step_3000.pth [async eval] logged video to wandb: ep000_success.h264.h264.h264.h264.mp4 Train Loss: 0.0041 (Volume: 0.0021, Gripper: 0.0021, Rotation: 0.0000) Val - Loss: 0.2068, Volume: 0.0019, Pixel Error: 7.22px, Height Error: 0.000mm, Gripper: 0.0735 ============================================================ Epoch 13/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0044 (Volume: 0.0022, Gripper: 0.0021, Rotation: 0.0000) Val - Loss: 0.1392, Volume: 0.0009, Pixel Error: 4.66px, Height Error: 0.000mm, Gripper: 0.0539 ============================================================ Epoch 14/1000 ============================================================ [async eval] launched at step 3500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0039 (Volume: 0.0020, Gripper: 0.0019, Rotation: 0.0000) Val - Loss: 0.1436, Volume: 0.0010, Pixel Error: 4.76px, Height Error: 0.000mm, Gripper: 0.0539 ============================================================ Epoch 15/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 Train Loss: 0.0039 (Volume: 0.0020, Gripper: 0.0019, Rotation: 0.0000) Val - Loss: 0.1425, Volume: 0.0013, Pixel Error: 6.23px, Height Error: 0.000mm, Gripper: 0.0441 ============================================================ Epoch 16/1000 ============================================================ [async eval] launched at step 4000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_4000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0051 (Volume: 0.0026, Gripper: 0.0026, Rotation: 0.0000) Val - Loss: 0.1985, Volume: 0.0027, Pixel Error: 7.88px, Height Error: 0.000mm, Gripper: 0.0637 ============================================================ Epoch 17/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0053 (Volume: 0.0027, Gripper: 0.0026, Rotation: 0.0000) Val - Loss: 0.1410, Volume: 0.0019, Pixel Error: 7.93px, Height Error: 0.000mm, Gripper: 0.0294 ============================================================ Epoch 18/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 4500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0033 (Volume: 0.0018, Gripper: 0.0016, Rotation: 0.0000) Val - Loss: 0.1070, Volume: 0.0010, Pixel Error: 5.17px, Height Error: 0.000mm, Gripper: 0.0147 ✓ Saved best model (val_loss=0.1070) ============================================================ Epoch 19/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0031 (Volume: 0.0016, Gripper: 0.0015, Rotation: 0.0000) Val - Loss: 0.1298, Volume: 0.0009, Pixel Error: 4.75px, Height Error: 0.000mm, Gripper: 0.0392 ============================================================ Epoch 20/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 5000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_5000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0034 (Volume: 0.0018, Gripper: 0.0016, Rotation: 0.0000) Val - Loss: 0.1182, Volume: 0.0012, Pixel Error: 6.12px, Height Error: 0.000mm, Gripper: 0.0294 ============================================================ Epoch 21/1000 ============================================================ [async eval] logged video to wandb: ep000_success.mp4 [async eval] logged video to wandb: ep000_success.h264.mp4 Train Loss: 0.0040 (Volume: 0.0021, Gripper: 0.0019, Rotation: 0.0000) Val - Loss: 0.1235, Volume: 0.0009, Pixel Error: 4.93px, Height Error: 0.000mm, Gripper: 0.0245 ============================================================ Epoch 22/1000 ============================================================ [async eval] logged video to wandb: ep000_success.h264.h264.mp4 [async eval] launched at step 5500 [async eval] logged video to wandb: ep000_success.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_success.h264.h264.h264.h264.mp4 Train Loss: 0.0046 (Volume: 0.0023, Gripper: 0.0023, Rotation: 0.0000) Val - Loss: 0.1370, Volume: 0.0010, Pixel Error: 5.51px, Height Error: 0.000mm, Gripper: 0.0294 ============================================================ Epoch 23/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0037 (Volume: 0.0019, Gripper: 0.0018, Rotation: 0.0000) Val - Loss: 0.1301, Volume: 0.0010, Pixel Error: 5.04px, Height Error: 0.000mm, Gripper: 0.0196 ============================================================ Epoch 24/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 6000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_6000.pth [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 Train Loss: 0.0034 (Volume: 0.0017, Gripper: 0.0017, Rotation: 0.0000) Val - Loss: 0.1571, Volume: 0.0013, Pixel Error: 5.29px, Height Error: 0.000mm, Gripper: 0.0490 ============================================================ Epoch 25/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0031 (Volume: 0.0016, Gripper: 0.0015, Rotation: 0.0000) Val - Loss: 0.1143, Volume: 0.0008, Pixel Error: 4.50px, Height Error: 0.000mm, Gripper: 0.0147 ============================================================ Epoch 26/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 6500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Train Loss: 0.0028 (Volume: 0.0015, Gripper: 0.0014, Rotation: 0.0000) Val - Loss: 0.1257, Volume: 0.0012, Pixel Error: 5.86px, Height Error: 0.000mm, Gripper: 0.0196 ============================================================ Epoch 27/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0029 (Volume: 0.0015, Gripper: 0.0014, Rotation: 0.0000) Val - Loss: 0.1158, Volume: 0.0007, Pixel Error: 4.42px, Height Error: 0.000mm, Gripper: 0.0245 ============================================================ Epoch 28/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 7000 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Saved step checkpoint: step_7000.pth Train Loss: 0.0037 (Volume: 0.0019, Gripper: 0.0018, Rotation: 0.0000) Val - Loss: 0.1129, Volume: 0.0011, Pixel Error: 5.74px, Height Error: 0.000mm, Gripper: 0.0294 ============================================================ Epoch 29/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0029 (Volume: 0.0015, Gripper: 0.0014, Rotation: 0.0000) Val - Loss: 0.1189, Volume: 0.0010, Pixel Error: 4.67px, Height Error: 0.000mm, Gripper: 0.0196 ============================================================ Epoch 30/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.mp4 [async eval] launched at step 7500 [async eval] logged video to wandb: ep000_fail.h264.h264.h264.mp4 Train Loss: 0.0026 (Volume: 0.0014, Gripper: 0.0013, Rotation: 0.0000) Val - Loss: 0.1088, Volume: 0.0009, Pixel Error: 4.32px, Height Error: 0.000mm, Gripper: 0.0147 ============================================================ Epoch 31/1000 ============================================================ [async eval] logged video to wandb: ep000_fail.h264.h264.h264.h264.mp4 [async eval] logged video to wandb: ep000_fail.mp4 [async eval] logged video to wandb: ep000_fail.h264.mp4 Train Loss: 0.0028 (Volume: 0.0014, Gripper: 0.0014, Rotation: 0.0000) Val - Loss: 0.1377, Volume: 0.0011, Pixel Error: 5.15px, Height Error: 0.000mm, Gripper: 0.0294 ⏱ Time limit reached (20.3 / 20 min). Stopping.