[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83384":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":12,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":14,"stars7d":15,"stars30d":15,"stars90d":13,"forks30d":13,"starsTrendScore":16,"compositeScore":17,"rankGlobal":8,"rankLanguage":8,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":19,"topics":21,"createdAt":8,"pushedAt":8,"updatedAt":22,"readmeContent":23,"aiSummary":8,"trendingCount":13,"starSnapshotCount":13,"syncStatus":12,"lastSyncTime":24,"discoverSource":25},83384,"ArtiFixer","nv-tlabs\u002FArtiFixer","nv-tlabs",null,"Python",100,6,2,0,3,38,24,68.34,"Apache License 2.0",false,"main",[],"2026-06-12 04:01:41","\u003C!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->\n\u003C!-- SPDX-License-Identifier: Apache-2.0 -->\n\n# ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models\n\n[Riccardo de Lutio](https:\u002F\u002Friccardodelutio.github.io\u002F),\n[Tobias Fischer](https:\u002F\u002Ftobiasfshr.github.io\u002F),\n[Yen-Yu Chang](https:\u002F\u002Fyuyuchang.github.io\u002F),\n[Yuxuan Zhang](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=Jt5VvNgAAAAJ&hl=en),\n[Jay Zhangjie Wu](https:\u002F\u002Fzhangjiewu.github.io\u002F),\n[Xuanchi Ren](https:\u002F\u002Fxuanchiren.com\u002F),\n[Tianchang Shen](https:\u002F\u002Fwww.cs.toronto.edu\u002F~shenti11\u002F),\n[Katarina Tothova](https:\u002F\u002Fwww.linkedin.com\u002Fin\u002Fkatarina-tothova\u002F),\n[Zan Gojcic](https:\u002F\u002Fzgojcic.github.io\u002F),\n[Haithem Turki](https:\u002F\u002Fhaithemturki.com\u002F)\n\n[Project Page](https:\u002F\u002Fresearch.nvidia.com\u002Flabs\u002Fsil\u002Fprojects\u002Fartifixer\u002F) \u002F [Paper](https:\u002F\u002Fresearch.nvidia.com\u002Flabs\u002Fsil\u002Fprojects\u002Fartifixer\u002Fassets\u002Fpaper.pdf)\n\n![Base 3DGRUT vs ArtiFixer3D+ slider](assets\u002Fdemo\u002Froof_anchor_base_vs_af3dplus_slider.gif)\n\nThis repository provides the official implementation of ArtiFixer.\n\n## License and Contributions\n\nThis project is released under the Apache License, Version 2.0. See [LICENSE](LICENSE).\nThird-party notices and additional license texts are listed in [THIRD-PARTY-NOTICES.md](THIRD-PARTY-NOTICES.md).\n\nThis project will only accept contributions under Apache-2.0. See [CONTRIBUTING.md](CONTRIBUTING.md) for contribution terms.\n\n## Citation\n\n```bibtex\n@inproceedings{delutio2026artifixer,\n    title={ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models},\n    author={de Lutio, Riccardo and Fischer, Tobias and Chang, Yen-Yu and Zhang, Yuxuan and\n            Wu, Jay Zhangjie and Ren, Xuanchi and Shen, Tianchang and Tothova, Katarina and\n            Gojcic, Zan and Turki, Haithem},\n    booktitle={SIGGRAPH},\n    year={2026}\n}\n```\n\n## Repository Layout\n\n- `model_training\u002F`: model definition, data loaders, training loop, and diffusion pipelines.\n- `model_eval\u002F`: inference entry point and metric computation for DL3DV and Nerfbusters evaluations.\n- `data_processing\u002F`: public data-preparation wrappers, split generation, captioning helpers, and sparse-reconstruction data conversion.\n- `thirdparty\u002F`: external reconstruction dependencies used by the data-preparation pipeline.\n\n## Setup\n\nClone the repository with its ArtiFixer-compatible 3DGRUT submodule:\n\n```bash\ngit clone --recurse-submodules https:\u002F\u002Fgithub.com\u002Fnv-tlabs\u002FArtiFixer.git\ncd ArtiFixer\n```\n\nIf you already cloned the repository without submodules, initialize the 3DGRUT\ndependency before building Docker images or running sparse reconstruction and\nArtiFixer3D:\n\n```bash\ngit submodule update --init --recursive\n```\n\nThe recommended environment is one of the provided CUDA Dockerfiles:\n\n```bash\ndocker build -f Dockerfile.cuda12 -t artifixer:cuda12 .\ndocker build -f Dockerfile.cuda13 -t artifixer:cuda13 .\ndocker build -f Dockerfile.cuda13-aarch64 -t artifixer:cuda13-aarch64 .\n```\n\nUse `Dockerfile.cuda13-aarch64` for ARM64 systems such as GB200 nodes. Use the CUDA 12 or CUDA 13 Dockerfiles for standard x86_64 CUDA environments.\n\nRun the image with the repository and datasets mounted:\n\n```bash\ndocker run --gpus all --ipc=host --rm -it \\\n    -v \"$PWD\":\u002Fworkspace\u002Fartifixer \\\n    -v \u002Fpath\u002Fto\u002Fdata:\u002Fdata \\\n    artifixer:cuda12\ncd \u002Fworkspace\u002Fartifixer\n```\n\nDownload the release checkpoint from the [ArtiFixer Hugging Face repo](https:\u002F\u002Fhuggingface.co\u002Fnvidia\u002FArtiFixer):\n\n```bash\nmkdir -p \u002Fdata\u002Fartifixer-checkpoints\nhuggingface-cli download nvidia\u002FArtiFixer \\\n    artifixer-14b.pt \\\n    --local-dir \u002Fdata\u002Fartifixer-checkpoints\n\nexport CHECKPOINT_PT=\u002Fdata\u002Fartifixer-checkpoints\u002Fartifixer-14b.pt\n```\n\n## Inference\n\nTo try out the workflow on one scene, download this DL3DV archive:\n\n```bash\nexport DL3DV_ROOT=\u002Fdata\u002FDL3DV-ALL-960P\n\npython scripts\u002Fdownload_dl3dv_scene.py \\\n    --local-dir \"$DL3DV_ROOT\" \\\n    --scene-id 15ff83e2531668d27c92091c97d31401ce323e24ee7c844cb32d5109ab9335f7 \\\n    --subdir 8K\n```\n\nFor an arbitrary image collection, first run COLMAP and organize the result as:\n\n```text\n\u003CCOLMAP_SCENE>\u002F\n  images\u002F\n  sparse\u002F0\u002F\n    cameras.bin\n    images.bin\n    points3D.bin\n```\n\nThen prepare the scene for ArtiFixer inference:\n\n```bash\npython -m data_processing.prepare_colmap_artifixer_inputs \\\n    --colmap_dir \u002Fpath\u002Fto\u002FCOLMAP_SCENE \\\n    --output_root \u002Fpath\u002Fto\u002Fartifixer-prep\u002Fmy_scene\n```\n\nBy default, every COLMAP image is used as a 3DGRUT training view. To select a subset of images, pass a\nnewline-delimited file of selected training image names:\n\n```bash\npython -m data_processing.prepare_colmap_artifixer_inputs \\\n    --colmap_dir \u002Fpath\u002Fto\u002FCOLMAP_SCENE \\\n    --output_root \u002Fpath\u002Fto\u002Fartifixer-prep\u002Fmy_scene \\\n    --selected_image_names_file \u002Fpath\u002Fto\u002Fselected_train_images.txt\n```\n\nEach prepared `split.json` describes one render path. To prepare a novel camera path, use a separate output\nroot and pass a transforms-style JSON file with camera intrinsics and 4x4 camera-to-world matrices. Frame entries\nmay override the top-level focal length, principal point, and distortion, but keep one fixed resolution across the\ntrajectory. The preparation command renders the 3DGRUT reconstruction along that path and writes a new\n`split.json` that points to those renders.\n\n```json\n{\n  \"camera_model\": \"OPENCV\",\n  \"w\": 1024,\n  \"h\": 576,\n  \"fl_x\": 640.0,\n  \"fl_y\": 640.0,\n  \"cx\": 512.0,\n  \"cy\": 288.0,\n  \"frames\": [\n    {\"transform_matrix\": [[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]]}\n  ]\n}\n```\n\n```bash\npython -m data_processing.prepare_colmap_artifixer_inputs \\\n    --colmap_dir \u002Fpath\u002Fto\u002FCOLMAP_SCENE \\\n    --output_root \u002Fpath\u002Fto\u002Fartifixer-prep\u002Fmy_scene_orbit_360 \\\n    --selected_image_names_file \u002Fpath\u002Fto\u002Fselected_train_images.txt \\\n    --trajectory_path \u002Fpath\u002Fto\u002Forbit_360.json\n```\n\nThe command trains a 3DGRUT COLMAP MCMC reconstruction for 10,000 iterations by default, renders the source\ncameras or the requested trajectory, estimates metric scale with MoGe, and writes caption embeddings. It prepares\nthese inputs for `model_eval.run_inference`:\n\n```text\n\u002Fpath\u002Fto\u002Fartifixer-prep\u002Fmy_scene\u002F\n  split.json\n  selected_indices.json\n  selected_images.txt\n  3dgrut_input\u002F\n  recon_results\u002F\n  captions\u002F\n  metric_alignment\u002Fscale_info.txt\n```\n\nRun ArtiFixer on the prepared full clip with the generated paths. Release\ncheckpoints are single-file transformer state dicts; DCP\u002FFSDP checkpoint\ndirectories are also supported by replacing `--checkpoint_pt` with\n`--checkpoint_dir`.\n\n```bash\nexport SCENE_ROOT=\u002Fpath\u002Fto\u002Fartifixer-prep\u002Fmy_scene\nexport SAVE_DIR=\u002Fpath\u002Fto\u002Fartifixer-corrected\n\npython -m model_eval.run_inference \\\n    --evalset reconstructed_colmap \\\n    --checkpoint_pt \"$CHECKPOINT_PT\" \\\n    --save_dir \"$SAVE_DIR\" \\\n    --split_path \"$SCENE_ROOT\u002Fsplit.json\" \\\n    --render_trajectory all_frames\n```\n\nTo run on a prepared novel trajectory:\n\n```bash\nexport SCENE_ROOT=\u002Fpath\u002Fto\u002Fartifixer-prep\u002Fmy_scene_orbit_360\n\npython -m model_eval.run_inference \\\n    --evalset reconstructed_colmap \\\n    --checkpoint_pt \"$CHECKPOINT_PT\" \\\n    --save_dir \"$SAVE_DIR\" \\\n    --split_path \"$SCENE_ROOT\u002Fsplit.json\" \\\n    --render_trajectory trajectory\n```\n\n### ArtiFixer3D and ArtiFixer3D+\n\n`model_eval.run_inference` can correct held-out validation frames, the full source trajectory, or the prepared trajectory described by the split:\n\n```bash\n# Default: held-out validation frames.\n--render_trajectory val_frames\n\n# Full source clip.\n--render_trajectory all_frames\n\n# Prepared novel trajectory.\n--render_trajectory trajectory\n```\n\nArtiFixer3D trains a fresh 3DGRUT optimization by default on the union of real anchor views and\nArtiFixer-generated target views. The split defines those roles: selected source images are real anchors,\nand non-selected source or trajectory frames are targets whose RGB comes from the ArtiFixer prediction directory.\nUse `--selected_image_names_file` to create source-camera targets, or `--trajectory_path` to create novel-trajectory targets.\n\nAfter the ArtiFixer run completes, pass its predicted frames into the ArtiFixer3D stage. Use the output directory\nprinted by `model_eval.run_inference`:\n\n```bash\nexport ARTIFIXER_OUTPUT_DIR=\u002Fpath\u002Fto\u002Fartifixer-corrected\u002F\u003Ccheckpoint_name>\u002F\u003Crun_name>\nexport SCENE_ID=$(basename \"$SCENE_ROOT\")\nexport ARTIFIXER_FRAMES_DIR=\"$ARTIFIXER_OUTPUT_DIR\u002F$SCENE_ID\u002Fframes\u002Fbatch_0000\u002Fpred\"\n\npython -m data_processing.run_artifixer3d \\\n    --scene_root \"$SCENE_ROOT\" \\\n    --artifixer_frames_dir \"$ARTIFIXER_FRAMES_DIR\"\n```\n\nThe ArtiFixer3D stage renders the updated reconstruction and writes the metadata used by ArtiFixer3D+ inference:\n\n```text\n$SCENE_ROOT\u002Fartifixer3d\u002F\n  distillation_input\u002F\n  runs\u002F\n  recon_results\u002F\n$SCENE_ROOT\u002Fsplit_artifixer3d_plus.json\n```\n\nRun ArtiFixer3D+ by applying ArtiFixer again with that generated inference metadata.\n\n```bash\nexport ARTIFIXER3D_PLUS_SAVE_DIR=\u002Fpath\u002Fto\u002Fartifixer3d-plus\nexport RENDER_TRAJECTORY=all_frames  # use trajectory for a prepared novel-trajectory split\n\npython -m model_eval.run_inference \\\n    --evalset reconstructed_colmap \\\n    --checkpoint_pt \"$CHECKPOINT_PT\" \\\n    --save_dir \"$ARTIFIXER3D_PLUS_SAVE_DIR\" \\\n    --split_path \"$SCENE_ROOT\u002Fsplit_artifixer3d_plus.json\" \\\n    --render_trajectory \"$RENDER_TRAJECTORY\"\n```\n\n## Training Data Preparation\n\nTraining expects three prepared inputs:\n\n1. DL3DV scene archives from the [DL3DV-ALL-960P Hugging Face dataset](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FDL3DV\u002FDL3DV-ALL-960P), arranged under a root such as `\u003CDL3DV_ROOT>\u002F\u003Csplit_or_subdir>\u002F\u003Cscene_id>.zip`.\n2. Reconstruction HDF5 files referenced by the split JSON. Each reconstruction file must include selected indices, render\u002Fopacity payloads, and a valid scale.\n3. Prompt HDF5 files under `\u003CPROMPT_ROOT>\u002F\u003Csplit_or_subdir>\u002F\u003Cscene_id>\u002Fframes_\u003Cnum_frames>_stride_1*.h5`.\n\n\nThe workflow below runs the required data-preparation tasks directly. The captioning and reconstruction commands process every scene zip under `--dl3dv_dir` by default; use `--scene_id` or `--scene_list` only when intentionally restricting a run.\n\n### 1. Download DL3DV\n\nDownload the DL3DV scene zips from the [DL3DV-ALL-960P Hugging Face dataset](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FDL3DV\u002FDL3DV-ALL-960P):\n\n```bash\nhuggingface-cli download DL3DV\u002FDL3DV-ALL-960P \\\n    --repo-type dataset \\\n    --local-dir \u002Fpath\u002Fto\u002FDL3DV-ALL-960P\n```\n\n### 2. Generate Prompt HDF5 Files\n\nGenerate the text-conditioning HDF5 files used during training:\n\n```bash\npython -m data_processing.run_captioning \\\n    --dl3dv_dir \u002Fpath\u002Fto\u002FDL3DV-ALL-960P \\\n    --output_dir \u002Fpath\u002Fto\u002Fartifixer-data\u002FDL3DV-ALL-960P-captions\n```\n\n### 3. Generate Reconstruction HDF5 Files\n\nGenerate sparse 3D reconstructions and convert their renders, opacity, depth, selected indices, and metric scale into ArtiFixer HDF5 files:\n\n```bash\npython -m data_processing.run_sparse_reconstruction \\\n    --dl3dv_dir \u002Fpath\u002Fto\u002FDL3DV-ALL-960P \\\n    --output_root \u002Fpath\u002Fto\u002Fartifixer-data\u002Freconstructions \\\n    --work_root \u002Fpath\u002Fto\u002Fartifixer-work\u002Freconstructions \\\n    --num_selected_indices 2 3 6 12\n```\n\nThis wrapper runs the required per-scene operations in order:\n\n1. Half-covisibility camera split generation.\n2. 3DGRUT sparse reconstruction training for each requested scene half and view count.\n3. Metric-scale alignment.\n4. Conversion to HDF5.\n5. Copying final `data_*.h5`, `parsed_*.yaml`, and `ckpt_last_*.pt` files into the reconstruction root.\n\nThe split builder expects reconstruction subdirectories named `dl3dv_\u003Cdl3dv_subdir>`, which the wrapper creates by default. Final files are written as:\n\n```text\n\u003CRECON_ROOT>\u002Fdl3dv_\u003Cdl3dv_subdir>\u002F\u003Cscene_id>\u002F\n  data_\u003Cscene_id>_\u003Cscene_half>_\u003Cnum_views>.h5\n  parsed_\u003Cscene_id>_\u003Cscene_half>_\u003Cnum_views>.yaml\n  ckpt_last_\u003Cscene_id>_\u003Cscene_half>_\u003Cnum_views>.pt\n```\n\nMetric alignment uses MoGe for monocular depth. If the run environment cannot download MoGe weights, download the checkpoint ahead of time and set `MOGE_MODEL_PATH` to that local checkpoint directory before launching reconstruction.\n\n### 4. Build the Train\u002FTest Split\n\nGenerate the split JSON after prompt and reconstruction files are available:\n\n```bash\npython -m data_processing.trainval_test_split \\\n    --data_path \u002Fpath\u002Fto\u002Fartifixer-data\u002Freconstructions \\\n    --dl3dv_dir \u002Fpath\u002Fto\u002FDL3DV-ALL-960P \\\n    --output_root \u002Fpath\u002Fto\u002Fartifixer-data\n```\n\nThis writes `\u002Fpath\u002Fto\u002Fartifixer-data\u002Ftrainval_test_split.json`. The script validates source archives, required reconstruction splits, duplicate reconstruction files, and known bad scenes before writing the split.\n\n### 5. Use Prepared Paths\n\nUse the prepared paths consistently for training and evaluation:\n\n```bash\nexport SPLIT_PATH=\u002Fpath\u002Fto\u002Fartifixer-data\u002Ftrainval_test_split.json\nexport DL3DV_ROOT=\u002Fpath\u002Fto\u002FDL3DV-ALL-960P\nexport PROMPT_ROOT=\u002Fpath\u002Fto\u002Fartifixer-data\u002FDL3DV-ALL-960P-captions\n```\n\n## Training\n\nArtiFixer training has three stages:\n\n1. Stage 1 supervised finetuning on reconstruction-conditioned DL3DV clips.\n2. Stage 2 diffusion-forcing finetuning from the stage 1 checkpoint.\n3. Stage 3 DMD distillation, using the stage 2 checkpoint as the student\u002Fgenerator initialization and the stage 1\n   checkpoint as the critic initialization.\n\nThe default model is Wan2.1 14B. Set `num_processes * gradient_accumulation_steps` to 128 for the default recipe; for example, use `--gradient_accumulation_steps 16` with 8 processes.\n\nLaunch stage 1 with `accelerate`:\n\n```bash\nexport PROJECT_DIR=\u002Fpath\u002Fto\u002Fruns\u002Fartifixer-s1-14b\nexport SPLIT_PATH=\u002Fpath\u002Fto\u002Fartifixer-data\u002Ftrainval_test_split.json\nexport DL3DV_ROOT=\u002Fpath\u002Fto\u002FDL3DV-ALL-960P\nexport PROMPT_ROOT=\u002Fpath\u002Fto\u002Fartifixer-data\u002FDL3DV-ALL-960P-captions\nexport NUM_PROCESSES=8\nexport GRADIENT_ACCUMULATION_STEPS=16\n\naccelerate launch \\\n    --multi_gpu \\\n    --num_processes \"$NUM_PROCESSES\" \\\n    --module model_training.train \\\n    --project_dir \"$PROJECT_DIR\" \\\n    --split_path \"$SPLIT_PATH\" \\\n    --dl3dv_dir \"$DL3DV_ROOT\" \\\n    --prompt_dir \"$PROMPT_ROOT\" \\\n    --gradient_accumulation_steps \"$GRADIENT_ACCUMULATION_STEPS\" \\\n    --tracker_run_name artifixer-s1-14b \\\n    --resume_from_checkpoint auto\n```\n\nFor multi-node Slurm jobs, start from `model_training\u002Fslurm\u002Fsample-slurm-submit.sh`; it is a template that expects you to provide your cluster account, partition, paths, and optional container image through standard Slurm flags or environment variables.\n\nStage 2 finetunes a stage 1 checkpoint with block-causal diffusion-forcing training:\n\n```bash\nexport STAGE1_CHECKPOINT=\u002Fpath\u002Fto\u002Fruns\u002Fartifixer-s1-14b\u002Fcheckpoints\u002Fcheckpoint_25000\u002Fpytorch_model_fsdp_0\nexport STAGE2_PROJECT_DIR=\u002Fpath\u002Fto\u002Fruns\u002Fartifixer-s2-14b-from-s1-25000\n\naccelerate launch \\\n    --multi_gpu \\\n    --num_processes \"$NUM_PROCESSES\" \\\n    --module model_training.diffusion_forcing \\\n    --project_dir \"$STAGE2_PROJECT_DIR\" \\\n    --base_checkpoint_dir \"$STAGE1_CHECKPOINT\" \\\n    --split_path \"$SPLIT_PATH\" \\\n    --dl3dv_dir \"$DL3DV_ROOT\" \\\n    --prompt_dir \"$PROMPT_ROOT\" \\\n    --gradient_accumulation_steps \"$GRADIENT_ACCUMULATION_STEPS\" \\\n    --tracker_run_name artifixer-s2-14b-from-s1-25000 \\\n    --resume_from_checkpoint auto\n```\n\nStage 3 runs DMD distillation. `--base_checkpoint_dir` initializes the student\u002Fgenerator; `--base_checkpoint_dir_critic` initializes the fixed real-score critic and trainable fake-score critic. The critic model config defaults to `--model_id`; pass `--model_id_critic` only when the critic checkpoint uses a different base model.\n\n```bash\nexport STAGE2_CHECKPOINT=\u002Fpath\u002Fto\u002Fruns\u002Fartifixer-s2-14b-from-s1-25000\u002Fcheckpoints\u002Fcheckpoint_10000\u002Fpytorch_model_fsdp_0\nexport CRITIC_CHECKPOINT=\u002Fpath\u002Fto\u002Fruns\u002Fartifixer-s1-14b\u002Fcheckpoints\u002Fcheckpoint_25000\u002Fpytorch_model_fsdp_0\nexport STAGE3_PROJECT_DIR=\u002Fpath\u002Fto\u002Fruns\u002Fartifixer-s3-14b-s2-10000-s1-25000\n\naccelerate launch \\\n    --multi_gpu \\\n    --num_processes \"$NUM_PROCESSES\" \\\n    --module model_training.distillation \\\n    --project_dir \"$STAGE3_PROJECT_DIR\" \\\n    --base_checkpoint_dir \"$STAGE2_CHECKPOINT\" \\\n    --base_checkpoint_dir_critic \"$CRITIC_CHECKPOINT\" \\\n    --split_path \"$SPLIT_PATH\" \\\n    --dl3dv_dir \"$DL3DV_ROOT\" \\\n    --prompt_dir \"$PROMPT_ROOT\" \\\n    --gradient_accumulation_steps \"$GRADIENT_ACCUMULATION_STEPS\" \\\n    --tracker_run_name artifixer-s3-14b-s2-10000-s1-25000 \\\n    --resume_from_checkpoint auto\n```\n\n## Evaluation\n\nRelease evaluation reports four rows:\n\n1. `3DGUT`: the base sparse 3D reconstruction renders.\n2. `ArtiFixer`: direct frame output from `model_eval.run_inference`.\n3. `ArtiFixer3D`: a fresh 3DGRUT optimization distilled from the direct ArtiFixer frames.\n4. `ArtiFixer3D+`: ArtiFixer run again on the ArtiFixer3D renders and generated inference metadata.\n\nDL3DV evaluation uses the same prepared DL3DV dataset flow as training. Use the split JSON to select the evaluation scenes and keep `--dl3dv_dir` pointed at the DL3DV-ALL-960P root used to prepare captions and reconstructions.\n\nNerfBusters uses each scene's `transforms.json` plus the scene-specific image folder selected by the shared resolution helper. `aloe`, `car`, `garbage`, and `table` use `images_2`; the remaining scenes use `images`. NerfBusters visibility masks are eval-only and are not passed to 3DGRUT distillation training.\n\nRun direct ArtiFixer inference from a checkpoint. These commands write PNG frames needed by the metric scripts. The examples use one process and therefore one GPU; to distribute scenes across multiple GPUs, run the same module through `torchrun --nproc_per_node \u003Cnum-gpus>`.\n\nDL3DV (our split):\n\n```bash\nexport CHECKPOINT_PT=\u002Fdata\u002Fartifixer-checkpoints\u002Fartifixer-14b.pt\nexport SAVE_DIR=\u002Fpath\u002Fto\u002Fartifixer-eval\nexport SPLIT_PATH=\u002Fpath\u002Fto\u002Fartifixer-data\u002Ftrainval_test_split.json\nexport DL3DV_ROOT=\u002Fpath\u002Fto\u002FDL3DV-ALL-960P\nexport PROMPT_ROOT=\u002Fpath\u002Fto\u002Fartifixer-data\u002FDL3DV-ALL-960P-captions\n\npython -m model_eval.run_inference \\\n    --evalset 3dgrut_dl3dv_ours \\\n    --checkpoint_pt \"$CHECKPOINT_PT\" \\\n    --save_dir \"$SAVE_DIR\" \\\n    --split_path \"$SPLIT_PATH\" \\\n    --dl3dv_dir \"$DL3DV_ROOT\" \\\n    --prompt_dir \"$PROMPT_ROOT\" \\\n    --save_frame_outputs_only\n```\n\n```bash\nexport EVAL_OUTPUT_NAME=artifixer-14b\n\npython -m model_eval.compute_metrics_dl3dv \\\n    --evalset 3dgrut_dl3dv_ours \\\n    --eval_output_name \"$EVAL_OUTPUT_NAME\" \\\n    --sink_size 7 \\\n    --split_path \"$SPLIT_PATH\" \\\n    --dl3dv_dir \"$DL3DV_ROOT\" \\\n    --eval_base_path \"$SAVE_DIR\" \\\n    --no_masks\n```\n\nDL3DV (DiFix split):\n\n```bash\nexport DIFIX_RECON_RESULTS_DIR=\u002Fpath\u002Fto\u002Fdifix-reconstruction-results\nexport DIFIX_TRAIN_IDS_DIR=\u002Fpath\u002Fto\u002Fdifix-train-ids\nexport DIFIX_VISIBILITY_MASKS_DIR=\u002Fpath\u002Fto\u002Fdifix-visibility-masks\n\npython -m model_eval.run_inference \\\n    --evalset 3dgrut_dl3dv_difix \\\n    --checkpoint_pt \"$CHECKPOINT_PT\" \\\n    --save_dir \"$SAVE_DIR\" \\\n    --split_path \"$SPLIT_PATH\" \\\n    --dl3dv_dir \"$DL3DV_ROOT\" \\\n    --prompt_dir \"$PROMPT_ROOT\" \\\n    --recon_results_dir \"$DIFIX_RECON_RESULTS_DIR\" \\\n    --save_frame_outputs_only\n```\n\n```bash\npython -m model_eval.compute_metrics_dl3dv \\\n    --evalset 3dgrut_dl3dv_difix \\\n    --eval_output_name \"$EVAL_OUTPUT_NAME\" \\\n    --sink_size 7 \\\n    --split_path \"$SPLIT_PATH\" \\\n    --dl3dv_dir \"$DL3DV_ROOT\" \\\n    --eval_base_path \"$SAVE_DIR\" \\\n    --difix_train_ids_dir \"$DIFIX_TRAIN_IDS_DIR\" \\\n    --visibility_masks_dir \"$DIFIX_VISIBILITY_MASKS_DIR\"\n```\n\nThe DiFix comparison uses the masked metric YAMLs as the paper-style numbers. The mask convention is to black out pixels outside the visibility mask and then compute full-image metrics.\n\nNerfBusters:\n\n```bash\nexport NERFBUSTERS_DIR=\u002Fpath\u002Fto\u002Fnerfbusters\nexport NERFBUSTERS_RECON_RESULTS_DIR=\u002Fpath\u002Fto\u002Fnerfbusters-reconstruction-results\nexport NERFBUSTERS_CAPTIONS_DIR=\u002Fpath\u002Fto\u002Fnerfbusters-captions\nexport NERFBUSTERS_VISIBILITY_MASKS_DIR=\u002Fpath\u002Fto\u002Fnerfbusters-visibility-masks\n\npython -m model_eval.run_inference \\\n    --evalset nerfbusters \\\n    --checkpoint_pt \"$CHECKPOINT_PT\" \\\n    --save_dir \"$SAVE_DIR\" \\\n    --nerfbusters_dir \"$NERFBUSTERS_DIR\" \\\n    --nerfbusters_recon_results_dir \"$NERFBUSTERS_RECON_RESULTS_DIR\" \\\n    --nerfbusters_captions_dir \"$NERFBUSTERS_CAPTIONS_DIR\" \\\n    --save_frame_outputs_only\n```\n\n```bash\npython -m model_eval.compute_metrics_nerfbusters \\\n    --eval_output_name \"$EVAL_OUTPUT_NAME\" \\\n    --eval_base_path \"$SAVE_DIR\" \\\n    --nerfbusters_dir \"$NERFBUSTERS_DIR\" \\\n    --visibility_masks_dir \"$NERFBUSTERS_VISIBILITY_MASKS_DIR\"\n```\n\nFor ArtiFixer3D, direct ArtiFixer output frames are passed to 3DGRUT through\n`image_path_override`. Source\u002Fselected frames remain the original GT anchors.\nNerfBusters visibility masks are applied only by the metric script.\n\nUse `--replace_if_exists` to regenerate existing outputs. Use `--scene_id \u003Cid>`\nfor a single DL3DV or NerfBusters scene. By default, the DL3DV metric scripts\nevaluate every scene in the configured evaluation split.\n","2026-06-11 04:11:03","CREATED_QUERY"]