[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-38":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":23,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":15,"lastSyncTime":37,"discoverSource":38},38,"nano-world-model","simchowitzlabpublic\u002Fnano-world-model","simchowitzlabpublic","A Minimalist, Batteries-included Repository for Advancing World Model Science.","https:\u002F\u002Fsimchowitzlabpublic.github.io\u002Fnano-world-model\u002F",null,"Python",606,32,3,2,0,24,189,70.56,"MIT License",false,"main",true,[25,26,27,28,29,30,31,32,33],"diffusion-forcing","diffusion-models","model-predictive-control","nano","planning","robot-manipulation","streaming-video","video-generation","world-model","2026-06-11 04:00:17","\u003Cdiv align=\"center\">\n\u003Ch1>🌍 Nano World Model\u003C\u002Fh1>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n\u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fknightnemo\u002Fnano-world-model'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Page-blue'>\u003C\u002Fa>\n\u003Ca href='https:\u002F\u002Fsimchowitzlabpublic.github.io\u002Fnano-world-model\u002F'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-Green'>\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg\" alt=\"License: MIT\"\u002F>\u003C\u002Fa>\n\u003C\u002Fdiv>\n\nA minimalist repository for training video world models based on diffusion-forcing.\n\n\u003Cdiv align=\"center\">\n\n![3×4 rollout grid](assets\u002Fgrid_video.gif)\n\n\u003C\u002Fdiv>\n\n## Key Features\n\n- 🚀 **Instant Start** — Minimal dependencies, easy data loading. From clone to first rollout in minutes.\n- 🛠️ **Unified Pipeline** — Training, Validation, Evaluation; All managed with clean hydra-based configuration systems.\n- 🔬 **Scientific Transparency** — Clean codebase with head-to-head ablations across prediction target, action injection, and model scale; Fully open-source, including model checkpoints.\n- 🤖 **Diverse Applications** — Long-horizon rollouts, rollout to 3d point clouds, planning (MPC) out of the box.\n\n## 🚀 Quick Start\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fsimchowitzlabpublic\u002Fnano-world-model.git\ncd nano-world-model\nconda env create -f environment.yml && conda activate nanowm\n```\n\nSet data + results paths (or use the gitignored `src\u002Fconfigs\u002Flocal\u002Fpaths.yaml` template — see [docs\u002Fconfig_system.md](docs\u002Fconfig_system.md#path-configuration)):\n\n```bash\nexport DATASET_DIR=\u002Fpath\u002Fto\u002Fdino_wm_data       # DINO-WM envs (point_maze, pusht, ...)\nexport CSGO_DATA_DIR=\u002Fpath\u002Fto\u002Fcsgo             # CSGO HDF5 files\nexport RT1_DATA_ROOT=\u002Fpath\u002Fto\u002Frt1_fractal      # RT-1 LeRobot mirror (optional)\nexport RESULTS_DIR=\u002Fpath\u002Fto\u002Fresults            # checkpoints + logs land here\n```\n\nDownload the i3d torchscript used by FID\u002FFVD evaluation:\n\n```bash\nmkdir -p pretrained_models\u002Fi3d && curl -L \\\n    \"https:\u002F\u002Fwww.dropbox.com\u002Fscl\u002Ffi\u002Fc5nfs6c422nlpj880jbmh\u002Fi3d_torchscript.pt?rlkey=x5xcjsrz0818i4qxyoglp5bb8&dl=1\" \\\n    -o pretrained_models\u002Fi3d\u002Fi3d_torchscript.pt\n```\n\nFor dataset downloads (DINO-WM, RT-1, CSGO), see [docs\u002Fdatasets\u002FREADME.md](docs\u002Fdatasets\u002FREADME.md).\n\n## 🥷 Train your first model\n\nDINO-WM PushT, NanoWM-B\u002F2, default settings (pred-v · additive injection · cosine + ZTSNR):\n\n```bash\npython src\u002Fmain.py experiment=dino_wm_pusht dataset=dino_wm\u002Fpusht model=nanowm_b2\n```\n\nCSGO with the L\u002F2 model:\n\n```bash\npython src\u002Fmain.py experiment=csgo dataset=game\u002Fcsgo model=nanowm_l2_csgo\n```\n\nRT-1 (fractal) main run:\n\n```bash\npython src\u002Fmain.py experiment=rt1 dataset=rt1\u002Frt1 model=nanowm_b2\n```\n\nFor reproducibility, we provide example scripts in `src\u002Fscripts\u002F`. See [docs\u002Ftraining.md](docs\u002Ftraining.md) for the full training guide, design choices, and ablation tables.\n\n## 📦 Pretrained Checkpoints\n\nBest-config runs (pred-v · additive · cosine + ZTSNR · NanoWM-B\u002F2 unless noted):\n\n\u003Cdiv align=\"center\">\n\n| Domain | Checkpoint | Steps |\n|:-------|:-----------|:------|\n| DINO-WM Point Maze | 🤗 [nanowm-b2-dino-wm-point-maze-30k](https:\u002F\u002Fhuggingface.co\u002Fknightnemo\u002Fnanowm-b2-dino-wm-point-maze-30k) | 30k |\n| DINO-WM Wall | 🤗 [nanowm-b2-dino-wm-wall-15k](https:\u002F\u002Fhuggingface.co\u002Fknightnemo\u002Fnanowm-b2-dino-wm-wall-15k) | 15k |\n| DINO-WM Rope | 🤗 [nanowm-b2-dino-wm-rope-15k](https:\u002F\u002Fhuggingface.co\u002Fknightnemo\u002Fnanowm-b2-dino-wm-rope-15k) | 15k |\n| DINO-WM Granular | 🤗 [nanowm-b2-dino-wm-granular-15k](https:\u002F\u002Fhuggingface.co\u002Fknightnemo\u002Fnanowm-b2-dino-wm-granular-15k) | 15k |\n| DINO-WM PushT | 🤗 [nanowm-b2-dino-wm-pusht-100k](https:\u002F\u002Fhuggingface.co\u002Fknightnemo\u002Fnanowm-b2-dino-wm-pusht-100k) | 100k |\n| RT-1 (fractal) | 🤗 [nanowm-b2-rt1-300k](https:\u002F\u002Fhuggingface.co\u002Fknightnemo\u002Fnanowm-b2-rt1-300k) | 300k |\n| CSGO | 🤗 [nanowm-l2-csgo-100k](https:\u002F\u002Fhuggingface.co\u002Fknightnemo\u002Fnanowm-l2-csgo-100k) (NanoWM-L\u002F2) | 100k |\n\n\u003C\u002Fdiv>\n\nWe also provide RT-1 ablation tables with HF checkpoint paths. See [docs\u002Ftraining.md#design-choices](docs\u002Ftraining.md#design-choices) for the full table and ablation numbers.\n\n## 🎬 Sample Predictions\n\nCSGO 50-frame auto-regressive long-rollouts (NanoWM-L\u002F2, 100k):\n\n\u003Cdiv align=\"center\">\n\n![CSGO 50-frame autoregressive long rollout](assets\u002Fcsgo_100k_long_rollout.gif)\n\n\u003C\u002Fdiv>\n\nQuantitative Metrics\nEvaluated on 256 fixed samples (seed=42), 250 DDIM steps, sequential scheduling (frame-by-frame autoregressive denoising).\n\n\u003Cdiv align=\"center\">\n\n| Dataset | Steps | PSNR ↑ | SSIM ↑ | LPIPS ↓ | FID ↓ |\n|:--------|:------|:-------|:-------|:--------|:------|\n| Point Maze | 30k | 36.74 | 0.984 | 0.019 | 9.66 |\n| Wall | 15k | 34.05 | 0.994 | 0.010 | 2.64 |\n| PushT | 100k | 33.19 | 0.982 | 0.016 | 13.63 |\n| Rope | 15k | 31.63 | 0.953 | 0.056 | 35.20 |\n| Granular | 15k | 26.08 | 0.917 | 0.073 | 40.05 |\n| RT-1 | 300k | 24.36 | 0.787 | 0.180 | 35.08 |\n\n\u003C\u002Fdiv>\n\nFull per-domain numbers and methodology in [docs\u002Fevaluation.md](docs\u002Fevaluation.md).\n\n## 🧭 Applications\n\nNanoWM rollouts can be used directly for downstream applications, including long-horizon generation, video-to-3D reconstruction, and MPC-style planning.\n\n\u003Cdiv align=\"center\">\n\n![Video-to-3D point cloud demo](assets\u002Fvideo_to_3d.gif)\n\n\u003C\u002Fdiv>\n\n- **[Long-horizon rollout](docs\u002Fapplications\u002Flong_rollout.md)** — autoregressive rollout from trained checkpoints\n- **[Video → 3D map](docs\u002Fapplications\u002Fvideo_to_3d.md)** — Depth Anything 3 point cloud reconstruction from rollout videos\n- **[MPC-style planning](docs\u002Fapplications\u002Fplanning.md)** — CEM planning over world model rollouts\n\n## 📚 Documentation\n\n- **[docs\u002Fconfig_system.md](docs\u002Fconfig_system.md)** — Hydra config layout, overrides, environment variables\n- **[docs\u002Ftraining.md](docs\u002Ftraining.md)** — training workflow, design choices, ablation tables, all checkpoints\n- **[docs\u002Fevaluation.md](docs\u002Fevaluation.md)** — evaluation workflow, metric definitions, full result tables\n- **[docs\u002Fdatasets\u002FREADME.md](docs\u002Fdatasets\u002FREADME.md)** — DINO-WM \u002F RT-1 \u002F CSGO formats, downloads, splits\n- **[docs\u002Fapplications\u002Fplanning.md](docs\u002Fapplications\u002Fplanning.md)** — MPC + CEM model-predictive control\n- **[docs\u002Fapplications\u002Flong_rollout.md](docs\u002Fapplications\u002Flong_rollout.md)** — long-horizon autoregressive rollout\n- **[docs\u002Fapplications\u002Fvideo_to_3d.md](docs\u002Fapplications\u002Fvideo_to_3d.md)** — Depth Anything 3 point cloud pipeline\n\n## 🙏 Acknowledgements\n\nWe build upon a number of existing codebases: [Latte](https:\u002F\u002Fgithub.com\u002FVchitect\u002FLatte), [Vid2World](https:\u002F\u002Fgithub.com\u002Fthuml\u002FVid2World), [DFoT](https:\u002F\u002Fgithub.com\u002Fkwsong0113\u002Fdiffusion-forcing-transformer), and [DINO-WM](https:\u002F\u002Fgithub.com\u002Fgaoyuezhou\u002Fdino_wm). More broadly, this repository draws inspirations and design principles from [NanoGPT](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002FnanoGPT), [NanoChat](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fnanochat), and [Boyuan Chen's Research Template](https:\u002F\u002Fgithub.com\u002Fbuoyancy99\u002Fresearch-template). We sincerely thank the codebases above for open-sourcing their works.\n\n## 📝 Citation\n\nIf you find this repository useful in your research, please consider citing:\n\n```bibtex\n@misc{nanoworldmodels,\n  title={Nano World Model: A Minimalist, Batteries-Included Repository for Advancing World Model Science},\n  author={Siqiao Huang and Partha Kaushik and Michael Chen and Hengkai Pan and Kaiwen Geng and Omar Chehab and Fernando Moreno-Pino and Max Simchowitz},\n  year={2026},\n  publisher={GitHub},\n  journal={GitHub repository},\n  howpublished={\\url{https:\u002F\u002Fgithub.com\u002Fsimchowitzlabpublic\u002Fnano-world-model}},\n}\n```\n","Nano World Model 是一个专注于训练基于扩散强制的视频世界模型的极简主义仓库。该项目提供了一套从数据加载到模型训练、验证和评估的一体化流程，支持快速启动，并且代码库清晰透明，便于进行科学实验对比。此外，它还具备多样化的应用场景，如长时域预测、3D点云生成以及内置模型预测控制（MPC）。适合于需要高效开发和测试视频预测模型的研究人员及开发者使用。","2026-06-11 02:30:33","CREATED_QUERY"]