[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72610":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},72610,"diamond","eloialonso\u002Fdiamond","eloialonso","DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.","https:\u002F\u002Fdiamond-wm.github.io",null,"Python",2057,157,21,6,0,1,7,19,3,66,"MIT License",false,"main",[26,27,28,29,30,31,32,33],"artificial-intelligence","atari","deep-learning","diffusion-models","machine-learning","reinforcement-learning","research","world-models","2026-06-12 04:01:06","# Diffusion for World Modeling: Visual Details Matter in Atari (NeurIPS 2024 Spotlight)\n\n[**TL;DR**] 💎 DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained entirely in a diffusion world model.\n\n🌍 [Project Page](https:\u002F\u002Fdiamond-wm.github.io) • 🤓 [Paper](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2405.12399) • 𝕏 [Atari thread](https:\u002F\u002Fx.com\u002FEloiAlonso1\u002Fstatus\u002F1793916382779982120) • 𝕏 [CSGO thread](https:\u002F\u002Fx.com\u002FEloiAlonso1\u002Fstatus\u002F1844803606064611771) • 💬 [Discord](https:\u002F\u002Fdiscord.gg\u002F74vha5RWPg)\n\n\u003Cdiv align='center'>\n  RL agent playing in autoregressive imagination of Atari world models\n  \u003Cbr>\n  \u003Cimg alt=\"DIAMOND agent in WM\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Feb6b72eb-73df-4178-8a3d-cdad80ff9152\">\n\n\u003C\u002Fdiv>\n\n\u003Cdiv align='center'>\n  Human player in CSGO world model (full quality video \u003Ca href=\"https:\u002F\u002Fdiamond-wm.github.io\u002Fstatic\u002Fvideos\u002Fgrid.mp4\">here\u003C\u002Fa>)\n  \u003Cbr>\n  \u003Cimg alt=\"DIAMOND agent in WM\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fdcbdd523-ca22-46a9-bb7d-bcc52080fe00\">\n\u003C\u002Fdiv>\n\nQuick install to try our [pretrained world models](#try) using [miniconda](https:\u002F\u002Fdocs.anaconda.com\u002Ffree\u002Fminiconda\u002Fminiconda-install\u002F):\n\n>```bash\n>git clone https:\u002F\u002Fgithub.com\u002Feloialonso\u002Fdiamond.git\n>cd diamond\n>conda create -n diamond python=3.10\n>conda activate diamond\n>pip install -r requirements.txt\n>```\n\nFor Atari (world model + RL agent)\n\n>```bash\n>python src\u002Fplay.py --pretrained\n>```\n\nFor CSGO (world model only)\n\n>```bash\n>git checkout csgo\n>python src\u002Fplay.py\n>```\n\nAnd press `m` to take control (the policy is playing by default)!\n\n**Warning**: Atari ROMs will be downloaded with the dependencies, which means that you acknowledge that you have the license to use them.\n\n## CSGO\n\n\n**Edit**: Check out the [csgo branch](https:\u002F\u002Fgithub.com\u002Feloialonso\u002Fdiamond\u002Ftree\u002Fcsgo) to try our DIAMOND's world model trained on *Counter-Strike: Global Offensive*!\n\n```bash\ngit checkout csgo\npython src\u002Fplay.py\n```\n> Note on Apple Silicon you must enable CPU fallback for MPS backend with\n> PYTORCH_ENABLE_MPS_FALLBACK=1 python src\u002Fplay.py\n\n\n\u003Ca name=\"quick_links\">\u003C\u002Fa>\n## Quick Links\n\n- [Try our playable diffusion world models](#try)\n- [Launch a training run](#launch)\n- [Configuration](#configuration)\n- [Visualization](#visualization)\n  - [Play mode (default)](#play_mode)\n  - [Dataset mode (add `-d`)](#dataset_mode)\n  - [Other options, common to play\u002Fdataset modes](#other_options)\n- [Run folder structure](#structure)\n- [Results](#results)\n- [Citation](#citation)\n- [Credits](#credits)\n\n\u003Ca name=\"try\">\u003C\u002Fa>\n## [⬆️](#quick_links) Try our playable diffusion world models\n\n```bash\npython src\u002Fplay.py --pretrained\n```\n\nThen select a game, and world model and policy pretrained on Atari 100k will be downloaded from our [repository on Hugging Face Hub 🤗](https:\u002F\u002Fhuggingface.co\u002Feloialonso\u002Fdiamond) and cached on your machine.\n\nSome things you might want to try:\n- Press `m` to change the policy between the agent and human (the policy is playing by default).\n- Press `↑\u002F↓` to change the imagination horizon (default is 50 for playing).\n\nTo adjust the sampling parameters (number of denoising steps, stochasticity, order, etc) of the trained diffusion world model, for instance to trade off sampling speed and quality, edit the section `world_model_env.diffusion_sampler` in the file `config\u002Ftrainer.yaml`.\n\nSee [Visualization](#visualization) for more details about the available commands and options.\n\n\u003Ca name=\"launch\">\u003C\u002Fa>\n## [⬆️](#quick_links) Launch a training run\n\nTo train with the hyperparameters used in the paper on cuda:0, launch:\n```bash\npython src\u002Fmain.py env.train.id=BreakoutNoFrameskip-v4 common.devices=0\n```\n\nThis creates a new folder for your run, located in `outputs\u002FYYYY-MM-DD\u002Fhh-mm-ss\u002F`.\n\nTo resume a run that crashed, navigate to the fun folder and launch:\n\n```bash\n.\u002Fscripts\u002Fresume.sh\n```\n\n\u003Ca name=\"configuration\">\u003C\u002Fa>\n## [⬆️](#quick_links) Configuration\n\nWe use [Hydra](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fhydra) for configuration management.\n\nAll configuration files are located in the `config` folder:\n\n- `config\u002Ftrainer.yaml`: main configuration file.\n- `config\u002Fagent\u002Fdefault.yaml`: architecture hyperparameters.\n- `config\u002Fenv\u002Fatari.yaml`: environment hyperparameters.\n\nYou can turn on logging to [weights & biases](https:\u002F\u002Fwandb.ai) in the `wandb` section of `config\u002Ftrainer.yaml`.\n\nSet `training.model_free=true` in the file `config\u002Ftrainer.yaml` to \"unplug\" the world model and perform standard model-free reinforcement learning.\n\n\u003Ca name=\"visualization\">\u003C\u002Fa>\n## [⬆️](#quick_links) Visualization\n\n\u003Ca name=\"play_mode\">\u003C\u002Fa>\n### [⬆️](#quick_links) Play mode (default)\n\nTo visualize your last checkpoint, launch **from the run folder**:\n\n```bash\npython src\u002Fplay.py\n```\n\nBy default, you visualize the policy playing in the world model. To play yourself, or switch to the real environment, use the controls described below.\n\n```txt\nControls (play mode)\n\n(Game-specific commands will be printed on start up)\n\n⏎   : reset environment\n\nm   : switch controller (policy\u002Fhuman)\n↑\u002F↓ : imagination horizon (+1\u002F-1)\n←\u002F→ : next environment [world model ←→ real env (test) ←→ real env (train)]\n\n.   : pause\u002Funpause\ne   : step-by-step (when paused)\n```\n\nAdd `-r` to toggle \"recording mode\" (works only in play mode). Every completed episode will be saved in `dataset\u002Frec_\u003Cenv_name>_\u003Ccontroller>`. For instance:\n\n- `dataset\u002Frec_wm_π`: Policy playing in world model.\n- `dataset\u002Frec_wm_H`: Human playing in world model.\n- `dataset\u002Frec_test_H`: Human playing in test real environment.\n\nYou can then use the \"dataset mode\" described in the next section to replay the stored episodes.\n\n\u003Ca name=\"dataset_mode\">\u003C\u002Fa>\n### [⬆️](#quick_links) Dataset mode (add `-d`)\n\n**In the run folder**, to visualize the datasets contained in the `dataset` subfolder, add `-d` to switch to \"dataset mode\":\n\n```bash\npython src\u002Fplay.py -d\n```\n\nYou can use the controls described below to navigate the datasets and episodes.\n\n```txt\nControls (dataset mode)\n\nm   : next dataset (if multiple datasets, like recordings, etc)\n↑\u002F↓ : next\u002Fprevious episode\n←\u002F→ : next\u002Fprevious timestep in episodes\nPgUp: +10 timesteps\nPgDn: -10 timesteps\n⏎   : back to first timestep\n```\n\n\u003Ca name=\"other_options\">\u003C\u002Fa>\n### [⬆️](#quick_links) Other options, common to play\u002Fdataset modes\n\n```txt\n--fps FPS             Target frame rate (default 15).\n--size SIZE           Window size (default 800).\n--no-header           Remove header.\n```\n\n\u003Ca name=\"structure\">\u003C\u002Fa>\n## [⬆️](#quick_links) Run folder structure\n\nEach new run is located at `outputs\u002FYYYY-MM-DD\u002Fhh-mm-ss\u002F`. This folder is structured as follows:\n\n```txt\noutputs\u002FYYYY-MM-DD\u002Fhh-mm-ss\u002F\n│\n└─── checkpoints\n│   │   state.pt  # full training state\n│   │\n│   └─── agent_versions\n│       │   ...\n│       │   agent_epoch_00999.pt\n│       │   agent_epoch_01000.pt  # agent weights only\n│\n└─── config\n│   |   trainer.yaml\n|\n└─── dataset\n│   │\n│   └─── train\n│   |   │   info.pt\n│   |   │   ...\n|   |\n│   └─── test\n│       │   info.pt\n│       │   ...\n│\n└─── scripts\n│   │   resume.sh\n|   |   ...\n|\n└─── src\n|   |   main.py\n|   |   ...\n|\n└─── wandb\n    |   ...\n```\n\n\u003Ca name=\"results\">\u003C\u002Fa>\n## [⬆️](#quick_links) Results\n\nThe file [results\u002Fdata\u002FDIAMOND.json](results\u002Fdata\u002FDIAMOND.json) contains the results for each game and seed used in the paper.\n\nThe DDPM code used for Section 5.1 of the paper can be found on the [ddpm](https:\u002F\u002Fgithub.com\u002Feloialonso\u002Fdiamond\u002Ftree\u002Fddpm) branch.\n\n\u003Ca name=\"citation\">\u003C\u002Fa>\n## [⬆️](#quick-links) Citation\n\n```text\n@inproceedings{alonso2024diffusionworldmodelingvisual,\n      title={Diffusion for World Modeling: Visual Details Matter in Atari},\n      author={Eloi Alonso and Adam Jelley and Vincent Micheli and Anssi Kanervisto and Amos Storkey and Tim Pearce and François Fleuret},\n      booktitle={Thirty-eighth Conference on Neural Information Processing Systems}}\n      year={2024},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.12399},\n}\n```\n\n\u003Ca name=\"credits\">\u003C\u002Fa>\n## [⬆️](#quick_links) Credits\n\n- [https:\u002F\u002Fgithub.com\u002Fcrowsonkb\u002Fk-diffusion\u002F](https:\u002F\u002Fgithub.com\u002Fcrowsonkb\u002Fk-diffusion\u002F)\n- [https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fhuggingface_hub](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fhuggingface_hub)\n- [https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Frliable](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Frliable)\n- [https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fpytorch](https:\u002F\u002Fgithub.com\u002Fpytorch\u002Fpytorch)\n","DIAMOND 是一个基于扩散模型训练的强化学习代理。其核心功能是通过在扩散世界模型中训练，使得代理能够处理复杂的环境任务，如Atari游戏和CS:GO中的视觉细节。项目使用Python编写，利用了深度学习、强化学习以及扩散模型等先进技术。它适合用于需要高精度模拟复杂动态环境的研究场景，比如游戏AI开发或任何涉及序列决策与环境交互的人工智能研究领域。此外，DIAMOND还提供了预训练模型供快速上手体验，并且支持用户自定义训练配置以适应不同需求。",2,"2026-06-11 03:42:46","high_star"]