[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-9695":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":30,"readmeContent":31,"aiSummary":32,"trendingCount":16,"starSnapshotCount":16,"syncStatus":33,"lastSyncTime":34,"discoverSource":35},9695,"HRM","sapientinc\u002FHRM","sapientinc","Hierarchical Reasoning Model Official Release","https:\u002F\u002Fsapient.inc",null,"Python",12526,1826,268,61,0,8,19,109,31,44.79,"Apache License 2.0",false,"main",[26,27,28,29],"brain-inspired-ai","deep-learning","large-language-models","reasoning","2026-06-12 02:02:11","# Hierarchical Reasoning Model\n\n![](.\u002Fassets\u002Fhrm.png)\n\nReasoning, the process of devising and executing complex goal-oriented action sequences, remains a critical challenge in AI.\nCurrent large language models (LLMs) primarily employ Chain-of-Thought (CoT) techniques, which suffer from brittle task decomposition, extensive data requirements, and high latency. Inspired by the hierarchical and multi-timescale processing in the human brain, we propose the Hierarchical Reasoning Model (HRM), a novel recurrent architecture that attains significant computational depth while maintaining both training stability and efficiency.\nHRM executes sequential reasoning tasks in a single forward pass without explicit supervision of the intermediate process, through two interdependent recurrent modules: a high-level module responsible for slow, abstract planning, and a low-level module handling rapid, detailed computations. With only 27 million parameters, HRM achieves exceptional performance on complex reasoning tasks using only 1000 training samples. The model operates without pre-training or CoT data, yet achieves nearly perfect performance on challenging tasks including complex Sudoku puzzles and optimal path finding in large mazes.\nFurthermore, HRM outperforms much larger models with significantly longer context windows on the Abstraction and Reasoning Corpus (ARC), a key benchmark for measuring artificial general intelligence capabilities.\nThese results underscore HRM’s potential as a transformative advancement toward universal computation and general-purpose reasoning systems.\n\nRead Our Paper: [https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.21734](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.21734)\n\n**Join Our Discord Community: [https:\u002F\u002Fdiscord.gg\u002Fsapient](https:\u002F\u002Fdiscord.gg\u002Fsapient)**\n\n\n## Quick Start Guide 🚀\n\n### Prerequisites ⚙️\n\nEnsure PyTorch and CUDA are installed. The repo needs CUDA extensions to be built. If not present, run the following commands:\n\n```bash\n# Install CUDA 12.6\nCUDA_URL=https:\u002F\u002Fdeveloper.download.nvidia.com\u002Fcompute\u002Fcuda\u002F12.6.3\u002Flocal_installers\u002Fcuda_12.6.3_560.35.05_linux.run\n\nwget -q --show-progress --progress=bar:force:noscroll -O cuda_installer.run $CUDA_URL\nsudo sh cuda_installer.run --silent --toolkit --override\n\nexport CUDA_HOME=\u002Fusr\u002Flocal\u002Fcuda-12.6\n\n# Install PyTorch with CUDA 12.6\nPYTORCH_INDEX_URL=https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu126\n\npip3 install torch torchvision torchaudio --index-url $PYTORCH_INDEX_URL\n\n# Additional packages for building extensions\npip3 install packaging ninja wheel setuptools setuptools-scm\n```\n\nThen install FlashAttention. For Hopper GPUs, install FlashAttention 3\n\n```bash\ngit clone git@github.com:Dao-AILab\u002Fflash-attention.git\ncd flash-attention\u002Fhopper\npython setup.py install\n```\n\nFor Ampere or earlier GPUs, install FlashAttention 2\n\n```bash\npip3 install flash-attn\n```\n\n## Install Python Dependencies 🐍\n\n```bash\npip install -r requirements.txt\n```\n\n## W&B Integration 📈\n\nThis project uses [Weights & Biases](https:\u002F\u002Fwandb.ai\u002F) for experiment tracking and metric visualization. Ensure you're logged in:\n\n```bash\nwandb login\n```\n\n## Run Experiments\n\n### Quick Demo: Sudoku Solver 💻🗲\n\nTrain a master-level Sudoku AI capable of solving extremely difficult puzzles on a modern laptop GPU. 🧩\n\n```bash\n# Download and build Sudoku dataset\npython dataset\u002Fbuild_sudoku_dataset.py --output-dir data\u002Fsudoku-extreme-1k-aug-1000  --subsample-size 1000 --num-aug 1000\n\n# Start training (single GPU, smaller batch size)\nOMP_NUM_THREADS=8 python pretrain.py data_path=data\u002Fsudoku-extreme-1k-aug-1000 epochs=20000 eval_interval=2000 global_batch_size=384 lr=7e-5 puzzle_emb_lr=7e-5 weight_decay=1.0 puzzle_emb_weight_decay=1.0\n```\n\nRuntime: ~10 hours on a RTX 4070 laptop GPU\n\n## Trained Checkpoints 🚧\n\n - [ARC-AGI-2](https:\u002F\u002Fhuggingface.co\u002Fsapientinc\u002FHRM-checkpoint-ARC-2)\n - [Sudoku 9x9 Extreme (1000 examples)](https:\u002F\u002Fhuggingface.co\u002Fsapientinc\u002FHRM-checkpoint-sudoku-extreme)\n - [Maze 30x30 Hard (1000 examples)](https:\u002F\u002Fhuggingface.co\u002Fsapientinc\u002FHRM-checkpoint-maze-30x30-hard)\n\nTo use the checkpoints, see Evaluation section below.\n\n## Full-scale Experiments 🔵\n\nExperiments below assume an 8-GPU setup.\n\n### Dataset Preparation\n\n```bash\n# Initialize submodules\ngit submodule update --init --recursive\n\n# ARC-1\npython dataset\u002Fbuild_arc_dataset.py  # ARC offical + ConceptARC, 960 examples\n# ARC-2\npython dataset\u002Fbuild_arc_dataset.py --dataset-dirs dataset\u002Fraw-data\u002FARC-AGI-2\u002Fdata --output-dir data\u002Farc-2-aug-1000  # ARC-2 official, 1120 examples\n\n# Sudoku-Extreme\npython dataset\u002Fbuild_sudoku_dataset.py  # Full version\npython dataset\u002Fbuild_sudoku_dataset.py --output-dir data\u002Fsudoku-extreme-1k-aug-1000  --subsample-size 1000 --num-aug 1000  # 1000 examples\n\n# Maze\npython dataset\u002Fbuild_maze_dataset.py  # 1000 examples\n```\n\n### Dataset Visualization\n\nExplore the puzzles visually:\n\n* Open `puzzle_visualizer.html` in your browser.\n* Upload the generated dataset folder located in `data\u002F...`.\n\n## Launch experiments\n\n### Small-sample (1K)\n\nARC-1:\n\n```bash\nOMP_NUM_THREADS=8 torchrun --nproc-per-node 8 pretrain.py \n```\n\n*Runtime:* ~24 hours\n\nARC-2:\n\n```bash\nOMP_NUM_THREADS=8 torchrun --nproc-per-node 8 pretrain.py data_path=data\u002Farc-2-aug-1000\n```\n\n*Runtime:* ~24 hours (checkpoint after 8 hours is often sufficient)\n\nSudoku Extreme (1k):\n\n```bash\nOMP_NUM_THREADS=8 torchrun --nproc-per-node 8 pretrain.py data_path=data\u002Fsudoku-extreme-1k-aug-1000 epochs=20000 eval_interval=2000 lr=1e-4 puzzle_emb_lr=1e-4 weight_decay=1.0 puzzle_emb_weight_decay=1.0\n```\n\n*Runtime:* ~10 minutes\n\nMaze 30x30 Hard (1k):\n\n```bash\nOMP_NUM_THREADS=8 torchrun --nproc-per-node 8 pretrain.py data_path=data\u002Fmaze-30x30-hard-1k epochs=20000 eval_interval=2000 lr=1e-4 puzzle_emb_lr=1e-4 weight_decay=1.0 puzzle_emb_weight_decay=1.0\n```\n\n*Runtime:* ~1 hour\n\n### Full Sudoku-Hard\n\n```bash\nOMP_NUM_THREADS=8 torchrun --nproc-per-node 8 pretrain.py data_path=data\u002Fsudoku-hard-full epochs=100 eval_interval=10 lr_min_ratio=0.1 global_batch_size=2304 lr=3e-4 puzzle_emb_lr=3e-4 weight_decay=0.1 puzzle_emb_weight_decay=0.1 arch.loss.loss_type=softmax_cross_entropy arch.L_cycles=8 arch.halt_max_steps=8 arch.pos_encodings=learned\n```\n\n*Runtime:* ~2 hours\n\n## Evaluation\n\nEvaluate your trained models:\n\n* Check `eval\u002Fexact_accuracy` in W&B.\n* For ARC-AGI, follow these additional steps:\n\n```bash\nOMP_NUM_THREADS=8 torchrun --nproc-per-node 8 evaluate.py checkpoint=\u003CCHECKPOINT_PATH>\n```\n\n* Then use the provided `arc_eval.ipynb` notebook to finalize and inspect your results.\n\n## Notes\n\n - Small-sample learning typically exhibits accuracy variance of around ±2 points.\n - For Sudoku-Extreme (1,000-example dataset), late-stage overfitting may cause numerical instability during training and Q-learning. It is advisable to use early stopping once the training accuracy approaches 100%.\n\n## Citation 📜\n\n```bibtex\n@misc{wang2025hierarchicalreasoningmodel,\n      title={Hierarchical Reasoning Model}, \n      author={Guan Wang and Jin Li and Yuhao Sun and Xing Chen and Changling Liu and Yue Wu and Meng Lu and Sen Song and Yasin Abbasi Yadkori},\n      year={2025},\n      eprint={2506.21734},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.21734}, \n}\n```\n","Hierarchical Reasoning Model (HRM) 是一种新型的递归架构，旨在解决复杂的目标导向行动序列推理问题。该项目通过两个相互依赖的递归模块实现高效的多层次推理：一个负责缓慢、抽象规划的高级模块和一个处理快速、详细计算的低级模块。HRM 仅使用2700万参数和1000个训练样本即可在复杂推理任务中表现出色，如解高难度数独和大型迷宫中的最优路径寻找，且无需预训练或Chain-of-Thought数据。此外，HRM 在衡量人工智能通用能力的关键基准——抽象与推理语料库(ARC)上表现优于许多更大规模的模型。这种高效能使其特别适用于需要在有限资源下进行复杂推理的应用场景。",2,"2026-06-11 03:24:14","top_topic"]