[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-74228":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},74228,"autoresearch-mlx","trevin-creator\u002Fautoresearch-mlx","trevin-creator","Apple Silicon (MLX) port of Karpathy's autoresearch — autonomous AI research loops on Mac, no PyTorch required.",null,"Python",1653,338,11,7,0,8,18,90,24,89.59,"MIT License",false,"main",true,[],"2026-06-12 04:01:13","# autoresearch-mlx\n\nApple Silicon (MLX) port of [Karpathy's autoresearch](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch).\n\nFull credit to [@karpathy](https:\u002F\u002Fgithub.com\u002Fkarpathy) for the core idea: fixed-time autonomous research loops controlled through `program.md`. This port keeps the same basic rules: one mutable `train.py`, one metric (`val_bpb`), a fixed 5-minute training budget, and keep-or-revert via git. It runs natively on Apple Silicon through [MLX](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx), so there is no PyTorch or CUDA dependency.\n\n## Quick start\n\nRequirements: Apple Silicon Mac, Python 3.10+, [uv](https:\u002F\u002Fdocs.astral.sh\u002Fuv\u002F).\n\n```bash\n# install uv if needed\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\n\n# install dependencies\nuv sync\n\n# one-time data + tokenizer prep\nuv run prepare.py\n\n# run one 5-minute training experiment\nuv run train.py\n```\n\nThen point Claude Code or another coding agent at `program.md` and let it run the loop.\n\n## What matters\n\n- `prepare.py` - data prep, tokenizer, dataloader, and evaluation. Treat as fixed.\n- `train.py` - model, optimizer, and training loop. This is the file the agent edits.\n- `program.md` - the autonomous experiment protocol.\n- `results.tsv` - logged experiment history.\n\nThe loop is the same as upstream: edit `train.py`, run a fixed-budget experiment, read `val_bpb`, keep the change if it wins, revert if it loses, and repeat.\n\n## Public baseline results\n\nThe public `results.tsv` captures the initial hardware-local walk from the default baseline down to `1.807902`:\n\n| Commit | val_bpb | Status | Description |\n|---|---:|---|---|\n| `383abb4` | 2.667000 | keep | baseline (AdamW, default config) |\n| `909dd59` | 2.588904 | keep | halve total batch size to `2^16` |\n| `4161af3` | 2.533728 | keep | increase matrix LR to `0.04` |\n| `5efc7aa` | 1.807902 | keep | reduce depth from `8` to `4` |\n\nThat result already shows the core Apple Silicon pattern: with a fixed 5-minute wall clock, smaller faster-training models can beat larger ones simply by fitting more optimizer steps into the budget.\n\n## Longer Apple Silicon runs\n\nLonger overnight runs on the working MLX port pushed much further. The long Mac Mini test is included here because it found a meaningfully different winner stack from the Max-class machines.\n\n| Machine | Current best | Starting point | Repeated wins |\n|---|---:|---:|---|\n| M4 Max #1 | 1.294526 | 1.596971 | AdamW-only, low matrix LR, 3x MLP, no logit cap, moderate weight decay |\n| M4 Max #2 | 1.330509 | 1.807902 | leaner batch, long anneal, SiLU, lower regularization, no logit cap |\n| Mac Mini (long run) | 1.353329 | 1.922472 | Muon, sharper attention, smaller MLP, lower scalar LR |\n\nThe Mac Mini result matters because it did not just rediscover the same exact recipe. On smaller Apple Silicon hardware, the strongest changes leaned toward more aggressive step-efficiency wins. Later transfer tests showed some of those Mac Mini findings did not carry cleanly onto the Max baseline, which is exactly the kind of hardware-specific behavior this loop is useful for uncovering.\n\n## Differences from upstream\n\n- **MLX instead of PyTorch\u002FCUDA.** Native Apple Silicon training with unified memory.\n- **AdamW-only public path.** This public `train.py` keeps the default path simple. The long Mac Mini run above explored a Muon variant in the working port, but that branch is not exposed as a public default here.\n- **Smaller eval token budget.** Reduced for faster iteration on Apple Silicon while keeping the same `evaluate_bpb` interface in `prepare.py`.\n- **Roughly 6-7 minutes per experiment.** Expect 5 minutes of training plus compile and eval overhead.\n- **MFU reporting is placeholder.** There is no Apple Silicon equivalent to the H100 FLOPs reference used upstream.\n\n## Acknowledgments\n\n- [Andrej Karpathy](https:\u002F\u002Fgithub.com\u002Fkarpathy) - autoresearch and nanochat\n- [scasella\u002Fnanochat-mlx](https:\u002F\u002Fgithub.com\u002Fscasella\u002Fnanochat-mlx) - MLX GPT and optimizer reference\n- [awni\u002Fpicochat](https:\u002F\u002Fgithub.com\u002Fawni\u002Fpicochat) - MLX training patterns\n- [Apple MLX team](https:\u002F\u002Fgithub.com\u002Fml-explore\u002Fmlx)\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n","autoresearch-mlx 是一个针对 Apple Silicon (MLX) 的移植项目，基于 Karpathy 的 autoresearch，实现了无需 PyTorch 的自主AI研究循环。该项目的核心功能包括通过 `program.md` 控制的固定时间研究循环、可编辑的 `train.py` 文件、以及在5分钟内完成一次实验并根据验证指标决定是否保留更改。技术上利用了 MLX 实现了对 Apple Silicon 的原生支持，从而摆脱了对 PyTorch 或 CUDA 的依赖。适合场景为使用 Apple Silicon 设备的研究人员或开发者想要快速迭代模型训练与优化时采用。",2,"2026-06-11 03:49:34","high_star"]