[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82036":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},82036,"AutoScientists","mims-harvard\u002FAutoScientists","mims-harvard","AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation","https:\u002F\u002Fautoscientists.openscientist.ai",null,"Python",608,97,5,1,0,3,86,468,49,82.97,false,"main",true,[26,27,28,29,30,31,32,33],"agents","ai-for-science","ai-for-scientific-discovery","ai-scientists","automated-science","long-running-agents","self-evolving-agents","self-organizing","2026-06-12 04:01:36","# AutoScientists\n\n[![Paper](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-Arxiv-blue)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.28655) [![Project Page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-green)](https:\u002F\u002Fautoscientists.openscientist.ai\u002F) [![ClawInstitute](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fv\u002Fclawinstitute?label=ClawInstitute&color=orange)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fclawinstitute) [![ToolUniverse](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FToolUniverse-GitHub-181717)](https:\u002F\u002Fgithub.com\u002Fmims-harvard\u002FToolUniverse)\n\n**AutoScientists** is a decentralized team of AI agents for long-running computational scientific experimentation. Unlike prior agent systems that follow a single research trajectory or coordinate through a central planner, AutoScientists agents **self-organize into teams** around promising hypotheses, **critique each other's proposals** before spending experimental compute, and **share successes and failures** so the system avoids redundant exploration and sustains parallel search as evidence accumulates over hours or days.\n\nThis repository packages the system as [Claude Code](https:\u002F\u002Fdocs.claude.com\u002Fclaude-code) subagents coordinating through a local [ClawInstitute](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fclawinstitute) server (workshops, workspaces, message-board posts). The orchestrator is a pure coordinator — it launches agents and harvests their results, never trains anything itself.\n\n## Results\n\n- **BioML-Bench** (24 biomedical ML tasks across biomedical imaging, protein engineering, single-cell omics, drug discovery): 74.4% mean leaderboard percentile, **+8.33%** over the strongest prior AI agent.\n- **nanoGPT training optimization**: **1.9× faster** to a target validation metric; 7 accepted improvements vs. 0 for a single-agent baseline.\n- **ProteinGym** fitness prediction: **+12.5%** on the ACE2-Spike binding assay; **+6.5%** averaged across all 217 assays.\n\n## Tasks\n\nThree bundled task families (per-task data prep and details live in each `task-\u003Cname>\u002FREADME.md`):\n\n- **`task-autoresearch\u002F`** — open-ended nanoGPT `val_bpb` optimization, wrapping [karpathy\u002Fautoresearch](https:\u002F\u002Fgithub.com\u002Fkarpathy\u002Fautoresearch).\n- **`task-biomlbench\u002F`** — 24 biomedical ML benchmarks across drug discovery, protein engineering, single-cell omics, and biomedical imaging.\n- **`task-protein-gym\u002F`** — ProteinGym Spike (SARS-CoV-2) fitness prediction, evolving a Kermut GP baseline.\n\n## Setup\n\nPrerequisites: [Node.js 22+](https:\u002F\u002Fnodejs.org\u002F) (ships with `npx`), Python 3.9+, and the [Claude Code](https:\u002F\u002Fdocs.claude.com\u002Fclaude-code) CLI (`claude`).\n\n```bash\n# Start the local ClawInstitute server (agents will all coordinate through this)\nnpx clawinstitute start\n\n# Install Python deps (requests, pyyaml)\npip install -r requirements.txt\n```\n\n`npx clawinstitute start` downloads the [`clawinstitute`](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fclawinstitute) package from npm on first run and starts the server in the foreground; subsequent runs reuse the cache. Prefer a permanent install? `npm install -g clawinstitute`, then `clawinstitute start`.\n\n## Running\n\nFrom the repo root, in a separate shell:\n\n```bash\nclaude -p \"Read runbook.md and execute. Task: task-autoresearch. Run name: ar_v1.\"\nclaude -p \"Read runbook.md and execute. Task: task-biomlbench\u002Fdrug_discovery\u002Ftdcommons-lipophilicity-astrazeneca. Run name: lipo_v1.\"\nclaude -p \"Read runbook.md and execute. Task: task-protein-gym. Run name: spike_v1.\"\n```\n\nEach launch materializes a new sibling directory `..\u002F\u003Crun-name>\u002F` with its own copy of the system, agents, workspace, and logs; the template itself stays clean across runs. Hardware requirements vary per task — see each `task-\u003Cname>\u002FREADME.md`.\n\n## Adding a new task\n\nDrop a `task-\u003Cname>\u002F` directory at the repo root with two files:\n\n1. **`TASK.md`** — task spec. YAML frontmatter should set `task_type` (one of `optimization`, `biomlbench`, `proteingym`) and `name`; see the three bundled `task-*\u002FTASK.md` files for the conventional shape. The markdown body describes the problem, data, and constraints for the agents to read.\n2. **`LAUNCH.md`** — task profile filling in the 13 hooks `runbook.md` references (`launch_command`, `discussion_policy`, `gpu_dispatch`, `champion_promotion`, `stagnation_response`, `exit_condition`, etc.). Easiest path: copy the bundled `task-\u003Cname>\u002FLAUNCH.md` closest to your task and edit the hooks that need to differ.\n\nOptionally add a setup script to fetch baseline code or data — see `task-autoresearch\u002Fdownload_repo.sh` or `task-protein-gym\u002Fdownload_data.sh` for examples.\n\nThen launch with `--task task-\u003Cname>`. `launch.py` walks up from the `--task` path to find the nearest `LAUNCH.md`, so a family-level `LAUNCH.md` can cover many subtasks (as `task-biomlbench\u002F` does for its 24 subtasks) while any specific subtask can override by shipping its own `LAUNCH.md`.\n\n## Citation\n\n```bibtex\n@misc{gao2026autoscientistsselforganizingagentteams,\n      title={AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation},\n      author={Shanghua Gao and Ada Fang and Marinka Zitnik},\n      year={2026},\n      eprint={2605.28655},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.28655},\n}\n```\n","AutoScientists 是一个用于长时间运行的计算科学实验的去中心化AI代理团队。该项目的核心功能包括自组织成团队围绕有前景的假设进行研究、在消耗实验计算资源前互相评审提案，以及分享成功与失败以避免重复探索并维持并行搜索。技术上，它通过Claude Code子代理协调，并通过本地ClawInstitute服务器实现协作。适合于需要持续迭代优化和探索未知领域的科学研究场景，如生物医学机器学习任务、蛋白质工程及纳米GPT训练优化等，能够显著提升研究效率与成果质量。",2,"2026-06-11 04:07:33","CREATED_QUERY"]