[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80018":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":15,"starSnapshotCount":15,"syncStatus":14,"lastSyncTime":26,"discoverSource":27},80018,"autoswarm","arteemg\u002Fautoswarm","arteemg","Open-source framework for superagents.","",null,"Python",83,4,2,0,1,12,40.8,false,"main",true,[],"2026-06-12 04:01:26","# AutoSwarm\n\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT)\n[![Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Farteemg\u002Fautoswarm?style=social)](https:\u002F\u002Fgithub.com\u002Farteemg\u002Fautoswarm)\n[![Forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002Farteemg\u002Fautoswarm?style=social)](https:\u002F\u002Fgithub.com\u002Farteemg\u002Fautoswarm)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.12+-blue.svg)](https:\u002F\u002Fwww.python.org\u002F)\n[![Docker](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDocker-required-2496ED?logo=docker&logoColor=white)](https:\u002F\u002Fdocker.com)\n\n\u003Cp align=\"center\">\n  \u003Cbr>\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002F9ggSRAFGKQ\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDiscord-Join%20Community-5865F2?logo=discord&logoColor=white\" alt=\"Discord\" \u002F>\n  \u003C\u002Fa>\n   \u003Cbr>\n  \u003Cimg src=\"superagent_solo.gif\" alt=\"AutoSwarm\" width=\"400\">\n\u003C\u002Fp>\n\n> A self-improving OpenAI-compatible proxy for local LLMs, plus a multi-agent pipeline harness that self-optimizes its own topology.\n\nAutoSwarm runs in two modes:\n\n- **Online mode** — drop-in proxy in front of LM Studio \u002F Ollama \u002F vLLM. It logs every chat, then a `reflect` pass distills lessons into a skillbook that gets injected into future system prompts. Skills that turn out to be wrong get pruned automatically.\n- **Benchmark mode** — a multi-agent pipeline harness over [Harbor](https:\u002F\u002Fgithub.com\u002Flaude-institute\u002Fharbor) tasks. A meta-agent edits stage prompts, tools, turn budgets, and pipeline structure to hill-climb on `passed` tasks.\n\n## Online mode: self-improving local LLM proxy\n\n### Install\n\n```bash\npip install -e .            # editable install from the repo\n```\n\nRequires Python 3.12+.\n\n### Start the proxy\n\n```bash\nautoswarm doctor            # diagnose local LLM availability\nautoswarm start             # auto-detects upstream + model\n```\n\n`autoswarm start` probes `:1234` (LM Studio), `:11434` (Ollama), and `:8000` (vLLM) and picks the first one that has a model loaded. Override either with `--upstream` \u002F `--model`. The proxy listens on `http:\u002F\u002F127.0.0.1:8080`.\n\nPoint any OpenAI-compatible client (Chatbox, Open WebUI, your own scripts) at `http:\u002F\u002F127.0.0.1:8080\u002Fv1`. Every chat is logged to `conversations\u002F` and runs through the proxy's skill-injection layer.\n\n### Reflect and prune\n\n```bash\nautoswarm reflect           # review unreviewed conversations\n```\n\nFor each unreviewed conversation, the same upstream LLM is asked whether there's a concrete lesson worth keeping. Novel lessons land in `skills.yaml`. After the add pass, a second judge call reviews the full skillbook against recent conversations and silently removes anything that's wrong, contradictory, or too vague. Output looks like:\n\n```\nreviewed=12 added=3 skipped=9 pruned=1\n```\n\nFor hosted upstreams (OpenAI etc.) pass `--api-key` or set `OPENAI_API_KEY`. Local LLMs need nothing.\n\n### Inspect skills\n\n```bash\nautoswarm skills list       # show learned strategies\nautoswarm skills clear      # wipe the skillbook\n```\n\n### CLI reference\n\n| Command                  | Purpose                                                             |\n| ------------------------ | ------------------------------------------------------------------- |\n| `autoswarm doctor`       | Probe local LLM servers, print copy-paste fixes                     |\n| `autoswarm start`        | Run the OpenAI-compatible proxy on `:8080` with skill injection     |\n| `autoswarm reflect`      | Distill lessons from new conversations + prune bad skills (one LLM call per convo, one per run for pruning) |\n| `autoswarm skills list`  | Show current skills                                                 |\n| `autoswarm skills clear` | Delete `skills.yaml`                                                |\n\n## Benchmark mode\n\n### How it works\n\n- **`pipeline_spec.yaml`** — the topology the meta-agent edits. Defines stages (system prompt, tools, turn budget, output format) and handoffs (token budget, context format) between them. This is the primary edit surface.\n- **`pipeline.py`** — the runner. Reads `pipeline_spec.yaml` and executes the pipeline. Contains a small editable section (tool definitions, compression logic) and a fixed Harbor adapter boundary.\n- **`evaluator.py`** — per-stage LLM judge. After each run, scores every stage on how well its output equipped the next stage. Produces `stage_scores` for `results.tsv` so the meta-agent can identify exactly which stage is failing.\n- **`program_pipeline.md`** — meta-agent instructions. Defines the experiment loop, triage, credit assignment, structural edit rules, and keep\u002Fdiscard criteria.\n- **`agent.py`** — single-agent baseline harness for comparison runs.\n- **`tasks\u002F`** — evaluation tasks in [Harbor](https:\u002F\u002Fgithub.com\u002Flaude-institute\u002Fharbor) format.\n\nThe metric is total **passed** tasks. The meta-agent hill-climbs on this score by editing the pipeline topology.\n\n\u003Cimg src=\"sample\u002Fsample_results.png\" alt=\"Sample results\" width=\"600\">\n\n### Running the meta-agent\n\nPoint your coding agent at the repo and prompt:\n\n```\nRead benchmark\u002Fprogram_pipeline.md and let's kick off a new experiment!\n```\n\nThe meta-agent will read the directive, inspect `benchmark\u002Fpipeline_spec.yaml`, run the benchmark, score each stage with `benchmark\u002Fevaluator.py`, edit the topology, and iterate.\n\n### Project structure\n\n```text\npipeline_spec.yaml             -- pipeline topology (primary edit surface)\npipeline.py                    -- pipeline runner + Harbor adapter\n  editable section             -- load_spec, tools, compress_handoff, run_task\n  fixed adapter section        -- PipelineResult, to_atif, AutoAgent\nevaluator.py                   -- per-stage LLM judge\nprogram_pipeline.md            -- meta-agent instructions for pipeline optimization\nagent.py                       -- single-agent baseline harness\nDockerfile.base                -- optional base image for custom task Dockerfiles (`FROM autoswarm-base`)\ntasks\u002F                         -- benchmark tasks\njobs\u002F                          -- Harbor job outputs (gitignored)\nresults.tsv                    -- experiment log (gitignored)\nrun.log                        -- latest run output (gitignored)\n```\n\n### pipeline_spec.yaml\n\nThis is what the meta-agent reads and edits. Stage-level fields:\n\n| Field           | Description                                                                                                           |\n| --------------- | --------------------------------------------------------------------------------------------------------------------- |\n| `system_prompt` | Instructions for this stage's agent                                                                                   |\n| `tools`         | Tool list — any subset of `run_shell`, `read_file`, `write_file` (register more in `pipeline.py`'s `_TOOL_FACTORIES`) |\n| `max_turns`     | Turn budget for this stage                                                                                            |\n| `output_format` | Hint to the agent: `bullet_list` \\| `json` \\| `prose` \\| `structured_json`                                            |\n| `model`         | Model override (inherits `pipeline.model` if omitted)                                                                 |\n\nHandoff fields between stages:\n\n| Field                | Description                                    |\n| -------------------- | ---------------------------------------------- |\n| `token_budget`       | Max tokens of context passed to the next stage |\n| `format`             | Format hint for context compression            |\n| `include_raw_output` | If true, passes full output uncompressed       |\n\n### results.tsv schema\n\n```text\ncommit  avg_score  passed  task_scores  stage_scores  pipeline_topology  cost_usd  status  description\n```\n\n`pipeline_topology` records the stage sequence at time of run — could be `vanilla-agent`, `recon→solve→check`, `plan→execute→verify→execute→verify` (verify-driven retry), or any shape the meta-agent has built — so structural changes are traceable across the experiment log.\n\n### Task format\n\nTasks follow [Harbor's format](https:\u002F\u002Fharborframework.com\u002Fdocs\u002Ftasks):\n\n```text\ntasks\u002Fmy-task\u002F\n  task.toml           -- config (timeouts, metadata)\n  instruction.md      -- prompt sent to the agent\n  tests\u002F\n    test.sh           -- entry point, writes \u002Flogs\u002Freward.txt\n    test_outputs.py   -- verification (deterministic or LLM-as-judge)\n  environment\u002F\n    Dockerfile        -- task container image for Harbor\n```\n\n## License\n\nMIT\n","AutoSwarm 是一个开源框架，用于创建自我优化的本地大语言模型代理。其核心功能包括通过日志记录和反思过程自动提炼并注入技能到系统提示中，同时自动修剪无效或错误的技能。此外，它还支持多代理管道模式，能够自我优化其结构以提升任务完成率。该项目基于Python开发，并需要Docker环境运行。AutoSwarm特别适合于希望在本地环境中增强与改进AI对话代理性能的研究者和开发者使用，同时也适用于那些寻求提高特定任务处理效率的场景。","2026-06-11 03:58:55","CREATED_QUERY"]