[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81028":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":13,"stars7d":14,"stars30d":16,"stars90d":15,"forks30d":15,"starsTrendScore":16,"compositeScore":17,"rankGlobal":9,"rankLanguage":9,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":9,"pushedAt":9,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":15,"starSnapshotCount":15,"syncStatus":14,"lastSyncTime":26,"discoverSource":27},81028,"agentbrain","rohitg00\u002Fagentbrain","rohitg00","Evidence-first operating system for agents",null,"Python",32,6,1,2,0,3,2.54,"MIT License",false,"main",true,[],"2026-06-12 02:04:09","# Agent Brain\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fagentbrain-comic-banner.png\" alt=\"Monochrome Agent Brain comic banner showing a vague request becoming a verified handoff\" width=\"100%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>Repo-local operating rules for coding agents.\u003C\u002Fstrong>\u003Cbr>\n  Commands, skills, schemas, templates, evals, and proof gates that make agent work inspectable.\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Frohitg00\u002Fagentbrain\u002Factions\u002Fworkflows\u002Fquality.yml\">\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Frohitg00\u002Fagentbrain\u002Factions\u002Fworkflows\u002Fquality.yml\u002Fbadge.svg\" alt=\"Quality\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Frohitg00\u002Fagentbrain\u002Fblob\u002Fmain\u002FLICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Frohitg00\u002Fagentbrain\" alt=\"License\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Frohitg00\u002Fagentbrain\u002Fstargazers\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Frohitg00\u002Fagentbrain?style=social\" alt=\"GitHub stars\">\u003C\u002Fa>\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.11-3776AB?logo=python&logoColor=white\" alt=\"Python 3.11\">\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fruntime-agent--agnostic-111827\" alt=\"Agent agnostic\">\n\u003C\u002Fp>\n\n## Supported Agent Runtimes\n\n\u003Cp align=\"center\">\n  \u003Cstrong>Works with the coding agent you already use.\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n\u003Ctable align=\"center\">\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"96\">\n      \u003Ca href=\"https:\u002F\u002Fclaude.ai\u002Fcode\">\n        \u003Cimg src=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fclaude-ai-icon.svg\" alt=\"Claude Code\" height=\"34\">\u003Cbr>\n        \u003Csub>Claude Code\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"96\">\n      \u003Ca href=\"https:\u002F\u002Fopenai.com\u002Fcodex\u002F\">\n        \u003Cpicture>\n          \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fcodex_dark.svg\">\n          \u003Cimg src=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fcodex_light.svg\" alt=\"Codex\" height=\"34\">\n        \u003C\u002Fpicture>\u003Cbr>\n        \u003Csub>Codex\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"96\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fgoogle-gemini\u002Fgemini-cli\">\n        \u003Cimg src=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fgemini.svg\" alt=\"Gemini CLI\" height=\"34\">\u003Cbr>\n        \u003Csub>Gemini CLI\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"96\">\n      \u003Ca href=\"https:\u002F\u002Fwww.cursor.com\">\n        \u003Cpicture>\n          \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fcursor_dark.svg\">\n          \u003Cimg src=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fcursor_light.svg\" alt=\"Cursor\" height=\"34\">\n        \u003C\u002Fpicture>\u003Cbr>\n        \u003Csub>Cursor\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"96\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Ffeatures\u002Fcopilot\">\n        \u003Cpicture>\n          \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fcopilot_dark.svg\">\n          \u003Cimg src=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fcopilot.svg\" alt=\"GitHub Copilot\" height=\"34\">\n        \u003C\u002Fpicture>\u003Cbr>\n        \u003Csub>Copilot\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"96\">\n      \u003Ca href=\"https:\u002F\u002Fwindsurf.com\u002Feditor\">\n        \u003Cpicture>\n          \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fwindsurf-dark.svg\">\n          \u003Cimg src=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fwindsurf-light.svg\" alt=\"Windsurf\" height=\"34\">\n        \u003C\u002Fpicture>\u003Cbr>\n        \u003Csub>Windsurf\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"96\">\n      \u003Ca href=\"https:\u002F\u002Fopencode.ai\u002F\">\n        \u003Cpicture>\n          \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fopencode-dark.svg\">\n          \u003Cimg src=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fopencode.svg\" alt=\"OpenCode\" height=\"34\">\n        \u003C\u002Fpicture>\u003Cbr>\n        \u003Csub>OpenCode\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"96\">\n      \u003Ca href=\"https:\u002F\u002Fopenclaw.ai\u002F\">\n        \u003Cimg src=\"https:\u002F\u002Fsvgl.app\u002Flibrary\u002Fopenclaw.svg\" alt=\"OpenClaw\" height=\"34\">\u003Cbr>\n        \u003Csub>OpenClaw\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"150\">\n      \u003Ca href=\"https:\u002F\u002Fhermes-agent.nousresearch.com\u002F\">\n        \u003Cimg src=\"docs\u002Fassets\u002Fagent-runtimes\u002Fhermes.svg\" alt=\"Hermes-Agent\" height=\"28\">\u003Cbr>\n        \u003Csub>Hermes\u003C\u002Fsub>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003Cp align=\"center\">\n  \u003Csub>SVGL-hosted marks update in place where available; Hermes uses the repo-local Hermes-Agent logo.\u003C\u002Fsub>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ccode>raw request\u003C\u002Fcode> -&gt; \u003Ccode>state\u003C\u002Fcode> -&gt; \u003Ccode>command\u003C\u002Fcode> -&gt; \u003Ccode>skill\u003C\u002Fcode> -&gt; \u003Ccode>artifact\u003C\u002Fcode> -&gt; \u003Ccode>evidence\u003C\u002Fcode> -&gt; \u003Ccode>handoff\u003C\u002Fcode>\n\u003C\u002Fp>\n\nAgent Brain is a portable harness you add to a repository. It does not run your\nagent. It gives any file-reading coding agent a state machine, command specs,\nskills, schemas, evals, and handoff contracts so work moves through evidence,\nartifacts, verification, review, and learning instead of chat momentum.\n\nIt is not a decorative prompt pack, an IDE plugin, or another agent framework.\nBring the coding agent you already use. Agent Brain supplies the operating\ndiscipline around the model.\n\nUse it when you want an agent to stop guessing, pick the right lifecycle state,\nproduce the right artifact, and prove the work before it claims progress.\n\nWorks with agent runtimes that can read files and follow repository-local\ninstructions: terminal coding agents, IDE agents, subagent runners,\napproval-gated runtimes, and custom CLI or hosted agents.\n\nMost agent failures are not syntax errors. They are judgment errors:\n\n- building the wrong thing,\n- trusting stale context,\n- skipping tests,\n- accepting vague requirements,\n- shipping without rollback,\n- turning one messy run into permanent memory.\n\nAgent Brain keeps the first question sharp:\n\n> Should this exist, should it be an agent, and what evidence would prove or kill it?\n\nIt gives agents three non-negotiable habits:\n\n- **Plan before build.** Route vague requests through intake, research, challenge,\n  brief, design, and plan before implementation.\n- **Verify before trust.** Treat tests, logs, diffs, screenshots, citations, and\n  approvals as proof; treat confident summaries as claims.\n- **Learn only from evidence.** Turn repeated successful workflows into small,\n  neutral skills without copying external branding or temporary task chatter.\n\n## Quickstart\n\n### Try it in any coding agent\n\nPaste this into your agent:\n\n```text\nUse Agent Brain as your operating harness.\n\nClone https:\u002F\u002Fgithub.com\u002Frohitg00\u002Fagentbrain, read AGENTBRAIN.md, PRINCIPLES.md, ANTI_RATIONALIZATION.md, and docs\u002Fstate-machine.md, then choose the command in commands\u002F that matches my request.\n\nDo not build before evidence, plan, and verification are clear. Produce the required artifact from templates\u002F and schemas\u002F. Stop if approval, secrets, loop limits, rollback, or validation evidence are missing.\n```\n\nThen give the agent a real request, for example:\n\n```text\nI want to build an agent that handles customer refunds. Use Agent Brain before planning implementation.\n```\n\nA good run should not jump to code. It should route through `\u002Fbrain-start`, challenge whether an agent is appropriate, name the missing evidence, and produce a small artifact before any build work.\n\n### Install native slash-command wrappers\n\nFor runtimes with project-local custom command support, generate thin wrappers from `commands\u002Fregistry.json`:\n\n```bash\npython scripts\u002Finstall_slash_commands.py --runtime \u003Cruntime-key>\n```\n\nThe wrappers expose `\u002Fbrain-*` shortcuts while keeping `commands\u002Fbrain-*.md` as the source of truth. Runtimes without proven custom slash-command support should use `AGENTS.md` and the command registry directly.\n\n### Run the local quality gate\n\nAgent Brain is documentation-first, but it is still tested. Match CI with Python 3.11.\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Frohitg00\u002Fagentbrain.git\ncd agentbrain\npython3 --version  # expect Python 3.11.x\npython3 -m venv .venv\nsource .venv\u002Fbin\u002Factivate\npython3 -m pip install -r requirements-dev.txt\nrm -rf scripts\u002F__pycache__ tests\u002F__pycache__\npython -m pytest -q\npython scripts\u002Fvalidate_repo.py\ngit diff --check\ngit fetch origin main\ngit rev-parse HEAD\ngit rev-parse origin\u002Fmain\n```\n\nConfirm `HEAD equals origin\u002Fmain` before using a checkout as a trustworthy harness. Run baseline validation before editing so new failures are not blamed on old repository drift.\n\nExpected result:\n\n```text\nall tests pass\nValidation passed\nno whitespace diff errors\n```\n\nIf those commands do not pass, fix validation before handing the repo to an autonomous agent.\n\n## What Agent Brain gives an agent\n\nAgent Brain gives a capable model a way to operate like a careful teammate instead of a blank prompt box.\n\n- **A constitution:** constructive disagreement, stop conditions, approval gates.\n- **A lifecycle:** intake, research, challenge, decide, design, plan, build, verify, review, ship, learn.\n- **Slash-command specs:** repeatable workflows such as `\u002Fbrain-plan`, `\u002Fbrain-review`, and `\u002Fbrain-learn`.\n- **Portable skills:** small procedures with triggers, inputs, steps, verification, examples, and failure modes.\n- **Artifact contracts:** schemas and templates for briefs, plans, reviews, QA evidence, doctor reports, runtime smoke reports, scorecards, and handoffs.\n- **Evals:** cases that catch common agent failures before they become habits.\n- **Adapters:** guidance for runtimes that load markdown, skills, subagents, or approval-gated tools differently.\n\nThe repo is intentionally portable. It is not a hosted runtime, IDE plugin, or model wrapper. It is the operating discipline layer you put on top of the agent you already use.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fagentbrain-concept-strip.png\" alt=\"Monochrome comic strip showing a vague request routed into commands, skills, artifacts, evidence, and handoff\" width=\"100%\">\n\u003C\u002Fp>\n\n## When to use it\n\nUse Agent Brain when the cost of a wrong agent action is higher than the cost of a few minutes of structure.\n\nGood fits:\n\n- planning a feature before implementation,\n- reviewing agent-written code,\n- turning a vague product idea into a real scope decision,\n- checking whether automation should exist at all,\n- collecting fresh proof before a handoff,\n- converting repeated success or failure into a maintained skill,\n- running agents in parallel without trusting their summaries blindly.\n\nBad fits:\n\n- one-off toy prompts,\n- simple deterministic scripts,\n- tasks where a checklist or human approval queue is safer,\n- work that needs a production runtime, queue, dashboard, or hosted memory backend by itself.\n\n## See it work\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fagentbrain-loop.gif\" alt=\"Animated Agent Brain preview cycling through routing, lifecycle, and proof gates\" width=\"100%\">\n\u003C\u002Fp>\n\n```text\nUser: Build an agent for customer refunds.\n\nAgent Brain route:\n\u002Fbrain-start\n  -> classify as high-risk automation\n\u002Fbrain-should-this-exist\n  -> compare agent vs form vs checklist vs human approval queue\n\u002Fbrain-grill\n  -> ask who approves refunds, what policy applies, and what abuse cases matter\n\u002Fbrain-brief\n  -> write the smallest product scope with facts, assumptions, open questions, and kill criteria\n\u002Fbrain-plan\n  -> only if the decision survives challenge\n```\n\nThe useful answer might be: do not build an autonomous refund agent yet. Start with a policy-backed approval workflow and a read-only assistant. That is the point.\n\n## The workflow\n\n```text\nraw request\n  -> intake\n  -> should this exist?\n  -> research\n  -> grill\n  -> brief\n  -> design\n  -> plan\n  -> build\n  -> verify\n  -> review\n  -> ship\n  -> learn\n```\n\nThe loop can stop early. Stopping early is success when evidence shows the idea is unsafe, overbuilt, or not worth building.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fagentbrain-lifecycle-strip.png\" alt=\"Monochrome lifecycle strip showing Agent Brain states moving from intake to learn\" width=\"100%\">\n\u003C\u002Fp>\n\n## Run as an Agent Harness\n\nFor the full operating contract, read `docs\u002Fagent-harness.md`.\n\nA capable agent should follow this sequence:\n\n```text\nintake -> choose state -> load command -> load skill -> produce artifact -> verify -> review -> ship or learn\n```\n\nFor coding work, the normal path is:\n\n```text\nrequest\n-> \u002Fbrain-start\n-> \u002Fbrain-should-this-exist when the problem is weak or over-automated\n-> \u002Fbrain-research when claims need sources\n-> \u002Fbrain-grill when assumptions are soft\n-> \u002Fbrain-brief when product scope is needed\n-> \u002Fbrain-design when flows and states matter\n-> \u002Fbrain-plan when implementation is ready\n-> \u002Fbrain-build only after evidence and plan exist\n-> \u002Fbrain-verify for tests and proof\n-> \u002Fbrain-review before trusting output\n-> \u002Fbrain-ship only with rollback and launch checks\n-> \u002Fbrain-learn after repeated success or failure\n```\n\n## Minimal Harness Prompt\n\nUse this when you want another agent to apply Agent Brain precisely:\n\n```text\nYou are working inside the Agent Brain repository.\n\nRules:\n1. Start by reading AGENTBRAIN.md, PRINCIPLES.md, ANTI_RATIONALIZATION.md, and docs\u002Fstate-machine.md.\n2. Inspect git status --short and git log --oneline -5.\n3. Run baseline validation before editing when the task changes repository files.\n4. Preserve user changes. Never overwrite unrelated local work.\n5. Choose the earliest safe lifecycle state, then load the matching command from commands\u002F and the required skills\u002F.\n6. Produce the required artifact using templates\u002F and schemas\u002F.\n7. Do not build before evidence, scope, and verification are clear.\n8. Stop when approval, secrets handling, loop limits, rollback, or evidence are missing.\n9. Before final output, run: rm -rf scripts\u002F__pycache__ tests\u002F__pycache__ && python -m pytest -q && python scripts\u002Fvalidate_repo.py && git diff --check.\n10. If running as a noninteractive scheduled run, do not ask questions. Use the safest documented default only when ambiguity does not change scope, safety, side effects, or approval.\n```\n\n## Repository Map\n\n```text\nAGENTS.md                     # first-stop agent entrypoint\nAGENTBRAIN.md                  # constitution and operating loop\nINSTALL_FOR_AGENTS.md          # fresh-checkout setup path for agents\nPRINCIPLES.md                  # behavioral principles\nANTI_RATIONALIZATION.md        # shortcut rebuttals\nCONTRIBUTING.md                # contribution and validation workflow\nrequirements-dev.txt           # local validation dependencies\n.github\u002Fworkflows\u002F             # CI quality gate\ncommands\u002F                      # slash command specs\nskills\u002F                        # portable agent skills\nschemas\u002F                       # machine-checkable artifact schemas\nexamples\u002Fartifacts\u002F            # valid JSON examples for schemas\ndocs\u002F                          # architecture, state, memory, research, gates\ntemplates\u002F                     # artifact templates\nevals\u002F                         # cases and rubrics\nadapters\u002F                      # runtime-specific integration notes\nscripts\u002F                       # validation, doctor, scrub, and runtime smoke tooling\n```\n\n## Documentation Guide\n\nStart here:\n\n- `docs\u002Fagent-harness.md` — setup, operating loop, stop conditions, and troubleshooting.\n- `docs\u002Faudience-playbooks.md` — entrypoints and proof gates for adopters, agents, maintainers, runtime builders, workflow authors, teams, reviewers, and session operators.\n- `docs\u002Fdrift-tracking.md` — deterministic extraction, structured diffs, and release-note synthesis for changing contracts.\n- `docs\u002Fharness-effect.md` — why the harness layer changes agent behavior, operating rules for new tools, and parity checks across tool-output presentation modes.\n- `docs\u002Foperation-contract.md` — read-only, write, approval-gated, side-effect, and destructive operation modes.\n- `docs\u002Freplayable-evidence.md` — exact evidence chain needed to replay evals, runtime smoke, scorecards, and handoffs.\n- `docs\u002Fstate-machine.md` — valid states, transitions, required artifacts, and stop conditions.\n- `docs\u002Farchitecture.md` — repository architecture and validation responsibilities.\n- `docs\u002Freview-gates.md` — product, design, engineering, security, QA, launch, and verifier gates.\n- `docs\u002Fnon-agent-alternatives.md` — when a script, checklist, form, queue, or human review is better.\n- `docs\u002Fskill-system.md` — skill anatomy, lifecycle fit, catalog rules, and maintenance.\n- `docs\u002Fskill-distillation.md` — turn external workflows into neutral skills without copying branding.\n- `docs\u002Fmemory-model.md` — what belongs in durable memory versus temporary task state.\n- `docs\u002Fci-recovery.md` — inspect, reproduce, fix, and re-check remote workflow failures.\n- `docs\u002Fdevex-engineering.md` — setup, validation, command routing, and recovery guidance.\n- `docs\u002Fautonomous-goals.md` — scope long-running goals with measurable end states and loop limits.\n- `docs\u002Fshared-language.md` — keep project terms, aliases, and naming conflicts explicit.\n- `docs\u002Fslash-command-install.md` — native `\u002Fbrain-*` wrapper generation for supported runtimes without turning Agent Brain into a service.\n- `docs\u002Fruntime-lifecycle.md` — phase, queue, tool lifecycle, save-point, retry, abort, and compaction discipline.\n- `docs\u002Fdecision-records.md` — record durable trade-offs without turning chat into history.\n- `docs\u002Fclaims-we-reject.md` — claims and shortcuts the harness refuses without evidence.\n- `docs\u002Fecosystem-review.md` — neutral criteria for evaluating external patterns.\n- `docs\u002Fgrilling-protocol.md` — staged challenge process for weak assumptions.\n- `docs\u002Fimplementation-plan.md` — guidance for moving from plan to verified slices.\n- `docs\u002Fimplementation-roadmap.md` — checkpoint ledger for harness hardening.\n- `docs\u002Fquestion-ladder.md` — ask staged questions without overwhelming the user.\n- `docs\u002Fresearch-synthesis.md` — turn sources into operating principles.\n- `docs\u002Fresearch-watchlist.md` — source classes to monitor while preserving neutral public copy.\n\n## Core State Machine\n\n```text\nINTAKE\n-> RESEARCH\n-> CHALLENGE\n-> DECIDE\n-> DESIGN\n-> PLAN\n-> BUILD\n-> VERIFY\n-> REVIEW\n-> SHIP\n-> LEARN\n```\n\nEach state answers:\n\n- What artifact is required?\n- What evidence is needed?\n- What could kill or redirect the work?\n- What is the next valid state?\n- What stop condition prevents unsafe progress?\n\n## Command Selection Guide\n\nUse this guide before reading individual command files. Pick the earliest safe lifecycle state that matches the request, especially when proof gaps or trust gaps appear. The selected command must name an output artifact, template, or command output contract.\n\n| Request shape | Start with | Use when |\n| --- | --- | --- |\n| Raw, ambiguous, or missing context | `\u002Fbrain-start` | The agent needs to classify the request and choose the next safe state. |\n| Product idea or proposed automation | `\u002Fbrain-should-this-exist` | The agent must test whether an agent, script, checklist, or human process is appropriate. |\n| Claims, market signals, APIs, or current facts | `\u002Fbrain-research` | Work needs source-backed evidence before a brief, plan, or decision. |\n| Weak assumptions or fuzzy requirements | `\u002Fbrain-grill` | The agent needs to challenge user, market, design, engineering, or risk assumptions. |\n| Product scope or user story | `\u002Fbrain-brief` | The agent needs a concise product artifact with facts, assumptions, questions, risks, and acceptance criteria. |\n| Interface, workflow, or edge-case design | `\u002Fbrain-design` | The agent needs to define states, flows, failure paths, and UX constraints. |\n| Implementation-ready work | `\u002Fbrain-plan` | The agent needs small vertical slices with tests and verification commands. |\n| Code or artifact creation | `\u002Fbrain-build` | A plan exists and the next slice can be built with test-first or validator-first proof. |\n| Proof collection | `\u002Fbrain-verify` | The agent needs tests, logs, traces, screenshots, citations, or diff evidence. |\n| Trust decision before handoff | `\u002Fbrain-review` | The agent needs a focused review for correctness, safety, maintainability, and evidence gaps. |\n| Release or production change | `\u002Fbrain-ship` | The agent needs go\u002Fno-go criteria, rollback, monitoring, and launch notes. |\n| Repeated outcome or new reusable workflow | `\u002Fbrain-learn` | The agent should update durable knowledge, skills, templates, evals, or validators. |\n| Project knowledge maintenance | `\u002Fbrain-wiki` | The agent should update source-backed repo knowledge without preserving temporary task chatter. |\n| Harness quality check | `\u002Fbrain-eval` | The agent should test a command, skill, or output against eval cases and rubrics. |\n\nIf no command fits, do not invent a new route silently. Stop with the closest existing state, the missing contract, and the next validator-backed improvement.\n\n## Artifact Routing Guide\n\nUse the command output first, then the closest template. Validate against the matching schema when one exists.\n\n| Work product | Use this file | Schema \u002F contract |\n| --- | --- | --- |\n| Command route registry | `commands\u002Fregistry.json` | `schemas\u002Fcommand-registry.schema.json` |\n| Checkout readiness report | `templates\u002Fdoctor-report.md` | `schemas\u002Fdoctor-report.schema.json` |\n| Intake routing | `templates\u002Fintake-summary.md` | Command output contract |\n| Should-this-exist decision | `templates\u002Fnon-agent-alternative-review.md` | Command output contract |\n| Source-backed research | `templates\u002Fresearch-claim-ledger.md` | Command output contract |\n| Challenge questions | `templates\u002Fgrill-report.md` | Command output contract |\n| Product scope | `templates\u002Fproduct-brief.md` | `schemas\u002Fproduct-brief.schema.json` |\n| Interface or workflow design | `templates\u002Fdesign-brief.md` | Command output contract |\n| Implementation slices | `templates\u002Fimplementation-plan.md` | `schemas\u002Fimplementation-plan.schema.json` |\n| Changed artifact and build notes | `templates\u002Fchanged-artifact-plus-implementation-notes.md` | `schemas\u002Fchanged-artifact-plus-implementation-notes.schema.json` |\n| QA or verification proof | `templates\u002Fqa-evidence.md` | `schemas\u002Fqa-evidence.schema.json` |\n| Real-runtime smoke evidence | `templates\u002Fruntime-smoke.md` | `schemas\u002Fruntime-smoke.schema.json` |\n| Comparable eval, adapter, or release result | `templates\u002Fscorecard.md` | `schemas\u002Fscorecard.schema.json` |\n| Harness-effect parity report for a tool wired into the harness | `evals\u002Fharness-effect\u002Ffixtures\u002F` plus `scripts\u002Fharness_effect.py` | `schemas\u002Fharness-effect-report.schema.json` |\n| Trust review before handoff | `templates\u002Freview-report.md` | `schemas\u002Freview-report.schema.json` |\n| Launch or merge readiness | `templates\u002Flaunch-checklist.md` | Command output contract |\n| Durable learning capture | `templates\u002Flearning-capture.md` | Command output contract |\n| Project knowledge update | `templates\u002Fwiki-update.md` | Command output contract |\n| Eval case run or rubric check | `templates\u002Feval-report.md` | `schemas\u002Feval-report.schema.json` |\n| Run handoff or blocked stop | `templates\u002Fhandoff-report.md` | `schemas\u002Fhandoff-report.schema.json` |\n| Memory write, update, retrieval, or rejection | `templates\u002Fmemory-decision.md` | `schemas\u002Fmemory-decision.schema.json` |\n| New or revised skill | `templates\u002Fskill-template.md` | `schemas\u002Fskill.schema.json` |\n| Decision or killed path | `docs\u002Fstate-machine.md` archive state | `schemas\u002Fdecision-log.schema.json` |\n| Unknowns and assumptions | `docs\u002Fgrilling-protocol.md` | `schemas\u002Fassumption-ledger.schema.json` |\n\nIf no template fits, stop and record the gap instead of inventing a private format.\n\n## Handoff Contract\n\nEvery handoff should be useful without private chat context. End each run, review, or blocked stop with:\n\n- state,\n- decision,\n- evidence checked,\n- fresh validation proof,\n- context boundary,\n- artifact paths,\n- facts,\n- assumptions,\n- open questions,\n- risks,\n- next action.\n\nWhen resuming from a previous handoff, treat it as stale until current files, blockers, risks, context boundary, and validation commands confirm it. Resume only the named next action.\n\n## Evidence Freshness Rules\n\nFresh proof must include:\n\n- command,\n- result,\n- date or commit,\n- artifact checked,\n- source provenance,\n- recheck trigger,\n- expiry when evidence depends on external state.\n\nStale validation proof cannot be reused after code, docs, schemas, templates, commands, skills, evals, CI, or dependencies change. Rerun the narrow check and then the full quality gate.\n\n## Core Commands\n\n- [`\u002Fbrain-start`](commands\u002Fbrain-start.md) — turn a raw request into the correct next state.\n- [`\u002Fbrain-should-this-exist`](commands\u002Fbrain-should-this-exist.md) — test whether the product or agent should exist at all.\n- [`\u002Fbrain-research`](commands\u002Fbrain-research.md) — produce a source-backed claim ledger.\n- [`\u002Fbrain-grill`](commands\u002Fbrain-grill.md) — challenge assumptions, user, market, design, engineering, and risk.\n- [`\u002Fbrain-brief`](commands\u002Fbrain-brief.md) — create a product brief with evidence and open questions.\n- [`\u002Fbrain-design`](commands\u002Fbrain-design.md) — define user flow, interface, states, and edge cases.\n- [`\u002Fbrain-plan`](commands\u002Fbrain-plan.md) — break work into small, verifiable slices.\n- [`\u002Fbrain-build`](commands\u002Fbrain-build.md) — implement only after plan and evidence gates pass.\n- [`\u002Fbrain-verify`](commands\u002Fbrain-verify.md) — collect tests, traces, screenshots, logs, or other proof.\n- [`\u002Fbrain-review`](commands\u002Fbrain-review.md) — review correctness, product fit, security, UX, and maintainability.\n- [`\u002Fbrain-ship`](commands\u002Fbrain-ship.md) — decide go\u002Fno-go with launch checklist and rollback plan.\n- [`\u002Fbrain-learn`](commands\u002Fbrain-learn.md) — convert repeated success or failure into durable knowledge or skill.\n- [`\u002Fbrain-wiki`](commands\u002Fbrain-wiki.md) — maintain source-backed project knowledge.\n- [`\u002Fbrain-eval`](commands\u002Fbrain-eval.md) — test the brain, command, or skill against cases and rubrics.\n\n## Core Skills\n\n- `activity-recap` — summarize recent project activity from local evidence.\n- `adapter-capability-probe` — prove adapter and runtime capabilities before trusting command routing, writes, shell access, or full-validation claims.\n- `agent-output-verifier` — block unsafe or unsupported agent output before trust or handoff.\n- `artifact-contract` — keep command outputs, templates, schemas, examples, handoff fields, and validators aligned.\n- `ci-recovery` — inspect, reproduce, fix, and re-check remote workflow failures.\n- `command-routing` — choose or verify `\u002Fbrain-*` routes against command files, loaded skills, artifacts, and stop conditions.\n- `context-memory` — choose what to remember, retrieve, update, or deliberately forget.\n- `domain-language` — resolve project vocabulary, aliases, and glossary-vs-decision routing.\n- `design-grill` — challenge interface, states, and edge cases before build work.\n- `engineering-grill` — challenge feasibility, failure modes, and implementation risk.\n- `evidence-research` — turn claims into source-backed research evidence.\n- `intake` — route raw intent into the correct next workflow state.\n- `launch-gate` — decide go\u002Fno-go with rollout, rollback, and proof.\n- `learning-capture` — convert repeated outcomes into durable project knowledge.\n- `market-grill` — challenge audience, alternatives, and demand evidence.\n- `plan-slicing` — split work into small verifiable implementation slices.\n- `problem-grill` — test whether the problem is real, specific, and worth solving.\n- `qa-evidence` — collect verification proof for review and shipping decisions.\n- `runtime-lifecycle` — verify turn phases, queues, tool lifecycle, save points, retry, abort, compaction, and branch claims.\n- `runtime-smoke` — check Agent Brain in a real agent runtime or adapter without overstating read-only smoke as full validation.\n- `question-ladder` — ask staged questions that narrow ambiguity without overloading the user.\n- `wiki-maintenance` — maintain project knowledge from checked sources.\n\n## Adapter Guide\n\nUse adapters when a runtime cannot load Agent Brain directly:\n\n- `adapters\u002Fread-only-cli\u002FREADME.md` — CLI runtime smoke checks with sandbox, Python, and markdown-command constraints.\n- `adapters\u002Fsubagent-runtime\u002FREADME.md` — subagent-capable runtimes with file-backed command routing and join reviews.\n- `adapters\u002Fapproval-gateway-runtime\u002FREADME.md` — approval-gated gateway smoke checks with explicit approval and fallback evidence.\n- `adapters\u002Fskill-runtime\u002FREADME.md` — tool-enabled skill runtimes while preserving portable command and skill contracts.\n- `adapters\u002Fplain-markdown\u002FREADME.md` — agents that only read markdown files and need manual command\u002Fskill routing.\n\n## Edge Cases and Stop Conditions\n\nStop instead of proceeding when:\n\n- the user is undefined,\n- the problem is generic or not worth solving,\n- a script, checklist, form, or human approval queue is safer,\n- success metrics are missing,\n- source claims are not backed by inspectable evidence,\n- the agent is about to build before a spec or plan exists,\n- implementation slices are too large to verify independently,\n- tests are skipped because the change feels small,\n- a tool call, file write, public post, deploy, payment, side effect, or destructive action lacks explicit approval evidence,\n- output claims tests passed without logs,\n- a background loop, retry worker, scheduled run, or unattended maintenance job has no stop condition,\n- a noninteractive run would need user clarification before a scoped or destructive decision,\n- secret-like values or private data appear in output,\n- rollback is undefined for a shipped change,\n- learning capture would preserve temporary task state instead of durable workflow knowledge.\n\nBlocked output should be short:\n\n```text\nStatus: blocked\nReason: \u003Cspecific stop condition>\nEvidence checked: \u003Cfiles, logs, sources, commands>\nMissing evidence: \u003Cwhat would unblock>\nSafe next action: \u003Csmallest next step>\n```\n\n## Quality Gates\n\nBefore trusting a change, run the matching gates from `docs\u002Freview-gates.md`:\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fagentbrain-gates-strip.png\" alt=\"Monochrome proof gates for approval, evidence, rollback, and review\" width=\"100%\">\n\u003C\u002Fp>\n\n- Product Gate: user, problem, scope, success metric, kill criteria.\n- Design Gate: flows, states, copy, accessibility, edge cases.\n- Engineering Gate: architecture, data flow, tests, observability, rollback.\n- Security and Trust Gate: secrets, permissions, destructive actions, abuse cases.\n- Guardrail and Approval Gate: input, tool, output, and human approval boundaries.\n- Agent Output Verifier Gate: evidence, loop limits, tool claims, side effects.\n- QA Gate: real journey, proof, severity, fixes, known limitations.\n- Launch Gate: setup, changelog, support path, rollback, learning capture.\n\n## Validation\n\nRun this before committing changes:\n\n```bash\npython3 -m pip install -r requirements-dev.txt\nrm -rf scripts\u002F__pycache__ tests\u002F__pycache__\npython scripts\u002Fdoctor.py --no-fail\npython -m pytest -q\npython scripts\u002Fvalidate_repo.py\ngit diff --check\n```\n\nMaintainer-only public-copy leak checks are separate from the user validation path.\n\n`scripts\u002Fdoctor.py` is the quick readiness check for agents. It reports Python, git freshness, required entrypoints, public setup exposure, validator status, blockers, warnings, and next actions as a `schemas\u002Fdoctor-report.schema.json` artifact.\n\nWhen testing a real runtime or adapter, capture a schema-valid smoke artifact:\n\n```bash\npython scripts\u002Fruntime_smoke.py \\\n  --runtime generic-cli-runtime \\\n  --version \u003Cruntime-version> \\\n  --sandbox-write-mode read_only \\\n  --brain-command-mode markdown_specs \\\n  --selected-command \u002Fbrain-start \\\n  --loaded-skill intake \\\n  --loaded-skill agent-output-verifier \\\n  --adapter-path adapters\u002Fread-only-cli\u002FREADME.md \\\n  --run-scope read_only_smoke \\\n  --command-exit-status 0 \\\n  --smoke-result blocked \\\n  --transcript-path artifacts\u002Fruntime-smoke\u002Fgeneric-cli-runtime-2026-05-15.log \\\n  --transcript-redaction-status redacted \\\n  --blocked-command \"python -m pytest -q\" \\\n  --output runtime-smoke.local.json\n```\n\nDo not call read-only smoke a full validation run.\n\nWhen wiring a new search or recall tool into the harness, capture a parity\nreport so tool-output presentation is measured, not asserted. See\n`docs\u002Fharness-effect.md` for the operating rules:\n\n```bash\npython scripts\u002Fharness_effect.py \\\n  evals\u002Fharness-effect\u002Ffixtures\u002Fakbp-search.json \\\n  --output-dir \u002Ftmp\u002Fharness-effect\u002Fout \\\n  --out runtime\u002Fharness-effect-report.json \\\n  --fail-on-mismatch\n```\n\nThe script invokes the tool once per declared presentation mode (`inline` and\n`file`), diffs retrieved evidence ids and citations, and writes a\n`schemas\u002Fharness-effect-report.schema.json`-valid JSON report. Treat a\nnon-pass verdict as a harness regression: file mode is not allowed to silently\ndrop citations or items.\n\n## Troubleshooting\n\n### `git status --short` shows a dirty working tree\n\nDo not overwrite local work to make the harness look clean. Preserve user changes first:\n\n1. Inspect `git status --short` and `git diff`.\n2. Separate user work from your intended slice.\n3. Stage and commit only files that belong to the current change.\n4. Stop with a handoff if unrelated changes are present.\n\n### Validation says a command is missing from README\n\nAdd the command to the Core Commands list with backticks.\n\n### Validation says a skill is missing from README\n\nAdd the skill to the Core Skills list with backticks.\n\n### A skill fails validation\n\nCheck that `skills\u002F\u003Cname>\u002FSKILL.md` has frontmatter, matching `name:`, a matching H1, and required sections in canonical order.\n\n### An eval fails validation\n\nCheck that `evals\u002Fcases\u002F\u003Cslug>.md` has the required H1, `## User request`, `## Expected behavior`, `## Failure if`, and a catalog entry in `evals\u002FREADME.md`.\n\n### Public copy validation fails\n\nConvert internal, vendor, and source-specific naming into neutral pattern classes such as agent runtime, coding agent, skill library, harness, verifier, guardrail, review gate, or evaluation case.\n\nIf validation reports secret-like values, remove the value, rotate it in the system where it was created, and replace public examples with redacted placeholders.\n\n### Tests pass locally but CI fails\n\nRun the exact CI sequence locally:\n\n```bash\npython3 -m pip install -r requirements-dev.txt\nrm -rf scripts\u002F__pycache__ tests\u002F__pycache__\npython -m pytest -q\npython scripts\u002Fvalidate_repo.py\ngit diff --check\n```\n\nThen inspect `.github\u002Fworkflows\u002Fquality.yml` for Python 3.11 drift, missing install, test, validation, timeout, or read-only permission settings.\n\n### Dependency bootstrap fails\n\nIf validation fails with `ModuleNotFoundError`, fix the virtual environment instead of editing around the missing dependency:\n\n```bash\npython3 -m venv .venv\nsource .venv\u002Fbin\u002Factivate\npython3 -m pip install -r requirements-dev.txt\npython -m pytest -q\n```\n\n### Generated cache validation fails\n\nIf validation reports a generated Python cache file, remove the local artifact:\n\n```bash\nrm -rf scripts\u002F__pycache__ tests\u002F__pycache__ .pytest_cache\npython -m pytest -q\npython scripts\u002Fvalidate_repo.py\ngit diff --check\n```\n\n### Artifact contract validation fails\n\nIf validation reports a schema\u002Ftemplate mismatch, inspect the schema, update the matching template, update README artifact routing if needed, and rerun the full gate.\n\n## Weakest Failure Mode Audit\n\nBefore choosing the next hardening slice, inspect the least protected way a future agent could fail:\n\n1. Commands: distinct workflow, stop conditions, quality bar, and skills-to-load list.\n2. Skills: triggers, inputs, procedure, anti-rationalization, verification, output artifact, and failure modes.\n3. Schemas and templates: closed schemas, required fields, and matching template field references.\n4. Evals: newest repeated failure represented as a case.\n5. CI and install: fresh checkout can run the same local and CI gate.\n6. Public copy: external sources distilled into neutral pattern language.\n7. Handoff: state, evidence checked, context boundary, facts, assumptions, risks, blockers, fresh validation proof, and next action.\n8. README\u002Fdocs: a capable coding agent can self-setup, choose the right command, troubleshoot, and maintain the harness without private context.\n\nPrefer the smallest slice that adds or tightens a validator\u002Feval first, then updates the corresponding doc, skill, command, schema, or template.\n\n## Maintainer Checklist\n\nBefore a harness release or direct-to-main hardening push, verify:\n\n- README can bootstrap a new agent without private context.\n- Commands and skills are cataloged and point to existing files.\n- Required docs, schemas, templates, evals, and adapters are discoverable.\n- The newest failure mode is covered by an eval or validator rule.\n- CI mirrors local validation and uses read-only permissions.\n- Public copy uses neutral pattern language.\n- Generated cache files are not tracked.\n- The latest commit is verified on the remote branch.\n\n## Maintainer Loop\n\n```text\n1. Find the weakest uncovered failure mode.\n2. Add or update an eval or validator first.\n3. Improve the smallest doc, skill, template, or schema that closes the gap.\n4. Run: rm -rf scripts\u002F__pycache__ tests\u002F__pycache__ && python -m pytest -q && python scripts\u002Fvalidate_repo.py && git diff --check.\n5. If the change used named external references, run the targeted exact-name scrub for those source names before publishing public copy.\n6. Commit a small coherent chunk.\n7. git push the verified chunk.\n8. Run git fetch origin main and confirm HEAD equals origin\u002Fmain.\n9. Repeat.\n```\n\nHigh-priority hardening targets: README detail and harness usability, command edge cases, skill trigger clarity, eval coverage, command registry drift, doctor\u002Freadiness proof, replayable evidence, schema\u002Ftemplate alignment, CI parity, public-copy neutrality, and install instructions that another agent can follow without guessing.\n\n## Status\n\nAgent Brain is a v0.2 portable harness: documentation-first, tested, runtime-agnostic, and ready for iterative hardening. The next big unlock is a true installer and more real runtime smoke artifacts.\n","Agent Brain 是一个面向证据的操作系统，旨在为编码代理提供本地仓库级的操作规则。其核心功能包括命令、技能、模式、模板、评估和验证门控机制，这些功能使得代理的工作过程可被检查和验证。项目采用Python 3.11开发，并且支持多种主流的编码代理运行时环境，如Claude Code、Codex、Gemini CLI等，展现出良好的兼容性。它适用于需要提高代码生成或修改过程中透明度与可靠性的场景，特别是对于依赖于AI辅助编程工具的开发者来说，Agent Brain能够帮助他们更好地理解和控制代理的行为。","2026-06-11 04:03:15","CREATED_QUERY"]