[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83940":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":25,"readmeContent":26,"aiSummary":10,"trendingCount":16,"starSnapshotCount":16,"syncStatus":14,"lastSyncTime":27,"discoverSource":28},83940,"canary","wizenheimer\u002Fcanary","wizenheimer","QA harness built for Claude Code | E2E testing with screen recordings, console logs, network HARs, and Playwright traces","",null,"TypeScript",357,19,2,1,0,26,228,280,3.9,"Other",false,"main",[],"2026-06-12 02:04:36","\u003Cdiv align=\"center\">\n  \u003Ch1>Canary\u003C\u002Fh1>\n  \u003Cp>\u003Cstrong>QA harness built for Claude Code.\u003C\u002Fstrong>\u003C\u002Fp>\n\u003C\u002Fdiv>\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F8ad76566-542e-43b0-a9f2-0220f819710b\" \u002F>\n\nCanary is a QA harness purpose built for coding agents like Claude Code. It reads your code diffs, identifies the affected UI flows, and tests them in real browser instances using Claude Code.\n\nUnder the hood, it ships with a QuickJS WASM sandbox exposing the full Playwright API, letting Claude automate any long-running UI task — from handling logins to navigating complicated UIs. \n\nInstead of clicking through flows by hand to reproduce and verify issues, Canary provides full session recordings. You get screen recordings with console logs, network requests, HARs, and Playwright traces so you can inspect exactly what the agent did.\n\nEvery Canary run captures a reusable Playwright script. Letting you re-run it in CI with zero inference cost on replay.\n\nMost testing tools force you to choose between two extremes:\n\n- An opaque agent run you can't reproduce.\n- Raw Playwright scripts you have to write and maintain by hand.\n\nCanary doesn't make you choose: the agent does the QA and hands you a reproducible script.\n\n## Features\n\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F145916b9-80ed-4154-858f-256d84783d19\" \u002F>\n\n- **See exactly what happened.** Trace, video, network, console, and a screenshot of every step — captured automatically.\n- **Reproducible by default.** Canary turns each run into a real Playwright script. Let your agent discover a flow once; re-run it forever.\n- **One file, zero setup.** Every session is a self-contained `report.html` — open it, commit it, send it. No server, no build.\n- **Built for agents.** Drop-in plugins for Claude Code, Cursor, and Codex.\n- **Sandboxed.** Scripts run in a QuickJS WASM sandbox with the full Playwright `Page` API — no Node, no host access.\n\n\n## Who it's for\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F8459994a-b43c-4483-bb4a-00522d1d03fe\n\nYou describe the flow in plain language; your agent drives a real browser and hands back **both** a\nreport you can just read **and** the exact Playwright script — plus the full trace — behind it. Most\ntools make you pick one: an opaque agent run you can't reproduce, or raw Playwright you write and\nmaintain by hand. Canary gives you both.\n\n| You are a…        | Instead of…                                            | Canary gives you…                                                                       |\n| ----------------- | ------------------------------------------------------ | --------------------------------------------------------------------------------------- |\n| **Developer**     | Writing and maintaining Playwright\u002FE2E scripts by hand | A reusable script captured from every run — re-run it in CI, no agent cost on replay    |\n| **QA engineer**   | Clicking through flows manually to repro and verify    | Evidence by default — trace, video, network, console, and a screenshot of every step    |\n| **PM \u002F reviewer** | Waiting on a build or trusting \"works on my machine\"   | A self-contained `report.html` you open and read — every step, replayable and shareable |\n\n## Get started\n\n```bash\nnpm i -g @usecanary\u002Fcli @usecanary\u002Fui   # puts `canary` + `canary-viewer` on your PATH\ncanary install                          # one-time: Chromium + the runtime into ~\u002F.canary (~150 MB)\n```\n\n…or run the guided wizard, which offers to install all of the above for you:\n\n```bash\nnpm create canary@latest                # guided setup (Ink wizard)\n```\n\nRecord a session and open the report:\n\n```bash\nid=$(canary session start --name \"checkout\")\ncanary run .\u002Fopen.js   --session \"$id\" --step open\ncanary run .\u002Fsubmit.js --session \"$id\" --step submit\ncanary session end \"$id\"                # -> ~\u002F.canary\u002Fsessions\u002F\u003Cid>\u002Freport.html\n\ncanary-viewer                           # browse every recorded session\ncanary stop                             # shut the background daemon down when you're done\n```\n\nJust need a quick one-off with no recording? Drive the browser engine directly:\n\n```bash\necho 'const p = await browser.getPage(\"main\");\nawait p.goto(\"https:\u002F\u002Fexample.com\");\nconsole.log(await p.title());' | canary-browser\n```\n\nOr attach to a Chrome you already have open — launch it with `--remote-debugging-port=9222`, then\n`canary-browser --connect` (it auto-discovers the port, or pass the URL explicitly). Handy for driving\na browser that's already logged in:\n\n```bash\ncanary-browser --connect http:\u002F\u002Flocalhost:9222 \u003C\u003C'EOF'\nconst page = await browser.getPage(\"main\");\nconsole.log(await page.title());\nEOF\n```\n\n> Prefer not to install? Every command also runs one-off via npx, e.g.\n> `npx @usecanary\u002Fcli session start …` and `npx @usecanary\u002Fui`.\n\n## Everything your agent does, on the record\n\nOpen any session and Canary replays the whole thing — the page, the script, every Playwright call, the\nconsole, the network, the full trace. Nothing summarized, nothing reconstructed: it's the actual run.\n(Every screenshot below is real output.) Capture is on by default; switch any stream off with\n`--no-trace` \u002F `--no-video` \u002F `--no-har` \u002F `--no-console`.\n\n### The session at a glance\n\nStatus, a per-step timeline, the exact environment, and a full **video replay** of the run with a\nfilmstrip of per-step screenshots — scrub straight to the moment something happened.\n\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fc538d7a3-5e03-4aa1-9412-3ae43cac4f34\" \u002F>\n\n\n### Step by step\n\nEach step, pass or fail, with its exit code, duration, and how many Playwright actions it ran.\n\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fade23da1-451b-41b3-b8b8-720f27352734\" \u002F>\n\n\n### Reproducible Playwright scripts\n\nThis is the one that matters. Let your agent figure a flow out **once** — Canary keeps the script\nbehind every step **and** decodes the full Playwright trace into the exact calls it made (`goto`,\n`waitForSelector`, `evaluate`, `screenshot`), with params and timing. What you get back is a real,\nreusable script. Next time you don't pay an agent to rediscover the page — you just re-run it.\n\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F8ef300ef-ee0b-4382-8ff0-a78bce88d09f\" \u002F>\n\n\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F249051ec-5543-4ab0-9f2f-48533817bbca\" \u002F>\n\n\n### Console and page errors\n\nEvery console message and uncaught page error, filterable by level — errors, warnings, info, logs —\nwith the source URL. Errors flagged in red.\n\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fad25e190-4123-43ab-9b74-15644a6fd3c3\" \u002F>\n\n\n### Network, request by request\n\nEvery request with status, type, size, and timing. Filter by kind, then click any row to inspect its\nheaders, payload, and response — like a devtools network panel, frozen at the moment it ran.\n\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fa9ea5f94-ba31-480b-abee-bf0350bb4735\" \u002F>\n\n\n### The full trace, and every artifact\n\nThe raw Playwright `trace.zip`, the network HAR, the console log, the machine-readable `results.json`,\nand the self-contained `report.html` — all under `~\u002F.canary\u002Fsessions\u002F\u003Cid>\u002F`, all one click away. Open\nthe trace in Playwright's own viewer with `npx playwright show-trace`.\n\n\u003Cimg width=\"1920\" height=\"1080\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ffdb5efbc-a92d-4eb6-b64c-0f7efb8977a7\" \u002F>\n\n\n## Claude Code, natively\n\nIn [Claude Code](https:\u002F\u002Fclaude.com\u002Fclaude-code), Canary is a first-class plugin — skills, subagents,\nand `\u002Fcanary:*` slash commands. Tell Claude what you changed or what to check; it plans the QA, drives\na real browser, and hands back the report.\n\n```\n\u002Fcanary:verify    # what changed? → a prioritized QA plan, then record it\n\u002Fcanary:session   # record a flow end to end and render report.html\n\u002Fcanary:run       # drive the browser once, nothing recorded\n\u002Fcanary:review    # open the viewer and triage a recorded session\n```\n\nOr skip the slash and just say *\"QA the checkout flow and give me a report\"* — Canary's subagents pick\nit up. Install the plugin below.\n\n## Use it with your coding agent\n\nCanary is built for agents — and it explains itself to them. Install it, then **tell your agent to run\n`canary --help`** (or `canary-browser --help` for one-offs): each output is a complete, self-contained\nusage guide — sandbox API, worked examples, a Playwright cheat sheet — written for an LLM to read.\nNo plugin required.\n\nFor deeper integration (slash commands, subagents, and skills), install the plugin pack. Canary ships\nas a Claude Code plugin, a Cursor plugin, and a Codex plugin — all pointing at the same `skills\u002F` +\n`agents\u002F` + `commands\u002F`. There's no bespoke installer; each agent's own mechanism does the work.\n\n```bash\n# Claude Code\n\u002Fplugin marketplace add wizenheimer\u002Fcanary\n\u002Fplugin install canary@canary-marketplace\n\n# Cursor — install \"canary\" from the Marketplace, or symlink for local dev:\nln -sfn \"$(pwd)\" ~\u002F.cursor\u002Fplugins\u002Flocal\u002Fcanary\n\n# Codex\ncodex marketplace add wizenheimer\u002Fcanary        # then \u002Fplugins → install \"canary\"\n```\n\nYou get **`canary-scripting`** (the sandbox API, with `references\u002FREFERENCE.md`) plus the workflow\nskills **`canary-verify`**, **`canary-automate`**, **`canary-session`**, and **`canary-review`** —\neach paired with a subagent and a slash command: `\u002Fcanary:verify`, `\u002Fcanary:run`, `\u002Fcanary:session`,\n`\u002Fcanary:review`.\n\n## Three tools, one runtime\n\n| Tool                            | Command                            | Use it to                                                                          |\n| ------------------------------- | ---------------------------------- | ---------------------------------------------------------------------------------- |\n| **CLI** `@usecanary\u002Fcli`        | `canary`                           | Record capture-enabled QA sessions and render reports. The main, user-facing tool. |\n| **Engine** `@usecanary\u002Fbrowser` | `canary-browser`                   | Drive a browser for quick, one-off automation — no recording, no report.           |\n| **Viewer** `@usecanary\u002Fui`      | `canary-viewer` · `npx @usecanary\u002Fui` | Browse, search, organize, and replay every recorded session locally.            |\n\nBoth CLIs share one background daemon (Playwright + a QuickJS sandbox) that starts automatically when\nneeded. Stop it anytime with **`canary stop`** (alias: `canary daemon stop`, or `canary-browser stop`) —\nit shuts down every browser and session it's running. You can also pass `--stop-daemon` to\n`canary session end` to tear it down as soon as nothing else needs it.\n\n## Scripting\n\nScripts are plain async JavaScript with top-level `await`.\n\n\u003C!-- canary:snippet api-sandbox-env -->\nScripts execute inside a QuickJS WASM sandbox with no arbitrary access to the host system.\nThis is NOT Node.js — there is no module system and no Node API:\n\n- `require()` \u002F `import()` — no module loading; inline any helpers in the script\n- `process`, `fs` \u002F `path` \u002F `os` — no process or direct filesystem access (use the file helpers)\n- `fetch` \u002F `WebSocket` — no direct network access (the page does the networking)\n- `__dirname` \u002F `__filename` — no path globals\n\nMemory and CPU limits are enforced, and both CPU time and wall-clock time are bounded — infinite\nloops or never-settling promises abort the script. Values crossing `evaluate` \u002F `$eval` must be\nJSON-serializable.\n\u003C!-- canary:end api-sandbox-env -->\n\n\u003C!-- canary:snippet ex-quickstart fenced=js -->\n```js\nconst page = await browser.getPage(\"main\");          \u002F\u002F named, persistent page\nawait page.goto(\"https:\u002F\u002Fexample.com\", { waitUntil: \"domcontentloaded\" });\nconsole.log(await page.title());\n\nconst headings = await page.evaluate(() =>\n  [...document.querySelectorAll(\"h1, h2\")].map((h) => h.textContent.trim())\n);\nconsole.log(JSON.stringify(headings));\n\nawait page.locator(\"a.more\").click();\nconst buf = await page.screenshot({ fullPage: false });\nawait saveScreenshot(buf, \"page.png\");               \u002F\u002F saveScreenshot(buffer, name)\n```\n\u003C!-- canary:end ex-quickstart -->\n\n**Browser**\n\n\u003C!-- canary:snippet api-browser -->\n- `browser.getPage(nameOrId)` — get-or-create a named page, or attach to an existing tab by the\n  `id` from `listPages()`. Named pages persist across steps in a session — call with the same\n  name to reuse the tab.\n- `browser.newPage()` — an anonymous page, auto-closed when the script ends; does not persist.\n- `browser.listPages()` — list every open tab: `[{ id, url, title, name }]` (`name` is `null`\n  for tabs you never named).\n- `browser.closePage(name)` — close and forget a named page.\n\u003C!-- canary:end api-browser -->\n\n**Files**\n\n\u003C!-- canary:snippet api-file-helpers -->\nAll file I\u002FO is async (await it), sandboxed to `~\u002F.canary\u002Ftmp\u002F` (no filesystem escape), and\nreturns the full path to the file:\n\n- `saveScreenshot(buffer, name)` — persist a screenshot buffer; buffer first:\n  `const path = await saveScreenshot(await page.screenshot(), \"home.png\");`\n- `writeFile(name, data)` — write a small file (e.g. JSON state):\n  `await writeFile(\"results.json\", JSON.stringify(data));`\n- `readFile(name)` — read it back (returns the contents as a string):\n  `const data = JSON.parse(await readFile(\"results.json\"));`\n\u003C!-- canary:end api-file-helpers -->\n\n**Output**\n\n\u003C!-- canary:snippet api-console -->\n- `console.log` \u002F `console.info` write to stdout; `console.warn` \u002F `console.error` write to\n  stderr. Top-level `console.log` is your script's output channel.\n- `console.log` inside `page.evaluate(() => …)` runs in the page and is captured into the\n  session's console artifact instead.\n\u003C!-- canary:end api-console -->\n\n\u003C!-- canary:snippet api-playwright-note -->\nPages returned by `browser.getPage()` and `browser.newPage()` are full Playwright Page objects —\nthe same API (`goto`, `click`, `fill`, `locator`, `evaluate`, `getByRole`, `waitForSelector`, …):\nhttps:\u002F\u002Fplaywright.dev\u002Fdocs\u002Fapi\u002Fclass-page\n\u003C!-- canary:end api-playwright-note -->\n\nFor element discovery, `await page.snapshotForAI()` returns an LLM-friendly outline of the page —\nthe `canary-scripting` skill and its `references\u002FREFERENCE.md` carry the full API.\n\n## Updating\n\nAlready installed? Grab the latest CLIs from npm, then refresh the runtime:\n\n```bash\nnpm i -g @usecanary\u002Fcli@latest @usecanary\u002Fui@latest   # update canary + canary-viewer\ncanary install                                        # refresh the runtime (Chromium + Playwright)\n```\n\n`canary install` is safe to re-run — it pulls the browser\u002Fruntime versions the new CLI pins. Running\nvia npx instead of a global install? `npx @usecanary\u002Fcli@latest …` always fetches the newest release.\n\n**Agent integrations** update through each agent's own mechanism:\n\n```bash\n# Claude Code — refresh the marketplace catalog, then update from \u002Fplugin:\n\u002Fplugin marketplace update canary-marketplace\n# or turn on auto-update: \u002Fplugin → Marketplaces → canary-marketplace → Enable auto-update\n# (third-party marketplaces ship with auto-update OFF)\n\n# Cursor \u002F Codex — update \"canary\" from each marketplace UI.\n```\n\nClaude Code detects plugin updates by comparing manifest **versions** (bumped every release); Cursor\nand Codex do the same against their plugin manifests, so every release makes the latest `skills\u002F`\nupdate-visible.\n\n## Contributing & development\n\nCanary is a pnpm + Turborepo monorepo: five apps and five packages cooperate to make agent-driven\nbrowser automation reproducible.\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Repo layout\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```\ncanary\u002F\n├── apps\u002F\n│   ├── canary\u002F             # @usecanary\u002Fcli      bin: canary          — session orchestrator (record QA sessions, render reports)\n│   ├── canary-browser\u002F     # @usecanary\u002Fbrowser  bin: canary-browser  — browser-automation engine (one-off runs)\n│   ├── canary-daemon\u002F      # @usecanary\u002Fdaemon   no bin               — Playwright + QuickJS runtime (embedded into the CLIs)\n│   ├── canary-ui\u002F          # @usecanary\u002Fui       bin: canary-viewer   — local session viewer (Astro); `canary-viewer`\n│   └── create-canary\u002F      # create-canary    bin: create-canary   — `npm create canary` setup wizard (Ink)\n├── packages\u002F\n│   ├── protocol\u002F           # @usecanary\u002Fprotocol         IPC schemas (Zod), single source of truth\n│   ├── config\u002F             # @usecanary\u002Fconfig           shared tsconfig bases\n│   ├── logger\u002F             # @usecanary\u002Flogger           pino-backed structured logger\n│   ├── cli-kit\u002F            # @usecanary\u002Fcli-kit          shared CLI helpers\n│   └── daemon-client\u002F      # @usecanary\u002Fdaemon-client    daemon transport + lifecycle; embeds the daemon bundle\n├── skills\u002F                 # agent skills: canary-scripting (+references), -verify, -automate, -session, -review\n├── agents\u002F                 # JTBD subagents: verify-agent, automate-agent, session-agent, review-agent\n├── commands\u002F               # slash commands: \u002Fcanary:verify, :run, :session, :review\n├── .claude-plugin\u002F         # Claude Code plugin + marketplace manifests\n├── .cursor-plugin\u002F         # Cursor plugin manifest (pairs with rules\u002F)\n├── plugins\u002Fcanary\u002F         # Codex plugin wrapper (.codex-plugin → canonical skills\u002F)\n├── .agents\u002F                # Codex \u002F agents marketplace manifest\n├── rules\u002F                  # Cursor rules (canary-workflows.mdc)\n├── examples\u002F               # dev-only demo scripts (Hacker News, Product Hunt, GitHub Trending, Wikipedia)\n└── .github\u002F                # CI\n```\n\n`canary` (the orchestrator) and `canary-browser` (the engine) both embed and supervise\n`canary-daemon` (the long-running Playwright host). The viewer ships standalone — `canary-viewer`\n(or one-off via `npx @usecanary\u002Fui`).\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Build, test &amp; conventions\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n```bash\nmake install   # pnpm install across the workspace\nmake build     # build everything in topo order\nmake test      # run all tests\nmake check     # compile + lint + test (what CI runs)\n```\n\nRun `make` with no args to see all targets.\n\n- **Conventional Commits** enforced via `commitlint` + a husky `commit-msg` hook.\n- **Linting & formatting** via [Ultracite](https:\u002F\u002Fdocs.ultracite.ai\u002F) (Biome) — `pnpm lint` checks, `pnpm format` autofixes; pre-commit runs `lint-staged` → `ultracite fix` on staged files.\n- **Logging** via `@usecanary\u002Flogger` (pino, structured). Set `CANARY_LOG_LEVEL` (trace|debug|info|warn|error|silent); the CLI also accepts `--verbose`\u002F`-v`.\n- **Node 20+** and **pnpm 9.15.0** (see `.nvmrc` and `packageManager`).\n- **Turbo** orchestrates builds (`turbo run build`, `dev`, `test`, `compile`); lint\u002Fformat run via Ultracite at the root.\n\n\u003C\u002Fdetails>\n\nSee [`AGENTS.md`](AGENTS.md) for architecture and orientation, [`CONTRIBUTING.md`](CONTRIBUTING.md)\nfor the contribution flow, and [`RELEASING.md`](RELEASING.md) for the publish pipeline.\n\n## License\n\nMIT. Canary's daemon and CLIs are derived in part from MIT-licensed work by\n[Sawyer Hood](https:\u002F\u002Fgithub.com\u002FSawyerHood) — see [`LICENSE`](LICENSE).\n","2026-06-11 04:11:53","CREATED_QUERY"]