[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-11239":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":15,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":14,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":23,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":43,"readmeContent":44,"aiSummary":45,"trendingCount":16,"starSnapshotCount":16,"syncStatus":46,"lastSyncTime":47,"discoverSource":48},11239,"harness","awizemann\u002Fharness","awizemann","AI-driven user testing for iOS Simulator, macOS apps, and web apps. Write a goal in plain language; an LLM agent drives the UI and reports friction. macOS 14+, Swift 6.","https:\u002F\u002Fawizemann.github.io\u002Fharness\u002F",null,"Swift",280,12,3,1,0,8,107,59.34,"MIT License",false,"main",true,[25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42],"ai-agents","anthropic","claude","developer-tools","ios-testing","mac","macos","macos-testing","simulator","swift","swift-ui","swift6","swiftui","user-testing","ux-testing","web-testing","webkit","xcode","2026-06-12 04:00:54","# Harness\n\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](LICENSE)\n![Platform: macOS 14+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fplatform-macOS%2014%2B-blue)\n![Targets: iOS · macOS · Web](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftargets-iOS%20%C2%B7%20macOS%20%C2%B7%20Web-3DDC97)\n![Version: 0.3.1](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-0.3.1-blue)\n![Swift 6](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSwift-6-F05138?logo=swift&logoColor=white)\n\n\u003Cp align=\"center\">\n  \u003Cpicture>\n    \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"site\u002Flanding\u002Fassets\u002Fscreenshots\u002Frunsession-hero-dark.png\">\n    \u003Cimg alt=\"Harness Run Session — simulator mirror, step feed, and approval card visible mid-run\" src=\"site\u002Flanding\u002Fassets\u002Fscreenshots\u002Frunsession-hero.png\" width=\"900\">\n  \u003C\u002Fpicture>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fawizemann\u002Fharness\u002Freleases\u002Fdownload\u002Fv0.3.1\u002FHarness-v0.3.1-Universal.zip\">\n    \u003Cimg alt=\"Download Harness v0.3.1 — macOS Universal (Apple Silicon + Intel)\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDownload%20for%20Mac-v0.3.1%20Universal-1f6feb?style=for-the-badge&logo=apple&logoColor=white\">\n  \u003C\u002Fa>\n  \u003Cbr>\n  \u003Csub>macOS 14+ · Apple Silicon &amp; Intel · ~12 MB\u003C\u002Fsub>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fawizemann.github.io\u002Fharness\u002F\">\u003Cstrong>awizemann.github.io\u002Fharness\u003C\u002Fstrong>\u003C\u002Fa> &nbsp;·&nbsp;\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fawizemann\u002Fharness\u002Fwiki\">Wiki\u003C\u002Fa> &nbsp;·&nbsp;\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fawizemann\u002Fharness\u002Freleases\u002Flatest\">All releases\u003C\u002Fa>\n\u003C\u002Fp>\n\n> A native macOS developer tool that drives an **iOS Simulator, a macOS app, or a web app** with an AI agent so you can run **user tests** — not scripted UI tests, but real-user simulation.\n\nYou write a goal in plain language (\"I want to sign up and create my first list\", \"delete my account\", \"find a vegetarian restaurant near me and save it\") and a persona (\"first-time user, never seen this app\"). Harness builds (or just launches) your target, and an LLM agent reads screenshots, clicks\u002Ftypes\u002Fscrolls, and pursues the goal — narrating what it sees, flagging UX friction (dead ends, ambiguous labels, unresponsive controls), and stopping when it succeeds, fails, or would give up.\n\nThree artifacts come out of every run:\n\n1. **Did the goal complete?** — success \u002F failure \u002F blocked + summary\n2. **What was the path?** — replayable sequence of screens + actions\n3. **Where was the friction?** — timestamped events the agent flagged as confusing\n\n## Targets\n\n| Kind | How Harness drives it |\n|---|---|\n| **iOS Simulator** | `xcodebuild` your project + scheme; `simctl` boot\u002Finstall\u002Flaunch; WebDriverAgent for input. |\n| **macOS app** | NSWorkspace launch (pre-built `.app` *or* xcodebuild macOS scheme); `CGEvent` for input; `CGWindowListCreateImage` for capture. |\n| **Web app** | Embedded `WKWebView` at a chosen viewport (default **1280×1600** tall desktop, or 375×812 mobile); JS-synthesised events for input; `WKWebView.takeSnapshot` for capture. The mirror shows a flat browser chrome (no device bezel) so the screenshot fills the full pane and one snapshot covers more page — fewer scrolls per goal, lower API cost. |\n\nPer-app setting: each Application declares its kind once at create time. The agent's tool schema (clicks vs swipes vs key shortcuts vs navigate) and the system-prompt context block re-shape per platform. Run history, replay, and friction reporting are platform-neutral.\n\n> **Status:** v0.3.1 (alpha). All three platforms wired end-to-end; **per-Application credential storage + Set-of-Mark targeting on web** (numbered overlays on focusable elements; agent clicks by id, no pixel guessing — agent-only, never on disk); **multi-provider LLM support** (Anthropic Opus 4.7 \u002F Sonnet 4.6 \u002F Haiku 4.5 + OpenAI GPT-5 Mini \u002F GPT-4.1 Nano + Google Gemini 2.5 Flash \u002F Flash Lite); per-provider Keychain storage; configurable per-model token budgets; unlimited-step option. macOS needs Screen Recording permission. Web is WebKit-only; Chrome via CDP is on the roadmap. See [`docs\u002FROADMAP.md`](docs\u002FROADMAP.md).\n\n## What's new in 0.3.1\n\n- **Set-of-Mark badges no longer leak into human-visible surfaces.** The disk PNG is the **clean rendered page** — replay, friction reports, and exported screenshots show what a real user would see. The agent still receives the marked-up image (numbered green badges over focusable elements) via an in-memory `ScreenshotMetadata.markedImageData` channel; the on-disk artifact stays free of dev-tool clutter. Standard 14 §6 documents the new \"no agent scaffolding on disk\" invariant.\n- **Compose Run pairs Persona + Credential side-by-side.** Both sections answer \"who's running this?\", so they read as one row instead of two stacked panels. Saves vertical scroll; auto-falls-back to a single column on narrow windows via `ViewThatFits`. When no credentials are staged, Persona expands to fill the row naturally.\n\n## What's new in 0.3.0\n\n- **Per-Application credential storage.** Pre-stage username\u002Fpassword pairs against an Application; pick one per run in Compose Run. The agent gets a new `fill_credential(field: \"username\"|\"password\")` tool for iOS, macOS, and web. Password bytes never enter the model's context, the JSONL log, or any prompt template — `tool_call.input` for password fills records `{\"field\":\"password\"}` and nothing else. New friction kind `auth_required` for the \"agent hit a login wall and has nothing to fill\" case.\n- **Set-of-Mark targeting (web).** Every screenshot now overlays small numbered badges on focusable elements (form fields, buttons, dropdowns, checkboxes). The agent calls `tap_mark(id)` and the WebDriver resolves to the element's center — no more \"agent picked y=228, input was at y=242\" misses. Coordinate `tap(x, y)` stays available for unmarked content. Probe pierces open shadow roots so inputs in modern signin \u002F payment widgets get marks. iOS \u002F macOS get the same treatment in a follow-up via accessibility-tree probes (tracked on the [wiki Roadmap](https:\u002F\u002Fgithub.com\u002Fawizemann\u002Fharness\u002Fwiki\u002FRoadmap)).\n- **Web mirror reworked.** Replaced the iPad-shaped device bezel with a flat browser chrome (URL pill, lock glyph, back\u002Fforward\u002Frefresh affordances) so web runs use the full middle column. Default viewport bumped to 1280×1600 — taller snapshots mean fewer scroll turns, which translates directly to lower API spend per run.\n- **React-aware form fill.** `dispatchType` now uses the native value setter via `Object.getOwnPropertyDescriptor`, so React's value tracker actually sees the change and re-renders won't reset typed text. Same fix applies to `fill_credential`. Click-target focus routing now walks `\u003Clabel>`, wrappers, and shadow children to focus the actual input, not the styled `\u003Cdiv>` on top of it.\n- **Multi-tool emissions accepted.** The system prompt always allowed *\"exactly one tool call ... optionally accompanied by one or more `note_friction` calls\"*; the parsers were rejecting anything > 1 block. Each provider's parser now splits action vs `note_friction` and forwards inline frictions through `AgentDecision.inlineFriction` → JSONL friction rows.\n- **Run-log schema v3.** `run_started` payload gains optional `credentialLabel` + `credentialUsername` (decode-if-present so v2 logs round-trip). Standards doc §5 documents the v2→v3 migration and the three credential-redaction invariants.\n- **`RunHistoryStore` adopts `@ModelActor`.** Eliminates the *\"Unbinding from the main queue. ModelContexts are not Sendable\"* runtime warning that Swift's strict concurrency was right to flag.\n\n## What's new in 0.2.0\n\n- **Seven supported models across three providers.** Pick a provider in Settings, then a model. Compose Run can override per-run. Each provider has its own Keychain entry; swap mid-session without restart.\n- **Per-model token budgets.** The legacy \"Opus → 250k, else 1M\" ternary is gone — every model has a justified default and a hard ceiling, configurable globally in Settings and per-run in Compose Run.\n- **Unlimited steps.** Toggle in Settings, Compose Run, or Application defaults. The token budget + cycle detector remain the safety rails.\n- **Settings persist across launches.** Default provider, model, mode, step + token budgets, and simulator visibility all survive a restart now (they didn't in 0.1).\n- **Real screenshot thumbnails** in the step feed, sized to each platform's aspect ratio.\n- **Loop hardening for cheaper models.** Multi-tool \u002F zero-tool \u002F parse-failure responses now surface a corrective hint to the model on retry instead of failing the run silently.\n- 218 unit tests passing (was 175 in 0.1).\n\n## First clone\n\nHarness vendors `appium\u002FWebDriverAgent` as a git submodule under `vendor\u002FWebDriverAgent` (it's how we drive the iOS Simulator's responder chain). The Xcode project is generated from `project.yml` via [xcodegen](https:\u002F\u002Fgithub.com\u002Fyonaskolb\u002FXcodeGen).\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fawizemann\u002Fharness.git\ncd harness\ngit submodule update --init --recursive\nbrew install xcodegen\nxcodegen generate\nopen Harness.xcodeproj\n```\n\nYou'll also need `idb_companion` for simulator control:\n\n```bash\nbrew tap facebook\u002Ffb && brew install idb-companion\n```\n\nThe first run builds WDA against your simulator's iOS runtime (~1–2 min). Result is cached under `~\u002FLibrary\u002FApplication Support\u002FHarness\u002Fwda-build\u002F\u003CiOS-version>\u002F` and reused on subsequent runs.\n\nFull setup: see [Build-and-Run on the Wiki](https:\u002F\u002Fgithub.com\u002Fawizemann\u002Fharness\u002Fwiki\u002FBuild-and-Run).\n\n## How to read this repo\n\n- [`standards\u002FINDEX.md`](standards\u002FINDEX.md) — development, code, and architecture standards. Read these before adding code.\n- [GitHub Wiki](https:\u002F\u002Fgithub.com\u002Fawizemann\u002Fharness\u002Fwiki) — \"where things live, why, and how to extend them.\" Maintained per PR alongside code.\n- [`docs\u002FARCHITECTURE.md`](docs\u002FARCHITECTURE.md) — system architecture overview.\n- [`docs\u002FROADMAP.md`](docs\u002FROADMAP.md) — build order and milestones.\n- [`docs\u002FPROMPTS\u002F`](docs\u002FPROMPTS\u002F) — canonical agent prompts (loaded as a bundle resource at runtime).\n- [`HarnessDesign\u002F`](HarnessDesign\u002F) — design system tokens, primitives, and screen layouts.\n\n## Contributing\n\nPRs welcome. Read [`CONTRIBUTING.md`](CONTRIBUTING.md) first — it covers setup, the architecture rules (MVVM-F, Swift 6 strict concurrency, single subprocess actor), and the **public-surfaces sync rule** (code changes that affect README \u002F wiki \u002F site update them in the same PR).\n\n## License\n\nMIT — see [`LICENSE`](LICENSE).\n","Harness 是一个基于 AI 的用户测试工具，适用于 iOS 模拟器、macOS 应用和 Web 应用。其核心功能是通过自然语言描述目标，由 AI 代理驱动 UI 并报告用户体验中的摩擦点。该工具使用 Swift 6 编写，支持 macOS 14 及以上版本。Harness 适合开发者在软件开发过程中进行真实用户模拟测试，帮助发现并解决界面设计和交互流程中的问题，提高应用的用户体验。它生成的测试结果包括目标完成情况、路径记录以及摩擦点的时间戳标记，为优化产品提供数据支持。",2,"2026-06-11 03:31:31","CREATED_QUERY"]