[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82325":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":43,"readmeContent":44,"aiSummary":45,"trendingCount":16,"starSnapshotCount":16,"syncStatus":15,"lastSyncTime":46,"discoverSource":47},82325,"agent-workspace-linux","agent-sh\u002Fagent-workspace-linux","agent-sh","Isolated Linux desktop workspaces for AI agents — a hidden, agent-owned desktop and browser over MCP, so an agent can do GUI and web work without touching your real desktop.","",null,"Rust",53,6,21,2,0,1,16,26,4,55.14,"MIT License",false,"main",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42],"agent","agents","ai","ai-agents","browser-automation","claude-code","codex","desktop-automation","linux","llm","mcp","model-context-protocol","rust","sandbox","skill","x11","2026-06-12 04:01:37","# agent-workspace-linux\n\n[![CI](https:\u002F\u002Fgithub.com\u002Fagent-sh\u002Fagent-workspace-linux\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fagent-sh\u002Fagent-workspace-linux\u002Factions\u002Fworkflows\u002Fci.yml)\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg)](LICENSE)\n![Platform: Linux](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fplatform-Linux-informational)\n[![Release](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fv\u002Frelease\u002Fagent-sh\u002Fagent-workspace-linux?label=release&color=blue)](https:\u002F\u002Fgithub.com\u002Fagent-sh\u002Fagent-workspace-linux\u002Freleases\u002Flatest)\n\n**An isolated, hidden Linux desktop that an AI agent fully controls — over MCP — without ever touching your real mouse, keyboard, focus, or browser.**\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fviewer-website-qa.png\" width=\"820\" alt=\"The agent-workspace-linux floating viewer (right) shows a live website being QA'd inside the hidden workspace while a Claude Code session (left) drives it — the user's real desktop is untouched.\">\n\u003C\u002Fp>\n\u003Cp align=\"center\">\u003Cem>The floating viewer (right) shows the agent doing live website QA inside the hidden workspace, while a Claude Code session (left) drives it. Your real desktop stays yours.\u003C\u002Fem>\u003C\u002Fp>\n\nAgents that \"use a computer\" normally take over *your* screen — they move your mouse, steal focus, and drive your logged-in browser. `agent-workspace-linux` gives the agent its **own** desktop instead: a headless X11 display with its own window manager, apps, clipboard, and browser. The agent launches apps, types, clicks, screenshots, and browses there; you can watch (and pause) through a small floating viewer. It speaks [MCP](https:\u002F\u002Fmodelcontextprotocol.io) over stdio, so it drops into Claude Code, Codex, and other MCP hosts.\n\n## Why this project\n\n- **Use it when** an agent needs to QA a GUI app or a website but must not hijack your live desktop or Chrome session.\n- **Use it when** you want browser\u002Fweb\u002Fshopping automation in a throwaway, isolated profile — observable and stoppable.\n- **Use it when** you need a clean Linux desktop to run, screenshot, and inspect an app, then tear it down.\n- **Use it when** a long-running or headless agent needs a desktop it can drive without a human babysitting the real one.\n\nIt is deliberately **not** a tool for driving your actual desktop — for that, use its sibling [computer-use-linux](https:\u002F\u002Fgithub.com\u002Fagent-sh\u002Fcomputer-use-linux). This one is the *separate, agent-owned* environment; the two are complements.\n\n## Install\n\nRequires Linux. Install the runtime dependencies, then build and install in one step:\n\n```bash\nsudo apt install xvfb openbox xdotool xauth x11-utils imagemagick xclip \\\n    bubblewrap pkg-config libxkbcommon-x11-dev\n.\u002Finstall.sh\n```\n\n`.\u002Finstall.sh` builds the release binary, installs it to `~\u002F.local\u002Fbin\u002F`, and installs the bundled skill to `~\u002F.codex\u002Fskills\u002F` by default. It is safe to rerun. Codex MCP registration is opt-in: use `--codex-configure` only for generic MCP-host workflows. In Codex for Linux, use the dedicated **Agent Workspaces** feature page to configure the backend and permission ceiling so the generic MCP settings\u002Fconfiguration pages stay clean. If an older install still appears in generic MCP\u002Fconfiguration pages, run `.\u002Finstall.sh --clean-codex-config` to remove the stale `agent-workspace-linux` server and tool tables. See [`install.sh --help`](install.sh) for flags (`--permissions`, `--clean-codex-config`, `--skills-dir`, `--no-skill`, `--dry-run`).\n\n### Install with cargo (from source)\n\nIt builds from source straight from git — no crates.io needed. Install the system dependencies above, then:\n\n```bash\n# latest from main\ncargo install --git https:\u002F\u002Fgithub.com\u002Fagent-sh\u002Fagent-workspace-linux\n# or pin a tagged release\ncargo install --git https:\u002F\u002Fgithub.com\u002Fagent-sh\u002Fagent-workspace-linux --tag v0.1.1\n```\n\nThat puts `agent-workspace-linux` on your `PATH`. Unlike `install.sh`, it installs only the binary — register it with your MCP host manually (below), and copy `skills\u002Fagent-workspace-linux\u002F` into your skills directory if you want the bundled skill.\n\nFor MCP hosts that read `.mcp.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"agent-workspace-linux\": {\n      \"command\": \"\u002Fhome\u002FYOU\u002F.local\u002Fbin\u002Fagent-workspace-linux\",\n      \"args\": [\"mcp\"]\n    }\n  }\n}\n```\n\nOr install the npm wrapper, which downloads the matching prebuilt Linux binary:\n\n```bash\nnpm install -g @agent-sh\u002Fagent-workspace-linux\n```\n\nPrebuilt `x86_64` and `aarch64` Linux binaries are also attached to each [GitHub Release](https:\u002F\u002Fgithub.com\u002Fagent-sh\u002Fagent-workspace-linux\u002Freleases\u002Flatest) — download the one for your architecture, `chmod +x`, and put it on your `PATH`.\n\n## Quick start\n\n```bash\n# 1. Ask the runtime what this machine can do (deps, display, sandbox backends)\nagent-workspace-linux doctor\n\n# 2. Preview a workspace without creating anything\nagent-workspace-linux workspace start --dry-run\n\n# 3. Create the hidden workspace (explicit acknowledgement required)\nagent-workspace-linux workspace start --ack-hidden-workspace --purpose \"QA run\"\n\n# 4. Watch it in the floating viewer\nagent-workspace-linux viewer\n\n# 5. Launch an app, see it, then stop the workspace\nagent-workspace-linux workspace launch --name editor -- xterm\nagent-workspace-linux workspace observe --screenshot --output \u002Ftmp\u002Fws.png\nagent-workspace-linux workspace stop\n```\n\nThrough an MCP host you don't run these by hand — the agent calls the matching tools. Start it via the [bundled skill](#the-skill-progressive-tool-loading) so the agent loads only the tools it needs.\n\n## Who controls the boundaries\n\nThe single most important thing to understand is **who sets the limits in each scenario** — and the project is explicit about it:\n\n| Scenario | Who sets the boundary | What is enforced | Can it be overridden at runtime? |\n|----------|-----------------------|------------------|----------------------------------|\n| **Default** (no `--permissions`) | Your **agent host** (Claude Code, Codex, …) | The MCP adds **no ceiling of its own** and defers to the host's approval flow. One explicit hidden-workspace acknowledgement scopes workspace-local actions to that environment. | Yes — the host\u002Fuser owns approvals. |\n| **Developer ceiling** (`--permissions file.json` or `AGENT_WORKSPACE_PERMISSIONS` env) | The **developer \u002F operator** who launched the MCP | Network mode, mount paths, and an app allowlist, **enforced at both the MCP front-end and the workspace daemon's IPC socket** — so even workspace-launched apps and other same-uid processes are capped. | **No** — only by restarting the MCP with new config. This is the authoritative boundary. |\n| **Live viewer control** (pause \u002F read-only) | The **human watching**, in real time | Best-effort: honors a runtime pause when the shared control state is readable, and fails open if it isn't. | It's a convenience layer, **not** the security boundary — the ceiling above is. |\n| **Workspace vs. host** | The **runtime** | Input, screenshots, windows, clipboard, and browser control target the hidden workspace **only** — never your real desktop or host Chrome. | Leakage to the host is a reportable bug. |\n\nIn short: **by default the agent host owns permission**, a developer can **lock a hard, daemon-enforced ceiling** via flag or env, and the **viewer gives a human a best-effort live stop** — layered, not conflicting. See [docs\u002Fpermission-model.md](docs\u002Fpermission-model.md) and [SECURITY.md](SECURITY.md) for the full model and trust assumptions.\n\n## Core concepts\n\n- **Hidden workspace** — a private `Xvfb` display + window manager + control socket. Apps launched into it attach to that display, not your session. Creating one requires `--ack-hidden-workspace` so it is never silent.\n- **Permission ceiling** — optional, declared in JSON (`network`, `mounts`, `apps`). When set, it is enforced for the life of the MCP process. Mount and network isolation are applied with [bubblewrap](https:\u002F\u002Fgithub.com\u002Fcontainers\u002Fbubblewrap) when available.\n- **Profiles** — reusable workspace definitions (mounts, network mode, setup commands, startup apps), e.g. `profile template project-dev` or `browser-session`.\n- **Viewer** — a small GPUI window that shows workspace state and a live screen view, with pause \u002F read-only \u002F stop controls. It is not always-on-top by default.\n- **Workspace browser** — workspace-owned Chrome\u002FChromium reached over a loopback DevTools endpoint, so browser automation never attaches to your host Chrome.\n\n## The skill (progressive tool loading)\n\nThe MCP exposes ~86 tools. To avoid dumping them all into the agent's context, it ships a skill at [`skills\u002Fagent-workspace-linux\u002FSKILL.md`](skills\u002Fagent-workspace-linux\u002FSKILL.md). Only the skill's short description stays loaded; when a task needs an isolated desktop or browser, the agent reads the skill and it routes to the right tools per phase (orient → start → observe → act → stop), loading tool schemas on demand. `.\u002Finstall.sh` installs it to `~\u002F.codex\u002Fskills\u002F` by default (override with `--skills-dir`; Claude users can pass `--skills-dir ~\u002F.claude\u002Fskills`).\n\n## Features\n\n- Hidden X11 workspace with window listing, screenshots, keyboard\u002Fmouse input, clipboard, and per-app logs — all scoped to the workspace display.\n- Optional, daemon-enforced permission ceiling (network \u002F mounts \u002F app allowlist) via flag or `AGENT_WORKSPACE_PERMISSIONS`.\n- bubblewrap-backed mount and network isolation (`disabled`, `local_only`, `inherit_host`) when available.\n- Workspace-owned browser control over loopback CDP — discover targets, read pages, navigate, extract results.\n- A native floating viewer with best-effort live pause \u002F read-only \u002F stop.\n- Saveable profiles with setup and startup commands.\n- A bundled skill for low-context, on-demand tool use across MCP hosts.\n\n## Limitations\n\n- **Linux only.** Targets an X11 (`Xvfb`) workspace; the viewer is validated on X11\u002FXwayland, with native Wayland still maturing.\n- **Pre-1.0.** Interfaces and tool schemas can change between versions.\n- **Single-user trust model.** The control socket is a same-uid Unix socket (mode 0600); there is no cross-user protection by design. Run as a dedicated user for multi-user isolation.\n- **Mount\u002Fnetwork enforcement needs bubblewrap.** Without it, those policies are declared but not enforced (the runtime tells you which).\n- **Live viewer control is best-effort**, not a hard guarantee — the permission ceiling is the authoritative boundary.\n\n## Docs\n\n- [Permission boundary](docs\u002Fpermission-model.md) — the authority model.\n- [GPUI viewer direction](docs\u002Fgpui-viewer-direction.md) — the visible control surface.\n- [SECURITY.md](SECURITY.md) — trust model and how to report a vulnerability.\n\n## Related\n\n- [computer-use-linux](https:\u002F\u002Fgithub.com\u002Fagent-sh\u002Fcomputer-use-linux) — the sibling MCP that drives the **user's real** Linux desktop. It is the inverse of this project: `computer-use-linux` automates the desktop you are already on, while `agent-workspace-linux` gives the agent a separate, isolated desktop of its own. Use them together — host control vs. sandboxed agent workspace.\n\n## Contributing\n\nContributions are welcome. Build with `cargo build --locked`; before pushing, run the gates: `cargo fmt --check`, `cargo clippy --locked -- -D warnings`, `cargo test --locked`, `git diff --check`, and (for runtime changes) `scripts\u002Fintegration_smoke.sh`. See [CONTRIBUTING.md](CONTRIBUTING.md) and [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md).\n\n## License\n\n[MIT](LICENSE) © Avi Fenesh\n","agent-workspace-linux 是一个为AI代理提供隔离的Linux桌面工作空间的项目，使代理能够在不干扰用户真实桌面的情况下执行GUI和网页任务。该项目使用Rust语言开发，通过MCP（Model Context Protocol）协议与代理进行通信，核心功能包括为AI代理提供独立的X11显示环境、窗口管理器、应用程序、剪贴板以及浏览器支持，允许代理在其中启动应用、输入文本、点击界面、截图及浏览网页等。适合于需要对GUI应用程序或网站进行质量保证但又不想让代理控制实际桌面或浏览器会话的场景；也适用于需要在一个可丢弃且隔离的环境中实现自动化浏览或购物的情况；或是当长期运行或无头代理需要一个可以自主操作的桌面环境时。","2026-06-11 04:08:23","CREATED_QUERY"]