[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80002":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":14,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":24,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":40,"readmeContent":41,"aiSummary":42,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":43,"discoverSource":44},80002,"opendesk","vitalops\u002Fopendesk","vitalops","Control 1 or more machines using computer use tools that integrates with your agents","https:\u002F\u002Fvitalops.ai\u002Fopendesk\u002Fdocs\u002F",null,"Python",79,15,3,1,0,2,9,6,3.61,"MIT License",false,"main",true,[26,27,28,29,30,31,32,33,34,35,36,37,38,39],"agents","agi","automation","claude-code","computer-use","cursor","desktop-automation","desktop-control","hacktoberfest","linux","mac","mcp","rpa","windows","2026-06-12 02:03:56","\u003Cdiv align=\"center\">\n\n# opendesk\n\n**Give any AI agent eyes and hands on your desktop.**\n\nOpendesk is a computer use framework that lets AI agents navigate your computer just like a human would — screenshots, mouse, keyboard, UI interaction, OCR, workflow recording, scheduling, and remote machine control.\n\n**macOS · Linux · Windows**\n\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fopendesk?label=pypi%20opendesk)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fopendesk\u002F)\n[![npm](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fv\u002F@vitalops\u002Fopendesk-sdk?label=npm%20opendesk-sdk)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002F@vitalops\u002Fopendesk-sdk)\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg)](LICENSE)\n[![Docs](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdocs-vitalops.github.io-blue)](https:\u002F\u002Fvitalops.github.io\u002Fopendesk\u002Fdocs\u002F)\n\n\u003C\u002Fdiv>\n\n---\n\n![opendesk demo](docs\u002Fopendesk_demo.gif)\n\n---\n\n## SDKs\n\n| Language | Location | Package | Install |\n|----------|----------|---------|---------|\n| Python | [`python\u002F`](python\u002F) | `opendesk` (PyPI) | `pip install 'opendesk[core,mcp]'` |\n| JavaScript \u002F TypeScript | [`js\u002F`](js\u002F) | `@vitalops\u002Fopendesk-sdk` (npm) | `npm install @vitalops\u002Fopendesk-sdk` |\n\nMore SDKs can be added to this repo following the same pattern.\n\n---\n\n## MCP install\n\nopendesk works as an MCP server with any MCP-compatible client — Claude Code, Claude Desktop, Cursor, Windsurf, Continue, or any custom tool.\n\n### Python\n\n```bash\npip install 'opendesk[core,mcp]'\nopendesk install        # shortcut for Claude Code\n```\n\n> Requires Python 3.10+\n\n### JavaScript \u002F TypeScript\n\n```bash\nnpm install @vitalops\u002Fopendesk-sdk\nnpx opendesk-js install        # shortcut for Claude Code\n```\n\n### Other MCP clients (Cursor, Windsurf, Continue, custom)\n\nPoint your client at the `opendesk-mcp` binary:\n\n```json\n{\n  \"mcpServers\": {\n    \"opendesk\": { \"command\": \"opendesk-mcp\" }\n  }\n}\n```\n\nFor JS:\n\n```json\n{\n  \"mcpServers\": {\n    \"opendesk\": {\n      \"command\": \"node\",\n      \"args\": [\"\u002Fpath\u002Fto\u002Fnode_modules\u002F@vitalops\u002Fopendesk-sdk\u002Fbin\u002Fopendesk-mcp.js\"]\n    }\n  }\n}\n```\n\nOnce connected, try:\n\n```\nTake a screenshot of my screen\nClick the Chrome icon\nOpen Spotify and play lo-fi beats\nShow me the audit log\nReplay everything from this session\n```\n\n---\n\n## SDK usage\n\nUse opendesk programmatically in your own agent or app.\n\n### Python\n\n```python\nfrom opendesk import create_registry, allow_all_context\n\nregistry = create_registry()\nctx = allow_all_context()\n\nresult = await registry.get(\"screenshot\").execute(ctx, ...)\n```\n\n### JavaScript \u002F TypeScript\n\n```typescript\nimport { OpenDeskClient } from \"@vitalops\u002Fopendesk-sdk\";\n\nconst client = new OpenDeskClient();\nawait client.screenshot({ marks: true });\nawait client.ui({ action: \"click\", app: \"Safari\", title: \"Go\" });\n```\n\n---\n\n## Architecture\n\nopendesk is built in independently-importable layers:\n\n```\n┌──────────────────────────────────────────────────────────────┐\n│  Integrations   MCP  ·  Claude Code  ·  OpenAI  ·  LangChain │\n├──────────────────────────────────────────────────────────────┤\n│  Tools          screenshot · mouse · keyboard · ui ·         │\n│                 clipboard · ocr · learn · schedule · audit   │\n├──────────────────────────────────────────────────────────────┤\n│  Computer       LocalComputer  ·  RemoteComputer  (ABC)      │\n├──────────────────────────────────────────────────────────────┤\n│  Remote         server · client · discovery (mDNS)           │\n├──────────────────────────────────────────────────────────────┤\n│  Protocol       frames · codec (msgpack) · peer · transports │\n│                 auth (X25519 + AEAD, pairing)                │\n└──────────────────────────────────────────────────────────────┘\n```\n\n| Layer | What it does |\n|-------|-------------|\n| **Computer** | The capability surface of a computer (observe \u002F act \u002F subscribe). `LocalComputer` drives the local machine; `RemoteComputer` forwards every call over the wire to a paired peer. Tools and integrations target this ABC — they never know whether the machine is local or remote. |\n| **Tools** | One class per capability, agent-friendly Pydantic schemas. Calls into the active `Computer` on the `ToolContext`. |\n| **Integrations** | Thin adapters for MCP, Anthropic, OpenAI, LangChain — add one tool, get all four. |\n| **Remote** | `opendesk serve` \u002F `opendesk pair`, mDNS discovery, client helper. |\n| **Protocol** | Five-frame wire protocol (msgpack binary, no base64 ever), WebSocket transport, mutual X25519 + ChaCha20-Poly1305 auth and encryption. |\n| **Automation** | `learn` + `schedule` backed by pynput recording, JSON storage, APScheduler daemon. |\n\nFull details → [docs\u002Farchitecture.md](docs\u002Farchitecture.md)\n\n---\n\n## Tools\n\n| Tool | What it does |\n|------|-------------|\n| `screenshot` | Capture the screen with numbered boxes on every clickable element (Set-of-Marks) |\n| `ui` | Click and type by element name — no coordinates needed |\n| `mouse` | Pixel-level mouse control for anything `ui` can't reach |\n| `keyboard` | Type text, press keys, send hotkeys |\n| `app` | Open, close, and focus applications |\n| `clipboard` | Read and write the system clipboard |\n| `ocr` | Extract text from any region of the screen |\n| `learn` | Record a workflow once, replay it anytime |\n| `schedule` | Run any task or learned procedure on a timer |\n\nFull reference → [docs\u002Ftools.md](docs\u002Ftools.md)\n\n---\n\n## Automation\n\nRecord a task once, replay it forever, or put it on a schedule.\n\n**Record**\n```\n\"Start recording task expense-form\"\n```\nPerform the workflow yourself. The agent captures every click, keystroke, and screenshot.\n\n**Replay**\n```\n\"Stop recording\"\n\"Replay expense-form\"\n```\nThe agent re-executes using the current screen state — no hardcoded coordinates.\n\n**Schedule**\n```\n\"Every morning at 9am, open my email in Chrome, take a screenshot, and summarize what's there\"\n\"Schedule expense-form every friday at 5pm\"\n```\n```bash\nopendesk scheduler start\n```\n\nSupported timing: `every 30m` · `every 2h` · `every day at 09:00` · `every friday at 17:00` · raw cron\n\nFull guide → [docs\u002Fautomation.md](docs\u002Fautomation.md)\n\n---\n\n## Remote computer use\n\nControl another machine from your agent — same tools, same MCP server, the\n`Computer` abstraction just lives on the other end of an encrypted WebSocket.\n\n**On the machine being controlled** (one time):\n\n```bash\npip install 'opendesk[core,remote]'\nopendesk pair        # prints a 6-digit code, listens\n```\n\n**On the controller** (one time):\n\n```bash\npip install 'opendesk[remote]'\nopendesk discover                          # list opendesk peers on the LAN\nopendesk pair-with \u003Chost> \u003Ccode> --name mini\n```\n\n**After pairing**, the controlled machine runs the long-lived server:\n\n```bash\nopendesk serve            # accepts paired peers only\n```\n\n…and the controller drives it through the existing MCP server (Claude Code,\nClaude Desktop, Cursor — anything that speaks MCP). The agent gets new admin\ntools — `opendesk_peers`, `opendesk_use`, `opendesk_status` — and every\nexisting tool accepts an optional `peer:` argument:\n\n```\nscreenshot                       → controls the local machine\nscreenshot peer=mini             → controls the paired remote\nopendesk_use mini                → make mini the default for this session\nscreenshot                       → [on mini] ...\n```\n\nWith exactly one paired peer the agent doesn't have to specify anything —\nit becomes the implicit default. With multiple, the agent must pick\nexplicitly (no silent fallback).\n\n**One controller at a time.** Pair as many machines as you like, but only\none drives the desktop at a time — a second peer trying to connect while\none is active gets a clean `BUSY` error. Same peer reconnecting bumps\nthe previous session (no waiting out a stale TCP). Two ways to free the\nslot from the controlled machine:\n\n- `opendesk disconnect` — **cooperative**. Server asks the controller to\n  leave via a `session.evicted` PUSH; a cooperative client (the in-tree\n  `RemoteComputer`) suppresses its auto-reconnect and raises\n  `SessionEvicted`. Trust is preserved.\n- `opendesk unpair \u003Cname>` — **enforced**. Revokes trust + closes the\n  session; next reconnect fails authentication.\n\n**Security model:** pairing exchanges long-lived X25519 keypairs via a 6-digit\ncode-authenticated handshake (PBKDF2-stretched, ~CPU-month to brute force).\nSubsequent connections use mutual static-key authentication. Every frame is\nChaCha20-Poly1305 AEAD-encrypted with per-direction counters. No CA-signed\ncertificates required — the keys ARE the trust.\n\nFull guide → [docs\u002Fremote.md](docs\u002Fremote.md)\n\n---\n\n## Installation options\n\n```bash\npip install opendesk                              # core framework only\npip install 'opendesk[core,mcp]'                  # + screen capture + MCP server (recommended)\npip install 'opendesk[core,mcp,remote]'           # + control another machine over LAN\npip install 'opendesk[core,mcp,learn]'            # + task recording and replay\npip install 'opendesk[core,mcp,learn,schedule]'   # + scheduled tasks\npip install 'opendesk[all]'                       # everything\n```\n\n---\n\n## Platform support\n\n| Feature | macOS | Linux | Windows |\n|---------|:-----:|:-----:|:-------:|\n| Screenshot | ✓ | ✓ | ✓ |\n| Mouse & keyboard | ✓ | ✓ | ✓ |\n| UI element access | AppleScript | AT-SPI2 | UI Automation |\n| Clipboard | pbcopy\u002Fpbpaste | xclip\u002Fxsel | pyperclip |\n| OCR | Vision \u002F tesseract | tesseract | WinRT \u002F tesseract |\n| App control | `open -a` | `xdg-open` | `start` |\n| Task recording | ✓ | ✓ | ✓ |\n| Scheduled tasks | ✓ | ✓ | ✓ |\n| Remote control (LAN) | ✓ | ✓ | ✓ |\n| LAN discovery (mDNS) | ✓ | ✓ | ✓ |\n\n---\n\n## System permissions\n\n### macOS\n- **System Settings → Privacy & Security → Screen Recording** — enable for your terminal\n- **System Settings → Privacy & Security → Accessibility** — enable for mouse\u002Fkeyboard control\n\n### Linux\n```bash\nsudo apt install xclip xdotool python3-atspi\n```\n\n### Windows\nNo extra permissions needed — opendesk uses Win32 APIs by default.\n\nSee [docs\u002Fpermissions.md](docs\u002Fpermissions.md) for full setup guide.\n\n---\n\n## Integrations\n\n### Claude Code\n```bash\nopendesk install        # registers opendesk-mcp globally\nopendesk uninstall      # removes the registration\n```\n\n### Claude Desktop\n\nAdd to your config file:\n- **macOS**: `~\u002FLibrary\u002FApplication Support\u002FClaude\u002Fclaude_desktop_config.json`\n- **Windows**: `%APPDATA%\\Claude\\claude_desktop_config.json`\n- **Linux**: `~\u002F.config\u002FClaude\u002Fclaude_desktop_config.json`\n\n```json\n{\n  \"mcpServers\": {\n    \"opendesk\": { \"command\": \"opendesk-mcp\" }\n  }\n}\n```\n\n### Python API\n\n```python\nimport asyncio\nfrom opendesk import create_registry, allow_all_context\n\nasync def main():\n    registry = create_registry()\n    ctx = allow_all_context()\n\n    result = await registry.get(\"screenshot\").execute(\n        ctx, registry.get(\"screenshot\").Params(marks=True)\n    )\n    print(result.output)\n\nasyncio.run(main())\n```\n\nWorks with Anthropic SDK, OpenAI, and LangChain — see [docs\u002Fintegrations.md](docs\u002Fintegrations.md)\n\n### On-device models (Ollama, LM Studio, vLLM, llama.cpp)\n\nAny OpenAI-compatible local server works out of the box:\n\n```python\nfrom openai import OpenAI\nfrom opendesk.integrations.openai_compat import OpenAIAdapter\n\nclient = OpenAI(base_url=\"http:\u002F\u002Flocalhost:11434\u002Fv1\", api_key=\"ollama\")\nadapter = OpenAIAdapter()\nresult = await adapter.run_loop(client, model=\"qwen2.5:72b\", messages=messages)\n```\n\n---\n\n## Citation\n\nIf you use opendesk in your research or project, please cite it:\n\n```bibtex\n@software{opendesk,\n  author  = {Abraham, Abhigith Neil and Rahman, Fariz and Rahman, Fadil},\n  title   = {opendesk: Open Desktop Automation Framework},\n  year    = {2026},\n  url     = {https:\u002F\u002Fgithub.com\u002Fvitalops\u002Fopendesk},\n  version = {0.2.0},\n  license = {MIT}\n}\n```\n\nA `CITATION.cff` is included — GitHub's \"Cite this repository\" button will pick it up automatically.\n\n---\n\n## License\n\nMIT\n","Opendesk 是一个桌面自动化框架，允许AI代理像人类一样操作计算机，包括截图、鼠标键盘控制、UI交互、OCR识别、工作流录制、调度及远程机器控制等功能。项目使用Python编写，并支持macOS、Linux和Windows系统，通过提供Python和JavaScript\u002FTypeScript的SDK，便于开发者在不同环境下集成使用。其核心特点在于能够为任何MCP兼容客户端（如Claude Code, Cursor等）提供服务，实现对桌面环境的灵活操控。适用于需要自动化日常任务、提高工作效率或构建复杂人机交互系统的场景。","2026-06-11 03:58:50","CREATED_QUERY"]