[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-732":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":15,"starSnapshotCount":15,"syncStatus":17,"lastSyncTime":27,"discoverSource":28},732,"dothething","fluffypony\u002Fdothething","fluffypony","an autonomous AI agent: you describe the thing, it does the thing.","https:\u002F\u002Fdotheth.ing",null,"Shell",1373,359,3,0,1,2,56.37,"BSD 3-Clause \"New\" or \"Revised\" License",false,"main",true,[],"2026-06-12 04:00:05","# dothething\n\nDothething (DTT) is a local AI agent. You give it a task, walk away, and come back to results.\n\nIt handles research, data extraction, browser automation, file editing, and code execution. It works until the job is done, or tells you exactly why it couldn't.\n\n**Website:** [dotheth.ing](https:\u002F\u002Fdotheth.ing)\n\n## What it does\n\nYou describe a task in plain English. The agent breaks it down, picks the right tools, and delivers the output.\n\n- Plans its work and tracks progress\n- Searches the web using a local SearXNG instance (supports Google, Bing, DuckDuckGo, and more -- you can target specific engines or search images directly)\n- Browses pages with Notte and Camoufox (a Firefox fork built to avoid fingerprinting). Extracts page content, solves captchas, and handles multi-step web interactions\n- Reads and edits files, runs shell commands, makes HTTP requests\n- Connects to your existing MCP servers via `~\u002F.dtt\u002Fmcp.json`\n- Loads custom skills from `~\u002F.dtt\u002Fskills\u002F\u003Cskill-name>\u002FSKILL.md` (Claude Code convention) -- behavioral skills inject directly into the agent's context, while text-processing skills run as isolated sub-tasks\n- Manages its own configuration. Tell it to add an API key or install a skill, and it handles the file edits and reloads itself\n- Sends and receives email through its own inbox via AgentMail\n- Copies to and pastes from your system clipboard, including images\n- Accepts mid-task input. Press any key while it's working to type instructions. Ctrl-Q queues input for after the current step finishes\n- Farms out grunt work to a cheaper model. Asks GPT-5.4 for a second opinion when stuck\n- Saves full conversation threads so you can resume interrupted work\n- Tracks token usage and dollar cost via OpenRouter, with Anthropic prompt caching for cost reduction\n\n## Quick start\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Ffluffypony\u002Fdothething.git\ncd dothething\n.\u002Fdtt.sh --prompt \"Find the 10 largest public companies by revenue that went bankrupt in the last 20 years and write a markdown report with causes and timelines.\"\n```\n\nFirst run prompts for your OpenRouter API key (required) and a 2Captcha API key (optional), and saves them to `~\u002F.dtt\u002Fenv` (mode 0600). Subsequent runs read the keys from there. To skip the prompt, export `OPENROUTER_API_KEY` in your shell first; values in the shell environment take precedence over the saved file. To change or clear the saved keys, edit or delete `~\u002F.dtt\u002Fenv`.\n\nThe first run also takes a couple of minutes to set up a Python venv, install SearXNG, and set up the Notte browser framework. After that, startup is fast.\n\nOmit `--prompt` to open a multiline editor. Type your task, then hit Esc+Enter to submit.\n\n## Requirements\n\n- macOS or Linux\n- Python 3.11+\n- An OpenRouter API key. Get one at [openrouter.ai\u002Fkeys](https:\u002F\u002Fopenrouter.ai\u002Fkeys). First run prompts for it and saves it to `~\u002F.dtt\u002Fenv`, or export `OPENROUTER_API_KEY` in your shell to skip the prompt.\n- Optional: a 2Captcha API key for automated captcha solving during browser tasks. First-run setup prompts for this too, or export `TWOCAPTCHA_API_KEY`.\n- Optional: an AgentMail API key for email tools. The agent can set this up for you on first use, or get one at [agentmail.to](https:\u002F\u002Fagentmail.to).\n- Linux clipboard\u002Fimage support needs `wl-clipboard` (Wayland) or `xclip` (X11).\n\nEverything else is installed automatically into `\u002Ftmp\u002Fdothething` on first run.\n\n## Usage\n\n```bash\n.\u002Fdtt.sh [flags]\n```\n\n| Flag | What it does |\n|---|---|\n| `--prompt \"...\"` | Provide the task inline instead of opening the editor |\n| `--fast` | Use claude-opus-4.6-fast (cheaper, slightly less capable) |\n| `--cwd DIR` | Set the working directory for file operations (default: `.`) |\n| `--max-loops N` | Cap the number of agent turns (default: 200) |\n| `--oraclepro` | Use GPT-5.4-pro instead of GPT-5.4 for oracle calls |\n| `--resume ID` | Pick up a previous session by thread ID |\n| `--headed` | Show the browser window for visual debugging |\n| `--orchestrator` | Launch orchestrator mode -- run and manage multiple agents from one terminal |\n| `--pipe` | Stdout-only output for Unix pipelines. Final report on stdout, everything else suppressed. Exit codes: 0=complete, 2=partial, 1=failed |\n| `--tui` | Full-screen terminal UI for single-agent mode (experimental) |\n| `--notify-desktop` | Send a desktop notification when the task finishes |\n| `--notify-email EMAIL` | Email a notification to this address when the task finishes (requires AgentMail) |\n| `--max-cost USD` | Stop and checkpoint when cumulative cost reaches this amount |\n| `--verbose` | Show full error tracebacks |\n| `--debug` | Log raw API payloads and cache metrics |\n\n## How it works\n\nThe agent routes Claude Opus through OpenRouter. Every turn, the model decides which tools to call, processes the results, and decides what to do next.\n\n**result_mode.** Every tool call has a `result_mode`. If you need exact output, use `\"raw\"`. If you tell it to \"extract all function signatures\", it pipes the output through Sonnet for a tight summary before the main agent sees it. This keeps the context window manageable on long tasks.\n\n**Browser automation.** We use Notte with Camoufox under the hood. For simple scraping, `fetch_page` grabs clean markdown with no LLM cost. If a captcha shows up, it gets solved automatically. For complex multi-step interactions (login flows, forms, SPAs), the agent can hand off the session to a dedicated Notte browser agent via `browser_agent`.\n\n**Prompt caching.** We use OpenRouter sticky routing and Anthropic's block-level cache controls. On long tasks, subsequent turns hit the cache, cutting input costs significantly.\n\n**Thread persistence.** Every session saves to `~\u002F.dtt\u002Fthreads\u002F` with a timestamped ID. If you interrupt a run or hit the loop limit, resume with `--resume \u003Cthread-id>`.\n\n**Skills.** Drop skill directories into `~\u002F.dtt\u002Fskills\u002F` to teach the agent new procedures. Each skill is a directory containing a `SKILL.md` file (Claude Code convention). Skills with `allowed-tools` in their frontmatter inject directly into the agent's context, so it follows those instructions while using its own tools. Text-processing skills run via Sonnet as isolated sub-tasks. Skills can also be installed mid-session via the `manage_skill` tool.\n\n**MCP servers.** Configure MCP servers in `~\u002F.dtt\u002Fmcp.json` (same format as Claude Code). The agent picks up all connected MCP tools at startup. Servers can also be added mid-session via the `manage_mcp` tool.\n\n## Orchestrator mode\n\n`--orchestrator` opens a terminal UI for running multiple agents in parallel. You get:\n\n- One line per session showing status, current phase, elapsed time, and cost\n- Expand any session to watch its log in real time\n- Send live input or queued input to a running agent\n- Terminate, copy logs, or copy final output to your clipboard\n- A \"smart launcher\" that sends your prompt to Opus, which figures out how to split the work and spins up agents for each piece\n\nThe smart launcher caps at 16 concurrent agents by default and shows a cost estimate before launching.\n\n## Live input\n\nWhile the agent is running, press any key to open an input bar at the bottom. Type and press Enter to inject your message immediately. Press Ctrl-Q to queue it until the current step finishes. Press Esc to cancel.\n\nThe agent can also ask you questions directly when it needs something it can't figure out on its own -- an OTP code, a preference, or confirmation before a destructive action.\n\n## Email\n\nDTT can send and receive email through AgentMail. First time: the agent signs itself up, you confirm a one-time OTP from your personal email, and the API key is saved for all future sessions. After that, it handles email on its own.\n\nSet `AGENTMAIL_API_KEY` in your shell or let the agent create one via `email_auth`.\n\n## Models\n\nAll calls route through OpenRouter. You only need one API key.\n\n| Role | Default model | Flag to change |\n|---|---|---|\n| Main agent | Claude Opus 4.6 | `--fast` for Opus 4.6-fast |\n| Summarizer, Notte agent, delegate | Claude Sonnet 4.6 | -- |\n| Oracle | GPT-5.4 | `--oraclepro` for GPT-5.4-pro |\n\n## Tools\n\n**File operations:** `read_file`, `write_file`, `edit_file`, `batch_read`, `diff_files`\n\n**System:** `run_command`, `shell_session`, `run_code`, `glob`, `list_dir`, `search_file`, `clipboard_copy`, `clipboard_paste`, `request_user_input`\n\n**Web:** `search_web` (hybrid Serper + SearXNG for general discovery, plus engine\u002Fcategory targeting), `fetch_page` (Notte-powered scraping), `browser_agent` (full interactive control), `http_request`\n\n**Analysis:** `think`, `oracle`, `delegate`, `analyze_data`, `analyze_image`, `batch_process`\n\n**State:** `notes_add`, `notes_read`, `plan_create`, `plan_update`\n\n**Config:** `manage_config`, `manage_skill`, `manage_mcp`\n\n**Email:** `email_auth`, `email_list_inboxes`, `email_create_inbox`, `email_list`, `email_read`, `email_send`, `email_delete`, `email_wait_for_message`\n\n**Extensions:** `use_skill` (custom skills), MCP tools (from configured servers)\n\n## Skills\n\nEach skill lives in its own directory under `~\u002F.dtt\u002Fskills\u002F` as a `SKILL.md` file (matching Claude Code's convention). So a skill called `my-skill` would live at `~\u002F.dtt\u002Fskills\u002Fmy-skill\u002FSKILL.md`. Subdirectories are scanned recursively, so you can organize skills however you like. Each `SKILL.md` can have optional YAML frontmatter:\n\n```yaml\n---\nname: my-skill\ndescription: What this skill does\ninline: true          # inject into agent context (vs. delegate to Sonnet)\nallowed-tools: [Read, Write, Edit]  # implies inline\ndisable-model-invocation: true  # hide from agent's skill list\n---\n\nYour skill instructions here...\n```\n\nSkills with `allowed-tools` or `inline: true` get injected directly into the agent's system prompt. The agent applies them while working, using its full tool access. All other skills are available via the `use_skill` tool for isolated execution.\n\n## MCP servers\n\nConfigure MCP servers in `~\u002F.dtt\u002Fmcp.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"my-server\": {\n      \"command\": \"npx\",\n      \"args\": [\"-y\", \"my-mcp-server\"],\n      \"env\": { \"API_KEY\": \"${MY_API_KEY}\" }\n    }\n  }\n}\n```\n\nThe agent discovers and uses all tools exposed by connected MCP servers.\n\n## Environment variables\n\n| Variable | Required | Description |\n|---|---|---|\n| `OPENROUTER_API_KEY` | Yes | Your OpenRouter API key |\n| `SERPER_API_KEY` | No | Enables hybrid `search_web` plus Serper-backed `batch_process` search enrichment |\n| `TWOCAPTCHA_API_KEY` | No | Enables automated captcha solving |\n| `AGENTMAIL_API_KEY` | No | AgentMail key for email tools |\n| `AGENTMAIL_INBOX_ID` | No | Default AgentMail inbox ID |\n| `AGENTMAIL_HUMAN_EMAIL` | No | Human email for AgentMail OTP verification |\n\nAll variables can be saved to `~\u002F.dtt\u002Fenv` (shell-exported values take precedence). The agent can update this file via `manage_config`.\n\n## Where things live\n\n| Path | What's there |\n|---|---|\n| `~\u002F.dtt\u002Fenv` | Saved API keys for OpenRouter, Serper, 2Captcha, and AgentMail. Mode 0600. The agent can update this via manage_config. |\n| `~\u002F.dtt\u002Fthreads\u002F` | Saved conversation threads (resume with `--resume`) |\n| `~\u002F.dtt\u002Fthreads\u002F\u003Cid>\u002Fcache\u002F` | Per-thread scratch folder (intermediate files, downloads, batch artifacts) |\n| `~\u002F.dtt\u002Fskills\u002F\u003Cname>\u002FSKILL.md` | User-defined skills (Claude Code convention) |\n| `~\u002F.dtt\u002Fmcp.json` | MCP server configuration |\n| `\u002Ftmp\u002Fdothething\u002F` | Runtime: Python venv, SearXNG, Camoufox browser |\n\n## Pipe mode\n\n`--pipe` sends only the final report to stdout and mutes everything else. Use it when you need to chain dothething into other commands:\n\n```bash\n.\u002Fdtt.sh --pipe --prompt \"Summarize the README in this repo\" | pbcopy\n.\u002Fdtt.sh --pipe --prompt \"List all TODO comments\" > todos.txt\ncat spec.md | .\u002Fdtt.sh --pipe --prompt \"Review this spec\"\n```\n\nExit codes: 0 means complete, 2 means partial, 1 means failed.\n\n## Notifications\n\n`--notify-desktop` pops a system notification when the task finishes. On macOS this uses osascript, on Linux it uses notify-send.\n\n`--notify-email you@example.com` sends a short email summary when done. Requires AgentMail to be configured.\n\nBoth work in orchestrator mode -- you get per-agent notifications as they finish, plus one when all agents are done.\n\n## Persistent shell\n\nThe `shell_session` tool provides a stateful bash session that persists environment variables, working directory, and shell state across calls. Use it for multi-step build processes, interactive debugging, or anything where shell state matters between commands. For simple one-off commands, `run_command` is still there and simpler.\n\n## Cost limits\n\n`--max-cost 5.00` stops the agent when cumulative spending hits $5. The agent checkpoints its state so you can `--resume` later if you want to continue. Useful for fire-and-forget runs where you don't want to babysit the budget.\n\n## Email polling\n\n`email_wait_for_message` pauses the agent until a specific reply hits the inbox. Set filters on sender, subject, or thread. The agent polls every few seconds and returns the message when it arrives, or times out. Saves you from wasting tokens on manual poll loops.\n\n## Security\n\nPersisted thread logs (`~\u002F.dtt\u002Fthreads\u002F`) are redacted -- API keys, tokens, and secrets are masked before writing to disk. The same redaction applies to `--debug` output.\n\n## License\n\nBSD 3-Clause. See [LICENSE](LICENSE) for the full text.\n","dothething 是一个本地自主AI代理，用户只需用自然语言描述任务，它就能自动完成并返回结果。该项目的核心功能包括网页自动化、数据提取、文件编辑及代码执行等，支持通过本地SearXNG实例进行网络搜索，并使用Notte和Camoufox浏览器避免指纹追踪。此外，它还能管理自身配置、处理邮件、与系统剪贴板交互以及在任务中途接受新指令。dothething适合需要自动化处理复杂多步骤任务的场景，如市场研究、数据分析或文档生成等。项目采用Shell脚本编写，易于安装和配置，运行环境要求Python 3.11及以上版本。","2026-06-11 02:38:55","CREATED_QUERY"]