[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-11684":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":9,"totalLinesOfCode":9,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":9,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":9,"rankLanguage":9,"license":9,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":40,"readmeContent":41,"aiSummary":42,"trendingCount":16,"starSnapshotCount":16,"syncStatus":43,"lastSyncTime":44,"discoverSource":45},11684,"GenericAgent","lsdefine\u002FGenericAgent","lsdefine","Self-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumption",null,"https:\u002F\u002Fgithub.com\u002Flsdefine\u002FGenericAgent","Python",12759,1474,32,77,0,121,411,752,363,44.51,false,"main",[25,26,27,28,29,30,31,32,33,34,35,36,37,38,39],"ai-agent","automation","autonomous-agent","browser-automation","claude","computer-control","desktop-automation","gemini","lightweight","llm-agent","memory-system","python","self-evolving","skill-tree","task-automation","2026-06-12 02:02:33","\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"assets\u002Fimages\u002Fbar.jpg\" width=\"880\" alt=\"GenericAgent Banner\"\u002F>\n\n# GenericAgent\n\n**A Minimal, Self-Evolving Autonomous Agent Framework**\n\n*~3K lines of seed code · 9 atomic tools · ~100-line Agent Loop*\n\n\u003Cp>\n\n  \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.17091\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTechnical_Report-PDF-EA4335?style=flat-square&logo=adobeacrobatreader&logoColor=white\" alt=\"Technical Report\"\u002F>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FJinyiHan99\u002FGA-Technical-Report\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCode_%26_Data-Reproduction-181717?style=flat-square&logo=github\" alt=\"Reproduction Repo\"\u002F>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fdatawhalechina.github.io\u002Fhello-generic-agent\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTutorial-Datawhale-blue?style=flat-square\" alt=\"Tutorial\"\u002F>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Ffudankw.cn\u002Fsophub\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSkill_Hub-Sophub-purple?style=flat-square\" alt=\"Sophub\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp>\n  \u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F25944\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F25944\" alt=\"Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n**[English](#-english) · [中文](#-中文)**\n\n\u003C\u002Fdiv>\n\n> 📌 **Official Channel** — This GitHub repository is the **only** official source of GenericAgent.\n> We have no affiliation with any third-party website using the GenericAgent name.\n\n---\n\n\u003Ca id=\"-english\">\u003C\u002Fa>\n\n## 🌟 Overview\n\n**GenericAgent** is a minimal, self-evolving autonomous agent framework. Its core is just **~3K lines of code**. Through **9 atomic tools + a ~100-line Agent Loop**, it grants any LLM system-level control over a local computer — covering browser, terminal, filesystem, keyboard\u002Fmouse input, screen vision, and mobile devices (ADB).\n\n> Design philosophy — **don't preload skills, evolve them.**\n\nEvery time GenericAgent solves a new task, it automatically crystallizes the execution path into a reusable **Skill**. The longer you use it, the more skills accumulate — forming a personal skill tree grown entirely from 3K lines of seed code.\n\n> 🤖 **Self-Bootstrap Proof** — Everything in this repository, from installing Git and running `git init` to every commit message, was completed autonomously by GenericAgent. The author never opened a terminal once.\n\n### 📑 Table of Contents\n\n- [Key Features](#-key-features)\n- [Demo Showcase](#-demo-showcase)\n- [Quick Start](#-quick-start)\n- [Usage](#-usage)\n- [Architecture](#-architecture)\n- [Self-Evolution Mechanism](#-self-evolution-mechanism)\n- [Comparison](#-comparison)\n- [Evaluation](#-evaluation)\n- [Roadmap & News](#-roadmap--news)\n- [Community & Support](#-community--support)\n- [License](#-license)\n\n---\n\n## 📋 Key Features\n\n| Feature | Description |\n| :--- | :--- |\n| 🧬 **Self-Evolving** | Automatically crystallizes each task into a Skill. Capabilities grow with every use, forming your personal skill tree. |\n| 🪶 **Minimal Architecture** | ~3K lines of core code. Agent Loop is ~100 lines. No complex dependencies, zero deployment overhead. |\n| ⚡ **Strong Execution** | Injects into a real browser (preserving login sessions). 9 atomic tools take direct control of the system. |\n| 🔌 **High Compatibility** | Supports Claude \u002F Gemini \u002F Kimi \u002F MiniMax and other major models. Cross-platform. |\n| 💰 **Token Efficient** | \u003C30K context window — a fraction of the 200K–1M other agents consume. Less noise, fewer hallucinations, higher success rate, lower cost. |\n\n---\n\n## 🎯 Demo Showcase\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"50%\">\u003Cb>🧋 Food Delivery Order\u003C\u002Fb>\u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"50%\">\u003Cb>📈 Quantitative Stock Screening\u003C\u002Fb>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"assets\u002Fdemo\u002Forder_tea.gif\" width=\"100%\" alt=\"Order Tea\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"assets\u002Fdemo\u002Fselectstock.gif\" width=\"100%\" alt=\"Stock Selection\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Csub>\u003Ci>\"Order me a milk tea\"\u003C\u002Fi> — navigates the delivery app, selects items, completes checkout.\u003C\u002Fsub>\u003C\u002Ftd>\n    \u003Ctd>\u003Csub>\u003Ci>\"Find GEM stocks with EXPMA golden cross, turnover &gt; 5%\"\u003C\u002Fi> — quantitative screening.\u003C\u002Fsub>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\u003Cb>🌐 Autonomous Web Exploration\u003C\u002Fb>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Cb>💰 Expense Tracking\u003C\u002Fb>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"assets\u002Fdemo\u002Fautonomous_explore.png\" width=\"100%\" alt=\"Web Exploration\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"assets\u002Fdemo\u002Falipay_expense.png\" width=\"100%\" alt=\"Alipay Expense\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Csub>Autonomously browses and periodically summarizes web content.\u003C\u002Fsub>\u003C\u002Ftd>\n    \u003Ctd>\u003Csub>\u003Ci>\"Find expenses over ¥2K in the last 3 months\"\u003C\u002Fi> — drives Alipay via ADB.\u003C\u002Fsub>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd align=\"center\" colspan=\"2\">\u003Cb>💬 Batch Messaging\u003C\u002Fb>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd colspan=\"2\" align=\"center\">\u003Cimg src=\"assets\u002Fdemo\u002Fwechat_batch.png\" width=\"50%\" alt=\"WeChat Batch\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd colspan=\"2\">\u003Csub>Sends bulk WeChat messages, fully driving the WeChat client.\u003C\u002Fsub>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n---\n\n## 🚀 Quick Start\n\n> ⚠️ **Python version**: use **Python 3.11 or 3.12**. **Do not** use Python 3.14 — it is incompatible with `pywebview` and a few other GA dependencies.\n>\n> 📖 Detailed installation guide: **[installation.md](docs\u002Finstallation.md)** · **[installation_zh.md（中文）](docs\u002Finstallation_zh.md)**\n\n### For LLM Agents\n\nFetch the installation guide and follow it:\n\n```bash\ncurl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002Flsdefine\u002FGenericAgent\u002Frefs\u002Fheads\u002Fmain\u002Fdocs\u002Finstallation.md\n```\n\n### For Humans\n\n#### Method 1 — One-line install *(recommended)*\n\nThis installs GenericAgent with an isolated Python environment and Git, then downloads a ready-to-run package.\n\n**Windows PowerShell**\n\n```powershell\npowershell -ExecutionPolicy Bypass -c \"$env:GLOBAL=1; irm http:\u002F\u002Ffudankw.cn:9000\u002Ffiles\u002Fga_install.ps1 | iex\"\n```\n\n**Linux \u002F macOS**\n\n```bash\nGLOBAL=1 bash -c \"$(curl -fsSL http:\u002F\u002Ffudankw.cn:9000\u002Ffiles\u002Fga_install.sh)\"\n```\n\nAfter installation, launch the desktop app from:\n\n```text\nfrontends\u002FGenericAgent.exe\n```\n\n#### Method 2 — Python install *(for developers)*\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Flsdefine\u002FGenericAgent.git\ncd GenericAgent\nuv venv\nuv pip install -e \".[ui]\"          # Core + UI dependencies\ncp mykey_template.py mykey.py      # Fill in your LLM API key\npython launch.pyw\n```\n\n> 💡 GenericAgent is meant to grow its environment **through the Agent itself**, not by pre-installing every possible package.\n\n📖 Full guide: [`docs\u002FGETTING_STARTED.md`](docs\u002FGETTING_STARTED.md)\n\n---\n\n## 💻 Usage\n\n### Frontends\n\n#### Desktop App\n\nFor one-line installs on Windows, double-click:\n\n```text\nfrontends\u002FGenericAgent.exe\n```\n\n#### Terminal UI\n\nA lightweight, keyboard-driven interface built on [Textual](https:\u002F\u002Fgithub.com\u002FTextualize\u002Ftextual). Supports multiple concurrent sessions and real-time streaming.\n\n```bash\npython frontends\u002Ftuiapp_v2.py\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>⚠️ Windows TUI Troubleshooting\u003C\u002Fb>\u003C\u002Fsummary>\n\nTUI rendering on Windows can be flaky depending on terminal + font. Common causes:\n\n1. `textual` is not on the latest version — `pip install -U textual` first.\n2. PowerShell \u002F cmd ship with terminals that have rough Unicode + key-binding support. **Prefer Git Bash on Windows**, which is much better behaved.\n3. If it still looks broken, ask GA itself to fix it:\n   > *\"My experience using `frontends\u002Ftuiapp_v2.py` in PowerShell \u002F cmd \u002F Git Bash on Windows is very poor — lots of incompatibility. Please refer to Claude Code's best practices for the Windows terminal and fix all font and rendering incompatibilities.\"*\n\n\u003C\u002Fdetails>\n\n#### Streamlit UI\n\n```bash\npython launch.pyw\n```\n\n### Bot Interface (IM)\n\nGenericAgent also supports IM frontends such as Telegram, WeChat, QQ, Feishu \u002F Lark, WeCom, and DingTalk.\n\n| Platform | Command |\n| :--- | :--- |\n| Telegram | `python frontends\u002Ftgapp.py` |\n| WeChat | `python frontends\u002Fwechatapp.py` |\n| QQ | `python frontends\u002Fqqapp.py` |\n| Feishu \u002F Lark | `python frontends\u002Ffsapp.py` |\n| WeCom | `python frontends\u002Fwecomapp.py` |\n| DingTalk | `python frontends\u002Fdingtalkapp.py` |\n\n> For detailed setup, ask GenericAgent itself.\n\n### Common Chat Commands\n\n| Command | Description |\n| :--- | :--- |\n| `\u002Fnew` | Start a fresh conversation and clear the current context |\n| `\u002Fcontinue` | List recoverable conversation snapshots |\n| `\u002Fcontinue N` | Restore the `N`-th recoverable conversation |\n\n---\n\n## 🧠 Architecture\n\nGenericAgent accomplishes complex tasks through **Layered Memory × Minimal Toolset × Autonomous Execution Loop**, continuously accumulating experience during execution.\n\n### 1️⃣ Layered Memory System\n\n> *Memory crystallizes throughout task execution, letting the agent build stable, efficient working patterns over time.*\n\n| Layer | Name | Description |\n| :---: | :--- | :--- |\n| **L0** | Meta Rules | Core behavioral rules and system constraints |\n| **L1** | Insight Index | Minimal memory index for fast routing and recall |\n| **L2** | Global Facts | Stable knowledge accumulated over long-term operation |\n| **L3** | Task Skills \u002F SOPs | Reusable workflows for completing specific task types |\n| **L4** | Session Archive | Archived task records distilled from finished sessions for long-horizon recall |\n\n### 2️⃣ Autonomous Execution Loop\n\n> *Perceive environment state → Task reasoning → Execute tools → Write experience to memory → Loop*\n\nThe entire core loop is just **~100 lines of code** ([`agent_loop.py`](agent_loop.py)).\n\n### 3️⃣ Minimal Toolset\n\n> *GenericAgent provides only **9 atomic tools**, forming the foundational capabilities for interacting with the outside world.*\n\n| Tool | Function |\n| :--- | :--- |\n| `code_run` | Execute arbitrary code (Python \u002F PowerShell) |\n| `file_read` | Read files |\n| `file_write` | Write \u002F create \u002F overwrite files |\n| `file_patch` | Patch \u002F modify files |\n| `web_scan` | Perceive web content |\n| `web_execute_js` | Control browser behavior |\n| `ask_user` | Human-in-the-loop confirmation |\n| `update_working_checkpoint` | *(memory)* Short-term working notepad |\n| `start_long_term_update` | *(memory)* Distill long-term memory |\n\n### 4️⃣ Capability Extension\n\n> *Capable of dynamically creating new tools.*\n\nVia `code_run`, GenericAgent can dynamically install Python packages, write new scripts, call external APIs, or control hardware at runtime — crystallizing temporary abilities into permanent tools.\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"assets\u002Fimages\u002Fworkflow.jpg\" alt=\"GenericAgent Workflow\" width=\"420\"\u002F>\n  \u003Cbr\u002F>\u003Cem>GenericAgent Workflow Diagram\u003C\u002Fem>\n\u003C\u002Fdiv>\n\n---\n\n## 🧬 Self-Evolution Mechanism\n\nThis is what fundamentally distinguishes GenericAgent from every other agent framework.\n\n```text\n[New Task]\n   │\n   ▼\n[Autonomous Exploration]   ─►  install deps · write scripts · debug · verify\n   │\n   ▼\n[Crystallize into Skill]   ─►  write to memory layer\n   │\n   ▼\n[Direct Recall on Next Similar Task]\n```\n\n| What you say | First time | Every time after |\n| :--- | :--- | :--- |\n| *\"Read my WeChat messages\"* | Install deps → reverse DB → write read script → save Skill | **one-line invoke** |\n| *\"Monitor stocks and alert me\"* | Install `mootdx` → build selection flow → configure cron → save Skill | **one-line start** |\n| *\"Send this file via Gmail\"* | Configure OAuth → write send script → save Skill | **ready to use** |\n\nAfter a few weeks, your agent instance will have a skill tree no one else in the world has — all grown from 3K lines of seed code.\n\n---\n\n## 📊 Comparison\n\n| Feature | **GenericAgent** | OpenClaw | Claude Code |\n| :--- | :---: | :---: | :---: |\n| **Codebase** | ~3K lines | ~530,000 lines | Open-sourced (large) |\n| **Deployment** | `pip install` + API Key | Multi-service orchestration | CLI + subscription |\n| **Browser Control** | Real browser (session preserved) | Sandbox \u002F headless browser | Via MCP plugin |\n| **OS Control** | Mouse\u002Fkbd, vision, ADB | Multi-agent delegation | File + terminal |\n| **Self-Evolution** | Autonomous skill growth | Plugin ecosystem | Stateless between sessions |\n| **Out of the Box** | Few core files + starter skills | Hundreds of modules | Rich CLI toolset |\n\n---\n\n## 📈 Evaluation\n\n> 📂 Full evaluation datasets and results: [**JinyiHan99\u002FGA-Technical-Report**](https:\u002F\u002Fgithub.com\u002FJinyiHan99\u002FGA-Technical-Report\u002Ftree\u002Fmain)\n\nWe evaluate GenericAgent across **five dimensions**:\n\n| # | Dimension | Question | Benchmarks |\n| :---: | :--- | :--- | :--- |\n| 1 | **Task Completion & Token Efficiency** | Can GA complete hard tasks more cheaply than leading agents? | SOP-Bench, Lifelong AgentBench, RealFin-Benchmark |\n| 2 | **Tool-Use Efficiency** | Can a minimal atomic toolset solve what specialized toolsets solve, with less overhead? | Tool Efficiency Benchmark (11 simple + 5 long-horizon) |\n| 3 | **Memory System Effectiveness** | Does condensed hierarchical memory beat full\u002Fredundant memory and embedding-based retrievers? | SOP-Bench (dangerous goods), LoCoMo, 20-skill stress test |\n| 4 | **Self-Evolution Capability** | Can the agent distill experience into reusable SOPs and code, without intervention? | 9-round LangChain longitudinal study, 8-task cross-task web benchmark |\n| 5 | **Web Browsing Capability** | Does density-driven design survive the open web? | WebCanvas, BrowseComp-ZH, Custom Tasks (22) |\n\nBaselines across these dimensions include **Claude Code**, **OpenAI CodeX**, and **OpenClaw**, evaluated under *Claude Sonnet 4.6*, *Claude Opus 4.6*, *GPT-5.4*, and *MiniMax M2.7* backbones.\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\"assets\u002Fimages\u002Fresult_radar.png\" width=\"100%\" alt=\"Tool-use efficiency radar\"\u002F>\u003Cbr\u002F>\n      \u003Csub>\u003Cb>Tool-use efficiency radar.\u003C\u002Fb> GA dominates token, request, and tool-call axes while preserving quality across four task dimensions.\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\"assets\u002Fimages\u002Fresult_convergence.png\" width=\"100%\" alt=\"Cross-task self-evolution convergence\"\u002F>\u003Cbr\u002F>\n      \u003Csub>\u003Cb>Cross-task self-evolution.\u003C\u002Fb> Second- and third-run GA executions converge to a stable low-cost regime across eight web tasks, while OpenClaw shows no such convergence.\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n---\n\n## 📅 Roadmap & News\n\n- **2026-05-15** — 🖥️ **Desktop GUI released**. One-line installs ship a ready-to-run desktop app (`frontends\u002FGenericAgent.exe`). Developers launch via `python launch.pyw`.\n- **2026-05-14** — 🆕 **Conductor sub-agent orchestration**. Spawn, supervise, and auto-clean parallel sub-agents; first-class delegation primitives complementing `\u002Fbtw` side-questions.\n- **2026-05-12** — 🆕 **TUI v2 released** (`frontends\u002Ftuiapp_v2.py`). Refined Textual frontend with image-paste folding, file paste, block-delete, Ctrl+C copy, history navigation, and `\u002Fllm` \u002F `\u002Fexport` \u002F `\u002Fcontinue` pickers.\n- **2026-04-21** — 📄 [**Technical Report on arXiv**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.17091) — *GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization*.\n- **2026-04-11** — Introduced **L4 session archive memory** and scheduler cron integration.\n- **2026-03-23** — Personal WeChat supported as a bot frontend.\n- **2026-03-10** — [Released million-scale Skill Library](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002Fq2gQ7YvWoiAcwxzaiwpuiQ?scene=1&click_id=7).\n- **2026-03-08** — [Released \"Dintal Claw\" — a GenericAgent-powered government-affairs bot](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FeiEhwo-j6S-WpLxgBnNxBg).\n- **2026-03-01** — [Featured by Jiqizhixin (机器之心)](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FuVWpTTF5I1yzAENV_qm7yg).\n- **2026-01-16** — GenericAgent **V1.0** public release.\n\n---\n\n## ⭐ Community & Support\n\nIf this project helped you, please consider leaving a **Star!** 🙏\n\nYou're also welcome to join the **GenericAgent Community Group** for discussion, feedback, and co-building 👏\n\n\u003Cdiv align=\"center\">\n  \u003Ctable>\n    \u003Ctr>\n      \u003Ctd align=\"center\">\u003Cstrong>WeChat Group 18\u003C\u002Fstrong>\u003Cbr\u002F>\u003Cimg src=\"assets\u002Fimages\u002Fwechat_group18.jpg\" alt=\"WeChat Group 18 QR\" width=\"240\"\u002F>\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Ftable>\n\u003C\u002Fdiv>\n\n### 🚩 Friendly Links\n\nThanks to the **LinuxDo** community for the support!\n\n[![LinuxDo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCommunity-LinuxDo-blue?style=for-the-badge)](https:\u002F\u002Flinux.do\u002F)\n\n**Community GUIs** *(independent open-source projects)*:\n\n- [chilishark27\u002Fga-manager](https:\u002F\u002Fgithub.com\u002Fchilishark27\u002Fga-manager)\n- [wangjc683\u002Fgalley](https:\u002F\u002Fgithub.com\u002Fwangjc683\u002Fgalley)\n\n---\n\n## 📄 License\n\nDistributed under the **MIT License**. See [`LICENSE`](LICENSE) for full text.\n\n> *Disclaimer: This project does not build or operate any commercial website. Apart from DintalClaw, no institution, organization, or individual is currently officially authorized to conduct commercial activities under the GenericAgent name.*\n\n---\n\n\u003Ca id=\"-中文\">\u003C\u002Fa>\n\n## 🌟 项目简介\n\n**GenericAgent** 是一个极简、可自我进化的自主 Agent 框架。核心仅 **~3K 行代码**，通过 **9 个原子工具 + ~100 行 Agent Loop**，赋予任意 LLM 对本地计算机的系统级控制能力，覆盖浏览器、终端、文件系统、键鼠输入、屏幕视觉及移动设备（ADB）。\n\n> 设计哲学 —— **不预设技能，靠进化获得能力。**\n\n每解决一个新任务，GenericAgent 就将执行路径自动固化为 Skill，供后续直接调用。使用时间越长，沉淀的技能越多，形成一棵完全属于你、从 3K 行种子代码生长出来的专属技能树。\n\n> 🤖 **自举实证** — 本仓库的一切，从安装 Git、`git init` 到每一条 commit message，均由 GenericAgent 自主完成。作者全程未打开过一次终端。\n\n### 📑 目录\n\n- [核心特性](#-核心特性)\n- [实例展示](#-实例展示)\n- [快速开始](#-快速开始)\n- [使用方式](#-使用方式)\n- [架构设计](#-架构设计)\n- [自我进化机制](#-自我进化机制)\n- [与同类产品对比](#-与同类产品对比)\n- [评测](#-评测)\n- [路线图与最新动态](#-路线图与最新动态)\n- [社区与支持](#-社区与支持)\n- [许可](#-许可)\n\n---\n\n## 📋 核心特性\n\n| 特性 | 说明 |\n| :--- | :--- |\n| 🧬 **自我进化** | 每次任务自动沉淀 Skill，能力随使用持续增长，形成专属技能树 |\n| 🪶 **极简架构** | ~3K 行核心代码，Agent Loop 约百行，无复杂依赖，部署零负担 |\n| ⚡ **强执行力** | 注入真实浏览器（保留登录态），9 个原子工具直接接管系统 |\n| 🔌 **高兼容性** | 支持 Claude \u002F Gemini \u002F Kimi \u002F MiniMax 等主流模型，跨平台运行 |\n| 💰 **极致省 Token** | 上下文窗口不到 30K，是其他 Agent（200K–1M）的零头；噪声更少、幻觉更低、成功率更高，成本低一个数量级 |\n\n---\n\n## 🎯 实例展示\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"50%\">\u003Cb>🧋 外卖下单\u003C\u002Fb>\u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"50%\">\u003Cb>📈 量化选股\u003C\u002Fb>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"assets\u002Fdemo\u002Forder_tea.gif\" width=\"100%\" alt=\"外卖下单\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"assets\u002Fdemo\u002Fselectstock.gif\" width=\"100%\" alt=\"量化选股\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Csub>\u003Ci>\"Order me a milk tea\"\u003C\u002Fi> — 自动导航外卖 App，选品并完成结账\u003C\u002Fsub>\u003C\u002Ftd>\n    \u003Ctd>\u003Csub>\u003Ci>\"Find GEM stocks with EXPMA golden cross, turnover &gt; 5%\"\u003C\u002Fi> — 量化条件筛股\u003C\u002Fsub>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\u003Cb>🌐 自主网页探索\u003C\u002Fb>\u003C\u002Ftd>\n    \u003Ctd align=\"center\">\u003Cb>💰 支出追踪\u003C\u002Fb>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Cimg src=\"assets\u002Fdemo\u002Fautonomous_explore.png\" width=\"100%\" alt=\"网页探索\">\u003C\u002Ftd>\n    \u003Ctd>\u003Cimg src=\"assets\u002Fdemo\u002Falipay_expense.png\" width=\"100%\" alt=\"支付宝支出\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd>\u003Csub>自主浏览并定时汇总网页信息\u003C\u002Fsub>\u003C\u002Ftd>\n    \u003Ctd>\u003Csub>\u003Ci>\"查找近 3 个月超 ¥2K 的支出\"\u003C\u002Fi> — 通过 ADB 驱动支付宝\u003C\u002Fsub>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd align=\"center\" colspan=\"2\">\u003Cb>💬 批量消息\u003C\u002Fb>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd colspan=\"2\" align=\"center\">\u003Cimg src=\"assets\u002Fdemo\u002Fwechat_batch.png\" width=\"50%\" alt=\"微信批量\">\u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd colspan=\"2\">\u003Csub>批量发送微信消息，完整驱动微信客户端\u003C\u002Fsub>\u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n---\n\n## 🚀 快速开始\n\n> ⚠️ **Python 版本：** 推荐使用 **Python 3.11 或 3.12**。**请不要使用 Python 3.14**，与 `pywebview` 及部分依赖不兼容。\n>\n> 📖 详细安装指南：**[installation_zh.md（中文）](docs\u002Finstallation_zh.md)** · **[installation.md (English)](docs\u002Finstallation.md)**\n\n### 给 LLM Agent 看的\n\n获取安装指南并照做：\n\n```bash\ncurl -fsSL https:\u002F\u002Fraw.githubusercontent.com\u002Flsdefine\u002FGenericAgent\u002Frefs\u002Fheads\u002Fmain\u002Fdocs\u002Finstallation_zh.md\n```\n\n### 给人类用户看的\n\n#### 方法一 — 一键安装 *（推荐）*\n\n一键安装会自动准备独立 Python 环境、Git、项目文件和桌面端，不污染系统环境。\n\n**Windows PowerShell**\n\n```powershell\npowershell -ExecutionPolicy Bypass -c \"irm http:\u002F\u002Ffudankw.cn:9000\u002Ffiles\u002Fga_install.ps1 | iex\"\n```\n\n**Linux \u002F macOS**\n\n```bash\ncurl -fsSL http:\u002F\u002Ffudankw.cn:9000\u002Ffiles\u002Fga_install.sh | bash\n```\n\n安装完成后，双击启动：\n\n```text\nfrontends\u002FGenericAgent.exe\n```\n\n#### 方法二 — Python 安装 *（开发者）*\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Flsdefine\u002FGenericAgent.git\ncd GenericAgent\nuv venv\nuv pip install -e \".[ui]\"          # 核心 + UI 依赖\ncp mykey_template.py mykey.py      # 填入你的 LLM API Key\npython launch.pyw\n```\n\n> 💡 GenericAgent 更推荐由 **Agent 在使用中自举环境**，而不是预先手动装完整依赖。\n\n📖 完整引导流程见 [`docs\u002FGETTING_STARTED.md`](docs\u002FGETTING_STARTED.md)\n📖 新手图文版：[飞书文档](https:\u002F\u002Fmy.feishu.cn\u002Fwiki\u002FCGrDw0T76iNFuskmwxdcWrpinPb)\n📘 完整入门教程（Datawhale 出品）：[Hello GenericAgent](https:\u002F\u002Fdatawhalechina.github.io\u002Fhello-generic-agent\u002F) · [GitHub](https:\u002F\u002Fgithub.com\u002Fdatawhalechina\u002Fhello-generic-agent)\n\n---\n\n## 💻 使用方式\n\n### 前端启动\n\n#### 桌面端\n\n一键安装自带桌面端，双击：\n\n```text\nfrontends\u002FGenericAgent.exe\n```\n\n#### 终端 UI\n\n基于 [Textual](https:\u002F\u002Fgithub.com\u002FTextualize\u002Ftextual) 的轻量键盘驱动界面。支持多会话并发、实时流式输出，有终端就能跑。\n\n```bash\npython frontends\u002Ftuiapp_v2.py\n```\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>⚠️ Windows 上 TUI 显示异常的排查思路\u003C\u002Fb>\u003C\u002Fsummary>\n\n1. `textual` 版本太旧，先 `pip install -U textual`；\n2. PowerShell \u002F cmd 自带终端对 Unicode 和键位的支持比较糟糕，**Windows 上推荐用 Git Bash**，体验明显更稳；\n3. 仍然显示异常时，可以让 GA 自己修一遍，参考 Prompt：\n   > *\"我在 Windows 的 PowerShell \u002F cmd \u002F Git Bash 中使用 `frontends\u002Ftuiapp_v2.py` 体验非常差，出现了一堆不兼容问题。请参考 Claude Code 在 Windows 终端的最佳配置，把所有字体和显示不兼容的问题修一遍。\"*\n\n\u003C\u002Fdetails>\n\n#### Streamlit UI\n\n```bash\npython launch.pyw\n```\n\n### Bot 接口（IM）\n\nGenericAgent 支持 Telegram、微信、QQ、飞书 \u002F Lark、企业微信、钉钉等 IM 前端。\n\n| 平台 | 启动命令 |\n| :--- | :--- |\n| Telegram | `python frontends\u002Ftgapp.py` |\n| 微信 | `python frontends\u002Fwechatapp.py` |\n| QQ | `python frontends\u002Fqqapp.py` |\n| 飞书 \u002F Lark | `python frontends\u002Ffsapp.py` |\n| 企业微信 | `python frontends\u002Fwecomapp.py` |\n| 钉钉 | `python frontends\u002Fdingtalkapp.py` |\n\n> 详细配置直接问 GenericAgent。\n\n### 通用聊天命令\n\n| 命令 | 说明 |\n| :--- | :--- |\n| `\u002Fnew` | 开启新对话并清空当前上下文 |\n| `\u002Fcontinue` | 列出可恢复会话快照 |\n| `\u002Fcontinue N` | 恢复第 `N` 个可恢复会话 |\n\n---\n\n## 🧠 架构设计\n\nGenericAgent 通过 **分层记忆 × 最小工具集 × 自主执行循环** 完成复杂任务，并在执行过程中持续积累经验。\n\n### 1️⃣ 分层记忆系统\n\n> *记忆在任务执行过程中持续沉淀，使 Agent 逐步形成稳定且高效的工作方式。*\n\n| 层级 | 名称 | 说明 |\n| :---: | :--- | :--- |\n| **L0** | 元规则（Meta Rules） | Agent 的基础行为规则和系统约束 |\n| **L1** | 记忆索引（Insight Index） | 极简索引层，用于快速路由与召回 |\n| **L2** | 全局事实（Global Facts） | 在长期运行过程中积累的稳定知识 |\n| **L3** | 任务 Skills \u002F SOPs | 完成特定任务类型的可复用流程 |\n| **L4** | 会话归档（Session Archive） | 从已完成任务中提炼出的归档记录，用于长程召回 |\n\n### 2️⃣ 自主执行循环\n\n> *感知环境状态 → 任务推理 → 调用工具执行 → 经验写入记忆 → 循环*\n\n整个核心循环仅 **约百行代码**（[`agent_loop.py`](agent_loop.py)）。\n\n### 3️⃣ 最小工具集\n\n> *GenericAgent 仅提供 **9 个原子工具**，构成与外部世界交互的基础能力。*\n\n| 工具 | 功能 |\n| :--- | :--- |\n| `code_run` | 执行任意代码（Python \u002F PowerShell） |\n| `file_read` | 读取文件 |\n| `file_write` | 写入 \u002F 创建 \u002F 覆盖文件 |\n| `file_patch` | 修改文件 |\n| `web_scan` | 感知网页内容 |\n| `web_execute_js` | 控制浏览器行为 |\n| `ask_user` | 人机协作确认 |\n| `update_working_checkpoint` | *（记忆）* 短期工作记事板 |\n| `start_long_term_update` | *（记忆）* 提炼长期记忆 |\n\n### 4️⃣ 能力扩展机制\n\n> *具备动态创建新工具的能力。*\n\n通过 `code_run`，GenericAgent 可在运行时动态安装 Python 包、编写新脚本、调用外部 API 或控制硬件，将临时能力固化为永久工具。\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"assets\u002Fimages\u002Fworkflow.jpg\" alt=\"GenericAgent 工作流程\" width=\"420\"\u002F>\n  \u003Cbr\u002F>\u003Cem>GenericAgent 工作流程图\u003C\u002Fem>\n\u003C\u002Fdiv>\n\n---\n\n## 🧬 自我进化机制\n\n这是 GenericAgent 区别于其他 Agent 框架的根本所在。\n\n```text\n[遇到新任务]\n    │\n    ▼\n[自主摸索]   ─►  安装依赖 · 编写脚本 · 调试验证\n    │\n    ▼\n[执行路径固化为 Skill]   ─►  写入记忆层\n    │\n    ▼\n[下次同类任务直接调用]\n```\n\n| 你说的一句话 | 第一次做了什么 | 之后每次 |\n| :--- | :--- | :--- |\n| *\"监控股票并提醒我\"* | 安装 `mootdx` → 构建选股流程 → 配置定时任务 → 保存 Skill | **一句话启动** |\n| *\"用 Gmail 发这个文件\"* | 配置 OAuth → 编写发送脚本 → 保存 Skill | **直接可用** |\n\n用几周后，你的 Agent 实例将拥有一套任何人都没有的专属技能树，全部从 3K 行种子代码中生长而来。\n\n---\n\n## 📊 与同类产品对比\n\n| 特性 | **GenericAgent** | OpenClaw | Claude Code |\n| :--- | :---: | :---: | :---: |\n| **代码量** | ~3K 行 | ~530,000 行 | 已开源（体量大） |\n| **部署方式** | `pip install` + API Key | 多服务编排 | CLI + 订阅 |\n| **浏览器控制** | 注入真实浏览器（保留登录态） | 沙箱 \u002F 无头浏览器 | 通过 MCP 插件 |\n| **OS 控制** | 键鼠、视觉、ADB | 多 Agent 委派 | 文件 + 终端 |\n| **自我进化** | 自主生长 Skill 和工具 | 插件生态 | 会话间无状态 |\n| **出厂配置** | 几个核心文件 + 少量初始 Skills | 数百模块 | 丰富 CLI 工具集 |\n\n---\n\n## 📈 评测\n\n> 📂 完整的评测数据集以及评测结果见：[**JinyiHan99\u002FGA-Technical-Report**](https:\u002F\u002Fgithub.com\u002FJinyiHan99\u002FGA-Technical-Report\u002Ftree\u002Fmain)\n\n我们从 **五大维度** 评测 GenericAgent：\n\n| # | 维度 | 核心问题 | 使用的基准 |\n| :---: | :--- | :--- | :--- |\n| 1 | **任务完成度与 Token 效率** | GA 能否以更低成本完成高难度任务？ | SOP-Bench、Lifelong AgentBench、RealFin-Benchmark |\n| 2 | **工具使用效率** | 最小原子工具集能否以更低开销替代专用工具集？ | Tool Efficiency Benchmark |\n| 3 | **记忆系统有效性** | 精简分层记忆能否超越冗余记忆和基于 Embedding 的检索器？ | SOP-Bench、LoCoMo、20-skill 压力测试 |\n| 4 | **自我进化能力** | Agent 能否在无人干预下将经验提炼为可复用的 SOP 与代码？ | 9 轮 LangChain 纵向研究、8 任务跨任务 Web 基准 |\n| 5 | **网页浏览能力** | 信息密度驱动设计能否适应开放网页？ | WebCanvas、BrowseComp-ZH、自定义任务 |\n\n以上维度的基线包括 **Claude Code**、**OpenAI CodeX** 和 **OpenClaw**，分别在 *Claude Sonnet 4.6*、*Claude Opus 4.6*、*GPT-5.4* 和 *MiniMax M2.7* 底座上进行评测。\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\"assets\u002Fimages\u002Fresult_radar.png\" width=\"100%\" alt=\"工具使用效率雷达图\"\u002F>\u003Cbr\u002F>\n      \u003Csub>\u003Cb>工具使用效率雷达图。\u003C\u002Fb>GA 在 Token、请求数和工具调用轴上全面领先，同时在四个任务维度上保持质量。\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"50%\">\n      \u003Cimg src=\"assets\u002Fimages\u002Fresult_convergence.png\" width=\"100%\" alt=\"跨任务自我进化收敛曲线\"\u002F>\u003Cbr\u002F>\n      \u003Csub>\u003Cb>跨任务自我进化。\u003C\u002Fb>GA 的第二轮和第三轮执行在 8 个 Web 任务上收敛至稳定的低成本区间。\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n---\n\n## 📅 路线图与最新动态\n\n- **2026-05-15** — 🖥️ **桌面 GUI 发布**。一键安装会自带可直接运行的桌面端（`frontends\u002FGenericAgent.exe`），开发者也可用 `python launch.pyw` 启动。\n- **2026-05-14** — 🆕 **Conductor 子 Agent 编排**。派发、监督、自动清理并行子 Agent；与 `\u002Fbtw` 旁路子 Agent 互补，提供一等公民级的任务委派原语。\n- **2026-05-12** — 🆕 **TUI v2 正式发布**（`frontends\u002Ftuiapp_v2.py`）。重做视觉风格的 Textual 前端，支持图片粘贴折叠、文件粘贴、块删除、Ctrl+C 复制、历史导航，以及 `\u002Fllm` \u002F `\u002Fexport` \u002F `\u002Fcontinue` 选择器。\n- **2026-04-21** — 📄 [**技术报告已发布至 arXiv**](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.17091) — *GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization*。\n- **2026-04-11** — 引入 **L4 会话归档记忆**，并接入 scheduler cron 调度。\n- **2026-03-23** — 支持个人微信接入作为 Bot 前端。\n- **2026-03-10** — [发布百万级 Skill 库](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002Fq2gQ7YvWoiAcwxzaiwpuiQ?scene=1&click_id=7)。\n- **2026-03-08** — [发布以 GenericAgent 为核心的\"政务龙虾\" Dintal Claw](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FeiEhwo-j6S-WpLxgBnNxBg)。\n- **2026-03-01** — [被机器之心报道](https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002FuVWpTTF5I1yzAENV_qm7yg)。\n- **2026-01-16** — GenericAgent **V1.0** 公开版本发布。\n\n---\n\n## ⭐ 社区与支持\n\n如果这个项目对你有帮助，欢迎点一个 **Star!** 🙏\n\n也欢迎加入 **GenericAgent 体验交流群**，一起交流、反馈、共建 👏\n\n\u003Cdiv align=\"center\">\n  \u003Ctable>\n    \u003Ctr>\n      \u003Ctd align=\"center\">\u003Cstrong>微信群 18\u003C\u002Fstrong>\u003Cbr\u002F>\u003Cimg src=\"assets\u002Fimages\u002Fwechat_group18.jpg\" alt=\"微信群 18 二维码\" width=\"240\"\u002F>\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Ftable>\n\u003C\u002Fdiv>\n\n### 🚩 友情链接\n\n感谢 **LinuxDo** 社区的支持！\n\n[![LinuxDo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F社区-LinuxDo-blue?style=for-the-badge)](https:\u002F\u002Flinux.do\u002F)\n\n**社区 GUI 客户端** *（独立开源项目）*：\n\n- [chilishark27\u002Fga-manager](https:\u002F\u002Fgithub.com\u002Fchilishark27\u002Fga-manager)\n- [wangjc683\u002Fgalley](https:\u002F\u002Fgithub.com\u002Fwangjc683\u002Fgalley)\n\n---\n\n## 📄 许可\n\n基于 **MIT License** 发布，详见 [`LICENSE`](LICENSE)。\n\n> *声明：本项目未构建任何商业站点；除 DintalClaw 外，目前未官方授权任何机构、组织或个人以 GenericAgent 名义从事商业活动。*\n\n---\n\n## 📈 Star History\n\n\u003Cdiv align=\"center\">\n\n\u003Ca href=\"https:\u002F\u002Fstar-history.com\u002F#lsdefine\u002FGenericAgent&Date\">\n  \u003Cpicture>\n    \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=lsdefine\u002FGenericAgent&type=Date&theme=dark\" \u002F>\n    \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=lsdefine\u002FGenericAgent&type=Date\" \u002F>\n    \u003Cimg alt=\"Star History Chart\" src=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=lsdefine\u002FGenericAgent&type=Date\" \u002F>\n  \u003C\u002Fpicture>\n\u003C\u002Fa>\n\n\u003Cbr\u002F>\u003Cbr\u002F>\n\u003C\u002Fdiv>\n","GenericAgent 是一个极简的自进化自主代理框架，通过约3000行的核心代码和9个基础工具实现对本地计算机的系统级控制。其核心功能包括浏览器、终端、文件系统、键盘\u002F鼠标输入、屏幕视觉及移动设备（ADB）的操作，并且能够自动将每次任务解决过程转化为可复用技能，逐渐形成用户专属的技能树，从而以更少的token消耗达成全面系统控制。该项目适合需要自动化日常任务处理、提升工作效率或进行复杂操作自动化的场景使用。",2,"2026-06-11 03:32:16","trending"]