[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-11153":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},11153,"mobileClaw","eggbrid2\u002FmobileClaw","eggbrid2","Open Android AI agent runtime for phone control, app automation, VLM screen reading, skill routing, mini apps, and Mihomo VPN workflows.",null,"Kotlin",265,16,10,5,0,58,134,159,174,93.69,"Other",false,"main",true,[],"2026-06-12 04:00:53","\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"docs\u002Flogo.png\" alt=\"MobileClaw\" width=\"150\" \u002F>\n\n# MobileClaw\n\n### An open Android AI agent runtime that can see the screen, control apps, build tools, remember context, and route its own skills.\n\nMobileClaw is an experimental Android app for running LLM agents on a real phone. It sits at the intersection of Android automation, mobile AI agents, accessibility-based phone control, on-device Python tools, multi-agent workflows, and VPN\u002Fproxy operations.\n\nThe idea is simple: a mobile agent should not just chat about your device. It should be able to observe the screen, choose the right tools, act through Android capabilities, create new workflows, and keep enough memory to improve across tasks.\n\n[![Android](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAndroid-11%2B-3DDC84?logo=android&logoColor=white)](https:\u002F\u002Fdeveloper.android.com)\n[![Kotlin](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FKotlin-2.2-7F52FF?logo=kotlin&logoColor=white)](https:\u002F\u002Fkotlinlang.org)\n[![Compose](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FJetpack%20Compose-4285F4?logo=jetpackcompose&logoColor=white)](https:\u002F\u002Fdeveloper.android.com\u002Fjetpack\u002Fcompose)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FChaquopy-Python%203.11-3776AB?logo=python&logoColor=white)](https:\u002F\u002Fchaquo.com\u002Fchaquopy\u002F)\n[![LLM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOpenAI--compatible-111827?logo=openai&logoColor=white)](https:\u002F\u002Fplatform.openai.com)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-22c55e)](LICENSE)\n\n**[中文 README](README_zh.md)**\n\n\u003C\u002Fdiv>\n\n---\n\n## Why This Exists\n\nMost mobile AI apps are chat surfaces. MobileClaw is closer to a small operating layer for agents.\n\nA user request is turned into a scoped task. The task gets a role, a short plan, a filtered tool set, and an execution loop. That shape is the core of the project:\n\n```text\nuser goal -> task type -> role scheduler -> planner -> allowed skills -> observe -> act -> verify\n```\n\nThis matters because phone automation fails quickly when every tool is always available. MobileClaw keeps phone control, web research, file work, app building, image generation, VPN control, skill management, and code execution in different task modes.\n\nThe project is still moving fast. Some pieces are stable enough to use daily; some are research-grade and need device-specific fixes. The code is open because this kind of Android agent needs real devices, real ROM quirks, and real users to become good.\n\n## What Works Today\n\n### Common Use Cases\n\n- Android AI agent for phone control and app automation.\n- VLM-style screen reading with coordinate-based tapping and scrolling.\n- AI assistant that can operate real Android apps through AccessibilityService.\n- Mobile agent runtime with task planning, role routing, and scoped tool injection.\n- Multi-agent group chat with long-running tasks and interruptible work.\n- AI-generated mini apps and native Android pages.\n- Optional on-device local model runtime with downloadable Gemma LiteRT models.\n- Clash\u002FMihomo subscription import and Android VPN control.\n- Embedded Python execution and dynamic skill creation on Android.\n\n### Phone Control\n\n- Accessibility-based screen reading through XML when Android exposes a useful tree.\n- Vision-first screen reading with `see_screen`, which captures a screenshot, marks interactive targets, and returns coordinates for direct action.\n- Raw `screenshot` fallback when XML is empty or misleading, especially for Flutter, React Native, WebView, and game-like UIs.\n- Tap, long press, scroll, text input, back\u002Fhome navigation, app launch, installed app listing.\n- A lightweight IME exists for more reliable text insertion paths.\n\n### Background Phone Work\n\n- Hidden virtual display support for launching apps away from the user's main screen.\n- Background screen XML and screenshot tools: `bg_launch`, `bg_read_screen`, `bg_screenshot`, `bg_stop`.\n- ROM-aware setup guidance, plus optional root or one-time ADB activated privileged service for devices that block launching apps onto virtual displays.\n\n### Task Runtime\n\n- `TaskClassifier` maps requests into task types such as `PHONE_CONTROL`, `WEB_RESEARCH`, `APP_BUILD`, `VPN_CONTROL`, `SKILL_MANAGEMENT`, and `CODE_EXECUTION`.\n- `TaskPlanner` makes a planning call before tool execution.\n- `TaskToolPolicy` controls which tools are visible for each task.\n- `RoleScheduler` chooses from built-in and user-created roles.\n- `AgentRuntime` runs a ReAct-style loop with repeated-perception guards, screenshot context trimming, structured observations, and task events.\n\n### Roles And Scheduling\n\nBuilt-in roles include:\n\n- General assistant\n- Coder\n- Web agent\n- Phone operator\n- Creator\n- Skill admin\n- VPN operator\n\nRoles are not just personas. They can declare preferred task types, keywords, scheduler priority, forced skills, and model overrides. User-created roles participate in the same scheduler.\n\n### Skills\n\nMobileClaw has a native skill registry with injection levels:\n\n- Level 0: always available for core runtime needs.\n- Level 1: task-aware skills.\n- Level 2: on-demand skills, usually created or promoted by the user.\n\nBuilt-in skill groups include:\n\n- Phone and perception: `see_screen`, `screenshot`, `read_screen`, `tap`, `scroll`, `input_text`, `navigate`, `list_apps`.\n- Web: `web_search`, `fetch_url`, hidden WebView browsing, page content extraction, JavaScript execution.\n- Files and attachments: create\u002Fread\u002Flist files, create HTML pages, user storage access, file cards, image\u002Ffile\u002Fhtml\u002Fwebpage\u002Fsearch-result attachments.\n- Creation: image generation, video generation, document generation, icon generation.\n- Apps: HTML mini-app creation and native Compose AI page creation.\n- Code: embedded Python execution, runtime pure-Python package install, shell execution, console editing.\n- Memory and user data: semantic memory, user profile facts, user config, skill notes.\n- Meta tools: create skills, generate skills from a description, browse\u002Finstall marketplace skills, manage roles, switch model, switch role, manage chat sessions.\n- VPN: start\u002Fstop\u002Fstatus through `vpn_control`.\n\nDynamic skills can be Python or HTTP definitions saved under app storage. Native and shell skills are intentionally not generated by the agent through the normal meta-skill path.\n\n### Mini Apps And AI Pages\n\nMobileClaw has two app-building paths:\n\n- HTML mini-apps run inside WebView and get a `Claw` JavaScript bridge for HTTP, SQLite, Python, shell, memory, config, files, clipboard, device info, app launch, URL opening, sharing, and asking the agent.\n- AI Pages are native Compose pages stored as JSON. They render a component DSL and execute action steps such as HTTP, shell, notification, vibration, app launch, open URL, clipboard, intents, phone dialer, SMS composer, alarms, and navigation between pages.\n\nBoth are created from chat through skills. Mini apps are good for fast web-like tools. AI Pages are better when a workflow should feel native.\n\nFollow-up edits keep artifact context. If the user asks to change \"that page\" after creating an AI Page, MobileClaw carries the recent page ID into the next task and routes the update back through `ui_builder` instead of falling back to one-off HTML.\n\n### VPN And Proxy Runtime\n\nMobileClaw includes a VPN stack designed for Android agent use:\n\n- Clash\u002FMihomo subscription import.\n- Raw YAML is stored so runtime configs can be rebuilt without resubscribing every time.\n- Supported parsed proxy types include HTTP, SOCKS5, Shadowsocks, SSR from YAML, VMess, Trojan, and VLESS.\n- Node latency is tested through short-lived mihomo processes.\n- Runtime config is built around a selected node and `MATCH,GLOBAL`.\n- Android `VpnService` creates the TUN interface.\n- mihomo provides the local mixed proxy.\n- `hev-socks5-tunnel` bridges Android TUN traffic to mihomo.\n- App HTTP and WebView traffic can use the active proxy path.\n\nThis stack does not use Xray. mihomo handles the proxy protocols; hev is kept because Android still needs a TUN-to-SOCKS bridge.\n\n### Chat, Group Chat, And Attachments\n\n- Normal chat supports text, image attachment, file attachment, streaming output, task logs, details sheets, collapsed long content, and separate attachment messages.\n- Group chat supports user and AI attachments.\n- Group chat has a small task pool. A long task occupies its agent and one pool slot, not the whole group.\n- Agents can be interrupted by newer user turns when capacity is available.\n- Group chat roles can own their own bubble style. Agents are encouraged to choose native Markdown bubbles by default, then tune presets, text color, font family, font weight, font size, gradients, background images, patterns, emotion fields, per-corner radius, padding, shadows, small decorations, and lightweight text\u002Fborder animations.\n- Roles may opt into HTML bubble rendering only when native Markdown styling is not expressive enough. HTML bubbles support custom templates, height, transparent backgrounds, optional JavaScript, and optional network images, but native rendering remains the preferred path for performance and app-like consistency.\n- Built-in ChineseBQB stickers can be searched or favorited from a thumbnail grid. Selecting a sticker sends it directly as a sticker\u002Fimage message instead of staging it as a generic attachment.\n\n### Memory\n\n- Semantic memory stores durable key-value facts.\n- Conversation memory stores recent user and assistant messages.\n- Episodic memory records task outcomes, skills used, and reflections, then retrieves similar past tasks through a local character n-gram embedder.\n- User profile extraction writes structured profile facts into semantic memory.\n- Working memory trims task steps to keep the active prompt bounded.\n\n### Local Models\n\nMobileClaw can run selected on-device models through LiteRT-LM:\n\n- Local model management lives in Settings, with download, import, delete, enable, and model selection controls.\n- Built-in download choices include Gemma 4 E2B and Gemma 4 E4B LiteRT-LM packages.\n- Multimodal `.task` resource packages can be downloaded or imported separately while the current Android LiteRT-LM chat path uses `.litertlm` text runtime files.\n- Model downloads support multiple sources: Hugging Face, ModelScope, and a user-provided custom direct URL.\n- Hugging Face tokens are supported for official Hugging Face downloads, but are not sent to domestic mirrors or custom URLs.\n- Local chat is used for text-only requests when enabled. Tool calls, image input, web access, or unavailable local models automatically fall back to the configured cloud endpoint when possible.\n\n### Local And LAN APIs\n\n- A loopback API server exposes skills, dynamic skill install\u002Fdelete, memory, and config to local HTTP skills.\n- A LAN console server exposes a browser UI, SSE task events, session\u002Fmessage APIs, skill export\u002Fimport APIs, memory\u002Fconfig APIs, and a downloadable OpenClaw CLI script.\n- The console page can be edited by the agent through `console_editor`.\n\n## Architecture\n\n```text\napp\u002Fsrc\u002Fmain\u002Fjava\u002Fcom\u002Fmobileclaw\n├─ agent\n│  ├─ TaskSession.kt       task types, task plans, tool policy\n│  ├─ AgentRuntime.kt      ReAct loop and task events\n│  ├─ AgentContext.kt      prompt construction\n│  ├─ Role.kt              built-in roles and role metadata\n│  └─ RoleScheduler.kt     automatic role routing\n├─ skill\n│  ├─ SkillRegistry.kt     registration, injection levels, overrides\n│  ├─ SkillLoader.kt       dynamic Python\u002FHTTP skill persistence\n│  ├─ builtin\u002F             native skills\n│  └─ executor\u002F            Python, HTTP, shell executors\n├─ perception\n│  ├─ ClawAccessibilityService.kt\n│  ├─ ScreenshotController.kt\n│  ├─ ActionController.kt\n│  ├─ VirtualDisplayManager.kt\n│  └─ ClawIME.kt\n├─ ui\n│  ├─ ChatScreen.kt        main chat\n│  ├─ GroupChatScreen.kt   multi-agent group chat\n│  ├─ DynamicUiRenderer.kt inline generated UI blocks\n│  ├─ MiniAppActivity.kt   WebView mini apps\n│  └─ aipage\u002F              native AI page runtime\n├─ vpn\n│  ├─ VpnManager.kt\n│  ├─ ClashParser.kt\n│  ├─ MihomoConfigBuilder.kt\n│  ├─ MihomoProcess.kt\n│  └─ ClawVpnService.kt\n├─ llm\n│  ├─ OpenAiGateway.kt     OpenAI-compatible cloud gateway\n│  ├─ LocalGemmaGateway.kt LiteRT-LM local gateway\n│  └─ LocalModelManager.kt local model download\u002Fimport\u002Fdelete\n├─ memory\n│  ├─ SemanticMemory.kt\n│  ├─ EpisodicMemory.kt\n│  ├─ ConversationMemory.kt\n│  └─ UserProfileExtractor.kt\n└─ server\n   ├─ ConsoleServer.kt\n   ├─ LocalApiServer.kt\n   ├─ PrivilegedServer.kt\n   └─ PrivilegedClient.kt\n```\n\n## Build\n\nRequirements:\n\n- Android Studio Ladybug or newer\n- JDK 21\n- Android 11+ device or emulator\n- An OpenAI-compatible chat endpoint and API key\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Feggbrid2\u002FmobileClaw.git\ncd mobileClaw\n.\u002Fgradlew :app:assembleDebug\n```\n\nDebug APK:\n\n```text\napp\u002Fbuild\u002Foutputs\u002Fapk\u002Fdebug\u002Fapp-debug.apk\n```\n\nThe app uses Kotlin 2.2, Jetpack Compose, Room, DataStore, WebView, OkHttp, Gson, Jsoup, SnakeYAML, Chaquopy Python 3.11, LiteRT-LM, mihomo, and hev-socks5-tunnel.\n\n## Permissions And Device Notes\n\nMobileClaw works by turning user-authorized Android capabilities into explicit agent tools. Depending on the feature, it may ask for:\n\n- Accessibility service access for screen reading, screenshots, gestures, and input.\n- VPN permission for Android `VpnService`.\n- Notification permission for foreground VPN state and AI Page notifications.\n- File and media access for user-selected attachments and user storage tools.\n- Overlay\u002Fbackground-related permissions for long-running and visual assistant features.\n- Optional ADB activation for the privileged virtual-display helper on ROMs that block standard APIs.\n\nRoot is not a baseline requirement. Some background-display features may still need ROM-specific setup, root, or the bundled shell-uid helper.\n\n## Good First Areas To Improve\n\n- More robust UI automation on non-standard Android views.\n- Better VLM grounding and action verification.\n- Safer dynamic skill review and promotion.\n- Better task policies and role scheduling heuristics.\n- More reproducible VPN subscription and mihomo edge cases.\n- ROM compatibility reports for virtual display launch behavior.\n- Better docs, demos, and small role\u002Fskill presets.\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=eggbrid2\u002FmobileClaw&type=Date)](https:\u002F\u002Fstar-history.com\u002F#eggbrid2\u002FmobileClaw&Date)\n\n## Status\n\nMobileClaw is not a polished assistant product. It is an open-source Android agent lab with a working app around it. Expect sharp edges, especially around device permissions, ROM policies, VPN configs, and long-running automation.\n\nIf you contribute, keep behavior inspectable. Small, understandable tools are better than magic.\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n","MobileClaw 是一个开放的Android AI代理运行时，能够控制手机、自动化应用、进行屏幕阅读、技能路由、创建迷你应用以及管理Mihomo VPN工作流。其核心功能包括通过AccessibilityService操作真实Android应用、基于视觉语言模型（VLM）的屏幕阅读与交互、任务规划与角色路由等。技术上，MobileClaw使用Kotlin开发，并集成了Jetpack Compose和Chaquopy以支持Python脚本执行，兼容OpenAI API。该项目适用于需要在移动设备上实现复杂自动化的场景，如日常任务自动化、无障碍辅助或开发智能助手等。由于项目仍在快速发展中，部分功能已达到日常可用水平，而另一些则处于研究阶段，需特定设备调试。",2,"2026-06-11 03:31:14","CREATED_QUERY"]