[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-75485":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},75485,"Tactile","yliust\u002FTactile","yliust","Tactile: an accessibility-first operating layer for agents.","",null,"Python",595,12,8,0,15,73,544,53,7.34,"Other",false,"main",true,[],"2026-06-12 02:03:34","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Flogo.png\" alt=\"Tactile logo\" width=\"160\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>English\u003C\u002Fstrong> · \u003Ca href=\"README_zh.md\">简体中文\u003C\u002Fa>\n\u003C\u002Fp>\n\n# Tactile\n\n**An accessibility-first operating layer for agents.**\n\n> Stop guessing pixels. Start touching semantics.\n\nTactile is not another computer-use agent. It is a skill, protocol, and tool layer that helps agents operate software through accessibility semantics first.\n\nWhen an agent needs to use an application, Tactile asks it not to begin with screenshots, guessed coordinates, and pixel-level clicks. Instead, it should first inspect the semantic information already exposed by the operating system and the application:\n\n- What role does this element have?\n- Does it have an accessible name?\n- Is it clickable, selected, focused, enabled, or disabled?\n- Where does it sit in the UI hierarchy?\n- Does it expose an action that can be invoked directly?\n\nIn that sense, Tactile gives agents a way to feel the structure of software before reaching for vision.\n\nThis information already exists for screen readers and assistive technologies. Tactile makes it the first entry point for agents as well. The easier software is for Tactile to operate, the more likely it is to support genuinely accessible interaction for humans as well.\n\n**Agent-ready software should also be human-accessible software.**\n\n\n## Demo\n\n**Tactile gives agents a sense of touch.**\n\n### Lark and WeChat workflow\n\nThis demo video was also edited by an agent using the Tactile skill to operate CapCut.\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F49dc6bfe-0661-4ab0-9099-be3849b4137a\n\n### CapCut video-editing workflow\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F7bc0f05e-9228-4cf1-abe3-ffb7e4722be2\n\n\n## How to Use Tactile\n\n### Prefer the macOS MCP\n\nFor macOS, you can use the dedicated Tactile MCP in\n`mcps\u002Ftactile-macos-mcp`. It is the recommended entry point when available:\nit is faster, easier to use, and exposes Tactile's accessibility-first workflow\ndirectly through MCP tools.\n\nUse the MCP server with:\n\n```bash\nmcps\u002Ftactile-macos-mcp\u002Fbin\u002Ftactile-macos-mcp\n```\n\nYou can also directly ask your agent (Codex \u002F Claude Code) to install this MCP:\nhttps:\u002F\u002Fgithub.com\u002Fyliust\u002FTactile\u002Ftree\u002Fmain\u002Fmcps\u002Ftactile-macos-mcp\n\n### Use Tactile as a skill\n\nAsk your agent to configure this skill from the repository:\n\n```txt\nConfigure this skill for me (make sure to choose the version for the corresponding operating system): https:\u002F\u002Fgithub.com\u002Fyliust\u002FTactile\n```\n\nIf using API:\n\n```txt\nexport TACTILE_OPENAI_BASE_URL=xxxxxxx\nexport TACTILE_OPENAI_API_KEY=xxxxxxx\nexport TACTILE_MODEL='gpt-5.5'\n```\n\n\n## Why Tactile?\n\nMany computer-use agents start from screenshots:\n\n```txt\nlook at screenshot -> infer element -> predict coordinates -> click -> inspect screenshot again\n```\n\nThis approach is general, but fragile. Tactile changes the order of operations:\n\n```txt\nread accessibility semantics -> use OCR-grounded coordinates when needed -> fall back to visual computer use\n```\n\nAgents should not only see software on a screen. When better information is available, they should first touch the structure of the interface.\n\n\n## Tactile v0\n\nTactile v0 will begin as a skill.\n\nIts goal is to package an accessibility-first operating method for agents:\n\n1. **Use accessibility semantics first**\n\n   If the system or application exposes useful accessibility information, the agent should use element roles, names, hierarchy, state, and actions to understand and operate the interface.\n\n2. **Use OCR + coordinates when semantics are incomplete**\n\n   If an element is not fully represented in the accessibility layer but the visible text is readable, the agent can use system OCR. System OCR usually returns both text and coordinates, which makes it a text-grounded fallback rather than pure visual guessing. For clear text buttons and labels, this can reduce token usage, retries, and time.\n\n3. **Fall back to the agent's native visual operating logic**\n\n   If the accessibility layer is unavailable, OCR cannot locate the target, or the current interface is canvas-based, game-like, remote, image-heavy, or otherwise semantically opaque, the agent can fall back to its own runtime or tool-specific visual operating logic.\n\nTactile provides operating strategy and method tools. It does not take over all agent decisions. When to downgrade, retry, or hand control back to the agent remains context-dependent.\n\n\n## Workflow\n\nTactile recommends the following operating ladder:\n\n```txt\nLevel 1: Accessibility semantics\n  Read the accessibility tree\n  Operate through element names, roles, states, hierarchy, and actions\n  Best for standard UI such as buttons, text fields, menus, tables, dialogs, and lists\n\nLevel 2: OCR-grounded coordinates\n  Use system OCR to read visible text and its coordinates\n  Use text locations to click, type, and verify\n  Best for interfaces with incomplete accessibility metadata but readable text\n\nLevel 3: Native visual computer use\n  Use the agent's existing screenshot understanding, visual reasoning, and coordinate actions\n  Best for image-based interfaces or environments with little usable semantic structure\n```\n\nHumans and agents can move faster when they can share the same semantic path through software.\n\n\n## Verification\n\nTactile is concerned not only with where an agent clicked, but also with whether the task actually succeeded.\n\nAfter each operation, the agent should verify the result whenever possible:\n\n1. **Prefer accessibility-state verification**\n\n   For example: whether a button became disabled, a checkbox became selected, a text field value changed, a dialog closed, or a new list item appeared.\n\n2. **Use OCR verification when accessibility state is insufficient**\n\n   If visible text changes, the agent can use OCR to check whether the expected text, error message, success state, or page title appeared.\n\n3. **Use screenshot-based visual verification as the final fallback**\n\n   When semantics and OCR are not enough, the agent can use screenshot understanding and visual reasoning to confirm the result.\n\nVerification failure does not always mean the action failed. It means the interface did not provide enough reliable feedback, and the agent may need to retry, choose another path, or fall back to a more general visual operating method.\n\n\n## Why Build This Ecosystem?\n\nMany attempts to make agents better require new agent-friendly interfaces. Tactile asks a different question: is there an interface that can serve both humans and agents?\n\nWe have found that when agents use accessibility entry points, they can operate more reliably. At the same time, if agents begin to depend on accessibility, long-standing accessibility gaps become easier to notice:\n\n- Buttons without readable names\n- Incorrect control roles\n- Dialogs that are invisible to the accessibility tree\n- State changes that are not exposed to assistive technologies\n- Custom components that are visible only to sighted users\n- Incomplete keyboard and screen reader paths\n\nThese problems affect agents, but they also affect real users, especially people who depend on screen readers, keyboard navigation, and assistive technologies.\n\nTactile's long-term goal is not only to help agents operate computers better.\n\nIt also aims to encourage software ecosystems to expose better semantic structure, so that agent-ready software can also become accessible software for all humans.\n\n\n## Current Status\n\nTactile is still early.\n\nThe first version connects to Codex as a skill. In early tests on macOS applications with reasonable accessibility support, the accessibility-first workflow can significantly reduce screenshot reasoning and coordinate retries, though of course it is not universal. As execution experience is distilled into reusable strategies, examples, and tool constraints, the skill can continue to improve task outcomes; this kind of experience reuse has proven valuable across many forms of automation work.\n\nWe are also seeing that even many widely used applications still lack strong accessibility support. At the same time, developers are already being asked to adapt to a growing number of agent-specific interfaces. Tactile explores whether these paths can converge.\n\nA longer-term goal is to provide interface layers for software that has not yet implemented sufficient accessibility support for human and AI.\n\n\n## Acknowledgements\n\nTactile is built on decades of work from accessibility communities, screen readers, assistive technologies, operating-system accessibility APIs, OCR systems, UI automation projects, agent runtimes, and open-source developers.\n\nWe are grateful to everyone who has helped make software more readable, operable, and adaptable. Tactile hopes to connect that work with the agent era, and to make the same semantic infrastructure useful to both humans and AI.\n\n\n## Join Us\n\nIf you care about Agentic AI, desktop automation, operating systems, accessibility technology, or simply believe software should be easier for both agents and humans to use, you are welcome to join Tactile.\n\n**Accessible to humans. Operable by agents.**\n","Tactile 是一个以无障碍为先的代理操作层。它通过优先使用操作系统和应用程序已暴露的语义信息来帮助代理操作软件，而不是依赖于屏幕截图、猜测坐标或像素级点击。其核心功能包括识别元素角色、访问名称以及确定元素是否可点击等，并且能够理解UI层次结构。Tactile适用于需要提高自动化脚本稳定性与准确性的场景，特别是在跨平台应用测试、辅助技术开发等领域。通过这种方式，Tactile不仅让机器更智能地理解和操作软件界面，同时也促进了对人类用户更加友好的交互设计。",2,"2026-06-11 03:52:56","CREATED_QUERY"]