[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81105":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":16,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":35,"readmeContent":36,"aiSummary":37,"trendingCount":15,"starSnapshotCount":15,"syncStatus":13,"lastSyncTime":38,"discoverSource":39},81105,"Tokenless","MaxForAI\u002FTokenless","MaxForAI","One command to cut token usage by up to 50%+","",null,"JavaScript",42,2,38,0,4,43.83,"MIT License",false,"main",true,[23,24,25,26,27,28,29,30,31,32,33,34],"agent","agent-skills","agentic-workflow","agents","anthropic","claude-code","claude-code-plugin","claude-code-skills","claude-skill","claude-skills","javascript","llm","2026-06-12 04:01:32","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Ftokenless-logo.png\" alt=\"Tokenless faucet logo\" width=\"360\" \u002F>\n\u003C\u002Fp>\n\n\u003Ch1 align=\"center\">Tokenless\u003C\u002Fh1>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>One command to cut token usage by up to 50%+.\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"#benchmarks--evidence\">\u003Cimg alt=\"vibe coding request reduction\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fvibe%20coding-47.3%25%20less%20request%20tokens-2dd4bf?style=for-the-badge\">\u003C\u002Fa>\n  \u003Ca href=\"#output-profiles\">\u003Cimg alt=\"chat response reduction\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fchat-80.0%25%20less%20response%20tokens-4ade80?style=for-the-badge\">\u003C\u002Fa>\n  \u003Ca href=\"LICENSE\">\u003Cimg alt=\"license MIT\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-f59e0b?style=for-the-badge\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"README.md\">English\u003C\u002Fa> ·\n  \u003Ca href=\"README.zh-CN.md\">简体中文\u003C\u002Fa> ·\n  \u003Ca href=\"README.ja.md\">日本語\u003C\u002Fa> ·\n  \u003Ca href=\"README.fr.md\">Français\u003C\u002Fa> ·\n  \u003Ca href=\"README.es.md\">Español\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"#before--after\">Before \u002F After\u003C\u002Fa> ·\n  \u003Ca href=\"#quick-start\">Quick start\u003C\u002Fa> ·\n  \u003Ca href=\"#installation\">Install\u003C\u002Fa> ·\n  \u003Ca href=\"#output-profiles\">Profiles\u003C\u002Fa> ·\n  \u003Ca href=\"#benchmarks--evidence\">Benchmarks & Evidence\u003C\u002Fa> ·\n  \u003Ca href=\"#roadmap\">Roadmap\u003C\u002Fa> ·\n  \u003Ca href=\"docs\u002Fbenchmarking.md\">Full benchmark guide\u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\n## Quick start\n\n```bash\nnpm install -g github:MaxForAI\u002FTokenless\ntokenless repair-hooks --user\ntokenless install-commands --user\ntokenless launch\n```\n\nThen use Claude Code normally. Switch profiles anytime:\n\n```bash\ntokenless style chat     # readable, shorter replies\ntokenless style coding   # dense coding output\ntokenless style off      # full hard-off\n```\n\nClaude Code gets expensive when every log, file read, diff, and long reply keeps getting carried into the next request.\n\nTokenless fixes that.\n\nIt keeps the raw evidence on your machine, sends Claude a compact version, and lets you expand the original output only when you need it.\n\n## Before \u002F After\n\n| Normal Claude Code | Claude Code with Tokenless |\n| --- | --- |\n| Reads a large file or log into future context repeatedly. | Stores the raw output locally and sends a compact packet. |\n| Verbose final replies become part of the next request history. | `chat` and `coding` profiles keep replies short. |\n| Agent trajectory can grow through repeated exploration and task-plan history. | Launcher trims Task\u002FPlan tools by default; packets reduce large read context. |\n\nExample large-read replacement:\n\n| Raw context | Tokenless context |\n| --- | --- |\n| Full file\u002Flog output is carried through API requests. | `TOKENLESS-READ-PACKET\u002F0.1` with artifact id, imports, symbols, snippets, nearby files, and exact expansion commands. |\n\n## Why Tokenless\n\nClaude Code sessions can become expensive because tool outputs, file reads, task-plan history, and verbose assistant replies are repeatedly carried through future API requests. Tokenless targets three sources of growth:\n\n- Large tool output: test logs, build logs, search results, tree output, diffs, large reads, and large successful edit\u002Fwrite results.\n- Agent trajectory overhead: repeated request context, high-overhead Task\u002FPlan tools, and large raw file payloads.\n- Response verbosity: optional `chat` and `coding` profiles reduce assistant output tokens.\n\n## Benchmarks & Evidence\n\nTokenless has two evidence layers: real Claude Code API-body measurements, and external research showing why shorter, denser context can reduce cost without automatically reducing quality.\n\n### Real Claude Code benchmark runs\n\nThese are API-body measurements from actual Claude Code sessions. The main metric is estimated request-body or response-body tokens from raw API logs, not local hook-side savings estimates.\n\n| Scenario | Baseline | Tokenless | Reduction |\n| --- | ---: | ---: | ---: |\n| 5-turn CRM vibe coding, `off` vs `coding` | 4,697,867 request tokens | 2,476,391 | 47.3% |\n| 6-turn natural conversation, `off` vs `chat` | 7,223 response tokens | 1,442 | 80.0% |\n| Large CSS visual edit | 1,017,642 request tokens | 403,995-473,354 | ~54-60% |\n| 10k-line React\u002FTSX edit | 917,137 request tokens | 545,456 | 40.5% |\n| Multifile React dashboard | 628,261 request tokens | 512,521 | 18.4% |\n| Task\u002FPlan tools enabled vs default launcher | 1,524,894 request tokens | 1,087,753 | 28.7% |\n\nThe strongest current product benchmark is the 5-turn CRM vibe-coding run: a non-specialist user gave vague iterative product-polish prompts. The public `coding` profile reduced request tokens by 47.3%, response tokens by 44.4%, and request count by 39.3% versus clean `off`.\n\nThe clean natural-conversation run isolates `chat`: no file tools or packet reducers were involved, and response tokens dropped by 80.0%.\n\nDetailed methodology and raw run notes are in [docs\u002Fbenchmarking.md](docs\u002Fbenchmarking.md) and [docs\u002Fstyle-benchmark.md](docs\u002Fstyle-benchmark.md).\n\n### Research backing\n\nThe research does not prove Tokenless automatically helps every session. It supports the benchmark premise: context and response length are controllable engineering variables, and less text can sometimes be cheaper, faster, and more accurate.\n\n| Paper | Why it matters for Tokenless |\n| --- | --- |\n| [Brevity Constraints Reverse Performance Hierarchies in Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.00025) | Brevity constraints improved large-model accuracy by 26.3 percentage points on inverse-scaling problems. Verbose is not always better. |\n| [Prompt Compression in the Wild](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.02985) | Prompt compression can deliver real end-to-end speedups when workload, compression ratio, and hardware match; quality can remain statistically unchanged. |\n| [LLMLingua](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.05736) | Prompt compression can reduce inference cost while preserving semantic integrity under high compression ratios. |\n| [LongLLMLingua](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06839) | Long-context compression can improve key-information perception while reducing cost and latency. |\n| [Selective Context](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.06201) | Pruning redundant context reported 50% context-cost reduction, 36% memory reduction, and 32% inference-time reduction with minor quality loss. |\n| [Gist Tokens](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.08467) | Learned prompt compression reached up to 26x prompt compression and up to 40% FLOPs reduction. |\n\n## Roadmap\n\nTokenless currently focuses on Claude Code context growth from tool output, file reads, and response verbosity. Next areas:\n\n- User prompt compression: identify repeated prompt patterns, compress user intent without losing constraints, and keep the original prompt recoverable.\n- Router-side optimization: reduce duplicated context and style overhead before requests hit the model backend.\n- Broader workflow support: keep Claude Code as the primary target, then evaluate adapters for other agentic coding tools where the same context-growth problem appears.\n\n## Installation\n\nInstall from GitHub:\n\n```bash\nnpm install -g github:MaxForAI\u002FTokenless\ntokenless repair-hooks --user\ntokenless launch\n```\n\nTokenless is currently distributed through GitHub. It has not been published to the public npm registry yet.\n\nFor local development from a checkout:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FMaxForAI\u002FTokenless.git\ncd Tokenless\nnpm install\nnpm link\ntokenless repair-hooks --user\ntokenless launch\n```\n\nIf Claude Code is not available as `claude` on your `PATH`, set `CLAUDE_BIN`:\n\n```bash\nCLAUDE_BIN=\u002Fpath\u002Fto\u002Fclaude tokenless launch\n```\n\nCheck installation status:\n\n```bash\ntokenless status --user\n```\n\n## Output profiles\n\nTokenless has three public profiles:\n\n| Profile | Behavior |\n| --- | --- |\n| `chat` | Default. Short, readable natural-language responses. Only changes output style. |\n| `coding` | Dense structured responses for coding workflows. Only changes output style. |\n| `off` | Full Tokenless hard-off. Disables style injection and compression hooks. |\n\nSet a profile:\n\n```bash\ntokenless style chat\ntokenless style coding\ntokenless style off\n```\n\nClaude Code slash command shortcuts:\n\n```text\n\u002Ftokenless-style-chat\n\u002Ftokenless-style-coding\n\u002Ftokenless-style-off\n```\n\nThe selected profile is stored at `~\u002F.tokenless\u002Fstyle.json` by default and persists across Claude Code restarts.\n\n`TOKENLESS_MODE=off` remains available as an environment-level hard-off switch for benchmark runs.\n\n## How it works\n\nFor noisy commands and large outputs:\n\n```text\nClaude requests a noisy tool call\n  -> Tokenless intercepts it with Claude Code hooks\n  -> The original command or output is processed locally\n  -> Raw stdout\u002Fstderr or file content is saved as an artifact\n  -> Claude receives a compact TOKENLESS-* packet\n  -> Claude expands only the relevant artifact slice if needed\n```\n\nExample read packet:\n\n```text\nTOKENLESS-READ-PACKET\u002F0.1\nfile: \u002Fpath\u002Fto\u002Fsrc\u002FApp.tsx\nartifact_id: ctx_20260518_abc123\nsummary: large TSX source packet with imports, declarations, snippets, and nearby files\n```\n\nExpand raw evidence when needed:\n\n```bash\ntokenless latest --data-dir ~\u002F.tokenless\ntokenless expand latest --around \"DashboardShell\" --data-dir ~\u002F.tokenless\ntokenless expand latest --lines 120:170 --data-dir ~\u002F.tokenless\n```\n\nThe shorter `acc` command remains available as a compatibility alias. `tokenless` is the preferred public command.\n\n## What Tokenless handles\n\nTokenless currently handles:\n\n- Test logs: `npm test`, `pytest`, `go test`, `cargo test`.\n- Build and CI logs: `npm run build`, `docker build`, `kubectl logs`.\n- Diffs and history: `git diff`, `git log`.\n- Search and tree output: `rg`, `grep -R`, `find`, `tree`, `ls -R`.\n- Large low-risk reads: CSS, HTML, JSON, logs, docs, generated files, large JS\u002FTS\u002FPython source files, and large Vue\u002FSvelte components.\n- Large successful edit\u002Fwrite tool results: conservative `TOKENLESS-EDIT-PACKET` and `TOKENLESS-WRITE-PACKET`.\n- Unexpectedly large Bash output through fallback compression.\n\nSmall bounded commands, such as `rg -m 20`, `find ... | head`, `cat file | grep`, and `tree | head`, pass through directly. Small reads are not compressed by default.\n\n## Read packets\n\nRead packets cap large low-risk file reads as `TOKENLESS-READ-PACKET\u002F0.1`.\n\nDefault policy:\n\n- Compress low-risk reads over roughly 4,000 estimated tokens: CSS, HTML, JSON, logs, docs, lockfiles, generated files.\n- Compress large JS\u002FTS\u002FReact\u002FPython source reads over roughly 30,000 estimated tokens with source-oriented packets.\n- Compress large Vue\u002FSvelte single-file components over roughly 12,000 estimated tokens with component-oriented packets.\n- Do not compress small files.\n- Do not compress source families such as `.go`, `.rs`, `.java`, `.swift`, or `.cpp` by default.\n\nRead packets are indexes, not edit proof. If Claude needs exact code or style, it should expand the relevant lines first:\n\n```bash\ntokenless expand ctx_abc --around \".target-card\" --data-dir ~\u002F.tokenless\ntokenless expand ctx_abc --lines 520:535 --data-dir ~\u002F.tokenless\n```\n\nFor CSS, SCSS, Less, HTML, HTM, and SVG files, read packets include deterministic editable summaries: variables, color palettes, likely editable selectors, media and animation rules, sections, ids\u002Fclasses, interactive elements, assets, and headings.\n\nLarge-file gate behavior:\n\n- Tokenless blocks raw access to large low-risk files until a read packet exists.\n- The hook prints the required next command, usually `tokenless read --agent --data-dir ~\u002F.tokenless \u003Cfile>`.\n- Read packets record file size and modified time.\n- If the file changes after packet creation, the packet is stale and Tokenless asks for a fresh packet.\n- For several related changes, expand the relevant lines and use one bounded edit while the packet is current.\n\n## Edit and write packets\n\nTokenless can cap large successful `Edit`, `MultiEdit`, and low-risk `Write` tool results.\n\nThe policy is intentionally conservative:\n\n- Successful `Edit` \u002F `MultiEdit` output over roughly 3,000 estimated tokens can become `TOKENLESS-EDIT-PACKET\u002F0.1`.\n- Successful low-risk `Write` output over roughly 5,000 estimated tokens can become `TOKENLESS-WRITE-PACKET\u002F0.1`.\n- Failed or risky outputs are never compressed.\n- Source-code `Write` outputs are not compressed by default.\n- Raw tool output is stored locally before replacement.\n\nRisk signals that pass through unchanged include:\n\n```text\nError\nFailed\nold_string\nnot found\nmultiple matches\nambiguous\npermission denied\nconflict\nNo changes\npartial\n```\n\nEdit\u002Fwrite packets do not claim semantic correctness. They only confirm the tool completed successfully, preserve the raw artifact, and mark previous read packets for that file as stale.\n\n## Launcher behavior\n\n`tokenless launch` starts Claude Code with normal read, edit, write, and bash tools available, but disables high-overhead Task\u002FPlan tools by default:\n\n```text\nTaskCreate, TaskUpdate, TaskList, TaskGet, EnterPlanMode, ExitPlanMode\n```\n\nThis reduces fixed tool-schema overhead and prevents task-list history from being repeatedly carried through API request context.\n\nOpt back into Task\u002FPlan tools for a session:\n\n```bash\nTOKENLESS_ALLOW_TASK_TOOLS=1 tokenless launch\n```\n\n## Slash commands\n\nInstall user-level Claude Code slash commands:\n\n```bash\ntokenless install-commands --user\n```\n\nInstalled commands:\n\n```text\n\u002Ftokenless\n\u002Ftokenless-style-chat\n\u002Ftokenless-style-coding\n\u002Ftokenless-style-off\n```\n\n`\u002Ftokenless` shows hook status, active mode, profile, savings, packet counts, pending gates, and latest artifact.\n\nRestart Claude Code after installing slash commands. If you previously installed older Tokenless commands, clean them up with:\n\n```bash\ntokenless uninstall-commands --user\ntokenless install-commands --user\n```\n\n## Common CLI commands\n\n```bash\ntokenless --help\ntokenless status --user\ntokenless latest --data-dir ~\u002F.tokenless\ntokenless list --data-dir ~\u002F.tokenless\ntokenless stats --data-dir ~\u002F.tokenless\ntokenless show latest --data-dir ~\u002F.tokenless\ntokenless expand latest --around \"Cannot find module\" --data-dir ~\u002F.tokenless\ntokenless clean --data-dir ~\u002F.tokenless --keep 100 --dry-run\ntokenless style status\ntokenless style coding\ntokenless api-usage --since 24h\n```\n\n`tokenless stats` separates local savings by source:\n\n- `hook`: real Claude Code hook-path compression.\n- `eval`: local evaluation fixtures.\n- `smoke`: manual probes, doctor checks, and ad hoc compression runs.\n- `legacy`: records created before source tagging.\n\n## Benchmarking\n\nFor API-body verification, enable raw Claude Code API body capture:\n\n```bash\nexport CLAUDE_CODE_ENABLE_TELEMETRY=1\nexport OTEL_LOG_RAW_API_BODIES=\"file:\u003Capi-body-dir>\"\n```\n\nInspect captured request\u002Fresponse bodies:\n\n```bash\nnode plugins\u002Fclaude-code\u002Fbin\u002Ftokenless api-probe stats \\\n  --dir \"\u003Capi-body-dir>\" \\\n  --data-dir ~\u002F.tokenless\n```\n\nA valid clean `off` run must show:\n\n```text\nTOKENLESS-READ-PACKET: request=0\nTOKENLESS-EDIT-PACKET: request=0\nTOKENLESS-WRITE-PACKET: request=0\nrequest_saved_estimate: 0\n```\n\nRaw API bodies can contain full prompts, tool outputs, and sensitive local context. Keep capture disabled unless you are verifying what enters model context.\n\n## Development and validation\n\nRun the main smoke checks:\n\n```bash\nnpm run eval:complex\nnpm run eval:read\nnpm run eval:edit\nnpm run eval:cli-smoke\nnpm run doctor\n```\n\nRun all synthetic and captured cases:\n\n```bash\nnpm run eval:all\n```\n\nCheck hook install state:\n\n```bash\nnpm run tokenless:status\nnpm run tokenless:install:dry-run\nnpm run tokenless:repair-hooks:dry-run\nnpm run tokenless:uninstall:dry-run\n```\n\nTest the actual Claude Code hook path inside Claude Code:\n\n```bash\nnpm run test:complex\n```\n\nExpected behavior:\n\n```text\nPreToolUse caps the noisy command\nClaude reruns node ...\u002Fbin\u002Facc run --agent ...\nClaude receives TOKENLESS-PACKET\u002F0.1\nRaw artifact can be expanded with tokenless expand latest\n```\n\n## Cleanup\n\nArtifacts are stored under the selected `--data-dir`, usually `~\u002F.tokenless\u002Fartifacts`.\n\nPreview cleanup:\n\n```bash\nnpm run tokenless:clean:dry-run\n```\n\nDelete old artifacts:\n\n```bash\ntokenless clean --data-dir ~\u002F.tokenless --older-than 7d\n```\n\nKeep only the newest 100 artifacts:\n\n```bash\ntokenless clean --data-dir ~\u002F.tokenless --keep 100\n```\n\n## Privacy and safety model\n\n- Tokenless runs locally.\n- Raw artifacts stay on local disk under the configured data directory.\n- Tokenless does not call a separate LLM or cloud summarization service.\n- Reducers are deterministic and intentionally conservative.\n- Risky failed outputs pass through unchanged.\n- Exact legal, financial, medical, security, and code-review work may require explicit artifact expansion.\n\n## Limitations\n\n- Claude Code hooks are the primary integration target.\n- Reducers are policy-based and may miss some noisy outputs.\n- Small outputs can expand slightly if forced through Tokenless; classifiers avoid common bounded commands, but the policy is not perfect.\n- Read packets are useful evidence, not a substitute for exact line expansion before high-risk edits.\n- API-body token counts are estimates, not exact billed-token accounting.\n\n## Star this repo\n\nTokenless saves tokens and keeps the raw evidence. Star costs zero. Fair trade.\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=MaxForAI\u002FTokenless&type=Date)](https:\u002F\u002Fstar-history.com\u002F#MaxForAI\u002FTokenless&Date)\n\n## Acknowledgements\n\nTokenless is an independent implementation and does not use code from [caveman](https:\u002F\u002Fgithub.com\u002FJuliusBrussee\u002Fcaveman). We still want to acknowledge caveman as part of the broader open-source conversation around shorter, lower-token agent output.\n\n## License\n\nMIT. Free to use, modify, and ship.\n\n## Contributors\n\n- Max Liu\n- Codex, AI coding assistant\n","Tokenless 是一个旨在通过单条命令减少高达50%以上令牌使用量的工具。它采用JavaScript编写，能够将大型文件或日志等原始输出存储在本地，并向Claude Code发送紧凑的数据包，仅在需要时展开详细信息，从而有效降低API请求中的数据冗余。该工具支持多种配置文件（如聊天、编码模式），用户可以根据实际需求调整响应风格以获得更简洁或更详细的回复。适用于频繁使用Claude Code进行开发测试、代码审查等场景中，特别是在处理大量文本数据时，能显著节约成本并提高效率。","2026-06-11 04:03:33","CREATED_QUERY"]