[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80647":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":15,"stars30d":13,"stars90d":14,"forks30d":14,"starsTrendScore":16,"compositeScore":17,"rankGlobal":9,"rankLanguage":9,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":9,"pushedAt":9,"updatedAt":35,"readmeContent":36,"aiSummary":37,"trendingCount":14,"starSnapshotCount":14,"syncStatus":38,"lastSyncTime":39,"discoverSource":40},80647,"prompt-cache-skills","OnlyTerp\u002Fprompt-cache-skills","OnlyTerp","Drop-in prompt-caching fixes for the LLM agent harness you use. Point your AI coding agent at this repo and it ships the patches.",null,"Python",99,6,50,0,1,3,2.54,"Other",false,"main",true,[23,24,25,26,27,28,29,30,31,32,33,34],"agents-md","ai-skills","aider","anthropic","claude-code","cline","llm-agents","openai","opencode","prompt-caching","prompt-engineering","roo-code","2026-06-12 02:04:04","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbanner.svg\" alt=\"prompt-cache-skills — Drop-in prompt-caching fixes for every LLM agent harness\" width=\"900\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"#how-to-use-it\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGet_Started-6366f1?style=for-the-badge&logoColor=white\" alt=\"Get Started\">\u003C\u002Fa>\n  \u003Ca href=\"skills\u002FREADME.md\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F13_Skills-8b5cf6?style=for-the-badge&logoColor=white\" alt=\"13 Skills\">\u003C\u002Fa>\n  \u003Ca href=\"docs\u002Fscorecard.md\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F13_Completed_Audits-3b82f6?style=for-the-badge&logoColor=white\" alt=\"13 Completed Audits\">\u003C\u002Fa>\n  \u003Ca href=\"#why-this-exists\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FUp_to_10x_Savings-10b981?style=for-the-badge&logoColor=white\" alt=\"10x Savings\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"LICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-CC--BY--4.0-blue.svg?style=flat-square\" alt=\"License\">\u003C\u002Fa>\n  \u003Ca href=\"CHANGELOG.md\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Faudit_date-2026--05--27-purple.svg?style=flat-square\" alt=\"Audit Date\">\u003C\u002Fa>\n  \u003Ca href=\"CONTRIBUTING.md\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-brightgreen.svg?style=flat-square\" alt=\"PRs Welcome\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\nMost popular OSS agent harnesses (Cline, Roo Code, Continue, OpenCode,\nAider) leave **30-90% off your API bill** on the table because their\nprompt-caching code is subtly wrong, off-by-default, or just missing\nfor some providers.\n\nThis repo is a set of **drop-in skills** that any AI coding agent\n(Claude Code, Codex, Cline, Cursor, Devin, Gemini CLI, OpenCode…) can\nread and apply on its own.\n\nYou don't read the diffs. You point your agent at this repo and say:\n\n> *\"Apply every skill in this repo that matches the harnesses I use.\"*\n\nThe agent reads each `SKILL.md`, checks if it applies to your setup,\nlands the diff, and verifies the fix on the wire. You go from broken\nor partial caching to **80-99% cache hit rates** without doing the\nresearch yourself.\n\n---\n\n## What you actually save\n\nOne row per completed audit, so the coverage matches the scorecard:\n\n| Harness | Finding | Cost impact today | Fix \u002F status |\n|---------|---------|-------------------|--------------|\n| Claude Desktop Code | Default Desktop Code launches embedded Claude Code; clean Mac logs show non-zero cache read\u002Fcreate counters by default | Already gets Anthropic cache benefits; no prompt-caching fix needed | No skill; working baseline |\n| Codex CLI | Correct OpenAI cache design: stable `thread_id` cache key | Already gets OpenAI cache benefits | No skill; reference implementation |\n| Aider | `--cache-prompts` off by default; 5min TTL\u002Fkeepalive overhead | Many users get 0% cache reads unless they opt in; shorter cache window | Skills: default-on caching + 1h TTL |\n| OpenCode | Strong Anthropic path, but proxy\u002FBedrock edge cases exist | Some OpenAI-compatible→Anthropic\u002FBedrock routes miss cache | Skills: proxy detection + Bedrock doc-block fix |\n| Roo Code | Anthropic volatile-message bug; Bedrock custom ARN gap | Wastes breakpoints; custom ARNs can drop to 0% cache reads | Skills: volatile-msg fix + Bedrock custom ARN fix |\n| Cline | Anthropic volatile-message bug; OpenAI lacks `prompt_cache_key` | Wastes Anthropic breakpoint; OpenAI native can get 0% cache reads | Skills: volatile-msg fix + OpenAI cache key + timestamp pin |\n| Continue | Cache opt-in default; Gemini explicit caching missing; volatile-message bug | Many users get 0% cache reads; Gemini relies on implicit luck | Skills: default-on + volatile-msg + Gemini explicit cache |\n| Hermes \u002F Nous | Multi-provider cache plumbing works; xAI wire showed cached tokens | No verified savings bug in this audit | No skill; working audit |\n| Codex Desktop | ChatGPT Codex backend cache-scope headers observed\u002Finferred | No verified savings bug in this audit | No skill; inferred working |\n| Devin CLI | Raw CLI model path is opaque Codeium\u002FDevin protobuf | Cache behavior not inspectable from public CLI capture | No skill; unverified managed backend |\n| Windsurf \u002F Cascade | Closed desktop; model turn not captured from CLI | Cache behavior unverified | No skill; needs desktop capture |\n| Antigravity | Closed desktop; no model turn captured | Cache behavior unverified | No skill; needs desktop capture |\n| Grok CLI | Documented CLI chat proxy returns non-zero `prompt_tokens_details.cached_tokens` with real CLI headers | Already gets xAI cache benefits through managed proxy | No skill; working managed proxy |\n\n13 skills total cover the verified patchable OSS bugs. See\n[`skills\u002FREADME.md`](skills\u002FREADME.md) for the full index.\n\n---\n\n## How to use it\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fhow-it-works.svg\" alt=\"Three-step process: Point → Detect and Patch → Verify and Save\" width=\"900\">\n\u003C\u002Fp>\n\n### Option A — point any AI coding agent at this repo\n\nIn your agent of choice (Claude Code, Codex, Cline, Cursor, Devin, etc.):\n\n```\nRead https:\u002F\u002Fgithub.com\u002FOnlyTerp\u002Fprompt-cache-skills\n\nApply every skill in skills\u002F that matches the harnesses I currently\nuse. For each one:\n1. Confirm the target file exists in my project at the cited path.\n2. Apply the diff.\n3. Run the SKILL's Verify steps and confirm the assertion passes.\n4. If verify fails, revert and tell me why.\n```\n\nThat's it. The agent picks up the rest from each `SKILL.md`'s\nmachine-readable frontmatter and instructions.\n\n### Option B — install as a skill bundle in Claude Code \u002F Devin \u002F etc.\n\nIf you use one of the agents that supports a skills directory:\n\n```bash\n# Claude Code\ngit clone https:\u002F\u002Fgithub.com\u002FOnlyTerp\u002Fprompt-cache-skills ~\u002F.claude\u002Fskills\u002Fprompt-cache-skills\n\n# Devin\ngit clone https:\u002F\u002Fgithub.com\u002FOnlyTerp\u002Fprompt-cache-skills ~\u002F.config\u002Fdevin\u002Fskills\u002Fprompt-cache-skills\n\n# OpenCode\ngit clone https:\u002F\u002Fgithub.com\u002FOnlyTerp\u002Fprompt-cache-skills ~\u002F.config\u002Fopencode\u002Fskills\u002Fprompt-cache-skills\n```\n\nThen ask your agent:\n\n```\nRun the prompt-cache-skills bundle on this codebase.\n```\n\n### Option C — read and apply by hand\n\nEach [`skills\u002F\u003Cname>\u002FSKILL.md`](skills\u002F) is a complete fix: target,\nsymptom, diff, verification. Apply the relevant ones manually if you\ndon't trust your agent to do it.\n\n---\n\n## What's in here\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Farchitecture.svg\" alt=\"Repository structure — skills, audits, docs, and tools\" width=\"900\">\n\u003C\u002Fp>\n\n```\nprompt-cache-skills\u002F\n├── skills\u002F                       ← the fixes (this is what your agent reads)\n│   ├── cline-fix-volatile-msg\u002F\n│   ├── cline-openai-cache-key\u002F\n│   ├── cline-pin-timestamp\u002F\n│   ├── roo-fix-volatile-msg\u002F\n│   ├── roo-bedrock-custom-arn\u002F\n│   ├── continue-fix-volatile-msg\u002F\n│   ├── continue-enable-defaults\u002F\n│   ├── continue-gemini-explicit\u002F\n│   ├── opencode-detect-openai-compat\u002F\n│   ├── opencode-bedrock-doc-blocks\u002F\n│   ├── opencode-mistral-cache-key\u002F\n│   ├── aider-1h-ttl\u002F\n│   └── aider-cache-default-on\u002F\n├── audits\u002F                       ← evidence: completed audits + queued stubs\n│   ├── cline.md\n│   ├── roo-code.md\n│   ├── aider.md\n│   ├── opencode.md\n│   ├── continue.md\n│   ├── codex-cli.md              ← (reference, already correct)\n│   ├── claude-code.md\n│   ├── hermes-nous.md\n│   ├── codex-desktop.md\n│   ├── devin-cli.md\n│   ├── windsurf-cascade.md\n│   ├── antigravity.md\n│   ├── grok-cli.md\n│   └── queued stubs: crush, goose, aichat, gptme, avante-nvim, kilo-code\n├── docs\u002F                         ← the underlying API mechanics\n│   ├── concepts\u002F                 ← per-provider caching reference\n│   ├── gotchas.md                ← 16 numbered footguns\n│   ├── verification.md           ← how to confirm caching on wire\n│   └── scorecard.md              ← completed audits graded at a glance\n├── tools\u002F                        ← scripts to verify caching + doc consistency\n│   ├── check_cache.py            ← fire request twice, dump cache_* fields\n│   ├── check_docs_consistency.py ← assert counts\u002Ftables\u002Flinks don't drift\n│   ├── audit_harness.sh\n│   └── replay_harness.md\n└── AGENTS.md                     ← entry point for AI agents reading this repo\n```\n\n---\n\n## Why this exists\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fcost-savings.svg\" alt=\"Cost comparison: $7.50 without caching vs $0.75 with caching — 10x savings\" width=\"900\">\n\u003C\u002Fp>\n\nIf your agent harness sends 30,000 tokens of system prompt + tools per\nturn, on Claude 4.7 Opus that's $0.15 per turn uncached vs $0.015\ncached — a 10x difference. A 50-turn coding session costs $7.50 vs\n$0.75. **You're paying 10x what you should be** because the harness\nyou use either:\n\n- doesn't set `cache_control` at all,\n- sets it on volatile content that thrashes the cache,\n- doesn't set `prompt_cache_key` for OpenAI,\n- has caching gated behind a config flag you never set, or\n- just doesn't implement it for one of your providers.\n\nNone of these are hard to fix. They're all 5-15 line diffs. The\nhard part is knowing which one applies to your harness and getting it\nright. This repo does that work for you.\n\n---\n\n## The grade card\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fscorecard.svg\" alt=\"Audit scorecard — 13 harnesses grouped by cache status: working, needs fixes, and unverified\" width=\"900\">\n\u003C\u002Fp>\n\n13 completed harness audits, dated 2026-05-27. The original 7 include the default Claude Desktop Code baseline, source-recon audits for Codex CLI, Aider, OpenCode, Roo Code, Cline, and Continue, plus extended source\u002Fwire\u002Flocal-install audits for Hermes\u002FNous, Codex Desktop, Devin CLI, Windsurf\u002FCascade, Antigravity, and Grok CLI. Six more files in `audits\u002F` are queued stubs, not completed audits.\n\n| Harness | Anthropic | OpenAI | Bedrock | Gemini | Managed\u002Fother |\n|---------|-----------|--------|---------|--------|---------------|\n| Claude Desktop Code | **working (default Desktop Code verified)** | n\u002Fa | n\u002Fa | n\u002Fa | n\u002Fa |\n| Codex CLI | n\u002Fa | **working** | n\u002Fa | n\u002Fa | n\u002Fa |\n| Aider | working | automatic | n\u002Fa | n\u002Fa | n\u002Fa |\n| OpenCode | working | working | partial | n\u002Fa | n\u002Fa |\n| Roo Code | partial | working | partial | n\u002Fa | n\u002Fa |\n| Cline | partial | **broken** | unverified | n\u002Fa | n\u002Fa |\n| Continue | partial | partial | partial | broken | n\u002Fa |\n| Hermes \u002F Nous | **working** | working (Responses) | n\u002Fa | unverified | xAI working |\n| Codex Desktop | n\u002Fa | working* | n\u002Fa | n\u002Fa | ChatGPT Codex backend inferred |\n| Devin CLI | n\u002Fa | n\u002Fa | n\u002Fa | n\u002Fa | unverified (opaque protobuf) |\n| Windsurf \u002F Cascade | n\u002Fa | n\u002Fa | n\u002Fa | n\u002Fa | unverified (desktop not captured) |\n| Antigravity | n\u002Fa | n\u002Fa | n\u002Fa | unverified | unverified (desktop not captured) |\n| Grok CLI | n\u002Fa | n\u002Fa | n\u002Fa | n\u002Fa | **working (xAI CLI proxy cached tokens)** |\n\n\\* RE-backed or inferred from captured\u002Fcompanion wire shape where public source is unavailable; see the linked audit for caveats.\n\nFull per-provider breakdown with file:line citations in\n[`docs\u002Fscorecard.md`](docs\u002Fscorecard.md).\n\n---\n\n## Headline findings\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Ffindings.svg\" alt=\"Seven headline findings from the audit\" width=\"900\">\n\u003C\u002Fp>\n\n1. **The \"last 2 user messages\" pattern is a copy-paste bug** that\n   propagated Cline → Roo → Continue. All three burn a breakpoint on\n   the volatile current turn. Same one-line fix in each.\n2. **Cline OpenAI native is silently broken** — no `prompt_cache_key`,\n   no prefix-stability work. Users on Cline+OpenAI pay full price.\n3. **Gemini explicit caching is universally unimplemented.** Only\n   implicit (best-effort, free) caching engages, even on long\n   sessions with massive stable system prompts where explicit gives\n   a guaranteed 75% discount.\n4. **Codex CLI is the reference for OpenAI-side caching** — thread_id\n   as cache key, preserved across compaction and into sub-agents.\n5. **OpenCode's system-prompt split is the best Anthropic pattern.**\n6. **Hermes \u002F Nous has real multi-provider cache plumbing** — source\n   covers Anthropic\u002FOpenRouter\u002FNous\u002FQwen and xAI wire capture showed\n   `prompt_cache_key`, `x-grok-conv-id`, and non-zero cached tokens.\n7. **Closed managed surfaces need transport-aware capture.** Grok CLI\n   now verifies through its versioned chat proxy with non-zero cached\n   tokens; Devin remains protobuf, while Windsurf and Antigravity still\n   need desktop-driven captures.\n\n---\n\n## Trust but verify\n\nEvery skill ships with a Verify section that captures the wire and\nconfirms the fix landed. Don't take our word for it — the\n[`tools\u002Fcheck_cache.py`](tools\u002Fcheck_cache.py) script fires any\nrequest body twice (cold + warm) and prints the diff of `cache_*`\ntoken fields.\n\nRun it before and after applying a skill. You should see\n`cache_read_input_tokens` (Anthropic) or `cached_tokens` (OpenAI) or\n`cachedContentTokenCount` (Gemini) go from 0 to most of your input.\n\n---\n\n## Contributing\n\nWe accept new skills, new harness audits, and corrections. See\n[`CONTRIBUTING.md`](CONTRIBUTING.md). The bar is: a captured request\nbody + a verified hit-rate change. We don't take vibe submissions.\n\nBy participating you agree to our [Code of Conduct](CODE_OF_CONDUCT.md).\n\n## Security\n\nFound something that looks like a credential leak path, a request\nconstruction bug that leaks user secrets, or any other security\nissue? See [`SECURITY.md`](SECURITY.md) for the disclosure process.\nDon't open a public issue.\n\n## Changelog\n\nReleases tracked in [`CHANGELOG.md`](CHANGELOG.md).\n\n## License\n\nSkills and audit prose: CC-BY-4.0. Code (`tools\u002F`): MIT.\n\n---\n\n\u003Cp align=\"center\">\n  \u003Cb>If this saved you money, star the repo and share it.\u003C\u002Fb>\u003Cbr>\n  \u003Csub>The whole point is that everyone gets caching working at once.\u003C\u002Fsub>\n\u003C\u002Fp>\n","OnlyTerp\u002Fprompt-cache-skills 是一个用于优化大型语言模型（LLM）代理提示缓存的项目。它提供了一套即插即用的修复方案，适用于多种流行的AI编码代理工具如Claude Code、Codex、Cline等，能够显著提高缓存命中率至80-99%，从而帮助用户节省高达30-90%的API费用。该项目采用Python编写，通过自动识别并应用与用户设置相匹配的技能文件来实现无缝集成。特别适合那些希望在不深入研究技术细节的情况下，提升其AI辅助开发效率和降低成本的开发者使用。",2,"2026-06-11 04:01:30","CREATED_QUERY"]