[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2455":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":13,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":24,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":36,"readmeContent":37,"aiSummary":38,"trendingCount":16,"starSnapshotCount":16,"syncStatus":14,"lastSyncTime":39,"discoverSource":40},2455,"code-context-engine","elara-labs\u002Fcode-context-engine","elara-labs","Save 94% on AI coding tokens. Index your codebase, agents search instead of reading files. Works with Claude Code, Codex, Copilot, Cursor, Gemini CLI. Local MCP server, free, open source.","https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002F",null,"Python",159,18,2,15,0,10,62,30,79.04,"MIT License",false,"main",true,[26,27,28,29,30,31,32,33,34,35],"ai-coding","claude","claude-code","code-indexing","cursor","llm-tools","mcp-server","open-source","save-tokens","token-savings","2026-06-12 04:00:14","\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Felara-labs\u002Fcode-context-engine\u002Fmain\u002Fdocs\u002Flogo.svg\" alt=\"Code Context Engine\" width=\"140\">\n\u003C\u002Fp>\n\n\u003Ch1 align=\"center\">Code Context Engine\u003C\u002Fh1>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>Index your codebase. AI searches instead of re-reading files.\u003Cbr>94% token savings, reproducibly benchmarked.\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002F\">Website\u003C\u002Fa> · \u003Ca href=\"https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002Fguide\u002F\">Docs\u003C\u002Fa> · \u003Ca href=\"https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002Fguide\u002Fwhy-cce\u002F\">Why CCE?\u003C\u002Fa> · \u003Ca href=\"https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002Fblog\u002Fbenchmark-fastapi.html\">Benchmark\u003C\u002Fa> · \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\">GitHub\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cbr>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fcode-context-engine\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fcode-context-engine?style=flat-square&color=blue&label=PyPI\" alt=\"PyPI\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fpepy.tech\u002Fproject\u002Fcode-context-engine\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fpepy\u002Fdt\u002Fcode-context-engine?style=flat-square&label=downloads&color=blue\" alt=\"Downloads\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\u002Factions\u002Fworkflows\u002Fci.yml\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Factions\u002Fworkflow\u002Fstatus\u002Felara-labs\u002Fcode-context-engine\u002Fci.yml?style=flat-square&label=CI\" alt=\"CI\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fregistry.modelcontextprotocol.io\u002F?q=code-context-engine\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMCP_Registry-listed-brightgreen?style=flat-square\" alt=\"MCP Registry\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-yellow?style=flat-square\" alt=\"MIT License\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Felara-labs\u002Fcode-context-engine?style=flat-square&label=stars\" alt=\"Stars\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Csub>Python 3.11+ · macOS · Linux · Windows\u003C\u002Fsub>\n\u003C\u002Fp>\n\n\u003Cbr>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"#install-and-see-savings-in-60-seconds\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FClaude_Code-352318?style=for-the-badge&logo=anthropic&logoColor=D4A27F\" alt=\"Claude Code\">\u003C\u002Fa>&nbsp;\n  \u003Ca href=\"#install-and-see-savings-in-60-seconds\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVS_Code-007ACC?style=for-the-badge&logo=visualstudiocode&logoColor=white\" alt=\"VS Code\">\u003C\u002Fa>&nbsp;\n  \u003Ca href=\"#install-and-see-savings-in-60-seconds\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCursor-000?style=for-the-badge\" alt=\"Cursor\">\u003C\u002Fa>&nbsp;\n  \u003Ca href=\"#install-and-see-savings-in-60-seconds\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGemini_CLI-4285F4?style=for-the-badge&logo=google&logoColor=white\" alt=\"Gemini CLI\">\u003C\u002Fa>&nbsp;\n  \u003Ca href=\"#install-and-see-savings-in-60-seconds\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCodex_CLI-412991?style=for-the-badge\" alt=\"Codex CLI\">\u003C\u002Fa>&nbsp;\n  \u003Ca href=\"#install-and-see-savings-in-60-seconds\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOpenCode-22C55E?style=for-the-badge&logo=gnometerminal&logoColor=white\" alt=\"OpenCode\">\u003C\u002Fa>&nbsp;\n  \u003Ca href=\"#install-and-see-savings-in-60-seconds\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTabnine-4B32C3?style=for-the-badge&logo=tabnine&logoColor=white\" alt=\"Tabnine\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Csub>One command. Auto-detects your editor. Zero cloud, zero config.\u003C\u002Fsub>\n\u003C\u002Fp>\n\n\u003Cbr>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Felara-labs\u002Fcode-context-engine\u002Fmain\u002Fdocs\u002Fdemo.gif\" alt=\"CCE Demo\" width=\"720\">\n\u003C\u002Fp>\n\n---\n\n## Use cases\n\n| | Use case | How CCE helps |\n|---|---|---|\n| **💰** | **Reduce Claude Code costs** | 94% fewer input tokens per session |\n| **🔒** | **Keep code private** | Everything local, no cloud indexing |\n| **🔄** | **Multi-editor teams** | One index across Claude Code, Cursor, VS Code, Gemini CLI |\n| **🧠** | **Cross-session memory** | Decisions and context survive restarts |\n| **⚡** | **Faster responses** | Less context = faster Claude replies |\n| **📊** | **Track actual savings** | Dollar amounts, not estimates |\n\n---\n\n## Quick start\n\n```bash\nuv tool install \"code-context-engine[local]\"    # or: pipx install \"code-context-engine[local]\"\ncd \u002Fpath\u002Fto\u002Fyour\u002Fproject\ncce init                                        # or: cce init --agent all\n```\n\nThat's it. Your AI coding agent now searches your index instead of reading entire files.\n\n> **Already have Ollama?** You can skip `[local]` and use `uv tool install code-context-engine` instead. CCE auto-detects Ollama at localhost:11434 and uses `nomic-embed-text`.\n\n---\n\n## System requirements\n\n- Python 3.11+ (tested on 3.11, 3.12, 3.13)\n- A C compiler and `cmake` (needed to build tree-sitter grammars)\n\n| Platform | Setup |\n|----------|-------|\n| **macOS** | `xcode-select --install` (provides compiler and cmake) |\n| **Ubuntu\u002FDebian** | `sudo apt install build-essential cmake` |\n| **Fedora\u002FRHEL** | `sudo dnf install gcc gcc-c++ cmake` |\n| **Windows** | Install [Visual Studio Build Tools](https:\u002F\u002Fvisualstudio.microsoft.com\u002Fvisual-cpp-build-tools\u002F) (C++ workload) and [CMake](https:\u002F\u002Fcmake.org\u002Fdownload\u002F) |\n\nTested on all three platforms in CI (macOS, Linux, Windows × Python 3.11\u002F3.12\u002F3.13).\n\n## Install and see savings in 60 seconds\n\nYou need an embedding backend to index code. Pick one:\n\n| Option | Install command | Size | Requires |\n|--------|----------------|------|----------|\n| **Local (recommended)** | `uv tool install \"code-context-engine[local]\"` | +60 MB | Nothing else |\n| **Ollama** | `uv tool install code-context-engine` | Core only | Ollama running + `nomic-embed-text` pulled |\n\nThen:\n\n```bash\ncd \u002Fpath\u002Fto\u002Fyour\u002Fproject\ncce init                              # index, install hooks, register MCP server\n```\n\nRestart your editor. Done. Every question now hits the index instead of re-reading files.\n\n`cce init` auto-detects your editor and writes the right config. To target a\nspecific agent, use `--agent claude`, `--agent codex`, `--agent copilot`, or\n`--agent all`.\n\n| Editor | Config written | Instructions |\n|--------|---------------|--------------|\n| Claude Code | `.mcp.json` | `CLAUDE.md` |\n| VS Code \u002F Copilot | `.vscode\u002Fmcp.json` | `.github\u002Fcopilot-instructions.md` |\n| Cursor | `.cursor\u002Fmcp.json` | `.cursorrules` |\n| Gemini CLI | `.gemini\u002Fsettings.json` | `GEMINI.md` |\n| OpenAI Codex | `~\u002F.codex\u002Fconfig.toml` (user-global, per-project section) | `AGENTS.md` |\n| OpenCode | `opencode.json` | |\n| Tabnine | `.tabnine\u002Fagent\u002Fsettings.json` | `TABNINE.md` |\n\nMultiple editors in the same project? All get configured in one command.\n\n**Codex note:** Codex CLI reads MCP servers from `~\u002F.codex\u002Fconfig.toml` only —\nit has no per-project config. `cce init` adds one `[mcp_servers.cce-\u003Cproject>-\u003Chash>]`\nsection per project so multiple projects coexist; `cce uninstall` removes only\nthe section for the current project.\n\n```\n  my-project · 38 queries\n\n  ⛁ ⛁ ⛁ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶ ⛶  94% tokens saved\n\n  Without CCE   48.0k  tokens   $0.14\n  With CCE       3.4k  tokens   $0.01\n  ──────────────────────────────────────────\n  Saved         44.6k  tokens   $0.13\n\n  Cost estimate based on Sonnet input pricing ($3\u002F1M tokens)\n```\n\n---\n\n## Why this matters\n\nInput tokens are 85-95% of your Claude Code bill. CCE cuts them by 94% ([benchmarked on FastAPI](#benchmark-fastapi-reproducible)).\n\n```\nWithout CCE:    Claude reads payments.py + shipping.py   = 45,000 tokens\nWith CCE:       context_search \"payment flow\"            =    800 tokens\n```\n\n| | Without CCE | With CCE |\n|---|---|---|\n| Session startup | Re-reads files every time | Queries the index |\n| Finding a function | Read entire 800-line file | Get the 40-line function |\n| Cross-session memory | None | Decisions + code areas persisted |\n| Token cost (Sonnet, medium project) | ~$0.14\u002Fsession | ~$0.04\u002Fsession |\n\n---\n\n## Benchmark: FastAPI (reproducible)\n\nWe benchmarked CCE against [FastAPI](https:\u002F\u002Fgithub.com\u002Ffastapi\u002Ffastapi) (53 source files, 180K tokens) with 20 real coding questions. No cherry-picking, no synthetic queries.\n\n**Methodology:** For each query, \"without CCE\" means reading the full content of every file the query touches. \"With CCE\" means the relevant chunks after compression.\n\n**Important baseline note:** The 94% number is measured against full-file reads, not against what Claude Code actually does. In practice, Claude Code already uses grep, partial file reads, and targeted tools, so the real-world savings compared to normal Claude Code behavior will be lower than 94%. We use full-file as the baseline because it's reproducible and deterministic (no agent behavior variability). The benchmark measures CCE's retrieval efficiency, not a head-to-head comparison with Claude Code's built-in exploration.\n\n| Metric | Result |\n|--------|--------|\n| **Retrieval savings** | **94%** (83,681 → 4,927 tokens\u002Fquery) |\n| Compression (additional, on retrieved chunks) | 89% (4,927 → 523 tokens\u002Fquery) |\n| Recall@10 (found the right files) | 0.90 |\n| Latency p50 | 0.4ms |\n| Queries tested | 20 |\n\n### Per-Layer Savings (each measured independently)\n\n| Layer | What it does | Savings | Method |\n|-------|-------------|---------|--------|\n| **Retrieval** | Full files → relevant code chunks | 94% | measured |\n| **Chunk Compression** | Raw chunks → signatures + docstrings | 89% | measured |\n| **Grammar** | Drops articles\u002Ffillers from memory text | 13% | measured |\n\nOutput compression (reducing Claude's reply length) provides additional savings (~65% estimated) but is not included in the headline number above.\n\n### Multi-language benchmarks\n\n| Repo | Language | Files | Retrieval savings | Recall@10 |\n|------|----------|-------|-------------------|-----------|\n| [FastAPI](benchmarks\u002Fresults\u002Ffastapi.md) | Python | 53 | **94%** | 0.90 |\n| [chi](benchmarks\u002Fresults\u002Fchi.md) | Go | 94 | **76%** | 0.67 |\n| [fiber](benchmarks\u002Fresults\u002Ffiber.md) | Go (monorepo) | 396 | **93%** | 0.07 |\n\nGo's shorter files reduce the retrieval headroom (smaller baseline). Monorepos dilute recall at top-10 (fiber). Middleware queries with one-feature-per-file hit R=1.00 consistently.\n\n**Reproduce it yourself:**\n\n```bash\npip install code-context-engine\npython benchmarks\u002Frun_benchmark.py --repo https:\u002F\u002Fgithub.com\u002Ffastapi\u002Ffastapi.git --source-dir fastapi\npython benchmarks\u002Frun_benchmark.py --repo https:\u002F\u002Fgithub.com\u002Fgo-chi\u002Fchi.git --source-dir .\n```\n\nFull results in [`benchmarks\u002Fresults\u002F`](benchmarks\u002Fresults\u002F). Queries and methodology in [`benchmarks\u002F`](benchmarks\u002F).\n\n---\n\n## What you get\n\n**9 MCP tools** that Claude uses automatically:\n\n| Tool | What it does |\n|------|-------------|\n| `context_search` | Hybrid vector + BM25 search with graph expansion |\n| `expand_chunk` | Full source for a compressed result |\n| `related_context` | Find code via graph edges (calls, imports) |\n| `session_recall` | Recall decisions from past sessions |\n| `record_decision` | Save a decision for future sessions |\n| `record_code_area` | Record which files were worked in |\n| `index_status` | Check index freshness |\n| `reindex` | Re-index a file or the full project |\n| `set_output_compression` | Adjust response verbosity (`off` \u002F `lite` \u002F `standard` \u002F `max`) |\n\n**Live dashboard** with donut charts, file health, and session history:\n\n```bash\ncce dashboard\n```\n\n![CCE Dashboard](https:\u002F\u002Fraw.githubusercontent.com\u002Felara-labs\u002Fcode-context-engine\u002Fmain\u002Fdocs\u002Fdashboard.png)\n\n**Dollar estimates** fetched from live Anthropic pricing:\n\n```bash\ncce savings --all    # see savings across all projects\n```\n\n---\n\n## How is CCE different?\n\nCCE is editor-agnostic, local-first, and gives you measurable token savings. Your code never leaves your machine. Unlike built-in indexing (Cursor, Continue), CCE works across Claude Code, VS Code, Cursor, Gemini CLI, and Codex with a single index. Unlike cloud tools (Greptile), it's free and private.\n\nSee the [full comparison with alternatives](docs\u002Fcomparison.md) for an honest look at trade-offs.\n\n---\n\n## How it works (the short version)\n\n1. **Index:** Tree-sitter parses your code into semantic chunks (functions, classes, modules). Stored as vector embeddings locally.\n2. **Search:** Claude calls `context_search`. Hybrid vector + BM25 retrieval finds the right chunks. Code graph adds related files automatically.\n3. **Compress:** Chunks are truncated to signatures + docstrings (or LLM-summarized if Ollama is running).\n4. **Remember:** Decisions and code areas persist across sessions via `session_recall`.\n5. **Track:** Every query is logged. `cce savings` shows exactly how much you saved.\n\nRe-indexing after edits takes under 1 second (96% embedding cache hit rate). Git hooks keep the index current automatically.\n\n---\n\n## What makes CCE different\n\n### It saves where the money is\n\nOutput compression tools (like Caveman) save 20-75% on output tokens. Output is 5-15% of your bill. Net savings: ~11%.\n\nCCE saves on **input** tokens (94% retrieval savings on FastAPI, [reproducibly benchmarked](#benchmark-fastapi-reproducible)). Input is 85-95% of your bill.\n\n### It actually understands your code\n\nNot a text search. Tree-sitter AST parsing creates semantic chunks. Hybrid retrieval merges vector similarity with BM25 keyword matching via Reciprocal Rank Fusion. A confidence scorer blends similarity (50%), keyword match (30%), and recency (20%). Graph expansion walks CALLS\u002FIMPORTS edges to pull in related code.\n\n### It remembers\n\n`record_decision(\"use JWT for auth\", reason=\"session tokens flagged by legal\")` is stored in SQLite and surfaces via `session_recall` in the next session. No re-explaining your architecture.\n\n### It tracks real savings\n\nNot estimates. Actual tokens served vs full-file baseline, broken down by buckets (retrieval, compression, output, memory, grammar). Dollar costs fetched from Anthropic's pricing page. Savings summary shown at every session start.\n\n### It is secure by default\n\nSecret files (.env, *.pem, credentials.json) are never indexed. Content is scanned for AWS keys, GitHub tokens, Slack tokens, Stripe keys, JWTs, and generic credentials. PII (emails, IPs, SSNs, credit cards) is scrubbed from memory writes. All MCP file paths are validated against path traversal.\n\n---\n\n## Under the hood\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Content-Hash Embedding Cache\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nSHA-256 fingerprint per chunk, salted with model name. Re-index skips unchanged code. Binary float32 storage (10x smaller than JSON). Typical re-index: 96% cache hit, under 1 second.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>sqlite-vec: 2 MB instead of 217 MB\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nReplaced LanceDB with sqlite-vec. Same cosine-distance quality, 99% smaller install. WAL mode + PRAGMA NORMAL for 80% write speedup. Vectors, FTS5, code graph, and compression cache all in three SQLite files.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Deterministic Grammar Compression\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nMemory entries compressed without LLM calls. Drops articles, fillers, pronouns. Three levels (lite\u002Ffull\u002Fultra, 20-60% savings). Code, paths, URLs preserved byte-for-byte. Same input always yields same output.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Fail-Closed Hook Design\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n5 Claude Code lifecycle hooks capture session context. Every hook runs `curl ... || true`, so a crashed server never blocks the user. SessionStart injects bootstrap context; others capture silently.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Dynamic Pricing\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nDollar estimates in `cce savings` come from live Anthropic pricing (HTML table parsed, cached 7 days, offline fallback). No manual updates when rates change.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Append-Only Savings Ledger\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n7 buckets track every token saved: retrieval, chunk compression, output compression, memory recall, grammar, turn summarization, progressive disclosure. Survives restarts. Powers CLI and dashboard analytics.\n\u003C\u002Fdetails>\n\n---\n\n## CLI at a glance\n\n```bash\ncce init                    # Index + install hooks + register MCP\ncce                         # Status banner\ncce savings                 # Token savings with dollar estimates\ncce savings --all           # All projects\ncce dashboard               # Web dashboard with live charts\ncce search \"auth flow\"      # Test a query\ncce status                  # Index health + config\ncce services                # Ollama + dashboard + MCP status\ncce commands add-rule '...' # Project rules for Claude\ncce uninstall               # Clean removal of all CCE artifacts\n```\n\nRun `cce list` for the full command reference.\n\n---\n\n## Configuration\n\nZero-config by default. Override what you need in `~\u002F.cce\u002Fconfig.yaml` or `.context-engine.yaml`:\n\n```yaml\ncompression:\n  level: standard          # minimal | standard | full\n  output: standard         # off | lite | standard | max\n  ollama_url: http:\u002F\u002Flocalhost:11434   # point at a remote Ollama if desired\n\nretrieval:\n  top_k: 20\n  confidence_threshold: 0.5\n\npricing:\n  model: sonnet            # sonnet | opus | haiku\n```\n\n**Remote Ollama:** If you run Ollama on another machine in your network, set `compression.ollama_url` (e.g. `http:\u002F\u002Fnas.local:11434`) or export `CCE_OLLAMA_URL` — the env var wins. CCE probes the endpoint and falls back to truncation-only compression when it's unreachable, so a flaky link won't break indexing.\n\n---\n\n## Output Compression\n\nCCE also compresses Claude's responses (same concept as Caveman):\n\n| Level | Style | Savings |\n|-------|-------|---------|\n| `off` | Full output | 0% |\n| `lite` | No filler or hedging | ~30% |\n| `standard` | Fragments, drop articles | ~65% |\n| `max` | Telegraphic | ~75% |\n\nTell Claude: \"switch to max compression\" or \"turn off compression\". Code blocks and commands are never compressed.\n\n---\n\n## Disk Footprint\n\n| Component | Size |\n|-----------|------|\n| Core install (Ollama backend) | ~17 MB |\n| With `[local]` extra (fastembed + ONNX) | ~189 MB |\n| Embedding model (one-time download) | ~60 MB (fastembed) or managed by Ollama |\n| Index per project (small\u002Fmedium\u002Flarge) | 5-60 MB |\n\nNo GPU required. With Ollama, embeddings are handled by the Ollama server. With the `[local]` extra, the embedding model runs on CPU via ONNX Runtime.\n\n---\n\n## Supported Languages\n\n**AST-aware chunking (tree-sitter parsed, 10 extensions):**\n\n| Language | Extensions |\n|----------|-----------|\n| Python | `.py` |\n| JavaScript | `.js`, `.jsx` |\n| TypeScript | `.ts`, `.tsx` |\n| PHP | `.php` |\n| Go | `.go` |\n| Rust | `.rs` |\n| Java | `.java` |\n\n**Language-aware fallback chunking (40+ extensions):**\n\n| Category | Languages |\n|----------|-----------|\n| Web | HTML, CSS, SCSS, LESS, Vue, Svelte |\n| Systems | C, C++, C#, Zig, Nim |\n| Mobile | Swift, Kotlin, Dart |\n| Functional | Haskell, Scala, Clojure, Elixir, Erlang, F# |\n| Scripting | Ruby, Perl, Lua, R, Bash\u002FZsh |\n| Data\u002FConfig | JSON, YAML, TOML, XML, SQL, GraphQL, Protobuf |\n| DevOps | Terraform, HCL, Dockerfile |\n| Docs | Markdown |\n\nAll other text files are chunked by line range. Binary files are skipped.\n\n---\n\n## Documentation\n\n| Page | Content |\n|------|---------|\n| [What is CCE? (Complete Guide)](https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002Fblog\u002Fwhat-is-code-context-engine.html) | Setup, tools, how it works, FAQ |\n| [How to Save Claude Code Tokens](https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002Fblog\u002Fsave-claude-code-tokens.html) | Cost breakdown and savings guide |\n| [Benchmark Deep Dive](https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002Fblog\u002Fbenchmark-fastapi.html) | Full FastAPI benchmark methodology |\n| [Comparison with Alternatives](https:\u002F\u002Felara-labs.github.io\u002Fcode-context-engine\u002Fcomparison.html) | CCE vs Cursor, Aider, Continue, Greptile |\n| [Examples](https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\u002Fblob\u002Fmain\u002Fdocs\u002Fwiki\u002FExamples.md) | Real conversations with Claude |\n| [How It Works](https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\u002Fblob\u002Fmain\u002Fdocs\u002Fwiki\u002FHow-It-Works.md) | Full 9-stage pipeline |\n| [CLI Reference](https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\u002Fblob\u002Fmain\u002Fdocs\u002Fwiki\u002FCLI-Reference.md) | Every command with output |\n| [Configuration](https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\u002Fblob\u002Fmain\u002Fdocs\u002Fwiki\u002FConfiguration.md) | All config options |\n\n---\n\n## FAQ\n\n### Does CCE affect response quality?\n\nNo. Quality stays the same or slightly improves.\n\nCCE replaces \"dump the entire file\" with \"search for the relevant function.\" The model still gets the code it needs (0.90 Recall@10 in benchmarks). Less irrelevant context means less noise competing for attention, which can improve the model's focus on your actual question.\n\n### How does output token savings work?\n\nCCE writes output compression rules directly into your agent's instruction files (`CLAUDE.md`, `AGENTS.md`, `.cursorrules`, etc.) during `cce init`. These rules apply to the **entire session**, not just CCE tool responses, so every reply from the agent follows them.\n\nSet the level in `cce.yaml`:\n\n```yaml\ncompression:\n  output: max       # off | lite | standard | max\n```\n\nThen re-run `cce init` to update instruction files. Or change at runtime:\n\n```\nset_output_level output_level=max\n```\n\n| Level | Savings | What it does |\n|-------|---------|--------------|\n| `off` | 0% | No compression |\n| `lite` | ~25% | Removes filler\u002Fhedging\u002Fpleasantries + diff-only for code changes |\n| `standard` | ~70% | Drops articles, fragments, short synonyms + diff-only for code |\n| `max` | ~80% | Telegraphic style + diff-only for code |\n\nDefault is `standard`. All levels include **code output rules** that tell the model to show only changed lines (not full file rewrites), which is where most output tokens go in coding sessions. The `max` level produces very terse prose (similar to \"caveman mode\"). Code blocks, paths, and commands are never compressed regardless of level.\n\n### Where do the savings come from?\n\nMost savings are **input tokens** (what goes into the model):\n\n| Layer | Type | Typical savings |\n|-------|------|-----------------|\n| Retrieval | Input | 94% (full files → relevant chunks) |\n| Chunk compression | Input | 89% (chunks → signatures) |\n| Grammar compression | Input | 13% (article\u002Ffiller removal) |\n| Turn summarization | Input | varies (session history) |\n| Progressive disclosure | Input | varies (tool payloads) |\n| Output compression | Output | 25-80% (depends on level) |\n\nOutput tokens cost 5x more per token (e.g. Opus: $15\u002F1M input vs $75\u002F1M output), so even a small output reduction has outsized cost impact.\n\n---\n\n## Roadmap\n\n- [x] Multi-repo benchmarks (FastAPI, chi, fiber)\n- [ ] More benchmarks (Django, Express)\n- [ ] Tree-sitter support for C, C++, Ruby, Swift, Kotlin\n- [ ] Docker support for remote mode\n\nSee [CHANGELOG.md](CHANGELOG.md) for shipped features.\n\n---\n\n## Contributing\n\nContributions welcome. See [https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\u002Fblob\u002Fmain\u002FCONTRIBUTING.md](https:\u002F\u002Fgithub.com\u002Felara-labs\u002Fcode-context-engine\u002Fblob\u002Fmain\u002FCONTRIBUTING.md) for setup.\n\n---\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n\n## Authors\n\n- [Fazle Elahee](https:\u002F\u002Fgithub.com\u002Ffazleelahhee)\n- [Raj](https:\u002F\u002Fgithub.com\u002Frajkumarsakthivel)\n\n## Acknowledgments\n\n[Claude Code](https:\u002F\u002Fdocs.anthropic.com\u002Fen\u002Fdocs\u002Fclaude-code) · [MCP](https:\u002F\u002Fmodelcontextprotocol.io) · [sqlite-vec](https:\u002F\u002Fgithub.com\u002Fasg017\u002Fsqlite-vec) · [Tree-sitter](https:\u002F\u002Ftree-sitter.github.io\u002F) · [fastembed](https:\u002F\u002Fgithub.com\u002Fqdrant\u002Ffastembed) · [Ollama](https:\u002F\u002Follama.com\u002F)\n\n---\n\n\u003Cp align=\"center\">\n  \u003Cstrong>If CCE saves you tokens, give it a star.\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n\u003C!-- mcp-name: io.github.ai-elara\u002Fcode-context-engine -->\n","Code Context Engine 是一个用于代码索引和搜索的工具，旨在大幅减少AI编码时的Token消耗。它通过为代码库建立索引，使得AI可以在不重新读取文件的情况下进行高效搜索，从而实现高达94%的Token节省。该项目基于Python开发，支持Claude Code、Codex、Copilot、Cursor、Gemini CLI等多种AI编码工具，并且提供了一个本地MCP服务器，完全免费且开源。适用于需要频繁使用AI辅助编程但希望控制成本的开发者或团队，尤其是在处理大型项目时能够显著提高效率并降低成本。","2026-06-11 02:49:59","CREATED_QUERY"]