[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81551":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":15,"stars30d":15,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":16,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":17,"fork":17,"defaultBranch":18,"hasWiki":17,"hasPages":17,"topics":19,"createdAt":10,"pushedAt":10,"updatedAt":20,"readmeContent":21,"aiSummary":22,"trendingCount":15,"starSnapshotCount":15,"syncStatus":13,"lastSyncTime":23,"discoverSource":24},81551,"Pi-forge","Kodrack\u002FPi-forge","Kodrack","Use LLM as a logic gate\u002Fprocessor for local coding agent, context management customisation so the llm calls remains fast and small","",null,"TypeScript",25,2,1,0,38.43,false,"main",[],"2026-06-12 04:01:34","# PiForge\n\n**Hard enforcement for local LLMs running on [Pi coding agent](https:\u002F\u002Fgithub.com\u002Fmariozechner\u002Fpi-coding-agent).**\n\nLocal models (35B and under) spiral, forget, and write 800-line files in one shot. PiForge physically prevents that — at the API boundary, not the prompt level — and gives the model an external brain via `.think\u002F` files that survive context compression.\n\nTested with `qwen3.6-35b-a3b` at **Q2_K_XL quantization** via LM Studio on macOS. Yes — a 2-bit quantized model doing structured multi-file coding, codebase distillation, and tool-call workflows. The guard stack makes that possible.\n\n---\n\n## What's in the box\n\n### 10 hard-enforcement extensions (guards)\n\n| Extension | What it enforces | Default |\n|---|---|---|\n| `incremental-guard.ts` | Rejects writes > 100 lines\u002F6000 chars, edits > 60 lines\u002F3000 chars — forces skeleton → small edit workflow | on |\n| `thinking-guard.ts` | Injects correction when thinking block > 2000 chars — stops reasoning spirals | on |\n| `context-monitor.ts` | Steers model to write state files at 65% context, urgent at 80% | on |\n| `analysis-guard.ts` | Forces findings to `.think\u002Fstep-NNN.md` when response > 1000 chars with no file write | on |\n| `state-guard.ts` | Blocks source reads until `_state.md` is read; forces updates every 5 turns; enforces `.think\u002F` at root only (not in subfolders) | on |\n| `loop-guard.ts` | Detects repetition loops via Jaccard similarity (warns at 4, blocks at 6) AND malformed tool calls (warns at 4, compacts at 8). Auto-compacts to escape both. Safety net for missing inference settings | **off** |\n| `first-prompt.ts` | Appends \"plan in steps, implement one at a time\" to first prompt — preventive, zero context overhead | on |\n| `plan-clarify.ts` | Intercepts `_plan.md` writes — forces model to ask ≤3 clarifying questions before any code | **off** |\n| `knowledge-injector.ts` | Pi subprocess (`--thinking off`) selects relevant `.\u002Fknowledge\u002F` files per-file. Large files distilled to `.distilled\u002F` with hash-based cache. Manifest survives compaction. `\u002Fforget` to remove. | on |\n| `web-search.ts` | Web search with sub-pi synthesis. Searches DuckDuckGo, fetches pages, synthesizes via isolated sub-pi — main context only sees final summary. `web_search()` tool + `\u002Fweb-search` command | on |\n\nThese are **hard** — the model cannot bypass them. `incremental-guard`, `knowledge-injector`, and `loop-guard` physically reject tool calls. The others inject steering messages before the next LLM call.\n\n`plan-clarify` and `loop-guard` are **disabled by default** — enable per session with `\u002Fpiforge enable \u003Cname>`. Use `\u002Fpiforge` to see status.\n\n### Codebase distillation — zoom levels for local models\n\nA local model with 50k context can't hold a real codebase. Reading files one by one is slow, burns context, and the model forgets file #1 by the time it reads file #10. Distill solves this by building compressed versions of the entire codebase at multiple zoom levels — like Google Maps for your code.\n\n**The idea:** You distill your codebase once. This creates three levels of compressed summaries, all mirroring the original folder structure:\n\n```\nSource (100%)  →  L1 (~50%)  →  L2 (~25%)  →  L3 (~12%)\nfull code         key logic     signatures     one-liners\n```\n\nWhen Pi needs to understand the codebase, it doesn't read source files. It queries the right zoom level:\n\n- **L3** — \"What modules exist? What's the architecture?\" — fits in a few hundred tokens\n- **L2** — \"How does the auth system work?\" — function signatures, key relationships\n- **L1** — \"Show me the output pipeline logic\" — detailed summaries with key code preserved\n\nPi zooms in only when needed. Most questions are answered at L2\u002FL3 without ever reading source. When Pi does need the actual code, it knows exactly which file to open because L2 already told it where things live.\n\n**How it works:** Crawls the directory, builds an import graph, topologically sorts files, and processes each file via isolated sub-Pi calls — the main session LLM stays idle and clean. The distilled knowledge persists across sessions.\n\n| Extension | What it does | Default |\n|---|---|---|\n| `distill.ts` | `\u002Fdistill` command + `distill_codebase` LLM-callable tool | on |\n| `distill-query.ts` | `\u002Fl1`, `\u002Fl2`, `\u002Fl3` query commands + `\u002Fdistill-status` | on |\n| `explore.ts` | `\u002Fexplore` + `explore_codebase` tool (superseded by distill-query) | **off** |\n| `distill-awareness.ts` | Session-start context injection (superseded by distill-query) | **off** |\n\n**Additional features:**\n- **Purpose-driven notes**: `--purpose \"how does auth work?\"` takes notes on each file during distillation, then synthesizes a comprehensive answer\n- **LLM-callable tool**: Pi can call `distill_codebase` autonomously — no slash command needed\n- **Single file support**: Distill one large file with automatic chunking\n- **Auto-detect level**: Point at `.think\u002Fdistill\u002FL1\u002F` and it auto-outputs L2\n- **Resume support**: `--resume` continues interrupted distillation\n\n```\n\u002Fdistill [path]                        # distill directory (default: .)\n\u002Fdistill [path] --purpose \"question\"   # distill + take notes on question\n\u002Fdistill --resume                      # resume interrupted run\n\u002Fdistill --level 2                     # compress L1 → L2\n\u002Fdistill [path] --ratio 30            # aggressive compression (30%)\n\u002Fl1 \"how does auth work?\"             # query L1 summaries directly\n\u002Fl2 \"what modules exist?\"             # query L2 summaries directly\n\u002Fl3 \"high-level architecture?\"        # query L3 summaries directly\n\u002Fdistill-status                        # show coverage per level\n```\n\nOutput structure:\n```\n.think\u002Fdistill\u002F\n├── manifest.json      ← state: files, progress per level, config\n├── distill.log        ← append-only log\n├── L1\u002F                ← mirrors source folder structure, ~50% of source\n│   └── src\u002F\n│       └── auth.ts.md\n├── L2\u002F                ← same structure, ~25% of source\n│   └── src\u002F\n│       └── auth.ts.md\n├── notes\u002F             ← purpose-driven findings (optional)\n│   ├── auth-notes.md\n│   └── auth-notes-answer.md\n└── tmp\u002F               ← prompt files (auto-cleaned)\n```\n\n### Web search (context-isolated)\n\n| Extension | What it does | Default |\n|---|---|---|\n| `web-search.ts` | Web search with sub-pi synthesis — main context only sees final summary | on |\n\nLocal models don't have current knowledge. `web-search` lets Pi search the web without polluting the main context with raw HTML:\n\n1. Searches DuckDuckGo for the query\n2. Fetches top 5 result pages in parallel\n3. Extracts readable content (strips nav, ads, scripts)\n4. Spawns isolated sub-pi (`--no-session --no-extensions --no-tools --thinking off --offline`) to synthesize\n5. Returns only the synthesis to main context\n\nRaw pages are saved to `.think\u002Fweb-search\u002F\u003Chash>\u002F` for reference. The main Pi never sees the HTML — only the ~400 word synthesis.\n\n```\n\u002Fweb-search \"svelte 5 runes tutorial\"    # manual search\n# OR Pi can call web_search() tool autonomously\n```\n\nUse it BEFORE implementing when:\n- Working with a library\u002FAPI you're unsure about\n- User mentions versions, \"latest\", or recent dates\n- Debugging error messages you don't recognize\n- Anything that might have changed since training\n\n### Session isolation (per-tab `.think\u002F`)\n\n| Extension | What it does | Default |\n|---|---|---|\n| `session-manager.ts` | Auto-creates isolated `.think\u002F` per Pi terminal instance via symlinks | on |\n\nEvery time you open a new Pi terminal, `session-manager` creates a fresh directory under `.think-sessions\u002F` and points the `.think\u002F` symlink to it. The model always writes to `.think\u002F` — same hardcoded path, zero tokens wasted on session management.\n\n```\n.think-sessions\u002F\n  session-001\u002F          ← first Pi tab's state\n  session-002\u002F          ← second Pi tab's state\n  session-003\u002F          ← third Pi tab's state\n.think\u002F → .think-sessions\u002Fsession-003\u002F   ← symlink to active session\n```\n\nIf `.think\u002F` already exists as a real directory (from before the extension), it gets migrated automatically into `session-001`.\n\nCommands: `\u002Fsessions` (list all), `\u002Fswitch-session [session-id]` (switch to a previous session)\n\n### Purpose anchor (anti-drift after compaction)\n\n| Extension | What it does | Default |\n|---|---|---|\n| `purpose-anchor.ts` | Captures session purpose from first prompt, re-injects purpose + state after compaction | on |\n\nWhen context gets compacted, Pi can lose track of the original goal. `purpose-anchor` solves this:\n1. Saves first user prompt to `.think\u002F_purpose.md`\n2. Hooks into Pi's `session_compact` event\n3. After compaction, steers Pi to re-read `.think\u002F_state.md` and `_summary.md`\n4. Pi re-orients and continues without drift\n\n`\u002Fimportant \"note\"` adds persistent mid-session directives (\"always use async\", \"don't touch auth module\"). Saved to `_purpose.md` under `## Important`, steered immediately, survives compaction. Use `\u002Fimportant -compact \"note\"` to also force compaction after — cleans the context while the note is safe on disk.\n\nCommands: `\u002Fpurpose` (view\u002Fset), `\u002Fpurpose-clear` (reset), `\u002Fimportant \"note\"` (add persistent note), `\u002Fimportant -compact \"note\"` (add + compact), `\u002Fimportant clear` (remove notes)\n\n### Loop detection (Jaccard similarity + malformed call detection)\n\n| Extension | What it does | Default |\n|---|---|---|\n| `loop-guard.ts` | Detects repetition loops AND malformed tool calls, auto-recovers via compaction | **off** |\n\nWithout proper inference settings (repeat penalty, temperature), Q2 models fall into loops — writing `_state.md` with identical content 20+ times, burning context doing nothing. They also sometimes emit malformed tool calls (empty `{}` arguments) repeatedly, poisoning context with failures. `loop-guard` detects both patterns and auto-recovers.\n\n**Jaccard similarity** measures the overlap between two sets of words. Given two text blocks, tokenize each into a set of lowercase words, then: `J = |intersection| \u002F |union|`. A score of 1.0 = identical word sets, 0.0 = no words in common. This runs in microseconds with zero inference cost — pure `Set` math in JS.\n\nThe guard tracks writes per file path in a sliding window of 10. Only repeated writes to the **same file** are flagged — writing similar but different files (e.g., `LeftArm.cs` \u002F `RightArm.cs`) is normal progress.\n\n**Write loop escalation:**\n\n| Trigger | Action |\n|---|---|\n| 4 similar writes (>85% Jaccard) | Warning steer |\n| 6 similar writes | Hard block + escape hint |\n| 3 blocked attempts | Abort → compact (ignore loop turns) → restart from `_state.md` |\n| Loops again | Abort → double compact (crush context to one sentence) → restart |\n| Still loops | Notify user to `\u002Fclear` |\n\n**Malformed tool call escalation:**\n\n| Trigger | Action |\n|---|---|\n| 4 consecutive malformed calls | Warning steer — suggests simpler alternatives (write\u002Fedit instead of bash, avoid paths with spaces) |\n| 8 consecutive malformed calls | Abort → compact (clear poisoned context of failed attempts) → restart |\n| Still failing | Same escalation as write loops (double compact → tell user to `\u002Fclear`) |\n\nAny valid tool call resets the malformed counter. The key insight: each failed attempt stays in context and the model fixates on retrying the same broken call. Compaction removes that poisoned history.\n\n> **This is a safety net, not the primary defense.** The real fix is LM Studio inference settings: `repeat_penalty: 1.1`, `temperature: 0.58`. Enable with `\u002Fpiforge enable loop-guard`.\n\n### Task queue (post-completion delivery)\n\n| Extension | What it does | Default |\n|---|---|---|\n| `queue.ts` | `\u002Fq \"message\"` queues work for after Pi finishes — delivered as a fresh turn, not a steer | on |\n\nQueue messages while Pi is working. Each item is delivered one at a time after Pi completes a turn — Pi fully finishes one queued task before starting the next. No context pollution: queued messages don't exist in context until Pi is idle.\n\n```\n\u002Fq \"now run the tests\"          # queue a task\n\u002Fq \"then update the README\"     # queue another\n\u002Fq                              # show the queue\n\u002Fq clear                        # clear all queued items\n```\n\n### 1 soft-enforcement skill\n\n`incremental-codegen` — SKILL.md that teaches the model the skeleton → edit workflow. Works alongside the hard guards.\n\n### Knowledge folder\n\n`knowledge\u002F` — inference-time context injection with zero context pollution.\n\nOn turn 1, `knowledge-injector` uses **pi subprocess calls** (`pi --thinking off`) to evaluate each knowledge file:\n\n1. **Distillation** (large files >2000 chars): Summarizes to ~100 words, cached in `.distilled\u002F` with hash-based invalidation. Only re-distills when source file changes.\n2. **Selection** (per file): Asks \"Is this file relevant to the purpose?\" with YES bias. Each file evaluated independently — better accuracy than batch selection.\n\n```\nuser prompt → distill large files (cached) → select per-file → inject content → Pi's main LLM call\n```\n\nUsing `--thinking off` ensures clean output from thinking models (Qwen3, etc.) — no reasoning trace pollution.\n\nSelected filenames are saved to `.think\u002F_knowledge-manifest.md`. After compaction or session restart, the extension reads the manifest, rebuilds the content from source files, and re-injects automatically — zero LLM cost, no re-selection needed.\n\nCode writes are blocked until `.think\u002F_knowledge.md` is written — proof the model absorbed the knowledge.\n\nCommands: `\u002Fforget \u003Cname>` (remove knowledge mid-session), `\u002Fguide` (load PiForge self-documentation into context on demand)\n\nKnowledge is **project-local** — each project has its own `knowledge\u002F` folder. Copy the files you need:\n\n```\n\u003Cyour-project>\u002Fknowledge\u002F\n├── astro-gotchas.md\n├── svelte5-gotchas.md\n├── drag-and-drop-gotchas.md\n├── canvas-node-editor-gotchas.md\n├── playwright-testing.md\n└── .distilled\u002F                    ← auto-generated summaries for large files\n    └── ...\n```\n\n`piforge-self.md` lives at `~\u002F.pi\u002Fpiforge-self.md` (global, loaded via `\u002Fguide`).\n\nIncluded knowledge files:\n- `svelte5-gotchas.md` — Svelte 5 runes failure patterns\n- `astro-gotchas.md` — Astro islands, client directives, frontmatter pitfalls\n- `drag-and-drop-gotchas.md` — HTML5 drag API, mouse drag, coordinate transforms\n- `canvas-node-editor-gotchas.md` — render order, SVG wires, pan\u002Fzoom, ports\n- `playwright-testing.md` — Playwright waiting, locators, assertions gotchas\n\nAdd your own — name by tech, failures only. Small files (\u003C500 tokens) get full content sent to selection LLM. Large files get auto-distilled to `.distilled\u002F` subfolder.\n\n### Project template\n\n`project-template\u002FAGENTS.md` — drop into any project. Tells the model to use the `.think\u002F` external brain workflow: scan knowledge folder at session start, read `_state.md` first, write one step file per turn, update state after every action.\n\n---\n\n## Install\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fyourusername\u002Fpiforge\ncd piforge\nbash install.sh\n```\n\nThen:\n1. Start LM Studio, load your model, start the server on `:1234`\n2. Edit `~\u002F.pi\u002Fagent\u002Fmodels.json` — set the model `id` to match your LM Studio model\n3. Copy `project-template\u002FAGENTS.md` into any project you work on\n4. Run `pi` from your project directory\n\nOn startup you should see:\n```\nincremental-guard active (max 100 lines \u002F 6000 chars per write\u002Fedit)\nthinking-guard active (max 2000 chars \u002F 60 lines of thinking per turn)\ncontext-monitor active — warn at 65%, urgent at 80% (window: XXXXX tokens)\nanalysis-guard active (triggers on responses >1000 chars with no file write)\nsession-manager: session-001 — .think\u002F ready\n```\n\n---\n\n## Requirements\n\n- [Pi coding agent](https:\u002F\u002Fgithub.com\u002Fmariozechner\u002Fpi-coding-agent) — `npm install -g @mariozechner\u002Fpi-coding-agent`\n- [LM Studio](https:\u002F\u002Flmstudio.ai) with a model loaded and server running on `:1234`\n- Node.js ≥ 20\n\n**Recommended model:** `qwen3.6-35b-a3b` at Q2_K_XL quantization (Unsloth). Runs on consumer hardware via LM Studio.\n\n> We develop and test PiForge at **Q2_K_XL** — the most aggressive quantization level. The results at 2-bit are already surprisingly good. At higher quantizations, they only get better.\n\nAlso tested with `qwen3-coder-30b-a3b-instruct`. Should work with any OpenAI-compatible local server.\n\n---\n\n## LM Studio settings\n\n### System prompt\n\nAdd this in LM Studio → Model → System Prompt:\n\n```\nCRITICAL OUTPUT RULE: You MUST NEVER write more than 2000 tokens in a single tool call.\n\nWhen generating a new file:\n- First call: write ONLY the \u003Chead> and \u003Cstyle> section\n- Second call: use bash to append the \u003Cbody> HTML: cat >> file.html \u003C\u003C 'CHUNK'\n- Third call: use bash to append the \u003Cscript> section\n- NEVER put an entire HTML file in one write call\n\nWhen the file would be large, ALWAYS use multiple bash append calls.\n\nDO NOT OVERTHINK. Short thinking is better than long thinking.\n```\n\n> Note: the Pi `incremental-guard` extension enforces this at the API layer regardless — the system prompt is a soft nudge on top.\n\n### Inference parameters\n\n| Parameter | Value | Notes |\n|---|---|---|\n| Temperature | `0.58` | Focused but not robotic |\n| Response length limit | `2000` tokens | Backstop — guards are the real enforcement |\n| Top-K sampling | `30` | Narrows token selection |\n| Repeat penalty | `1.1` | Mild reduction of token-level loops |\n| Top-P sampling | `0.95` | Standard nucleus sampling |\n| Min-P sampling | `0.08` | Cuts low-probability tail tokens |\n\n> The response length limit is not always respected by local models — treat it as a last-resort backstop, not primary enforcement. The guard stack handles the real enforcement.\n\n---\n\n## Why this exists\n\nCloud models (GPT-4, Claude, Gemini) self-regulate well enough that you don't need enforcement. Local 35B models don't — they ignore prompt rules, spiral in reasoning loops, and produce truncated garbage when they try to write large files.\n\nThe existing local LLM tooling (Cline, Roo, etc.) is designed for cloud models and just pointed at local endpoints. PiForge is built specifically for the constraints of local inference:\n\n- **Hard limits** at the API layer, not suggestions in a prompt\n- **External memory** via `.think\u002F` files — the model writes everything to disk instead of holding it in context\n- **Distillation** — build a knowledge base from a codebase once, reference it across sessions without re-reading source files\n\n> A scalpel isn't better than a chainsaw because it's sharper — it's better because you're doing surgery, not cutting trees.\n>\n> PiForge doesn't make a Q2 quantized model smart. It removes every decision the model is bad at, until what remains is a narrow set of small, recoverable tasks it can do reliably. The right tool constrained to the right task performs well regardless of raw capability.\n\n---\n\n## Full setup guide\n\nSee [PI-SETUP.md](.\u002FPI-SETUP.md) for the complete reference — every config option, tuning guide, benchmark results, and troubleshooting section.\n\n---\n\n## File structure\n\n```\npiforge\u002F\n├── README.md\n├── install.sh                          ← run this first\n├── PI-SETUP.md                         ← full reference guide\n├── distill-v2-plan.md                  ← distill design document\n├── distill-v2-implementation.md        ← distill implementation spec\n├── extensions\u002F\n│   ├── incremental-guard.ts            ← blocks oversized write\u002Fedit calls\n│   ├── thinking-guard.ts               ← stops reasoning spirals\n│   ├── context-monitor.ts              ← warns before context degrades\n│   ├── analysis-guard.ts               ← forces analysis to disk\n│   ├── token-counter.ts                ← tracks tokens + Gemini cost comparison\n│   ├── first-prompt.ts                 ← injects planning instruction into first prompt only\n│   ├── plan-clarify.ts                 ← clarifying questions after _plan.md (off by default)\n│   ├── knowledge-injector.ts           ← isolated LLM selects project-local knowledge files, hash-based distill cache\n│   ├── state-guard.ts                  ← blocks reads until _state.md read, forces updates\n│   ├── loop-guard.ts                   ← detects repetition loops + malformed tool calls\n│   ├── piforge-manager.ts              ← \u002Fpiforge command to toggle extensions\n│   ├── distill.ts                      ← \u002Fdistill + distill_codebase tool\n│   ├── distill-query.ts                ← \u002Fl1 \u002Fl2 \u002Fl3 direct level queries + \u002Fdistill-status\n│   ├── explore.ts                      ← \u002Fexplore + explore_codebase tool (off by default)\n│   ├── distill-awareness.ts            ← session-start awareness (off by default)\n│   ├── purpose-anchor.ts              ← anti-drift: re-injects purpose after compaction\n│   ├── session-manager.ts             ← per-tab .think\u002F isolation via symlinks\n│   └── queue.ts                       ← \u002Fq \"message\" — post-completion task queue\n├── knowledge\u002F                          ← copy to your project's knowledge\u002F folder\n│   ├── svelte5-gotchas.md              ← Svelte 5 runes failure patterns\n│   ├── astro-gotchas.md                ← Astro islands + client directives failure patterns\n│   ├── drag-and-drop-gotchas.md        ← HTML5 drag API, mouse drag, coordinate transforms\n│   ├── canvas-node-editor-gotchas.md   ← render order, SVG wires, pan\u002Fzoom, ports\n│   └── playwright-testing.md           ← Playwright waiting, locators, assertions gotchas\n├── config\u002F\n│   ├── piforge-self.md                 ← PiForge guide (installed to ~\u002F.pi\u002F, loaded via \u002Fguide)\n├── skills\u002F\n│   └── incremental-codegen\u002F\n│       └── SKILL.md                    ← soft-enforcement skill\n├── config\u002F\n│   ├── models.json                     ← LM Studio provider config template\n│   ├── settings.json                   ← Pi global settings\n│   └── piforge.json                    ← extension toggles (plan-clarify + loop-guard off by default)\n└── project-template\u002F\n    └── AGENTS.md                       ← drop in any project\n```\n","PiForge 是一个用于本地编码代理的项目，通过将大语言模型作为逻辑门\u002F处理器来增强其功能，并通过自定义上下文管理确保调用快速且占用资源小。核心功能包括一系列硬性执行扩展（如增量保护、思考保护等），这些扩展在API边界而非提示级别上强制执行规则，以防止模型生成过长代码或陷入推理循环，并通过`.think\u002F`文件为模型提供外部记忆支持。此外，该项目还支持代码库蒸馏，使得本地模型能够处理大型代码库。适合需要高效利用本地大语言模型进行复杂编程任务和代码管理的场景使用。","2026-06-11 04:05:28","CREATED_QUERY"]