[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2100":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":18,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":9,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":9,"pushedAt":9,"updatedAt":30,"readmeContent":31,"aiSummary":32,"trendingCount":14,"starSnapshotCount":14,"syncStatus":33,"lastSyncTime":34,"discoverSource":35},2100,"llm-atomic-wiki","cablate\u002Fllm-atomic-wiki","cablate","An extension of Karpathy's LLM Wiki pattern: atom layer, topic-branches, two-layer Lint. Distilled from running the pattern end-to-end.",null,"Shell",136,25,1,0,3,5,8,9,50.54,false,"main",true,[24,25,26,27,28,29],"karpathy","knowledge-base","knowledge-management","llm","markdown","wiki","2026-06-12 04:00:13","# llm-atomic-wiki\n\n> Built on top of **Andrej Karpathy's [LLM Wiki](https:\u002F\u002Fgist.github.com\u002Fkarpathy\u002F442a6bf555914893e9891c11519de94f)**.\n> All credit to him for the pattern — this repo is what I learned by running it end-to-end, plus four small additions that helped at scale.\n\n**584 posts · 8,668 replies · 630 atoms · 83 wiki pages · 11 branches**\n\nThe repo gives you the framework — methodology, schema, scripts, folder structure. Fork it and run it on your own materials. My actual content stays private; the kit is what you get.\n\n🇹🇼 [中文版 README](README.zh-TW.md) · 📖 [the story behind this repo](STORY.md)\n\n---\n\n## What this adds on top of Karpathy's pattern\n\nKarpathy's gist captures the core pattern in beautifully minimal form. The four additions below came from problems I hit while running it at scale — they extend his pattern, they don't replace it.\n\n```\nKarpathy:   raw ─→ wiki\nThis repo:  raw ─→ atoms (organized into topic-branches) ─→ wiki\n```\n\nFour additions:\n\n**1. Atom layer.** Karpathy goes raw → wiki in one compile step. I added atoms in between — one atom equals one claim, with frontmatter (source, type, depth, tags, date). Atoms are the source of truth; wiki is a derived cache. When a wiki page gets a fact wrong, you go back to the atom, not the raw source. This solves the \"loss of information\" and \"false sense of source of truth\" problems that commenter `frosk1` raised on the original gist.\n\n**2. Topic-branches at the atom layer.** Karpathy's wiki is flat. I organize atoms by topic into branch folders at the repo root (one folder per branch), then compile to flat wiki pages with topic prefixes (`wiki\u002F\u003Cbranch>-\u003Csubtopic>.md`). The atom layer becomes browsable; the wiki layer stays index-friendly.\n\n**3. Two-layer Lint.** Karpathy lumps \"find contradictions, ghost links, orphan pages, outdated claims\" into a single Lint operation. I split it. A programmatic layer (`scripts\u002Flint.sh`) handles deterministic checks (ghost links, orphan pages, format violations, outdated markers) in seconds. An LLM layer handles semantic checks (contradictions, expired claims). The programmatic layer runs first so the LLM doesn't waste attention on format issues.\n\n**4. Parallel-compile naming lock.** Karpathy compiles one page at a time. When N agents compile in parallel, they invent different filenames for the same content (`mcp-plus-skills.md` vs `mcp-plus-skills-architecture.md`). The fix is to pre-lock the slug namespace before fanning out. Agents fill content into pre-named slots; they do not name files.\n\n---\n\n## Proof\n\n| Stage | Numbers |\n|-------|---------|\n| Raw input | 584 posts + 8,668 replies + lecture\u002Fcourse materials |\n| Filter pass-through | Posts 70–90% kept, replies ~13% kept (87% noise) |\n| Atoms extracted | 630 (immutable, source of truth) |\n| Branches | 11 (one folder at repo root per topic) |\n| Wiki pages compiled | 83 (3–8 atoms per page) |\n| Lint warnings (tightened) | 16 (down from 47 before regex was tightened) |\n| Largest branch | 101 atoms |\n| Smallest branch | 23 atoms |\n\n---\n\n## How it works\n\n```\n┌─────────┐  Ingest    ┌────────────┐  Compile   ┌─────────┐\n│  raw\u002F   │ ─────────▶ │ \u003Cbranch>\u002F  │ ─────────▶ │  wiki\u002F  │\n│         │  (LLM      │ atom.md    │  (LLM      │ flat    │\n│ sources │  extract)  │ atom.md    │  group)    │ pages   │\n└─────────┘            │ ...        │            └────┬────┘\n                       └────────────┘                 │\n                                                      │\n                                ┌─────────────────────┼─────────────────────┐\n                                ▼                     ▼                     ▼\n                          gen-index.sh           lint.sh              log-append.sh\n                              │                     │                     │\n                              ▼                     ▼                     ▼\n                          index.md           lint-report.md            log.md\n```\n\nCompare to Karpathy's loop:\n\n```\nKarpathy:   raw → wiki → {Ingest, Query, Lint}\nThis repo:  raw → atoms → wiki → {Ingest, Query, programmatic Lint, LLM Lint}\n```\n\nAtoms are where the real work happens. Wiki is rebuildable from atoms; atoms are not rebuildable from wiki.\n\n---\n\n## What's in this repo\n\n```\nllm-atomic-wiki\u002F\n├── README.md              ← you are here\n├── README.zh-TW.md        ← Chinese version\n├── STORY.md               ← the personal story of running it end-to-end\n├── METHODOLOGY.md         ← 6-phase pipeline\n├── CLAUDE.md              ← schema for the LLM operating this repo\n│\n├── raw\u002F                   ← drop your source materials here (gitignored)\n│\n├── atoms\u002F                 ← knowledge atoms, organized by topic-branch (gitignored)\n│   ├── README.md\n│   ├── _template.md       ← copy when creating a new atom\n│   ├── \u003Cbranch-1>\u002F        ← one folder per topic-branch\n│   ├── \u003Cbranch-2>\u002F        ← e.g. ai-agent\u002F, ai-skills\u002F, mcp\u002F, ...\n│   └── ...\n│\n├── wiki\u002F                  ← compiled pages, flat (gitignored)\n│   └── _template.md       ← copy when creating a new wiki page\n│\n├── index.md               ← auto-generated navigation (gitignored)\n├── log.md                 ← change log, append-only (gitignored)\n│\n└── scripts\u002F\n    ├── lint.sh            ← programmatic Lint\n    ├── gen-index.sh       ← rebuild index.md from wiki\u002F\n    ├── log-append.sh      ← append a change entry to log.md\n    └── README.md\n```\n\nThe framework files (READMEs, METHODOLOGY, CLAUDE, scripts, templates) are versioned. Your actual content (raw, branch folders, wiki, generated index\u002Flog) is gitignored — this is intentional and load-bearing. The repo is the kit, not the data.\n\n---\n\n## Quickstart\n\n1. **Fork this repo.**\n2. **Read `METHODOLOGY.md`** — six phases from raw to wiki, plus the maintenance loop.\n3. **Read `CLAUDE.md`** — the formal spec (atom format, wiki format, branch rules, operations, what not to do).\n4. **Edit `.gitignore`** — replace the listed branch names with your own.\n5. **Drop materials into `raw\u002F`** — any text format. PDFs, transcripts, post dumps, articles.\n6. **Drive the pipeline with an LLM** — point Claude Code (or your agent) at `CLAUDE.md` and ask it to ingest a batch.\n7. **Run the scripts** after each compile:\n   ```bash\n   .\u002Fscripts\u002Fgen-index.sh        # rebuild wiki index\n   .\u002Fscripts\u002Flint.sh             # programmatic health check\n   .\u002Fscripts\u002Flog-append.sh \"...\" # record what changed\n   ```\n8. **Run an LLM Lint pass** weekly or after major ingests — see `METHODOLOGY.md`.\n\nThe whole loop is `Ingest → Compile → Index\u002FLog → Lint → Query`. Re-run as you accumulate materials.\n\n---\n\n## Deep dives\n\n- **[STORY.md](STORY.md)** — the personal story: why I ran it, what worked, what surprised me.\n- **[METHODOLOGY.md](METHODOLOGY.md)** — the six-phase pipeline (skeleton → segment-classify → extract → quality pass → external check → wiki compile) and the three maintenance operations.\n- **[CLAUDE.md](CLAUDE.md)** — the formal spec for any LLM operating this repo.\n\n---\n\n## Why this matters (and when it doesn't)\n\nKarpathy's thesis is that knowledge should be a persistent, compounded artifact — not regenerated from raw sources on every query. Compile beats RAG, in his framing. I agree, but with conditions:\n\n- **Knowledge volume under ~200 wiki pages.** Past that, index.md scans degrade and you need vector search alongside.\n- **Knowledge is relatively stable.** This is a cognitive map, not breaking news. Update cadence in days\u002Fweeks, not minutes.\n- **There's a single owner with a point of view.** Personal knowledge, not a hundred-author aggregation.\n- **Quality matters more than coverage.** 50 pages written tight beat 500 pages written shallow.\n\nOutside these conditions, RAG is often the better fit. The two are not exclusive — compile your stable core, RAG your long tail.\n\nA frame that I think gets undersold: Karpathy's real contribution isn't wiki quality. It's that **LLMs don't get bored maintaining the wiki**. The bookkeeping tax that kills most personal knowledge systems is the maintenance, not the structure. LLMs change the cost structure of maintenance — and that's the unlock the gist points at, more than any specific format choice.\n\n---\n\n## Credit\n\nThe pattern, the schema, the operations (Ingest \u002F Query \u002F Lint), the philosophy of compile-over-retrieve — all that is **[Andrej Karpathy's](https:\u002F\u002Fgist.github.com\u002Fkarpathy\u002F442a6bf555914893e9891c11519de94f)**. If you find this repo useful, his gist is the thing to read first.\n\nWhat this repo adds on top:\n- Four small additions to Karpathy's pattern (atom layer, topic-branches, two-layer Lint, parallel-compile lock)\n- A reference implementation methodology\n- A bilingual README and a story doc\n\nIf you fork it and find it useful, a star on Karpathy's original gist is more deserved than one on this repo.\n\n---\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=cablate\u002Fllm-atomic-wiki&type=Date)](https:\u002F\u002Fstar-history.com\u002F#cablate\u002Fllm-atomic-wiki&Date)\n","llm-atomic-wiki 是一个基于 Andrej Karpathy 的 LLM Wiki 模式扩展的知识管理项目。该项目引入了原子层、主题分支和两层 Lint 机制，从而增强了原始模式的功能。核心功能包括：将知识分解为原子单元（每个单元代表一个主张，并带有元数据），按主题组织这些单元，并通过两层 Lint 机制确保内容的准确性和一致性。适合需要高效管理和维护大量知识材料的个人或团队使用，特别是在构建知识库或进行长期研究时。",2,"2026-06-11 02:48:05","CREATED_QUERY"]