[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82288":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":20,"hasPages":20,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":43,"readmeContent":44,"aiSummary":45,"trendingCount":15,"starSnapshotCount":15,"syncStatus":14,"lastSyncTime":46,"discoverSource":47},82288,"pmb","oleksiijko\u002Fpmb","oleksiijko","Local-first persistent memory for AI coding agents (Claude Code, Cursor, Codex) via MCP. 94.5% LoCoMo recall@10, 70ms p50, multilingual, zero API keys.","https:\u002F\u002Fpypi.org\u002Fproject\u002Fpmb-ai\u002F",null,"Python",78,6,2,0,4,17,46.24,"Apache License 2.0",false,"main",[23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42],"ai-agents","ai-memory","bm25","claude-code","codex","cursor","knowledge-graph","lancedb","llm","local-first","mcp","memory","model-context-protocol","privacy","python","rag","semantic-search","sentence-transformers","sqlite","vector-search","2026-06-12 04:01:37","\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002Flogo.png\" width=\"180\" alt=\"PMB logo\">\n\n# PMB · Personal Memory Brain\n\n### Local-first persistent memory for AI agents - Claude Code, Cursor, Codex.\n### 94.5% LoCoMo recall@10 · 70ms p50 · multilingual · Apache 2.0 · zero API keys.\n\n[![PyPI version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fpmb-ai.svg?logo=pypi&logoColor=white&label=pypi&color=blue)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpmb-ai\u002F)\n[![PyPI downloads](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fpmb-ai.svg?logo=pypi&logoColor=white&label=downloads)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpmb-ai\u002F)\n[![Python versions](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fpmb-ai.svg?logo=python&logoColor=white)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpmb-ai\u002F)\n[![CI](https:\u002F\u002Fgithub.com\u002Foleksiijko\u002Fpmb\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg?branch=main)](https:\u002F\u002Fgithub.com\u002Foleksiijko\u002Fpmb\u002Factions\u002Fworkflows\u002Fci.yml)\n[![License: Apache 2.0](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-blue.svg)](LICENSE)\n[![MCP](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fprotocol-MCP-purple.svg)](https:\u002F\u002Fmodelcontextprotocol.io)\n[![LoCoMo Recall](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLoCoMo%20recall%4010-94.5%25-success.svg)](#-benchmarks)\n[![Latency](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fp50%20warm%20recall-70ms-success.svg)](#-benchmarks)\n[![Top-10 stress](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftop--10%20stress%20(900%20q)-99.2%25-success.svg)](#-benchmarks)\n[![Multilingual](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fmultilingual-50%2B%20langs-blueviolet.svg)](#-multilingual)\n[![Local first](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flocal--first-✓-success.svg)](#-privacy--security)\n\n[Quickstart](#-quickstart) · [Screenshots](#-screenshots--every-claim-above-captured-from-a-real-run) · [Benchmarks](#-benchmarks) · [Multilingual](#-multilingual) · [Architecture](#-architecture) · [FAQ](#-faq)\n\n\u003C\u002Fdiv>\n\n---\n\n## 📸 Screenshots - every claim above, captured from a real run\n\n\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F01_connect.png\" width=\"780\" alt=\"pmb connect - wire memory into Claude Code & Codex\">\u003Cbr>\n\u003Csub>One command. Both Claude Code and Codex now share the same workspace.\u003C\u002Fsub>\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F05_locomo.png\" width=\"780\" alt=\"LoCoMo benchmark: 94.5% recall@10\">\u003Cbr>\n\u003Csub>Reproducible LoCoMo: \u003Ccode>python scripts\u002Fbenchmarks\u002Fbenchmark_locomo.py --n-conversations 10\u003C\u002Fcode> → 94.5%.\u003C\u002Fsub>\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F06_multilingual.png\" width=\"780\" alt=\"Multilingual atomic extraction across English, Spanish, German\">\u003Cbr>\n\u003Csub>25+ regex patterns + multilingual embedder cover 50+ languages out of the box.\u003C\u002Fsub>\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F07_mega.png\" width=\"780\" alt=\"Mega stress test - 900 queries, multi-language, top-10 = 99.2%\">\u003Cbr>\n\u003Csub>900-query multi-language stress test including cross-lingual pairs. \u003Ccode>top-10 = 99.2%, p50 = 70ms\u003C\u002Fcode>.\u003C\u002Fsub>\n\n[More screenshots: `pmb stats`, `pmb recall`, `pmb doctor` ↓](#-screenshots--cli-reference)\n\n\u003C\u002Fdiv>\n\n---\n\n## 📖 The problem\n\nYour AI agent forgets everything between sessions. You paste the same\ncontext every morning. You keep a separate notes file the agent can't see.\nYou repeat decisions you made last week.\n\n**PMB fixes this in 3 commands.** Memory survives across sessions, across\ntools (Claude Code + Cursor + Codex share one workspace), and across\nmachine restarts. Nothing leaves your disk.\n\n---\n\n## ⚡ What makes PMB different\n\n|                              | **PMB**               | mem0       | Letta      | Zep        |\n| :--------------------------- | :-------------------: | :--------: | :--------: | :--------: |\n| **LoCoMo recall@10**         | **94.5 %** *(reproducible, [see below](#-benchmarks))* | ~67-70 %   | ~76-80 %   | ~80 %      |\n| **p50 warm recall**          | **70 ms**             | 1-3 s      | 1-3 s      | 1-3 s      |\n| **MCP cold start (boot)**    | **~3.7 s**            | n\u002Fa        | n\u002Fa        | n\u002Fa        |\n| **First recall on empty ws** | **~0 ms** *(skips LanceDB import)* | n\u002Fa | n\u002Fa | n\u002Fa |\n| **Multilingual** (EN + RU + UK + 50+) | **✅ 81-83% top-1 on RU\u002FUK** | EN-mostly | EN-mostly | EN-mostly |\n| **Cross-lingual recall** (RU query → UK fact) | **✅ 100% on bench** | ⚠ ⚠ ⚠ |\n| **Per-call cost**            | **$0**                | metered    | metered    | metered    |\n| **Runs offline**             | **✅ no network**     | ❌ cloud   | partial    | partial    |\n| **API key required**         | **❌**                | ✅         | ✅         | ✅         |\n| **MCP-native**               | **✅ Claude Code \u002F Cursor \u002F Codex** | ❌ | ⚠️ | ⚠️ |\n| **Storage**                  | SQLite + LanceDB on disk | proprietary | proprietary | proprietary |\n| **Portable** (USB \u002F Dropbox) | **✅ just copy `~\u002F.pmb\u002F`** | ❌ | partial | partial |\n| **License**                  | Apache 2.0            | Apache 2.0 | Apache 2.0 | Apache 2.0 |\n\n> Numbers for mem0\u002FLetta\u002FZep are from their own published LoCoMo benchmarks\n> - we have not reproduced them locally. PMB numbers reproduce in one\n> command: `python scripts\u002Fbenchmarks\u002Fbenchmark_locomo.py --n-conversations 10`\n> (~6 min, no graders, no LLM, just retrieval scoring).\n\n---\n\n## 🚀 Quickstart\n\n> **TL;DR**\n> ```bash\n> pip install pmb-ai                    # CLI command remains `pmb`\n> pmb connect codex                     # or claude-code \u002F cursor\n> # restart your agent and say \"remember - I prefer Postgres\"\n> ```\n>\n> Or install from source for the latest unreleased changes:\n> ```bash\n> git clone https:\u002F\u002Fgithub.com\u002Foleksiijko\u002Fpmb.git && cd pmb\n> python -m venv .venv && source .venv\u002Fbin\u002Factivate\n> pip install -e .\n> ```\n\n### Detailed\n\n**1. Install (Python 3.11+ required).**\n\n```bash\ngit clone \u003Crepo-url> pmb\ncd pmb\npython -m venv .venv\n\n# Activate\nsource .venv\u002Fbin\u002Factivate                  # Linux \u002F macOS\n.venv\\Scripts\\activate                     # Windows PowerShell\n\npip install -e .\n```\n\nYou now have a `pmb` command on your `$PATH`. Sanity-check:\n\n```bash\npmb doctor\npmb stats\n```\n\n**2. Hook up your AI agent.** One command per agent - 9 supported:\n\n```bash\npmb connect claude        # Anthropic Claude Code\npmb connect codex         # OpenAI Codex CLI\npmb connect cursor        # Cursor\npmb connect windsurf      # Codeium Windsurf\npmb connect gemini        # Google Gemini CLI\npmb connect vscode        # VS Code \u002F GitHub Copilot MCP\npmb connect zed           # Zed editor\npmb connect opencode      # OpenCode\npmb connect continue      # Continue.dev\n\npmb connect --list        # show every agent + its config path\n```\n\nThis writes an MCP server entry into the agent's config (e.g. `~\u002F.codex\u002Fconfig.toml`)\nand appends a tiny rule block to `AGENTS.md` \u002F `CLAUDE.md`. Point several agents at\none shared workspace with `--workspace personal` so they all see the same memory.\n\n**3. Use your agent normally.** PMB activates only on explicit memory triggers:\n\n| What you say                                | What PMB does                                    |\n| :------------------------------------------ | :----------------------------------------------- |\n| `\"remember - my cat is allergic to chicken\"` | record a pinned fact (importance 0.95)           |\n| `\"I work on the pmb-dashboard project\"`     | record a fact about you\u002Fyour project             |\n| `\"what did I research about Next.js?\"`      | pulls last research summaries                    |\n| `\"why did we pick Postgres?\"`               | recalls the project decision                     |\n| `\"what is JWT?\"`                            | **does nothing** - general questions bypass PMB  |\n\n**4. Inspect what's stored.**\n\n```bash\npmb tui            # terminal UI: Memory · Recall · Stats · Dedup · Tune\npmb dashboard      # web UI on http:\u002F\u002F127.0.0.1:8765\n```\n\n---\n\n## 🔄 Sync, backup, and team memory - without a server\n\nA PMB workspace is just files on disk, so git can version and sync it. No\ncloud service, no account, no vendor in the middle:\n\n```bash\n# Back up \u002F sync your memory to any git remote (private or public)\npmb workspace init --remote git@github.com:you\u002Fmy-memory.git\npmb workspace push                       # commit + push (after every session, or via cron)\npmb workspace pull                        # on a second machine - remote wins on conflict\npmb workspace clone \u003Curl> work-laptop     # bring memory to a fresh device\n\n# Encrypt the whole workspace into one portable, authenticated bundle\npmb workspace export memory.enc           # AES + HMAC, scrypt-derived key\npmb workspace import memory.enc personal  # restore anywhere\n```\n\nThe two compose: **`export` to an encrypted bundle, then push that to a\n*public* repo** - the remote only ever sees ciphertext, so your memory is\nbacked up everywhere and readable nowhere but your machine. Cloud memory\nservices (mem0, Letta, Zep) can't offer this; the data has to live on their\nservers to work. (Encryption needs `pip install 'pmb-ai[crypto]'`.)\n\n---\n\n## 📥 Import your existing memory - no empty cold start\n\nA fresh memory knows nothing about you, which makes day one feel useless. But\nyou already have years of context in ChatGPT, Claude, mem0, or a notes folder.\nBring it in with one command (the entity graph rebuilds automatically after):\n\n```bash\npmb import chatgpt ~\u002FDownloads\u002Fconversations.json   # OpenAI data export\npmb import claude  ~\u002FDownloads\u002Fclaude-export\u002F        # Anthropic data export\npmb import mem0    mem0_dump.json                     # migrate off a competitor\npmb import markdown ~\u002Fnotes\u002F                          # Obsidian vault \u002F plain notes\n```\n\n`--roles user,assistant` controls which chat turns to keep (default: your own\nwords); `--dry-run` previews without writing.\n\n---\n\n## 🔍 `pmb why` - see exactly why recall ranked a result\n\nMost memory tools are a black box. PMB shows the full reranker trace: which of\nthe 14 predicate-aware rules fired on each result and the multiplier each one\ncontributed.\n\n```bash\n$ pmb why \"where do I live now\"\n#1  I moved to Lisbon in April 2026 from Kyiv\n    ▲ verb-match                              ×1.25\n    ▲ now\u002Fcurrent vs past tense               ×1.30\n    ▲ self-intent (first-person rescue)       ×1.30\n    net PAMVR multiplier: ×2.11\n```\n\nGreat for debugging a miss, tuning, or just trusting the engine.\n\n---\n\n## 🔌 Pluggable embedders\n\nThe default multilingual model needs no setup. But you can point PMB at your\nown embedder - fully local via Ollama, or a hosted API:\n\n```bash\npmb config set embedding.backend ollama     # local, offline (nomic-embed-text)\npmb config set embedding.backend openai     # text-embedding-3-small (needs OPENAI_API_KEY)\npmb config set embedding.backend fastembed  # ONNX, fast on CPU\n```\n\nA dimension guard refuses to mix embedders of different vector sizes in one\nworkspace (which would corrupt recall) - switch backends in a fresh workspace\nor re-embed. Use `pmb why` and the LoCoMo bench to verify recall holds on yours.\n\n---\n\n## 📊 Benchmarks\n\n### 1. LoCoMo (the standard) - 94.5% recall@10\n\nLoCoMo is the multi-session benchmark from Snap Research: 10 conversations × ~199 QA pairs each, cited by mem0, Letta, and Zep in their papers.\n\n```\nmean evidence_recall@10 = 94.5% (full 10-conv run, default settings)\n\n   conv-26  █████████████████████████  96.0%   conv-44  █████████████████████████  96.2%\n   conv-30  █████████████████████████  95.2%   conv-47  ████████████████████████   93.2%\n   conv-41  ███████████████████████    90.7%   conv-48  █████████████████████████  96.7%\n   conv-42  █████████████████████████  94.6%   conv-49  ████████████████████████   92.9%\n   conv-43  █████████████████████████  95.0%   conv-50  █████████████████████████  94.6%\n                                                                  all 10 ≥ 90.7%\n```\n\nReproduce in one command:\n```bash\npython scripts\u002Fbenchmarks\u002Fbenchmark_locomo.py --n-conversations 10\n```\n\nLatency: p50 ranges 65-95 ms across conversations, p95 96-142 ms.\n\n### 2. Mega stress test - 900 queries, multi-language, all features on\n\nA harder bench than LoCoMo: 30 base queries × 30 paraphrases each, mixing\n**English coding**, **Russian personal**, **Ukrainian personal**, and\ncross-lingual pairs. Runs with the full PAMVR + auto-vocab + atomic-fact\npipeline.\n\n```\nHEADLINE: top-1 = 73.3% · top-3 = 87.3% · top-10 = 99.2%\nLatency: p50 70ms · p95 183ms · p99 292ms\n\nper-language top-1 (n=queries):\n   en           300   79.3%   ████████████████\n   ru           300   81.0%   ████████████████\n   uk           180   82.8%   ████████████████\n   ru→uk        30   100.0%   ████████████████████ ←  cross-lingual works\n```\n\nReproduce:\n```bash\npython scripts\u002Fbenchmarks\u002Fmega_stress_test.py --n-paraphrases 30\n```\n\n### 3. What actually carries the LoCoMo number\n\nHonest take from a full ablation (`scripts\u002Fbenchmarks\u002Fablation_full.py`):\n\n- **BM25 lexical retrieval is the dominant signal.** Disabling it costs 18 points. Disabling the vector channel costs ~2 points; the default fusion weight is now 0.7 BM25 \u002F 0.3 vector.\n- **The cross-encoder reranker regresses 17 points on LoCoMo.** It is available as an opt-in flag (`recall.rerank = True`) but is **not recommended** for this workload.\n- **Twelve of nineteen ablated layers show 0.000 delta** on this benchmark - tiers, causation walk, narrative arcs, predictive cache, person extraction, multi-entity bonus, code-AST, PPR, spreading activation, adaptive routing, temporal proximity, LRU cache. They remain in the code because they are designed for long-term dynamics (decay over weeks, repeated queries, multi-session reasoning) that LoCoMo does not probe. **We do not claim they are responsible for the LoCoMo score.**\n\n### What actually carries the number\n\nHonest take from a full ablation (`scripts\u002Fbenchmarks\u002Fablation_full.py`):\n\n- **BM25 lexical retrieval is the dominant signal.** Disabling it costs 18 points. Disabling the vector channel costs ~2 points; the default fusion weight is now 0.7 BM25 \u002F 0.3 vector.\n- **The cross-encoder reranker regresses 17 points on LoCoMo.** It is available as an opt-in flag (`recall.rerank = True`) but is **not recommended** for this workload.\n- **Twelve of nineteen ablated layers show 0.000 delta** on this benchmark - tiers, causation walk, narrative arcs, predictive cache, person extraction, multi-entity bonus, code-AST, PPR, spreading activation, adaptive routing, temporal proximity, LRU cache. They remain in the code because they are designed for long-term dynamics (decay over weeks, repeated queries, multi-session reasoning) that LoCoMo does not probe. **We do not claim they are responsible for the LoCoMo score.**\n\n### 4. Latency\n\n```\noperation                                  p50       p95       notes\n──────────────────────────────────────────────────────────────────────────\nrecall (warm engine, mega-stress avg)       70 ms   183 ms    hybrid BM25 + vector + PAMVR\nrecall (LoCoMo per-conv avg)                65-95 ms 96-142 ms one workspace, ~25 events\nrecall (cache hit)                          \u003C1 ms     5 ms    LRU cache\nrecord_batch via MCP (fire-and-forget)       2 ms    11 ms    returns instantly; embed async\nrecord_batch via direct API (sync, n=1)    ~40 ms   113 ms    one fact, one embedding call\nrecent_activity \u002F list_goals                 3 ms    10 ms    pure SQL\npin \u002F unpin                                  5 ms    15 ms    single SQLite UPDATE\npmb stats \u002F pmb list \u002F pmb config           ~900 ms ~1100 ms  full CLI invocation incl. Python boot\n──────────────────────────────────────────────────────────────────────────\nMCP server boot (Codex \u002F Claude Code)       3.7 s              async prewarm runs in background\nMCP first recall on EMPTY workspace         \u003C50 ms             SQL count short-circuits LanceDB import\nMCP first recall AFTER `pmb warmup`         \u003C100 ms            model + LanceDB + BM25 all preloaded\n```\n\n`import lancedb` (~22 s on Windows) is now **fully deferred** - read-only\nCLI commands never pay it, and the MCP server uses an async prewarm that\nreturns boot in ~4 s instead of blocking 45 s.\n\n### 5. Reproduce locally\n\n```bash\npython scripts\u002Fbenchmarks\u002Fbenchmark_locomo.py --n-conversations 10  # 94.5%\npython scripts\u002Fbenchmarks\u002Fmega_stress_test.py --n-paraphrases 30    # 900 queries\npython scripts\u002Fbenchmarks\u002Fablation_full.py --n-conversations 3      # what carries it\npython scripts\u002Fbenchmarks\u002Fperf_bench.py                             # latency \u002F throughput\n```\n\n---\n\n## 🌍 Multilingual\n\nPMB ships the multilingual `paraphrase-multilingual-MiniLM-L12-v2`\nembedder by default - covering **50+ languages**. The recall pipeline\n(PAMVR, atomic fact extraction, auto-vocab bridges) adds explicit regex\npatterns for the common ones (English, plus two Cyrillic-script languages\nfor our integrator's domain), and falls back to embedder-only matching\nfor everything else.\n\n### Real numbers from `mega_stress_test.py` (n=900 queries)\n\n```\nLanguage                  n       top-1     top-3\n────────────────────────────────────────────────────\nEnglish (coding)         300     79.3%     90.0%\nCyrillic lang-A          300     81.0%     99.7%   ← multilingual embedder shines\nCyrillic lang-B          180     82.8%     87.2%\nCross-lingual A → B       30    100.0%    100.0%   ← embedder bridges related languages\n```\n\n### Atomic fact extraction without LLM\n\n```text\nInput (EN):  \"Today I met Alice. She lives in Berlin. We use Cloud Run.\"\nPMB extracts:\n  • Alice is the tech lead\n  • She lives in Berlin\n  • We use Cloud Run for deployment\n\nInput (ES):  \"Mi nombre es Carlos. Vivo en Madrid. Trabajo como ingeniero.\"\nPMB extracts (via multilingual embedder + structural patterns):\n  • Name: Carlos\n  • Lives in Madrid\n  • Works as engineer\n\nInput (DE):  \"Ich heiße Anna. Ich wohne in München. Mein Geburtstag ist 7. Juni.\"\nPMB extracts:\n  • Name: Anna\n  • Lives in München\n  • Birthday: 7. Juni\n```\n\n25+ regex patterns cover name, location, work, birthday, preference,\nfamily, ownership across the three primary languages. The embedder\nhandles the rest. Enable atomic extraction per-workspace:\n\n```bash\npmb config set write.atomic_fact_extract true\n```\n\n### Fact replacement (when life changes)\n\n```python\neng.record_keyed_fact(\"user\", \"residence\", \"Kyiv\")\neng.record_keyed_fact(\"user\", \"residence\", \"Warsaw\")   # archives Kyiv\n\n# Recall now returns ONLY Warsaw; Kyiv stays in history:\neng.get_keyed_fact_history(\"user\", \"residence\")\n# → [{\"value\": \"Warsaw\", \"is_current\": True},\n#    {\"value\": \"Kyiv\",   \"is_current\": False}]\n```\n\n### Multilingual safety: `pmb doctor` flags mismatched embedder\n\nIf your workspace has ≥5% non-Latin characters AND you've configured an\nEnglish-only embedder (e.g. `all-MiniLM-L6-v2`), `pmb doctor` shows:\n\n```\nMultilingual fit  │ warn  │ Workspace has 81% non-Latin chars but uses\n                  │       │ all-MiniLM-L6-v2 (English-only). Switch to a\n                  │       │ multilingual model: pmb config set embedding.model\n                  │       │ paraphrase-multilingual-MiniLM-L12-v2\n```\n\n---\n\n## 📸 Screenshots - CLI reference\n\n\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F02_stats.png\" width=\"700\" alt=\"pmb stats - workspace overview\">\u003Cbr>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F03_recall.png\" width=\"700\" alt=\"pmb recall - hybrid search from the shell\">\u003Cbr>\u003Cbr>\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F04_doctor.png\" width=\"700\" alt=\"pmb doctor - health check with multilingual warning\">\n\n\u003C\u002Fdiv>\n\n---\n\n## 📊 Web Dashboard - `pmb dashboard`\n\nLaunch the local web UI at `http:\u002F\u002F127.0.0.1:8765` - no auth, no cloud,\njust a window into your memory:\n\n```bash\npmb dashboard\n```\n\n\u003Cdiv align=\"center\">\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F08_dashboard.png\" width=\"900\" alt=\"PMB Dashboard - overview\">\u003Cbr>\n\u003Csub>Overview tab: total events, active \u002F pinned \u002F archived counts, entity graph stats.\u003C\u002Fsub>\u003Cbr>\u003Cbr>\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F09_dashboard_events.png\" width=\"900\" alt=\"PMB Dashboard - Events tab\">\u003Cbr>\n\u003Csub>Events tab: timeline of recorded facts, activities, decisions. Each row is sortable.\u003C\u002Fsub>\u003Cbr>\u003Cbr>\n\n\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Foleksiijko\u002Fpmb\u002Fmain\u002Fdocs\u002Fassets\u002F10_dashboard_recall.png\" width=\"900\" alt=\"PMB Dashboard - Recall Debug tab\">\u003Cbr>\n\u003Csub>Recall Debug tab: test any query against the workspace, see the ranked\nresults with PAMVR score breakdown - useful for tuning \u003Ccode>recall.*\u003C\u002Fcode> knobs.\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\nOther tabs include **Entities**, **Graph** (interactive entity-edges visualisation),\n**Arcs** (narrative clusters), **Duplicates**, **Performance** (per-MCP-call timings).\n\n---\n\n## 🏛 Architecture\n\n```mermaid\nflowchart TB\n    A[\"AI agent\u003Cbr\u002F>Claude Code · Cursor · Codex\"] -->|MCP protocol| B[\"PMB MCP server\u003Cbr\u002F>12 tools by default\"]\n    B --> C[Engine]\n    C -->|read pipeline| R[\"Hybrid recall\u003Cbr\u002F>BM25 + vector + graph\u003Cbr\u002F>+ PAMVR boosts\"]\n    C -->|write path 2ms| W[\"Persist + async embed\u003Cbr\u002F>SQLite first, vector later\"]\n    R --> D[(SQLite events)]\n    R --> E[(LanceDB vectors)]\n    R --> F[(BM25 pickle)]\n    W --> D\n    W --> E\n    W --> F\n    style A fill:#e0f2fe,color:#0c4a6e\n    style B fill:#ddd6fe,color:#4c1d95\n    style C fill:#fef3c7,color:#78350f\n    style R fill:#d1fae5,color:#064e3b\n    style W fill:#fed7aa,color:#7c2d12\n    style D fill:#f3f4f6,color:#111\n    style E fill:#f3f4f6,color:#111\n    style F fill:#f3f4f6,color:#111\n```\n\n\u003Cdetails>\n\u003Csummary>Text-only architecture (collapse this to see the diagram above)\u003C\u002Fsummary>\n\n```\n                      ┌─────────────────────────────────────────────┐\n                      │              AI agent                       │\n                      │   (Codex CLI · Claude Code · Cursor · …)    │\n                      └──────────────────┬──────────────────────────┘\n                                         │  MCP (Model Context Protocol)\n                                         ▼\n                      ┌─────────────────────────────────────────────┐\n                      │  PMB MCP server  -  12 tools by default     │\n                      │  record_batch · recall · pin · list_goals · │\n                      │  recent_activity · what_just_happened · …   │\n                      └──────────────────┬──────────────────────────┘\n                                         │\n                                         ▼\n                ┌────────────────────────────────────────────────┐\n                │  Engine                                        │\n                │  ─────────────────────────────────────────     │\n                │  READ pipeline (12 stages, all gated):         │\n                │   embed → BM25 → vector → graph traversal      │\n                │   → causation walk → arc expansion → PPR       │\n                │   → reranker → adaptive decompose → fusion     │\n                │                                                │\n                │  WRITE path (≤ 2 ms MCP return):               │\n                │   sync: SQLite insert                          │\n                │   async: embed → LanceDB → entity graph        │\n                │   dedup: L1 exact + L2 cosine + L2.5 LLM-verify│\n                └────────────────────────────────────────────────┘\n                                         │\n                       ┌─────────────────┴──────────────────┐\n                       ▼                                    ▼\n              ┌─────────────────┐                  ┌────────────────┐\n              │     SQLite       │                 │    LanceDB     │\n              │  events          │                 │  vectors       │\n              │  graph_entities  │                 │  CLIP (images) │\n              │  graph_edges     │                 └────────────────┘\n              │  mcp_calls       │\n              │  dedup_pending   │\n              │  predictive_cache│\n              └─────────────────┘\n```\n\n\u003C\u002Fdetails>\n\n### Thirteen storage layers\n\n> **Honest note:** these are the **types of data** PMB can store and reason over, not thirteen ranking signals each pulling its weight. Ablation on LoCoMo (see [Benchmarks](#-benchmarks)) shows that BM25 over raw text + the entity co-occurrence graph (layers 1, 2, 5) carry essentially all of the single-session retrieval quality. Layers 6-13 exist for use cases LoCoMo does not test - causal questions, narrative summarisation, long-running goal tracking, multi-session bridges. Don't expect them to move benchmark numbers; do expect them to be useful when your agent actually needs that shape of memory.\n\n| Layer                       | What                                                    | Where                                  |\n| :-------------------------- | :------------------------------------------------------ | :------------------------------------- |\n| 1. Raw events               | every fact\u002Fqa\u002Fdecision the user records                 | `events` table                         |\n| 2. Entities                 | tech names, files, concepts (regex-extracted)           | `graph_entities`                       |\n| 3. Persons                  | people mentioned in chat (5-stage regex pipeline)       | `graph_entities` kind=person           |\n| 4. Code AST                 | Python `def`\u002F`class`\u002F`import` from code blocks          | `graph_entities` kind=function\u002Fclass   |\n| 5. Co-occurrence graph      | \"A & B were in the same event\" edges                    | `graph_edges`                          |\n| 6. Typed causation edges    | `references`, `supersedes`, `caused_by`                 | `event_edges`                          |\n| 7. Atomic facts             | mem0-style decomposition of long messages               | facts attached via metadata            |\n| 8. Fact trees               | one main event + N linked subfacts                      | metadata.parent_ulid                   |\n| 9. Reflections              | LLM-generated \"why does this matter\" bridges            | sleep-mode, optional                   |\n| 10. Narrative arcs          | clusters of related events into stories                 | sleep-mode, optional                   |\n| 11. Bi-temporal index       | `event_time` vs `system_time` (when vs recorded)        | metadata.event_time                    |\n| 12. Activity log            | working-memory tier (3-day decay)                       | event_type=activity                    |\n| 13. Goals + milestone chains| explicit goals with status + tracked metric evolution   | event_type=goal\u002Fmilestone              |\n\n### Five access paths at recall time\n\n```\n                                                     ┌→ BM25 (lexical)\n                                                     │\n                                                     ├→ vector (cosine, multilingual)\n   query  →  classify  →  pick weights  →  fuse  →   ┼→ graph traversal\n                              ↑                      │\n                              │                      ├→ Personalized PageRank\n                       (adaptive routing)            │\n                                                     └→ predictive cache (sleep-baked)\n```\n\nAll five fire in parallel where independent, results are merged with importance × recency × graph weights.\n\n### Three memory tiers\n\n```\n                  tier        decay rate    use\n                  ──────────  ────────────  ──────────────────────\n                  working     ~2-day half-life    recent edits, AI logs\n                  episodic    ~46-day half-life   facts, events\n                  semantic    ~346-day half-life  pinned, goals, identity\n```\n\nThe tiers govern **long-term importance decay**: events that aren't re-accessed lose importance gradually, faster in `working` than in `semantic`. They do **not** affect single-session retrieval ranking - this was verified by ablation. The tier abstraction is in PMB for forgetting \u002F consolidation behaviour over days and weeks, which LoCoMo does not measure.\n\n---\n\n## 🛠 What gets stored, when (and what doesn't)\n\nPMB is **lazy by default**. The AI only touches it on explicit triggers:\n\n```\n┌──────────────────────────────────────┬─────────────────────────────────────────┐\n│ Trigger phrase                       │ PMB action                              │\n├──────────────────────────────────────┼─────────────────────────────────────────┤\n│ \"remember \u002F save \u002F pin\"              │ record + pin (importance 0.95)          │\n│ \"I work on X\"  •  \"we use Y\"         │ record fact (importance 0.7)            │\n│ \"my cat is X\"  •  personal facts     │ record fact tree if there are subfacts  │\n│ \"I want to ship X by Y\"              │ record goal with due_at                 │\n│ \"we switched from X to Y\"            │ record decision + maybe milestone       │\n├──────────────────────────────────────┼─────────────────────────────────────────┤\n│ Agent autonomously decided\u002Fedited\u002Ffixed │ activity(kind=decision\u002Fedit\u002Fcompleted)│\n│ Tracked metric changed               │ milestone in named chain                │\n│ User asked an info question          │ optional 1-line research summary        │\n├──────────────────────────────────────┼─────────────────────────────────────────┤\n│ \"what is Next.js?\" (general Q)       │ ❌ no save, no recall - answers directly│\n│ \"how do I write a for loop?\"         │ ❌ no save, no recall                   │\n│ Debugging \u002F coding help              │ ❌ no save, no recall                   │\n└──────────────────────────────────────┴─────────────────────────────────────────┘\n```\n\nThis is the design - PMB is a memory for **you**, not a log of every Q&A.\n\n> ⚠️ **Important: the \"lazy by default\" gate lives in the agent, not in PMB.**\n> PMB is a retrieval engine - it will always return top-K for any query you hand it. The decision to *not* call `recall()` on general questions like \"what is JWT?\" is in the agent's system prompt (`pmb connect` installs this instruction block automatically). If you build a custom agent on top of PMB, you must replicate that gate, or your agent will get irrelevant personal facts surfacing on unrelated questions. See `src\u002Fpmb\u002Fcli\u002Fconnect.py` for the canonical instruction block PMB injects.\n\n### Which features help which use case\n\nDifferent memory workloads benefit from different parts of the system. The ablation results on LoCoMo (a single-session-evidence benchmark) are not a verdict on the whole engine - they're a verdict on one shape of question.\n\n| Use case | What the agent asks | What helps | Settings to enable |\n|---|---|---|---|\n| **Single-session evidence recall**\u003Cbr\u002F>(\"who said what about X in this thread?\") | \"what did we decide about Postgres?\" | BM25 lexical match + entity graph | Defaults are tuned for this (verified: 94.1% on LoCoMo). Leave `recall.bm25_weight = 0.7`, `recall.typo_correction = False`. |\n| **Multi-hop reasoning across events**\u003Cbr\u002F>(\"who introduced X, and why did Y reject it?\") | \"why did we move away from microservices?\" | Causation walk + adaptive query decomposition | `recall.causation_walk = True` (default), `recall.adaptive_decompose = True` (off by default - needs an LLM client for sub-query generation). |\n| **Long-running goal tracking**\u003Cbr\u002F>(\"am I closer to my Q2 target?\") | \"what's the status of the launch?\" | Goals + milestone chains | Use `record_batch [{\"type\":\"goal\", ...}, {\"type\":\"milestone\", ...}]`. `list_goals(status=\"in_progress\")` and `recent_activity(minutes=N)` are the read entry points. |\n| **Narrative \u002F \"history of X\" queries**\u003Cbr\u002F>(\"walk me through how we got here\") | \"tell me the story of the auth rewrite\" | Narrative arcs + reflections | `pmb arcs cluster` to seed; `recall.arc_expansion = True` (default). Requires `pmb reflect` runs to produce bridges. |\n| **Cross-session bridges**\u003Cbr\u002F>(\"you said something like this a month ago...\") | open-ended \"this reminds me of...\" | Reflections-as-edges + spreading activation | `recall.reflection_to_edges = True` (default), `recall.spreading_activation = True` (default). Run `pmb reflect` periodically. |\n| **Date-anchored questions**\u003Cbr\u002F>(\"what was I doing in March?\") | \"what did I work on last week?\" | Temporal proximity + bi-temporal index | `recall.temporal_enabled = True` (default). Auto-extracts dates from text. |\n| **Code memory**\u003Cbr\u002F>(\"which file imports module X?\") | function\u002Fclass\u002Fimport retrieval | AST entity extraction | `recall.code_ast_extraction = True` (default). Python only today. |\n| **Multilingual \u002F cross-lingual**\u003Cbr\u002F>(query in one language, fact in another) | \"wann haben wir Postgres gewählt?\" | Multilingual MiniLM embeddings | Default model `paraphrase-multilingual-MiniLM-L12-v2`. 50+ languages. |\n| **Decay \u002F \"forget what's stale\"**\u003Cbr\u002F>(low-importance items aging out) | n\u002Fa (background) | Three tiers + per-tier decay rates | Run `pmb decay` (manual) or enable `consolidate.auto_trigger = True`. |\n| **Sleep-mode generalisation**\u003Cbr\u002F>(extract patterns from many small facts) | n\u002Fa (background) | LLM consolidation | `pmb consolidate` with Anthropic or Ollama backend. |\n| **General Q&A**\u003Cbr\u002F>(\"what is JWT?\") | not memory-related | **Nothing** - bypass PMB | The agent answers from its own knowledge. PMB stays out of the loop. |\n\nIf your workload doesn't appear here, that doesn't mean PMB can't help - it means we haven't benchmarked it. Open an issue with a description and we'll tell you what to enable (or admit we don't know).\n\n---\n\n## 💻 CLI reference\n\n```\npmb stats                  workspace summary (event count, by type, graph stats)\npmb list                   last N events\npmb recall \"\u003Cquery>\"       search memory from the shell\npmb why \"\u003Cquery>\"          explain the ranking - full PAMVR rule trace\npmb fact \"\u003Ccontent>\"       record a standalone fact\npmb import \u003Csrc> \u003Cpath>    import chatgpt | claude | mem0 | markdown history\npmb pin \u003Culid>             pin a memory (max importance, no decay)\npmb forget \u003Culid>          archive (reversible)\npmb feedback \u003Culid> useful|wrong   tune importance based on real outcomes\n\npmb tui                    full TUI: Memory · Recall · Stats · Dedup · Tune\npmb dashboard              web UI on :8765\npmb tune                   settings-only TUI (67 knobs)\n\npmb connect \u003Cagent>               auto-wire MCP (9 agents; --list to see all)\npmb ollama status|use|test         local LLM integration\n\npmb workspace push|pull            sync memory to\u002Ffrom any git remote\npmb workspace clone \u003Curl> \u003Cname>   clone a remote workspace\npmb workspace export \u003Cfile.enc>    encrypt workspace to a portable bundle\npmb workspace import \u003Cfile.enc>    restore an encrypted bundle\n\npmb dedupe                 one-shot duplicate sweep\npmb regraph                rebuild the entity graph from events\npmb prune-graph            drop weak co-occurrence edges\npmb reindex                re-embed all events (after model change)\npmb reflect                LLM-generated bridges (sleep-mode)\npmb arcs cluster|list|show narrative arcs\n\npmb config get|set|list    flat-key tuning from the shell\npmb doctor                 health check (model, DB, MCP, …)\n```\n\n---\n\n## ⚙️ Configuration\n\n**67 settings**, organised by category. Browse \u002F edit them three ways:\n\n```bash\npmb tui              # interactive TUI, tab [5] Tune\npmb tune             # settings-only TUI\npmb config set recall.top_k 10           # one-liner from shell\n```\n\n### What you'll most likely want to tune\n\n| Setting                   | Default | What it does                                       |\n| :------------------------ | :-----: | :------------------------------------------------- |\n| `recall.top_k`            | 5       | results returned per query                          |\n| `recall.bm25_weight`      | 0.5     | BM25 vs vector mix (0 = pure vector, 1 = pure BM25)|\n| `recall.rerank`           | false   | add cross-encoder reranker (+50 ms, +precision)    |\n| `recall.recency_half_life_days` | 30 | how fast recent events outweigh old ones           |\n| `dedup.cosine_high`       | 0.92    | merge threshold (higher = more conservative)       |\n| `dedup.enable_semantic`   | true    | turn off to rely on exact-text dedup only          |\n| `embedding.backend`       | sentence-transformers | switch to `fastembed` for 3-5× faster embed |\n| `mcp.record_batch_async`  | true    | fire-and-forget MCP writes                         |\n| `decay.factor_per_day`    | 0.985   | set to 1.0 to disable forgetting                   |\n| `consolidate.auto_trigger`| false   | turn on for nightly LLM consolidation              |\n\nFull list: `pmb config list` or open the TUI Tune tab.\n\n---\n\n## 🦙 Fully local with Ollama\n\nPMB doesn't *need* a cloud LLM, ever. The vector embedder is local (sentence-transformers). The optional LLM-powered ops (consolidation, dedup verification, the `pmb-chat` standalone loop) can all run through Ollama:\n\n```bash\n# 1. Install Ollama → https:\u002F\u002Follama.com\u002Fdownload\nollama serve &\nollama pull llama3.1:8b              # ~5 GB, balanced default\n\n# 2. Point PMB at it\npmb ollama use balanced              # configures all LLM-using ops\npmb ollama status                    # health check\npmb ollama test                      # 1-shot PONG smoke test\n```\n\nNow PMB is **100% offline**:\n\n```\n┌────────────────────────────┬─────────────────────────────┐\n│  Operation                 │  Runs where?                 │\n├────────────────────────────┼─────────────────────────────┤\n│  Embedding                 │  your machine (CPU\u002FGPU)      │\n│  Vector + BM25 + graph     │  your machine                │\n│  record_batch \u002F recall     │  your machine                │\n│  Dedup L1+L2               │  your machine                │\n│  Dedup L2.5 (LLM verify)   │  your machine via Ollama     │\n│  Consolidation             │  your machine via Ollama     │\n│  pmb-chat                  │  your machine via Ollama     │\n└────────────────────────────┴─────────────────────────────┘\n```\n\nFull guide: [`docs\u002FSETUP_OLLAMA.md`](docs\u002FSETUP_OLLAMA.md).\n\n---\n\n## 🔒 Privacy & security\n\n- **Local only.** PMB itself doesn't open any network connections. All data sits in `~\u002F.pmb\u002F`.\n- **No telemetry.** PMB doesn't phone home, has no analytics, no usage reporting.\n- **The agent has its own networking.** Claude Code talks to api.anthropic.com, Codex to OpenAI, etc. PMB has no control over that - but PMB doesn't add a second channel.\n- **Secret redaction.** `record_fact` runs a regex scrubber over content (API keys, tokens, AWS\u002FGCP creds patterns). It's not bulletproof; don't deliberately feed PMB secrets.\n- **Single-user model.** Anyone with read access to `~\u002F.pmb\u002Fworkspaces\u002F\u003Cid>\u002Fevents.sqlite` can read all your memory.\n\nSee [`SECURITY.md`](SECURITY.md) for the full threat model and vulnerability reporting.\n\n---\n\n## 🗺 Roadmap\n\n### Shipped in v0.1\n- [x] 13 storage layers, 5 retrieval signals, 3 decay tiers\n- [x] MCP server with 50+ tools (12 exposed by default)\n- [x] Web dashboard + 5-tab TUI\n- [x] Async fire-and-forget writes (~2 ms MCP response)\n- [x] BM25 fallback for cold reads (no blocking model load)\n- [x] Multi-layer dedup (exact + cosine + LLM-verify)\n- [x] Cross-lingual recall (multilingual MiniLM by default)\n- [x] Per-MCP-call performance tracking\n- [x] Ollama backend for fully-local LLM ops\n- [x] LoCoMo evidence-recall@10: **94.5%** on the full 10-conversation run with default settings (up from 91.6% under previous defaults)\n- [x] Lazy package imports - `import pmb` takes 48 ms (was ~14 s)\n- [x] **Lazy LanceDB import** - `Engine()` no longer pays the 22 s `import lancedb` cost up front; CLI commands `pmb stats \u002F list \u002F config \u002F pin \u002F forget` now run in ~1 s end-to-end (was ~14 s)\n\n### Known issues \u002F on the roadmap for v0.2\n- [ ] **Sync `record_batch(100)` still takes ~11 s** even with batched embedding. The per-item cost is graph indexing + temporal\u002Fcausation edge inserts + L1 dedup, not embedding (already batched). Fix: a `record_batch_bulk` mode that defers graph work. Affects bulk imports, not agent traffic (MCP returns in 2 ms).\n- [ ] **Long-term ablation untested.** The tier \u002F decay \u002F arc \u002F causation features are designed for multi-session dynamics but PMB has no benchmark for that scenario yet. Either build one or be more conservative about claims.\n- [ ] **Reranker regression on LoCoMo.** Cross-encoder is off by default after ablation; investigate which workloads (if any) it actually helps.\n- [ ] Persistent daemon mode - `pmb daemon start`, every Codex session connects to a hot process (no cold start)\n- [ ] PyPI publication - `pip install pmb`\n- [ ] Web dashboard: workspace switcher, settings tab\n- [ ] LLM-judge benchmark wired into CI for regression catching\n- [ ] Auto-backup \u002F export-import commands\n- [ ] First-class macOS \u002F Linux testing (Windows is the primary CI target today)\n\n### Not planned\n- Multi-user, multi-device, cloud sync. PMB is single-machine on purpose.\n- A new GUI framework. The dashboard stays vanilla HTML+JS; the TUI stays Textual.\n- Plugin marketplaces, model hubs, third-party tool stores.\n\n---\n\n## ❓ FAQ\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>How is this different from just pasting context every time?\u003C\u002Fb>\u003C\u002Fsummary>\n\nPasting works for one or two facts. PMB survives across **every** session of **every** agent that supports MCP, indefinitely. And it surfaces context you forgot you ever mentioned.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Why not just use mem0 \u002F Letta \u002F Zep?\u003C\u002Fb>\u003C\u002Fsummary>\n\n- They're cloud services with per-call costs and rate limits.\n- They send your conversations to their servers.\n- On the public LoCoMo benchmark, PMB recalls competitively with their published numbers - and the **methodology** (run the same `benchmark_locomo.py` locally) is auditable, not a marketing slide.\n- Hot-path latency is ~10-30× lower, although PMB has a ~14 s cold-start cost the cloud services don't.\n\nIf their trade-offs are fine for your use case, use them. PMB exists for people who want local + auditable + cheap, knowing that the \"single process owns the memory\" model is a real constraint.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Will PMB slow down my AI agent?\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Hot path (MCP server keeps the engine warm):**\n- Writes: `record_batch` returns in ~2 ms (fire-and-forget; embedding happens in the background).\n- Reads: ~70 ms p50 warm, ~100 ms cold-query (BM25 fallback while the model finishes loading).\n- The agent's own LLM thinking is the dominant latency in any chat turn, by 10-100×.\n\n**Cold path (every short-lived CLI invocation):**\n- `Engine()` construction takes ~14 s the first time per process. The MCP server pays this once at boot, then keeps it. The CLI (`pmb stats`, `pmb recall ...`) pays it every invocation - this is on the v0.2 roadmap to fix.\n\nIf you suspect PMB specifically is slow, open `pmb tui` → tab [3] Stats. It shows the actual per-call timings from the `mcp_calls` table.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Should I enable the cross-encoder reranker?\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Probably not on LoCoMo-like workloads.** Our ablation showed the reranker regresses evidence-recall@10 by 17 points and adds ~840 ms p50 latency - it ranks fluent paraphrases above the source events that carry the dia-id evidence. The flag (`recall.rerank = True`) stays in the code because reranking can help when the candidate set is wide and lexical\u002Fsemantic match alone gives ties; if your workload looks like that, measure first.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>What if I use multiple projects?\u003C\u002Fb>\u003C\u002Fsummary>\n\nPMB defaults to one global workspace (your personal memory follows you across projects). If you want isolation per project, drop a `.pmb\u002Fworkspace.yaml` in each project root with a unique `id` - PMB picks it up automatically.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Does it work with [my agent]?\u003C\u002Fb>\u003C\u002Fsummary>\n\nAnything that speaks MCP: Claude Code, Codex CLI, Cursor, and any future tool that adopts the protocol. For custom agents (Ollama wrappers, your own loop) see `docs\u002FSETUP_OLLAMA.md` for the call patterns.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Can I see what was stored?\u003C\u002Fb>\u003C\u002Fsummary>\n\nThree ways: `pmb tui` (Memory tab), `pmb dashboard` (Events), or just `sqlite3 ~\u002F.pmb\u002Fworkspaces\u002F\u003Cid>\u002Fevents.sqlite` and run SQL. The store is plain SQLite - nothing proprietary.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>How do I delete a memory?\u003C\u002Fb>\u003C\u002Fsummary>\n\n`pmb forget \u003Culid>` archives it (reversible). To purge entirely, open the SQLite file and `DELETE` the row, or use `pmb dedupe --undo` to restore something you didn't mean to merge.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>What if my workspace gets corrupted?\u003C\u002Fb>\u003C\u002Fsummary>\n\nSQLite is robust; the `mcp_calls` and `events` tables are append-mostly. Worst case, copy `~\u002F.pmb\u002Fworkspaces\u002F\u003Cid>\u002F` and start fresh - nothing else depends on this state.\n\nAuto-backup is on the v0.2 roadmap.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Why \"Personal Memory Brain\"?\u003C\u002Fb>\u003C\u002Fsummary>\n\nBecause it's personal (not a team product), it stores memory (not just chat history), and \"brain\" because the architecture is loosely inspired by working memory → episodic → semantic transitions in actual neuroscience. The marketing department was overruled.\n\u003C\u002Fdetails>\n\n---\n\n## 🤝 Contributing\n\nPRs welcome. Please read [`CONTRIBUTING.md`](CONTRIBUTING.md) first - it explains where things go, what's in scope, and what's not.\n\nIn short:\n- One concern per PR.\n- New write-path code must stay sub-100 ms on warm cache.\n- If recall accuracy could change, include a LoCoMo number with the PR.\n\n---\n\n## 📄 License\n\n**Apache License 2.0** - see [`LICENSE`](LICENSE) and [`NOTICE`](NOTICE).\n\nSame license as **mem0**, **Letta**, and **Zep** community editions. Apache 2.0 includes an explicit patent grant from every contributor - important for AI\u002FML projects where patent ambiguity can otherwise scare off enterprise users.\n\nIf you use PMB in a paper or product, citation is appreciated but not required - see [`CITATION.cff`](CITATION.cff).\n\n---\n\n\u003Cdiv align=\"center\">\n\n**Built to forget less.**\n\n[⬆ back to top](#-pmb--personal-memory-brain)\n\n\u003C\u002Fdiv>\n","PMB 是一个为 AI 编码代理（如 Claude Code、Cursor 和 Codex）提供本地优先持久化记忆的解决方案。它基于 Model Context Protocol (MCP)，具有 94.5% 的 LoCoMo 回忆率和 70 毫秒的 p50 延迟，支持多语言且无需 API 密钥。项目采用 Python 开发，利用了 BM25、SQLite 和 LanceDB 等技术实现高效的记忆存储与检索功能。PMB 适用于需要在本地环境中增强 AI 代理上下文理解和信息检索能力的场景，特别是在重视隐私保护的应用中表现尤为出色。","2026-06-11 04:08:16","CREATED_QUERY"]