[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83122":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":18,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":15,"starSnapshotCount":15,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},83122,"mnemo","zaydmulani09\u002Fmnemo","zaydmulani09","Local-first AI memory layer for any LLM. Persistent knowledge graph, entity extraction, semantic retrieval. Works with Ollama, OpenAI, Anthropic, or any OpenAI-compatible backend.",null,"Rust",209,8,4,1,0,5,65,22,2.86,"MIT License",false,"main",true,[],"2026-06-12 02:04:31","# mnemo\n\n> Local-first AI memory layer for any LLM. Persistent knowledge graph,\n> entity extraction, semantic retrieval — no cloud required.\n\n![Build Status](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Factions\u002Fworkflow\u002Fstatus\u002Fzaydmulani09\u002Fmnemo\u002Fci.yml?branch=main)\n![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue)\n![Crates.io](https:\u002F\u002Fimg.shields.io\u002Fcrates\u002Fv\u002Fmnemo-core)\n![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fmnemo-sdk)\n![Docker](https:\u002F\u002Fimg.shields.io\u002Fdocker\u002Fpulls\u002Fzaydmulani09\u002Fmnemo)\n\n---\n\n## What is mnemo?\n\nLLM apps built on custom pipelines have no persistent memory between\nsessions. mnemo is a local sidecar that extracts entities, builds a\nknowledge graph, and injects scored context back into your prompts —\nno cloud, no Python runtime, no vendor lock-in.\n\nmnemo is a sidecar service that watches every conversation you feed it, extracts named entities and relationships using an LLM, builds a persistent knowledge graph in SQLite, and injects relevant context back into future prompts — automatically, in under 50ms. It works with **Ollama** (fully local, free), OpenAI, Anthropic, or any OpenAI-compatible API. It ships as a single static binary with zero cloud dependency.\n\n---\n\n## How it works\n\n```\n  your app\n     │\n     ▼\n  POST \u002Fingest ──► entity extraction (LLM) ──► knowledge graph (SQLite + petgraph)\n                                                        │\n  POST \u002Fretrieve ◄── scoring + ranking ◄── graph traversal + full-text search\n     │\n     ▼\n  context_prompt  ──► inject into your LLM prompt\n```\n\n1. You POST raw text to `\u002Fingest` (a conversation turn, a document, a note).\n2. mnemo sends it to your configured LLM and extracts entities (people, tools, places, concepts) and the relationships between them.\n3. Entities are deduplicated by name+type, aliases are merged, and everything is written to SQLite. The in-memory petgraph is updated atomically.\n4. On POST `\u002Fretrieve`, mnemo runs a 6-stage pipeline: full-text chunk search → entity name search → graph expansion (BFS over the knowledge graph) → relation filter → score+rank → assemble a `context_prompt` string.\n5. You inject `context_prompt` into your LLM's system prompt. Done.\n\n---\n\n## Why mnemo\n\nThere are a lot of AI memory tools. Here's what makes mnemo different:\n\n| | mnemo | Most alternatives |\n|---|---|---|\n| **Runtime** | Single Rust binary | Python daemon |\n| **Storage** | SQLite, survives restarts | In-memory or cloud |\n| **Graph layer** | petgraph, multi-hop traversal | None |\n| **Cloud dependency** | Zero | Required or optional |\n| **LLM backend** | Any OpenAI-compatible | Often locked to one |\n| **Retrieval** | Scored + ranked, graph-expanded | Naive context dump |\n\n**mnemo is not for everyone.** If you're using a managed agent\nharness that handles memory for you, you don't need it. mnemo\nis for developers building custom LLM pipelines who need\npersistent, structured, local memory they fully control.\n\nThe graph layer is the real differentiator — entities are\ndeduplicated across sessions, relationships are weighted and\ntraversed at query time, and graph-expanded results score at\n0.5x so direct matches always rank higher than inferred ones.\n\n---\n\n## Quickstart\n\n### Path A — Docker + Ollama (fully free, recommended)\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fzaydmulani09\u002Fmnemo\ncd mnemo\ndocker compose up -d\n\n# Pull the llama3 model the first time (~4 GB)\ndocker exec mnemo-ollama ollama pull llama3\n\n# Verify everything is healthy\ncurl http:\u002F\u002Flocalhost:8080\u002Fhealth\n```\n\n### Path B — Binary (Ollama or OpenAI running separately)\n\n```bash\ncargo install --path crates\u002Fmnemo-api\n\n# With Ollama\nexport MNEMO_LLM_BASE_URL=http:\u002F\u002Flocalhost:11434\u002Fv1\nmnemo-api\n\n# With OpenAI\nexport MNEMO_LLM_BASE_URL=https:\u002F\u002Fapi.openai.com\u002Fv1\nexport MNEMO_LLM_API_KEY=sk-...\nexport MNEMO_LLM_MODEL=gpt-4o-mini\nexport MNEMO_LLM_PROVIDER=openai\nmnemo-api\n```\n\n### Path C — Python SDK\n\n```bash\npip install mnemo-sdk\n```\n\n```python\nfrom mnemo import MnemoClient\n\nclient = MnemoClient()  # server at http:\u002F\u002Flocalhost:8080\n\n# Store a memory\nclient.ingest(\"I'm building a Rust vector database called vecdb\")\n\n# Get context for injection into your next LLM prompt\nprint(client.get_context(\"what am I working on?\"))\n```\n\n---\n\n## API Reference\n\nAll endpoints accept and return `application\u002Fjson`. Base URL: `http:\u002F\u002Flocalhost:8080`.\n\n| Method | Path | Description | Request body | Response |\n|--------|------|-------------|--------------|----------|\n| `GET` | `\u002Fhealth` | Server + DB + LLM status | — | `HealthResponse` |\n| `POST` | `\u002Fingest` | Store text, extract entities | `IngestRequest` | `IngestResponse` |\n| `POST` | `\u002Fretrieve` | Retrieve ranked memory context | `RetrievalQuery` | `RetrievalResult` |\n| `GET` | `\u002Fentities` | List entities (paginated) | `?limit&offset` | `Entity[]` |\n| `GET` | `\u002Fentities\u002F:id` | Get entity by UUID | — | `Entity` |\n| `DELETE` | `\u002Fentities\u002F:id` | Delete entity (cascades) | — | `{\"deleted\":true}` |\n| `GET` | `\u002Fentities\u002F:id\u002Fneighbors` | Knowledge graph neighbors | `?depth` (max 5) | `GraphNode[]` |\n| `GET` | `\u002Fchunks` | List memory chunks (paginated) | `?limit&offset&session_id` | `MemoryChunk[]` |\n| `GET` | `\u002Fchunks\u002F:id` | Get chunk by UUID | — | `MemoryChunk` |\n| `DELETE` | `\u002Fchunks\u002F:id` | Delete chunk | — | `{\"deleted\":true}` |\n| `POST` | `\u002Fsearch` | Full-text search entities + chunks | `{\"query\",\"limit\"}` | `{\"entities\",\"chunks\"}` |\n| `DELETE` | `\u002Fwipe` | Delete all memory (irreversible) | header: `X-Confirm-Wipe: true` | `{\"wiped\":true}` |\n| `GET` | `\u002Fstats` | Entity\u002Fchunk\u002Fgraph counts + uptime | — | `StatsResponse` |\n\n**Key request\u002Fresponse types:**\n\n```jsonc\n\u002F\u002F IngestRequest\n{\n  \"content\": \"string\",         \u002F\u002F required — text to store\n  \"source\":  \"string\",         \u002F\u002F required — e.g. \"chat\", \"email\", \"cli\"\n  \"session_id\": \"string|null\", \u002F\u002F optional — group related chunks\n  \"metadata\": {}               \u002F\u002F optional — arbitrary JSON\n}\n\n\u002F\u002F RetrievalQuery\n{\n  \"text\": \"string\",            \u002F\u002F required — query text\n  \"session_id\": \"string|null\", \u002F\u002F optional — filter by session\n  \"max_chunks\": 10,            \u002F\u002F default 10\n  \"max_entities\": 20,          \u002F\u002F default 20\n  \"min_confidence\": 0.5,       \u002F\u002F default 0.5\n  \"include_graph\": true,       \u002F\u002F default true — expand via knowledge graph\n  \"graph_depth\": 2             \u002F\u002F default 2 — BFS depth for graph expansion\n}\n```\n\nFull endpoint documentation with curl examples: [`docs\u002Fapi.md`](docs\u002Fapi.md)\n\n---\n\n## Configuration\n\n### Environment variables\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `MNEMO_DB_PATH` | `mnemo.db` | SQLite database file path |\n| `MNEMO_PORT` | `8080` | API server port |\n| `MNEMO_LLM_BASE_URL` | `http:\u002F\u002Flocalhost:11434\u002Fv1` | OpenAI-compatible LLM base URL |\n| `MNEMO_LLM_MODEL` | `llama3` | Model name for entity extraction |\n| `MNEMO_LLM_API_KEY` | `ollama` | API key (any value works for Ollama) |\n| `MNEMO_LLM_PROVIDER` | `ollama` | Provider type: `ollama`, `openai`, `anthropic`, `custom` |\n\n### TOML config file\n\nPass `--config path\u002Fto\u002Fconfig.toml` to `mnemo-api`. See `mnemo.example.toml`:\n\n```toml\ndb_path = \"mnemo.db\"\nport = 8080\n\n[llm]\nprovider = \"ollama\"\nbase_url = \"http:\u002F\u002Flocalhost:11434\u002Fv1\"\nmodel = \"llama3\"\napi_key = \"ollama\"\ntimeout_secs = 30\nmax_retries = 3\nmax_tokens = 2048\ntemperature = 0.1\n```\n\nEnvironment variables take precedence over TOML values. The active config source is reported in `GET \u002Fhealth` → `config_source`.\n\n---\n\n## CLI\n\nInstall:\n\n```bash\ncargo install --path crates\u002Fmnemo-cli\n```\n\nUsage:\n\n```bash\n# Store a memory\nmnemo ingest \"I use Neovim and prefer dark mode\"\n\n# Retrieve relevant context\nmnemo search \"what editor do I use?\"\n\n# List all extracted entities\nmnemo entities\n\n# Show entity detail + graph neighbors\nmnemo entity \u003Cuuid> --neighbors\n\n# List memory chunks\nmnemo chunks\n\n# Server health\nmnemo health\n\n# Memory statistics\nmnemo stats\n\n# Delete everything (prompts for confirmation)\nmnemo wipe\n\n# Skip confirmation prompt\nmnemo wipe --yes\n\n# Point at a non-default server\nmnemo --server http:\u002F\u002F192.168.1.10:8080 stats\n```\n\n---\n\n## Python SDK\n\nInstall:\n\n```bash\npip install mnemo-sdk\n```\n\nSee [`sdk\u002Fpython\u002FREADME.md`](sdk\u002Fpython\u002FREADME.md) for the full API reference.\n\n**Async example:**\n\n```python\nimport asyncio\nfrom mnemo import AsyncMnemoClient\n\nasync def main():\n    async with AsyncMnemoClient() as client:\n        await client.ingest(\n            \"Alice is a principal engineer at Stripe working on payment infrastructure.\",\n            session_id=\"session-001\",\n        )\n        context = await client.get_context(\n            \"what does Alice work on?\",\n            session_id=\"session-001\",\n        )\n        print(context)\n\nasyncio.run(main())\n```\n\nA working standalone example: [`examples\u002Fbasic_usage.py`](examples\u002Fbasic_usage.py)\n\n---\n\n## Architecture\n\nFour Rust crates wired together:\n\n| Crate | Type | Role |\n|-------|------|------|\n| `mnemo-core` | lib | Entity extraction, graph ops, retrieval engine, DB layer |\n| `mnemo-api` | bin | Axum REST API — thin handler layer over mnemo-core |\n| `mnemo-cli` | bin | CLI tool using blocking reqwest against the API |\n| `mnemo-bench` | bin | Performance benchmarks (12 suites) |\n\nFull architecture documentation: [`docs\u002Farchitecture.md`](docs\u002Farchitecture.md)\n\n---\n\n## Performance\n\nBenchmarked on Apple M2, SQLite WAL mode, in-memory petgraph. Debug build numbers — release build (`--release`) is 3–5× faster.\n\n| Operation | Avg latency | Throughput |\n|-----------|-------------|------------|\n| Entity insert (SQLite) | ~0.12 ms | ~8,300 ops\u002Fs |\n| Entity lookup by ID | ~0.08 ms | ~12,500 ops\u002Fs |\n| Chunk insert | ~0.14 ms | ~7,100 ops\u002Fs |\n| Full-text chunk search | ~0.28 ms | ~3,500 ops\u002Fs |\n| Graph neighbor (depth=1) | ~0.21 ms | ~4,700 ops\u002Fs |\n| Graph neighbor (depth=2) | ~0.89 ms | ~1,100 ops\u002Fs |\n| Full retrieval pipeline | ~4.2 ms | ~238 ops\u002Fs |\n\nRun `cargo run -p mnemo-bench` to benchmark on your hardware.\n\n---\n\n## Testing\n\n### Rust\n```bash\ncargo test --workspace          # run all 122 tests\nmake coverage                  # HTML coverage report (requires cargo-llvm-cov)\nmake coverage-summary          # summary to stdout\n```\n\n### Python SDK\n```bash\ncd sdk\u002Fpython && pytest tests\u002F -v\n```\n\n### Benchmarks\n```bash\ncargo run -p mnemo-bench                    # all 12 benchmarks\ncargo run -p mnemo-bench -- --filter graph  # graph benchmarks only\ncargo run -p mnemo-bench -- --json out.json # save results to JSON\n```\n\nCurrent test counts: **122 Rust tests** · **21 Python tests** · **12 benchmarks**\n\n---\n\n## Contributing\n\nPRs welcome. Please run `make fmt && make lint` before submitting.\nOpen an issue first for large changes.\n\nSee [`CONTRIBUTING.md`](CONTRIBUTING.md) for full setup instructions, code style guide, and how to add a new LLM provider.\n\n---\n\n## License\n\nMIT — see [LICENSE](LICENSE)\n","mnemo是一个为任何大型语言模型（LLM）设计的本地优先AI记忆层，提供持久化的知识图谱、实体提取和语义检索功能，无需依赖云端。该项目使用Rust编写，具有高性能和低延迟的特点，能够在50毫秒内自动完成实体提取、关系构建及上下文注入等操作。通过与Ollama、OpenAI、Anthropic或任何兼容OpenAI API的后端集成，mnemo支持在SQLite数据库中保存数据，并利用petgraph进行多跳遍历以增强检索效果。它特别适合那些需要完全控制本地存储且不希望被特定供应商锁定的应用场景，如开发自定义LLM流水线时希望拥有持久化、结构化记忆能力的情况。",2,"2026-06-11 04:10:12","CREATED_QUERY"]