[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81855":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":14,"stars30d":14,"stars90d":15,"forks30d":15,"starsTrendScore":17,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":15,"starSnapshotCount":15,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},81855,"MikeRust","SemplificaAI\u002FMikeRust","SemplificaAI","From the original MikeOSS, I ported and expanded it as desktop application","https:\u002F\u002Fsemplifica.ai\u002Fmikerust\u002F",null,"Rust",28,8,4,0,3,9,2.86,"GNU Affero General Public License v3.0",false,"main",true,[24,25,26],"genai","insurtech","legal-tech","2026-06-12 02:04:20","\u003Cp align=\"left\">\n  \u003Cimg src=\"src\u002Fassets\u002Fmikerust_logo_3x3.svg\" alt=\"MikeRust logo\" width=\"120\" height=\"120\">\n\u003C\u002Fp>\n\n# MikeRust\n\nSovereign local AI document assistant — Rust+axum backend, SQLite, local filesystem storage, ONNX-based embeddings, Tauri shell, clean-room Svelte 5 frontend (forked from [`willchen96\u002Fmike`][upstream] upstream).\n\nDesigned to run entirely on the user's machine: no cloud database, no external auth provider, no S3 bucket. Optional LLM API keys are stored locally and never leave the box except to call the model provider the user explicitly configured.\n\nMaintained by **Semplifica s.r.l.** — [semplifica.ai](https:\u002F\u002Fsemplifica.ai). The code is AGPL-3.0; the Semplifica wordmark and logo are trademarks, see [NOTICE.md](NOTICE.md) for the brand-vs-code separation.\n\n### Built in the open — please contribute\n\nMikeRust is meant to be a **collaborative** project. Fixes, new corpus\nplugins, translations, jurisdiction-specific feedback, design ideas,\nhalf-formed proposals — all welcome, and a small idea filed as an issue\ntends to land faster than a full feature held back until it's \"ready\".\nYou don't have to write the patch yourself: open an\n[issue](https:\u002F\u002Fgithub.com\u002FSemplificaAI\u002FMikeRust\u002Fissues) to discuss a\ndirection, send a pull request when you have something concrete, or\nreach out at [git@semplifica.ai](mailto:git@semplifica.ai)\nif a public thread is the wrong place.\n\n## Lineage\n\nMikeRust derives from the open-source **Mike** project by Will Chen\n([`willchen96\u002Fmike`][upstream]) — an AGPL-3.0 AI legal assistant with\na TypeScript \u002F Express \u002F Supabase \u002F S3 \u002F LibreOffice stack. MikeRust\nkeeps the *product* (chat with citations, document viewer, workflows,\ntabular reviews, corpora) but rebuilds both halves:\n\n  * the **backend** becomes a Rust+axum implementation that uses SQLite\n    (via [`sqlite-vec`](https:\u002F\u002Fgithub.com\u002Fasg017\u002Fsqlite-vec)) instead of\n    Supabase + pgvector, embeds locally with ONNX (INT8-quantized\n    multilingual-e5-base via fastembed + optional DirectML \u002F QNN execution\n    providers), extracts PDF \u002F DOCX \u002F RTF \u002F XLSX in pure Rust — no\n    LibreOffice process spawn — and ships as a Tauri desktop app with no\n    server-side dependency;\n  * the **frontend** is a clean-room rewrite in **Svelte 5 + Vite +\n    Tailwind CSS v4**, replacing the original Next.js \u002F React frontend —\n    see *Frontend* below.\n\nBoth halves have now been rebuilt and the repository contains **no\nsource code from the original Mike project**. The Rust backend was\noriginal from the start; the legacy React frontend — the only\nMike-derived code — has been **removed from the repository** and\nreplaced by the clean-room Svelte rewrite in `frontend\u002F`. MikeRust\nremains a fork of an AGPL-3.0 project and ships under AGPL-3.0 (see\n*License*), but no Mike source survives in the tree.\n\nFor the original cloud-native upstream, see\n[github.com\u002Fwillchen96\u002Fmike][upstream]. For a different sister fork\nspecialised on Danish law, see\n[github.com\u002Fmarklok\u002Fdanishmike](https:\u002F\u002Fgithub.com\u002Fmarklok\u002Fdanishmike).\n\n[upstream]: https:\u002F\u002Fgithub.com\u002Fwillchen96\u002Fmike\n\n## Interface\n\nA desktop window (Tauri) wrapping the Svelte frontend; the embedded axum\nbackend runs in the same process. A few views to set expectations\n(screenshots are of the current Svelte UI):\n\n### The workspace\n\n![MikeRust Assistant home — left sidebar with Assistant, Projects, Tabular reviews, Workflows, DOCX templates and a recent-chats list; a 'Hello, Dario' greeting; a composer with attachment buttons, a per-conversation model picker and the AI disclaimer](docs\u002Fimages\u002Fui_main.png)\n\nThe sidebar holds the Assistant, Projects, Tabular reviews, Workflows and\nDOCX templates, plus the recent-chats list and Settings. Light \u002F system \u002F\ndark theme toggle sits in the top bar; the AI-disclaimer is always shown\nunder the composer.\n\n### Chat with citations and inline document viewer\n\n![Chat answer with numbered citation pills next to the document viewer, which is open on the right showing the cited PDF page](docs\u002Fimages\u002Fui_pdf.png)\n\nNumeric citation pills (`[1]`, `[2]`, …) and `[gN]`\u002F`[pN]` KB tags open the\nsource document in a resizable side panel. The viewer renders PDF \u002F DOCX \u002F\nspreadsheets \u002F plain text in-browser; PDF.js text-search highlights the exact\nquote the model cited. Re-opening a chat re-renders all pills from persisted\nannotations.\n\n### DOCX generation from templates\n\n![A generated 'Relazione di stima d'azienda' Word document rendered in the document viewer beside the chat](docs\u002Fimages\u002Fui_docx.png)\n\nAttach a DOCX template, ask for the document, and the assistant drafts the\nbody and renders a print-ready `.docx` through the pure-Rust docx engine —\nreturned as a download card and previewable in the viewer.\n\n### DOCX template editor\n\n![The full-page DOCX template editor showing the IMS-procedure template — identity fields, per-locale display names, primary-domain and automation-level selectors, also-applicable-to checkboxes, and the start of the layout\u002Fmargins section](docs\u002Fimages\u002Fui_docx_templates.png)\n\nEvery field of a template is editable on a single page — identity, layout,\ntypography, styles and the authoring contract. System templates open\nread-only and can be **duplicated** into editable user templates, saved as\nJSON under `config\u002Fdocx-templates\u002Fuser\u002F`.\n\n### Model providers\n\n![Settings → LLM models — active-provider toggle across Anthropic, Google, OpenAI, Mistral and Local; per-provider API-key fields; a 'key set' badge on Gemini; Gemini model and serving-region selectors](docs\u002Fimages\u002Fui_models.png)\n\nSettings → LLM models picks the active provider — Anthropic, Google, OpenAI,\nMistral, or a local OpenAI-compatible endpoint — and, for Gemini, the serving\nregion. Only providers with a saved API key are selectable.\n\n### Internationalisation\n\nEvery user-facing string goes through a small runes-based i18n store\n([`frontend\u002Fsrc\u002Flib\u002Fstores\u002Fi18n.svelte.ts`](frontend\u002Fsrc\u002Flib\u002Fstores\u002Fi18n.svelte.ts));\nthe catalogues live as plain JSON in\n[`frontend\u002Flocales\u002F`](frontend\u002Flocales\u002F). Currently shipped: **Italian\n(`it.json`), English (`en.json`), French (`fr.json`), German (`de.json`),\nSpanish (`es.json`), Portuguese (`pt.json`)** — six locales, identical\nkey tree on each (no missing strings). All six dictionaries are bundled\nstatically; English is the canonical locale and the fallback for any key\nabsent from another catalogue, so a new key is safe to ship before its\ntranslations land.\n\n> **Key parity tool.** [`frontend\u002Fscripts\u002Ffill-i18n.mjs`](frontend\u002Fscripts\u002Ffill-i18n.mjs)\n> carries a `T` table of translations and *adds* any key a locale is\n> missing, then asserts all six catalogues hold the identical key set.\n> Adding UI strings: put the new keys in the `T` table with all six\n> languages and run `node scripts\u002Ffill-i18n.mjs`.\n\n> ⚠️ Development and screenshots use the UI in **Italian** — that's the\n> source-of-truth surface for visual review and copy iteration. **Never\n> hardcode user-facing strings**, always resolve them through an\n> `i18n.t('Namespace.key')` call.\n\n## Frontend\n\nThe frontend was rewritten from scratch in **Svelte 5 + Vite + Tailwind\nCSS v4**, replacing the original Next.js \u002F React frontend.\n\nThe rewrite was done **blind**: from a written specification of UI\n*behaviour* — what each screen does, which backend endpoints it calls,\nwhat the user sees and on which interaction — never by porting,\ntranslating or reading the React source across line by line. The point\nwas to carry **no code dependency** from the original Mike project into\nthis frontend. `frontend\u002F` is therefore original, independent work, not\na derivative of the upstream React code.\n\nSvelte was chosen because it is a **compiler, not a runtime framework**:\nit emits markedly more **compact** code and the running UI is **leaner\nand more reactive**. MikeRust's interface is form- and panel-heavy —\neditors, modals, side panels, settings, tables — the kind of UI that\ngains little from React's virtual-DOM diffing; Svelte's compiled,\nfine-grained reactivity updates exactly the nodes that changed, with no\nvDOM layer and far less shipped JavaScript.\n\n### Code independence\n\nThe Rust backend was original from the start — Mike's backend is an\nExpress \u002F TypeScript stack, so nothing was carried over. The React\nfrontend was the only Mike-derived code in the project, and it has now\nbeen **removed from the repository**. The Svelte `frontend\u002F` that\nreplaced it was verified independent before that removal:\n\n- it shared **no byte-identical file** with the old React tree — a\n  SHA-256 comparison of every source file in both trees found zero\n  matches;\n- the formats were disjoint — 106 `.tsx` React components versus 68\n  `.svelte` components — so nothing could have been copied across;\n- every commit that touches `frontend\u002F` is authored by **Dario Finardi**.\n\nThe repository therefore now holds **no source code from the original\nMike project** — only original Rust (`src\u002F`, `src-tauri\u002F`) and the\noriginal, blind-rewritten Svelte frontend (`frontend\u002F`).\n\n## Quick start\n\n### Supported platforms\n\nMikeRust currently ships **Windows-only**: x86_64 + ARM64 MSI\ninstallers (the latter native on Snapdragon X Elite). **macOS** is on\nthe roadmap — work hasn't started yet, but the codebase already\ncompiles to `aarch64-apple-darwin` and the Tauri \u002F Webview backends\nare macOS-ready, so the gating items are signing \u002F notarisation and\nthe equivalent of `Windows Hello` via Touch ID. **Linux is not\nsupported and there are no plans to add it.** Any Linux-specific\nadvisories that show up in the dependency graph through Tauri's GTK\nchain (`gtk`, `glib`, `atk`, `webkit2gtk`, …) are therefore not\nreachable in any shipped artefact — they compile on Linux but are\ninert on Windows and macOS, which use `webview2-com` and `WKWebView`\nrespectively.\n\n### Install a pre-built Windows release\n\nEach tagged release ships pre-built MSI installers for Windows x86_64\nand ARM64. They bundle the binary plus the matching `onnxruntime.dll`\n(1.20.0) and `pdfium.dll` under `\u003Cinstall>\u002Flibs\u002F\u003Clib>\u002Fwin-\u003Carch>\u002F`, so\nthe only post-install requirement is double-clicking the installer.\n\n```\ndist\u002FMikeRust_\u003Cversion>_x64.msi    # Windows x86_64\ndist\u002FMikeRust_\u003Cversion>_arm64.msi  # Windows ARM64 (Snapdragon X Elite)\n```\n\nBuilds are produced by `scripts\u002Fbuild-release.ps1` and attached to\nthe matching tag on\n[GitHub Releases](https:\u002F\u002Fgithub.com\u002FSemplificaAI\u002FMikeRust\u002Freleases).\nRuntime logs land in `\u003Chome>\u002Fmikerust-data\u002Fmike-tauri.log` (see\nv0.2.2 entry in [HISTORY.md](HISTORY.md) for the rationale).\n\n### Build from source\n\n```bash\n# 1. pdfium (PDF extraction)\n# Download from https:\u002F\u002Fgithub.com\u002Fbblanchon\u002Fpdfium-binaries\u002Freleases\n# Place pdfium.dll \u002F libpdfium.so \u002F libpdfium.dylib in libs\u002Fpdfium\u002F\n\n# 2. onnxruntime (embeddings via the RAG feature)\n# Download the variant matching your hardware (CPU \u002F DirectML \u002F CUDA \u002F CoreML \u002F …)\n# from https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Fonnxruntime\u002Freleases and place the\n# onnxruntime.dll \u002F libonnxruntime.so \u002F libonnxruntime.dylib under the\n# matching libs\u002Fonnxruntime\u002F\u003Cplatform>\u002F subfolder. The full recipe per\n# variant is in libs\u002Fonnxruntime\u002FREADME.md. ort is built with\n# `load-dynamic` — no statically-linked runtime, no system DLL fallback.\n\n# 3. Backend env\ncp .env.example .env\n# Edit .env: set JWT_SECRET. STORAGE_PATH, DATABASE_URL etc. have sensible defaults.\n\n# 4. Install frontend deps + Tauri CLI (one-shot)\n# The Svelte frontend uses pnpm.\ncd frontend && pnpm install && cd ..\n\n# 5. Run dev (Tauri shell + axum backend + Svelte\u002FVite frontend)\n# Preferred on Windows: use the repo-local Tauri CLI binary installed\n# by pnpm under frontend\u002Fnode_modules\u002F (version pinned by package.json,\n# no global cargo-tauri required).\n.\\frontend\\node_modules\\.bin\\tauri.cmd dev --config src-tauri\u002Ftauri.svelte.conf.json\n\n# Optional alternative (only if cargo-tauri is installed globally):\n# cargo tauri dev --config src-tauri\u002Ftauri.svelte.conf.json\n\n# Or backend only (axum on 127.0.0.1:$PORT, no Tauri shell):\ncargo run --features rag\n```\n\nThe first run will:\n- create `data\u002Fdb\u002Fmike.db` (SQLite) and apply all migrations\n- create `data\u002Fstorage\u002F` (uploads, chat cache)\n- download `multilingual-e5-base` ONNX weights (~280 MB) into `%USERPROFILE%\u002Fmikerust-data\u002Ffastembed\u002F` on first scan \u002F first chat with attachments\n\n## Architecture\n\n```\nBrowser \u002F Tauri webview (Svelte + Vite :5173)\n       │  HTTP + SSE\n       ▼\naxum backend (127.0.0.1:\u003Crandom>)   ← OS-assigned high port; the Tauri\n                                      shell publishes it to the frontend\n                                      via the `api_base_url` invoke\n                                      command at boot.\n                                      Override with PORT=3001 for the\n                                      standalone-backend dev story.\n   ├── SQLite          mike.db          (schema, vector store, settings)\n   ├── sqlite-vec      doc_chunks       (768-dim embeddings, partition-keyed)\n   ├── fastembed\u002Fort   multilingual-e5-base ONNX  (CPU \u002F DirectML \u002F QNN)\n   ├── pdfium-render   PDF text extraction + page rendering\n   ├── quick-xml+zip   DOCX extraction (incl. redline detection — see below)\n   ├── rtf-parser      RTF text extraction\n   ├── calamine        XLSX\u002FXLS\u002FXLSB\u002FODS extraction\n   ├── Local storage   .\u002Fdata\u002Fstorage\u002F{documents,cache}\n   ├── LLM             Anthropic \u002F Gemini \u002F OpenAI \u002F vLLM \u002F Ollama\n   └── MCP             any HTTP\u002FSSE MCP server, including localhost\n```\n\n## Key features\n\n### RAG: local folder sync\nConfigure folders under **Impostazioni → Documenti locali**. The scanner walks the tree (honouring `.gitignore`-style patterns), extracts text per format, chunks at ~800 tokens with 200-token overlap, and embeds with INT8-quantized `multilingual-e5-base` (Xenova mirror, 768 dims, ~265 MB on disk). Embeddings live in `sqlite-vec` virtual tables in the same `mike.db`. Search queries use cosine over the partitioned vector index; partitions are keyed by `(user_id, project_id_or_global)` so cross-tenant retrieval is impossible.\n\nSupported formats:\n- **PDF** — pdfium-render native text. Per-page extraction; pages stamped with `[Page N]` markers so chunks can carry locality metadata. Scanned PDFs (no embedded text) are skipped unless a vision LLM is configured.\n- **DOCX** — pure Rust ZIP+XML. Detects tracked deletions (`\u003Cw:del>`) and strike-through formatting (`\u003Cw:strike\u002F>`\u002F`\u003Cw:dstrike\u002F>`); both are surfaced inline as `[removed by author: …]` markers. See [docs\u002FDOCX.md](docs\u002FDOCX.md).\n- **RTF** — `rtf-parser` body text (control words, font tables, pictures, fields stripped).\n- **XLSX \u002F XLS \u002F XLSB \u002F ODS** — `calamine` per-sheet flattening.\n- **TXT \u002F MD \u002F CSV** — UTF-8 lossy decode.\n- **Images \u002F scanned PDFs** — surfaced via the chat composer's vision path when the selected model is vision-capable; not indexed for RAG.\n\n### RAG: hardware acceleration\nONNX Runtime execution providers compiled in via opt-in features:\n\n```bash\ncargo build --features rag-directml   # Windows GPU (DX12 device, no extra SDK)\ncargo build --features rag-qnn        # Qualcomm Snapdragon NPU (X Elite \u002F 8 Gen 3)\n```\n\nThe service tries the configured EP → CPU, silently skipping providers whose DLLs aren't loadable.\n\n**Critical fix — ort\u002Fonnxruntime ABI version match (Windows 11 ARM64 \u002F Qualcomm Snapdragon X Elite).**\nThe vendored `onnxruntime.dll` in `libs\u002Fonnxruntime\u002F\u003Cplatform>\u002F` **must match the exact version** the current `ort` crate was compiled against. ABI drift even across a single minor — e.g. 1.20.0 ↔ 1.24.x — silently deadlocks `TextEmbedding::try_new_from_user_defined` because the function-pointer table ort builds at startup references symbols (IO bindings, plugin-EP APIs) that don't exist on the other side. No error, no log, the spawn_blocking just never returns. The current pin is `ort = \"=2.0.0-rc.9\"` \u002F `fastembed = \"=4.9.1\"`, which targets **onnxruntime 1.20.0** (the rc.12 \u002F 1.24.2 line was reverted as a preparatory step for the upcoming data-security \u002F privacy feature — see [`HISTORY.md`](HISTORY.md) entry for 2026-05-20).\n\nWe chased this bug across three failed attempts before isolating it to ABI mismatch. The fix:\n\n1. **Vendor onnxruntime 1.20.0** (the version `ort 2.0.0-rc.9` actually links against — verifiable with `Select-String -Path target\\debug\\mike-tauri.exe -Pattern 'branch=rel-\\d+\\.\\d+\\.\\d+'`). Drop the matching `onnxruntime.dll` for each platform under `libs\u002Fonnxruntime\u002Fwin-{arm64,x64}\u002F`; see [`libs\u002Fonnxruntime\u002FREADME.md`](libs\u002Fonnxruntime\u002FREADME.md) for the fetch recipe. With the right DLL, `try_new_from_user_defined` returns in ≈2.3 s on ARM64 native and ≈3.2 s on x64 under Windows' Prism emulation — equivalent to the static-link timing.\n2. **`ort` runs in `load-dynamic` mode** (cf. `Cargo.toml`: `ort\u002Fload-dynamic` + `fastembed\u002Fort-load-dynamic`), so the runtime DLL is a distributable artifact users can swap independently of the Rust toolchain. `ensure_onnxruntime_dylib_path()` resolves the path at startup and exports `ORT_DYLIB_PATH` before any embedding code touches the runtime.\n3. **Default model: `Xenova\u002Fmultilingual-e5-base` INT8-dynamic** (~265 MB) instead of intfloat FP32 (~1.1 GB). Cross-model FP32-vs-INT8 cosine drift stays ≥ 0.97 on a curated Italian legal\u002Finsurance corpus and top-1 retrieval is preserved (see `tests\u002Fembedding_perf.rs::quality_fp32_vs_int8`). The INT8 path is also ~1.8× faster on batch indexing and ~3.7× lighter in RAM, so it is now the default on every platform.\n\n**When upgrading `ort`:** rebuild, run the `Select-String` probe above on the new `mike-tauri.exe`, then bump every vendored `onnxruntime.dll` to match the new version. Cross-minor drift is a silent deadlock, not a build error or a runtime panic — there is no fail-fast.\n\n### Chat with attachments — hash-keyed cache\nDocuments attached via the chat composer (the **+** button) land in `data\u002Fstorage\u002Fcache\u002F` keyed by SHA-256 of the binary, and are pre-extracted to plain text at upload time:\n\n```\ndata\u002Fstorage\u002Fcache\u002F\u003Chash>.\u003Cext>     # original file\ndata\u002Fstorage\u002Fcache\u002F\u003Chash>.txt       # extracted plain text\n```\n\nEffects:\n- **Dedup across chats** — the same file uploaded in different chats reuses the same on-disk pair (multiple `documents` rows reference one hash).\n- **No filename collisions** — two different chats can both have a file named `contratto.pdf`; the hash determines the path, not the user-facing filename.\n- **Auto re-extract on edit** — modifying a docx changes its hash, so the next upload generates a fresh `.txt` instead of stale text.\n- **Chat-delete cleanup** — when a chat is deleted, the backend ref-counts each `content_hash` of its linked docs and removes the on-disk binary + text only when no other doc still references that hash.\n\nSee [docs\u002FCACHE.md](docs\u002FCACHE.md) for the storage contract and migration history.\n\n### Citations\nAssistant responses inline numeric markers (`[1]`, `[2]`) plus a trailing `\u003CCITATIONS>` JSON block. The frontend parses both:\n- **Per-marker pills** — `renderMessageHtml` in [`frontend\u002Fsrc\u002Flib\u002Futils\u002Fcitations.ts`](frontend\u002Fsrc\u002Flib\u002Futils\u002Fcitations.ts) maps each `[n]` to the matching annotation by `ref` (numeric) or `doc_id` (alphanumeric `[g1]`\u002F`[p1]` from KB hits).\n- **DocPanel jump** — clicking a pill opens the cited doc in the in-app viewer, scrolling to the cited page (PDFs) or section (DOCX).\n\nAnnotations are persisted on `messages.annotations` (migration 0012) so re-opening a chat re-renders all pills correctly. Page-marker contamination (`[Page N]` leaking into citation quotes) is sanitised on both write and read paths so PDF.js text-layer highlights work.\n\nWhen the model skips the `\u003CCITATIONS>` block (some providers do), `synthesise_kb_citations_from_markers` rebuilds citations from RAG hits keyed by `(g|p)\u003Cn>` tags in the response.\n\n### Local-folder sync\n**Impostazioni → Documenti locali** (formerly \"Sincronizzazione\"). Add a folder, optionally scope to a project, hit *Scansiona ora*. Scanner emits per-file progress over the `\u002Fsync\u002Ffolders\u002F:id\u002Fstatus` endpoint with coarse pipeline stages (`extracting`, `embedding`) so the user sees motion during the slow first PDF.\n\nThe embedding model state — including the one-shot ~280 MB download — is reported via `\u002Fsync\u002Fmodel-status`; the UI renders an amber progress bar above the folder list while in `downloading` or `loading` state.\n\n### Authoritative legal corpora — plugin system\n\n**The intent.** The medium-term goal for this project is a **plugin\nsystem for downloading legal documents locally**: a contributor (or the\nuser) describes a new public source — Légifrance, BOE, Bundesgesetzblatt,\na regional bulletin — in a small JSON manifest, and MikeRust handles the\nrest (sidebar entry, importer, bulk-snapshot ingestion, search, fetch,\nembed). No Rust patch, no rebuild, no per-corpus bespoke UI. The user\nkeeps a fully offline mirror of the parts of public law they care about,\nunder their own AGPL-licensed copy of MikeRust.\n\nThe first implementation lives in [`config\u002Fcorpora-plugins\u002F`](config\u002Fcorpora-plugins\u002F)\nand is documented in [docs\u002FCORPUS_PLUGINS.md](docs\u002FCORPUS_PLUGINS.md).\nToday three strategies are supported:\n\n- `builtin` — corpus served by a hand-written Rust adapter (EUR-Lex,\n  Italian Legal). The manifest only carries metadata (display name,\n  language list, license attribution).\n- `dila-bulk-xml` — fully declarative: point at a DILA OPENDATA archive\n  index URL, pick a *fonds* (CNIL \u002F LEGI \u002F JORF \u002F CASS \u002F KALI), and the\n  generic importer takes care of download → tar walk → XML parse →\n  `corpus_documents` insert → FTS5 index. **Proof of concept: CNIL**,\n  ~26 000 délibérations indexed locally from an ~18 MB tar.gz, Etalab 2.0\n  license, zero anti-bot exposure.\n- `http-fetch-per-id` — fully declarative keyword search + single-\n  document fetch driven by URL templates and CSS\u002FJSONPath extraction.\n  Carries per-source discovery metadata (jurisdiction, doc type, auth\n  mode, search mode, fetch format) and an optional year-filtered search\n  endpoint. This is the live engine behind the international connectors\n  (e-Gov, eCFR, CourtListener, OpenLegalData, BOE, …). Sources behind a\n  JS anti-bot challenge (Cloudflare\u002FWAF) are detected and fail loudly —\n  for those the connector must target the source's JSON API, never its\n  website.\n\n**Alternative under evaluation — MCP-driven backend.** A second design\nis on the table: instead of (or alongside) the JSON-plugin path, expose\nlegal-source ingestion as an **MCP backend**. Each public source becomes\nan MCP server with tools like `search`, `fetch`, `bulk_import`,\n`list_indexed`; MikeRust calls those tools the same way it already calls\nany other MCP server. The trade-off is roughly:\n\n| | JSON plugins (current) | MCP backend (evaluating) |\n|---|---|---|\n| Author-time cost | one JSON file | one MCP server (Rust\u002FPython\u002FTS) |\n| Run-time cost | in-process, zero extra deps | extra long-lived process |\n| Reach | bounded by the manifest schema | unbounded — arbitrary code |\n| Sovereignty | data stays in `data\u002Fdb\u002Fmike.db` | depends on the server's policy |\n| Reuse outside MikeRust | none | usable from Claude \u002F any MCP host |\n\nNeither path forecloses the other. The JSON plugin is shipping now\nbecause it's the smallest possible footprint; the MCP backend is being\nweighed for the connectors that can't be expressed declaratively\n(arbitrary auth flows, multi-step session protocols, jurisdiction-\nspecific quirks like Légifrance's PISTE OAuth or the Bundesanzeiger's\nTOC-then-ZIP pattern). Feedback welcome — pick your favourite source and\ntell us which path would be less painful for it.\n\n| Corpus | Strategy | Languages | Status |\n|---|---|---|---|\n| **EUR-Lex** (EU) | builtin (REST\u002FSOAP + SPARQL + Cellar) | 24 EU languages | ✅ V1 — CELEX fetch via public HTML, EN fallback |\n| **Italia legale** (HF dataset) | builtin (HF datasets-server + Parquet bulk) | Italian | ✅ V1 — Normattiva (~69K) + Corte Costituzionale (~22K) |\n| **CNIL** (France, via DILA OPENDATA) | `dila-bulk-xml` plugin | French | ✅ V1 — délibérations + recommandations + avis (~26K docs, Etalab 2.0) |\n| Italian: OpenGA (TAR + Consiglio di Stato) | Same HF dataset, source filter | Italian | 🔲 in dataset, opt-in mancante |\n| Italian: Cassazione (civile\u002Fpenale\u002Fsez. unite) | da identificare | Italian | 🔲 V2 — sorgente fuori dataset HF |\n| Italian: Normattiva post-snapshot live | URN single-fetch | Italian | 🔲 V2 — atti dopo 2026-03-01 |\n| Italian: Leggi regionali (20 BUR) | per-regione | Italian | 🔲 V3 |\n| Italian: Gazzetta Ufficiale (sumario quotidiano) | XML feed | Italian | 🔲 V3 |\n| Italian: Decreti ministeriali \u002F circolari | per-ministero | Italian | 🔲 V3 (import da URL) |\n| French: LEGI \u002F JORF \u002F CASS \u002F KALI (DILA bulk) | `dila-bulk-xml` plugin | French | 🔲 — same strategy as CNIL; one manifest each |\n| **Légifrance** (France, via PISTE) | candidate for MCP backend | French | 🔲 OAuth2 REST — JSON plugin or MCP TBD |\n| **Retsinformation** (Denmark) | JSON `\u002Fapi\u002Fdocument\u002F{eli}` + `\u002Fapi\u002Fsearch` | Danish | planned |\n| **BOE** (Spain) | Open Data API + daily XML sumarios | Spanish | planned |\n| **Gesetze im Internet** (Germany) | TOC XML → per-law ZIP | German | scraping-only |\n| **Normattiva** (Italy, direct) | none — HTML \u002F Akoma Ntoso URN deep links | Italian | sostituito dal connettore HF; resta utile come V2 live-fetch |\n\n### Sovereign data\nEverything that contains user data lives under the workspace:\n- `data\u002Fdb\u002Fmike.db` — schema, embeddings, settings, all chats, all documents metadata\n- `data\u002Fstorage\u002F` — uploads (`documents\u002F`), chat cache (`cache\u002F`)\n- `%USERPROFILE%\u002Fmikerust-data\u002Ffastembed\u002F` — ONNX weights (out-of-tree to avoid the Tauri watcher)\n- `data\u002F.mikeprj` envelopes — AES-256-GCM-encrypted project bundles, key derived via Argon2id from a recipient email\n\nNo telemetry, no remote logging, no anonymous metrics. Outbound traffic only when the user explicitly invokes a remote LLM (Anthropic \u002F Gemini \u002F OpenAI) or a remote MCP server they configured themselves.\n\n## Environment variables\n\nSee `.env.example` for the full reference.\n\n| Variable | Required | Default |\n|---|---|---|\n| `JWT_SECRET` | **yes** | — |\n| `DATABASE_URL` | no | `sqlite:\u003CUSERPROFILE>\u002Fmikerust-data\u002Fmike.db` |\n| `STORAGE_PATH` | no | `%USERPROFILE%\u002Fmikerust-data\u002Fstorage` |\n| `FASTEMBED_CACHE_DIR` | no | `%USERPROFILE%\u002Fmikerust-data\u002Ffastembed` |\n| `PDFIUM_DYNAMIC_LIB_PATH` | no | walks ancestors of cwd \u002F exe for `libs\u002Fpdfium\u002F` |\n| `ORT_DYLIB_PATH` | no | walks ancestors for `libs\u002Fonnxruntime\u002F\u003Cplatform>\u002F` (see [`libs\u002Fonnxruntime\u002FREADME.md`](libs\u002Fonnxruntime\u002FREADME.md)) |\n| `PORT` | no | `0` (OS picks a free high port — see Architecture) |\n| `VLLM_BASE_URL` | for local LLM | — |\n| `VLLM_API_KEY` | no | `local` |\n| `ANTHROPIC_API_KEY` | for Claude | — |\n| `GEMINI_API_KEY` | for Gemini | — |\n| `MCP_SERVERS` | no | `[]` |\n\n## Implementation status\n\n| Area | Status |\n|---|---|\n| Auth (PIN\u002FArgon2id + Windows Hello biometric + opaque sessions) | ✅ |\n| SQLite + migrations (0001 → 0030) | ✅ |\n| Local storage (filesystem) | ✅ — the historical S3\u002FR2 fallback (`s3-storage` feature) was removed in v0.5.2. The AWS SDK chain it pulled in pinned a vulnerable `rustls 0.21.12` \u002F `rustls-webpki 0.101.7`, and the feature was never wired into `make_storage` to begin with. |\n| PDF extraction (pdfium) + scanned-PDF detection | ✅ |\n| DOCX extraction with redline detection | ✅ |\n| RTF \u002F XLSX \u002F TXT \u002F MD \u002F CSV extraction | ✅ |\n| RAG: scanner, chunker, sqlite-vec, fastembed CPU | ✅ |\n| RAG: DirectML \u002F QNN execution providers | ✅ opt-in |\n| LLM: Anthropic \u002F Gemini \u002F OpenAI \u002F vLLM \u002F Ollama | ✅ |\n| MCP client (HTTP\u002FSSE) — synchronous tools | ✅ |\n| MCP client — multi-step async flows (request → poll → fetch) | ⚠️ partial — see note below |\n| Routes: auth, user, chat, documents, projects, workflows, docx-templates, sync, tabular-review, corpora | ✅ |\n| Project: documents (PDF\u002FDOCX\u002FRTF\u002FXLSX) + folders + versions + rename | ✅ |\n| Project: chats list, tabular-reviews list, owner\u002Fshared visibility | ✅ |\n| Project: URL `?tab=` deep-linking (documents \u002F assistant \u002F reviews) | ✅ |\n| Project: `.mikeprj` AES-256-GCM export + import | ✅ |\n| Chat citations with persistence | ✅ |\n| Chat-attachment hash cache + ref-counted cleanup | ✅ |\n| **Accept \u002F Reject decision on generated docx** (migration 0029) — per-chat decision (`accepted` \u002F `rejected`) on every docx the model emits; rejection requires a user motive and triggers a one-shot LLM summary; subsequent chat turns inject the reason + summary in place of the rejected body so the model can correct itself without re-seeing the vetoed bytes. A read-only **\"Vedi riassunto\"** modal re-opens the archived reason + summary after the reject modal closes; flipping back to Accept restores the original document while keeping the audit trail. | ✅ |\n| **Chat-files popover** in the composer footer — surfaces all five categories the chat ever touched: uploaded attachments, tool-generated docs, rejected docs (strikethrough + red `Rifiutato` badge), project-inherited docs (`chats.project_id` → `documents.project_id`) and KB \u002F corpora docs cited via `messages.annotations`. Per-format icon colours (Excel green \u002F Word blue \u002F PDF red \u002F PowerPoint orange \u002F Markdown text-primary); origin tag chips colour-coded by category. Reads exclusively from `GET \u002Fchat\u002F:id\u002Fdocuments`. | ✅ |\n| **App version badge** next to \"MikeRust\" in the sidebar + **License panel** in Settings → Licenza (SPDX `AGPL-3.0-only`, plain-language summary, full bundled LICENSE text) | ✅ |\n| **HyDE — Hypothetical Document Embeddings** (migration 0030, v0.5.0) — opt-in toggle in Settings → Recupero documenti. When ON, `retrieve_kb_chunks` drafts a domain-aware pseudo-answer (anchored on the legal\u002Fmedical\u002Ffinance\u002F… prologue), embeds it, runs a second KNN, and merges the two rankings via Reciprocal Rank Fusion (k=60) before the usual top-K + 0.75 distance threshold + PII filter. Default OFF (adds one LLM call per turn). | ✅ |\n| **DirectML execution provider** compiled into the MSI (Windows DX12 GPU). `ort` tries DirectML at runtime; falls back to CPU automatically on machines without a DX12 adapter. No knob — transparent acceleration. | ✅ |\n| **Citation pipeline normalisers (v0.5.1)** — model-independent post-processors that survive the variability of mid-tier LLMs: hybrid bracket splitter (`[c1, c2, FILE.pdf, p.4, doc-7]` → clean per-citation pills, stops at `\u003CCITATIONS>`), cross-message `[cN]` lookup (re-resolves a marker against earlier assistant turns when the current turn forgot to re-emit it), plus five new explicit CITATION QUALITY RULES in `MRUST_SYSTEM_PROMPT` (omit empty quotes, ranges only with `[[PAGE_BREAK]]`, prefer per-passage + attached-doc citations, re-emit cross-turn `[cN]`). | ✅ |\n| **Cross-provider determinism (v0.5.1)** — `temperature = 0.5` on Anthropic \u002F OpenAI \u002F local-OpenAI-compatible; `max_tokens` 4096 → 8192 on Claude + local for trailing `\u003CCITATIONS>` JSON headroom. **Gemini sampler rolled back to its API default in v0.5.1b** for versatile heterogeneous-document analysis — tightening `gemini-2.5-flash` to 0.5 triggered a long-context white-out (model loops on whitespace until the stream limit). | ✅ |\n| **Orphan KB cleanup (v0.5.1)** — `retrieve_kb_chunks` probes each chunk's `source_path` and drops missing files at chat-time; new `POST \u002Fsync\u002Fcleanup-orphans` cascades through `documents` + `doc_chunks` + `synced_files`; viewer surfaces a \"Pulisci sorgenti rimosse\" modal when a citation source 404s. | ✅ |\n| **DocxView A4 fit + reflow toggle (v0.5.1)** — preserves A4 page geometry (width + height + margins, `breakPages: true`) and auto-scales via CSS `zoom = containerWidth \u002F pageWidth` (clamped `[0.4, 1.5]`) through a `ResizeObserver` when the user drags the side-panel divider; top-right toggle flips to a reflow mode for narrow side-panel reading. | ✅ |\n| Authoritative-corpus framework (`LegalCorpusAdapter` trait) | ✅ |\n| EUR-Lex V1 (CELEX-based fetch + 24-language picker + EN fallback) | ✅ |\n| EUR-Lex V2 (full-text search via SOAP CWS) | 🔲 [registration required](docs\u002FEURLEX_REGISTRATION.md) |\n| Italia legale V1 (Normattiva + Corte Cost via HF dataset) | ✅ |\n| Italia legale V2 (OpenGA opt-in, Cassazione, live Normattiva) | 🔲 see [CORPORA.md](docs\u002FCORPORA.md) |\n| Italia legale V3 (regional laws, GU, ministerial decrees) | 🔲 |\n| **JSON-manifest plugin system** (`config\u002Fcorpora-plugins\u002F*.json`) | ✅ schema + loader + adapter registry + generic \u002Fcorpora routes + discovery metadata, per-corpus enable\u002Fdisable, unified year-filter search, dev hot-reload |\n| **`dila-bulk-xml` strategy** (download tar.gz → walk XML → FTS5) | ✅ end-to-end test + live import |\n| CNIL via DILA OPENDATA (declarative plugin) | ✅ — first proof-of-concept consumer of the plugin system |\n| Other DILA fondi (LEGI, JORF, CASS, KALI) as plugins | 🔲 — same strategy, one manifest each |\n| MCP-backend alternative for ingestion (Légifrance \u002F Bundesanzeiger \u002F …) | 🔲 design phase — see \"Authoritative legal corpora\" |\n| Other corpus ingestors (Retsinformation, BOE, …) | 🔲 planned |\n| **Professional-domain column** across workflows \u002F tabular_reviews \u002F projects \u002F documents (migration 0018) | ✅ 11 canonical domains (`legal`, `medical`, `finance`, `real_estate`, `hr`, `insurance`, `ip`, `compliance`, `gdpr`, `pa`, `others`), validated at API boundary, filter chips in list views |\n| **Per-user `default_domain`** preference (migration 0019, Account → Generali UI) | ✅ pre-selects in every create \u002F picker modal |\n| **Per-user `enabled_domains`** toggle (migration 0027, Impostazioni → Domini) | ✅ persists subset of visible verticals server-side; NULL = all enabled; downstream filtering of pickers is a follow-up |\n| **Per-file PII redaction** — GLiNER2 zero-shot multilingual NER behind the `ner-pii` feature; per-file checkbox in the chat composer (`( PII [☐] filename ✕ )`) + per-chat disclaimer modal (Enter \u002F Esc keybinds, link to Omissis via system browser); 2000-char chunked extraction with `[LABEL]` redaction over the *extracted text* before it reaches the cloud LLM; manual HF bootstrap with byte-level download progress in the same banner as the embedding model. The whole `stream_chat` pipeline runs inside a single `tokio::spawn` so `Sse::new` returns immediately — `doc_extract_*` and `pii_redact_*` SSE events render a \"Estrazione testo → Anonimizzazione PII `n \u002F N`\" progress strip inside the assistant turn the moment each stage fires (no more silent wait on long PDFs) | ⚠️ **EXPERIMENTAL — recall depends on the zero-shot model**; with the multilingual PII variant and threshold 0.2 the redactor reliably catches dates \u002F addresses \u002F contacts \u002F IBANs but may miss unusual first names (zero-shot training gaps). Pure-GLiNER pipeline by design (no regex fallback); pair with [Omissis](https:\u002F\u002Fedito-pdf.com) for production-grade audited redaction. Performance on Snapdragon X Elite CPU: 22-35 s per 2000-char chunk |\n| **JSON-driven workflow & column-preset registries** (`config\u002Fworkflow-presets\u002F` + `config\u002Fcolumn-presets\u002F`) | ✅ in-memory loaders, no DB seed, merged into `\u002Fworkflow` list; replaces the legacy TS constants |\n| Built-in workflows shipped (legal vertical) | ✅ 14 — Generate CP Checklist, NDA Review, SPA Review, Credit Agreement Review\u002FSummary, Commercial Agreement\u002FLease\u002FSupply Review, LPA, Shareholder Agreement Review\u002FSummary, Change of Control Review, E-Discovery Review, Employment Agreement Review |\n| Built-in workflows shipped (insurance vertical) | ✅ 6 — 3 tabular comparison (RC Professionale, RC Prodotti, D&O, 24 cols each) + 3 assistant (Riassunto copertura, Due Diligence assicurativa, Inventario beni assicurati) |\n| Insurance phase-2 workflows (Cyber \u002F RC Generale \u002F Property review \u002F RC Medica \u002F Key Man) | 🔲 see `docs\u002Finsurance-workflows-plan.md` |\n| **Built-in workflows shipped (medical-legal vertical)** | ✅ 11 — 7 tabular (inventario documenti, timeline cronologica, diagnosi strumentali con flag DIRETTA\u002FINDIRETTA\u002FESCLUSIVA, ITT, IP-RC SIMLA, MIP INAIL, invalidità civile DM 1992) + 4 assistant (diagnosi ingresso, diagnosi dimissione, nesso causale 6.1→6.6, quality check 10 punti). Map dei 7 moduli di `docs\u002Fpiano_toolkit_medico_legale.md` + DOCX template `it\u002Frelazione-medico-legale` per la stesura finale. |\n| **Built-in workflows shipped (finance \u002F commercialista vertical)** | ✅ 22 — 17 tabular (inventario, scadenzario fiscale annuale, portafoglio clienti, quality-check pre-invio, riclassificazione bilanci pluriennale, indicatori economico-finanziari, metodi valutativi, contestazioni accertamento, analisi bancaria Art. 32, rideterminazione reddito, indicatori di crisi CCII, stato passivo per rango, confronto piano vs liquidatoria, cash-flow previsionale, checklist DD documenti, rischi DD con semaforo, esposizione fiscale anno×tributo, rilievi-controdeduzioni, scadenze processuali, checklist redditi PF) + 5 assistant (relazione stima d'azienda, relazione CTU tributaria, attestazione Art. 33 CCII, report due diligence, ricorso tributario D.Lgs. 546\u002F92). Map delle 6 aree di `docs\u002Fpiano_toolkit_commercialista.md` + 3 DOCX template (`it\u002Frelazione-stima-valore`, `it\u002Fattestazione-art-33-ccii`, `it\u002Fricorso-tributario`). |\n| Column-preset shortcuts (auto-suggest column prompt+format from name match) | ✅ 13 legal + 17 insurance, domain-scoped auto-match |\n| Picker modals with on-the-fly domain switch | ✅ workflow \u002F template \u002F tabular-template \u002F column-preset pickers all expose a `DomainSelect` combo, pre-seeded with the user's default at every open |\n| **LLM model catalogue** (`config\u002Fmodel.json` + `GET \u002Fmodels`) | ✅ 4 providers (Anthropic, Google Gemini, OpenAI, Mistral), Gemini 30-region matrix, `preview`\u002F`legacy` flags drive auto-snap to global + UI dimming |\n| Settings → Modelli LLM: catalogue-driven combos | ✅ model and region dropdowns populated from `\u002Fmodels`; \"Provider attivo\" buttons gated to providers with a saved API key (lock icon + tooltip for the rest) |\n| Chat model picker filters out unconfigured providers | ✅ ModelToggle hides Anthropic \u002F OpenAI \u002F Gemini until their API key is saved, matching the Settings page gating |\n| **Six UI locales** (`it` \u002F `en` \u002F `fr` \u002F `de` \u002F `es` \u002F `pt`) | ✅ full catalogues, identical key tree, English fallback for any missing key via the i18n store |\n| Language picker | ✅ in Settings → Profile |\n| **DOCX templates screen** — list, detail, generate, apply-to-chat | ✅ filter by domain \u002F locale \u002F free-text; detail modal renders the auto-generated authoring contract; \"Apply to chat\" opens a fresh conversation with the template attached |\n| **DOCX template editor** (full-page) | ✅ edits every `DocxTemplate` field — identity, layout, typography, styles, authoring contract; system templates open read-only and **duplicate** into editable user templates; user templates persist as JSON under `config\u002Fdocx-templates\u002Fuser\u002F` via `POST \u002Fdocx-templates\u002Fsave` |\n| Hide \u002F unhide DOCX templates | ✅ per-user, with Tutti \u002F Predefiniti \u002F Personalizzati \u002F Nascosti tabs (migration 0024) |\n| **Prompt translation** | ✅ the workflow and DOCX-template editors translate their free-text fields into a language chosen in a modal; runs through a bounded-concurrency pool with a per-request timeout and live progress |\n| **Prompt caching + window-aware summarization** | ✅ the stable system prefix is sent with Anthropic `cache_control` (Gemini gets implicit caching); conversation history is compressed once the whole prompt — system prefix + attached docs + history — passes 80% of the model's context window |\n| Frontend unit tests (vitest) | ✅ citation \u002F highlight \u002F markdown utilities + a DOCX-template-editor mount test |\n\n### Note: MCP client — async multi-step flows\n\nThe MCP client successfully discovers servers and dispatches **synchronous\ntool calls** (a tool that returns its real result on the same call). What\nis **not yet reliably handled** is the multi-step async pattern that\nhuman-in-the-loop MCP servers use — Edge \u002F Semplifica.Edge being the\ncanonical case:\n\n```\nMike → request_pseudonymized_documents       → {session_id, status:\"pending\"}\n        (Edge waits for the human to approve in its GUI)\nMike →   get_pseudonymized_documents          → list of pseudonymised files\nMike →     count entities \u002F download text     → next tools in the chain\n```\n\nToday Mike's auto-chain wrapper covers the immediate `request_*` →\n`get_*` hop (with a 300 s timeout, configurable via\n`MCP_CALL_TIMEOUT_SECS`), but a third-step or beyond — \"list the\npseudonymised files, count their entities, download the pseudonymised\ntext\" as separate tool invocations within the same conversational turn —\nis not driven correctly by the dispatcher yet. The user has to nudge the\nchat to perform each subsequent step manually.\n\nTracked as work-in-progress; do **not** modify `MRUST_SYSTEM_PROMPT` or\n`build_mcp_system_prompt` when fixing this — the prompt structure is\npreserved, only the dispatcher mechanics are in scope. See\n[`src\u002Froutes\u002Fchat.rs`](src\u002Froutes\u002Fchat.rs) `dispatch_mcp_tool_with_async_chain`.\n\n## Documentation\n\n- [docs\u002FMANUAL.md](docs\u002FMANUAL.md) — operator manual (running, troubleshooting, recovery)\n- [docs\u002FDOCX.md](docs\u002FDOCX.md) — DOCX extraction details (tracked changes, strikes, namespaces)\n- [docs\u002FCACHE.md](docs\u002FCACHE.md) — chat-attachment cache layout + ref-counting\n- [docs\u002FCORPORA.md](docs\u002FCORPORA.md) — EUR-Lex + national legal-corpora plan + API survey\n- [docs\u002FEURLEX_REGISTRATION.md](docs\u002FEURLEX_REGISTRATION.md) — EUR-Lex V1 (no auth) + V2 SOAP registration steps\n- [docs\u002FSESSION_RECAP.md](docs\u002FSESSION_RECAP.md) — historical session notes\n- [docs\u002FUPSTREAM_SYNC.md](docs\u002FUPSTREAM_SYNC.md) — policy + audit log for syncing fixes from upstream `willchen96\u002Fmike`\n- [docs\u002FCORPUS_PLUGINS.md](docs\u002FCORPUS_PLUGINS.md) — JSON-manifest plugin system for legal corpora (schema, strategies, add-a-corpus guide)\n- [docs\u002FWORKFLOWS.md](docs\u002FWORKFLOWS.md) — user manual for Workflows, Tabular Reviews, and Assistant injection. Includes 7 non-legal examples (medical, finance, real estate, HR, insurance, IP, compliance) for designing your own templates.\n- [HISTORY.md](HISTORY.md) — release notes \u002F changelog, date-grouped\n- For the pristine upstream README, see [`willchen96\u002Fmike`][upstream] directly\n\n## License\n\nMikeRust is a fork of the AGPL-3.0 [`willchen96\u002Fmike`][upstream] project and ships under **AGPL-3.0**. The backend (`src\u002F`, `src-tauri\u002F`) is original Rust and the `frontend\u002F` is an original clean-room Svelte rewrite; both ship under the same license for consistency. See `LICENSE`.\n\n**Brand assets are not AGPL.** The wordmark **Semplifica**, the corporate\nname **Semplifica s.r.l.**, and the Semplifica logo shipped under\n[`frontend\u002Fpublic\u002Fsemplifica\u002F`](frontend\u002Fpublic\u002Fsemplifica\u002F) are\ntrademarks of [Semplifica s.r.l.](https:\u002F\u002Fsemplifica.ai) and are reserved\nseparately from the code license. The **MikeRust** name is also reserved\nas the identifier of this upstream (MikeRust ships without its own logo\nfor now — only the Semplifica mark is present in the UI). Forks with\nsubstantive changes are asked to drop the Semplifica wordmark\u002Flogo and\nrename the binary; see [NOTICE.md](NOTICE.md) for the full policy and\nthe precedent (GitLab CE, Mastodon, Nextcloud, Element, Plausible, …).\n","MikeRust是一个基于本地运行的AI文档助手，专为法律和技术文档处理设计。项目核心功能包括使用Rust+axum构建的后端服务，支持SQLite数据库和本地文件系统存储，利用ONNX格式的嵌入向量进行文本处理，并通过Tauri框架提供桌面应用体验；前端则采用Svelte 5重写，界面简洁且响应迅速。该软件完全在用户设备上运行，无需依赖云服务或外部认证，确保数据安全与隐私保护。适合需要在本地环境中处理敏感信息、追求高性能及隐私保护的法律科技、保险科技等领域专业人士使用。",2,"2026-06-11 04:06:59","CREATED_QUERY"]