[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80810":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":15,"stars30d":16,"stars90d":14,"forks30d":14,"starsTrendScore":14,"compositeScore":17,"rankGlobal":8,"rankLanguage":8,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":19,"topics":21,"createdAt":8,"pushedAt":8,"updatedAt":22,"readmeContent":23,"aiSummary":24,"trendingCount":14,"starSnapshotCount":14,"syncStatus":12,"lastSyncTime":25,"discoverSource":26},80810,"hey-jude","sure-scale\u002Fhey-jude","sure-scale",null,"Python",41,9,2,4,0,1,3,40.8,"GNU Affero General Public License v3.0",false,"main",[],"2026-06-12 04:01:30","\u003Cp align=\"left\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fjude-logo.png\" alt=\"Hey Jude logo\" width=\"120\">\n\u003C\u002Fp>\n\n# Hey Jude\n\n**Privacy gateway for legal LLM workflows.**\n\n\u003Cp align=\"left\">\n  \u003Cimg alt=\"Python 3.11+\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.11%2B-3776AB?style=flat-square&logo=python&logoColor=white\">\n  \u003Cimg alt=\"FastAPI\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFastAPI-0.115%2B-009688?style=flat-square&logo=fastapi&logoColor=white\">\n  \u003Cimg alt=\"Docker Compose\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDocker-Compose-2496ED?style=flat-square&logo=docker&logoColor=white\">\n  \u003Cimg alt=\"OpenAI-compatible\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOpenAI--compatible-API-111827?style=flat-square\">\n  \u003Cimg alt=\"License AGPL-3.0\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-AGPL--3.0-111827?style=flat-square\">\n\u003C\u002Fp>\n\nHey Jude sits between your app and your LLM provider. It strips PII from prompts before they leave your environment, then restores the original details in the response. Your users see real names; the cloud LLM never does.\n\nIt uses a local LLM to understand context — so legal defined terms like \"the Purchaser\" stay intact while real names, emails, and addresses get replaced with semantic placeholders like `INVESTMENT_BANK_01` or `PERSON_02`. A Presidio-based safety net catches anything the LLM misses.\n\nThis is a helper layer for data minimization, not a guarantee. Use it as part of a broader confidentiality strategy.\n\n---\n\n## How It Works\n\n1. Your app sends a chat completion request to Hey Jude (OpenAI-compatible API).\n2. A local LLM analyzes the text and identifies real PII vs. legal\u002Fstructural terms.\n3. PII gets replaced with semantic placeholders. Legal defined terms are kept.\n4. A Presidio safety net scans the result for anything the LLM missed.\n5. The sanitized prompt is forwarded to your chosen LLM provider.\n6. The response comes back, placeholders are swapped for originals, and your app gets a normal-looking reply.\n\n---\n\n## Why It Exists\n\n*   **Context-aware anonymization:** A local LLM understands that \"Goldman Sachs\" is PII but \"the Purchaser\" is a legal term — something regex and NER can't do reliably.\n*   **Semantic placeholders:** `INVESTMENT_BANK_01`, not `ORGANIZATION_01`. The downstream LLM keeps enough context to reason well.\n*   **Safety net:** Presidio runs after the LLM as a second pass. Configurable as `warn` (auto-fix), `strict` (reject), or `off`.\n*   **Drop-in API:** Exposes OpenAI, Anthropic, and Gemini-compatible endpoints. Point existing SDKs at the gateway.\n*   **Fully local by default:** Runs without cloud keys using Ollama for both anonymization and demo responses.\n*   **Cloud routing when ready:** Route anonymized prompts to OpenAI, Anthropic, Gemini, Azure, or any LiteLLM-compatible provider.\n\n---\n\n## Quick Start\n\n### 1. Install Ollama and Pull the Default Model\n\n```bash\nollama pull qwen3.5:4b\n```\n\n### 2. Run the Gateway\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fnickwatson\u002Fhey-jude.git\ncd hey-jude\ncp .env.example .env\ndocker compose up --build\n```\n\nThe gateway will run at `http:\u002F\u002Flocalhost:4005`.\n\n### 3. Test It\n\nIn another terminal:\n\n```bash\npython3 tests\u002Fe2e\u002Ftest_gateway.py\n```\n\nThe default setup is fully local: Redis runs in Docker, and Ollama runs on your host machine.\n\n---\n\n## Default Configuration\n\nThe defaults work out of the box for most users.\n\n| Variable | Default | Purpose |\n|----------|---------|---------|\n| `REDIS_URL` | `redis:\u002F\u002Flocalhost:6379\u002F0` | Temporary mapping storage |\n| `API_KEY` | `sk-heyjude-dev` | Gateway authentication |\n| `LOCAL_LLM_URL` | `http:\u002F\u002Flocalhost:11434\u002Fv1` | Local anonymization endpoint |\n| `LOCAL_LLM_MODEL` | `qwen3.5:4b` | Local anonymization model |\n| `LOCAL_LLM_API_KEY` | *(empty)* | API key for cloud-hosted anonymization models |\n| `EXTERNAL_LLM_MODEL` | `ollama_chat\u002Fqwen3.5:4b` | Destination model via LiteLLM |\n| `EXTERNAL_LLM_API_BASE` | `http:\u002F\u002Flocalhost:11434` | LiteLLM API base for Ollama |\n| `ANONYMIZATION_MODE` | `llm` | `llm` (context-aware) or `mechanical` (NER-only) |\n| `SAFETY_NET_STRICTNESS` | `warn` | `warn` (auto-fix), `strict` (reject), or `off` |\n| `DOCUMENT_UNREADABLE_ACTION` | `reject` | What to do when an uploaded file has no readable text layer: `reject`, `warn`, or `skip` |\n| `CUSTOM_RECOGNIZERS_PATH` | *(unset)* | Path to a YAML\u002FJSON file of custom Presidio regex recognizers |\n| `KNOWN_ENTITIES_PATH` | *(unset)* | Path to a YAML\u002FJSON known-entity dictionary |\n| `AUDIT_ENABLED` | `false` | Enable request-level audit logging |\n| `AUDIT_DESTINATION` | `stdout` | `stdout` or a file path |\n| `AUDIT_CONTENT_LEVEL` | `metadata` | `metadata` (digests only), `anonymized` (PII-free payload), `full` (raw content) |\n| `AUDIT_ROTATION` | `monthly` | Segment files by period: `none`, `daily`, `monthly` |\n| `AUDIT_FAILURE_MODE` | `ignore` | `ignore` (logging never blocks a request) or `fail` (fail-closed) |\n\nWhen running through Docker Compose, the service automatically uses `host.docker.internal` so the container can reach Ollama on your Mac.\n\n### Choosing Models\n\nHey Jude uses **two separate models** for two different jobs, and you size them independently:\n\n| Role | Setting | Job | Pick for |\n|------|---------|-----|----------|\n| Anonymizer | `LOCAL_LLM_MODEL` (+ `LOCAL_LLM_URL`) | Classify entities and emit placeholder JSON | Speed and cost — a small, fast model is enough |\n| Destination | `EXTERNAL_LLM_MODEL` | Do the actual legal work on the already-anonymized prompt | Capability — your strongest model |\n\nThe anonymizer's task is narrow and structured (find PII, output a fixed JSON schema), so it does **not** need a frontier model. Putting the cheapest model that holds classification quality here cuts cost and latency on every request, because the anonymizer runs once per message before anything reaches the destination. Reserve the expensive, capable model for `EXTERNAL_LLM_MODEL`, which never sees raw PII anyway.\n\nThe two run on independent endpoints and providers — e.g. a small Azure or Ollama model for anonymization and Gemini, Anthropic, or OpenAI for the destination — so you tune the cost\u002Fquality trade-off on each without touching the other.\n\n### Domain-Specific Detection\n\nDefault NER misses the abbreviated, inconsistent names common in legal text (\"Call w\u002F J. Smith re: Acme merger\"). Two opt-in mechanisms close the gap. Templates live in [`examples\u002F`](examples\u002F).\n\n**Custom recognizers** (`CUSTOM_RECOGNIZERS_PATH`) add regex-based entity types — matter numbers, client codes, opposing-counsel formats. They run in the Presidio safety net and as a mechanical-mode detection strategy.\n\n**Known-entity dictionary** (`KNOWN_ENTITIES_PATH`) is a firm-maintained list of the names you must never leak — clients, personnel, matter names. Listed entities are matched case-insensitively and **guaranteed replaced before the prompt reaches the LLM**, so a critical name never depends on the model noticing it. All spelling variants (`term` + `aliases`) collapse to one placeholder.\n\nBy default an auto-numbered placeholder (e.g. `CLIENT_NAME_01`) is assigned per request. Set `replace_with` on an entry to fix its placeholder so it stays identical across every request.\n\nHey Jude extracts text from common legal document formats before anonymization, including text PDFs, DOCX, HTML, EML, TXT, Markdown, and RTF. Scanned PDFs, flattened PDFs, and images are not OCRed yet; by default they are rejected so unreadable content is not forwarded without anonymization.\n\n### Audit Logging\n\nSet `AUDIT_ENABLED=true` to record one envelope per request: timestamps, latency, which external model it was routed to, entity count, sensitivity, safety-net result, the per-entity anonymization decisions, and SHA-256 digests of the input and the anonymized output. This is the artifact that proves anonymization happened and that only PII-free content left the network.\n\n**Per-entity decisions.** In LLM mode each record carries a `decisions` list — what the anonymizer found and what it did to it (`action` is `replace`, `keep`, or `generalize`) with the reason. At the default `metadata` level this stores `entity_type`, `action`, and `reason` only, never the raw entity text; `full` additionally records the original text and its replacement. This is the per-matter \"what did we send, what did we withhold, why\" trail for discovery and malpractice defense.\n\n**Tamper-evident.** The log is hash-chained JSONL: each record carries the hash of the previous one, so editing or deleting any historical record breaks the chain from that point on — detectable even by someone with write access. Verify a segment at any time:\n\n```bash\nhey-jude audit verify audit\u002Faudit-2026-05.jsonl\n```\n\nSet `AUDIT_HMAC_KEY` to bind the chain to a secret so an attacker who cannot read the key cannot recompute valid hashes. Walk a segment for conflict checks, client audits, or discovery production:\n\n```bash\nhey-jude audit query audit\u002Faudit-2026-05.jsonl --matter M-123456 --since 2026-05-01\n```\n\n**Content level.** The default `metadata` stores **no raw client PII** — only digests — so the audit trail itself does not become a confidential-data store. Choose `anonymized` to retain the PII-free payload (useful as a malpractice-defensible record of what the AI was actually asked and answered, without storing client identities). `full` additionally persists the raw pre-anonymization content and is a deliberate PII honeypot; it logs a startup warning and should be reserved for environments where that risk is understood. Tag requests with the `X-Heyjude-Matter-Id` header so records are queryable by matter; enable `AUDIT_ACTOR_HEADER` only if your firm policy permits attorney attribution.\n\n**Immutability vs. retention.** A hash chain makes history immutable, but legal duties (matter-close destruction, data-subject erasure, retention schedules) require eventual deletion. Hey Jude resolves this with period segments (`AUDIT_ROTATION`): each month\u002Fday is an independent chain in its own file, so an expired segment can be destroyed wholesale without invalidating the active chain. **Suspend rotation and deletion while a matter is under legal hold.** For cryptographic-grade WORM, point `AUDIT_DESTINATION` at a write-once volume (`chattr +a` on Linux) or ship sealed segments to object storage with an immutability lock (e.g. S3 Object Lock). Keep the log on encrypted disk; Hey Jude does not encrypt records itself.\n\n---\n\n## Native Python\n\nIf you prefer not to use Docker, run Redis yourself and start the app directly:\n\n```bash\npip install -e \".[dev]\"\npython3 -m spacy download en_core_web_lg\nuvicorn hey_jude.main:app --host 0.0.0.0 --port 4005\n```\n\n---\n\n## Advanced Models\n\nThe default `qwen3.5:4b` target is chosen for low-friction local setup. If you want a different local model, pull it with Ollama and update `LOCAL_LLM_MODEL` plus `EXTERNAL_LLM_MODEL`.\n\n| Use case | Model |\n|----------|-------|\n| Fastest tiny local test | `qwen3.5:0.8b` |\n| Default local setup | `qwen3.5:4b` |\n| Stronger local setup | `qwen3.5:9b` |\n| High-end local setup | `qwen3.6:35b-a3b` |\n\nExample:\n\n```bash\nollama pull qwen3.5:9b\nLOCAL_LLM_MODEL=qwen3.5:9b\nEXTERNAL_LLM_MODEL=ollama_chat\u002Fqwen3.5:9b\n```\n\nTo use Apple MLX instead of Ollama, serve an OpenAI-compatible endpoint and point `LOCAL_LLM_URL` or `EXTERNAL_LLM_API_BASE` at that server.\n\nTo route the final prompt to a cloud provider, set `EXTERNAL_LLM_MODEL` to any LiteLLM model identifier and provide that provider's API key, for example `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY`.\n\n### Azure AI as Anonymization Backend\n\nInstead of running Ollama locally, you can use an Azure AI-hosted model for the anonymization layer. This is useful on machines where local inference is slow (e.g., laptops without dedicated GPU).\n\nAzure AI Foundry exposes OpenAI-compatible endpoints for models deployed from the model catalog. Set these in your `.env`:\n\n```bash\nLOCAL_LLM_URL=https:\u002F\u002F\u003Cyour-resource>.openai.azure.com\u002Fopenai\u002Fv1\nLOCAL_LLM_MODEL=DeepSeek-V4-Pro\nLOCAL_LLM_API_KEY=\u003Cyour-azure-api-key>\n```\n\nAvailable models (same endpoint, swap `LOCAL_LLM_MODEL`):\n\n| Model | `LOCAL_LLM_MODEL` value | Notes |\n|-------|------------------------|-------|\n| DeepSeek V4 Pro | `DeepSeek-V4-Pro` | Recommended — fast, strong at structured JSON output |\n| Kimi K2.6 | `Kimi-K2.6` | Reasoning model, needs high `max_tokens` (uses thinking tokens) |\n\nAny model deployed to your Azure AI project that serves an OpenAI-compatible chat completions endpoint will work. The gateway sends requests to `{LOCAL_LLM_URL}\u002Fchat\u002Fcompletions` with both `api-key` and `Authorization: Bearer` headers.\n\n**Prompt caching.** The anonymization prompt (`prompts\u002Fanonymize.txt`) keeps its large static block — task, classification instructions, and output schema — first, and the per-request variables (the existing placeholder mapping and the message text) last. That fixed prefix is identical on every request, so providers with automatic prompt caching (Azure OpenAI, Anthropic, Gemini) reuse it instead of re-billing the instructions each call, cutting input-token cost and latency on the hot path. If you edit the template, keep the variables at the end or the cached prefix is lost.\n\n### E2E Testing\n\nThe end-to-end test uses three models:\n\n| Role | Model | Purpose |\n|------|-------|---------|\n| Anonymizer | Configured via `LOCAL_LLM_*` | PII detection and replacement |\n| Destination | Gemini Flash | Receives anonymized prompts |\n| Evaluator | Gemini Pro | Judges anonymization quality (PII leaks, coherence, completeness) |\n\n```bash\nGEMINI_API_KEY=your-key python3 tests\u002Fe2e\u002Ftest_gemini_anonymization.py\n```\n\nThe test auto-downloads public-domain legal documents from SEC EDGAR on first run (NDAs, employment agreements, settlement agreements, etc.) and uses them alongside inline test cases. Downloaded documents are cached locally and gitignored.\n\n---\n\n## SDK Integration\n\nSince Hey Jude behaves like standard LLM endpoints, existing clients can point at the gateway.\n\n### Mike OSS\n\nMike can route OpenAI, Claude, and Gemini calls through Hey Jude. After starting this gateway, set in Mike's `backend\u002F.env`:\n\n```bash\nHEY_JUDE_ENABLED=true\nHEY_JUDE_BASE_URL=http:\u002F\u002Flocalhost:4005\nHEY_JUDE_API_KEY=sk-heyjude-dev\n```\n\n### Python OpenAI SDK\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(\n    base_url=\"http:\u002F\u002Flocalhost:4005\u002Fv1\",\n    api_key=\"sk-heyjude-dev\",\n)\n\nresponse = client.chat.completions.create(\n    model=\"gpt-4o\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"I am John Doe and I work at Google.\"}\n    ],\n)\nprint(response.choices[0].message.content)\n\nresponse = client.responses.create(\n    model=\"gpt-4o\",\n    input=\"I am John Doe and I work at Google.\",\n)\nprint(response.output_text)\n```\n\n### Python Anthropic SDK\n\n```python\nfrom anthropic import Anthropic\n\nclient = Anthropic(\n    base_url=\"http:\u002F\u002Flocalhost:4005\",\n    api_key=\"sk-heyjude-dev\",\n)\n\nresponse = client.messages.create(\n    model=\"claude-3-5-sonnet-20241022\",\n    max_tokens=1024,\n    messages=[\n        {\"role\": \"user\", \"content\": \"I am John Doe and I work at Google.\"}\n    ],\n)\nprint(response.content[0].text)\n```\n\n### Node.js OpenAI SDK\n\n```javascript\nimport OpenAI from 'openai';\n\nconst client = new OpenAI({\n  baseURL: 'http:\u002F\u002Flocalhost:4005\u002Fv1',\n  apiKey: 'sk-heyjude-dev',\n});\n\nconst response = await client.chat.completions.create({\n  model: 'gpt-4o',\n  messages: [{ role: 'user', content: 'I am John Doe and I work at Google.' }],\n});\nconsole.log(response.choices[0].message.content);\n\nconst responsesResponse = await client.responses.create({\n  model: 'gpt-4o',\n  input: 'I am John Doe and I work at Google.',\n});\nconsole.log(responsesResponse.output_text);\n```\n\n### Node.js Anthropic SDK\n\n```javascript\nimport Anthropic from '@anthropic-ai\u002Fsdk';\n\nconst client = new Anthropic({\n  baseURL: 'http:\u002F\u002Flocalhost:4005',\n  apiKey: 'sk-heyjude-dev',\n});\n\nconst response = await client.messages.create({\n  model: 'claude-3-5-sonnet-20241022',\n  max_tokens: 1024,\n  messages: [{ role: 'user', content: 'I am John Doe and I work at Google.' }],\n});\nconsole.log(response.content[0].text);\n```\n\n---\n\n## License\n\nGNU Affero General Public License v3.0. See [LICENSE](LICENSE).\n","Hey Jude 是一个用于合法LLM工作流程的隐私网关，旨在保护个人身份信息（PII）。它通过本地LLM识别并替换文本中的真实姓名、邮箱和地址等敏感信息为语义占位符，同时保留法律定义术语不变，并利用Presidio作为安全网来捕捉可能遗漏的信息。该项目使用Python开发，基于FastAPI框架构建，并支持Docker Compose部署及OpenAI兼容接口。适用于需要处理敏感数据但又依赖于云上LLM服务的应用场景中，作为数据最小化策略的一部分，帮助开发者在不牺牲用户隐私的前提下利用强大的语言模型能力。","2026-06-11 04:02:25","CREATED_QUERY"]