[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81850":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":13,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":15,"stars7d":16,"stars30d":16,"stars90d":13,"forks30d":13,"starsTrendScore":17,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":13,"starSnapshotCount":13,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},81850,"gate","GaaraZhu\u002Fgate","GaaraZhu","A deterministic privacy boundary between your data and AI.","",null,"Rust",108,0,25,45,83,135,88.3,"MIT License",false,"main",true,[24,25,26,27,28,29,30,31,32,33],"agentic-ai","ai-governance","ai-privacy","ai-security","data-governance","llm-security","mcp","pii-protection","privacy-engineering","rust","2026-06-12 04:01:35","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fbanner.png\" alt=\"gate\" width=\"600\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>A deterministic privacy boundary between your data and AI.\u003Cbr>Intercepts query results before the model sees them — rule-driven, reproducible, and audit-ready.\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FGaaraZhu\u002Fgate\u002Factions\">\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FGaaraZhu\u002Fgate\u002Fworkflows\u002FCI\u002Fbadge.svg\" alt=\"CI\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FGaaraZhu\u002Fgate\u002Freleases\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fv\u002Frelease\u002FGaaraZhu\u002Fgate\" alt=\"Release\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg\" alt=\"License: MIT\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FGaaraZhu\u002Fhomebrew-gate\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fhomebrew-tap-orange?logo=homebrew\" alt=\"Homebrew\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>English\u003C\u002Fstrong> | \u003Ca href=\"README.zh-CN.md\">简体中文\u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\nAI agents increasingly access internal databases and APIs through CLI tools, scripts, and MCP servers. Without safeguards, sensitive data such as emails, phone numbers, tax identifiers, and payment details can be unintentionally exposed to LLM context windows.\n\n`gate` intercepts query results before they reach the model and automatically redacts detected PII fields without requiring changes to existing agent workflows or prompts. It covers both access paths agents use: **Bash commands** (via a harness hook) and **MCP server calls** (via a wrap-style stdio proxy), adding \u003C 10 ms of overhead per query.\n\n## Why rules, not a model?\n\nMost PII guardrails for AI agents are themselves LLMs — they send your data to a model to decide whether it's sensitive. Gate takes the opposite approach.\n\n| | gate | LLM-based redaction |\n|---|---|---|\n| Decision method | Regex + column heuristics + Luhn | Model inference |\n| Deterministic | ✅ Same input always produces the same output | ❌ Varies by run and model version |\n| Data stays local | ✅ Never leaves your machine | ❌ Sent to a model API for classification |\n| Latency | ✅ \u003C 10ms overhead | ❌ Adds an API round-trip |\n| Auditable | ✅ Every decision traceable to an explicit rule | ❌ Model reasoning is opaque |\n| Known gaps | ✅ Documented — free-text prose | ❌ False-negative rate unknown |\n\nThe trade-off gate makes: rules can't catch PII in unstructured free-text prose. The [threat model](THREAT-MODEL.md) documents what gate doesn't cover.\n\n## Demo\n\nThe demo walks through three steps:\n\n1. `gate scan` detecting PII columns across the schema before any query runs\n2. An agent querying the transactions table with gate disabled — `card_number` fully visible\n3. The same queries with gate enabled — `card_number` redacted across both MCP and Bash paths\n\n![gate intercepting PII before it reaches the model](assets\u002Fdemo.gif)\n\nAlso works with OpenCode, Cursor, GitHub Copilot CLI, Codex CLI, and Gemini CLI — see [Supported AI Tools](#supported-ai-tools) for the full compatibility matrix.\n\n> For the design rationale, threat-model walkthrough, and detection-pipeline deep dive, read [**Introducing gate**](https:\u002F\u002Fgaarazhu.github.io\u002Fintroducing-gate\u002F).\n\n## Scan your schema\n\nBefore installing the hook, use `gate scan` to assess how much PII your schema exposes. Pipe a `TABLE_NAME, COLUMN_NAME` query into it and gate prints a risk report across every table. No config is required for `gate scan` itself — if you haven't created one yet, run `gate config --init-only` first.\n\n```bash\npsql -U \u003Cuser> -h \u003Chost> -d \u003Cdbname> -c \"SELECT TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = 'public' ORDER BY TABLE_NAME, ORDINAL_POSITION\" | gate scan\n```\n\nSee [docs\u002Fscan.md](docs\u002Fscan.md) for queries against MySQL, MS SQL Server (including native `sqlcmd`), Databricks, and toolkit-managed clients.\n\nRisk level is weighted by category sensitivity — one SSN column matters more than twenty address columns. Exits with code 1 if any PII columns are found (scriptable in CI). Pass `--verbose` to show all detected columns, or `--json` for machine-readable output.\n\n| Sensitivity | Categories | Risk floor |\n|-------------|-----------|------------|\n| **Critical** | Government IDs, Health & medical, Financial, Biometric | **HIGH** always; **CRITICAL** if ≥3 columns or >10% of schema |\n| **Elevated** | Contact, Names, Date of birth, Location of birth, Family & relationships, Employment | **HIGH** if >5% of schema; **CRITICAL** if >25% |\n| **Standard** | Address & location, Online & technical, Demographics | **HIGH** if >25% of schema |\n\n> **Note:** `gate scan` detects PII by column name only. A LOW result means your column names look clean — it does not mean the data is safe. Gate 2 additionally inspects values at query time, catching PII in free-text, JSON, and ambiguously-named columns that scan cannot see.\n\nFor false positives (e.g. `city` in a `products` table), run `gate scan --review` to triage interactively and add columns to the allowlist. Allowlisted columns skip **name-based** redaction only — Gate 2 still checks their values against regex patterns and the Luhn algorithm. Manage the list directly with `gate allowlist add\u002Fremove\u002Flist`.\n\n## Quickstart\n\n1. **Install gate**\n\n   ```bash\n   # Homebrew — macOS and Linux (recommended)\n   brew tap GaaraZhu\u002Fgate && brew install gate\n\n   # cargo binstall — downloads a prebuilt binary\n   cargo binstall gate\n\n   # Or grab a binary from the releases page\n   # https:\u002F\u002Fgithub.com\u002FGaaraZhu\u002Fgate\u002Freleases\n   ```\n\n2. **Create your config** (opens `~\u002F.config\u002Fgate\u002Fconfig.yaml` in your editor):\n\n   ```bash\n   gate config\n   ```\n\n3. **Register the hook** with your agent harness:\n\n   ```bash\n   # Claude Code (default)\n   gate init\n\n   # OpenCode\n   gate init --harness opencode\n\n   # Cursor\n   gate init --harness cursor\n\n   # GitHub Copilot CLI (project-scoped, run from repo root)\n   gate init --harness copilot-cli\n\n   # Codex CLI\n   gate init --harness codex\n\n   # Gemini CLI\n   gate init --harness gemini\n   ```\n\n   Add `--scope project` for project-only setup. Restart your OpenCode, Cursor, or Gemini CLI session after `gate init` to load the hook. For Codex CLI, restart the session, then review the hook in the Trust & Permissions UI, mark it as trusted, and enable it. For Copilot CLI, the generated `.github\u002Fhooks\u002FPreToolUse.json` is gitignored by default — each developer runs `gate init --harness copilot-cli` once in their local clone.\n\n4. *(Optional)* **Register MCP server proxies** so `tools\u002Fcall` responses also pass through gate:\n\n   ```bash\n   # Claude Code (default) — dry-run, shows what would change\n   gate init --wrap-mcp\n\n   # OpenCode\n   gate init --harness opencode --wrap-mcp --yes\n\n   # Cursor\n   gate init --harness cursor --wrap-mcp --yes\n\n   # Copilot CLI\n   gate init --harness copilot-cli --wrap-mcp --yes\n\n   # Codex CLI\n   gate init --harness codex --wrap-mcp --yes\n\n   # Gemini CLI\n   gate init --harness gemini --wrap-mcp --yes\n   ```\n\n   Add `--scope project` for project-level MCP config. For Cursor project-scoped MCP, re-enable the servers in **Settings → Tools & MCPs** after registration. See [docs\u002Fmcp.md](docs\u002Fmcp.md) for `--servers`, per-harness paths, and manual single-server registration.\n\n5. **Start your AI session** — `gate` intercepts query commands automatically. No changes to your prompts or tools required.\n\nRun `gate validate` to confirm your config is valid before the first session.\n\n## How it works\n\n`gate` covers two access paths agents use to reach data. The [blog post](https:\u002F\u002Fgaarazhu.github.io\u002Fintroducing-gate\u002F) has the full walkthrough; the short version:\n\n### Bash tooling path\n\nEvery Bash command passes through `gate hook` first. Commands that match a configured tool are silently rewritten to `gate run -- \u003Coriginal command>`, which spawns the subprocess and pipes stdout through the two-gate detection pipeline. The rewrite happens in the harness's pre-tool-execution hook — it is **enforcing** in Claude Code, OpenCode, Cursor, GitHub Copilot CLI, Codex CLI, and Gemini CLI; the agent cannot bypass it. Humans and CI scripts running outside the harness are untouched.\n\n```\nAI asks to run: tkpsql query --sql \"SELECT * FROM users\"\n                        │\n         harness hook fires (PreToolUse \u002F tool.execute.before)\n                        │\n              gate hook rewrites to: gate run -- tkpsql query --sql \"...\"\n                        │\n         ┌──────────────┴──────────────┐\n         │ Gate 1: SQL inspection      │  SELECT * → no column hints, defer to Gate 2\n         │ Gate 2: Value scanning      │  regex + column-name heuristics + Luhn check\n         └──────────────┬──────────────┘\n                        │\n         {\"id\": 1, \"full_name\": \"[PII:name]\", \"email\": \"[PII:email]\", ..., \"_gate_summary\": {...}}\n```\n\n### MCP path\n\n`gate mcp` is a transparent stdio proxy registered in the harness as the MCP server. It forwards all JSON-RPC traffic verbatim except `tools\u002Fcall` responses, which pass through Gate 2 before reaching the model. No changes to the upstream server are required.\n\n> **Note:** only `tools\u002Fcall` responses are redacted — `resources\u002Fread`, `prompts\u002Fget`, and other MCP message types are forwarded without inspection.\n\n```\nAI ──tools\u002Fcall──> gate mcp ──forward──> upstream MCP server\n                       │\n                       │ \u003C── tools\u002Fcall response with PII\n                       │\n                       │ Gate 2 scan + redact\n                       │\nAI \u003C───redacted result─┘\n```\n\n## Output format\n\nRedacted output preserves the original JSON structure. PII values are replaced with `[PII:\u003Ctype>]` placeholders. A `_gate_summary` field is appended reporting what was redacted.\n\n```json\n{\n  \"rows\": [{\"id\": 1, \"email\": \"[PII:email]\", \"ssn\": \"[PII:ssn]\"}],\n  \"count\": 1,\n  \"_gate_summary\": {\"redacted\": 2, \"types\": [\"email\", \"ssn\"], \"warnings\": []}\n}\n```\n\nWith `hash_values: true` in config, each placeholder gains an 8-char hex suffix derived from the original value (`[PII:email:7f83b165]`). The same raw value always produces the same suffix, so the AI can join or deduplicate across rows without ever seeing the underlying data. Error responses from the underlying tool pass through unchanged.\n\n## Protection retrospective\n\n`_gate_summary` reports a single response. `gate retro` aggregates across all of them — total queries seen, PII fields redacted, hit rate, plus a breakdown by tool and PII category. Useful for periodic audits and for confirming the boundary is doing real work.\n\n![gate retro output](assets\u002Fretro.jpg)\n\nStats are collected by default and written to a local JSONL log on disk — they never leave your machine. Disable with `stats.enabled: false` in config.\n\n## What gate does NOT protect against\n\n`gate` is a deterministic redaction layer, not a sandbox. It assumes the agent is non-adversarial and only inspects output from commands listed under `tools:` in config. The following are deliberately out of scope:\n\n- **Adversarial agents \u002F prompt injection.** Gate's threat model is an agent that *inadvertently* exfiltrates PII. `gate protect` (Unix) blocks the most direct bypass — a hijacked agent disabling gate via config edits — by transferring config ownership to root. But a determined attacker can still route around gate by invoking commands not in `tools:`, requesting non-JSON output formats, piping through encoders, or removing the hook entry from the harness settings file for the next session. Pair gate with a harness-level Bash allowlist to close the residual gap.\n- **Commands not in `tools:`.** The AI can invoke them freely; their output is never inspected.\n- **Non-JSON tool output.** Plain text, CSV, and other formats pass through unchanged. Configure tools to emit JSON.\n- **Encoded or obfuscated PII.** Base64-encoded emails, URL-encoded values, or deliberately spaced strings (`a l i c e @ e x a m p l e . c o m`) are not detected.\n- **Non-US PII by value alone.** The built-in SSN regex requires dashes and the phone pattern is US-centric. Non-US formats rely on column-name matching — extend `pii.column_names` or `pii.patterns` for your region.\n- **PII already in the model's context** from prior turns, system prompts, file reads, or earlier summarisation. Gate filters what goes *into* the model from configured tools; what's already there stays there.\n- **Tool-side network exfiltration.** If a configured tool sends data to an external service directly (rather than returning it via stdout), gate never sees it.\n- **Write operations.** `INSERT`, `UPDATE`, `DELETE` are not inspected or blocked.\n- **Credential exposure.** Gate holds no credentials; that is the responsibility of the underlying tool. Prefer toolkit commands or MCP servers over raw clients that take credentials on the CLI.\n\nFor a stronger boundary, combine gate with harness-level tool restrictions and database-level read-only roles. See [THREAT-MODEL.md](THREAT-MODEL.md) for the full attacker model and known bypasses.\n\n## Supported query tools\n\nAny command that returns JSON can be configured as a `gate` target — database clients, internal API calls via `curl`, or any other tool your AI agent uses to fetch data. The AI sees the same structured response it always did, with PII values replaced in-place.\n\n| Command | Type | Notes |\n|---|---|---|\n| `tkpsql` | PostgreSQL (toolkit-managed) | `sql_arg: \"--sql\"` |\n| `tkmsql` | MS SQL Server (toolkit-managed) | `sql_arg: \"--sql\"` |\n| `tkdbr` | Databricks (toolkit-managed) | `sql_arg: \"--sql\"` |\n| `databricks` | Databricks CLI (native) | `sql_arg: \"--json\"`, `json_sql_path: \"statement\"` |\n| `curl` | HTTP data sources | `pipe: \"jq -c .\"` |\n| `psql`, `mysql`, `mariadb` | Raw DB clients | **Not enabled by default** — see [Raw database clients](docs\u002Fconfiguration.md#raw-database-clients-opt-in) |\n\nPrefer toolkit commands or MCP servers over raw clients: raw clients typically require credentials on the command line, which lands in the agent's transcript, shell history, and process listing. Toolkit commands ([`tk*`](https:\u002F\u002Fgithub.com\u002Fscott-abernethy\u002Ftoolkit)) inject credentials from a secrets store; MCP servers hide the connection string entirely. `gate` works with any JSON-returning command — toolkit is not required.\n\n## Commands\n\n```bash\ngate --help                    # full subcommand list\ngate \u003Csubcommand> --help       # details for any subcommand\n```\n\nThe ones you'll use most:\n\n| Command | Purpose |\n|---|---|\n| `gate init` | Register the hook with your harness (see Quickstart) |\n| `gate config` | Create and edit the YAML config |\n| `gate scan` | PII risk report across your schema |\n| `gate allowlist add\u002Fremove\u002Flist` | Manage column-name false positives |\n| `gate retro` | Protection retrospective — total queries & PII fields redacted, breakdown by tool and PII type\u002Fcategory, hit rate with visual progress bar |\n| `gate enable` \u002F `gate disable` | Toggle redaction without uninstalling |\n| `gate validate` | Check config for errors before the first session |\n| `gate protect` \u002F `gate unprotect` *(Unix only)* | Transfer config ownership to root |\n| `gate uninstall` | Remove everything gate added to your system |\n\nSee [docs\u002Fcommands.md](docs\u002Fcommands.md) for the full reference, including `gate run`, `gate mcp`, and the `--wrap-mcp` \u002F `--scope` \u002F `--harness` flags.\n\n### Config file protection (Unix only)\n\nFor a stronger guarantee, transfer ownership of the config to root so the agent cannot modify it:\n\n```bash\nsudo gate protect      # any future enable\u002Fdisable\u002Fconfig\u002Fallowlist now needs sudo\nsudo gate unprotect    # restore direct write access\n```\n\nEnforced at the OS level across all harnesses (Claude Code, OpenCode, Cursor, GitHub Copilot CLI, Codex CLI, Gemini CLI). Not supported on Windows.\n\n## Supported AI Tools\n\n| AI Tool | Bash Hook | MCP Wrap | Notes |\n|---|:---:|:---:|---|\n| [Claude Code](https:\u002F\u002Fclaude.ai\u002Fcode) | ✅ | ✅ | |\n| [Cursor](https:\u002F\u002Fcursor.sh) | ✅ | ✅ | Restart session after `gate init` to load the hook |\n| [OpenCode](https:\u002F\u002Fopencode.ai) | ✅ | ✅ | Restart session after `gate init` to load the hook |\n| [GitHub Copilot CLI](https:\u002F\u002Fgithub.com\u002Ffeatures\u002Fcopilot) | ✅ | ✅ | Hook is project-scoped; each developer runs `gate init` once |\n| [Codex CLI](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fcodex) | ✅ | ✅ | After `gate init`, restart session and trust + enable the hook in the Permissions UI |\n| [Gemini CLI](https:\u002F\u002Fgithub.com\u002Fgoogle-gemini\u002Fgemini-cli) | ✅ | ✅ | Restart session after `gate init` to load the hook |\n\n## Documentation\n\n- [Configuration](docs\u002Fconfiguration.md) — full YAML schema and built-in PII detection rules\n- [Commands](docs\u002Fcommands.md) — full subcommand reference\n- [MCP setup](docs\u002Fmcp.md) — wrapping existing MCP servers and registering new ones\n- [Scan queries](docs\u002Fscan.md) — schema-query examples for each database\n- [Config file locations](docs\u002Fconfig-locations.md) — where each harness stores hooks and MCP settings\n- [Troubleshooting](docs\u002Ftroubleshooting.md) — common issues and fixes\n\n## Uninstallation\n\n```bash\ngate uninstall\nbrew uninstall gate\n```\n\n`gate uninstall` removes gate hooks from all harnesses, the config directory at `~\u002F.config\u002Fgate\u002F`, and any gate-generated plugin files. It shows what will be deleted and asks for confirmation.\n\n## Contributing\n\nBug reports and pull requests are welcome. For significant changes, open an issue first to discuss the proposal. See [CONTRIBUTING.md](CONTRIBUTING.md) for the dev setup, pre-commit checklist, and safety rules for redaction changes.\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n\n## Disclaimer\n\nSee [DISCLAIMER.md](DISCLAIMER.md).\n","Gate 是一个在数据与AI之间建立确定性隐私边界的技术项目。它通过规则驱动的方式，在查询结果到达模型前拦截并自动屏蔽检测到的个人身份信息（PII）字段，无需修改现有的工作流或提示。该项目采用Rust语言开发，具有低延迟（每次查询增加不到10毫秒的开销）、可审计性和数据本地化的特点。Gate适用于需要保护敏感信息不被AI代理无意中暴露给大型语言模型（LLM）的各种场景，如内部数据库和API访问控制。与基于模型的方法相比，Gate使用正则表达式、列启发式算法及Luhn算法来决定是否对数据进行处理，确保了决策过程的一致性和透明度。",2,"2026-06-11 04:06:56","CREATED_QUERY"]