[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-84032":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":18,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":25,"readmeContent":26,"aiSummary":9,"trendingCount":15,"starSnapshotCount":15,"syncStatus":27,"lastSyncTime":28,"discoverSource":29},84032,"privacy-filter","packyme\u002Fprivacy-filter","packyme","LLM privacy gateway in Go — millisecond-latency PII and secret redaction. Used in production by PackyCode.",null,"Go",212,25,118,1,0,8,94,110,4.24,"MIT License",false,"main",true,[],"2026-06-12 02:04:37","# Privacy Filter (Go)\n\n**English** | [简体中文](README.zh-CN.md)\n\nStrip sensitive user data (PII \u002F secrets) from text before it reaches an LLM.\nPure Go, no model, no GPU, no CGO — a single static binary, millisecond latency on text of any length.\n\n🌐 Running in production at [PackyCode](https:\u002F\u002Fwww.packyapi.com) — the privacy-compliance component of an API relay service.\n\n---\n\n## Three ways to use it\n\n1. **Core package**: `import \"privacyfilter\u002Ffilter\"` straight into your gateway — redaction is one function call, no HTTP hop.\n2. **HTTP service**: `cmd\u002Fhttp`, REST API.\n3. **gRPC service**: `cmd\u002Fgrpc`, interface in `proto\u002Ffilter.proto`.\n\nThe latter two are thin wrappers around the `filter` core package.\n\n---\n\n## Two detection layers\n\n| Layer | Covers | Technique |\n|---|---|---|\n| Structured PII | Email, phone, national ID, bank card (Luhn-checked), IP | Regex |\n| Secrets \u002F credentials | API keys, tokens, private keys, passwords written in prose, unknown high-entropy strings | gitleaks ruleset (keyword pre-filter) + contextual regex + Shannon-entropy fallback |\n\nEach layer emits `(start, end, placeholder)` spans → spans are merged and de-overlapped → the text is rebuilt in a single pass.\nPlaceholders are typed and carry the entity kind — `[邮箱]` (email), `[电话]` (phone), `[身份证]` (national ID), `[银行卡]` (bank card), `[IP]`, `[密钥]` (secret) — and are irreversible (no un-redaction).\n\n> No person \u002F place \u002F organization name recognition — that needs an NER model, which costs seconds of CPU time on long text and was removed per requirements.\n> High-risk identity data (national ID, bank card, secrets, etc.) is fully covered by regex.\n\n---\n\n## Layout\n\n```\nprivacy-filter\u002F\n├── go.mod \u002F go.sum\n├── filter\u002F                  core package (import directly from a gateway)\n│   ├── filter.go            Filter \u002F New \u002F Redact\n│   ├── pii.go               structured PII\n│   ├── secrets.go           gitleaks + context + entropy\n│   └── filter_test.go\n├── cmd\u002F\n│   ├── http\u002Fmain.go         HTTP service\n│   └── grpc\u002Fmain.go         gRPC service\n├── proto\u002Ffilter.proto       gRPC interface definition\n├── gen\u002Ffilterpb\u002F            protoc-generated code\n├── rules\u002Fgitleaks.toml      gitleaks ruleset\n├── scripts\u002Ffetch_rules.sh   ruleset update script\n└── Dockerfile\n```\n\n---\n\n## Build\n\n```bash\ngo build -o bin\u002Fserver-http .\u002Fcmd\u002Fhttp\ngo build -o bin\u002Fserver-grpc .\u002Fcmd\u002Fgrpc\ngo test .\u002F...                          # run all tests\n```\n\n---\n\n## Usage\n\n### 1. Core package (recommended for gateways)\n\n```go\nimport \"privacyfilter\u002Ffilter\"\n\n\u002F\u002F Create once at startup; concurrency-safe, reuse globally.\nf, err := filter.New(\"rules\u002Fgitleaks.toml\")   \u002F\u002F pass \"\" to use the built-in fallback rules\n\n\u002F\u002F Per request\nres := f.Redact(userPrompt)\nforwardToLLM(res.Redacted)                    \u002F\u002F forward the redacted text to the LLM\n```\n\n`filter.Result`: `Redacted` (redacted text), `Hit`, `Count`, `Entities` (hit details,\nincluding type and byte offsets).\n\n> To consume this package from your own gateway module: put it in the same monorepo, or add\n> `replace privacyfilter => ..\u002Fprivacy-filter` to the gateway's go.mod. The `filter` package\n> depends only on `BurntSushi\u002Ftoml`.\n\n### 2. HTTP service\n\n```bash\n.\u002Fbin\u002Fserver-http                    # default :8088\n```\n\n```bash\ncurl http:\u002F\u002F127.0.0.1:8088\u002Fhealth\ncurl -X POST http:\u002F\u002F127.0.0.1:8088\u002Fredact -H 'Content-Type: application\u002Fjson' \\\n  -d '{\"text\":\"我的邮箱是 a@b.com，密码是 Hunter2xy\"}'\n# {\"redacted\":\"我的邮箱是 [邮箱]，密码是 [密钥]\",\"hit\":true,\"count\":2,\"entities\":[...],\"elapsed_ms\":0.08}\n```\n\nEndpoints: `GET \u002Fhealth`, `POST \u002Fredact`, `POST \u002Fredact\u002Fbatch` (`{\"texts\":[...]}`).\n\n### 3. gRPC service\n\n```bash\n.\u002Fbin\u002Fserver-grpc                    # default :8089\n```\n\nService `filter.v1.PrivacyFilter`, methods `Redact` \u002F `RedactBatch`, defined in `proto\u002Ffilter.proto`.\nGenerate a client from that proto on the gateway side. To regenerate the code in this repo:\n\n```bash\nprotoc -I. --go_out=. --go_opt=module=privacyfilter \\\n       --go-grpc_out=. --go-grpc_opt=module=privacyfilter proto\u002Ffilter.proto\n```\n\n---\n\n## Configuration (environment variables)\n\n| Variable | Default | Description |\n|---|---|---|\n| `PF_PORT` | `8088` | HTTP listen port |\n| `PF_GRPC_PORT` | `8089` | gRPC listen port |\n| `PF_GITLEAKS_TOML` | `rules\u002Fgitleaks.toml` | path to the gitleaks rules file |\n\n---\n\n## Performance (local benchmark, synthetic high-density PII text — worst case)\n\n| Text length | Latency |\n|---|---|\n| ~50 B | ~0.01ms |\n| ~2 KB | ~0.46ms |\n| ~32 KB | ~9ms |\n\nBoth layers are O(n). Real prompts (PII is never this dense) are faster.\n\n---\n\n## Integration notes (gateway side)\n\n- With the core-package import there is no HTTP\u002FgRPC hop, hence no timeout and no fail-open\u002Fclosed concerns.\n- If you use the HTTP\u002FgRPC service: set a 150–300ms timeout; on failure, prefer fail-closed (reject the request rather than forwarding the raw text).\n\n---\n\n## Notes\n\n- All **222 gitleaks rules compile natively** in Go (Go's `regexp` is RE2, the same engine gitleaks uses;\n  an earlier Python port lost 26 rules to RE2-incompatible syntax).\n- Go `regexp` runs in linear time — no catastrophic backtracking (ReDoS) risk.\n- gitleaks does not support look-around assertions, so digit boundaries for phone \u002F national ID etc. are\n  enforced by manual post-match validation.\n- No person \u002F place \u002F organization recognition. If added later, prefer rules (a Chinese-address regex is\n  feasible; names are better anchored by context).\n- The entropy fallback can mis-flag git SHAs, long base64 strings, etc. — tune the threshold or add an allowlist.\n```\n",2,"2026-06-11 04:12:08","CREATED_QUERY"]