[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81782":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":16,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":15,"starSnapshotCount":15,"syncStatus":13,"lastSyncTime":26,"discoverSource":27},81782,"cloakrs","kadir\u002Fcloakrs","kadir","A blazingly fast PII detection, masking, and anonymization engine written in Rust. Library + CLI.","",null,"Rust",50,2,49,0,1,42.03,"MIT License",false,"master",true,[],"2026-06-12 04:01:35","# cloakrs\n\n[![CI](https:\u002F\u002Fgithub.com\u002Fkadir\u002Fcloakrs\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fkadir\u002Fcloakrs\u002Factions\u002Fworkflows\u002Fci.yml)\n\n`cloakrs` is a Rust library and CLI for detecting and masking personally identifiable information in text, logs, JSON, CSV, and database dumps.\n\nIt ships universal recognizers for emails, phone numbers, credit cards, IBANs, IP addresses, URLs, API keys, JWTs, AWS access keys, MAC addresses, hostnames, user home paths, person names, physical addresses, crypto wallet addresses, and context-dependent dates of birth. Locale bundles add identifiers such as US SSNs, Dutch BSNs, UK NINO\u002FNHS numbers, German Steuer-IDs, Indian Aadhaar\u002FPAN values, Brazilian CPF\u002FCNPJ values, and French INSEE\u002FNIR numbers.\n\nSee [supported entities](docs\u002Fsupported-entities.md) for the full detection matrix, including validation algorithms, confidence ranges, and examples.\n\n## Install\n\n```bash\ncargo install cloakrs-cli\n```\n\nFor local development:\n\n```bash\ncargo build --workspace\ncargo test --workspace\ncargo run -p cloakrs-cli -- scan tests\u002Ffixtures\u002Fsample_text.txt\n```\n\n## Quick Start\n\n```rust\nuse cloakrs_core::Locale;\n\nlet scanner = cloakrs_locales::default_registry()\n    .into_scanner_builder()\n    .locale(Locale::US)\n    .build()?;\n\nlet result = scanner.scan(\"Contact jane@example.com or ssn 123-45-6789\")?;\nassert_eq!(result.masked_text.as_deref(), Some(\"Contact [EMAIL] or ssn [SSN]\"));\n# Ok::\u003C(), cloakrs_core::CloakError>(())\n```\n\n## CLI Examples\n\n```bash\n# Scan a file and print a human-readable report.\ncloakrs scan tests\u002Ffixtures\u002Fsample_text.txt --locale us --output-format text\n\n# Produce SARIF for code scanning systems.\ncloakrs audit . --output-format sarif --output cloakrs.sarif\n\n# Run from the pre-commit framework against staged file paths.\ncloakrs pre-commit src\u002Flib.rs README.md --min-confidence 0.8\n\n# Mask a CSV file, scanning selected columns only.\ncloakrs scan users.csv --format csv --columns email,phone --output users.masked.csv\n```\n\n## LLM Prompt Sanitization\n\n```rust\nuse cloakrs_core::{Locale, PromptSanitizer, Result};\n\nfn main() -> Result\u003C()> {\n    let scanner = cloakrs_locales::default_registry()\n        .into_scanner_builder()\n        .locale(Locale::US)\n        .build()?;\n    let sanitizer = PromptSanitizer::new(scanner);\n    let (clean_prompt, mapping) =\n        sanitizer.sanitize(\"Email jane@example.com about the invoice\")?;\n    assert_eq!(clean_prompt, \"Email [EMAIL_1] about the invoice\");\n    let restored = sanitizer.restore(\"I emailed [EMAIL_1]\", &mapping);\n    assert_eq!(restored, \"I emailed jane@example.com\");\n    Ok(())\n}\n```\n\n## Architecture\n\nThe workspace is split into six crates with one-way dependencies:\n\n```text\ncloakrs-core -> cloakrs-patterns -> cloakrs-locales -> cloakrs-adapters -> cloakrs-cli\ncloakrs-core -> cloakrs-tracing\n```\n\n- `cloakrs-core`: scanner, recognizer trait, shared types, masking strategies\n- `cloakrs-patterns`: universal recognizers such as email, phone, card, IBAN\n- `cloakrs-locales`: country-specific recognizers such as US SSN and Dutch BSN\n- `cloakrs-adapters`: streaming handlers for text, JSON, CSV, logs, and SQL dumps\n- `cloakrs-tracing`: a `tracing_subscriber` layer for redacted event output\n- `cloakrs-cli`: the `cloakrs` command-line interface\n\n## Comparison\n\n| Tool | Language | Runtime requirements | Primary fit | Benchmark status |\n| --- | --- | --- | --- | --- |\n| cloakrs | Rust | Single native binary | Fast local scanning and masking | Criterion suite included |\n| Microsoft Presidio | Python | Python plus NLP dependencies | NLP-rich enterprise workflows | Run locally for same-hardware numbers |\n| DataFog | Python | Python runtime | App-level PII detection | Run locally for same-hardware numbers |\n| scrubadub | Python | Python runtime | Text scrubbing | Not benchmarked in-tree |\n| piidetect | Go | Native binary | Lightweight PII detection | Not benchmarked in-tree |\n\nRun the local benchmark suite with:\n\n```bash\ncargo bench -p cloakrs-cli --bench scan_benchmark\n```\n\nThe benchmark harness covers 1KB through 10MB inputs for plain text, JSON, and CSV, each recognizer individually, and all masking strategies. See [docs\u002Fbenchmarking.md](docs\u002Fbenchmarking.md).\n\n## Guides\n\n- [Adding recognizers](docs\u002Fadding-recognizers.md)\n- [Adding locale recognizers](docs\u002Flocale-guide.md)\n- [Supported entities](docs\u002Fsupported-entities.md)\n- [CI\u002FCD integration](docs\u002Fci-cd-integration.md)\n- [Benchmarking](docs\u002Fbenchmarking.md)\n- [Release checklist](docs\u002Frelease-checklist.md)\n\n## Status\n\nThe first Rust release is published on crates.io. See [implementation status](docs\u002Fimplementation-status.md) for completed work and known gaps.\n\n## License\n\nMIT. See [LICENSE.md](LICENSE.md).\n","`cloakrs`是一个用Rust编写的用于检测、屏蔽和匿名化个人身份信息（PII）的高速引擎，同时提供库和命令行工具。它支持识别并处理包括电子邮件、电话号码、信用卡号、IP地址等在内的多种敏感数据类型，并通过特定地区包增加了对如美国社保号、英国国民保险号等地域性标识符的支持。该项目利用了Rust语言的性能优势来保证处理速度，适合应用于需要保护用户隐私的数据处理场景中，比如日志脱敏、数据库导出前的数据清洗或与大语言模型交互时的安全提示清理。MIT许可下开源，易于集成到现有开发流程里。","2026-06-11 04:06:40","CREATED_QUERY"]