[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-677":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},677,"privacy-filter","openai\u002Fprivacy-filter","openai","OpenAI Privacy Filter",null,"Python",2422,210,12,11,0,18,66,338,54,28.97,"Apache License 2.0",false,"main",true,[],"2026-06-12 02:00:17","# OpenAI Privacy Filter\n\nOpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended for high-throughput data sanitization workflows where teams need a model that they can run on-premises that is fast, context-aware, and tunable.\n\nOpenAI Privacy Filter is pretrained autoregressively to arrive at a checkpoint with similar architecture to gpt-oss, albeit of a smaller size.  We  then converted that checkpoint into a bidirectional token classifier over a privacy label taxonomy, and post-trained with a supervised classification loss. (For architecture details about gpt-oss, please see the gpt-oss model card.) Instead of generating text token-by-token, this model labels an input sequence in a single forward pass, then decodes coherent spans with a constrained Viterbi procedure. For each input token, the model predicts a probability distribution over the label taxonomy which consists of 8 output categories described below.\n\nHighlights:\n\n- Permissive Apache 2.0 license: ideal for experimentation, customization, and commercial deployment.\n- Small size: Runs in a web browser or on a laptop – 1.5B parameters total and 50M active parameters.\n- Fine-tunable: Adapt the model to specific data distributions through easy and data efficient finetuning.\n- Long-context: 128,000-token context window enables processing long text with high throughput and no chunking.\n- Runtime control: configure precision\u002Frecall tradeoffs and detected span lengths through preset operating points.\n\n## This Repo\n\nThis repository contains the local code, CLI, and example assets used to run, evaluate, and finetune Privacy Filter checkpoints. It is meant for teams that want to inspect the implementation directly and operate the model in their own environment.\n\nRepository resources: [License](LICENSE) and [Security Policy](SECURITY.md).\n\n### How To Use\n\n1. Install the package locally:\n\n```bash\npip install -e .\n```\n\nAfter this, you will have a python script `opf` that can be run directly or via `python -m opf`. The script can be used in 3 separate ways, as described below.\n\n2. Run one-shot redaction:\n\nBy default, `opf` looks for a model at the directory pointed to by the `OPF_CHECKPOINT` variable, or `~\u002F.opf\u002Fprivacy_filter`. If a model is not found in the `~\u002F.opf\u002Fprivacy_filter` location, it will be downloaded.\n\n```bash\nopf \"Alice was born on 1990-01-02.\"\n```\n\nThe code supports running both on GPU (by default) and CPU. To run on CPU, use `--device cpu` flag:\n\n```bash\nopf --device cpu \"Alice was born on 1990-01-02.\"\n```\n\nTo override the default checkpoint, pass `--checkpoint`:\n\n```bash\nopf --checkpoint \u002Fpath\u002Fto\u002Fcheckpoint_dir \"Alice was born on 1990-01-02.\"\n```\n\nThe redaction mode supports redacting an entire file at once\n\n```bash\nopf -f \u002Fpath\u002Fto\u002Ffile\n```\n\nThe redaction can also be performed via pipes, to support complex one-liners:\n\n```bash\ncat \u002Fpath\u002Fto\u002Ffile | grep -e 'some_pattern' | opf\n```\n\nIf no input is provided, `opf` will start in interactive mode. In this mode, for each input example, the CLI prints structured JSON output, using ANSI color-coded previews if the terminal supports them. These options can be controlled by flags.\n\nConsult `opf redact --help` for more flags and information about the redaction mode.\n\n3. Run eval on a labeled dataset:\n\n```bash\nopf eval examples\u002Fdata\u002Fsample_eval_five_examples.jsonl\n```\n\nThe sample eval fixtures under `examples\u002Fdata\u002Fsample_eval_five_examples*.jsonl` are synthetic example data only and do not describe real people or real sensitive records. See `examples\u002Fdata\u002FREADME.md`.\n\nConsult `opf eval --help` for more flags and information about the evaluation mode.\n\n4. Finetune on your own labeled dataset:\n\n```bash\nopf train \u002Fpath\u002Fto\u002Ftrain.jsonl --output-dir \u002Fpath\u002Fto\u002Ffinetuned_checkpoint\n```\n\nConsult `opf train --help` for more flags and information about the finetuning mode.\n\n### Structure\n\n- `opf\u002F__main__.py`: unified CLI entrypoint for redact, eval, and train modes.\n- `opf\u002F_api.py`: Python-facing API over the runtime and decoding stack.\n- `opf\u002F_cli\u002F`: command-line argument parsing and terminal rendering helpers.\n- `opf\u002F_core\u002F`: runtime loading, span conversion, and shared decoding logic.\n- `opf\u002F_eval\u002F`: dataset loading, preprocessing, metrics, and evaluation runners.\n- `opf\u002F_train\u002F`: local finetuning argument parsing and training runners.\n- `opf\u002F_model\u002F`: transformer implementation, checkpoint config, and weight loading.\n- `examples\u002Fdata\u002F`: sample eval files plus reproducible finetuning demo datasets.\n- `examples\u002Fscripts\u002Ffinetuning\u002F`: runnable finetuning demo harnesses.\n- `FINETUNING.md`: focused finetuning workflow and demo-script guide.\n- `OUTPUT_SCHEMAS.md`: JSON response and export payload formats.\n- `EVAL_AND_OUTPUT_MODES.md`: description of the output modes for redaction and evaluation.\n\n## Model Details\n\n### Model Description\n\nPrivacy Filter is a bidirectional token classification model with span decoding. It is trained in phases, beginning with autoregressive pretraining. The pretrained language model is then modified and post-trained as a bidirectional banded attention token classifier with band size 128 (effective attention window: 257 tokens including self). This means:\n\n* The base model is an autoregressive pretrained checkpoint.\n* The language-model output head is replaced with a token-classification head over privacy labels.\n* Post-training is supervised token-level classification rather than next-token prediction.\n* Inference applies constrained sequence decoding to produce coherent BIOES (Begin, Inside, Outside, End, Single) span labels.\n\nArchitecturally, the implementation in this repo is a pre-norm transformer encoder-style stack with:\n\n* token embeddings\n* 8 repeated transformer blocks\n* grouped-query attention with rotary positional embeddings, with 14 query heads and 2 KV heads (group size = 7 queries per KV head)\n* sparse mixture-of-experts feed-forward blocks with 128 experts total (top-4 routing per token)\n* a final token-classification head over privacy labels (rather than natural language vocabulary tokens), with residual stream width `d_model = 640`.\n\nRelative to iterative autoregressive approaches, this design allows all tokens to be labeled in one pass, which improves throughput. Relative to classical masked-language-model pretraining approaches, this is a post-training conversion of an autoregressive model rather than a native masked-LM setup.\n\n### Output Shape\n\nPrivacy Filter can detect 8 privacy span categories:\n\n1. `account_number`\n2. `private_address`\n3. `private_email`\n4. `private_person`\n5. `private_phone`\n6. `private_url`\n7. `private_date`\n8. `secret`\n\nTo perform token-classification, each non-background span category is expanded into boundary-tagged token classes: `B-\u003Clabel>`, `I-\u003Clabel>`, `E-\u003Clabel>`, `S-\u003Clabel>`, plus the background class, `O`. So the total number of token-level output classes is 33: 1 background class \\+ 8 span labels \\* 4 boundary tags \\= 33 classes. This means the output head emits 33 logits for each token. For a sequence of length T, the output has shape `[T, 33]`; for a batch of size B, it has shape `[B, T, 33]`.\n\nThe token-label vocabulary consists of the background label `O` plus BIOES-tagged variants of each privacy category: `account_number`, `private_address`, `private_email`, `private_person`, `private_phone`, `private_url`, `private_date`, and `secret`. In other words, for each category, the model predicts `B-`, `I-`, `E-`, and `S-` forms corresponding to begin, inside, end, and single-token spans. At inference time, these per-token logits are decoded into coherent BIOES span labels using constrained sequence decoding.\n\n### Sequence Decoding Rationale and Calibration\n\n#### Rationale\n\nAfter the token classifier produces per-token logits, we decode labels with a constrained Viterbi decoder using linear-chain transition scoring, rather than taking an independent argmax for each token. The decoder enforces allowed BIOES boundary transitions and scores complete label paths with start, transition, and end terms, plus six transition-bias parameters that control background persistence, span entry, span continuation, span closure, and boundary-to-boundary handoff. This global path optimization is intended to improve span coherence and boundary stability by making each token decision depend on sequence-level structure, not just local logits, especially in noisy or mixed-format text where local token decisions alone can produce fragmented or inconsistent boundaries.\n\n#### Operating-Point Calibration\n\nSequence Decoding parameters can discourage staying in background while encouraging span entry and continuation, yielding broader and more contiguous masking for improved recall, or vice versa for improved precision. At runtime, users can tune parameters that control this tradeoff.\n\n### Model Metadata\n\n- Developed by: OpenAI\n- Funded by: OpenAI\n- Shared by: OpenAI\n- Model type: Bidirectional token classification model for privacy span detection\n- Language(s): Primarily English; selected multilingual robustness evaluation reported\n- License: [Apache 2.0](LICENSE)\n\n- Model weights: https:\u002F\u002Fhuggingface.co\u002Fopenai\u002Fprivacy-filter\n- Demo: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fopenai\u002Fprivacy-filter\n- Model card: [OpenAI Privacy Filter Model Card](https:\u002F\u002Fcdn.openai.com\u002Fpdf\u002Fc66281ed-b638-456a-8ce1-97e9f5264a90\u002FOpenAI-Privacy-Filter-Model-Card.pdf)\n\n## Bias, Risks, and Limitations\n\n### Risk: Over-reliance\n\nPrivacy Filter is a redaction and data minimization aid, not an anonymization, compliance, or a safety guarantee. Over-reliance on the tool as a blanket anonymization claim would risk missing desired privacy objectives. Privacy Filter is best used as one of multiple layers in a holistic end-to-end privacy-by-design approach.\n\n### Limitation: Static Label Policy\n\nThe model will only identify personal data spans that match the trained label taxonomy and definitions. Real-life privacy use cases are varied and complex and definitions of appropriate label policies and decision boundaries can differ. Thus model defaults may not satisfy organization-specific governance requirements without calibration\u002Ffine-tuning.\n\nPrivacy Filter does not support configuring label policies dynamically at runtime; instead changing policies requires further finetuning of the model. The native label set and associated decision boundaries may not be appropriate for every use case. For example, the model's training policy aims to prioritize personal identifiers, often preserving context that is not strongly person-linked by design; some users may want to adjust this choice.\n\nPerformance may drop on non-English text, non-Latin scripts, protected-group naming patterns, or domains that are out of distribution compared to model training.\n\n### Failure Modes\n\nLike all models, Privacy Filter can make mistakes, such as: under-detection of uncommon personal names, regional naming conventions, initials, honorific-heavy references, or domain-specific identifiers; over-redaction of public entities, organizations, locations, or common nouns when local context is ambiguous; fragmented or shifted span boundaries in mixed-format text, long documents, or text with heavy punctuation and layout artifacts; missed secrets for novel credential formats, project-specific token patterns, or secrets split across surrounding syntax; and over-redaction of benign high-entropy strings, placeholders, hashes, sample credentials, or synthetic examples that resemble secrets.\n\nThese limitations can interact with demographic, regional, and domain variation. For example, names and identifiers that are underrepresented in training data, or that follow conventions different from the dominant training distribution, may be more likely to be missed or inconsistently bounded.\n\n### High-Risk Deployment Caution\n\nAdditional caution is warranted in high-sensitivity settings such as medical, legal, financial, human resources, education, and government workflows. In these settings, both false negatives and false positives can be costly: missed spans may expose sensitive information, while excess masking can remove material context needed for review, auditing, or downstream decision-making.\n\n### Recommendations\n\n- Use Privacy Filter as part of a holistic privacy-by-design approach, not as a blanket anonymization claim.\n- Evaluate in-domain with local policy references before production.\n- Use task-specific fine-tuning when policy differs from base boundaries.\n- Keep human review paths for high-sensitivity workflows.\n","OpenAI Privacy Filter 是一个用于检测和屏蔽文本中个人可识别信息（PII）的双向标记分类模型。其核心技术特点包括：预训练自回归模型，随后转换为基于隐私标签分类的双向标记分类器，并通过监督分类损失进行后训练；模型在单次前向传递中标记输入序列，并使用受限Viterbi过程解码连贯片段；支持长上下文处理（128,000个令牌），运行时控制精度\u002F召回率权衡及检测跨度长度。此项目适用于需要高吞吐量数据清理工作流、且希望在本地运行快速、上下文感知及可调模型的团队。此外，它还具备Apache 2.0许可、小尺寸（总共1.5B参数，激活参数50M）、易于微调等优点，适合实验、定制化及商业部署。",2,"2026-06-11 02:38:34","CREATED_QUERY"]