[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-76181":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":20,"hasPages":20,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":42,"readmeContent":43,"aiSummary":44,"trendingCount":16,"starSnapshotCount":16,"syncStatus":15,"lastSyncTime":45,"discoverSource":46},76181,"MemPrivacy","MemTensor\u002FMemPrivacy","MemTensor","MemPrivacy is a privacy-preserving personalized memory management framework for edge-cloud agents. ","",null,"Python",104,7,78,2,0,3,19,43.11,false,"main",[23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41],"agent","agent-memory","ai","edge-cloud","information-extraction","information-security","langmem","llm","long-term-memory","mem0","memory","memory-management","memory-system","openai-privacy-filter","personalized-memory","privacy","privacy-detection","privacy-preserving","qwen3","2026-06-12 04:01:20","\u003Ch1 align=\"center\">\n    MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents\n\u003C\u002Fh1>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fspdx.org\u002Flicenses\u002FCC-BY-NC-ND-4.0.html\">\n    \u003Cimg alt=\"License: CC-BY-NC-ND-4.0\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-CC_BY_NC_ND_4.0-brightgreen.svg\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002FMemTensor\u002FMemPrivacy\u002Fissues\">\n    \u003Cimg alt=\"GitHub Issues\" src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002FMemTensor\u002FMemPrivacy?color=blueviolet\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.09530\">\n     \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-Paper-B31B1B?style=flat-square&logo=arxiv&logoColor=white\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FIAAR-Shanghai\u002Fmemprivacy\">\n    \u003Cimg alt=\"Huggingface\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗_Huggingface-Model-ff9800.svg\">\n\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fmodelscope.cn\u002Fcollections\u002FMemTensor\u002FMemPrivacy\">\n    \u003Cimg alt=\"ModelScope\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤖_ModelScope-Model-7B42BC?style=flat-square\">\n\u003C\u002Fa>\n\u003C\u002Fp>\n\n\nMemPrivacy is a **privacy-preserving personalized memory management framework** for **edge-cloud agents**.\nIt lets cloud-based LLM agents retain long-term personalization signals by replacing sensitive spans with **semantically meaningful typed placeholders** before data leaves the device, and **restoring** the original values locally after the cloud response returns—so **raw privacy values are never stored or exposed in the cloud**.\n\n\n---\n\n## Why MemPrivacy?\n\nCloud agents typically send user messages to remote LLMs and store conversation traces in memory systems (e.g., **Mem0**, **LangMem**, **Memobase**) for long-term personalization. This creates a large privacy attack surface:\n\n- plaintext prompts and logs may contain **PII**, medical\u002Ffinancial data, credentials\n- cloud memory stores can leak via retrieval, prompt injection, inversion, or misconfiguration\n- naïve mitigation (e.g., `***` masking) **destroys task semantics**, harming retrieval and personalization\n\n**Goal:** reduce privacy leakage **without sacrificing utility**.\n\n---\n\n## Core Idea\n\n\u003Cdiv align=\"center\">\n    \u003Ctable border=\"0\">\n        \u003Ctr>\n            \u003Ctd width=\"45%\" align=\"center\">\n                \u003Cimg src=\"assets\\framework.jpg\" width=\"100%\">\n                \u003Cbr>\n                \u003Cem>\u003Cstrong>Fig 1.\u003C\u002Fstrong> Overview of MemPrivacy. \u003C\u002Fem>\n            \u003C\u002Ftd>\n        \u003C\u002Ftr>\n    \u003C\u002Ftable>\n\u003C\u002Fdiv>\n\nMemPrivacy implements **local reversible pseudonymization**:\n\n1. **On-device privacy detection (local)**  \n   Detect privacy spans in user input and classify them by:\n   - **privacy level** (PL1–PL4)\n   - **privacy type** (e.g., Email, Real Name, Medical Health, Recovery Code)\n\n2. **Typed placeholder replacement (local → cloud)**  \n   Replace protected spans with **semantically meaningful typed placeholders**, e.g.:\n   - `160\u002F110` (blood pressure) → `\u003CHealth_Info_1>`\n   - `recovery code RC-7291` → `\u003CRecovery_Code_1>`\n\n3. **Local secure mapping (persistent across sessions)**  \n   Store the mapping `placeholder ↔ original value` in a **local SQLite DB**.\n\n4. **Cloud reasoning and memory operations (cloud)**  \n   The cloud agent\u002Fmemory only sees placeholders—preserving semantic roles while hiding raw values.\n\n5. **Downlink restoration (local)**  \n   Restore placeholders in the cloud response back to the original values for a fluent user experience.\n\nThis yields **architecture-level isolation**: cloud components never see\u002Fstore raw sensitive values.\n\n---\n\n## Key Contributions & Advantages\n\n\u003Cdiv align=\"center\">\n    \u003Ctable border=\"0\">\n        \u003Ctr>\n            \u003Ctd width=\"45%\" align=\"center\">\n                \u003Cimg src=\"assets\\intro.jpg\" width=\"100%\">\n                \u003Cbr>\n                \u003Cem>\u003Cstrong>Fig 2.\u003C\u002Fstrong> Comparison of privacy protection strategies for local-to-cloud agent interactio. \u003C\u002Fem>\n            \u003C\u002Ftd>\n        \u003C\u002Ftr>\n    \u003C\u002Ftable>\n\u003C\u002Fdiv>\n\n### 1) Privacy–Utility Balance (vs. masking)\n- **Irreversible masking** (`***`) protects privacy but loses meaning and breaks memory retrieval.\n- **Untyped placeholders** (`\u003CMask_1>`) keep structure but lose semantic roles.\n- **MemPrivacy (typed placeholders)** preserve the semantic role *and* hide raw values, minimizing utility loss.\n\n### 2) Configurable Protection via a 4-Level Privacy Taxonomy\nMemPrivacy introduces **PL1–PL4** to support user-configurable policies:\n\n| Level | Meaning | Examples | Typical Default Policy |\n|---|---|---|---|\n| PL1 | low sensitivity \u002F preferences | “I like sci-fi”, tone, generic habits | can be kept for personalization |\n| PL2 | identifiable PII | real name, phone, email, detailed address, account IDs | disallowed by default in long-term memory |\n| PL3 | highly sensitive PII | health records, financial records, precise location, religion\u002Fethnicity | not permitted in general memory |\n| PL4 | critical secrets (immediately exploitable) | passwords, OTPs, recovery codes, API keys | **zero retention**; must be blocked\u002Fredacted |\n\n### 3) Benchmark & Evaluation for Memory Systems\nThis repo builds **MemPrivacy-Bench** and evaluates privacy protection strategies across real memory systems:\n- **MemPrivacy-Bench**: 200 synthetic users, bilingual (Chinese\u002FEnglish), multi-turn dialogues with dense privacy exposure, plus memory QA tasks.\n- Evaluations on **MemPrivacy-Bench** (in-distribution) and **PersonaMem-v2** (out-of-distribution, annotated here).\n\n### 4) Lightweight & Practical\nThe framework is designed for **edge deployment**:\n- local detection + placeholder substitution + SQLite lookup are low-latency operations\n- works as a drop-in privacy layer for existing cloud agents \u002F memory systems\n\n### 5) Open-Source MemPrivacy Models\nWe release a family of MemPrivacy models trained via Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) across different parameter sizes. You can access the full model collections on [Hugging Face](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FIAAR-Shanghai\u002Fmemprivacy) and [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fcollections\u002FMemTensor\u002FMemPrivacy).\n\n| Model Name | Parameters | Method | HuggingFace Link | ModelScope Link |\n| :--- | :---: | :---: | :--- | :--- |\n| **MemPrivacy-4B-RL** | 4B | RL | [🤗 MemPrivacy-4B-RL](https:\u002F\u002Fhuggingface.co\u002FIAAR-Shanghai\u002FMemPrivacy-4B-RL) | [🤖 MemPrivacy-4B-RL](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FMemTensor\u002FMemPrivacy-4B-RL) |\n| **MemPrivacy-4B-SFT** | 4B | SFT | [🤗 MemPrivacy-4B-SFT](https:\u002F\u002Fhuggingface.co\u002FIAAR-Shanghai\u002FMemPrivacy-4B-SFT) | [🤖 MemPrivacy-4B-SFT](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FMemTensor\u002FMemPrivacy-4B-SFT) |\n| **MemPrivacy-1.7B-RL** | 1.7B | RL | [🤗 MemPrivacy-1.7B-RL](https:\u002F\u002Fhuggingface.co\u002FIAAR-Shanghai\u002FMemPrivacy-1.7B-RL) | [🤖 MemPrivacy-1.7B-RL](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FMemTensor\u002FMemPrivacy-1.7B-RL) |\n| **MemPrivacy-1.7B-SFT** | 1.7B | SFT | [🤗 MemPrivacy-1.7B-SFT](https:\u002F\u002Fhuggingface.co\u002FIAAR-Shanghai\u002FMemPrivacy-1.7B-SFT) | [🤖 MemPrivacy-1.7B-SFT](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FMemTensor\u002FMemPrivacy-1.7B-SFT) |\n\n---\n\n## Evaluation Results\n\n### 1. Privacy Extraction Performance\n\n\u003Cdiv align=\"center\">\n    \u003Cem>\u003Cstrong>Table 1.\u003C\u002Fstrong> Performance comparison of different LLMs and MemPrivacy models on MemPrivacy-Bench and PersonaMem-v2.\u003C\u002Fem>\n    \u003Cimg src=\"assets\\table1.png\" width=\"100%\" alt=\"Performance comparison of different LLMs and MemPrivacy models on MemPrivacy-Bench and PersonaMem-v2.\">\n    \u003Cbr>\n\u003C\u002Fdiv>\n\n\n**Key Takeaways:**\n\n* **Superior Accuracy:** MemPrivacy consistently outperforms 11 general LLMs and **OpenAI-Privacy-Filter**. The best model (MemPrivacy-4B-RL) achieves F1 scores of **85.97%** and **94.48%**, significantly surpassing the top general models (78.41% and 92.18%). Even our smallest 0.6B model beats most general models.\n* **Robustness on Complex Data:** While lightweight filters like OpenAI-Privacy-Filter are fast, they struggle with implicit and linguistically diverse privacy expressions (only 35.50% F1 on MemPrivacy-Bench). MemPrivacy accurately handles fine-grained, heterogeneous conversational scenarios.\n* **High Efficiency:** Despite its accuracy, MemPrivacy remains highly efficient. Processing latency per message is consistently **below one second** on PersonaMem-v2, making it well-suited for seamless on-device deployment without noticeable delays.\n\n### 2. Memory System Performance under Different Protection Methods\n\n\u003Cdiv align=\"center\">\n    \u003Cem>\u003Cstrong>Table 2.\u003C\u002Fstrong> Performance comparison under different privacy protection methods on three memory systems.\u003C\u002Fem>\n    \u003Cimg src=\"assets\\table2.png\" width=\"100%\" alt=\"Performance comparison under different privacy protection methods on three memory systems.\">\n    \u003Cbr>\n\u003C\u002Fdiv>\n\n\n**Key Takeaways:**\n\n* **Optimal Privacy-Utility Trade-off:** Compared to traditional masking (`***`) or untyped placeholders (`\u003CMask_1>`), MemPrivacy preserves the utility of downstream systems (LangMem, Mem0, Memobase) significantly better by retaining critical semantic roles.\n* **Minimal Degradation:** When applying stringent protection (PL2–PL4), system accuracy drops by merely **0.71%–1.60%**. If protecting only critical secrets (PL4), the drop is **below 0.89%**. \n* **Extractor Dependency:** The effectiveness of the entire framework heavily depends on accurate privacy extraction. Replacing the MemPrivacy model with general LLMs (e.g., DeepSeek-V3.2-Think, GPT-5.2) causes substantial accuracy degradation, validating the necessity of our specialized fine-tuning.\n\n---\n\n\n## What’s in This Repository?\n\nHigh-level structure:\n\n```text\nMemPrivacy\u002F\n├── evaluation\u002F              # evaluation on memory systems + metrics\n└── src\u002F                     # privacy masking\u002Fpseudonymization core\n```\n\n### Core Components\n\n- **Reversible pseudonymization module** (`src\u002Fprivacy_masking.py`)\n  - `PrivacyStore` (SQLite mapping store)\n  - `mask_dialogue()`, `unmask_dialogue()`, `detect_and_mask_dialogue()`\n  - masking modes: `type_specific`, `generic`, `complete`\n- **Evaluation suite** (`evaluation\u002F`)\n  - memory systems: `eval_mem0.py`, `eval_langmem.py`, `eval_memobase.py`\n  - metrics: `metric.py` (privacy extraction P\u002FR\u002FF1, level\u002Ftype matching, etc.)\n  - results saved to `evaluation\u002Fresults\u002F`\n\n---\n\n## How It Works (End-to-End)\n\n### Stage A — Uplink Desensitization (Local)\n- detect privacy spans locally (original text, privacy level, privacy type)\n- apply a user policy: e.g., mask only **PL3+**, or **PL2–PL4**\n- replace spans with typed placeholders\n- store mapping locally (persistent across sessions)\n\n### Stage B — Cloud Processing\n- send only placeholderized text to the cloud LLM \u002F memory system\n- the cloud performs normal agent workflows (reasoning, tool use, memory write\u002Fretrieval) **and generates a response**\n- cloud memory stores placeholders, not raw secrets\n\n### Stage C — Downlink Restoration (Local)\n- restore placeholders in the response using the local mapping DB\n- user sees original values; cloud never receives them\n\n---\n\n## Quickstart\n\n### 1) Installation\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FMemTensor\u002FMemPrivacy.git\ncd MemPrivacy\npython -m venv .venv\nsource .venv\u002Fbin\u002Factivate\npip install -r requirements.txt\n```\n\n### 2) Configuration\n\nTo both use the core MemPrivacy framework and run the evaluation benchmarks, you need to configure two YAML files:\n\n**1. `src\u002Fprivacy_config.yaml` (For using the framework)**  \nThis file controls the core reversible pseudonymization module. Key configurations include:\n- **`llm`**: API credentials (`base_url`, `api_key`) and model parameters used for on-device privacy detection.\n- **`privacy`**: The local SQLite database path (`db_path`) for storing mapping rules, and the `mask_levels` (e.g., `PL3`, `PL4`) to define your privacy protection policy.\n\n**2. `evaluation\u002Feval_config.yaml` (For evaluating memory systems)**  \nThis file configures the benchmarking suite across different memory systems (Mem0, Memobase, etc.). Key configurations include:\n- **Global API Keys**: `openai_base_url` and `openai_api_key`.\n- **Role-specific LLMs**: Distinct model settings for memory operations (`memory_llm`), generating answers (`answer_llm`), and automated evaluation (`judgment_llm`, `privacy_llm`).\n- **System Configs**: Database paths and connection URLs for specific memory systems (e.g., `mem0_config`, `memobase`).\n\n---\n\n## Evaluate Memory Systems (Mem0 \u002F LangMem \u002F Memobase)\n\nExample commands:\n\n```bash\npython evaluation\u002Feval_mem0.py\npython evaluation\u002Feval_langmem.py\npython evaluation\u002Feval_memobase.py\n```\n\nEvaluation logic:\n1. feed dialogues turn-by-turn into the memory system (optionally with MemPrivacy masking)\n2. query the system using generated questions\n3. judge answer correctness (short-answer uses an LLM judge; PersonaMem-v2 uses exact match)\n4. compute privacy leakage \u002F extraction metrics\n\n---\n\n## Use MemPrivacy in Your Own Agent (Minimal Example)\n\nThe reversible pseudonymization APIs live in:\n\n- `src\u002Fprivacy_masking.py` (core)\n- (a similar copy exists under `evaluation\u002Fprivacy_masking.py` for evaluation-time use)\n\nConceptual usage:\n\n```python\nfrom src.privacy_masking import PrivacyStore, mask_dialogue, unmask_dialogue\n\nstore = PrivacyStore(db_path=\"local_privacy_store.sqlite\")\n\nmasked_text, meta = mask_dialogue(\n    text=user_text,\n    privacy_items=detected_privacy_items,  # produced locally by MemPrivacy model\n    store=store,\n    mode=\"type_specific\",                  # or \"generic\", \"complete\"\n)\n\n# send masked_text to cloud...\n\nrestored = unmask_dialogue(cloud_response_text, store=store)\n```\n\n### Masking Modes\n- `type_specific`: `\u003CEmail_1>`, `\u003CReal_Name_2>` (best utility)\n- `generic`: `\u003CPrivacy_1>` (less semantic signal)\n- `complete`: remove sensitive spans entirely (max privacy, lowest utility)\n\n### Policy Control (Privacy Levels)\nYou can enforce a masking threshold such as:\n- protect `PL4` only (credentials)\n- protect `PL3+` (highly sensitive + secrets)\n- protect `PL2–PL4` (most conservative)\n\n---\n\n## Citation\n\nIf you use MemPrivacy-Bench, the taxonomy, or the framework, please cite:\n\n```bibtex\n@misc{chen2026memprivacyprivacypreservingpersonalizedmemory,\n      title={MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents}, \n      author={Yining Chen and Jihao Zhao and Bo Tang and Haofen Wang and Yue Zhang and Fei Huang and Feiyu Xiong and Zhiyu Li},\n      year={2026},\n      eprint={2605.09530},\n      archivePrefix={arXiv},\n      primaryClass={cs.CR},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.09530}, \n}\n```\n\n---\n\n## Disclaimer\n\nThis project is intended for **privacy research and evaluation**.  \nDo **not** use it to process real user secrets without proper security controls, threat modeling, and compliance review. Always follow local laws and organizational policies.\n","MemPrivacy是一个面向边缘云代理的隐私保护个性化内存管理框架。它通过在数据离开设备前将敏感信息替换为语义上有意义的类型占位符，并在云端响应返回后本地恢复原始值，从而确保原始隐私数据不会被存储或暴露于云端，实现了对用户隐私的有效保护同时保留了任务语义。该项目采用Python开发，适用于需要长期个性化但又高度关注用户隐私安全的边缘计算场景，比如智能助手、健康监测应用等，能够显著降低因数据泄露带来的隐私风险而不牺牲系统实用性。","2026-06-11 03:54:44","CREATED_QUERY"]