[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80703":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":12,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":15,"stars30d":16,"stars90d":14,"forks30d":14,"starsTrendScore":13,"compositeScore":17,"rankGlobal":9,"rankLanguage":9,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":19,"topics":21,"createdAt":9,"pushedAt":9,"updatedAt":22,"readmeContent":23,"aiSummary":24,"trendingCount":14,"starSnapshotCount":14,"syncStatus":25,"lastSyncTime":26,"discoverSource":27},80703,"ghosttype","xFreed0m\u002Fghosttype","xFreed0m","Local forensic scanner that extracts credentials from AI tool conversation history. For authorized red team and DLP use only.",null,"Python",52,4,1,0,6,7,45.8,"Other",false,"main",[],"2026-06-12 04:01:29","# ghosttype\n\n[![CI](https:\u002F\u002Fgithub.com\u002FxFreed0m\u002Fghosttype\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002FxFreed0m\u002Fghosttype\u002Factions\u002Fworkflows\u002Fci.yml)\n[![coverage ≥95%](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcoverage-%E2%89%A595%25-brightgreen)](https:\u002F\u002Fgithub.com\u002FxFreed0m\u002Fghosttype\u002Factions\u002Fworkflows\u002Fci.yml)\n[![Scorecard supply-chain](https:\u002F\u002Fgithub.com\u002FxFreed0m\u002Fghosttype\u002Factions\u002Fworkflows\u002Fscorecard.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002FxFreed0m\u002Fghosttype\u002Factions\u002Fworkflows\u002Fscorecard.yml)\n[![OpenSSF Best Practices](https:\u002F\u002Fwww.bestpractices.dev\u002Fprojects\u002FBESTPRACTICES_ID\u002Fbadge)](https:\u002F\u002Fwww.bestpractices.dev\u002Fprojects\u002FBESTPRACTICES_ID)\n[![pre-commit](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https:\u002F\u002Fgithub.com\u002Fpre-commit\u002Fpre-commit)\n\n\u003C!--\nSSCS decorators — REAL, KNOWN badges only. None are fabricated.\n  CI \u002F Scorecard supply-chain : native GitHub Actions workflow-status badges; live once the workflow runs on the default branch.\n  coverage ≥95%               : NOT a fabricated number. It states the ENFORCED invariant — CI runs `pytest --cov-fail-under=95`, so a green build provably means coverage ≥95% (actual is higher, ~98%, but the guaranteed claim is the floor). A static \"97%\" would rot\u002Fbecome false; a live exact-% badge needs Codecov (excluded) or a CI self-publish job (offered separately). This gate badge is truthful, FOSS (shields.io static), zero-infra.\n  OpenSSF Scorecard (score)   : REMOVED. The api.securityscorecards.dev score badge is structurally impossible on a fork — the OpenSSF webapp rejects fork publishing (HTTP 400, \"Fork repository: true\") and `publish_results: true` then FAILS the scorecard workflow. It is NOT \"pending activation\"; it can only ever work if this stops being a fork. The Scorecard *workflow* still runs (SARIF → code-scanning); the \"Scorecard supply-chain\" workflow-status badge above reflects that truthfully. Removed rather than show a permanently-broken \"invalid repo path\" badge.\n  OpenSSF Best Practices      : official bestpractices.dev badge; the URL needs a numeric project ID from registering the project. BESTPRACTICES_ID is a placeholder, NOT a fabricated value — replace after registering (SSCS-USER-ACTIONS.md). Until then this badge intentionally shows as broken rather than show a fake score.\n  pre-commit                  : the project's own recognized badge; live immediately.\nThe gitleaks badge was REMOVED because gitleaks is no longer used anywhere — the pre-commit cred scanner is now TruffleHog (broader detector coverage + active verification). Keeping a \"protected by gitleaks\" badge would be a false claim.\nDeliberately NOT added — no canonical project decorator exists, so adding one would be inventing it (the user explicitly forbade fabricated badges): Opengrep, OSV-Scanner, pip-audit, Dependabot, CodeQL, TruffleHog (TruffleHog has no canonical README badge — not added rather than fabricated).\nSLSA: slsa.dev publishes a Build-Level badge, but ghosttype has no provenanced release yet (release.yml is dormant). Claiming \"SLSA 3\" now would be false — add only after the first provenanced release (SSCS-USER-ACTIONS.md).\n-->\n\nLocal forensic scanner that extracts **and verifies** credentials from AI tool conversation history. Detection + verification powered by [TruffleHog](https:\u002F\u002Fgithub.com\u002Ftrufflesecurity\u002Ftrufflehog).\n\n> Read the original blog post: [**ghosttype — finding secrets in AI conversation history**](https:\u002F\u002Fbetheadversary.com\u002Fposts\u002Fghosttype)\n\n> **Authorized use only.** For licensed penetration testers, red teams, and DLP\u002Fblue teams operating under explicit written authorization. See [THREAT-MODEL.md](THREAT-MODEL.md).\n\n---\n\n## What it does\n\nghosttype scans AI tool conversation files for exposed credentials, then asks TruffleHog whether each one is **actually live** by hitting the issuing provider's verification endpoint. Findings are emitted as JSON + CSV, each linked back to the source conversation.\n\n**Two complementary detection engines** (since v0.4.0):\n\n- **TruffleHog** — 800+ structural detectors with live API verification, entropy filtering, known-example exclusion. The only engine that can prove a credential is *live*.\n- **In-tree pattern engine** — 30 regex + 10 heuristic patterns. Offline, never verified, but catches loose variable-name context signals (`api_key=`, `password=`, `JWT_SECRET=`) that TruffleHog's structural detectors don't match.\n\nBy default both run and results are merged (`--engine both`); on a `(secret_value, file)` overlap the TruffleHog finding wins because it carries verification. Choose one with `--engine {both,trufflehog,patterns}`. ghosttype always owns the discovery layer — where each AI tool stores conversations and how to decode them. Every finding carries a `source` field so you know which engine produced it.\n\n**Supported AI tools:**\n\n| Tool | Data source |\n|------|------------|\n| Claude Code CLI | `~\u002F.claude\u002Fprojects\u002F**\u002F*.jsonl` + history |\n| Cursor IDE | `state.vscdb` (SQLite, global + workspace) |\n| Codex CLI | `~\u002F.codex\u002Fstate_5.sqlite` + logs |\n| ChatGPT Desktop | Keychain-backed `.data` files (AES-128-CBC) |\n| Claude Desktop | Stub (path detected; extraction in progress) |\n\n**Detected credential types:** the full TruffleHog detector catalog (800+), including AWS, GitHub PATs, OpenAI \u002F Anthropic, Stripe, Slack, HashiCorp Vault, Snowflake, Databricks, Linear, GCP service accounts, Azure, Twilio, Cloudflare, npm, Telegram, Hugging Face, DigitalOcean, Docker Hub, Pulumi, Doppler, PyPI, SendGrid, JWT, PEM private keys, and database connection strings. Each finding is marked `verified: true` if TruffleHog confirmed it live against the provider's API, or `verified: false` if the structure matched but verification was skipped, declined, or failed.\n\n---\n\n## Requirements\n\n- Python 3.11+\n- TruffleHog 3.x installed and on `PATH` (or set `GHOSTTYPE_TRUFFLEHOG_BIN`)\n  - macOS: `brew install trufflehog`\n  - Linux: see [installation docs](https:\u002F\u002Fgithub.com\u002Ftrufflesecurity\u002Ftrufflehog#installation)\n- macOS for full tool coverage (Linux\u002FWindows paths: roadmap)\n\nCheck your install:\n\n```bash\nghosttype doctor\n```\n\n---\n\n## Quick start\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fp4gs\u002Fghosttype\ncd ghosttype\npython3 -m venv .venv && source .venv\u002Fbin\u002Factivate\npip install -e \".[dev]\"\n\n# Scan all detected AI tools (verifies every credential against its provider)\nghosttype scan\n\n# Triage rotation work: only TruffleHog-verified live credentials\nghosttype scan --only-verified\n\n# Fast offline pass: detect without hitting any provider APIs\nghosttype scan --no-verification\n\n# Pipe to jq, filter by detector\nghosttype scan --no-verification --format json --output - --quiet \\\n  | jq '.[] | select(.detector_name == \"Github\")'\n\n# Show which AI tools and TruffleHog are present\nghosttype doctor\n```\n\n---\n\n## Output\n\nDefault: `.\u002Fghosttype_report\u002Ffindings.json` + `findings.csv`\n\nEach finding includes:\n\n| Field | Description |\n|-------|-------------|\n| `tool` | Source AI tool (e.g. `claude_code`) |\n| `detector_name` | TruffleHog detector name (e.g. `Github`, `AWS`) |\n| `secret_type` | Detector name, lowercased |\n| `severity` | `critical` \u002F `high` \u002F `medium` — derived from detector + verification state |\n| `verified` | `true` if TruffleHog confirmed live against the provider, else `false` |\n| `verification_error` | Verifier error message if verification was attempted and errored |\n| `secret_value` | Plaintext value (use `--redact` to mask) |\n| `file_path` | Source conversation file |\n| `position` | Chunk position + line within the chunk |\n| `confidence` | `verified` or `unverified` |\n| `context` | Window of surrounding text |\n| `extra_data` | TruffleHog detector extras (e.g. `rotation_guide` URLs) |\n\n---\n\n## All scan options\n\n```\nghosttype scan [OPTIONS]\n\n  --tool TEXT                  Scan one tool: cursor, chatgpt, codex, claude, claude_code\n  --format [json|csv|both]     Output format (default: both)\n  --output TEXT                Output dir, or - for stdout JSON (default: .\u002Fghosttype_report)\n  --engine [both|trufflehog|patterns]\n                               Detection engine (default: both). 'patterns'\n                               needs no TruffleHog binary.\n  --redact                     Mask secret values in output\n  --min-confidence             verified | unverified (default: unverified)\n                               'verified' = TruffleHog-verified only;\n                               'high' (legacy) also keeps regex pattern hits\n  --only-verified              Pass --results=verified to TruffleHog\n  --no-verification            Skip live verifier calls (fast, offline)\n  --trufflehog-binary PATH     Override the TruffleHog binary\n  --trufflehog-timeout SECONDS Outer timeout for the TruffleHog subprocess (default: 300)\n  --max-age-days N             Only scan files modified within last N days\n  --copy-sources               Copy source conversation files to output\u002Fsources\u002F\n  --allow-list PATH            Suppress known-safe values (one value per line)\n  --stats-only                 Print summary statistics only\n  --quiet \u002F -q                 Suppress banner for scripting\n  --context-window N           Context chars around match (default: 200)\n\nghosttype list-tools           Show detected AI tools on this machine\nghosttype doctor               Show TruffleHog binary, version, and detected tools\nghosttype version              Print version\n```\n\n### Environment variables\n\n- `GHOSTTYPE_TRUFFLEHOG_BIN` — explicit path to TruffleHog binary (overridden by `--trufflehog-binary`)\n\n### Exit codes\n\n- `0` — no findings\n- `1` — at least one finding (enables CI\u002FCD gating)\n- `2` — environment problem (TruffleHog missing, subprocess failed, etc.)\n\n---\n\n## Detection design\n\nghosttype is two layers stitched together:\n\n```\n[AI tool storage] --(scanner module)--> TextChunks --(trufflehog filesystem)--> Findings\n   .jsonl\u002FSQLite\u002Fencrypted         (extracted text)        (verified or unverified)\n```\n\nThe **discovery layer** is the per-tool code under `ghosttype\u002Fscanners\u002F` — one module each for Claude Code, Cursor, Codex, ChatGPT, Claude Desktop. They know SQLite schemas, Electron `safeStorage` decryption, JSONL message shapes.\n\nThe **detection + verification layer** is a TruffleHog subprocess. ghosttype writes each extracted text chunk to a temp file with a deterministic name, runs `trufflehog filesystem --json --no-update [...] \u003Ctmpdir>`, parses NDJSON results, and maps each one back to the originating conversation record via the temp filename. The temp dir is deleted in a `finally` block; nothing persists.\n\nSee [ARCHITECTURE.md](ARCHITECTURE.md) for the full pipeline diagram.\n\n---\n\n## Security & threat model\n\nghosttype is forensic — it reads files you already have access to and runs detectors locally. The only network traffic is TruffleHog's own verification calls to credential issuers (AWS, GitHub, Stripe, etc.) and only when verification is enabled.\n\nUse `--no-verification` if any of the following apply:\n- You're operating in an air-gapped environment\n- You don't want to risk lighting up provider audit logs on red-team engagements\n- You just want fast triage\n\nSee [THREAT-MODEL.md](THREAT-MODEL.md) for intended-use and abuse considerations.\n\n---\n\n## License\n\nSee [LICENSE](LICENSE).\n","ghosttype 是一个本地取证扫描工具，用于从AI工具的对话历史中提取凭证信息。它主要使用Python编写，具有高代码覆盖率（≥95%），并通过了多项安全和最佳实践检查，确保了其可靠性和安全性。该工具特别适合授权的红队成员或数据泄露防护（DLP）场景使用，帮助用户在合法合规的前提下发现潜在的安全风险。通过集成多种自动化测试和持续集成流程，ghosttype 保证了软件质量的同时也简化了部署和维护工作。",2,"2026-06-11 04:01:41","CREATED_QUERY"]