[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83337":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":12,"subscribersCount":12,"size":12,"stars1d":13,"stars7d":15,"stars30d":15,"stars90d":12,"forks30d":12,"starsTrendScore":16,"compositeScore":17,"rankGlobal":9,"rankLanguage":9,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":19,"topics":21,"createdAt":9,"pushedAt":9,"updatedAt":42,"readmeContent":43,"aiSummary":9,"trendingCount":12,"starSnapshotCount":12,"syncStatus":44,"lastSyncTime":45,"discoverSource":46},83337,"subarr","coaxk\u002Fsubarr","coaxk","The coordination, measurement, and quality layer that subgen never had. A peer service for the *arr family that adds calibrated audio-language detection, provider success leaderboards, and (v1.1) an in-app Whisper tuning lab.",null,"Python",83,0,1,18,29,7,59.4,"MIT License",false,"main",[22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41],"arr-stack","bazarr","docker","fastapi","homelab","jellyfin","media-automation","plex","python","radarr","self-hosted","selfhosted","servarr","sonarr","subgen","subtitle-generation","subtitles","tautulli","transcription","whisper","2026-06-12 04:01:40","# subarr\n\nThe coordination layer for the *arr subtitle stack. Stands beside Bazarr.\n\nSubarr decides what subtitles are actually missing across your library, which providers are worth your time, and when it is worth running Whisper. Bazarr finds and downloads. Subgen transcribes. Subarr coordinates.\n\n[![status](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fstatus-v1.2-violet)](https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr)\n[![tests](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftests-603_passing-22d3ee)](https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr\u002Factions\u002Fworkflows\u002Fci.yml)\n[![security](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBandit_%2B_Semgrep_%2B_Trivy_%2B_pip--audit-22c55e)](#security)\n[![license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-c8c8cc)](LICENSE)\n\n> Built with AI assistance from Claude. Code is open, every PR is human-reviewed. Telemetry, security scans, and a published test count are how we stay honest about that.\n\n![Subarr in action](docs\u002Fhero.gif)\n\n---\n\n## New in 1.2 — the Tuning Lab and verified audio\n\n- **Tuning Lab: find the Whisper settings that actually win on your hardware.** Pick a file, choose recipes to compare, and subarr runs each one against your live subgen and lets a validated tournament judge rank them objectively. It auto-samples up to three short clips per file (dialogue, a speech-to-silence edge, a quiet stretch) and a recipe has to win across clips, not on a lucky one. A per-language \"herd\" view aggregates results so a dependable default emerges for each language, and nothing is ever written to your library.\n- **Audio-language verification: subarr listens and tells you the truth about the track.** The *arr metadata chain can only parrot whatever Sonarr or Radarr tagged. Subarr verifies the spoken language by ear with robust multi-chunk Whisper detection and tells three real situations apart: a mislabeled track (tagged Danish, audio unanimously Dutch) with a one-click correction that flows back into coverage, a bilingual file flagged as mixed instead of mis-collapsed, and \"Whisper unsure\" that falls back to the known tag rather than guessing. Multi-track files (an original plus a dub) are swept per track. An Audio language issues panel collects every flagged file in one place.\n- **Library-wide audio scan.** One click runs that same listening pass over your whole library, not just files you happened to sweep. It is opt-in, throttled to a background trickle, GPU-polite (it pauses while live Tuning Lab sweeps run), and resumable across restarts. Findings land in the same Audio language issues panel and drop out once you confirm them.\n- **Global recipe leaderboard.** Per-language herds roll up into one overall ranking, scored by the mean of per-language means so each language counts equally. Medals for the top three, a confidence signal, and an expandable per-language breakdown.\n- **Edit integrations in-app.** Add or change Bazarr, Sonarr, Radarr, Tautulli URLs and API keys plus the Plex token from Settings, with test-connection and live apply. No env edit, no restart. Env-set fields stay authoritative and read-only.\n- **Push-based completion.** Subarr consumes subgen's completion webhook instead of polling the queue (polling stays as the fallback).\n\n*Speech-aware audio (silero VAD) and config persistence landed in 1.1; see the changelog for history.*\n\n---\n\n## In one breath\n\n- **See your whole library's subtitle coverage at a glance.** Per-language gap view across Sonarr + Radarr + Bazarr, with audio language we trust.\n- **We verify before we call it a gap.** A row only becomes an actionable gap once subarr has actually probed the file — so it never queues something that already has an embedded sub subgen would skip. Un-probed files wait in a visible \"Analyzing\" bucket, not silently dropped or falsely surfaced.\n- **Calibrated audio language detection.** Three Whisper chunks across the file, conservative voting, confidence-gated. Cheap to skip files Whisper would hallucinate on.\n- **We don't parrot the metadata, we verify it.** Subarr listens to the actual audio and tells a mislabeled track from a bilingual one from \"genuinely unsure\", then offers a one-click fix that flows back into coverage. Beside Bazarr, never instead of it.\n- **Tune Whisper to your hardware.** The Tuning Lab sweeps recipe variants against your live subgen, a validated judge ranks them, and a per-language leaderboard surfaces the dependable default for each language.\n- **Don't burn GPU on content nobody watches.** Scheduled walks with backpressure. Tautulli playback signal influences priority.\n- **Provenance ledger.** Which provider gave you which sub, when, why. Survives re-search runs.\n- **Embedded subs are first-class.** SDH, forced, PGS, full, all distinguished, not collapsed.\n\n## Five-minute install\n\n```yaml\n# compose.yaml\nservices:\n  subarr:\n    image: ghcr.io\u002Fcoaxk\u002Fsubarr:latest\n    container_name: subarr\n    restart: unless-stopped\n    ports:\n      - \"9922:9922\"\n    environment:\n      - PUID=1000\n      - PGID=1000\n      - TZ=Etc\u002FUTC\n      - UMASK=022\n      - SUBARR_DB_PATH=\u002Fdata\u002Fsubarr.db     # SQLite + persisted settings live here\n    volumes:\n      - .\u002Fsubarr\u002Fdata:\u002Fdata                # subarr.db (override the path with SUBARR_DB_PATH)\n      - \u002Fpath\u002Fto\u002Fmedia:\u002Fmedia\u002Flibrary:rw   # same path Bazarr and subgen see\n```\n\n```bash\ndocker compose up -d\n# Open http:\u002F\u002Flocalhost:9922, onboarding wizard auto-detects your stack.\n```\n\nThe wizard tries to auto-detect Sonarr\u002FRadarr\u002FBazarr\u002FTautulli\u002Fsubgen on your existing Docker network and prefills URLs. Manual entry is available at every step as a safety net. Auto-detect plus manual fallback at every step is the design rule.\n\nAfter onboarding you can edit any integration's URL and API key (and the Plex token) directly in Settings, with test-connection and live apply. Values you set via env vars stay authoritative and show as read-only.\n\n**Why `:rw` on the media mount.** Subarr's sidecar mismatch detector renames orphaned `.srt` files whose basename drifted from the video. Read-only blocks this. If you don't want it, set `SUBARR_SIDECAR_RENAME=0` and mount `:ro`, the rest of the product works.\n\n**Plex (optional).** Set `PLEX_URL` + `PLEX_TOKEN` (and optionally `PLEX_SECTION`) to enable two things: an instant Plex library refresh the moment subarr writes a sub (instead of waiting for Plex's own periodic scan), and the opt-in per-show audio-language read (`PLEX_AUDIO_HINTS=1`). Plex shows in the dashboard + Settings integration health either way, so you can see its status at a glance. Activity\u002Fnow-playing still comes through Tautulli.\n\n## Two ways to use subarr\n\nPick whichever fits how you work. You can do both.\n\n**Simple, \"I just want a real frontend for subgen\".** Install subarr, open the Library tab, tick a file or a folder or a whole series, hit \"Queue for transcription\". Watch it run. Re-queue, cancel, see what failed and why. Same way you'd use Sonarr's queue for downloads. No coverage walks, no rules, no scheduler. Just a working UI on top of subgen.\n\n**Advanced, \"tell me what I should fix first\".** Open the Coverage tab. Subarr has already walked your library and sorted gaps by score with reason chips per row (no track, embedded-only, bazarr-wanted, audio-mislabel, low-score, unmonitored). Apply auto-queue rules, run scheduled walks, integrate Tautulli playback signal into priority. Set it up once, walk away. Subarr decides what's worth running.\n\nMost installs start simple and grow into advanced as the coverage walk surfaces things worth doing. Nothing forces the move; both are valid forever.\n\n## I already have subgen. What do I do?\n\nThe most-asked question. Quick answer.\n\n| You have | What to do |\n|---|---|\n| Vanilla `mccloud\u002Fsubgen` | Keep it. Add subarr next to it. Subarr detects vanilla and runs in compat mode. Coverage, provenance, scheduling, audio-language review all work. You miss calibrated multi-chunk detection and queue cancel, both require our subgen patches. |\n| `mccloud\u002Fsubgen` and you want everything | Swap to `ghcr.io\u002Fcoaxk\u002Fsubarr-subgen`. Same upstream image plus 20 small auditable patches. Pull, change one line in your compose, restart. No data loss, no config rewrite. |\n| No subgen yet | Start with `ghcr.io\u002Fcoaxk\u002Fsubarr-subgen`. Everything works on day one. |\n| You run Bazarr only | Subarr adds a coordination layer beside Bazarr. Bazarr keeps doing what it does. Subarr surfaces what is actually missing, schedules the work, and writes results back. |\n\nYou do not need to decide at install. Subarr re-probes subgen every 30 seconds and adopts new capabilities the moment you upgrade.\n\n## Do you need subarr?\n\nSkip subarr if any of these are true:\n\n- Your library is single-language and you have never had a wrong-language subtitle land.\n- You use one or two providers and never wonder which one delivered what.\n- You don't run Whisper or any local transcription, and don't plan to.\n\nSubarr's value compounds with: multi-language libraries, three or more Bazarr providers, Whisper-in-the-loop, and a habit of asking \"why did Bazarr re-search this?\"\n\n## What's in subarr\n\n| Surface | Function |\n|---|---|\n| Dashboard | Live column-as-stage pipeline (discovered → probing → bazarr-wanted → transcribing → written-back), GPU widget, integration health, next scheduled run, recent activity |\n| Coverage | Scored gap list (tree-by-show or flat), score-gradient sort, reason chips (no-track, embedded-only, bazarr-wanted, audio-mislabel, low-score, unmonitored). **Probe-gate:** only files subarr has verified appear as gaps; un-probed files sit in a sticky \"Analyzing\" bucket (with a Probe-now action) and \"Couldn't analyze\" surfaces failures — nothing silently dropped. Bulk select + apply rule + queue |\n| Library | Tree across all series and movies. Audio \u002F sub \u002F runtime columns with probe-state indicators |\n| Queue | Featured Queue: Processing, Queued, Lost-on-restart, Issues, Recently done. Per-row and **bulk** requeue \u002F remove \u002F cancel (multi-select across every section). Promote \u002F demote \u002F reorder \u002F pause are roadmap (need a subgen queue-mutation patch first) |\n| Review | Manual audio-language verification queue with audio player, multi-track support, batch cycle, Layer 3 Whisper detection inline. **Speech-aware clip selection (1.1):** the player lands on actual dialogue via silero VAD, with a \"🎙 speech-detected\" badge |\n| Rules | Auto-queue rules with score thresholds, language filters, custom-format pre-classification |\n| Tuning Lab | Config arena: sweep Whisper recipes against your live subgen, judged by a validated tournament judge across multiple strata clips. Per-language herd view, global recipe leaderboard, and an Audio language issues panel surfacing mislabeled \u002F bilingual \u002F multi-track files from on-demand sweeps and the opt-in library-wide scan |\n| Settings | Per-language Whisper kwargs, **in-app integration editing** (URLs + API keys + Plex token, test-connection + live apply, env-set fields stay read-only), integrations health, system actions, telemetry transparency panel showing the exact JSON last sent. **Speech-aware audio:** enable\u002Fdisable + download the silero model |\n\n### About ollama (optional, recommended)\n\nSubarr does not require ollama. With it, you get two extras:\n\n- **Structured enrichment.** Vague Bazarr wanted entries get classified by language, genre hints, dialog density. Improves prioritisation. Works with any text model.\n- **Vision pre-filter.** A vision-capable model classifies Tautulli thumbnails as dialog-heavy \u002F music-heavy \u002F visual-only. Suppresses transcribe submissions where Whisper would hallucinate.\n\nVision and text models are separate (`OLLAMA_MODEL` and `OLLAMA_VISION_MODEL`). Default vision model is `qwen2.5vl:7b`. Subarr auto-detects any installed model from `qwen2.5vl`, `qwen2-vl`, `llama3.2-vision`, `llava`, `bakllava`, `minicpm-v`, `moondream`. Without a vision-capable model the pre-filter is cleanly disabled, not silently broken. Settings shows the active state.\n\n## Screens\n\nReal library, real foreign-language content — nothing staged.\n\n**Dashboard** — live pipeline (discovered → probing → bazarr-wanted → transcribing → written-back), GPU, integration health, next scheduled run, recent activity.\n\n![Dashboard](docs\u002Fscreenshots\u002F01-dashboard.png)\n\n**Coverage** — the scored gap list with the probe-gate: verified gaps in the table, un-probed files held in \"Analyzing\", every explainer panel inline.\n\n![Coverage](docs\u002Fscreenshots\u002F02-coverage.png)\n\n**Queue** — a real frontend for subgen: Processing \u002F Queued \u002F Lost-on-restart \u002F Issues, with per-row and bulk requeue · remove · cancel.\n\n![Queue](docs\u002Fscreenshots\u002F03-queue.png)\n\n**Library** — every series and movie with audio \u002F sub \u002F runtime + probe-state.\n\n![Library](docs\u002Fscreenshots\u002F04-library.png)\n\n**Review** — manual audio-language verification with an audio player, multi-track support, and inline Whisper detection. In 1.1 the clip lands on actual dialogue (silero VAD), not dead air.\n\n![Review](docs\u002Fscreenshots\u002F05-review.png)\n\n**Tuning Lab** — sweep Whisper recipes against your live subgen; a validated judge ranks them across multiple clips, with plain-language guidance and per-clip winners. Nothing is written to your library.\n\n![Tuning Lab](docs\u002Fscreenshots\u002F10-tuning-lab.png)\n\n**Recipe leaderboard** — every recipe's per-language results rolled into one overall ranking (mean of per-language means, so each language counts equally). Medals for the top three, a confidence signal, and an expandable per-language breakdown.\n\n![Recipe leaderboard](docs\u002Fscreenshots\u002F11-leaderboard.png)\n\n**Audio language issues** — subarr listened and disagreed with the tag: mislabeled, bilingual, and multi-track files flagged in one place, from on-demand sweeps and the opt-in library-wide scan. One click to review and confirm.\n\n![Audio language issues](docs\u002Fscreenshots\u002F12-audio-issues.png)\n\n**Rules** — auto-queue policy with score thresholds and language filters, plus a live \"what would queue right now?\" preview.\n\n![Rules](docs\u002Fscreenshots\u002F06-rules.png)\n\n**Settings — Integrations** — live online \u002F version \u002F badges per service.\n\n![Settings — integrations](docs\u002Fscreenshots\u002F07-settings-integrations.png)\n\n**Settings — Telemetry** — full transparency: install ID, opt-out, and the exact JSON last sent.\n\n![Settings — telemetry](docs\u002Fscreenshots\u002F08-settings-telemetry.png)\n\n**Logs** — structured, filterable runtime logs.\n\n![Logs](docs\u002Fscreenshots\u002F09-logs.png)\n\n## How calibrated audio detection works\n\nVanilla subgen samples one 30-second window at the start of a file and trusts whatever Whisper says. That window is silent, intro music, or a foreign-language opening narration as often as not. Anime is the canonical failure case: an English-dub episode whose first 30 seconds are the Japanese OP gets transcribed in Japanese, the user gets garbage, nobody knows why.\n\nSubarr's audio-language pipeline:\n\n```\n  L1  file metadata          ffprobe audio_language tag.\n                             Cheap, often wrong on retags.\n\n  L2  Tautulli signal        Which audio track is your household\n                             actually picking when they watch?\n\n  L3  Whisper robust detect  Sample 3 chunks across 10 \u002F 50 \u002F 90 percent\n                             of the file. Vote by majority. Confidence\n                             is the MINIMUM probability across the\n                             agreeing chunks, one high-confidence\n                             chunk cannot mask a disagreeing one.\n\n  L4  user verification      Review queue surfaces every suspect row.\n                             One click confirms, propagates to Sonarr\n                             so Bazarr stops getting blinded.\n```\n\nOnce a verification exists, every downstream submission carries it through an evidence gate. Confidence below 0.5, or missing source field, refuses to forward the override. Whisper transcribes from the audio, the way it was meant to.\n\n## Common questions\n\n**Is this just for anime?** No. The audio-language detection problem hits anything where the first 30 seconds of a file aren't representative: foreign-language openings on dubbed releases, silent cold opens, music-only intros, opening narrations in a different language than the dialog. Anime gets cited a lot because the OP pattern is universal across the genre, but the technical problem is general across multi-language libraries. Coverage, scheduling, provenance, and the queue UI are all language-and-genre-agnostic.\n\n**Do I need ollama?** No. It enables two optional extras (structured enrichment and the vision pre-filter). Everything else works without it.\n\n**Do I need Tautulli?** No, but you get NOW PLAYING boost, just-imported boost, and per-user language profiles if you have it. Without Tautulli the scheduler still works, it just has one fewer priority signal.\n\n**Will this work with Jellyfin \u002F Emby?** Not yet — a candidate if there's demand. Open a feature request.\n\n## Known limitations (v1.2)\n\nTransparent before you install.\n\n- Requires `ghcr.io\u002Fcoaxk\u002Fsubarr-subgen` for calibrated Layer 3 detection, queue cancel, curated per-language `initial_prompt`s, and the safe-decode preset. Vanilla subgen works in compat mode but you miss these.\n- No built-in multi-user auth. Basic-auth env vars exist as a single-admin fallback. Run behind a reverse proxy (Authelia \u002F Caddy \u002F Traefik) for anything serious.\n- Queue reorder \u002F promote \u002F demote \u002F pause aren't shipped yet — they need a subgen queue-mutation patch first. Requeue \u002F remove \u002F cancel work today.\n- Auto-update is intentionally absent. Update notifications appear in the UI; you run the upgrade.\n- Plex activity signal goes through Tautulli (the bridge). Reading a show's *selected* audio language straight from Plex metadata is an opt-in extra (`PLEX_AUDIO_HINTS=1`), off by default.\n- Multi-episode disc images (a single `.iso` holding a whole season) can't be probed per-episode, so they're surfaced in a distinct \"Couldn't analyze\" (unsupported) bucket rather than becoming verified gaps or sitting in \"Analyzing\" forever. Standard per-episode files are unaffected.\n- SQLite only. No Postgres backend.\n- Single-host. Workers \u002F multi-host are an explicit non-goal until users ask.\n- Jellyfin \u002F Emby are not yet supported.\n- arm64 builds are not yet published. Pi 4 \u002F 5 users need to build locally for now.\n- Compose example uses bind mounts. Named volumes work but you lose the \"same path Bazarr and subgen see\" sanity.\n\n## Security\n\n- Bandit, Semgrep, pip-audit, Trivy run on every push to `coaxk\u002Fsubarr` and `coaxk\u002Fsubarr-subgen`. SARIF uploads to the GitHub Security tab.\n- Constant-time auth comparison (`secrets.compare_digest`). Regression tested.\n- API keys never appear in any HTTP response, masked surface, raw key only in dataclass internals. Regression tested.\n- Every filesystem operation routes through `canonical_to_fs()` which rejects path-traversal outside the configured media root. Regression tested.\n- Parameterised SQL throughout. Zero string-concat. Grepped in CI.\n- `shell=False` everywhere. No user input flows into `subprocess.run`. Grepped in CI.\n- Telemetry payload contents enumerated in `src\u002Fsubarr\u002Ftelemetry.py` with a regression test (`test_payload_never_includes_forbidden_fields`) guarding against accidental fingerprintable fields.\n- Reporting a vulnerability: `security@subarr.com`. We acknowledge within 72 hours. Full policy in [`SECURITY.md`](SECURITY.md).\n\n## Telemetry\n\nSubarr ships with anonymous telemetry **on by default**. We are explicit about what it buys you, and the opt-out is one click in Settings and one click in the onboarding wizard.\n\n**What gets sent:** install ID (random UUID generated locally, not a user identity), subarr version, Python version, OS \u002F arch, subgen kind (subarr-subgen \u002F vanilla \u002F unreachable), subgen version, integration booleans (configured yes \u002F no, never URLs or keys), library-size bucket (under 100 \u002F 100-1k \u002F 1k-10k \u002F over 10k), scheduler mode, walks-per-day rolling average, error counts by exception class, docker tier.\n\n**Never sent:** file paths, titles, IPs, hostnames, API keys, languages, anything user-fingerprintable. Enforced by a regression test on the client AND by an allow-list \u002F forbidden-pattern check on the receiving Cloudflare Worker. Both pin against the same forbidden-fields list.\n\n**What it buys you:** the Tuning Lab and recipe leaderboard shipped in 1.2 are the *local* half of a feedback loop. Telemetry is what lets the *global* half follow:\n- A global Whisper-kwargs leaderboard built from aggregated telemetry. The more installs send their per-language kwargs plus verification outcomes, the more accurate the \"best French settings\" recommendation gets.\n- A global provider success leaderboard, the same loop for Bazarr providers.\n- Tuning Lab variant suggestions pre-filled from cohort data.\n\nThese cross-install loops are the post-1.2 roadmap (gated on a trustworthy reference-free quality judge before any crowd-aggregation); 1.2 ships the local lab + leaderboard they build on.\n\n**Where to verify:** Settings → Telemetry shows the exact JSON of the last ping. Receiving worker source at [`coaxk\u002Fsubarr-telemetry`](https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr-telemetry). Public stats dashboard at [`stats.subarr.com`](https:\u002F\u002Fstats.subarr.com).\n\n**Note for Pi-hole users:** there are two subarr subdomains and they do different things.\n\n- `telemetry.subarr.com`, the receiver your install posts heartbeats to. Privacy-conscious regex blocklists deny anything matching `*telemetry*` by default, which catches this one. That is working as intended: blocking it switches telemetry off without any further action.\n- `stats.subarr.com`, the public read-only dashboard. No PII, no auth, no requests from your install, just the aggregated numbers anyone can view. Most blocklists do not catch it because the name is honest about what it is.\n\nWe picked these names deliberately. Hiding the sender behind something like `analytics.subarr.com` or putting it on the apex would be the opposite of honest. If you want telemetry off, do not allow `telemetry.subarr.com`. If you want it on, allow that one specifically rather than wildcarding the whole zone.\n\n## Authentication\n\nNo built-in auth by default. Designed for a reverse proxy (Authelia, Caddy basicauth, Traefik forward-auth). In-product fallback is HTTP Basic via env vars:\n\n```yaml\nenvironment:\n  SUBARR_USER: youradmin\n  SUBARR_PASS: a-very-long-random-password\n```\n\nWhen both are set, every non-monitoring request requires Basic credentials. `\u002Fapi\u002Fhealth` always bypasses for monitoring tools.\n\nHonest limitations of basic auth: one global user, no per-user audit, credentials transmitted on every request. Reverse-proxy auth is the right answer for anything that matters.\n\n## Updates\n\nSubarr polls GitHub releases once per 24 hours for both `coaxk\u002Fsubarr` and `coaxk\u002Fsubarr-subgen`. The subarr-subgen comparison uses patch-stack revision so patch-level updates are detected even when upstream subgen version stays the same.\n\n```bash\n# In the directory with your compose.yaml\ndocker compose pull\ndocker compose up -d\n```\n\nThe Settings panel shows the current vs latest version per product with release notes inline. No auto-update by design, you run upgrades when you know it is happening.\n\n## Architecture\n\n\u003Cpicture>\n  \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"docs\u002Farchitecture-dark.png\">\n  \u003Cimg alt=\"subarr sits between your stack's inputs and subgen: Bazarr's wanted-list, Sonarr\u002FRadarr file paths, library files on disk, and Tautulli\u002FPlex hints feed into subarr — scheduler, probe-gate (ffprobe), coverage, queue — which coordinates transcription out to subgen (Whisper), the written .srt, and a Plex library refresh.\" src=\"docs\u002Farchitecture.png\" width=\"850\">\n\u003C\u002Fpicture>\n\n**How it runs.** subarr is a long-running service with its own scheduler — it reads Bazarr's wanted list and walks your library on a cadence you set (and on demand from the UI). You don't wire it into Sonarr\u002FRadarr as a custom script or trigger it manually; it just runs beside them.\n\n| Layer | Detail |\n|---|---|\n| Backend | Python 3.12 + FastAPI + httpx. Async throughout. |\n| Storage | Single SQLite file, default `\u002Fdata\u002Fsubarr.db` (override with `SUBARR_DB_PATH`). Hand-rolled migrations runner. |\n| Frontend | React 18 + esbuild. CDN React. Bundles committed so `pip install` ships a working SPA. |\n| Subgen drive | HTTP. 20 small patches over upstream McCloudS\u002Fsubgen. Living patch stack at [`coaxk\u002Fsubarr-subgen`](https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr-subgen). |\n| Discovery | Read-only Docker API via [tecnativa\u002Fdocker-socket-proxy](https:\u002F\u002Fgithub.com\u002FTecnativa\u002Fdocker-socket-proxy). |\n| Telemetry receiver | Cloudflare Worker + D1. Open source at [`coaxk\u002Fsubarr-telemetry`](https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr-telemetry). |\n\nThree deployment tiers (full templates in [`deploy\u002Ftemplates\u002F`](deploy\u002Ftemplates\u002FREADME.md)):\n\n| Tier | What you get | What you give up | Who it is for |\n|---|---|---|---|\n| 1, Standalone | Manual integration URLs, no Docker access | Auto-detect, container-name hostnames | Non-Docker hosts |\n| 2, Socket proxy (recommended) | Auto-detect on your existing Docker network | Slightly more setup | Most homelabs |\n| 3, Full integration | Tier 2 + API-key auto-extract from config volumes | Subarr can read every mounted config dir | Trust your single-tenant box |\n\n## Roadmap\n\n**v1.2 (this release)** — the Tuning Lab and verified audio:\n\n- **Tuning Lab** (shipped): sweep Whisper recipes against your live subgen, judged by the validated tournament judge, with per-language herd adoption.\n- **Global recipe leaderboard** (shipped): per-language herds rolled into one overall ranking, each language weighted equally.\n- **Audio-language verification + library-wide scan** (shipped): subarr listens and flags mislabeled, bilingual, and multi-track files, with one-click fixes that flow into coverage.\n- **In-app integration editing** (shipped): credentials editable from Settings, test-connection + live apply, no restart.\n- **Series-level audio-language intent** (shipped): declare a series' language once; new episodes inherit it as verified on the next coverage build.\n\n**Later** — still on the list:\n\n- **Provider success leaderboard**: aggregate Bazarr per-provider history across opt-in installs into a global ranking. Closes \"which subtitle providers actually deliver?\", a long-standing Bazarr feature request.\n- **Cross-install kwargs aggregation** ranked by verification outcomes, and **\"use community-best for &lt;language&gt;\"** one-click adoption (the global half of the Tuning Lab loop; gated on a trustworthy reference-free quality judge).\n- **Queue mutation**: promote, demote, reorder, pause. Requires a subgen queue-mutation patch first.\n- Jellyfin \u002F Emby support if there's demand.\n\n## The subgen patch story\n\nSubarr drives subgen through 20 small patches over upstream McCloudS\u002Fsubgen. Each is independent, idempotent on reapply, required for one specific subarr orchestration behaviour. Living patch stack at [`coaxk\u002Fsubarr-subgen`](https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr-subgen).\n\nThe maintained image is `ghcr.io\u002Fcoaxk\u002Fsubarr-subgen:\u003Ctag>`. Tagged releases: `v2026.05.3-r5` current, with `latest`, `stable` (7-day soak), and per-version tags.\n\nYou do not need our patched image. See the \"I already have subgen\" table at the top.\n\n## Development\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr\ncd subarr\npython -m venv .venv && source .venv\u002Fbin\u002Factivate\npip install -e .[dev]\nPYTHONPATH=src uvicorn subarr.app:app --reload --port 9922\nPYTHONPATH=src pytest -q                    # 603 passing\nnpm install && npm run build:frontend       # SPA bundles\n```\n\n## Related\n\n- [Bazarr](https:\u002F\u002Fgithub.com\u002Fmorpheus65535\u002Fbazarr), the librarian. Subarr reads its wanted list and writes back its scan-disk trigger.\n- [McCloudS\u002Fsubgen](https:\u002F\u002Fgithub.com\u002FMcCloudS\u002Fsubgen), the worker. Subarr drives it via the patches in [`coaxk\u002Fsubarr-subgen`](https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr-subgen).\n- [subsyncarr](https:\u002F\u002Fgithub.com\u002Fjohnpc\u002Fsubsyncarr), the synchroniser. Recommended companion for sync issues subarr does not tackle.\n\n## License\n\nMIT. See [LICENSE](LICENSE). The patched subgen image (`ghcr.io\u002Fcoaxk\u002Fsubarr-subgen`) is a derived work of upstream McCloudS\u002Fsubgen. See that repo's [`NOTICE`](https:\u002F\u002Fgithub.com\u002Fcoaxk\u002Fsubarr-subgen\u002Fblob\u002Fmain\u002FNOTICE) for attribution.\n",2,"2026-06-11 04:10:59","CREATED_QUERY"]