[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72348":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":24,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},72348,"speakr","murtaza-nasir\u002Fspeakr","murtaza-nasir","Speakr is a personal, self-hosted web application designed for transcribing audio recordings",null,"Python",3178,253,21,1,0,20,32,72,60,29.21,"GNU Affero General Public License v3.0",false,"master",true,[],"2026-06-12 02:03:02","\u003Cdiv align=\"center\">\n    \u003Cimg src=\"static\u002Fimg\u002Ficon-32x32.png\" alt=\"Speakr Logo\" width=\"32\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Ch1 align=\"center\">Speakr\u003C\u002Fh1>\n\u003Cp align=\"center\">Self-hosted AI transcription and intelligent note-taking platform\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.gnu.org\u002Flicenses\u002Fagpl-3.0\">\u003Cimg alt=\"AGPL v3\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-AGPL_v3-blue.svg\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Factions\u002Fworkflows\u002Fdocker-publish.yml\">\u003Cimg alt=\"Docker Build\" src=\"https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Factions\u002Fworkflows\u002Fdocker-publish.yml\u002Fbadge.svg\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhub.docker.com\u002Fr\u002Flearnedmachine\u002Fspeakr\">\u003Cimg alt=\"Docker Pulls\" src=\"https:\u002F\u002Fimg.shields.io\u002Fdocker\u002Fpulls\u002Flearnedmachine\u002Fspeakr\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Freleases\u002Flatest\">\u003Cimg alt=\"Latest Version\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-0.8.20--alpha-brightgreen.svg\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\">Documentation\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Fgetting-started\">Quick Start\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Fscreenshots\">Screenshots\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fhub.docker.com\u002Fr\u002Flearnedmachine\u002Fspeakr\">Docker Hub\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Freleases\">Releases\u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\n## Overview\n\nSpeakr transforms your audio recordings into organized, searchable, and intelligent notes. Built for privacy-conscious groups and individuals, it runs entirely on your own infrastructure, ensuring your sensitive conversations remain completely private.\n\n\u003Cdiv align=\"center\">\n    \u003Cimg src=\"docs\u002Fassets\u002Fimages\u002Fscreenshots\u002FMain view.png\" alt=\"Speakr Main Interface\" width=\"750\"\u002F>\n\u003C\u002Fdiv>\n\n## Key Features\n\n### Core Functionality\n- **Smart Recording & Upload** - Record directly in browser or upload existing audio files\n- **AI Transcription** - High-accuracy transcription with speaker identification\n- **Voice Profiles** - AI-powered speaker recognition with voice embeddings (requires WhisperX ASR service)\n- **REST API v1** - Complete API with Swagger UI for automation tools (n8n, Zapier, Make) and dashboard widgets\n- **Single Sign-On** - Authenticate with any OIDC provider (Keycloak, Azure AD, Google, Auth0, Pocket ID)\n- **Audio-Transcript Sync** - Click transcript to jump to audio, auto-highlight current text, follow mode for hands-free playback\n- **Interactive Chat** - Ask questions about your recordings and get AI-powered answers\n- **Inquire Mode** - Semantic search across all recordings using natural language\n- **Internationalization** - Full support for English, Spanish, French, German, Chinese, and Russian\n- **Beautiful Themes** - Light and dark modes with customizable color schemes\n\n### Collaboration & Sharing\n- **Internal Sharing** - Share recordings with specific users with granular permissions (view\u002Fedit\u002Freshare)\n- **Group Management** - Create groups with automatic sharing via group-scoped tags\n- **Public Sharing** - Generate secure links to share recordings externally (admin-controlled)\n- **Group Tags** - Tags that automatically share recordings with all group members\n\n### Organization & Management\n- **Smart Tagging** - Organize with tags that include custom AI prompts and ASR settings\n- **Tag Prompt Stacking** - Combine multiple tags to layer AI instructions for powerful transformations\n- **Tag Protection** - Prevent specific recordings from being auto-deleted\n- **Group Retention Policies** - Set custom retention periods per group tag\n- **Auto-Deletion** - Automatic cleanup of old recordings with flexible retention policies\n\n## Real-World Use Cases\n\nDifferent people use Speakr's collaboration and retention features in different ways:\n\n| Use Case | Setup | What It Does |\n|----------|-------|-------------|\n| **Family memories** | Create \"Family\" group with protected tag | Everyone gets access to trips and events automatically, recordings preserved forever |\n| **Book club discussions** | \"Book Club\" group, tag monthly meetings | All members auto-share discussions, can add personal notes about what resonated |\n| **Work project group** | Share individually with 3 teammates | Temporary collaboration, easy to revoke when project ends |\n| **Daily group standups** | Group tag with 14-day retention | Auto-share with group, auto-cleanup of routine meetings |\n| **Architecture decisions** | Engineering group tag, protected from deletion | Technical discussions automatically shared, preserved permanently as reference |\n| **Client consultations** | Individual share with view-only permission | Controlled external access, clients can't accidentally edit |\n| **Research interviews** | Protected tag + Obsidian export | Preserve recordings indefinitely, transcripts auto-import to note-taking system |\n| **Legal consultations** | Group tag with 7-year retention | Automatic sharing with legal group, compliance-based retention |\n| **Sales calls** | Group tag with 1-year retention | Whole sales group learns from each call, cleanup after sales cycle |\n\n### Creative Tag Prompt Examples\n\nTags with custom prompts transform raw recordings into exactly what you need:\n\n- **Recipe recordings**: Record yourself cooking while narrating - tag with \"Recipe\" to convert messy speech into formatted recipes with ingredient lists and numbered steps\n- **Lecture notes**: Students tag lectures with \"Study Notes\" to get organized outlines with concepts, examples, and definitions instead of raw transcripts\n- **Code reviews**: \"Code Review\" tag extracts issues, suggested changes, and action items in technical language developers can use directly\n- **Meeting summaries**: \"Action Items\" tag ignores discussion and returns just decisions, tasks, and deadlines\n\n### Tag Stacking for Combined Effects\n\nStack multiple tags to layer instructions:\n- \"Recipe\" + \"Gluten Free\" = Formatted recipe with gluten substitution suggestions\n- \"Lecture\" + \"Biology 301\" = Study notes format focused on biological terminology\n- \"Client Meeting\" + \"Legal Review\" = Client requirements plus legal implications highlighted\n\nThe order can matter - start with format tags, then add focus tags for best results.\n\n### Integration Examples\n\n- **Obsidian\u002FLogseq**: Enable auto-export to write completed transcripts directly to your vault using your custom template - no manual export needed\n- **Documentation wikis**: Map auto-export to your wiki's import folder for seamless transcript publishing\n- **Content creation**: Create SRT subtitle templates from your audio recordings for podcasts or video content\n- **Project management**: Extract action items with custom tag prompts, then auto-export for automated task creation\n\n## Quick Start\n\n### Using Docker (Recommended)\n\n```bash\n# Create project directory\nmkdir speakr && cd speakr\n\n# Download docker-compose configuration:\nwget https:\u002F\u002Fraw.githubusercontent.com\u002Fmurtaza-nasir\u002Fspeakr\u002Fmaster\u002Fconfig\u002Fdocker-compose.example.yml -O docker-compose.yml\n\n# Download the environment template:\nwget https:\u002F\u002Fraw.githubusercontent.com\u002Fmurtaza-nasir\u002Fspeakr\u002Fmaster\u002Fconfig\u002Fenv.transcription.example -O .env\n\n# Configure your API keys and launch\nnano .env\ndocker compose up -d\n\n# Access at http:\u002F\u002Flocalhost:8899\n```\n\n> **Lightweight image:** Use `learnedmachine\u002Fspeakr:lite` for a smaller image (~725MB vs ~4.4GB) that skips PyTorch. All features work normally — only Inquire Mode's semantic search falls back to basic text search.\n\n**Required API Keys:**\n- `TRANSCRIPTION_API_KEY` - For speech-to-text (OpenAI) or `ASR_BASE_URL` for self-hosted\n- `TEXT_MODEL_API_KEY` - For summaries, titles, and chat (OpenRouter or OpenAI)\n\n### Transcription Options\n\nSpeakr uses a **connector-based architecture** that auto-detects your transcription provider:\n\n| Option | Setup | Speaker Diarization | Voice Profiles |\n|--------|-------|---------------------|----------------|\n| **OpenAI Transcribe** | Just API key | ✅ `gpt-4o-transcribe-diarize` | ❌ |\n| **WhisperX ASR** | GPU container | ✅ Best quality | ✅ |\n| **Mistral Voxtral** | Just API key | ✅ Built-in | ❌ |\n| **VibeVoice ASR** | Self-hosted (vLLM) | ✅ Built-in | ❌ |\n| **Legacy Whisper** | Just API key | ❌ | ❌ |\n\n**Simplest setup (OpenAI with diarization):**\n```bash\nTRANSCRIPTION_API_KEY=sk-your-openai-key\nTRANSCRIPTION_MODEL=gpt-4o-transcribe-diarize\n```\n\n**Best quality (Self-hosted WhisperX):**\n```bash\nASR_BASE_URL=http:\u002F\u002Fwhisperx-asr:9000\nASR_RETURN_SPEAKER_EMBEDDINGS=true  # Enable voice profiles\n```\nRequires [WhisperX ASR Service](https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fwhisperx-asr-service) container with GPU.\n\n**Mistral Voxtral (cloud diarization):**\n```bash\nTRANSCRIPTION_CONNECTOR=mistral\nTRANSCRIPTION_API_KEY=your-mistral-key\nTRANSCRIPTION_MODEL=voxtral-mini-latest\n```\n\n**VibeVoice ASR (self-hosted, no cloud dependency):**\n```bash\nTRANSCRIPTION_CONNECTOR=vibevoice\nTRANSCRIPTION_BASE_URL=http:\u002F\u002Fyour-vllm-server:8000\nTRANSCRIPTION_MODEL=vibevoice\n```\nRequires [VibeVoice](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft\u002FVibeVoice-ASR) served via vLLM with GPU.\n\n> **⚠️ PyTorch 2.6 Users:** If you encounter a \"Weights only load failed\" error with WhisperX, add `TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD=true` to your ASR container. See [troubleshooting](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Ftroubleshooting#pytorch-26-weights-loading-error-whisperx-asr-service) for details.\n\n**[View Full Installation Guide →](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Fgetting-started\u002Finstallation)**\n\n## Documentation\n\nComplete documentation is available at **[murtaza-nasir.github.io\u002Fspeakr](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr)**\n\n- [Getting Started](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Fgetting-started) - Quick setup guide\n- [User Guide](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Fuser-guide\u002F) - Learn all features\n- [Admin Guide](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Fadmin-guide\u002F) - Administration and configuration\n- [Troubleshooting](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Ftroubleshooting) - Common issues and solutions\n- [FAQ](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Ffaq) - Frequently asked questions\n\n## Latest Release (v0.8.20-alpha)\n\n**Security: open-redirect fix in `is_safe_url` (CWE-601).** Patch release on top of v0.8.19-alpha.\n\n- The `is_safe_url()` helper validated `urljoin(request.host_url, target)` while `redirect()` was called with the raw `target`. A scheme-relative input such as `\u002F\u002F\u002F\u002Fevil.com` resolved to a same-host URL during validation but was emitted verbatim in the `Location` header, where browsers interpret it as a network-path-relative redirect to an attacker-controlled host.\n- `is_safe_url()` now validates the raw target against a local-path allowlist: leading `\u002F` required, scheme-relative URLs (`\u002F\u002F`, `\u002F\\`), backslashes, control characters, and any value with a scheme or netloc are rejected. The duplicate copy in `src\u002Fapi\u002Fauth.py` was removed; password login and the SSO `next` \u002F callback flow share one validator.\n- Reported by **RacerZ and Fushuling**. Tracked as a GitHub Security Advisory; CVE pending. Users on v0.8.19-alpha or earlier should upgrade promptly.\n\nNo new features, no breaking changes.\n\n### Previous Release (v0.8.19-alpha)\n\n**Inquire-mode performance and re-embed reliability.** Patch release on top of v0.8.18-alpha. Vectorised chunk similarity search (60s → 2-3s on large libraries), embedding API retries with backoff, transactional rollback when a partial embedding response would otherwise drop chunks, and Re-embed all retry passes that include stale-chunk recordings regardless of status.\n\n### Previous Release (v0.8.18-alpha)\n\n**API v1 folder operations.** Patch release on top of v0.8.17-alpha (#274 follow-up).\n\n- `GET \u002Fapi\u002Fv1\u002Frecordings?folder_id=\u003Cid>` (or `?folder_id=none`) filters list responses by folder\n- `PATCH \u002Fapi\u002Fv1\u002Frecordings\u002F{id}` accepts `folder_id` to move a recording (or `null` to remove it from any folder)\n- `PATCH \u002Fapi\u002Fv1\u002Frecordings\u002Fbatch` accepts `folder_id` inside `updates` for bulk moves\n- OpenAPI schema documents all of the above plus the previously-undocumented batch fields (`is_inbox`, `is_highlighted`, `add_tag_ids`, `remove_tag_ids`)\n\nNo breaking changes. The folder *resource* endpoints (CRUD on `\u002Fapi\u002Fv1\u002Ffolders`) shipped in v0.8.16-alpha; this release lets recordings actually be moved into and out of those folders.\n\n### Previous Release (v0.8.17-alpha)\n\n**Bug fixes and CI maintenance.** Patch release on top of v0.8.16-alpha.\n\n- Reprocess summary modal: prompt-variables panel and Append\u002FReplace toggle now reflect the prompt source the user actually picked (was showing the recording's original tag variables and offering Append\u002FReplace for tag-source prompts where it does not apply)\n- Docs: corrected reverse-proxy nginx example so the WebSocket `Connection: upgrade` header is forwarded conditionally rather than set unconditionally (caused 500s on file uploads through the proxy with Gunicorn). Added a Nginx Proxy Manager section noting that NPM's default `client_max_body_size` is `2000m` and that the `Advanced` tab is the right place for per-host overrides.\n- CI: bumped all GitHub Actions to Node 24 versions to clear deprecation warnings.\n\nNo new features, no breaking changes.\n\n### Previous Release (v0.8.16-alpha)\n\n**Prompt Templating, Transcription UX Polish, Per-Recording Model Selection, and Observability**\n\n**Prompt templating and summary control**\n\n- **Prompt Template Variables** - Tag, folder, user-default, and admin-default summary prompts can contain `{{name}}` placeholders. Selecting a tag with `{{agenda}}` exposes an agenda input on the upload form; the value is stored on the recording, substituted into the prompt at summarisation time, and remains editable from the reprocess summary modal. Caps: 8,000 chars per value, 32,000 total. Single-pass `re.sub` substitution so values cannot introduce new placeholders or reach Python attributes.\n- **Append vs Replace Mode** - The reprocess summary modal and the new \"Customise summary prompt\" modal each let you Append text to the resolved prompt (combine your saved prompt with extra context) or Replace it entirely (use only the text you paste). Append mode runs variable substitution after the append step so appended text can use the same `{{var}}` placeholders.\n- **Customise Summary Prompt Split-Button** - A new control next to **Generate Summary** opens the Append\u002FReplace modal for recordings that don't have a summary yet, so one-off context (an agenda, custom focus instructions) can be passed in without rewriting your saved prompt.\n- **Full LLM Prompt Structure Preview** - Both the admin Default Prompts page and the user Customise-prompts tab now show the complete two-message payload (system prompt with context block, user message with transcription wrapper and language directive). Placeholder chips colour-code system tokens (blue, replaced by the framework) versus user-supplied variables (amber). The user-side preview re-renders live as you type into your custom prompt.\n\n**Per-recording transcription control**\n\n- **Per-Upload \u002F Per-Tag \u002F Per-Folder Transcription Model** - Set `TRANSCRIPTION_MODELS_AVAILABLE` and the upload form, reprocess modal, and tag\u002Ffolder edit forms all gain a model dropdown. Tag and folder edit forms warn if a previously-selected default is no longer in the configured list. The dropdown is hidden when only one option would be visible.\n- **Admin-Managed Transcription Model List** - When the connector exposes `\u002Fv1\u002Fmodels` discovery, admins can curate the list from the dashboard rather than via env var. Stored in the database; overrides `TRANSCRIPTION_MODELS_AVAILABLE` when set.\n- **Per-Connector Capability Gating** - The hotwords, initial-prompt, and speaker-count UI elements are now hidden for connectors that don't support them, instead of accepting input that is silently ignored.\n- **Mistral Voxtral Chunking** - `MISTRAL_ENABLE_CHUNKING=true` (with `MISTRAL_MAX_DURATION_SECONDS`) opts the Mistral connector into app-side chunking for recordings approaching Voxtral's 3-hour timeout.\n\n**ASR transcript editor**\n\n- **Autosave** - Saves edits 2 seconds after the last keystroke when the user opts in (`Account → Preferences → Autosave editor`).\n- **Save Without Closing + Ctrl+S** - New button keeps the editor open after saving; Ctrl+S triggers a save from anywhere in the editor.\n- **Scroll Memory** - Reopening the editor restores the previous scroll position instead of jumping to the top.\n- **Double-Click to Edit** - Double-clicking a transcript row in the simple view jumps into the editor with that segment highlighted.\n- **Row Highlight After Jump** - Briefly tints the row when navigating into it from the simple view so the target is obvious.\n\n**Account preferences**\n\n- **Preferences Tab** - Account settings has a new **Preferences** tab (split from the language settings) using a two-column layout for transcript display, editor behaviour, and language preferences.\n- **Compact Timestamps** - Optional `mm:ss` (or `h:mm:ss`) timestamps in the simple transcript view, rendered as a two-part pill alongside the speaker label. The leading segment shows \"Start\" instead of `00:00`.\n- **Persist Recording-List Sort** - The Created date \u002F Meeting date toggle now sticks across reloads and sessions on the same browser (#263).\n\n**Embeddings and inquire mode**\n\n- **Configurable Embedding Model** - `EMBEDDING_MODEL` swaps `all-MiniLM-L6-v2` for any sentence-transformers model.\n- **API-Mode Embeddings** - `EMBEDDING_BASE_URL`, `EMBEDDING_API_KEY`, and `EMBEDDING_DIMENSIONS` route embeddings through any OpenAI-compatible provider (vLLM, OpenRouter, OpenAI, Together, etc.). Inquire startup banner reflects the active provider.\n- **Embedding Token Tracking + Re-Embed-All** - The Vector Store admin tab now tracks embedding API token usage and cost separately from LLM usage, and exposes a \"Re-embed all\" action for after a model or dimensionality change. Speakr warns at startup if the embedding identifier changed since data was stored.\n\n**Observability and admin**\n\n- **Per-Operation Token Stats** - Admin token statistics now break out title, summary, chat, event extraction, and embeddings as separate categories with their own cards and charts. Embedding usage is shown as a distinct cost line.\n- **Granular Token Budgets** - `TITLE_MAX_TOKENS` and `EVENT_MAX_TOKENS` join the existing `SUMMARY_MAX_TOKENS` \u002F `CHAT_MAX_TOKENS` so reasoning models that consume budget on hidden thinking tokens can be tuned per operation. The resolved `max_tokens` is logged with each LLM call.\n- **LLM Timeout Visibility** - The configured `LLM_REQUEST_TIMEOUT` is logged at startup, and `APITimeoutError` log entries now include elapsed time so it is clear whether the timeout was the actual bound that fired.\n\n**API v1**\n\n- **Folder CRUD** - New `\u002Fapi\u002Fv1\u002Ffolders` endpoints for list, create, update, delete.\n- **Connector Discovery** - New endpoint exposing the active transcription connector and its capabilities for companion-app integrations.\n- **Recording Field Parity** - `\u002Fapi\u002Fv1\u002Frecordings` and `\u002Fapi\u002Fv1\u002Frecordings\u002F{id}` now expose `audio_duration`, transcription\u002Fsummarization durations, folder, events (detail only), `deletion_exempt`, `prompt_variables`, and the per-recording transcription model.\n- **Forwarded Per-Request Overrides** - The `\u002Fapi\u002Fv1\u002Ftranscribe` endpoint now forwards `transcription_model`, `hotwords`, and `initial_prompt`. The custom-ASR-endpoint connector forwards a `?model=` query param so WhisperX runtime model switching works through the API.\n\n**Bug fixes**\n\n- Reprocessing now applies tag\u002Ffolder\u002Fuser default hotwords + initial_prompt (#265, previously only at upload time)\n- Legacy user records with `transcription_language=\"français\"` are normalised to ISO 639-1 codes on upgrade so WhisperX no longer 500s on display names (#256)\n- Title generation no longer leaks `\\\\uXXXX` escape sequences into the LLM prompt for non-ASCII transcripts; truncation now happens after `format_transcription_for_llm` (#260)\n- The Vector Store \"recordings to process\" message now uses the i18n params API instead of inline brace replace\n- CSRF token added to the Preferences form so submissions are accepted\n\n**Infrastructure**\n\n- **Vitest Frontend Tests** - Pure-helper modules in `static\u002Fjs\u002Fmodules\u002Futils\u002F` are now covered by Vitest. Run `npm test`. Currently exercises the prompt-variable extraction and priority-chain logic.\n\n**Docs**\n\n- nginx reverse-proxy `proxy_request_buffering off` and `client_max_body_size` notes for large uploads\n- Google Gemini OpenAI-compatible endpoint setup example\n- Prompt template variables guide\n- Per-upload \u002F per-tag \u002F per-folder model selection documentation\n- `EMBEDDING_BASE_URL` API mode documentation across inquire-mode, vector-store, and troubleshooting\n\n---\n\n**Older releases:** see the [GitHub Releases page](https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Freleases) for tagged versions, or the [release history on the docs site](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002F#latest-updates) for narrative changelog entries going back to earlier v0.x lines.\n\n## Screenshots\n\n\u003Ctable align=\"center\" border=\"0\">\n  \u003Ctr>\n    \u003Ctd align=\"center\">\n      \u003Cimg src=\"docs\u002Fassets\u002Fimages\u002Fscreenshots\u002FMain view.png\" alt=\"Main Screen\" width=\"400\"\u002F>\n      \u003Cbr>\u003Cem>Main Screen with Chat\u003C\u002Fem>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Cimg src=\"docs\u002Fassets\u002Fimages\u002Fscreenshots\u002Fvideo-playback.png\" alt=\"Video Playback\" width=\"400\"\u002F>\n      \u003Cbr>\u003Cem>Video Playback with Transcript\u003C\u002Fem>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\n      \u003Cimg src=\"docs\u002Fassets\u002Fimages\u002Fscreenshots\u002FInquire mode.png\" alt=\"Inquire Mode\" width=\"400\"\u002F>\n      \u003Cbr>\u003Cem>AI-Powered Semantic Search\u003C\u002Fem>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Cimg src=\"docs\u002Fassets\u002Fimages\u002Fscreenshots\u002Fchat-interface.png\" alt=\"Transcription with Chat\" width=\"400\"\u002F>\n      \u003Cbr>\u003Cem>Interactive Transcription & Chat\u003C\u002Fem>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n**[View Full Screenshot Gallery →](https:\u002F\u002Fmurtaza-nasir.github.io\u002Fspeakr\u002Fscreenshots)**\n\n## Technology Stack\n\n- **Backend**: Python\u002FFlask with SQLAlchemy\n- **Frontend**: Vue.js 3 with Tailwind CSS\n- **AI\u002FML**: OpenAI Whisper, OpenRouter, Ollama support\n- **Database**: SQLite (default) or PostgreSQL\n- **Deployment**: Docker, Docker Compose\n\n## Roadmap\n\n### Completed\n- ✅ Speaker voice profiles with AI-powered identification (v0.5.9)\n- ✅ Group workspaces with shared recordings (v0.5.9)\n- ✅ PWA enhancements with offline support and background sync (v0.5.10)\n- ✅ Multi-user job queue with fair scheduling (v0.6.0)\n- ✅ SSO integration with OIDC providers (v0.7.0)\n- ✅ Token usage tracking and per-user budgets (v0.7.2)\n- ✅ Connector-based transcription architecture with auto-detection (v0.8.0)\n- ✅ Comprehensive REST API with Swagger UI documentation (v0.8.0)\n- ✅ Video retention with in-browser video playback (v0.8.11)\n- ✅ Parallel uploads with duplicate detection (v0.8.11)\n- ✅ Fullscreen video mode with live subtitles (v0.8.14)\n- ✅ Custom vocabulary and transcription hints (v0.8.14)\n\n### Near-term\n- Quick language switching for transcription\n- Automated workflow triggers\n\n### Long-term\n- Plugin system for custom integrations\n- End-to-end encryption option\n\n### Reporting Issues\n\n- [Report bugs](https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Fissues)\n- [Request features](https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Fdiscussions)\n\n## License\n\nThis project is **dual-licensed**:\n\n1.  **GNU Affero General Public License v3.0 (AGPLv3)**\n    [![License: AGPL v3](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-AGPL_v3-blue.svg)](https:\u002F\u002Fwww.gnu.org\u002Flicenses\u002Fagpl-3.0)\n\n    Speakr is offered under the AGPLv3 as its open-source license. You are free to use, modify, and distribute this software under the terms of the AGPLv3. A key condition of the AGPLv3 is that if you run a modified version on a network server and provide access to it for others, you must also make the source code of your modified version available to those users under the AGPLv3.\n\n    * You **must** create a file named `LICENSE` (or `COPYING`) in the root of your repository and paste the full text of the [GNU AGPLv3 license](https:\u002F\u002Fwww.gnu.org\u002Flicenses\u002Fagpl-3.0.txt) into it.\n    * Read the full license text carefully to understand your rights and obligations.\n\n2.  **Commercial License**\n\n    For users or organizations who cannot or do not wish to comply with the terms of the AGPLv3 (for example, if you want to integrate Speakr into a proprietary commercial product or service without being obligated to share your modifications under AGPLv3), a separate commercial license is available.\n\n    Please contact **speakr maintainers** for details on obtaining a commercial license.\n\n**You must choose one of these licenses** under which to use, modify, or distribute this software. If you are using or distributing the software without a commercial license agreement, you must adhere to the terms of the AGPLv3.\n\n## Contributing\n\nWe welcome contributions to Speakr! There are many ways to help:\n\n- **Bug Reports & Feature Requests**: [Open an issue](https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Fissues)\n- **Discussions**: [Share ideas and ask questions](https:\u002F\u002Fgithub.com\u002Fmurtaza-nasir\u002Fspeakr\u002Fdiscussions)\n- **Documentation**: Help improve our docs\n- **Translations**: Contribute translations for internationalization\n\n### Code Contributions\n\nBy submitting a pull request, you agree to our [Contributor License Agreement (CLA)](CLA.md). This ensures we can maintain our dual-license model (AGPLv3 and Commercial). You retain copyright ownership of your contribution — the CLA simply grants us permission to include it in both the open source and commercial versions of Speakr. Our bot will post a reminder when you open a PR.\n\n**See our [Contributing Guide](CONTRIBUTING.md) for complete details on:**\n- How the CLA works and why we need it\n- Step-by-step contribution process\n- Development setup instructions\n- Coding standards and best practices\n","Speakr 是一个个人自托管的网络应用程序，旨在将音频记录转录成文本。它利用AI技术实现高精度的语音转录，并具备智能笔记功能，如语音识别、语义搜索及交互式聊天等。用户可以直接在浏览器中录音或上传已有音频文件，通过AI技术进行精准转录并自动识别说话人。此外，Speakr还提供了丰富的API接口支持自动化工具集成和单点登录功能，确保了数据的安全性和隐私性。该平台适用于需要保护敏感信息交流的企业团队和个人使用场景，特别适合于会议记录、访谈整理等领域。",2,"2026-06-11 03:41:27","high_star"]