[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-73552":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":30,"readmeContent":31,"aiSummary":32,"trendingCount":16,"starSnapshotCount":16,"syncStatus":33,"lastSyncTime":34,"discoverSource":35},73552,"summarize","steipete\u002Fsummarize","steipete","Point at any URL\u002FYouTube\u002FPodcast or file. Get the gist. CLI and Chrome Extension.","https:\u002F\u002Fsummarize.sh",null,"TypeScript",6138,402,18,4,0,26,61,242,78,38.82,"MIT License",false,"main",true,[27,28,5,29],"ai","cli","typescript","2026-06-12 02:03:14","# Summarize 📝 — Chrome Side Panel + CLI\n\nFast summaries from URLs, files, and media. Works in the terminal, a Chrome Side Panel and Firefox Sidebar.\n\n## Highlights\n\n- Chrome Side Panel **chat** (streaming agent + history) inside the sidebar.\n- **Video slides**: screenshots + OCR + transcript cards for YouTube, direct video URLs, and local video files.\n- Media-aware summaries: auto‑detect video\u002Faudio vs page content.\n- Coding CLI backends: Codex, Claude, Gemini, Cursor Agent, OpenClaw, OpenCode.\n- Streaming Markdown + metrics + cache‑aware status.\n- CLI supports URLs, files, podcasts, YouTube, audio\u002Fvideo, PDFs.\n\n## Feature overview\n\n- URLs, files, and media: web pages, PDFs, images, audio\u002Fvideo, YouTube, podcasts, RSS.\n- Slide extraction for video sources (YouTube, direct video URLs, local video files) with OCR + timestamped cards.\n- Transcript-first media flow: published transcripts when available, then Groq\u002FONNX\u002Fwhisper.cpp\u002FAssemblyAI\u002FGemini\u002FOpenAI\u002FFAL transcription fallback when not.\n- Coding CLI providers: Claude, Codex, Gemini, Cursor Agent, OpenClaw, OpenCode.\n- Streaming output with Markdown rendering, metrics, and cache-aware status.\n- Local, paid, and free models: OpenAI‑compatible local endpoints, paid providers, plus an OpenRouter free preset.\n- Output modes: Markdown\u002Ftext, JSON diagnostics, extract-only, metrics, timing, and cost estimates.\n- Smart default: if content is shorter than the requested length, we return it as-is (use `--force-summary` to override).\n\n## Get the extension (recommended)\n\n![Summarize extension screenshot](docs\u002Fassets\u002Fsummarize-extension.png)\n\nOne‑click summarizer for the current tab. Chrome Side Panel + Firefox Sidebar + local daemon for streaming Markdown.\n\n**Chrome Web Store:** [Summarize Side Panel](https:\u002F\u002Fchromewebstore.google.com\u002Fdetail\u002Fsummarize\u002Fcejgnmmhbbpdmjnfppjdfkocebngehfg)\n\nYouTube slide screenshots (from the browser):\n\n![Summarize YouTube slide screenshots](docs\u002Fassets\u002Fyoutube-slides.png)\n\n### Beginner quickstart (extension)\n\n1. Install the CLI (choose one):\n   - **npm** (cross‑platform): `npm i -g @steipete\u002Fsummarize`\n   - **Homebrew** (Homebrew\u002Fcore): `brew install summarize`\n2. Install the extension (Chrome Web Store link above) and open the Side Panel.\n3. The panel shows a token + install command. Run it in Terminal:\n   - `summarize daemon install --token \u003CTOKEN>`\n\nWhy a daemon\u002Fservice?\n\n- The extension can’t run heavy extraction inside the browser. It talks to a local background service on `127.0.0.1` for fast streaming and media tools (yt‑dlp, ffmpeg, OCR, transcription).\n- The service autostarts (launchd\u002Fsystemd\u002FScheduled Task) so the Side Panel is always ready.\n\nIf you only want the **CLI**, you can skip the daemon install entirely.\n\nNotes:\n\n- Summarization only runs when the Side Panel is open.\n- Auto mode summarizes on navigation (incl. SPAs); otherwise use the button.\n- Daemon is localhost-only and requires a shared token; rerunning `summarize daemon install --token \u003CTOKEN>` adds another paired browser token instead of invalidating the old one.\n- Autostart: macOS (launchd), Linux (systemd user), Windows (Scheduled Task).\n- Windows containers: `summarize daemon install` starts the daemon for the current container session but does not register a Scheduled Task. Run it each time the container starts or add that command to your container startup, and publish port `8787` so the host browser can reach the daemon.\n- Tip: configure `free` via `summarize refresh-free` (needs `OPENROUTER_API_KEY`). Add `--set-default` to set model=`free`.\n\nMore:\n\n- Step-by-step install: [apps\u002Fchrome-extension\u002FREADME.md](apps\u002Fchrome-extension\u002FREADME.md)\n- Architecture + troubleshooting: [docs\u002Fchrome-extension.md](docs\u002Fchrome-extension.md)\n- Firefox compatibility notes: [apps\u002Fchrome-extension\u002Fdocs\u002Ffirefox.md](apps\u002Fchrome-extension\u002Fdocs\u002Ffirefox.md)\n\n### Slides (extension)\n\n- Select **Video + Slides** in the Summarize picker.\n- Slides render at the top; expand to full‑width cards with timestamps.\n- Click a slide to seek the video; toggle **Transcript\u002FOCR** when OCR is significant.\n- Requirements: `yt-dlp` + `ffmpeg` for extraction; `tesseract` for OCR. Missing tools show an in‑panel notice.\n\n### Advanced (unpacked \u002F dev)\n\n1. Build + load the extension (unpacked):\n   - Chrome: `pnpm -C apps\u002Fchrome-extension build`\n     - `chrome:\u002F\u002Fextensions` → Developer mode → Load unpacked\n     - Pick: `apps\u002Fchrome-extension\u002F.output\u002Fchrome-mv3`\n   - Firefox: `pnpm -C apps\u002Fchrome-extension build:firefox`\n     - `about:debugging#\u002Fruntime\u002Fthis-firefox` → Load Temporary Add-on\n     - Pick: `apps\u002Fchrome-extension\u002F.output\u002Ffirefox-mv3\u002Fmanifest.json`\n2. Open Side Panel\u002FSidebar → copy token.\n3. Install daemon in dev mode:\n   - `pnpm summarize daemon install --token \u003CTOKEN> --dev`\n\n## CLI\n\n![Summarize CLI screenshot](docs\u002Fassets\u002Fsummarize-cli.png)\n\n### Install\n\nRequires Node 24+.\n\n- npx (no install):\n\n```bash\nnpx -y @steipete\u002Fsummarize \"https:\u002F\u002Fexample.com\"\n```\n\n- npm (global):\n\n```bash\nnpm i -g @steipete\u002Fsummarize\n```\n\n- npm (library \u002F minimal deps):\n\n```bash\nnpm i @steipete\u002Fsummarize-core\n```\n\n```ts\nimport { createLinkPreviewClient } from \"@steipete\u002Fsummarize-core\u002Fcontent\";\n```\n\n- Homebrew:\n\n```bash\nbrew install summarize\n```\n\nHomebrew ships from `homebrew\u002Fcore` via `brew install summarize`.\nIf Homebrew is unavailable in your environment, use the npm global install above.\n\n### Optional local dependencies\n\nInstall these if you want media-heavy features:\n\n- `ffmpeg`: required for `--slides` and many local media\u002Ftranscription flows\n- `yt-dlp`: required for YouTube slide extraction and some remote media flows\n- `tesseract`: optional OCR for `--slides-ocr`\n- Optional cloud transcription providers:\n  - `GROQ_API_KEY`\n  - `ASSEMBLYAI_API_KEY`\n  - `GEMINI_API_KEY` \u002F `GOOGLE_GENERATIVE_AI_API_KEY` \u002F `GOOGLE_API_KEY`\n  - `OPENAI_API_KEY`\n  - `FAL_KEY`\n\nmacOS (Homebrew):\n\n```bash\nbrew install ffmpeg yt-dlp\nbrew install tesseract # optional, for --slides-ocr\n```\n\nIf `--slides` is enabled and these tools are missing, Summarize warns and continues without slides.\n\n### CLI vs extension\n\n- **CLI only:** just install via npm\u002FHomebrew and run `summarize ...` (no daemon needed).\n- **Chrome\u002FFirefox extension:** install the CLI **and** run `summarize daemon install --token \u003CTOKEN>` so the Side Panel can stream results and use local tools.\n\n### Quickstart\n\n```bash\nsummarize \"https:\u002F\u002Fexample.com\"\n```\n\n### Inputs\n\nURLs or local paths:\n\n```bash\nsummarize \"\u002Fpath\u002Fto\u002Ffile.pdf\" --model google\u002Fgemini-3-flash\nsummarize \"https:\u002F\u002Fexample.com\u002Freport.pdf\" --model google\u002Fgemini-3-flash\nsummarize \"\u002Fpath\u002Fto\u002Faudio.mp3\"\nsummarize \"\u002Fpath\u002Fto\u002Fvideo.mp4\"\n```\n\nStdin (pipe content using `-`):\n\n```bash\necho \"content\" | summarize -\npbpaste | summarize -\n# binary stdin also works (PDF\u002Fimage\u002Faudio\u002Fvideo bytes)\ncat \u002Fpath\u002Fto\u002Ffile.pdf | summarize -\n```\n\n**Notes:**\n\n- Stdin has a 50MB size limit\n- The `-` argument tells summarize to read from standard input\n- Text stdin is treated as UTF-8 text (whitespace-only input is rejected as empty)\n- Binary stdin is preserved as raw bytes and file type is auto-detected when possible\n- Useful for piping clipboard content or command output\n\nYouTube (supports `youtube.com` and `youtu.be`):\n\n```bash\nsummarize \"https:\u002F\u002Fyoutu.be\u002FdQw4w9WgXcQ\" --youtube auto\n```\n\nPodcast RSS (transcribes latest enclosure):\n\n```bash\nsummarize \"https:\u002F\u002Ffeeds.npr.org\u002F500005\u002Fpodcast.xml\"\n```\n\nApple Podcasts episode page:\n\n```bash\nsummarize \"https:\u002F\u002Fpodcasts.apple.com\u002Fus\u002Fpodcast\u002F2424-jelly-roll\u002Fid360084272?i=1000740717432\"\n```\n\nSpotify episode page (best-effort; may fail for exclusives):\n\n```bash\nsummarize \"https:\u002F\u002Fopen.spotify.com\u002Fepisode\u002F5auotqWAXhhKyb9ymCuBJY\"\n```\n\nHLS playlist:\n\n```bash\nsummarize \"https:\u002F\u002Fexample.com\u002Fmaster.m3u8\"\n```\n\n### Output length\n\n`--length` controls how much output we ask for (guideline), not a hard cap.\n\nSet a default in `~\u002F.summarize\u002Fconfig.json` with `output.length`.\n\n```bash\nsummarize \"https:\u002F\u002Fexample.com\" --length long\nsummarize \"https:\u002F\u002Fexample.com\" --length 20k\n```\n\n- Presets: `short|medium|long|xl|xxl`\n- Character targets: `1500`, `20k`, `20000`\n- Optional hard cap: `--max-output-tokens \u003Ccount>` (e.g. `2000`, `2k`)\n  - Provider\u002Fmodel APIs still enforce their own maximum output limits.\n  - If omitted, no max token parameter is sent (provider default).\n  - Prefer `--length` unless you need a hard cap.\n- Short content: when extracted content is shorter than the requested length, the CLI returns the content as-is.\n  - Override with `--force-summary` to always run the LLM.\n- Minimums: `--length` numeric values must be >= 50 chars; `--max-output-tokens` must be >= 16.\n- Preset targets (source of truth: `packages\u002Fcore\u002Fsrc\u002Fprompts\u002Fsummary-lengths.ts`):\n  - short: target ~900 chars (range 600-1,200)\n  - medium: target ~1,800 chars (range 1,200-2,500)\n  - long: target ~4,200 chars (range 2,500-6,000)\n  - xl: target ~9,000 chars (range 6,000-14,000)\n  - xxl: target ~17,000 chars (range 14,000-22,000)\n\n### What file types work?\n\nBest effort and provider-dependent. These usually work well:\n\n- `text\u002F*` and common structured text (`.txt`, `.md`, `.json`, `.yaml`, `.xml`, ...)\n  - Text-like files are inlined into the prompt for better provider compatibility.\n- PDFs: `application\u002Fpdf` (provider support varies; Google is the most reliable here)\n- Images: `image\u002Fjpeg`, `image\u002Fpng`, `image\u002Fwebp`, `image\u002Fgif`\n- Audio\u002FVideo: `audio\u002F*`, `video\u002F*` (local audio\u002Fvideo files MP3\u002FWAV\u002FM4A\u002FOGG\u002FFLAC\u002FMP4\u002FMOV\u002FWEBM automatically transcribed, when supported by the model)\n\nNotes:\n\n- If a provider rejects a media type, the CLI fails fast with a friendly message.\n- xAI models do not support attaching generic files (like PDFs) via the AI SDK; use Google\u002FOpenAI\u002FAnthropic for those.\n\n### Model ids\n\nUse gateway-style ids: `\u003Cprovider>\u002F\u003Cmodel>`.\n\nExamples:\n\n- `openai\u002Fgpt-5.4`\n- `openai\u002Fgpt-5.4-mini`\n- `openai\u002Fgpt-5.4-nano`\n- `openai\u002Fgpt-5-mini`\n- `openai\u002Fgpt-5-nano`\n- `github-copilot\u002Fgpt-5.4`\n- `anthropic\u002Fclaude-sonnet-4-5`\n- `xai\u002Fgrok-4-fast-non-reasoning`\n- `google\u002Fgemini-3-flash`\n- `zai\u002Fglm-4.7`\n- `openrouter\u002Fopenai\u002Fgpt-5-mini` (force OpenRouter)\n\nNote: some models\u002Fproviders do not support streaming or certain file media types. When that happens, the CLI prints a friendly error (or auto-disables streaming for that model when supported by the provider).\n`gpt-5.4-mini` and `gpt-5.4-nano` are treated as real model ids; the same shorthand also works under `github-copilot\u002F...`.\n\n### OpenAI fast mode and thinking\n\nFast mode is a request option, not a model id:\n\n```bash\nsummarize \"https:\u002F\u002Fexample.com\" --model openai\u002Fgpt-5.5 --fast --thinking medium\nsummarize \"https:\u002F\u002Fexample.com\" --model openai\u002Fgpt-5.4 --service-tier fast --thinking low\n```\n\n- `--fast` is shorthand for `--service-tier fast`.\n- `--service-tier default|fast|priority|flex` controls OpenAI service tier. `fast` is the summarize\u002FCodex-facing spelling and is sent to OpenAI as `service_tier=\"priority\"`.\n- `--thinking none|low|medium|high|xhigh` controls OpenAI reasoning effort. Aliases: `off` → `none`, `min` → `low`, `mid` \u002F `med` → `medium`, `x-high` \u002F `extra-high` → `xhigh`.\n- `--service-tier default` clears a configured tier for one run.\n\nConfig equivalent:\n\n```json\n{\n  \"model\": \"openai\u002Fgpt-5.5\",\n  \"openai\": {\n    \"serviceTier\": \"fast\",\n    \"thinking\": \"medium\"\n  }\n}\n```\n\nCompatibility aliases still work, but prefer the explicit flags above:\n\n- `--model gpt-fast` \u002F `--model fast` → `openai\u002Fgpt-5.5` + fast tier + medium thinking\n- `--model openai\u002Fgpt-5.5-fast` → `openai\u002Fgpt-5.5` + fast tier\n\n### Limits\n\n- Text inputs over 10 MB are rejected before tokenization.\n- Text prompts are preflighted against the model input limit (LiteLLM catalog), using a GPT tokenizer.\n\n### Common flags\n\n```bash\nsummarize \u003Cinput> [flags]\n```\n\nUse `summarize --help` or `summarize help` for the full help text.\n\n- `--model \u003Cprovider\u002Fmodel>`: which model to use (defaults to `auto`)\n- `--model auto`: automatic model selection + fallback (default)\n- `--model \u003Cname>`: use a built-in or config-defined preset (see Configuration)\n- `--timeout \u003Cduration>`: `30s`, `2m`, `5000ms` (default `2m`)\n- `--retries \u003Ccount>`: LLM retry attempts on timeout (default `1`)\n- `--length short|medium|long|xl|xxl|s|m|l|\u003Cchars>`\n- `--language, --lang \u003Clanguage>`: output language (`auto` = match source)\n- `--max-output-tokens \u003Ccount>`: hard cap for LLM output tokens\n- `--cli [provider]`: use a CLI provider (`--model cli\u002F\u003Cprovider>`). Supports `claude`, `gemini`, `codex`, `agent`, `openclaw`, `opencode`. If omitted, uses auto selection with CLI enabled.\n- `--stream auto|on|off`: stream LLM output (`auto` = TTY only; disabled in `--json` mode)\n- `--plain`: keep raw output (no ANSI\u002FOSC Markdown rendering)\n- `--no-color`: disable ANSI colors\n- `--theme \u003Cname>`: CLI theme (`aurora`, `ember`, `moss`, `mono`)\n- `--format md|text`: website\u002Ffile content format (default `text`)\n- `--markdown-mode off|auto|llm|readability`: HTML -> Markdown mode (default `readability`)\n- `--preprocess off|auto|always`: controls `uvx markitdown` usage (default `auto`)\n  - Install `uvx`: `brew install uv` (or https:\u002F\u002Fastral.sh\u002Fuv\u002F)\n  - Image-only PDFs can fall back to OpenAI vision OCR when `OPENAI_API_KEY` is set; override the OCR model with `MARKITDOWN_OCR_MODEL` or page render DPI with `MARKITDOWN_OCR_DPI`.\n- `--extract`: print extracted content and exit (URLs only; stdin `-` is not supported)\n  - Deprecated alias: `--extract-only`\n- `--slides`: extract slides for YouTube, direct video URLs, or local video files and render them inline in the summary narrative (auto-renders inline in supported terminals)\n- `--slides-ocr`: run OCR on extracted slides (requires `tesseract`)\n- `--slides-dir \u003Cdir>`: base output dir for slide images (default `.\u002Fslides`)\n- `--slides-scene-threshold \u003Cvalue>`: scene detection threshold (0.1-1.0)\n- `--slides-max \u003Ccount>`: maximum slides to extract (default `6`)\n- `--slides-min-duration \u003Cseconds>`: minimum seconds between slides\n- `--json`: machine-readable output with diagnostics, prompt, `metrics`, and optional summary\n- `--verbose`: debug\u002Fdiagnostics on stderr\n- `--metrics off|on|detailed`: metrics output (default `on`)\n\n### Coding CLIs (Codex, Claude, Gemini, Agent, OpenClaw, OpenCode)\n\nSummarize can use common coding CLIs as local model backends:\n\n- `codex` -> `--cli codex` \u002F `--model cli\u002Fcodex\u002F\u003Cmodel>`\n- `claude` -> `--cli claude` \u002F `--model cli\u002Fclaude\u002F\u003Cmodel>`\n- `gemini` -> `--cli gemini` \u002F `--model cli\u002Fgemini\u002F\u003Cmodel>`\n- `agent` (Cursor Agent CLI) -> `--cli agent` \u002F `--model cli\u002Fagent\u002F\u003Cmodel>`\n- `openclaw` -> `--cli openclaw` \u002F `--model cli\u002Fopenclaw\u002F\u003Cmodel>` or `--model openclaw\u002F\u003Cmodel>`\n- `opencode` -> `--cli opencode` \u002F `--model cli\u002Fopencode\u002F\u003Cmodel>` (`--model cli\u002Fopencode` uses the OpenCode runtime default)\n\nBuilt-in preset:\n\n- `--model codex-fast` runs Codex with GPT-5.5 Fast mode and requires `codex login`.\n\nRequirements:\n\n- Binary installed and on `PATH` (or set `CODEX_PATH`, `CLAUDE_PATH`, `GEMINI_PATH`, `AGENT_PATH`, `OPENCLAW_PATH`, `OPENCODE_PATH`)\n- Provider authenticated (`codex login`, `claude auth`, `gemini` login flow, `agent login` or `CURSOR_API_KEY`, `opencode auth login`)\n\nQuick smoke test:\n\n```bash\nprintf \"Summarize CLI smoke input.\\nOne short paragraph. Reply can be brief.\\n\" >\u002Ftmp\u002Fsummarize-cli-smoke.txt\n\nsummarize --cli codex --plain --timeout 2m \u002Ftmp\u002Fsummarize-cli-smoke.txt\nsummarize --cli claude --plain --timeout 2m \u002Ftmp\u002Fsummarize-cli-smoke.txt\nsummarize --cli gemini --plain --timeout 2m \u002Ftmp\u002Fsummarize-cli-smoke.txt\nsummarize --cli agent --plain --timeout 2m \u002Ftmp\u002Fsummarize-cli-smoke.txt\nsummarize --cli openclaw --plain --timeout 2m \u002Ftmp\u002Fsummarize-cli-smoke.txt\nsummarize --cli opencode --plain --timeout 2m \u002Ftmp\u002Fsummarize-cli-smoke.txt\n```\n\nSet explicit CLI allowlist\u002Forder:\n\n```json\n{\n  \"cli\": { \"enabled\": [\"codex\", \"claude\", \"gemini\", \"agent\", \"openclaw\", \"opencode\"] }\n}\n```\n\nConfigure implicit auto CLI fallback:\n\n```json\n{\n  \"cli\": {\n    \"autoFallback\": {\n      \"enabled\": true,\n      \"onlyWhenNoApiKeys\": true,\n      \"order\": [\"claude\", \"gemini\", \"codex\", \"agent\", \"openclaw\", \"opencode\"]\n    }\n  }\n}\n```\n\nMore details: [`docs\u002Fcli.md`](docs\u002Fcli.md)\n\n### Auto model ordering\n\n`--model auto` builds candidate attempts from built-in rules (or your `model.rules` overrides).\nCLI attempts are prepended when:\n\n- `cli.enabled` is set (explicit allowlist\u002Forder), or\n- implicit auto selection is active and `cli.autoFallback` is enabled.\n\nDefault fallback behavior: only when no API keys are configured, order `claude, gemini, codex, agent, openclaw, opencode`, and remember\u002Fprioritize last successful provider (`~\u002F.summarize\u002Fcli-state.json`).\n\nSet explicit CLI attempts:\n\n```json\n{\n  \"cli\": { \"enabled\": [\"gemini\"] }\n}\n```\n\nDisable implicit auto CLI fallback:\n\n```json\n{\n  \"cli\": { \"autoFallback\": { \"enabled\": false } }\n}\n```\n\nNote: explicit `--model auto` does not trigger implicit auto CLI fallback unless `cli.enabled` is set.\n\n### Website extraction (Firecrawl + Markdown)\n\nNon-YouTube URLs go through a fetch -> extract pipeline. When direct fetch\u002Fextraction is blocked or too thin,\n`--firecrawl auto` can fall back to Firecrawl (if configured).\n\n- `--firecrawl off|auto|always` (default `auto`)\n- `--extract --format md|text` (default `text`; if `--format` is omitted, `--extract` defaults to `md` for non-YouTube URLs)\n- `--markdown-mode off|auto|llm|readability` (default `readability`)\n  - `auto`: use an LLM converter when configured; may fall back to `uvx markitdown`\n  - `llm`: force LLM conversion (requires a configured model key)\n  - `off`: disable LLM conversion (still may return Firecrawl Markdown when configured)\n- Plain-text mode: use `--format text`.\n\n### YouTube transcripts\n\n`--youtube auto` tries best-effort web transcript endpoints first. When captions are not available, it falls back to:\n\n1. Apify (if `APIFY_API_TOKEN` is set): uses a scraping actor (`faVsWy9VTSNVIhWpR`)\n2. yt-dlp + Whisper (if `yt-dlp` is available): downloads audio, then transcribes with local `whisper.cpp` when installed\n   (preferred), otherwise falls back to Groq (`GROQ_API_KEY`), AssemblyAI (`ASSEMBLYAI_API_KEY`), Gemini\n   (`GEMINI_API_KEY` \u002F Google aliases), OpenAI (`OPENAI_API_KEY`), then FAL (`FAL_KEY`)\n\nEnvironment variables for yt-dlp mode:\n\n- `YT_DLP_PATH` - optional path to yt-dlp binary (otherwise `yt-dlp` is resolved via `PATH`)\n- `SUMMARIZE_WHISPER_CPP_MODEL_PATH` - optional override for the local `whisper.cpp` model file\n- `SUMMARIZE_WHISPER_CPP_BINARY` - optional override for the local binary (default: `whisper-cli`)\n- `SUMMARIZE_DISABLE_LOCAL_WHISPER_CPP=1` - disable local whisper.cpp (force remote)\n- `GROQ_API_KEY` - Groq Whisper transcription\n- `ASSEMBLYAI_API_KEY` - AssemblyAI transcription\n- `GEMINI_API_KEY` - Gemini transcription (`GOOGLE_GENERATIVE_AI_API_KEY` \u002F `GOOGLE_API_KEY` also work)\n- `OPENAI_API_KEY` - OpenAI Whisper transcription\n- `OPENAI_WHISPER_BASE_URL` - optional OpenAI-compatible Whisper endpoint override\n- `FAL_KEY` - FAL AI Whisper fallback\n\nApify costs money but tends to be more reliable when captions exist.\n\n### Slide extraction (YouTube + direct video URLs + local video files)\n\nExtract slide screenshots (scene detection via `ffmpeg`) and optional OCR:\n\nRequirements:\n\n- `ffmpeg` for scene detection and frame extraction\n- `yt-dlp` for YouTube video download\u002Fstream resolution\n- `tesseract` only when using `--slides-ocr`\n\n```bash\nsummarize \"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=...\" --slides\nsummarize \"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=...\" --slides --slides-ocr\nsummarize \"\u002Fpath\u002Fto\u002Fvideo.webm\" --slides\n```\n\nOutputs are written under `.\u002Fslides\u002F\u003CsourceId>\u002F` (or `--slides-dir`). OCR results are included in JSON output\n(`--json`) and stored in `slides.json` inside the slide directory. When scene detection is too sparse, the\nextractor also samples at a fixed interval to improve coverage.\nWhen using `--slides`, supported terminals (kitty\u002FiTerm\u002FKonsole) render inline thumbnails automatically inside the\nsummary narrative (the model inserts `[slide:N]` markers). Timestamp links are clickable when the terminal supports\nOSC-8 (YouTube\u002FVimeo\u002FLoom\u002FDropbox). If inline images are unsupported, Summarize prints a note with the on-disk\nslide directory. Local video files stay on the slide-aware path, transcribe in place, and avoid fake download labels.\n\nUse `--slides --extract` to print the full timed transcript and insert slide images inline at matching timestamps.\n\nFormat the extracted transcript as Markdown (headings + paragraphs) via an LLM:\n\n```bash\nsummarize \"https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=...\" --extract --format md --markdown-mode llm\n```\n\n### Media transcription (Whisper)\n\nLocal audio\u002Fvideo files are transcribed first, then summarized. `--video-mode transcript` forces\ndirect media URLs (and embedded media) through Whisper first. Prefers local `whisper.cpp` when available; otherwise requires\none of `GROQ_API_KEY`, `ASSEMBLYAI_API_KEY`, `GEMINI_API_KEY` (or Google aliases), `OPENAI_API_KEY`, or `FAL_KEY`.\n\n### Local ONNX transcription (Parakeet\u002FCanary)\n\nSummarize can use NVIDIA Parakeet\u002FCanary ONNX models via a local CLI you provide. Auto selection (default) prefers ONNX when configured.\n\n- Setup helper: `summarize transcriber setup`\n- Install `sherpa-onnx` from upstream binaries\u002Fbuild (Homebrew may not have a formula)\n- Auto selection: set `SUMMARIZE_ONNX_PARAKEET_CMD` or `SUMMARIZE_ONNX_CANARY_CMD` (no flag needed)\n- Force a model: `--transcriber parakeet|canary|whisper|auto`\n- Docs: `docs\u002Fnvidia-onnx-transcription.md`\n\n### Verified podcast services (2025-12-25)\n\nRun: `summarize \u003Curl>`\n\n- Apple Podcasts\n- Spotify\n- Amazon Music \u002F Audible podcast pages\n- Podbean\n- Podchaser\n- RSS feeds (Podcasting 2.0 transcripts when available)\n- Embedded YouTube podcast pages (e.g. JREPodcast)\n\nTranscription: prefers local `whisper.cpp` when installed; otherwise uses Groq, AssemblyAI, Gemini, OpenAI, or FAL when keys are set.\n\n### Translation paths\n\n`--language\u002F--lang` controls the output language of the summary (and other LLM-generated text). Default is `auto`.\n\nWhen the input is audio\u002Fvideo, the CLI needs a transcript first. The transcript comes from one of these paths:\n\n1. Existing transcript (preferred)\n   - YouTube: uses `youtubei` \u002F `captionTracks` when available.\n   - Podcasts: uses Podcasting 2.0 RSS `\u003Cpodcast:transcript>` (JSON\u002FVTT) when the feed publishes it.\n2. Whisper transcription (fallback)\n   - YouTube: falls back to yt-dlp (audio download) + Whisper transcription when configured; Apify is a last resort.\n   - Prefers local `whisper.cpp` when installed + model available.\n   - Otherwise uses cloud transcription in this order: Groq (`GROQ_API_KEY`) → AssemblyAI (`ASSEMBLYAI_API_KEY`) → Gemini (`GEMINI_API_KEY` \u002F Google aliases) → OpenAI (`OPENAI_API_KEY`) → FAL (`FAL_KEY`).\n\nFor direct media URLs, use `--video-mode transcript` to force transcribe -> summarize:\n\n```bash\nsummarize https:\u002F\u002Fexample.com\u002Ffile.mp4 --video-mode transcript --lang en\n```\n\n### Configuration\n\nSingle config location:\n\n- `~\u002F.summarize\u002Fconfig.json`\n\nSupported keys today:\n\n```json\n{\n  \"model\": { \"id\": \"openai\u002Fgpt-5-mini\" },\n  \"env\": { \"OPENAI_API_KEY\": \"sk-...\" },\n  \"output\": { \"length\": \"long\" },\n  \"ui\": { \"theme\": \"ember\" }\n}\n```\n\nShorthand (equivalent):\n\n```json\n{\n  \"model\": \"openai\u002Fgpt-5-mini\"\n}\n```\n\nAlso supported:\n\n- `model: { \"mode\": \"auto\" }` (automatic model selection + fallback; see [docs\u002Fmodel-auto.md](docs\u002Fmodel-auto.md))\n- `model.rules` (customize candidates \u002F ordering)\n- `models` (define presets selectable via `--model \u003Cpreset>`; overrides built-ins like `free`)\n- `env` (generic env var defaults; process env still wins)\n- `apiKeys` (legacy shortcut, mapped to env names; prefer `env` for new configs)\n- `output.length` (default `--length`: `short|medium|long|xl|xxl|20k`)\n- `cache.media` (media download cache: TTL 7 days, 2048 MB cap by default; `--no-media-cache` disables)\n- `media.videoMode: \"auto\"|\"transcript\"|\"understand\"`\n- `slides.enabled` \u002F `slides.max` \u002F `slides.ocr` \u002F `slides.dir` (defaults for `--slides`)\n- `ui.theme: \"aurora\"|\"ember\"|\"moss\"|\"mono\"`\n- `openai.useChatCompletions: true` (force OpenAI-compatible chat completions)\n- `openai.serviceTier: \"fast\"|\"priority\"|\"flex\"` (use `\"fast\"` for the friendly alias)\n- `openai.thinking` \u002F `openai.reasoningEffort: \"none\"|\"low\"|\"medium\"|\"high\"|\"xhigh\"`\n- `openai.textVerbosity: \"low\"|\"medium\"|\"high\"`\n\nNote: the config is parsed leniently (JSON5), but comments are not allowed. Unknown keys are ignored.\n\nMedia cache defaults:\n\n```json\n{\n  \"cache\": {\n    \"media\": { \"enabled\": true, \"ttlDays\": 7, \"maxMb\": 2048, \"verify\": \"size\" }\n  }\n}\n```\n\nNote: `--no-cache` bypasses summary caching only (LLM output). Extract\u002Ftranscript caches still apply. Use `--no-media-cache` to skip media files.\n\nPrecedence:\n\n1. `--model`\n2. `SUMMARIZE_MODEL`\n3. `~\u002F.summarize\u002Fconfig.json`\n4. default (`auto`)\n\nTheme precedence:\n\n1. `--theme`\n2. `SUMMARIZE_THEME`\n3. `~\u002F.summarize\u002Fconfig.json` (`ui.theme`)\n4. default (`aurora`)\n\nEnvironment variable precedence:\n\n1. process env\n2. `~\u002F.summarize\u002Fconfig.json` (`env`)\n3. `~\u002F.summarize\u002Fconfig.json` (`apiKeys`, legacy)\n\n### Environment variables\n\nSet the key matching your chosen `--model`:\n\n- Optional fallback defaults can be stored in config:\n  - `~\u002F.summarize\u002Fconfig.json` -> `\"env\": { \"OPENAI_API_KEY\": \"sk-...\" }`\n  - process env always takes precedence\n  - legacy `\"apiKeys\"` still works (mapped to env names)\n\n- `OPENAI_API_KEY` (for `openai\u002F...`)\n- `NVIDIA_API_KEY` (for `nvidia\u002F...`)\n- `ANTHROPIC_API_KEY` (for `anthropic\u002F...`)\n- `XAI_API_KEY` (for `xai\u002F...`)\n- `Z_AI_API_KEY` (for `zai\u002F...`; supports `ZAI_API_KEY` alias)\n- `GEMINI_API_KEY` (for `google\u002F...`)\n  - also accepts `GOOGLE_GENERATIVE_AI_API_KEY` and `GOOGLE_API_KEY` as aliases\n\nOpenAI-compatible chat completions toggle:\n\n- `OPENAI_USE_CHAT_COMPLETIONS=1` (or set `openai.useChatCompletions` in config)\n\nUI theme:\n\n- `SUMMARIZE_THEME=aurora|ember|moss|mono`\n- `SUMMARIZE_TRUECOLOR=1` (force 24-bit ANSI)\n- `SUMMARIZE_NO_TRUECOLOR=1` (disable 24-bit ANSI)\n\nOpenRouter (OpenAI-compatible):\n\n- Set `OPENROUTER_API_KEY=...`\n- Prefer forcing OpenRouter per model id: `--model openrouter\u002F\u003Cauthor>\u002F\u003Cslug>`\n- Built-in preset: `--model free` (uses a default set of OpenRouter `:free` models)\n\n### `summarize refresh-free`\n\nQuick start: make free the default (keep `auto` available)\n\n```bash\nsummarize refresh-free --set-default\nsummarize \"https:\u002F\u002Fexample.com\"\nsummarize \"https:\u002F\u002Fexample.com\" --model auto\n```\n\nRegenerates the `free` preset (`models.free` in `~\u002F.summarize\u002Fconfig.json`) by:\n\n- Fetching OpenRouter `\u002Fmodels`, filtering `:free`\n- Skipping models that look very small (\u003C27B by default) based on the model id\u002Fname\n- Testing which ones return non-empty text (concurrency 4, timeout 10s)\n- Picking a mix of smart-ish (bigger `context_length` \u002F output cap) and fast models\n- Refining timings and writing the sorted list back\n\nIf `--model free` stops working, run:\n\n```bash\nsummarize refresh-free\n```\n\nFlags:\n\n- `--runs 2` (default): extra timing runs per selected model (total runs = 1 + runs)\n- `--smart 3` (default): how many smart-first picks (rest filled by fastest)\n- `--min-params 27b` (default): ignore models with inferred size smaller than N billion parameters\n- `--max-age-days 180` (default): ignore models older than N days (set 0 to disable)\n- `--set-default`: also sets `\"model\": \"free\"` in `~\u002F.summarize\u002Fconfig.json`\n\nExample:\n\n```bash\nOPENROUTER_API_KEY=sk-or-... summarize \"https:\u002F\u002Fexample.com\" --model openrouter\u002Fmeta-llama\u002Fllama-3.1-8b-instruct:free\nOPENROUTER_API_KEY=sk-or-... summarize \"https:\u002F\u002Fexample.com\" --model openrouter\u002Fminimax\u002Fminimax-m2.5\n```\n\nIf your OpenRouter account enforces an allowed-provider list, make sure at least one provider\nis allowed for the selected model. When routing fails, `summarize` prints the exact providers to allow.\n\nLegacy: `OPENAI_BASE_URL=https:\u002F\u002Fopenrouter.ai\u002Fapi\u002Fv1` (and either `OPENAI_API_KEY` or `OPENROUTER_API_KEY`) also works.\n\nNVIDIA API Catalog (OpenAI-compatible; free credits):\n\n- Set `NVIDIA_API_KEY=...`\n- Optional: `NVIDIA_BASE_URL=https:\u002F\u002Fintegrate.api.nvidia.com\u002Fv1`\n- Credits: API Catalog trial starts with 1000 free API credits on signup (up to 5000 total via “Request More” in the API Catalog profile)\n- Pick a model id from `\u002Fv1\u002Fmodels` (examples: fast `stepfun-ai\u002Fstep-3.5-flash`, strong but slower `z-ai\u002Fglm5`)\n\n```bash\nexport NVIDIA_API_KEY=\"nvapi-...\"\nsummarize \"https:\u002F\u002Fexample.com\" --model nvidia\u002Fstepfun-ai\u002Fstep-3.5-flash\n```\n\nZ.AI (OpenAI-compatible):\n\n- `Z_AI_API_KEY=...` (or `ZAI_API_KEY=...`)\n- Optional base URL override: `Z_AI_BASE_URL=...`\n\nOptional services:\n\n- `FIRECRAWL_API_KEY` (website extraction fallback)\n- `YT_DLP_PATH` (path to yt-dlp binary for audio extraction)\n- `GROQ_API_KEY` (Groq Whisper transcription)\n- `ASSEMBLYAI_API_KEY` (AssemblyAI transcription)\n- `GEMINI_API_KEY` \u002F `GOOGLE_GENERATIVE_AI_API_KEY` \u002F `GOOGLE_API_KEY` (Gemini transcription)\n- `OPENAI_API_KEY` \u002F `OPENAI_WHISPER_BASE_URL` (OpenAI Whisper transcription)\n- `FAL_KEY` (FAL AI API key for audio transcription via Whisper)\n- `APIFY_API_TOKEN` (YouTube transcript fallback)\n\n### Model limits\n\nThe CLI uses the LiteLLM model catalog for model limits (like max output tokens):\n\n- Downloaded from: `https:\u002F\u002Fraw.githubusercontent.com\u002FBerriAI\u002Flitellm\u002Fmain\u002Fmodel_prices_and_context_window.json`\n- Cached at: `~\u002F.summarize\u002Fcache\u002F`\n\n### Library usage (optional)\n\nRecommended (minimal deps):\n\n- `@steipete\u002Fsummarize-core\u002Fcontent`\n- `@steipete\u002Fsummarize-core\u002Fprompts`\n\nCompatibility (pulls in CLI deps):\n\n- `@steipete\u002Fsummarize\u002Fcontent`\n- `@steipete\u002Fsummarize\u002Fprompts`\n\n### Development\n\n```bash\npnpm install\npnpm check\n```\n\n## More\n\n- Docs index: [docs\u002FREADME.md](docs\u002FREADME.md)\n- CLI providers and config: [docs\u002Fcli.md](docs\u002Fcli.md)\n- Auto model rules: [docs\u002Fmodel-auto.md](docs\u002Fmodel-auto.md)\n- Website extraction: [docs\u002Fwebsite.md](docs\u002Fwebsite.md)\n- YouTube handling: [docs\u002Fyoutube.md](docs\u002Fyoutube.md)\n- Media pipeline: [docs\u002Fmedia.md](docs\u002Fmedia.md)\n- Config schema and precedence: [docs\u002Fconfig.md](docs\u002Fconfig.md)\n\n## Troubleshooting\n\n- \"Receiving end does not exist\": Chrome did not inject the content script yet.\n  - Extension details -> Site access -> On all sites (or allow this domain)\n  - Reload the tab once.\n- \"Failed to fetch\" \u002F daemon unreachable:\n  - `summarize daemon status`\n  - Logs: `~\u002F.summarize\u002Flogs\u002Fdaemon.err.log`\n\nLicense: MIT\n","Summarize 是一个能够从URL、YouTube视频、播客或文件中提取要点的工具，支持命令行界面（CLI）和Chrome扩展。其核心功能包括通过侧边栏提供流式聊天体验、为视频生成带有时间戳的截图与文字记录卡、智能识别媒体类型并据此生成摘要等。项目采用TypeScript编写，具备强大的文本及多媒体处理能力，并集成了多种编码后端服务如Codex、Claude等以增强功能。适用于需要快速了解长篇文档、在线文章或音视频内容梗概的研究者、学生及一般用户，在日常学习、工作或是信息检索时能极大提高效率。",2,"2026-06-11 03:46:07","high_star"]