[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-79972":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":29,"readmeContent":30,"aiSummary":31,"trendingCount":15,"starSnapshotCount":15,"syncStatus":32,"lastSyncTime":33,"discoverSource":34},79972,"video-recap","worldwonderer\u002Fvideo-recap","worldwonderer","AI narration skill: input a video, output a voiceover recap video for Claude Code｜AI解说skill，输入视频，输出带中文旁白的解说视频，适配Claude Code","",null,"Python",92,10,1,0,5,8,16,15,58.72,"MIT License",false,"main",[25,26,27,28,5],"claude-code","skill","tts","video-narration","2026-06-12 04:01:26","# video-recap\n\n[中文说明](README.zh-CN.md) · English\n\n> A Claude Code skill for turning videos into recap videos with story research, ASR+VLM scene understanding, TTS voiceover, subtitles, and dynamic audio mixing.\n\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](LICENSE)\n![Claude Code Skill](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FClaude%20Code-Skill-purple)\n![TTS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTTS-edge--tts-green)\n![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.10%2B-blue)\n\n## Demo\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F92698ec6-0d23-4f9f-8825-c3684ef57aff\n\n## What is it?\n\n`video-recap` is a Claude Code skill that helps an agent create short-form recap videos from existing video files.\n\n```mermaid\nflowchart TB\n    input([Input video]) --> prep[Prepare artifacts]\n    context[[Story research \u002F context]] -.-> brief\n\n    subgraph understand[1. Understand the video]\n        direction LR\n        scene[Scene cuts]\n        asr[ASR dialogue]\n        vlm[VLM frame facts]\n    end\n\n    subgraph write[2. Plan the script]\n        direction LR\n        brief[Timing brief]\n        script[narration.json]\n    end\n\n    subgraph produce[3. Produce the recap]\n        direction LR\n        tts[edge-tts voiceover]\n        mix[Subtitles + audio ducking]\n        output([Recap video])\n    end\n\n    prep --> scene\n    prep --> asr\n    prep --> vlm\n    scene --> brief\n    asr --> brief\n    vlm --> brief\n    brief --> script\n    script --> tts\n    script --> mix\n    tts --> mix\n    mix --> output\n\n    classDef source fill:#eef6ff,stroke:#4f86c6,stroke-width:1px,color:#1f2937;\n    classDef analysis fill:#fff7e6,stroke:#d99100,stroke-width:1px,color:#1f2937;\n    classDef scriptStyle fill:#f3ecff,stroke:#7c3aed,stroke-width:1px,color:#1f2937;\n    classDef output fill:#ecfdf3,stroke:#16a34a,stroke-width:1px,color:#1f2937;\n    class input,context,prep source;\n    class scene,asr,vlm analysis;\n    class brief,script scriptStyle;\n    class tts,mix,output output;\n```\n\n## Why use it?\n\n- **Story research before writing** — pull plot, characters, relationships, and world context into the brief so the recap is not just visual guesswork.\n- **ASR + VLM understanding** — combine dialogue transcripts with scene cuts, VLM descriptions, and frame-level facts.\n- **Timing-aware writing brief** — `agent_narration_brief.md` includes quiet windows, dialogue overlap, scene timing, and word budgets.\n- **Original audio stays alive** — voiceover is mixed with ducking instead of replacing dialogue, ambience, and rhythm.\n- **Script-first reruns** — edit `narration.json`, then rerun TTS\u002Fassembly without redoing video analysis.\n- **Cut-style recaps** — in `--edit-mode cut`, select source ranges in `clip_plan.json` to turn long videos into shorter narrated edits.\n- **No-key TTS path** — defaults to `edge-tts` with `zh-CN-YunxiNeural` when available.\n\n## Installation\n\n### 1. Install the Claude Code skill\n\nAsk Claude Code:\n\n```text\nInstall this skill: https:\u002F\u002Fgithub.com\u002Fworldwonderer\u002Fvideo-recap\n```\n\n### 2. Install runtime dependencies\n\n```bash\nbrew install ffmpeg\npip3 install edge-tts\n```\n\n### 3. Configure an OpenAI-compatible API\n\n```bash\nexport OPENAI_API_KEY=your-key\nexport OPENAI_API_URL=https:\u002F\u002Fyour-api-url\u002Fv1\nexport OPENAI_MODEL=doubao-seed-2-0-lite-260428\n\n# Recommended when your proxy\u002Fprovider is sensitive to concurrent VLM requests:\nexport VLM_WORKERS=1\n```\n\n## Quick start\n\nAfter installing the skill, tell Claude Code:\n\n```text\nCreate a recap video for \u002Fpath\u002Fto\u002Fvideo.mp4 using video-recap.\nUse edge-tts with the Yunxi voice. Context: \u003Cshow \u002F movie \u002F character background>.\n```\n\nThe pipeline prepares scene, ASR, and visual-analysis artifacts, then pauses with an `agent_narration_brief.md`. The agent writes `narration.json`, and the CLI resumes to synthesize voiceover and assemble the video.\n\nIf you want to start the first analysis pass manually:\n\n```bash\npython3 skills\u002Fvideo-recap\u002Fscripts\u002Fvideo_recap.py \u002Fpath\u002Fto\u002Fvideo.mp4 \\\n  --tts edge-tts \\\n  --voice zh-CN-YunxiNeural \\\n  --context \"show name, characters, or story background\"\n```\n\nThe command pauses before TTS and prints a `work_dir`. Read `work_dir\u002Fagent_narration_brief.md`, write `work_dir\u002Fnarration.json`, then run the printed resume command.\n\nTo validate the agent-written script before TTS, run `--step script` after writing `narration.json`. This writes `work_dir\u002Fnarration_lint.json` with timing errors and warnings.\n\nFor an edited recap that keeps only selected source moments (target duration is a planning goal):\n\n```bash\npython3 skills\u002Fvideo-recap\u002Fscripts\u002Fvideo_recap.py \u002Fpath\u002Fto\u002Fvideo.mp4 \\\n  --edit-mode cut \\\n  --target-duration 10m \\\n  --tts edge-tts\n```\n\nIn cut mode, write both `work_dir\u002Fclip_plan.json` and `work_dir\u002Fnarration.json` using original source timestamps. The CLI builds `edited_source.mp4`, maps narration into `narration_mapped.json`, then resumes TTS\u002Fassembly.\n\nTo hardcode the narration subtitles into the final video, add `--burn-subtitles` on the resume\u002Fassembly run:\n\n```bash\npython3 skills\u002Fvideo-recap\u002Fscripts\u002Fvideo_recap.py \u002Fpath\u002Fto\u002Fvideo.mp4 \\\n  --resume work_dir \\\n  --burn-subtitles\n```\n\nThe CLI exports `subtitles.srt` from the final `narration.json` and TTS placement. Burn-in uses an internal `subtitles.ass` renderer with readable bottom subtitles and re-encodes the video, so your `ffmpeg` build must include the `subtitles`\u002Flibass filter.\n\n### Doctor check\n\n```bash\npython3 skills\u002Fvideo-recap\u002Fscripts\u002Fvideo_recap.py --doctor\n```\n\nUse `--doctor-tts-smoke` when you also want a short `edge-tts` synthesis check. The doctor also reports ffmpeg subtitle-filter support, ASR path\u002Fmodel readiness, normalized API configuration, and the default TTS setup.\n\n## Output\n\nTypical outputs:\n\n- `recap_\u003Cvideo>.mp4` — final recap video\n- `work_dir\u002Fsubtitles.srt` — voiceover\u002Fnarration subtitles generated from final TTS placement\n- `work_dir\u002Fsubtitles.ass` — internal narration subtitle file used for burn-in when `--burn-subtitles` is enabled\n- `work_dir\u002Fagent_narration_brief.md` — timing and scene brief for the agent\n- `work_dir\u002Fnarration.json` — recap narration script\n- `work_dir\u002Fnarration_lint.json` — script timing\u002Fpreflight diagnostics from `--step script` or resume validation\n- `work_dir\u002Fclip_plan.json` — source ranges to keep when `--edit-mode cut` is used\n- `work_dir\u002Fedited_source.mp4` — concatenated short source video in cut mode\n- `work_dir\u002Fnarration_mapped.json` — narration mapped from source time to edited-output time\n- `work_dir\u002Fvlm_analysis.json` — scene-level visual analysis\n- `work_dir\u002Fasr_result.json` — ASR result when available; used as recap context\n- `work_dir\u002Ftts_segments\u002F` — generated TTS audio segments\n\n## Useful references\n\n- [Skill contract](skills\u002Fvideo-recap\u002FSKILL.md)\n- [Agent workflow](skills\u002Fvideo-recap\u002Freferences\u002Fagent-mode-workflow.md)\n- [Parameters](skills\u002Fvideo-recap\u002Freferences\u002Fparameters.md)\n- [Prompt templates](skills\u002Fvideo-recap\u002Freferences\u002Fprompt-templates.md)\n- [Resume and partial reruns](skills\u002Fvideo-recap\u002Freferences\u002Fpipeline-resume.md)\n- [Data schema](skills\u002Fvideo-recap\u002Freferences\u002Fdata-schema.md)\n\n## Acknowledgements\n\n- [linux.do](https:\u002F\u002Flinux.do)\n- [qwen3-asr-rs](https:\u002F\u002Fgithub.com\u002Falan890104\u002Fqwen3-asr-rs)\n\n## License\n\nMIT — see [LICENSE](LICENSE).\n","video-recap 是一个用于将现有视频转换成带有中文旁白解说的回顾视频的Claude Code技能。该项目利用故事研究、ASR（自动语音识别）与VLM（视觉语言模型）来理解视频内容，并生成包含旁白、字幕及动态音频混合的解说视频。其核心技术特点包括基于情节和角色关系的研究进行剧本编写，结合对话转录与场景切割提供更准确的内容解析，以及通过智能音频混合技术保留原始音轨的同时添加旁白。适用于需要快速制作高质量视频摘要或教育性解说视频的场景。",2,"2026-06-11 03:58:43","CREATED_QUERY"]