[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-71151":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},71151,"FunClip","modelscope\u002FFunClip","modelscope","Open-source, accurate and easy-to-use video speech recognition & clipping tool. LLM-based AI clipping integrated.","",null,"Python",5792,696,41,1,0,19,65,201,57,39.53,"MIT License",false,"main",[26,27,28,29,30,31,32,33],"gradio","gradio-python-llm","llm","speech-recognition","speech-to-text","subtitles-generator","video-clip","video-subtitles","2026-06-12 02:02:48","[![SVG Banners](https:\u002F\u002Fsvg-banners.vercel.app\u002Fapi?type=rainbow&text1=FunClip%20%20🥒&width=800&height=210)](https:\u002F\u002Fgithub.com\u002FAkshay090\u002Fsvg-banners)\n\n### \u003Cp align=\"center\">「[简体中文](.\u002FREADME_zh.md) | English」\u003C\u002Fp>\n\n**\u003Cp align=\"center\"> ⚡ Open-source, accurate and easy-to-use video clipping tool \u003C\u002Fp>**\n**\u003Cp align=\"center\"> 🧠 Explore LLM based video clipping with FunClip \u003C\u002Fp>**\n\n\u003Cp align=\"center\"> \u003Cimg src=\"docs\u002Fimages\u002Finterface.jpg\" width=444\u002F>\u003C\u002Fp>\n\n\u003Cp align=\"center\" class=\"trendshift\">\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F10126\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F10126\" alt=\"alibaba-damo-academy%2FFunClip | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"300\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cdiv align=\"center\">  \n\u003Ch4>\n\u003Ca href=\"#What's New\"> What's New \u003C\u002Fa>\n｜\u003Ca href=\"#On Going\"> On Going \u003C\u002Fa>\n｜\u003Ca href=\"#Install\"> Install \u003C\u002Fa>\n｜\u003Ca href=\"#Usage\"> Usage \u003C\u002Fa>\n｜\u003Ca href=\"#Community\"> Community \u003C\u002Fa>\n\u003C\u002Fh4>\n\u003C\u002Fdiv>\n\n**FunClip** is a fully open-source, locally deployed automated video clipping tool. It leverages Alibaba TONGYI speech lab's open-source [FunASR](https:\u002F\u002Fgithub.com\u002Falibaba-damo-academy\u002FFunASR) Paraformer series models to perform speech recognition on videos. Then, users can freely choose text segments or speakers from the recognition results and click the clip button to obtain the video clip corresponding to the selected segments (Quick Experience [Modelscope⭐](https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fiic\u002Ffunasr_app_clipvideo\u002Fsummary) [HuggingFace🤗](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FR1ckShi\u002FFunClip)).\n\n## Highlights🎨\n\n- 🔥Try AI clipping using LLM in FunClip now.\n- FunClip integrates Alibaba's open-source industrial-grade model [Paraformer-Large](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fiic\u002Fspeech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\u002Fsummary), which is one of the best-performing open-source Chinese ASR models available, with over 13 million downloads on Modelscope. It can also accurately predict timestamps in an integrated manner.\n- FunClip incorporates the hotword customization feature of [SeACo-Paraformer](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fiic\u002Fspeech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\u002Fsummary), allowing users to specify certain entity words, names, etc., as hotwords during the ASR process to enhance recognition results.\n- FunClip integrates the [CAM++](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fiic\u002Fspeech_campplus_sv_zh-cn_16k-common\u002Fsummary) speaker recognition model, enabling users to use the auto-recognized speaker ID as the target for trimming, to clip segments from a specific speaker.\n- The functionalities are realized through Gradio interaction, offering simple installation and ease of use. It can also be deployed on a server and accessed via a browser.\n- FunClip supports multi-segment free clipping and automatically returns full video SRT subtitles and target segment SRT subtitles, offering a simple and convenient user experience.\n\n\u003Ca name=\"What's New\">\u003C\u002Fa>\n## What's New🚀\n- 2024\u002F06\u002F12 FunClip supports recognize and clip English audio files now. Run `python funclip\u002Flaunch.py -l en` to try.\n- 🔥2024\u002F05\u002F13 FunClip v2.0.0 now supports smart clipping with large language models, integrating models from the qwen series, GPT series, etc., providing default prompts. You can also explore and share tips for setting prompts, the usage is as follows:\n  1. After the recognition, select the name of the large model and configure your own apikey;\n  2. Click on the 'LLM Inference' button, and FunClip will automatically combine two prompts with the video's srt subtitles;\n  3. Click on the 'AI Clip' button, and based on the output results of the large language model from the previous step, FunClip will extract the timestamps for clipping;\n  4. You can try changing the prompt to leverage the capabilities of the large language models to get the results you want;\n- 2024\u002F05\u002F09 FunClip updated to v1.1.0, including the following updates and fixes:\n  - Support configuration of output file directory, saving ASR intermediate results and video clipping intermediate files;\n  - UI upgrade (see guide picture below), video and audio cropping function are on the same page now, button position adjustment;\n  - Fixed a bug introduced due to FunASR interface upgrade, which has caused some serious clipping errors;\n  - Support configuring different start and end time offsets for each paragraph;\n  - Code update, etc;\n- 2024\u002F03\u002F06 Fix bugs in using FunClip with command line.\n- 2024\u002F02\u002F28 [FunASR](https:\u002F\u002Fgithub.com\u002Falibaba-damo-academy\u002FFunASR) is updated to 1.0 version, use FunASR1.0 and SeACo-Paraformer to conduct ASR with hotword customization.\n- 2023\u002F10\u002F17 Fix bugs in multiple periods chosen, used to return video with wrong length.\n- 2023\u002F10\u002F10 FunClipper now supports recognizing with speaker diarization ability, choose 'yes' button in 'Recognize Speakers' and you will get recognition results with speaker id for each sentence. And then you can clip out the periods of one or some speakers (e.g. 'spk0' or 'spk0#spk3') using FunClipper.\n\n\u003Ca name=\"On Going\">\u003C\u002Fa>\n## On Going🌵\n\n- [x] FunClip will support Whisper model for English users, coming soon (ASR using Whisper with timestamp requires massive GPU memory, we support timestamp prediction for vanilla Paraformer in FunASR to achieving this).\n- [x] FunClip will further explore the abilities of large langage model based AI clipping, welcome to discuss about prompt setting and clipping, etc.\n- [ ] Reverse periods choosing while clipping.\n- [ ] Removing silence periods.\n\n\u003Ca name=\"Install\">\u003C\u002Fa>\n## Install🔨\n\n### Python env install\n\nFunClip basic functions rely on a python environment only.\n```shell\n# clone funclip repo\ngit clone https:\u002F\u002Fgithub.com\u002Falibaba-damo-academy\u002FFunClip.git\ncd FunClip\n# install Python requirments\npip install -r .\u002Frequirements.txt\n```\n\n### imagemagick install (Optional)\n\nIf you want to clip video file with embedded subtitles\n\n1. ffmpeg and imagemagick is required\n\n- On Ubuntu\n```shell\napt-get -y update && apt-get -y install ffmpeg imagemagick\nsed -i 's\u002Fnone\u002Fread,write\u002Fg' \u002Fetc\u002FImageMagick-6\u002Fpolicy.xml\n```\n- On MacOS\n```shell\nbrew install imagemagick\nsed -i 's\u002Fnone\u002Fread,write\u002Fg' \u002Fusr\u002Flocal\u002FCellar\u002Fimagemagick\u002F7.1.1-8_1\u002Fetc\u002FImageMagick-7\u002Fpolicy.xml \n```\n- On Windows\n\nDownload and install imagemagick https:\u002F\u002Fimagemagick.org\u002Fscript\u002Fdownload.php#windows\n\nFind your python install path and change the `IMAGEMAGICK_BINARY` to your imagemagick install path in file `site-packages\\moviepy\\config_defaults.py`\n\n2. Download font file to funclip\u002Ffont\n\n```shell\nwget https:\u002F\u002Fisv-data.oss-cn-hangzhou.aliyuncs.com\u002Fics\u002FMaaS\u002FClipVideo\u002FSTHeitiMedium.ttc -O font\u002FSTHeitiMedium.ttc\n```\n\u003Ca name=\"Usage\">\u003C\u002Fa>\n## Use FunClip\n\n### A. Use FunClip as local Gradio Service\nYou can establish your own FunClip service which is same as [Modelscope Space](https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fiic\u002Ffunasr_app_clipvideo\u002Fsummary) as follow:\n```shell\npython funclip\u002Flaunch.py\n# '-l en' for English audio recognize\n# '-p xxx' for setting port number\n# '-s True' for establishing service for public accessing\n```\nthen visit ```localhost:7860``` you will get a Gradio service like below and you can use FunClip following the steps:\n\n- Step1: Upload your video file (or try the example videos below)\n- Step2: Copy the text segments you need to 'Text to Clip'\n- Step3: Adjust subtitle settings (if needed)\n- Step4: Click 'Clip' or 'Clip and Generate Subtitles'\n\n\u003Cimg src=\"docs\u002Fimages\u002Fguide.jpg\"\u002F>\n\nFollow the guide below to explore LLM based clipping:\n\n\u003Cimg src=\"docs\u002Fimages\u002FLLM_guide.png\" width=360\u002F>\n\n### B. Experience FunClip in Modelscope\n\n[FunClip@Modelscope Space⭐](https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fiic\u002Ffunasr_app_clipvideo\u002Fsummary)\n\n[FunClip@HuggingFace Space🤗](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FR1ckShi\u002FFunClip)\n\n### C. Use FunClip in command line\n\nFunClip supports you to recognize and clip with commands:\n```shell\n# step1: Recognize\npython funclip\u002Fvideoclipper.py --stage 1 \\\n                       --file examples\u002F2022云栖大会_片段.mp4 \\\n                       --output_dir .\u002Foutput\n# now you can find recognition results and entire SRT file in .\u002Foutput\u002F\n# step2: Clip\npython funclip\u002Fvideoclipper.py --stage 2 \\\n                       --file examples\u002F2022云栖大会_片段.mp4 \\\n                       --output_dir .\u002Foutput \\\n                       --dest_text '我们把它跟乡村振兴去结合起来，利用我们的设计的能力' \\\n                       --start_ost 0 \\\n                       --end_ost 100 \\\n                       --output_file '.\u002Foutput\u002Fres.mp4'\n```\n\n\u003Ca name=\"Community\">\u003C\u002Fa>\n## Community Communication🍟\n\nFunClip is firstly open-sourced bu FunASR team, any useful PR is welcomed.\n\nYou can also scan the following DingTalk group or WeChat group QR code to join the community group for communication.\n\n|                           DingTalk group                            |                     WeChat group                      |\n|:-------------------------------------------------------------------:|:-----------------------------------------------------:|\n| \u003Cdiv align=\"left\">\u003Cimg src=\"docs\u002Fimages\u002Fdingding.png\" width=\"250\"\u002F> | \u003Cimg src=\"docs\u002Fimages\u002Fwechat.png\" width=\"215\"\u002F>\u003C\u002Fdiv> |\n\n## Find Speech Models in FunASR\n\n[FunASR](https:\u002F\u002Fgithub.com\u002Falibaba-damo-academy\u002FFunASR) hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model released on ModelScope, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun！\n\n📚FunASR Paper: \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.11013\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FArxiv-2305.11013-orange\">\u003C\u002Fa> \n\n📚SeACo-Paraformer Paper: \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.03266\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FArxiv-2308.03266-orange\">\u003C\u002Fa>\n\n🌟Support FunASR: \u003Ca href='https:\u002F\u002Fgithub.com\u002Falibaba-damo-academy\u002FFunASR\u002Fstargazers'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Falibaba-damo-academy\u002FFunASR.svg?style=social'>\u003C\u002Fa>\n","FunClip是一款开源的视频语音识别与剪辑工具，集成了基于大语言模型的AI剪辑功能。其核心功能包括利用阿里巴巴达摩院的FunASR Paraformer系列模型进行高精度语音识别，并允许用户根据识别结果自由选择文本段落或发言人以生成对应的视频片段。此外，它还支持热词定制和说话人识别，进一步提高了识别准确性与使用灵活性。适合需要从大量视频内容中快速提取特定信息或片段的场景，如教育、媒体制作及数据分析等领域。通过Gradio实现交互界面，使得安装部署简便，既可本地运行也能服务器端访问。",2,"2026-06-11 03:36:09","high_star"]