[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-11472":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":9,"totalLinesOfCode":9,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":9,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":9,"rankLanguage":9,"license":9,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":45,"readmeContent":46,"aiSummary":47,"trendingCount":16,"starSnapshotCount":16,"syncStatus":48,"lastSyncTime":49,"discoverSource":50},11472,"sherpa-onnx","k2-fsa\u002Fsherpa-onnx","k2-fsa","Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server\u002Fclient, support 12 programming languages",null,"https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx","C++",12895,1473,114,539,0,139,290,738,417,44.51,false,"main",[25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44],"asr","onnx","windows","linux","macos","cpp","android","ios","raspberry-pi","aarch64","arm32","csharp","dotnet","mfc","speech-to-text","text-to-speech","vits","risc-v","lazarus","object-pascal","2026-06-12 02:02:32","\u003Cdiv align=\"center\">\n\n[![Ask DeepWiki](https:\u002F\u002Fdeepwiki.com\u002Fbadge.svg)](https:\u002F\u002Fdeepwiki.com\u002Fk2-fsa\u002Fsherpa-onnx)\n\n\u003C\u002Fdiv>\n\n ### Supported functions\n\n|Speech recognition| [Speech synthesis][tts-url] | [Source separation][ss-url] |\n|------------------|------------------|-------------------|\n|   ✔️              |         ✔️        |       ✔️           |\n\n|Speaker identification| [Speaker diarization][sd-url] | Speaker verification |\n|----------------------|-------------------- |------------------------|\n|   ✔️                  |         ✔️           |            ✔️           |\n\n| [Spoken Language identification][slid-url] | [Audio tagging][at-url] | [Voice activity detection][vad-url] |\n|--------------------------------|---------------|--------------------------|\n|                 ✔️              |          ✔️    |                ✔️         |\n\n| [Keyword spotting][kws-url] | [Add punctuation][punct-url] | [Speech enhancement][se-url] |\n|------------------|-----------------|--------------------|\n|     ✔️            |       ✔️         |      ✔️             |\n\n\n### Supported platforms\n\n|Architecture| Android | iOS     | Windows    | macOS | linux | HarmonyOS |\n|------------|---------|---------|------------|-------|-------|-----------|\n|   x64      |  ✔️      |         |   ✔️      | ✔️    |  ✔️    |   ✔️   |\n|   x86      |  ✔️      |         |   ✔️      |       |        |        |\n|   arm64    |  ✔️      | ✔️      |   ✔️      | ✔️    |  ✔️    |   ✔️   |\n|   arm32    |  ✔️      |         |           |       |  ✔️    |   ✔️   |\n|   riscv64  |          |         |           |       |  ✔️    |        |\n\n### Supported programming languages\n\n| 1. C++ | 2. C  | 3. Python | 4. JavaScript |\n|--------|-------|-----------|---------------|\n|   ✔️    | ✔️     | ✔️         |    ✔️          |\n\n|5. Java | 6. C# | 7. Kotlin | 8. Swift |\n|--------|-------|-----------|----------|\n| ✔️      |  ✔️    | ✔️         |  ✔️       |\n\n| 9. Go | 10. Dart | 11. Rust | 12. Pascal |\n|-------|----------|----------|------------|\n| ✔️     |  ✔️       |   ✔️      |    ✔️       |\n\n\nIt also supports WebAssembly.\n\n### Supported NPUs\n\n| [1. Rockchip NPU (RKNN)][rknpu-doc] | [2. Qualcomm NPU (QNN)][qnn-doc]  | [3. Ascend NPU][ascend-doc] |\n|-------------------------------------|-----------------------------------|-----------------------------|\n|     ✔️                              |                  ✔️               |     ✔️                      |\n\n| [4. Axera NPU][axera-npu] |\n|---------------------------|\n|     ✔️                    |\n\n[Join our discord](https:\u002F\u002Fdiscord.gg\u002FfJdxzg2VbG)\n\n\n## Introduction\n\nThis repository supports running the following functions **locally**\n\n  - Speech-to-text (i.e., ASR); both streaming and non-streaming are supported\n  - Text-to-speech (i.e., TTS)\n  - Speaker diarization\n  - Speaker identification\n  - Speaker verification\n  - Spoken language identification\n  - Audio tagging\n  - VAD (e.g., [silero-vad][silero-vad])\n  - Speech enhancement (e.g., [gtcrn][gtcrn], [DPDFNet](https:\u002F\u002Fgithub.com\u002Fceva-ip\u002FDPDFNet))\n  - Keyword spotting\n  - Source separation (e.g., [spleeter][spleeter], [UVR][UVR])\n\non the following platforms and operating systems:\n\n  - x86, ``x86_64``, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64), **RK NPU**, **Ascend NPU**\n  - Linux, macOS, Windows, openKylin\n  - Android, WearOS\n  - iOS\n  - HarmonyOS\n  - NodeJS\n  - WebAssembly\n  - [NVIDIA Jetson Orin NX][NVIDIA Jetson Orin NX] (Support running on both CPU and GPU)\n  - [NVIDIA Jetson Nano B01][NVIDIA Jetson Nano B01] (Support running on both CPU and GPU)\n  - [Raspberry Pi][Raspberry Pi]\n  - [RV1126][RV1126]\n  - [LicheePi4A][LicheePi4A]\n  - [VisionFive 2][VisionFive 2]\n  - [旭日X3派][旭日X3派]\n  - [爱芯派][爱芯派]\n  - [RK3588][RK3588]\n  - [SpacemiT-K1][SpacemiT-K1]\n  - [SpacemiT-K3][SpacemiT-K3]\n  - etc\n\nwith the following APIs\n\n  - C++, C, Python, Go, ``C#``\n  - Java, Kotlin, JavaScript\n  - Swift, Rust\n  - Dart, Object Pascal\n\n### Links for Huggingface Spaces\n\n\u003Cdetails>\n\u003Csummary>You can visit the following Huggingface spaces to try sherpa-onnx without\ninstalling anything. All you need is a browser.\u003C\u002Fsummary>\n\n| Description                                           | URL                                     | 中国镜像                               |\n|-------------------------------------------------------|-----------------------------------------|----------------------------------------|\n| Speaker diarization                                   | [Click me][hf-space-speaker-diarization]| [镜像][hf-space-speaker-diarization-cn]|\n| Speech recognition                                    | [Click me][hf-space-asr]                | [镜像][hf-space-asr-cn]                |\n| Speech recognition with [Whisper][Whisper]            | [Click me][hf-space-asr-whisper]        | [镜像][hf-space-asr-whisper-cn]        |\n| Speech synthesis                                      | [Click me][hf-space-tts]                | [镜像][hf-space-tts-cn]                |\n| Generate subtitles                                    | [Click me][hf-space-subtitle]           | [镜像][hf-space-subtitle-cn]           |\n| Audio tagging                                         | [Click me][hf-space-audio-tagging]      | [镜像][hf-space-audio-tagging-cn]      |\n| Source separation                                     | [Click me][hf-space-source-separation]  | [镜像][hf-space-source-separation-cn]  |\n| Spoken language identification with [Whisper][Whisper]| [Click me][hf-space-slid-whisper]       | [镜像][hf-space-slid-whisper-cn]       |\n\nWe also have spaces built using WebAssembly. They are listed below:\n\n| Description                                                                              | Huggingface space| ModelScope space|\n|------------------------------------------------------------------------------------------|------------------|-----------------|\n|Voice activity detection with [silero-vad][silero-vad]                                    | [Click me][wasm-hf-vad]|[地址][wasm-ms-vad]|\n|Real-time speech recognition (Chinese + English) with Zipformer                           | [Click me][wasm-hf-streaming-asr-zh-en-zipformer]|[地址][wasm-hf-streaming-asr-zh-en-zipformer]|\n|Real-time speech recognition (Chinese + English) with Paraformer                          |[Click me][wasm-hf-streaming-asr-zh-en-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-paraformer]|\n|Real-time speech recognition (Chinese + English + Cantonese) with [Paraformer-large][Paraformer-large]|[Click me][wasm-hf-streaming-asr-zh-en-yue-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-yue-paraformer]|\n|Real-time speech recognition (English) |[Click me][wasm-hf-streaming-asr-en-zipformer]    |[地址][wasm-ms-streaming-asr-en-zipformer]|\n|VAD + speech recognition (Chinese) with [Zipformer CTC](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Ficefall\u002Fzipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese)|[Click me][wasm-hf-vad-asr-zh-zipformer-ctc-07-03]| [地址][wasm-ms-vad-asr-zh-zipformer-ctc-07-03]|\n|VAD + speech recognition (Chinese + English + Korean + Japanese + Cantonese) with [SenseVoice][SenseVoice]|[Click me][wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]| [地址][wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]|\n|VAD + speech recognition (English) with [Whisper][Whisper] tiny.en|[Click me][wasm-hf-vad-asr-en-whisper-tiny-en]| [地址][wasm-ms-vad-asr-en-whisper-tiny-en]|\n|VAD + speech recognition (English) with [Moonshine tiny][Moonshine tiny]|[Click me][wasm-hf-vad-asr-en-moonshine-tiny-en]| [地址][wasm-ms-vad-asr-en-moonshine-tiny-en]|\n|VAD + speech recognition (English) with Zipformer trained with [GigaSpeech][GigaSpeech]    |[Click me][wasm-hf-vad-asr-en-zipformer-gigaspeech]| [地址][wasm-ms-vad-asr-en-zipformer-gigaspeech]|\n|VAD + speech recognition (Chinese) with Zipformer trained with [WenetSpeech][WenetSpeech]  |[Click me][wasm-hf-vad-asr-zh-zipformer-wenetspeech]| [地址][wasm-ms-vad-asr-zh-zipformer-wenetspeech]|\n|VAD + speech recognition (Japanese) with Zipformer trained with [ReazonSpeech][ReazonSpeech]|[Click me][wasm-hf-vad-asr-ja-zipformer-reazonspeech]| [地址][wasm-ms-vad-asr-ja-zipformer-reazonspeech]|\n|VAD + speech recognition (Thai) with Zipformer trained with [GigaSpeech2][GigaSpeech2]      |[Click me][wasm-hf-vad-asr-th-zipformer-gigaspeech2]| [地址][wasm-ms-vad-asr-th-zipformer-gigaspeech2]|\n|VAD + speech recognition (Chinese 多种方言) with a [TeleSpeech-ASR][TeleSpeech-ASR] CTC model|[Click me][wasm-hf-vad-asr-zh-telespeech]| [地址][wasm-ms-vad-asr-zh-telespeech]|\n|VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-large          |[Click me][wasm-hf-vad-asr-zh-en-paraformer-large]| [地址][wasm-ms-vad-asr-zh-en-paraformer-large]|\n|VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-small          |[Click me][wasm-hf-vad-asr-zh-en-paraformer-small]| [地址][wasm-ms-vad-asr-zh-en-paraformer-small]|\n|VAD + speech recognition (多语种及多种中文方言) with [Dolphin][Dolphin]-base          |[Click me][wasm-hf-vad-asr-multi-lang-dolphin-base]| [地址][wasm-ms-vad-asr-multi-lang-dolphin-base]|\n|Speech synthesis (Piper, English)                                                                  |[Click me][wasm-hf-tts-piper-en]| [地址][wasm-ms-tts-piper-en]|\n|Speech synthesis (Piper, German)                                                                   |[Click me][wasm-hf-tts-piper-de]| [地址][wasm-ms-tts-piper-de]|\n|Speech synthesis (Matcha, Chinese)                                                                  |[Click me][wasm-hf-tts-matcha-zh]| [地址][wasm-ms-tts-matcha-zh]|\n|Speech synthesis (Matcha, English)                                                                  |[Click me][wasm-hf-tts-matcha-en]| [地址][wasm-ms-tts-matcha-en]|\n|Speech synthesis (Matcha, Chinese+English)                                                          |[Click me][wasm-hf-tts-matcha-zh-en]| [地址][wasm-ms-tts-matcha-zh-en]|\n|Speaker diarization                                                                         |[Click me][wasm-hf-speaker-diarization]|[地址][wasm-ms-speaker-diarization]|\n|Voice cloning with ZipVoice (Chinese+English)                                               |[Click me][wasm-hf-voice-cloning-zipvoice]|[地址][wasm-ms-voice-cloning-zipvoice]|\n|Voice cloning with Pocket TTS (English)                                               |[Click me][wasm-hf-voice-cloning-pocket]|[地址][wasm-ms-voice-cloning-pocket]|\n\n\u003C\u002Fdetails>\n\n### Links for pre-built Android APKs\n\n\u003Cdetails>\n\n\u003Csummary>You can find pre-built Android APKs for this repository in the following table\u003C\u002Fsummary>\n\n| Description                            | URL                                | 中国用户                          |\n|----------------------------------------|------------------------------------|-----------------------------------|\n| Speaker diarization                    | [Address][apk-speaker-diarization] | [点此][apk-speaker-diarization-cn]|\n| Streaming speech recognition           | [Address][apk-streaming-asr]       | [点此][apk-streaming-asr-cn]      |\n| Simulated-streaming speech recognition | [Address][apk-simula-streaming-asr]| [点此][apk-simula-streaming-asr-cn]|\n| Text-to-speech                         | [Address][apk-tts]                 | [点此][apk-tts-cn]                |\n| Voice activity detection (VAD)         | [Address][apk-vad]                 | [点此][apk-vad-cn]                |\n| VAD + non-streaming speech recognition | [Address][apk-vad-asr]             | [点此][apk-vad-asr-cn]            |\n| Two-pass speech recognition            | [Address][apk-2pass]               | [点此][apk-2pass-cn]              |\n| Audio tagging                          | [Address][apk-at]                  | [点此][apk-at-cn]                 |\n| Audio tagging (WearOS)                 | [Address][apk-at-wearos]           | [点此][apk-at-wearos-cn]          |\n| Speaker identification                 | [Address][apk-sid]                 | [点此][apk-sid-cn]                |\n| Spoken language identification         | [Address][apk-slid]                | [点此][apk-slid-cn]               |\n| Keyword spotting                       | [Address][apk-kws]                 | [点此][apk-kws-cn]                |\n\n\u003C\u002Fdetails>\n\n### Links for pre-built Flutter APPs\n\n\u003Cdetails>\n\n#### Real-time speech recognition\n\n| Description                    | URL                                 | 中国用户                            |\n|--------------------------------|-------------------------------------|-------------------------------------|\n| Streaming speech recognition   | [Address][apk-flutter-streaming-asr]| [点此][apk-flutter-streaming-asr-cn]|\n\n#### Text-to-speech\n\n| Description                              | URL                                | 中国用户                           |\n|------------------------------------------|------------------------------------|------------------------------------|\n| Android (arm64-v8a, armeabi-v7a, x86_64) | [Address][flutter-tts-android]     | [点此][flutter-tts-android-cn]     |\n| Linux (x64)                              | [Address][flutter-tts-linux]       | [点此][flutter-tts-linux-cn]       |\n| macOS (x64)                              | [Address][flutter-tts-macos-x64]   | [点此][flutter-tts-macos-x64-cn] |\n| macOS (arm64)                            | [Address][flutter-tts-macos-arm64] | [点此][flutter-tts-macos-arm64-cn]   |\n| Windows (x64)                            | [Address][flutter-tts-win-x64]     | [点此][flutter-tts-win-x64-cn]     |\n\n> Note: You need to build from source for iOS.\n\n\u003C\u002Fdetails>\n\n### Links for pre-built Lazarus APPs\n\n\u003Cdetails>\n\n#### Generating subtitles\n\n| Description                    | URL                        | 中国用户                   |\n|--------------------------------|----------------------------|----------------------------|\n| Generate subtitles (生成字幕)  | [Address][lazarus-subtitle]| [点此][lazarus-subtitle-cn]|\n\n\u003C\u002Fdetails>\n\n### Links for pre-trained models\n\n\u003Cdetails>\n\n| Description                                 | URL                                                                                   |\n|---------------------------------------------|---------------------------------------------------------------------------------------|\n| Speech recognition (speech to text, ASR)    | [Address][asr-models]                                                                 |\n| Text-to-speech (TTS)                        | [Address][tts-models]                                                                 |\n| VAD                                         | [Address][vad-models]                                                                 |\n| Keyword spotting                            | [Address][kws-models]                                                                 |\n| Audio tagging                               | [Address][at-models]                                                                  |\n| Speaker identification (Speaker ID)         | [Address][sid-models]                                                                 |\n| Spoken language identification (Language ID)| See multi-lingual [Whisper][Whisper] ASR models from  [Speech recognition][asr-models]|\n| Punctuation                                 | [Address][punct-models]                                                               |\n| Speaker segmentation                        | [Address][speaker-segmentation-models]                                                |\n| Speech enhancement                          | [Address][speech-enhancement-models]                                                  |\n| Source separation                           | [Address][source-separation-models]                                                  |\n\n\u003C\u002Fdetails>\n\n#### Some pre-trained ASR models (Streaming)\n\n\u003Cdetails>\n\nPlease see\n\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-paraformer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-ctc\u002Findex.html>\n\nfor more models. The following table lists only **SOME** of them.\n\n\n|Name | Supported Languages| Description|\n|-----|-----|----|\n|[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20][sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]| Chinese, English| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english)|\n|[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16][sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]| Chinese, English| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16-bilingual-chinese-english)|\n|[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23][sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]|Chinese| Suitable for Cortex A7 CPU. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-zh-14m-2023-02-23)|\n|[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17][sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]|English|Suitable for Cortex A7 CPU. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-en-20m-2023-02-17)|\n|[sherpa-onnx-streaming-zipformer-korean-2024-06-16][sherpa-onnx-streaming-zipformer-korean-2024-06-16]|Korean| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-korean-2024-06-16-korean)|\n|[sherpa-onnx-streaming-zipformer-fr-2023-04-14][sherpa-onnx-streaming-zipformer-fr-2023-04-14]|French| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fonline-transducer\u002Fzipformer-transducer-models.html#shaojieli-sherpa-onnx-streaming-zipformer-fr-2023-04-14-french)|\n\n\u003C\u002Fdetails>\n\n\n#### Some pre-trained ASR models (Non-Streaming)\n\n\u003Cdetails>\n\nPlease see\n\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-paraformer\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Ftelespeech\u002Findex.html>\n  - \u003Chttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fwhisper\u002Findex.html>\n\nfor more models. The following table lists only **SOME** of them.\n\n|Name | Supported Languages| Description|\n|-----|-----|----|\n|[sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fnemo-transducer-models.html#sherpa-onnx-nemo-parakeet-tdt-0-6b-v2-int8-english)| English | It is converted from \u003Chttps:\u002F\u002Fhuggingface.co\u002Fnvidia\u002Fparakeet-tdt-0.6b-v2>|\n|[Whisper tiny.en](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-whisper-tiny.en.tar.bz2)|English| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Fwhisper\u002Ftiny.en.html)|\n|[Moonshine tiny][Moonshine tiny]|English|See [also](https:\u002F\u002Fgithub.com\u002Fusefulsensors\u002Fmoonshine)|\n|[sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Ficefall\u002Fzipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese)|Chinese| A Zipformer CTC model|\n|[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17][sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]|Chinese, Cantonese, English, Korean, Japanese| 支持多种中文方言. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fsense-voice\u002Findex.html)|\n|[sherpa-onnx-paraformer-zh-2024-03-09][sherpa-onnx-paraformer-zh-2024-03-09]|Chinese, English| 也支持多种中文方言. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-paraformer\u002Fparaformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2024-03-09-chinese-english)|\n|[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01][sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]|Japanese|See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01-japanese)|\n|[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24][sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]|Russian|See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fnemo-transducer-models.html#sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24-russian)|\n|[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24][sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]|Russian| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-ctc\u002Fnemo\u002Frussian.html#sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24)|\n|[sherpa-onnx-zipformer-ru-2024-09-18][sherpa-onnx-zipformer-ru-2024-09-18]|Russian|See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-ru-2024-09-18-russian)|\n|[sherpa-onnx-zipformer-korean-2024-06-24][sherpa-onnx-zipformer-korean-2024-06-24]|Korean|See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-korean-2024-06-24-korean)|\n|[sherpa-onnx-zipformer-thai-2024-06-20][sherpa-onnx-zipformer-thai-2024-06-20]|Thai| See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Foffline-transducer\u002Fzipformer-transducer-models.html#sherpa-onnx-zipformer-thai-2024-06-20-thai)|\n|[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04][sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]|Chinese| 支持多种方言. See [also](https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpretrained_models\u002Ftelespeech\u002Fmodels.html#sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04)|\n\n\u003C\u002Fdetails>\n\n### Useful links\n\n- Documentation: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002F\n- Bilibili 演示视频: https:\u002F\u002Fsearch.bilibili.com\u002Fall?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi\n\n### How to reach us\n\nPlease see\nhttps:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fsocial-groups.html\nfor 新一代 Kaldi **微信交流群** and **QQ 交流群**.\n\n## Projects using sherpa-onnx\n\n### [Sherpa Voice \u002F @siteed\u002Fsherpa-onnx.rn](https:\u002F\u002Fgithub.com\u002Fdeeeed\u002Faudiolab)\n\n> React Native wrapper and demo app for validating sherpa-onnx on iOS,\n> Android, and Web, including ASR, TTS, VAD, KWS, speaker ID, diarization,\n> language ID, punctuation, audio tagging, and speech enhancement.\n\n- [NPM package](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002F@siteed\u002Fsherpa-onnx.rn)\n- [Live demo](https:\u002F\u002Fdeeeed.github.io\u002Faudiolab\u002Fsherpa-voice\u002F)\n\n### [Speed of Sound](https:\u002F\u002Fgithub.com\u002Fzugaldia\u002Fspeedofsound)\n\n> A voice-typing application for the Linux desktop (GTK4\u002FAdwaita).\n> It captures microphone audio, transcribes it offline using Sherpa ONNX ASR models,\n> optionally polishes the text with an LLM, and types the result into the active window\n> via XDG Remote Desktop Portal keyboard simulation.\n\n### [VoxSherpa TTS](https:\u002F\u002Fgithub.com\u002FCodeBySonu95\u002FVoxSherpa-TTS)\n\n> VoxSherpa TTS is a 100% offline Android Text-to-Speech app powered by Sherpa-ONNX.\n> It supports Kokoro-82M, Piper, and VITS engines with multilingual support including\n> Hindi, English, British English, Japanese, Chinese and 50+ more languages.\n\n- [Download APK v1.0-beta](https:\u002F\u002Fhuggingface.co\u002FCodeBySonu95\u002FSherpa-onnx-models\u002Fresolve\u002Fmain\u002FVoxSherpa-TTS_test.apk)\n- Android 11+ · 100% offline · No telemetry\n\n\u003Cdiv align=\"center\">\n\n| Generate | Models | Library | Settings |\n|:---:|:---:|:---:|:---:|\n| \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002FCodeBySonu95\u002FVoxSherpa-TTS\u002Fmain\u002Ffastlane\u002Fmetadata\u002Fandroid\u002Fen-US\u002Fimages\u002FphoneScreenshots\u002F1.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002FCodeBySonu95\u002FVoxSherpa-TTS\u002Fmain\u002Ffastlane\u002Fmetadata\u002Fandroid\u002Fen-US\u002Fimages\u002FphoneScreenshots\u002F2.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002FCodeBySonu95\u002FVoxSherpa-TTS\u002Fmain\u002Ffastlane\u002Fmetadata\u002Fandroid\u002Fen-US\u002Fimages\u002FphoneScreenshots\u002F3.jpg\" width=\"180\"\u002F> | \u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002FCodeBySonu95\u002FVoxSherpa-TTS\u002Fmain\u002Ffastlane\u002Fmetadata\u002Fandroid\u002Fen-US\u002Fimages\u002FphoneScreenshots\u002F4.jpg\" width=\"180\"\u002F> |\n\n\u003C\u002Fdiv>\n\n---\n### [BreezeApp](https:\u002F\u002Fgithub.com\u002Fmtkresearch\u002FBreezeApp) from [MediaTek Research](https:\u002F\u002Fgithub.com\u002Fmtkresearch)\n\n> BreezeAPP is a mobile AI application developed for both Android and iOS platforms.\n> Users can download it directly from the App Store and enjoy a variety of features\n> offline, including speech-to-text, text-to-speech, text-based chatbot interactions,\n> and image question-answering\n\n  - [Download APK for BreezeAPP](https:\u002F\u002Fhuggingface.co\u002FMediaTek-Research\u002FBreezeApp\u002Fresolve\u002Fmain\u002FBreezeApp.apk)\n  - [APK 中国镜像](https:\u002F\u002Fhf-mirror.com\u002FMediaTek-Research\u002FBreezeApp\u002Fblob\u002Fmain\u002FBreezeApp.apk)\n\n| 1 | 2 | 3 |\n|---|---|---|\n|![](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F1cdbc057-b893-4de6-9e9c-f1d7dfd1d992)|![](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fd77cd98e-b057-442f-860d-d5befd5c769b)|![](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F57e546bf-3d39-45b9-b392-b48ca4fb3c58)|\n\n### [Open-LLM-VTuber](https:\u002F\u002Fgithub.com\u002Ft41372\u002FOpen-LLM-VTuber)\n\nTalk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking\nface running locally across platforms\n\nSee also \u003Chttps:\u002F\u002Fgithub.com\u002Ft41372\u002FOpen-LLM-VTuber\u002Fpull\u002F50>\n\n### [voiceapi](https:\u002F\u002Fgithub.com\u002Fruzhila\u002Fvoiceapi)\n\n\u003Cdetails>\n  \u003Csummary>Streaming ASR and TTS based on FastAPI\u003C\u002Fsummary>\n\n\nIt shows how to use the ASR and TTS Python APIs with FastAPI.\n\u003C\u002Fdetails>\n\n### [腾讯会议摸鱼工具 TMSpeech](https:\u002F\u002Fgithub.com\u002Fjxlpzqc\u002FTMSpeech)\n\nUses streaming ASR in C# with graphical user interface.\n\nVideo demo in Chinese: [【开源】Windows实时字幕软件（网课\u002F开会必备）](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1rX4y1p7Nx)\n\n### [lol互动助手](https:\u002F\u002Fgithub.com\u002Fl1veIn\u002Flol-wom-electron)\n\nIt uses the JavaScript API of sherpa-onnx along with [Electron](https:\u002F\u002Felectronjs.org\u002F)\n\nVideo demo in Chinese: [爆了！炫神教你开打字挂！真正影响胜率的英雄联盟工具！英雄联盟的最后一块拼图！和游戏中的每个人无障碍沟通！](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV142tje9E74)\n\n### [Sherpa-ONNX 语音识别服务器](https:\u002F\u002Fgithub.com\u002Fhfyydd\u002Fsherpa-onnx-server)\n\nA server based on nodejs providing Restful API for speech recognition.\n\n### [QSmartAssistant](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant)\n\n一个模块化，全过程可离线，低占用率的对话机器人\u002F智能音箱\n\nIt uses QT. Both [ASR](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant\u002Fblob\u002Fmaster\u002Fdoc\u002F%E5%AE%89%E8%A3%85.md#asr)\nand [TTS](https:\u002F\u002Fgithub.com\u002Fxinhecuican\u002FQSmartAssistant\u002Fblob\u002Fmaster\u002Fdoc\u002F%E5%AE%89%E8%A3%85.md#tts)\nare used.\n\n### [Flutter-EasySpeechRecognition](https:\u002F\u002Fgithub.com\u002FJason-chen-coder\u002FFlutter-EasySpeechRecognition)\n\nIt extends [.\u002Fflutter-examples\u002Fstreaming_asr](.\u002Fflutter-examples\u002Fstreaming_asr) by\ndownloading models inside the app to reduce the size of the app.\n\nNote: [[Team B] Sherpa AI backend](https:\u002F\u002Fgithub.com\u002Fumgc\u002Fspring2025\u002Fpull\u002F82) also uses\nsherpa-onnx in a Flutter APP.\n\n### [sherpa-onnx-unity](https:\u002F\u002Fgithub.com\u002Fxue-fei\u002Fsherpa-onnx-unity)\n\nsherpa-onnx in Unity. See also [#1695](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1695),\n[#1892](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1892), and [#1859](https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Fissues\u002F1859)\n\n### [xiaozhi-esp32-server](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server)\n\n本项目为xiaozhi-esp32提供后端服务，帮助您快速搭建ESP32设备控制服务器\nBackend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.\n\nSee also\n\n  - [ASR新增轻量级sherpa-onnx-asr](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server\u002Fissues\u002F315)\n  - [feat: ASR增加sherpa-onnx模型](https:\u002F\u002Fgithub.com\u002Fxinnan-tech\u002Fxiaozhi-esp32-server\u002Fpull\u002F379)\n\n### [KaithemAutomation](https:\u002F\u002Fgithub.com\u002FEternityForest\u002FKaithemAutomation)\n\nPure Python, GUI-focused home automation\u002Fconsumer grade SCADA.\n\nIt uses TTS from sherpa-onnx. See also [✨ Speak command that uses the new globally configured TTS model.](https:\u002F\u002Fgithub.com\u002FEternityForest\u002FKaithemAutomation\u002Fcommit\u002F8e64d2b138725e426532f7d66bb69dd0b4f53693)\n\n### [Open-XiaoAI KWS](https:\u002F\u002Fgithub.com\u002Fidootop\u002Fopen-xiaoai-kws)\n\nEnable custom wake word for XiaoAi Speakers. 让小爱音箱支持自定义唤醒词。\n\nVideo demo in Chinese: [小爱同学启动～˶╹ꇴ╹˶！](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1YfVUz5EMj)\n\n### [C++ WebSocket ASR Server](https:\u002F\u002Fgithub.com\u002Fmawwalker\u002Fstt-server)\n\nIt provides a WebSocket server based on C++ for ASR using sherpa-onnx.\n\n### [Go WebSocket Server](https:\u002F\u002Fgithub.com\u002Fbbeyondllove\u002Fasr_server)\n\nIt provides a WebSocket server based on the Go programming language for sherpa-onnx.\n\n### [Making robot Paimon, Ep10 \"The AI Part 1\"](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=KxPKkwxGWZs)\n\nIt is a [YouTube video](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=KxPKkwxGWZs),\nshowing how the author tried to use AI so he can have a conversation with Paimon.\n\nIt uses sherpa-onnx for speech-to-text and text-to-speech.\n|1|\n|---|\n|![](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ff6eea2d5-1807-42cb-9160-be8da2971e1f)|\n\n### [TtsReader - Desktop application](https:\u002F\u002Fgithub.com\u002Fys-pro-duction\u002FTtsReader)\n\nA desktop text-to-speech application built using Kotlin Multiplatform.\n\n### [MentraOS](https:\u002F\u002Fgithub.com\u002FMentra-Community\u002FMentraOS)\n\n> Smart glasses OS, with dozens of built-in apps. Users get AI assistant, notifications,\n> translation, screen mirror, captions, and more. Devs get to write 1 app that runs on\n> any pair of smart glasses.\n\nIt uses sherpa-onnx for real-time speech recognition on iOS and Android devices.\nSee also \u003Chttps:\u002F\u002Fgithub.com\u002FMentra-Community\u002FMentraOS\u002Fpull\u002F861>\n\nIt uses Swift for iOS and Java for Android.\n\n### [flet_sherpa_onnx](https:\u002F\u002Fgithub.com\u002FSamYuan1990\u002Fflet_sherpa_onnx)\n\nFlet ASR\u002FSTT component based on sherpa-onnx.\nExample [a chat box agent](https:\u002F\u002Fgithub.com\u002FSamYuan1990\u002Fi18n-agent-action)\n\n### [achatbot-go](https:\u002F\u002Fgithub.com\u002Fai-bot-pro\u002Fachatbot-go)\n\na multimodal chatbot based on go with sherpa-onnx's speech lib api.\n\n### [fcitx5-vinput](https:\u002F\u002Fgithub.com\u002Fxifan2333\u002Ffcitx5-vinput)\n\nLocal offline voice input plugin for [Fcitx5](https:\u002F\u002Fgithub.com\u002Ffcitx\u002Ffcitx5) (Linux input method framework).\nIt uses C++ with offline ASR for speech recognition, supporting push-to-talk,\ncommand mode, and optional LLM post-processing.\n\nVideo demo in Chinese: [fcitx5-vinput](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1a6cUzVE6F)\n\n### [Wake Word](https:\u002F\u002Fgithub.com\u002Fanalyticsinmotion\u002Fwake-word)\n\nA VS Code extension for hands-free voice-activated coding. It uses sherpa-onnx for real-time\nkeyword spotting (KWS) to detect custom wake phrases and trigger VS Code commands by voice.\nAudio capture is handled by [decibri](https:\u002F\u002Fgithub.com\u002Fanalyticsinmotion\u002Fdecibri), a\ncross-platform Node.js microphone streaming library with prebuilt native binaries.\n\n- [VS Code Marketplace](https:\u002F\u002Fmarketplace.visualstudio.com\u002Fitems?itemName=analytics-in-motion.wake-word)\n- [Open VSX](https:\u002F\u002Fopen-vsx.org\u002Fextension\u002Fanalytics-in-motion\u002Fwake-word)\n- [decibri integration guides for sherpa-onnx](https:\u002F\u002Fdecibri.dev\u002Fdocs\u002Fnode\u002Fintegrations\u002Fsherpa-onnx-stt.html)\n\n[silero-vad]: https:\u002F\u002Fgithub.com\u002Fsnakers4\u002Fsilero-vad\n[Raspberry Pi]: https:\u002F\u002Fwww.raspberrypi.com\u002F\n[RV1126]: https:\u002F\u002Fwww.rock-chips.com\u002Fuploads\u002Fpdf\u002F2022.8.26\u002F191\u002FRV1126%20Brief%20Datasheet.pdf\n[LicheePi4A]: https:\u002F\u002Fsipeed.com\u002Flicheepi4a\n[VisionFive 2]: https:\u002F\u002Fwww.starfivetech.com\u002Fen\u002Fsite\u002Fboards\n[旭日X3派]: https:\u002F\u002Fdeveloper.horizon.ai\u002Fapi\u002Fv1\u002FfileData\u002Fdocuments_pi\u002Findex.html\n[爱芯派]: https:\u002F\u002Fwiki.sipeed.com\u002Fhardware\u002Fzh\u002FmaixIII\u002Fax-pi\u002Faxpi.html\n[hf-space-speaker-diarization]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fspeaker-diarization\n[hf-space-speaker-diarization-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fspeaker-diarization\n[hf-space-asr]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition\n[hf-space-asr-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition\n[Whisper]: https:\u002F\u002Fgithub.com\u002Fopenai\u002Fwhisper\n[hf-space-asr-whisper]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition-with-whisper\n[hf-space-asr-whisper-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fautomatic-speech-recognition-with-whisper\n[hf-space-tts]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Ftext-to-speech\n[hf-space-tts-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Ftext-to-speech\n[hf-space-subtitle]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fgenerate-subtitles-for-videos\n[hf-space-subtitle-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fgenerate-subtitles-for-videos\n[hf-space-audio-tagging]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Faudio-tagging\n[hf-space-audio-tagging-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Faudio-tagging\n[hf-space-source-separation]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fsource-separation\n[hf-space-source-separation-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fsource-separation\n[hf-space-slid-whisper]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fspoken-language-identification\n[hf-space-slid-whisper-cn]: https:\u002F\u002Fhf.qhduan.com\u002Fspaces\u002Fk2-fsa\u002Fspoken-language-identification\n[wasm-hf-vad]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-sherpa-onnx\n[wasm-ms-vad]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-sherpa-onnx\n[wasm-hf-streaming-asr-zh-en-zipformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en\n[wasm-ms-streaming-asr-zh-en-zipformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en\n[wasm-hf-streaming-asr-zh-en-paraformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en-paraformer\n[wasm-ms-streaming-asr-zh-en-paraformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-en-paraformer\n[Paraformer-large]: https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fdamo\u002Fspeech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\u002Fsummary\n[wasm-hf-streaming-asr-zh-en-yue-paraformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer\n[wasm-ms-streaming-asr-zh-en-yue-paraformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer\n[wasm-hf-streaming-asr-en-zipformer]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-en\n[wasm-ms-streaming-asr-en-zipformer]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-asr-sherpa-onnx-en\n[SenseVoice]: https:\u002F\u002Fgithub.com\u002FFunAudioLLM\u002FSenseVoice\n[wasm-hf-vad-asr-zh-zipformer-ctc-07-03]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\n[wasm-ms-vad-asr-zh-zipformer-ctc-07-03]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\u002Fsummary\n[wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-ja-ko-cantonese-sense-voice\n[wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-jp-ko-cantonese-sense-voice\n[wasm-hf-vad-asr-en-whisper-tiny-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\n[wasm-ms-vad-asr-en-whisper-tiny-en]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\n[wasm-hf-vad-asr-en-moonshine-tiny-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\n[wasm-ms-vad-asr-en-moonshine-tiny-en]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\n[wasm-hf-vad-asr-en-zipformer-gigaspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\n[wasm-ms-vad-asr-en-zipformer-gigaspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\n[wasm-hf-vad-asr-zh-zipformer-wenetspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\n[wasm-ms-vad-asr-zh-zipformer-wenetspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\n[reazonspeech]: https:\u002F\u002Fresearch.reazon.jp\u002F_static\u002Freazonspeech_nlp2023.pdf\n[wasm-hf-vad-asr-ja-zipformer-reazonspeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-ja-zipformer\n[wasm-ms-vad-asr-ja-zipformer-reazonspeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-ja-zipformer\n[gigaspeech2]: https:\u002F\u002Fgithub.com\u002Fspeechcolab\u002Fgigaspeech2\n[wasm-hf-vad-asr-th-zipformer-gigaspeech2]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-th-zipformer\n[wasm-ms-vad-asr-th-zipformer-gigaspeech2]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-th-zipformer\n[telespeech-asr]: https:\u002F\u002Fgithub.com\u002Ftele-ai\u002Ftelespeech-asr\n[wasm-hf-vad-asr-zh-telespeech]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-telespeech\n[wasm-ms-vad-asr-zh-telespeech]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-telespeech\n[wasm-hf-vad-asr-zh-en-paraformer-large]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\n[wasm-ms-vad-asr-zh-en-paraformer-large]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\n[wasm-hf-vad-asr-zh-en-paraformer-small]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\n[wasm-ms-vad-asr-zh-en-paraformer-small]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\n[dolphin]: https:\u002F\u002Fgithub.com\u002Fdataoceanai\u002Fdolphin\n[wasm-ms-vad-asr-multi-lang-dolphin-base]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\n[wasm-hf-vad-asr-multi-lang-dolphin-base]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\n\n[wasm-hf-tts-matcha-zh-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-en-tts-matcha\n[wasm-hf-tts-matcha-zh]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-tts-matcha\n[wasm-ms-tts-matcha-zh-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-en-tts-matcha\n[wasm-ms-tts-matcha-zh]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-tts-matcha\n[wasm-hf-tts-matcha-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-en-tts-matcha\n[wasm-ms-tts-matcha-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-en-tts-matcha\n[wasm-hf-tts-piper-en]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-en\n[wasm-ms-tts-piper-en]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-en\n[wasm-hf-tts-piper-de]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-de\n[wasm-ms-tts-piper-de]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fk2-fsa\u002Fweb-assembly-tts-sherpa-onnx-de\n[wasm-hf-speaker-diarization]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-speaker-diarization-sherpa-onnx\n[wasm-ms-speaker-diarization]: https:\u002F\u002Fwww.modelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-speaker-diarization-sherpa-onnx\n[wasm-hf-voice-cloning-zipvoice]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-zh-en-tts-zipvoice\n[wasm-ms-voice-cloning-zipvoice]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-zh-en-tts-zipvoice\n[wasm-hf-voice-cloning-pocket]: https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fk2-fsa\u002Fweb-assembly-en-tts-pocket\n[wasm-ms-voice-cloning-pocket]: https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fcsukuangfj\u002Fweb-assembly-en-tts-pocket\n[apk-speaker-diarization]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Fapk.html\n[apk-speaker-diarization-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Fapk-cn.html\n[apk-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk.html\n[apk-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-cn.html\n[apk-simula-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-simulate-streaming-asr.html\n[apk-simula-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-simulate-streaming-asr-cn.html\n[apk-tts]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fapk-engine.html\n[apk-tts-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fapk-engine-cn.html\n[apk-vad]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk.html\n[apk-vad-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-cn.html\n[apk-vad-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-asr.html\n[apk-vad-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Fapk-asr-cn.html\n[apk-2pass]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-2pass.html\n[apk-2pass-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fandroid\u002Fapk-2pass-cn.html\n[apk-at]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk.html\n[apk-at-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-cn.html\n[apk-at-wearos]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-wearos.html\n[apk-at-wearos-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Fapk-wearos-cn.html\n[apk-sid]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-identification\u002Fapk.html\n[apk-sid-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-identification\u002Fapk-cn.html\n[apk-slid]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Fapk.html\n[apk-slid-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Fapk-cn.html\n[apk-kws]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Fapk.html\n[apk-kws-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Fapk-cn.html\n[apk-flutter-streaming-asr]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Fpre-built-app.html#streaming-speech-recognition-stt-asr\n[apk-flutter-streaming-asr-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Fpre-built-app.html#streaming-speech-recognition-stt-asr\n[flutter-tts-android]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-android.html\n[flutter-tts-android-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-android-cn.html\n[flutter-tts-linux]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-linux.html\n[flutter-tts-linux-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-linux-cn.html\n[flutter-tts-macos-x64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-x64.html\n[flutter-tts-macos-arm64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-arm64-cn.html\n[flutter-tts-macos-arm64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-arm64.html\n[flutter-tts-macos-x64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-macos-x64-cn.html\n[flutter-tts-win-x64]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-win.html\n[flutter-tts-win-x64-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fflutter\u002Ftts-win-cn.html\n[lazarus-subtitle]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Flazarus\u002Fdownload-generated-subtitles.html\n[lazarus-subtitle-cn]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Flazarus\u002Fdownload-generated-subtitles-cn.html\n[asr-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fasr-models\n[tts-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Ftts-models\n[vad-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsilero_vad.onnx\n[kws-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fkws-models\n[at-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Faudio-tagging-models\n[sid-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-recongition-models\n[slid-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-recongition-models\n[punct-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fpunctuation-models\n[speaker-segmentation-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeaker-segmentation-models\n[GigaSpeech]: https:\u002F\u002Fgithub.com\u002FSpeechColab\u002FGigaSpeech\n[WenetSpeech]: https:\u002F\u002Fgithub.com\u002Fwenet-e2e\u002FWenetSpeech\n[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16.tar.bz2\n[sherpa-onnx-streaming-zipformer-korean-2024-06-16]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-korean-2024-06-16.tar.bz2\n[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2\n[sherpa-onnx-zipformer-ru-2024-09-18]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-ru-2024-09-18.tar.bz2\n[sherpa-onnx-zipformer-korean-2024-06-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-korean-2024-06-24.tar.bz2\n[sherpa-onnx-zipformer-thai-2024-06-20]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-zipformer-thai-2024-06-20.tar.bz2\n[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24.tar.bz2\n[sherpa-onnx-paraformer-zh-2024-03-09]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-paraformer-zh-2024-03-09.tar.bz2\n[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24.tar.bz2\n[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n[sherpa-onnx-streaming-zipformer-fr-2023-04-14]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-streaming-zipformer-fr-2023-04-14.tar.bz2\n[Moonshine tiny]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Fdownload\u002Fasr-models\u002Fsherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n[NVIDIA Jetson Orin NX]: https:\u002F\u002Fdeveloper.download.nvidia.com\u002Fassets\u002Fembedded\u002Fsecure\u002Fjetson\u002Forin_nx\u002Fdocs\u002FJetson_Orin_NX_DS-10712-001_v0.5.pdf?RCPGu9Q6OVAOv7a7vgtwc9-BLScXRIWq6cSLuditMALECJ_dOj27DgnqAPGVnT2VpiNpQan9SyFy-9zRykR58CokzbXwjSA7Gj819e91AXPrWkGZR3oS1VLxiDEpJa_Y0lr7UT-N4GnXtb8NlUkP4GkCkkF_FQivGPrAucCUywL481GH_WpP_p7ziHU1Wg==&t=eyJscyI6ImdzZW8iLCJsc2QiOiJodHRwczovL3d3dy5nb29nbGUuY29tLmhrLyJ9\n[NVIDIA Jetson Nano B01]: https:\u002F\u002Fwww.seeedstudio.com\u002Fblog\u002F2020\u002F01\u002F16\u002Fnew-revision-of-jetson-nano-dev-kit-now-supports-new-jetson-nano-module\u002F\n[speech-enhancement-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fspeech-enhancement-models\n[source-separation-models]: https:\u002F\u002Fgithub.com\u002Fk2-fsa\u002Fsherpa-onnx\u002Freleases\u002Ftag\u002Fsource-separation-models\n[RK3588]: https:\u002F\u002Fwww.rock-chips.com\u002Fuploads\u002Fpdf\u002F2022.8.26\u002F192\u002FRK3588%20Brief%20Datasheet.pdf\n[spleeter]: https:\u002F\u002Fgithub.com\u002Fdeezer\u002Fspleeter\n[UVR]: https:\u002F\u002Fgithub.com\u002FAnjok07\u002Fultimatevocalremovergui\n[gtcrn]: https:\u002F\u002Fgithub.com\u002FXiaobin-Rong\u002Fgtcrn\n[tts-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Ftts\u002Fall-in-one.html\n[ss-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fsource-separation\u002Findex.html\n[sd-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeaker-diarization\u002Findex.html\n[slid-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspoken-language-identification\u002Findex.html\n[at-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Faudio-tagging\u002Findex.html\n[vad-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fvad\u002Findex.html\n[kws-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fkws\u002Findex.html\n[punct-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fpunctuation\u002Findex.html\n[se-url]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fspeech-enhancement\u002Findex.html\n[rknpu-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Frknn\u002Findex.html\n[qnn-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fqnn\u002Findex.html\n[ascend-doc]: https:\u002F\u002Fk2-fsa.github.io\u002Fsherpa\u002Fonnx\u002Fascend\u002Findex.html\n[axera-npu]: https:\u002F\u002Faxera-tech.com\u002FSkill\u002F166.html\n[SpacemiT-K1]: https:\u002F\u002Fcdn-resource.spacemit.com\u002Ffile\u002Fchip\u002FK1\u002FK1_brief_zh.pdf\n[SpacemiT-K3]: https:\u002F\u002Fcdn-resource.spacemit.com\u002Ffile\u002Fchip\u002FK3\u002FK3_brief_zh.pdf\n","sherpa-onnx 是一个基于下一代 Kaldi 和 onnxruntime 的本地语音处理工具，无需互联网连接即可实现语音转文字、文字转语音、说话人分割、语音增强等功能。其核心功能包括支持多种语音处理任务如ASR（自动语音识别）、TTS（文本到语音转换）、VAD（语音活动检测）等，并且能够运行在广泛的硬件平台上，从嵌入式系统（如树莓派、RISC-V）到服务器级架构（x86_64）。该项目特别适合需要在没有网络连接的环境下进行高效语音处理的应用场景，比如移动设备上的离线语音助手或隐私敏感环境中的语音交互解决方案。支持12种编程语言和多种神经处理单元（NPU），极大增强了其跨平台部署能力。",2,"2026-06-11 03:31:57","trending"]