[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-11312":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":41,"readmeContent":42,"aiSummary":43,"trendingCount":16,"starSnapshotCount":16,"syncStatus":44,"lastSyncTime":45,"discoverSource":46},11312,"supertonic","supertone-inc\u002Fsupertonic","supertone-inc","Lightning-Fast, On-Device, Multilingual TTS — running natively via ONNX.","https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FSupertone\u002Fsupertonic-2",null,"Swift",11471,1195,87,79,0,38,220,8548,176,119.23,"MIT License",false,"main",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40],"cpp","csharp","go","ios","java","lightweight","nodejs","on-device","python","rust","swift","text-to-speech","tts","web","2026-06-12 04:00:54","# Supertonic — Lightning Fast, On-Device, Accurate TTS\n\n[![v3 Demo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗%20v3-Demo-yellow)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FSupertone\u002Fsupertonic-3)\n[![v3 Models](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗%20v3-Models-blue)](https:\u002F\u002Fhuggingface.co\u002FSupertone\u002Fsupertonic-3)\n[![v2 Branch](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fv2-release%2Fsupertonic--2-lightgrey)](https:\u002F\u002Fgithub.com\u002Fsupertone-inc\u002Fsupertonic\u002Ftree\u002Frelease\u002Fsupertonic-2)\n[![v1 Demo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗%20v1%20(old)-Demo-lightgrey)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FSupertone\u002Fsupertonic#interactive-demo)\n[![v1 Models](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗%20v1%20(old)-Models-lightgrey)](https:\u002F\u002Fhuggingface.co\u002FSupertone\u002Fsupertonic)\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"img\u002FSupertonic3_HeroImage.png\" alt=\"Supertonic 3 Banner\">\n\u003C\u002Fp>\n\n**Supertonic** is a lightning-fast, on-device text-to-speech system designed for local inference with minimal overhead. Powered by ONNX Runtime, it runs entirely on your device—no cloud, no API calls, no privacy concerns.\n\n### 📰 Update News\n\n- **2026.04.29** - 🎉 **Supertonic 3** released with **31-language support**, improved reading accuracy, fewer repeat\u002Fskip failures, and v2-compatible public ONNX assets. [Demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FSupertone\u002Fsupertonic-3) | [Models](https:\u002F\u002Fhuggingface.co\u002FSupertone\u002Fsupertonic-3)\n- **2026.01.22** - **[Voice Builder](https:\u002F\u002Fsupertonic.supertone.ai\u002Fvoice_builder)** is now live! Turn your voice into a deployable, edge-native TTS with permanent ownership.\n- **2026.01.06** - 🎉 **Supertonic 2** released with 5-language support. The v2 code path is preserved on the [`release\u002Fsupertonic-2`](https:\u002F\u002Fgithub.com\u002Fsupertone-inc\u002Fsupertonic\u002Ftree\u002Frelease\u002Fsupertonic-2) branch.\n- **2025.12.10** - Added `supertonic` PyPI package! Install via `pip install supertonic`. For details, visit [supertonic-py documentation](https:\u002F\u002Fsupertone-inc.github.io\u002Fsupertonic-py)\n- **2025.12.10** - Added [6 new voice styles](https:\u002F\u002Fhuggingface.co\u002FSupertone\u002Fsupertonic\u002Ftree\u002Fb10dbaf18b316159be75b34d24f740008fddd381) (M3, M4, M5, F3, F4, F5). See [Voices](https:\u002F\u002Fsupertone-inc.github.io\u002Fsupertonic-py\u002Fvoices\u002F) for details\n- **2025.12.08** - Optimized ONNX models via [OnnxSlim](https:\u002F\u002Fgithub.com\u002Finisis\u002FOnnxSlim) now available on [Hugging Face Models](https:\u002F\u002Fhuggingface.co\u002FSupertone\u002Fsupertonic)\n- **2025.11.24** - Added Flutter SDK support with macOS compatibility\n\n## Quick Start\n\nInstall the Python SDK and generate speech immediately. On the first run, Supertonic downloads the model assets from Hugging Face automatically.\n\n```bash\npip install supertonic\n```\n\n### Python\n\n```python\nfrom supertonic import TTS\n\n# First run downloads the model from Hugging Face automatically.\ntts = TTS(auto_download=True)\n\nstyle = tts.get_voice_style(voice_name=\"M1\")\n\ntext = \"A gentle breeze moved through the open window while everyone listened to the story.\"\nwav, duration = tts.synthesize(text, voice_style=style, lang=\"en\")\n\ntts.save_audio(wav, \"output.wav\")\nprint(f\"Generated {duration:.2f}s of audio\")\n```\n\n## Getting Started\n\nFirst, clone the repository:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fsupertone-inc\u002Fsupertonic.git\ncd supertonic\n```\n\n### Prerequisites\n\nBefore running the examples, download the ONNX models and preset voices, and place them in the `assets` directory:\n\n> **Note:** The Hugging Face repository uses Git LFS. Please ensure Git LFS is installed and initialized before cloning or pulling large model files.\n> - macOS: `brew install git-lfs && git lfs install`\n> - Generic: see `https:\u002F\u002Fgit-lfs.com` for installers\n\n```bash\ngit lfs install\ngit clone https:\u002F\u002Fhuggingface.co\u002FSupertone\u002Fsupertonic-3 assets\n```\n\nSome language examples need native runtimes:\n- **Go**: install the ONNX Runtime C library. On macOS, `brew install onnxruntime` is enough; the Go example auto-detects Homebrew paths.\n- **Java**: use a JDK, not just a JRE. On macOS, `brew install openjdk@17` works.\n- **C#**: targets .NET 9 and allows major-version roll-forward, so .NET 9 or newer runtimes can run it.\n\nThen run the Python example:\n\n```bash\ncd py\nuv sync\nuv run example_onnx.py\n```\n\nThis generates `outputs\u002Foutput.wav` using the default preset voice.\n\n### Other Runtime Examples\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Run Supertonic in other languages and platforms\u003C\u002Fb>\u003C\u002Fsummary>\n\n**Node.js Example** ([Details](nodejs\u002F))\n```bash\ncd nodejs\nnpm install\nnpm start\n```\n\n**Browser Example** ([Details](web\u002F))\n```bash\ncd web\nnpm install\nnpm run dev\n```\n\n**Java Example** ([Details](java\u002F))\n```bash\ncd java\nmvn clean install\nmvn exec:java\n```\n\n**C++ Example** ([Details](cpp\u002F))\n```bash\ncd cpp\nmkdir build && cd build\ncmake .. && cmake --build . --config Release\n.\u002Fexample_onnx\n```\n\n**C# Example** ([Details](csharp\u002F))\n```bash\ncd csharp\ndotnet restore\ndotnet run\n```\n\n**Go Example** ([Details](go\u002F))\n```bash\ncd go\ngo mod download\ngo run example_onnx.go helper.go\n```\n\n**Swift Example** ([Details](swift\u002F))\n```bash\ncd swift\nswift build -c release\n.build\u002Frelease\u002Fexample_onnx\n```\n\n**Rust Example** ([Details](rust\u002F))\n```bash\ncd rust\ncargo build --release\n.\u002Ftarget\u002Frelease\u002Fexample_onnx\n```\n\n**iOS Example** ([Details](ios\u002F))\n```bash\ncd ios\u002FExampleiOSApp\nxcodegen generate\nopen ExampleiOSApp.xcodeproj\n```\n\nIn Xcode: Targets → ExampleiOSApp → Signing: select your Team, then choose your iPhone as run destination and build.\n\n\u003C\u002Fdetails>\n\n\n### Technical Details\n\n- **Runtime**: ONNX Runtime for cross-platform inference\n- **Browser Support**: onnxruntime-web for client-side inference\n- **Batch Processing**: Supports batch inference for improved throughput\n- **Audio Output**: Outputs 16-bit WAV files\n\n## Performance Highlights\n\nSupertonic 3 is designed for practical on-device inference: compact enough to run locally, while staying competitive with much larger open TTS systems.\n\n### Reading Accuracy\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"img\u002Fmetrics\u002Fs3_vs_measured_wer_range_voxcpm2.png\" alt=\"Supertonic 3 reading accuracy compared with measured model ranges and VoxCPM2\">\n\u003C\u002Fp>\n\nAcross measured languages, Supertonic 3 stays within a competitive WER\u002FCER range against much larger open TTS models such as VoxCPM2, while preserving a lightweight on-device deployment path. Asterisked languages use CER; the others use WER.\n\n### Supertonic 2 to Supertonic 3\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"img\u002Fmetrics\u002Fsupertonic2_vs_3_comparison.png\" alt=\"Supertonic 2 and Supertonic 3 comparison\">\n\u003C\u002Fp>\n\nCompared with Supertonic 2, Supertonic 3 reduces repeat and skip failures, improves speaker similarity across the shared-language set, and expands language coverage from 5 to 31 languages. It keeps the v2-compatible public ONNX interface, so existing integrations can move to v3 with the same inference contract.\n\n### Runtime Footprint\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"img\u002Fmetrics\u002Fruntime_cpu_gpu_latency_memory.png\" alt=\"Supertonic CPU runtime compared with GPU baselines\">\n\u003C\u002Fp>\n\nSupertonic 3 runs fast on CPU, even compared with larger baselines measured on A100 GPU, and uses substantially less memory. The open-weight fixed-voice setting does not require a GPU, which makes local, browser, and edge deployment much easier.\n\n### Model Size\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"img\u002Fmetrics\u002Fmodel_size_comparison.png\" alt=\"Model size comparison\">\n\u003C\u002Fp>\n\nAt about 99M parameters across the public ONNX assets, Supertonic 3 is much smaller than 0.7B to 2B class open TTS systems. The smaller model size is a practical advantage for download size, startup time, and on-device inference.\n\n## Demo\n\n> **Try it now**: Experience Supertonic in your browser with our [**Interactive Demo**](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FSupertone\u002Fsupertonic-3), or get started with pre-trained models from [**Hugging Face Hub**](https:\u002F\u002Fhuggingface.co\u002FSupertone\u002Fsupertonic-3)\n\n### Raspberry Pi\n\nWatch Supertonic running on a **Raspberry Pi**, demonstrating on-device, real-time text-to-speech synthesis:\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fea66f6d6-7bc5-4308-8a88-1ce3e07400d2\n\n### E-Reader\n\nExperience Supertonic on an **Onyx Boox Go 6** e-reader in airplane mode, achieving an average RTF of 0.3× with zero network dependency:\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F64980e58-ad91-423a-9623-78c2ffc13680\n\n### Chrome Extension\n\nTurns any webpage into audio in under one second, delivering lightning-fast, on-device text-to-speech with zero network dependency—free, private, and effortless:\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fcc8a45fc-5c3e-4b2c-8439-a14c3d00d91c\n\n## Why Supertonic?\n\n- **Blazingly Fast**: Optimized for low-latency, on-device speech generation across desktop, browser, and edge deployments\n- **Lightweight**: Compact ONNX assets designed for efficient local execution\n- **On-Device Capable**: Complete privacy and zero network dependency\n- **Accurate Reading**: Improved reading stability with fewer repeat and skip failures\n- **Expressive Tags**: Supports simple expression tags such as `\u003Claugh>`, `\u003Cbreath>`, and `\u003Csigh>`\n- **Flexible Deployment**: Ready-to-use examples across Python, JavaScript, browser, mobile, and native runtimes\n\n## Language Support\n\nSupertonic 3 supports 31 languages:\n\n| Code | Language | Code | Language | Code | Language | Code | Language |\n|------|----------|------|----------|------|----------|------|----------|\n| `en` | English | `ko` | Korean | `ja` | Japanese | `ar` | Arabic |\n| `bg` | Bulgarian | `cs` | Czech | `da` | Danish | `de` | German |\n| `el` | Greek | `es` | Spanish | `et` | Estonian | `fi` | Finnish |\n| `fr` | French | `hi` | Hindi | `hr` | Croatian | `hu` | Hungarian |\n| `id` | Indonesian | `it` | Italian | `lt` | Lithuanian | `lv` | Latvian |\n| `nl` | Dutch | `pl` | Polish | `pt` | Portuguese | `ro` | Romanian |\n| `ru` | Russian | `sk` | Slovak | `sl` | Slovenian | `sv` | Swedish |\n| `tr` | Turkish | `uk` | Ukrainian | `vi` | Vietnamese | | |\n\nWe provide ready-to-use TTS inference examples across multiple ecosystems:\n\n| Language\u002FPlatform | Path | Description |\n|-------------------|------|-------------|\n| [**Python**](py\u002F) | `py\u002F` | ONNX Runtime inference |\n| [**Node.js**](nodejs\u002F) | `nodejs\u002F` | Server-side JavaScript |\n| [**Browser**](web\u002F) | `web\u002F` | WebGPU\u002FWASM inference |\n| [**Java**](java\u002F) | `java\u002F` | Cross-platform JVM |\n| [**C++**](cpp\u002F) | `cpp\u002F` | High-performance C++ |\n| [**C#**](csharp\u002F) | `csharp\u002F` | .NET ecosystem |\n| [**Go**](go\u002F) | `go\u002F` | Go implementation |\n| [**Swift**](swift\u002F) | `swift\u002F` | macOS applications |\n| [**iOS**](ios\u002F) | `ios\u002F` | Native iOS apps |\n| [**Rust**](rust\u002F) | `rust\u002F` | Memory-safe systems |\n| [**Flutter**](flutter\u002F) | `flutter\u002F` | Cross-platform apps |\n\n> For detailed usage instructions, please refer to the README.md in each language directory.\n\n## Natural Text Handling\n\nSupertonic is designed to handle complex, real-world text inputs that contain natural prose, punctuation, abbreviations, and proper nouns.\n\n> 🎧 **View audio samples more easily**: Check out our [**Interactive Demo**](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FSupertone\u002Fsupertonic-3) for a better viewing experience of all audio examples\n\n**Overview of Test Cases:**\n\n| Category | Key Challenges | Supertonic | ElevenLabs | OpenAI | Gemini | Microsoft |\n|:--------:|:--------------:|:----------:|:----------:|:------:|:------:|:---------:|\n| Financial Expression | Decimal currency, abbreviated magnitudes (M, K), currency symbols, currency codes | ✅ | ❌ | ❌ | ❌ | ❌ |\n| Phone Number | Area codes, hyphens, extensions (ext.) | ✅ | ❌ | ❌ | ❌ | ❌ |\n| Technical Unit | Decimal numbers with units, abbreviated technical notations | ✅ | ❌ | ❌ | ❌ | ❌ |\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Example 1: Financial Expression\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Cbr>\n\n**Text:**\n> \"The startup secured **$5.2M** in venture capital, a huge leap from their initial **$450K** seed round.\"\n\n**Challenges:**\n- Decimal point in currency ($5.2M should be read as \"five point two million\")\n- Abbreviated magnitude units (M for million, K for thousand)\n- Currency symbol ($) that needs to be properly pronounced as \"dollars\"\n\n**Audio Samples:**\n\n| System | Result | Audio Sample |\n|--------|--------|--------------|\n| **Supertonic** | ✅ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1eancUOhiSXCVoTu9ddh4S-OcVQaWrPV-\u002Fview?usp=sharing) |\n| ElevenLabs Flash v2.5 | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1-r2scv7XQ1crIDu6QOh3eqVl445W6ap_\u002Fview?usp=sharing) |\n| OpenAI TTS-1 | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1MFDXMjfmsAVOqwPx7iveS0KUJtZvcwxB\u002Fview?usp=sharing) |\n| Gemini 2.5 Flash TTS | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1dEHpNzfMUucFTJPQK0k4RcFZvPwQTt09\u002Fview?usp=sharing) |\n| VibeVoice Realtime 0.5B | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1b69XWBQnSZZ0WZeR3avv7E8mSdoN6p6P\u002Fview?usp=sharing) |\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Example 2: Phone Number\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Cbr>\n\n**Text:**\n> \"You can reach the hotel front desk at **(212) 555-0142 ext. 402** anytime.\"\n\n**Challenges:**\n- Area code in parentheses that should be read as separate digits\n- Phone number with hyphen separator (555-0142)\n- Abbreviated extension notation (ext.)\n- Extension number (402)\n\n**Audio Samples:**\n\n| System | Result | Audio Sample |\n|--------|--------|--------------|\n| **Supertonic** | ✅ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1z-e5iTsihryMR8ll1-N1YXkB2CIJYJ6F\u002Fview?usp=sharing) |\n| ElevenLabs Flash v2.5 | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1HAzVXFTZfZm0VEK2laSpsMTxzufcuaxA\u002Fview?usp=sharing) |\n| OpenAI TTS-1 | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F15tjfAmb3GbjP_kmvD7zSdIWkhtAaCPOg\u002Fview?usp=sharing) |\n| Gemini 2.5 Flash TTS | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1BCL8n7yligUZyso970ud7Gf5NWb1OhKD\u002Fview?usp=sharing) |\n| VibeVoice Realtime 0.5B | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1c0c0YM_Qm7XxSk2uSVYLbITgEDTqaVzL\u002Fview?usp=sharing) |\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cb>Example 3: Technical Unit\u003C\u002Fb>\u003C\u002Fsummary>\n\n\u003Cbr>\n\n**Text:**\n> \"Our drone battery lasts **2.3h** when flying at **30kph** with full camera payload.\"\n\n**Challenges:**\n- Decimal time duration with abbreviation (2.3h = two point three hours)\n- Speed unit with abbreviation (30kph = thirty kilometers per hour)\n- Technical abbreviations (h for hours, kph for kilometers per hour)\n- Technical\u002Fengineering context requiring proper pronunciation\n\n**Audio Samples:**\n\n| System | Result | Audio Sample |\n|--------|--------|--------------|\n| **Supertonic** | ✅ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1kvOBvswFkLfmr8hGplH0V2XiMxy1shYf\u002Fview?usp=sharing) |\n| ElevenLabs Flash v2.5 | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1_SzfjWJe5YEd0t3R7DztkYhHcI_av48p\u002Fview?usp=sharing) |\n| OpenAI TTS-1 | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1P5BSilj5xFPTV2Xz6yW5jitKZohO9o-6\u002Fview?usp=sharing) |\n| Gemini 2.5 Flash TTS | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1GU82SnWC50OvC8CZNjhxvNZFKQb7I9_Y\u002Fview?usp=sharing) |\n| VibeVoice Realtime 0.5B | ❌ | [🎧 Play Audio](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1lUTrxrAQy_viEK2Hlu3KLLtTCe8jvbdV\u002Fview?usp=sharing) |\n\n\u003C\u002Fdetails>\n\n> **Note:** These samples demonstrate how each system handles text normalization and pronunciation of complex expressions **without requiring pre-processing or phonetic annotations**.\n\n## Built with Supertonic\n\n| Project | Description | Links |\n|---------|-------------|-------|\n| **TLDRL** | Free, on-device TTS extension for reading any webpage | [Chrome](https:\u002F\u002Fchromewebstore.google.com\u002Fdetail\u002Ftldrl-lightning-tts-power\u002Fmdbiaajonlkomihpcaffhkagodbcgbme) |\n| **Read Aloud** | Open-source TTS browser extension | [Chrome](https:\u002F\u002Fchromewebstore.google.com\u002Fdetail\u002Fread-aloud-a-text-to-spee\u002Fhdhinadidafjejdhmfkjgnolgimiaplp) · [Edge](https:\u002F\u002Fmicrosoftedge.microsoft.com\u002Faddons\u002Fdetail\u002Fread-aloud-a-text-to-spe\u002Fpnfonnnmfjnpfgagnklfaccicnnjcdkm) · [GitHub](https:\u002F\u002Fgithub.com\u002Fken107\u002Fread-aloud) |\n| **PageEcho** | E-Book reader app for iOS | [App Store](https:\u002F\u002Fapps.apple.com\u002Fus\u002Fapp\u002Fpageecho\u002Fid6755965837) |\n| **VoiceChat** | On-device voice-to-voice LLM chatbot in the browser | [Demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FRickRossTN\u002Fai-voice-chat) · [GitHub](https:\u002F\u002Fgithub.com\u002Firelate-ai\u002Fvoice-chat) |\n| **OmniAvatar** | Talking avatar video generator from photo + speech | [Demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Falexnasa\u002FOmniAvatar) |\n| **CopiloTTS** | Kotlin Multiplatform TTS SDK via ONNX Runtime | [GitHub](https:\u002F\u002Fgithub.com\u002Fsigmadeltasoftware\u002FCopiloTTS) |\n| **Voice Mixer** | PyQt5 tool for mixing and modifying voice styles | [GitHub](https:\u002F\u002Fgithub.com\u002FTopping1\u002FSupertonic-Voice-Mixer) |\n| **Supertonic MNN** | Lightweight library based on MNN (fp32\u002Ffp16\u002Fint8) | [GitHub](https:\u002F\u002Fgithub.com\u002Fvra\u002Fsupertonic-mnn) · [PyPI](https:\u002F\u002Fpypi.org\u002Fproject\u002Fsupertonic-mnn\u002F) |\n| **Transformers.js** | Hugging Face's JS library with Supertonic support | [GitHub PR](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers.js\u002Fpull\u002F1459) · [Demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fwebml-community\u002FSupertonic-TTS-WebGPU) |\n| **Pinokio** | 1-click localhost cloud for Mac, Windows, and Linux | [Pinokio](https:\u002F\u002Fpinokio.co\u002F) · [GitHub](https:\u002F\u002Fgithub.com\u002FSUP3RMASS1VE\u002FSuperTonic-TTS) |\n\n## Citation\n\nThe following papers describe the core technologies used in Supertonic. If you use this system in your research or find these techniques useful, please consider citing the relevant papers:\n\n### SupertonicTTS: Main Architecture\n\nThis paper introduces the overall architecture of SupertonicTTS, including the speech autoencoder, flow-matching based text-to-latent module, and efficient design choices.\n\n```bibtex\n@article{kim2025supertonic,\n  title={SupertonicTTS: Towards Highly Efficient and Streamlined Text-to-Speech System},\n  author={Kim, Hyeongju and Yang, Jinhyeok and Yu, Yechan and Ji, Seunghun and Morton, Jacob and Bous, Frederik and Byun, Joon and Lee, Juheon},\n  journal={arXiv preprint arXiv:2503.23108},\n  year={2025},\n  url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.23108}\n}\n```\n\n### Length-Aware RoPE: Text-Speech Alignment\n\nThis paper presents Length-Aware Rotary Position Embedding (LARoPE), which improves text-speech alignment in cross-attention mechanisms.\n\n```bibtex\n@article{kim2025larope,\n  title={Length-Aware Rotary Position Embedding for Text-Speech Alignment},\n  author={Kim, Hyeongju and Lee, Juheon and Yang, Jinhyeok and Morton, Jacob},\n  journal={arXiv preprint arXiv:2509.11084},\n  year={2025},\n  url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.11084}\n}\n```\n\n### Self-Purifying Flow Matching: Training with Noisy Labels\n\nThis paper describes the self-purification technique for training flow matching models robustly with noisy or unreliable labels.\n\n```bibtex\n@article{kim2025spfm,\n  title={Training Flow Matching Models with Reliable Labels via Self-Purification},\n  author={Kim, Hyeongju and Yu, Yechan and Yi, June Young and Lee, Juheon},\n  journal={arXiv preprint arXiv:2509.19091},\n  year={2025},\n  url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.19091}\n}\n```\n\n## License\n\nThis project's sample code is released under the MIT License. - see the [LICENSE](https:\u002F\u002Fgithub.com\u002Fsupertone-inc\u002Fsupertonic?tab=MIT-1-ov-file) for details.\n\nThe accompanying model is released under the OpenRAIL-M License. - see the [LICENSE](https:\u002F\u002Fhuggingface.co\u002FSupertone\u002Fsupertonic-3\u002Fblob\u002Fmain\u002FLICENSE) file for details.\n\nThis model was trained using PyTorch, which is licensed under the BSD 3-Clause License but is not redistributed with this project. - see the [LICENSE](https:\u002F\u002Fdocs.pytorch.org\u002FFBGEMM\u002Fgeneral\u002FLicense.html) for details.\n\nCopyright (c) 2026 Supertone Inc.\n","Supertonic 是一个快速、本地化的多语言文本转语音系统，旨在提供低延迟的本地推理。它基于ONNX Runtime运行，完全在设备上执行，无需云服务或API调用，从而确保用户隐私安全。该项目支持31种语言，具备高准确度的文字朗读能力，并且减少了重复或跳过错误的发生。适用于需要高效且私密处理语音合成的应用场景，如移动应用、桌面软件或是任何对实时性和数据安全性有较高要求的情况。此外，Supertonic还提供了Python SDK以及Flutter SDK的支持，方便开发者快速集成到自己的项目中。",2,"2026-06-11 03:31:37","top_language"]