[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72072":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},72072,"neutts","neuphonic\u002Fneutts","neuphonic","On-device TTS model by Neuphonic",null,"Python",5989,645,58,28,0,14,35,157,42,39.43,"Other",false,"main",true,[],"2026-06-12 02:02:58","# NeuTTS\n\nHuggingFace 🤗:\n\n- NeuTTS-Air (English): [Model](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-air), [Q8 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-air-q8-gguf), [Q4 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-air-q4-gguf), [Space](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fneuphonic\u002Fneutts-air)\n\n- NeuTTS-Nano Multilingual Collection:\n   - NeuTTS-Nano (English): [Model](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano), [Q8 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-q8-gguf), [Q4 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-q4-gguf)\n   - NeuTTS-Nano-French: [Model](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-french), [Q8 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-french-q8-gguf), [Q4 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-french-q4-gguf)\n   - NeuTTS-Nano-German: [Model](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-german), [Q8 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-german-q8-gguf), [Q4 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-german-q4-gguf)\n   - NeuTTS-Nano-Spanish: [Model](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-spanish), [Q8 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-spanish-q8-gguf), [Q4 GGUF](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-spanish-q4-gguf)\n   - [Multilingual Space](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fneuphonic\u002Fneutts-nano-multilingual-collection)\n\n[NeuTTS-Nano Demo Video](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F629ec5b2-4818-4fa6-987a-99fcbadc56bc)\n\n_Created by [Neuphonic](http:\u002F\u002Fneuphonic.com\u002F) - building faster, smaller, on-device voice AI_\n\nState-of-the-art Voice AI has been locked behind web APIs for too long. NeuTTS is a collection of open source, on-device, TTS speech language models with instant voice cloning. Built off of LLM backbones, NeuTTS brings natural-sounding speech, real-time performance, built-in security and speaker cloning to your local device - unlocking a new category of embedded voice agents, assistants, toys, and compliance-safe apps.\n\n## Key Features\n\n- 🗣Best-in-class realism for their size - produce natural, ultra-realistic voices that sound human, at the sweet spot between speed, size, and quality for real-world applications\n- 📱Optimised for on-device deployment - quantisations provided in GGUF format, ready to run on phones, laptops, or even Raspberry Pis\n- 👫Instant voice cloning - create your own speaker with as little as 3 seconds of audio\n- 🚄Simple LM + codec architecture - making development and deployment simple\n\n> [!CAUTION]\n> Websites like neutts.com are popping up and they're not affliated with Neuphonic, our github or this repo.\n>\n> We are on neuphonic.com only. Please be careful out there! 🙏\n\n## Model Details\n\nNeuTTS models are built from small LLM backbones - lightweight yet capable language models optimised for text understanding and generation - as well as a powerful combination of technologies designed for efficiency and quality:\n\n- **Supported Languages**: English, Spanish, German, French (model-dependent)\n- **Audio Codec**: [NeuCodec](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneucodec) - our 50hz neural audio codec that achieves exceptional audio quality at low bitrates using a single codebook\n- **Context Window**: 2048 tokens, enough for processing ~30 seconds of audio (including prompt duration)\n- **Format**: Quantisations available in GGUF format for efficient on-device inference\n- **Responsibility**: Watermarked outputs\n- **Inference Speed**: Real-time generation on mid-range devices\n- **Power Consumption**: Optimised for mobile and embedded devices\n\n\n|  | NeuTTS-Air | NeuTTS-Nano Models |\n|---|---:|---:|\n| **# Params (Active)** | ~360m | ~120m |\n| **# Params (Emb + Active)** | ~552m | ~229m |\n| **Cloning** | Yes | Yes |\n| **License** | Apache 2.0 | NeuTTS Open License 1.0 |\n\n## Throughput Benchmarking\n\nThese benchmarks are for the Q4_0 quantisations [neutts-air-Q4_0](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-air-q4-gguf) and [neutts-nano-Q4_0](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneutts-nano-q4-gguf). Note that all models in the NeuTTS-Nano Multilingual Collection have an identical architecture, so these results should apply for any Q4_0 model in the collection.\n\nCPU benchmarking used [llama-bench](https:\u002F\u002Fgithub.com\u002Fggml-org\u002Fllama.cpp\u002Ftree\u002Fmaster\u002Ftools\u002Fllama-bench) (from llama.cpp) to measure prefill and decode throughput at multiple context sizes. For the GPU benchmark (RTX 4090), we leverage vLLM to maximise throughput, using the [vLLM benchmark](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Fstable\u002Fcli\u002Fbench\u002Fthroughput\u002F).\n\nWe include benchmarks on four devices: Galaxy A25 5G, AMD Ryzen 9HX 370, iMac M4 16GB, NVIDIA GeForce RTX 4090.\n\n\n|  | NeuTTS-Air | NeuTTS-Nano |\n|---|---:|---:|\n| **Galaxy A25 5G (CPU only)** | 20 tokens\u002Fs | 45 tokens\u002Fs|\n| **AMD Ryzen 9 HX 370 (CPU only)** | 119 tokens\u002Fs | 221 tokens\u002Fs |\n| **iMAc M4 16 GB (CPU only)** | 111 tokens\u002Fs | 195 tokens\u002Fs |\n| **RTX 4090** | 16194 tokens\u002Fs | 19268 tokens\u002Fs |\n\n\n> [!NOTE]\n>  llama-bench used 14 threads for prefill and 16 threads for decode (as configured in the benchmark run) on AMD Ryzen 9HX 370 and iMac M4 16GB, and 6 threads for each on the Galaxy A25 5G. The tokens\u002Fs reported are when having 500 prefill tokens and generating 250 output tokens.\n\n> [!NOTE]\n> Please note that these benchmarks only include the Speech Language Model and do not include the Codec which is needed for a full audio generation pipeline.\n\n## Get Started with NeuTTS\n\n> [!NOTE]\n> We have added a [streaming example](examples\u002Fbasic_streaming_example.py) using the `llama-cpp-python` library as well as a [finetuning script](examples\u002Ffinetune.py). For finetuning, please refer to the [finetune guide](TRAINING.md) for more details.\n\n1. **Install NeuTTS**\n   ```bash\n   pip install neutts\n   ```\n\n   Or for a local editable install, clone this repository and run in the base folder:\n   ```bash\n   pip install -e .\n   ```\n\n   Alternatively to install all dependencies, including `onnxruntime` and `llama-cpp-python` (equivalent to steps 3 and 4 below):\n\n   ```bash\n   pip install neutts[all]\n   ```\n\n   or for an editable install:\n\n   ```bash\n   pip install -e .[all]\n   ```\n\n2. **(Optional) Install `llama-cpp-python` to use `.gguf` models.**\n\n   To use any of the GGUF backbones (e.g., in basic_streaming_example.py) you need to install the llama-cpp-python package.\n\n   For the best performance, you must compile this package from source with hardware acceleration enabled for your specific operating system and target device (CPU or GPU).\n\n   #### macOS (Apple Silicon)\n\n   For M-series Macs, it is highly recommended to use Apple's native Accelerate framework for optimized CPU performance:\n\n   ```bash\n      CMAKE_ARGS=\"-DGGML_METAL=OFF -DGGML_BLAS=ON -DGGML_BLAS_VENDOR=Apple\" pip install \"neutts[llama]\" --force-reinstall --no-cache-dir\n      ```\n\n   #### Linux (OpenBLAS)\n   For Linux, you can accelerate CPU performance using OpenBLAS.\n\n   *Prerequisite: Ensure you have OpenBLAS installed on your system (e.g., `sudo apt-get install libopenblas-dev` on Ubuntu). For other distros, refer to the [OpenBLAS Installation Guide](https:\u002F\u002Fgithub.com\u002FOpenMathLib\u002FOpenBLAS\u002Fblob\u002Fdevelop\u002Fdocs\u002Finstall.md).*\n\n   ```bash\n      CMAKE_ARGS=\"-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS\" pip install \"neutts[llama]\" --force-reinstall --no-cache-dir\n   ```\n\n   #### Windows (OpenBLAS)\n\n      *Prerequisite: Ensure you have OpenBLAS installed on your system. Please refer to the [OpenBLAS Installation Guide](https:\u002F\u002Fgithub.com\u002FOpenMathLib\u002FOpenBLAS\u002Fblob\u002Fdevelop\u002Fdocs\u002Finstall.md).*\n\n   For Windows users utilizing PowerShell, set the environment variable and run the install command like this:\n   ```pwsh\n      $env:CMAKE_ARGS=\"-DGGML_BLAS=ON -DGGML_BLAS_VENDOR=OpenBLAS\"; pip install \"neutts[llama]\" --force-reinstall --no-cache-dir\n   ```\n\n   #### Looking for GPU Support?\n   If you have a dedicated GPU (Nvidia\u002FCUDA, AMD\u002FROCm, M-Series Mac\u002FMetal) and want to utilize it instead of the CPU, the CMAKE flags will be different.Please refer to the official [llama-cpp-python documentation](https:\u002F\u002Fgithub.com\u002Fabetlen\u002Fllama-cpp-python\u002Fblob\u002Fmain\u002FREADME.md) for the exact flags required for your specific hardware.\n\n3. **(Optional) Install `onnxruntime` to use the `.onnx` decoder.**\n   ```bash\n   pip install \"neutts[onnx]\"\n   ```\n\n## Examples\n\nTo get started with the example scripts, clone this repository and navigate into the project directory:\n\n   ```bash\n   git clone https:\u002F\u002Fgithub.com\u002Fneuphonic\u002Fneutts.git\n   cd neutts\n   ```\n\nSeveral examples are available, including a Jupyter notebook in the `examples` folder.\n\n### Basic Example\nRun the basic example script to synthesize speech:\n\n```bash\npython -m examples.basic_example \\\n  --input_text \"My name is Andy. I'm 25 and I just moved to London. The underground is pretty confusing, but it gets me around in no time at all.\" \\\n  --ref_audio samples\u002Fjo.wav \\\n  --ref_text samples\u002Fjo.txt\n```\n\nTo specify a particular model repo for the backbone or codec, add the `--backbone` argument. Available backbones are listed in [NeuTTS-Air](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fneuphonic\u002Fneutts-air) and [NeuTTS-Nano Multilingual Collection](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fneuphonic\u002Fneutts-nano-multilingual-collection) huggingface collections.\n\n> [!CAUTION]\n> If you are using a non-English backbone, it is highly recommended to use a same-language reference for best performance. See the 'example reference files' section below to select an appropriate example reference.\n\n### One-Code Block Usage\n\n```python\nfrom neutts import NeuTTS\nimport soundfile as sf\n\ntts = NeuTTS(\n   backbone_repo=\"neuphonic\u002Fneutts-nano\", # or 'neuphonic\u002Fneutts-nano-q4-gguf' with llama-cpp-python installed\n   backbone_device=\"cpu\",\n   codec_repo=\"neuphonic\u002Fneucodec\",\n   codec_device=\"cpu\"\n)\ninput_text = \"My name is Andy. I'm 25 and I just moved to London. The underground is pretty confusing, but it gets me around in no time at all.\"\n\nref_text = \"samples\u002Fjo.txt\"\nref_audio_path = \"samples\u002Fjo.wav\"\n\nref_text = open(ref_text, \"r\").read().strip()\nref_codes = tts.encode_reference(ref_audio_path)\n\nwav = tts.infer(input_text, ref_codes, ref_text)\nsf.write(\"test.wav\", wav, 24000)\n```\n\n### Streaming\n\nSpeech can also be synthesised in _streaming mode_, where audio is generated in chunks and plays as generated. Note that this requires pyaudio to be installed. To do this, run:\n\n```bash\npython -m examples.basic_streaming_example \\\n  --input_text \"My name is Andy. I'm 25 and I just moved to London. The underground is pretty confusing, but it gets me around in no time at all.\" \\\n  --ref_codes samples\u002Fjo.pt \\\n  --ref_text samples\u002Fjo.txt\n```\n\nAgain, a particular model repo can be specified with the `--backbone` argument - note that for streaming the model must be in GGUF format.\n\n## Preparing References for Cloning\n\nNeuTTS requires two inputs:\n\n1. A reference audio sample (`.wav` file)\n2. A text string\n\nThe model then synthesises the text as speech in the style of the reference audio. This is what enables NeuTTS models' instant voice cloning capability.\n\n### Example Reference Files\n\nYou can find some ready-to-use references in the `samples` folder:\n\n- English:\n   - `dave.wav`\n   - `jo.wav`\n- Spanish:\n   - `mateo.wav`\n- German:\n   - `greta.wav`\n- French:\n   - `juliette.wav`\n\n### Guidelines for Best Results\n\nFor optimal performance, reference audio samples should be:\n\n1. **Mono channel**\n2. **16-44 kHz sample rate**\n3. **3–15 seconds in length**\n4. **Saved as a `.wav` file**\n5. **Clean** — minimal to no background noise\n6. **Natural, continuous speech** — like a monologue or conversation, with few pauses, so the model can capture tone effectively\n\n## Guidelines for minimizing Latency\n\nFor optimal performance on-device:\n\n1. Use the GGUF model backbones\n2. Pre-encode references (see `examples\u002Fencode_reference.py` or `examples\u002Fbasic_example.py`)\n3. Use the [onnx codec decoder](https:\u002F\u002Fhuggingface.co\u002Fneuphonic\u002Fneucodec-onnx-decoder)\n\nTake a look at this example in the [examples README](examples\u002FREADME.md###minimal-latency-example) to get started.\n\n## Responsibility\n\nEvery audio file generated by NeuTTS includes by default  a [Perth (Perceptual Threshold) Watermark](https:\u002F\u002Fgithub.com\u002Fresemble-ai\u002Fperth).\n\nNote: If you install neutts using `uv sync` within the repo, the program will still run, but watermarking will be disabled (you will see warning that perth is missing). This is because `uv sync` currently fails to pull the required Perth dependencies, please see [This Issue](https:\u002F\u002Fgithub.com\u002Fresemble-ai\u002FPerth\u002F). To ensure watermarking is active, please install the package via PyPI instead (`pip install neutts`).\n\n## Disclaimer\n\nDon't use this model to do bad things… please.\n\n## Developer Requirements\n\nTo run the pre commit hooks to contribute to this project run:\n\n```bash\npip install pre-commit\n```\n\nThen:\n\n```bash\npre-commit install\n```\n\n## Running Tests\n\nFirst, install the dev requirements:\n\n```\npip install -r requirements-dev.txt\n```\n\nTo run the tests:\n\n```\npytest tests\u002F\n```\n\nTo test loading of all the official backbone and codecs, use:\n\n```\nRUN_SLOW=true pytest tests\u002F\n```\n","NeuTTS是一个由Neuphonic开发的设备端文本转语音（TTS）模型。它基于轻量级的语言模型，能够在本地设备上运行，并支持即时声音克隆，仅需几秒钟的音频样本即可创建个性化的声音。该技术特别针对移动电话、笔记本电脑乃至树莓派等低资源环境进行了优化，采用GGUF格式量化以确保高效运行。NeuTTS非常适合需要在保证数据安全的同时提供自然流畅语音的应用场景，比如嵌入式语音助手、玩具或合规性要求较高的应用程序。通过简洁的架构设计，使得开发者能够轻松集成和部署这些先进的语音合成能力。",2,"2026-06-11 03:40:13","high_star"]