[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72528":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":18,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":16,"starSnapshotCount":16,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},72528,"tunix","google\u002Ftunix","google","A Lightweight LLM Post-Training Library","",null,"Python",2327,306,22,44,0,3,9,58,75.76,"Apache License 2.0",false,"main",[],"2026-06-12 04:01:06","# Tunix: A Lightweight LLM Post-Training Library\n\n\u003Cdiv align=\"left\">\n\n\u003Ca href=\"https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Findex.html\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdocumentation-blue\">\u003C\u002Fa>\n\n\u003C\u002Fdiv>\n\n**Tunix (Tune-in-JAX)** is a JAX based library designed to streamline the\npost-training of Large Language Models. It provides efficient and scalable\nsupport for:\n\n- **SOTA Training performance on TPUs**\n- **Supervised Fine-Tuning**\n- **Reinforcement Learning (RL)**\n- **Agentic RL**\n\nTunix leverages the power of JAX for accelerated computation and seamless\nintegration with JAX-based modeling frameworks like\n[Flax NNX](https:\u002F\u002Fflax.readthedocs.io\u002Fen\u002Flatest\u002Fnnx_basics.html), and\nintegrates with high-performance inference engines like vLLM and SGLang-JAX for\nrollout. **For our detailed documentation, please refer to the [Tunix Website](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Findex.html)**.\n\n\n**Current Status: V2 Release**\n\nTunix is under active development. Our team is actively working on expanding its\ncapabilities, usability and performance. Stay tuned for upcoming updates and new\nfeatures! See [Talks and Announcements](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Ftalks.html) for latest updates, talks, and blog posts.\n\n\n## High Level Architecture\nTunix serves as a state-of-the-art post-training library within the JAX training\nstack, positioned to leverage foundational tools like Flax, Optax, Orbax, etc.\nfor efficient model refinement. It sits as an intermediate layer between these\ncore utilities and optimized models like MaxText and MaxDiffusion, streamlining\ntuning workflows on top of the XLA and JAX infrastructure. See [Design Overview](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fdesign.html) for more details on the architecture.\n\n![Tunix in JAX ecosystem](docs\u002Fimages\u002Ftunix_in_jax_ecosystem.png)\n\n## Key Features\n-   **[Supervised Fine-Tuning (SFT)](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Falgorithms.html)**:\n    -   Full Weights Fine-Tuning\n    -   [PEFT](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fperformance.html#peft-with-lora) (Parameter-Efficient\n        Fine-Tuning)\n    -   [DPO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.18290) (Direct Preference Optimization)\n      -   [ORPO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.07691) (Odds Ratio Preference Optimization)\n-   **[Reinforcement Learning (RL)](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Falgorithms.html)**:\n    -   [PPO](https:\u002F\u002Farxiv.org\u002Fabs\u002F1707.06347) (Proximal Policy Optimization)\n    -   [GRPO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.03300) (Group Relative Policy\n        Optimization)\n      -   [GSPO-Token](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.18071) (Token-level Group\n          Sequence Policy Optimization)\n      -   [DAPO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.14476) (Direct Alignment via Preference\n          Optimization)\n      -   [Dr.GRPO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.20783) (Distributionally Robust\n          GRPO)\n-   **[Agentic RL](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fagentic_rl.html)**:\n    -   Multi-turn tool use\n    -   Asynchronous rollout for high-throughput trajectory collection\n    -   Trajectory batching and grouping\n\n## News\n\n-   [2026\u002F04] Gemma4 models are supported in Tunix! Stay tuned for upcoming training recipes.\n-   [2026\u002F01] Tunix model now supports efficient kernel execution ([splash attn](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Ftunix\u002Fblob\u002Fmain\u002Ftunix\u002Fmodels\u002Fqwen3\u002Fmodel.py#L150-L151), [GMM MoE](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Ftunix\u002Fblob\u002Fmain\u002Ftunix\u002Fmodels\u002Fqwen3\u002Fmodel.py#L638)).\n-   [2025\u002F12] [Agentic RL Training](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Ftunix\u002Ftree\u002Fmain\u002Ftunix\u002Frl\u002Fagentic) has been released, with efficient support of multi-turn agent-env interaction, tool usage, async rollout, etc.\n\n## Framework & Infra Highlights\n-   **Modularity**:\n    -   Components are designed to be reusable and composable\n    -   Easy to customize and extend\n-   **Performance & Efficiency**:\n    -   Native [vLLM](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Frollout.html#vllm) and\n        [SGLang-JAX](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Frollout.html#sglang) on TPU integration for performant\n        rollout\n    -   Native [MaxText](https:\u002F\u002Fgithub.com\u002FAI-Hypercomputer\u002Fmaxtext) model\n        integration for high performance kernels and model execution\n    -   [Micro-batching](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fperformance.html#batching-config) support for component\n        level efficient execution\n-   **Stability**\n    -   Seamless multi-host distributed training with Pathways which can scale\n        up to thousands of devices\n    -   [Checkpointing and Fault Tolerance](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Freliability.html)\n\n## Getting Started\n**Installation:** Jump to [Installation](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fquickstart.html#installation) to install Tunix and run your first training\njob.\n\nFor TPU users integrating `vllm` and `tpu-inference`, there are two supported\nsetup paths:\n\n- Docker image builds use [Dockerfile](\u002Fusr\u002Flocal\u002Fgoogle\u002Fhome\u002Flancewang\u002Fgithub\u002Ftunix\u002FDockerfile) and install\n    the pinned dependencies directly from `requirements\u002Frequirements.txt` and\n    `requirements\u002Fspecial_requirements.txt`.\n- Local TPU VM or developer-machine installs can use\n    [scripts\u002Finstall_tunix_vllm_requirement.sh](\u002Fusr\u002Flocal\u002Fgoogle\u002Fhome\u002Flancewang\u002Fgithub\u002Ftunix\u002Fscripts\u002Finstall_tunix_vllm_requirement.sh),\n    which installs the same requirement files outside Docker.\n\nThese are separate entry points. If you are building the Docker image, you do\nnot need to run the install script inside the container build.\n\n**Examples:** To get started, we have a number of detailed examples and tutorials. You can see [Quick Start](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fquickstart.html) for a great set of starting examples and [Examples and Guides](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fexamples.html) for a comprehensive list of all the notebooks and examples we have.\n\n\n## Supported Models\nTunix supports a growing list of models including Gemma, Llama, and Qwen families.\nSee [Models](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fmodels.html) for a full list and details on how to add new ones.\n\n\n## Contributing and Feedback\nWe welcome contributions! As Tunix is in early development, the contribution\nprocess is still being formalized. The detailed contribution process is outlined\n[here](https:\u002F\u002Ftunix.readthedocs.io\u002Fen\u002Flatest\u002Fcontributing.html). In\nthe meantime, you can make feature requests, report issues and ask questions in\nour\n[Tunix GitHub discussion forum](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Ftunix\u002Fdiscussions).\n\n## Collaborations and Partnership\n[GRL](https:\u002F\u002Fgithub.com\u002Flmgame-org\u002FGRL\u002Fblob\u002Ftunix_integration_dev\u002FREADME.md)\n(Game Reinforcement Learning), developed by\n[Hao AI Lab](https:\u002F\u002Fhao-ai-lab.github.io\u002F) from UCSD, is an open-source\nframework for post-training large language models through multi-turn RL on\nchallenging games. In collaboration with Tunix, GRL integrates seamless TPU\nsupport—letting users quickly run scalable, reproducible RL experiments (like\nPPO rollouts on Qwen2.5-0.5B-Instruct) on TPU v4 meshes with\n[minimal setup](https:\u002F\u002Fgithub.com\u002Flmgame-org\u002FGRL\u002Fblob\u002Ftunix_integration_dev\u002FREADME.md#5-launch-the-quick-test-defaults-to-qwen2505b-supports-4-tpu-v4-with-mesh-22).\nThis partnership empowers the community to push LLM capabilities further,\ncombining Tunix’s optimized TPU runtime with GRL’s flexible game RL pipeline for\ncutting-edge research and easy reproducibility.\n\n## Citing Tunix\n```bibtex\n@misc{tunix2025,\n  title={Tunix (Tune-in-JAX)},\n  author={Bao, Tianshu and Carpenter, Jeff and Chai, Lin and Gao, Haoyu and Jiang, Yangmu and Noghabi, Shadi and Sharma, Abheesht and Tan, Sizhi and Wang, Lance and Yan, Ann and Yu, Weiren and others},\n  year={2025},\n  howpublished={\\url{https:\u002F\u002Fgithub.com\u002Fgoogle\u002Ftunix}},\n}\n```\n\n## Acknowledgements\n\nThank you to all our wonderful contributors!\n\n[![Contributors](https:\u002F\u002Fcontrib.rocks\u002Fimage?repo=google\u002Ftunix)](https:\u002F\u002Fgithub.com\u002Fgoogle\u002Ftunix\u002Fgraphs\u002Fcontributors)\n","Tunix 是一个基于 JAX 的轻量级大型语言模型后训练库。它支持在 TPU 上实现顶级的训练性能、监督微调、强化学习及代理强化学习等功能，并通过与 Flax NNX 等建模框架无缝集成，以及与 vLLM 和 SGLang-JAX 等高性能推理引擎结合，提供高效且可扩展的支持。Tunix 适用于需要对大型语言模型进行后训练优化的各种场景，特别是在追求计算效率和模型性能改进的研究和开发环境中。",2,"2026-06-11 03:42:26","high_star"]