[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-70965":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":29,"readmeContent":30,"aiSummary":31,"trendingCount":16,"starSnapshotCount":16,"syncStatus":32,"lastSyncTime":33,"discoverSource":34},70965,"axolotl","axolotl-ai-cloud\u002Faxolotl","axolotl-ai-cloud","Go ahead and axolotl questions","https:\u002F\u002Fdocs.axolotl.ai",null,"Python",12034,1367,58,117,0,24,43,148,72,44.41,"Apache License 2.0",false,"main",true,[27,28],"fine-tuning","llm","2026-06-12 02:02:46","\u003Cp align=\"center\">\n    \u003Cpicture>\n        \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fraw.githubusercontent.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002F887513285d98132142bf5db2a74eb5e0928787f1\u002Fimage\u002Faxolotl_logo_digital_white.svg\">\n        \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Fraw.githubusercontent.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002F887513285d98132142bf5db2a74eb5e0928787f1\u002Fimage\u002Faxolotl_logo_digital_black.svg\">\n        \u003Cimg alt=\"Axolotl\" src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002F887513285d98132142bf5db2a74eb5e0928787f1\u002Fimage\u002Faxolotl_logo_digital_black.svg\" width=\"400\" height=\"104\" style=\"max-width: 100%;\">\n    \u003C\u002Fpicture>\n\u003C\u002Fp>\n  \u003Cp align=\"center\">\n      \u003Cstrong>A Free and Open Source LLM Fine-tuning Framework\u003C\u002Fstrong>\u003Cbr>\n  \u003C\u002Fp>\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Faxolotl-ai-cloud\u002Faxolotl.svg?color=blue\" alt=\"GitHub License\">\n    \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Factions\u002Fworkflows\u002Ftests.yml\u002Fbadge.svg\" alt=\"tests\">\n    \u003Ca href=\"https:\u002F\u002Fcodecov.io\u002Fgh\u002Faxolotl-ai-cloud\u002Faxolotl\">\u003Cimg src=\"https:\u002F\u002Fcodecov.io\u002Fgh\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fbranch\u002Fmain\u002Fgraph\u002Fbadge.svg\" alt=\"codecov\">\u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Freleases\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frelease\u002Faxolotl-ai-cloud\u002Faxolotl.svg\" alt=\"Releases\">\u003C\u002Fa>\n    \u003Cbr\u002F>\n    \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fgraphs\u002Fcontributors\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fcontributors-anon\u002Faxolotl-ai-cloud\u002Faxolotl?color=yellow&style=flat-square\" alt=\"contributors\" style=\"height: 20px;\">\u003C\u002Fa>\n    \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Faxolotl-ai-cloud\u002Faxolotl\" alt=\"GitHub Repo stars\">\n    \u003Cbr\u002F>\n    \u003Ca href=\"https:\u002F\u002Fdiscord.com\u002Finvite\u002FHhrNrHJPRb\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdiscord-7289da.svg?style=flat-square&logo=discord\" alt=\"discord\" style=\"height: 20px;\">\u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Ftwitter.com\u002Faxolotl_ai\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002Faxolotl_ai?style=social\" alt=\"twitter\" style=\"height: 20px;\">\u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fblob\u002Fmain\u002Fexamples\u002Fcolab-notebooks\u002Fcolab-axolotl-example.ipynb\">\u003Cimg src=\"https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg\" alt=\"google-colab\" style=\"height: 20px;\">\u003C\u002Fa>\n    \u003Cbr\u002F>\n    \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Factions\u002Fworkflows\u002Ftests-nightly.yml\u002Fbadge.svg\" alt=\"tests-nightly\">\n    \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Factions\u002Fworkflows\u002Fmulti-gpu-e2e.yml\u002Fbadge.svg\" alt=\"multigpu-semi-weekly tests\">\n\u003C\u002Fp>\n\n\n## 🎉 Latest Updates\n\n- 2026\u002F04:\n  - New model support has been added in Axolotl for [Mistral Medium 3.5](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Fmistral-medium-3_5) and [Gemma 4](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Fgemma4).\n  - Axolotl is now [uv-first](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F3545) and has [SonicMoE fused LoRA](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F3519) support.\n- 2026\u002F03:\n  - New model support has been added in Axolotl for [Mistral Small 4](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Fmistral4), [Qwen3.5, Qwen3.5 MoE](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Fqwen3.5), [GLM-4.7-Flash](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Fglm47-flash), [GLM-4.6V](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Fglm46v), and [GLM-4.5-Air](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Fglm45).\n  - [MoE expert quantization](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fexpert_quantization.html) support (via `quantize_moe_experts: true`) greatly reduces VRAM when training MoE models (FSDP2 compat).\n- 2026\u002F02:\n  - [ScatterMoE LoRA](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F3410) support. LoRA fine-tuning directly on MoE expert weights using custom Triton kernels.\n  - Axolotl now has support for [SageAttention](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F2823) and [GDPO](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F3353) (Generalized DPO).\n- 2026\u002F01:\n  - New integration for [EAFT](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F3366) (Entropy-Aware Focal Training), weights loss by entropy of the top-k logit distribution, and [Scalable Softmax](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F3338), improves long context in attention.\n- 2025\u002F12:\n  - Axolotl now includes support for [Kimi-Linear](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fkimi-linear.html), [Plano-Orchestrator](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fplano.html), [MiMo](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fmimo.html), [InternVL 3.5](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Finternvl3_5.html), [Olmo3](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Folmo3.html), [Trinity](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Ftrinity.html), and [Ministral3](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fministral3.html).\n  - [Distributed Muon Optimizer](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F3264) support has been added for FSDP2 pretraining.\n- 2025\u002F10: New model support has been added in Axolotl for: [Qwen3 Next](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fqwen3-next.html), [Qwen2.5-vl, Qwen3-vl](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Fqwen2_5-vl), [Qwen3, Qwen3MoE](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fqwen3.html), [Granite 4](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fgranite4.html), [HunYuan](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fhunyuan.html), [Magistral 2509](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fmagistral\u002Fvision.html), [Apertus](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fapertus.html), and [Seed-OSS](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fseed-oss.html).\n\n\u003Cdetails>\n\n\u003Csummary>Expand older updates\u003C\u002Fsummary>\n\n- 2025\u002F09: Axolotl now has text diffusion training. Read more [here](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fsrc\u002Faxolotl\u002Fintegrations\u002Fdiffusion).\n- 2025\u002F08: QAT has been updated to include NVFP4 support. See [PR](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fpull\u002F3107).\n- 2025\u002F07:\n  - ND Parallelism support has been added into Axolotl. Compose Context Parallelism (CP), Tensor Parallelism (TP), and Fully Sharded Data Parallelism (FSDP) within a single node and across multiple nodes. Check out the [blog post](https:\u002F\u002Fhuggingface.co\u002Fblog\u002Faccelerate-nd-parallel) for more info.\n  - Axolotl adds more models: [GPT-OSS](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fgpt-oss.html), [Gemma 3n](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fgemma3n.html), [Liquid Foundation Model 2 (LFM2)](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002FLiquidAI.html), and [Arcee Foundation Models (AFM)](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Farcee.html).\n  - FP8 finetuning with fp8 gather op is now possible in Axolotl via `torchao`. Get started [here](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmixed_precision.html#sec-fp8)!\n  - [Voxtral](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fvoxtral.html), [Magistral 1.1](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fmagistral.html), and [Devstral](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fdevstral.html) with mistral-common tokenizer support has been integrated in Axolotl!\n  - TiledMLP support for single-GPU to multi-GPU training with DDP, DeepSpeed and FSDP support has been added to support Arctic Long Sequence Training. (ALST). See [examples](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002Falst) for using ALST with Axolotl!\n- 2025\u002F06: Magistral with mistral-common tokenizer support has been added to Axolotl. See [docs](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fmagistral.html) to start training your own Magistral models with Axolotl!\n- 2025\u002F05: Quantization Aware Training (QAT) support has been added to Axolotl. Explore the [docs](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fqat.html) to learn more!\n- 2025\u002F04: Llama 4 support has been added in Axolotl. See [docs](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmodels\u002Fllama-4.html) to start training your own Llama 4 models with Axolotl's linearized version!\n- 2025\u002F03: Axolotl has implemented Sequence Parallelism (SP) support. Read the [blog](https:\u002F\u002Fhuggingface.co\u002Fblog\u002Faxolotl-ai-co\u002Flong-context-with-sequence-parallelism-in-axolotl) and [docs](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fsequence_parallelism.html) to learn how to scale your context length when fine-tuning.\n- 2025\u002F03: (Beta) Fine-tuning Multimodal models is now supported in Axolotl. Check out the [docs](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmultimodal.html) to fine-tune your own!\n- 2025\u002F02: Axolotl has added LoRA optimizations to reduce memory usage and improve training speed for LoRA and QLoRA in single GPU and multi-GPU training (DDP and DeepSpeed). Jump into the [docs](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Flora_optims.html) to give it a try.\n- 2025\u002F02: Axolotl has added GRPO support. Dive into our [blog](https:\u002F\u002Fhuggingface.co\u002Fblog\u002Faxolotl-ai-co\u002Ftraining-llms-w-interpreter-feedback-wasm) and [GRPO example](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Fgrpo_code) and have some fun!\n- 2025\u002F01: Axolotl has added Reward Modelling \u002F Process Reward Modelling fine-tuning support. See [docs](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Freward_modelling.html).\n\n\u003C\u002Fdetails>\n\n## ✨ Overview\n\nAxolotl is a free and open-source tool designed to streamline post-training and fine-tuning for the latest large language models (LLMs).\n\nFeatures:\n\n- **Multiple Model Support**: Train various models like GPT-OSS, LLaMA, Mistral, Mixtral, Pythia, and many more models available on the Hugging Face Hub.\n- **Multimodal Training**: Fine-tune vision-language models (VLMs) including LLaMA-Vision, Qwen2-VL, Pixtral, LLaVA, SmolVLM2, GLM-4.6V, InternVL 3.5, Gemma 3n, and audio models like Voxtral with image, video, and audio support.\n- **Training Methods**: Full fine-tuning, LoRA, QLoRA, GPTQ, QAT, Preference Tuning (DPO, IPO, KTO, ORPO), RL (GRPO, GDPO), and Reward Modelling (RM) \u002F Process Reward Modelling (PRM).\n- **Easy Configuration**: Re-use a single YAML configuration file across the full fine-tuning pipeline: dataset preprocessing, training, evaluation, quantization, and inference.\n- **Performance Optimizations**: [Multipacking](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmultipack.html), [Flash Attention 2\u002F3\u002F4](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fattention.html#flash-attention), [Xformers](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fattention.html#xformers), [Flex Attention](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fattention.html#flex-attention), [SageAttention](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fattention.html#sageattention), [Liger Kernel](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fcustom_integrations.html#liger-kernels), [Cut Cross Entropy](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fcustom_integrations.html#cut-cross-entropy), [ScatterMoE](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fcustom_integrations.html#kernels-integration), [Sequence Parallelism (SP)](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fsequence_parallelism.html), [LoRA optimizations](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Flora_optims.html), [Multi-GPU training (FSDP1, FSDP2, DeepSpeed)](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmulti-gpu.html), [Multi-node training (Torchrun, Ray)](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmulti-node.html), and many more!\n- **Flexible Dataset Handling**: Load from local, HuggingFace, and cloud (S3, Azure, GCP, OCI) datasets.\n- **Cloud Ready**: We ship [Docker images](https:\u002F\u002Fhub.docker.com\u002Fu\u002Faxolotlai) and also [PyPI packages](https:\u002F\u002Fpypi.org\u002Fproject\u002Faxolotl\u002F) for use on cloud platforms and local hardware.\n\n\n\n## 🚀 Quick Start - LLM Fine-tuning in Minutes\n\n**Requirements**:\n\n- NVIDIA GPU (Ampere or newer for `bf16` and Flash Attention) or AMD GPU\n- Python >=3.11 (3.12 recommended)\n- PyTorch ≥2.9.1\n\n### Google Colab\n\n[![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fblob\u002Fmain\u002Fexamples\u002Fcolab-notebooks\u002Fcolab-axolotl-example.ipynb#scrollTo=msOCO4NRmRLa)\n\n### Installation\n\n```bash\n# install uv if you don't already have it installed (restart shell after)\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\n\n# change depending on system\nexport UV_TORCH_BACKEND=cu128\n\n# create a new virtual environment\nuv venv --python 3.12\nsource .venv\u002Fbin\u002Factivate\n\nuv pip install torch==2.10.0 torchvision\nuv pip install --no-build-isolation axolotl[deepspeed]\n\n# Download example axolotl configs, deepspeed configs\naxolotl fetch examples\naxolotl fetch deepspeed_configs  # OPTIONAL\n```\n\n#### Using Docker\n\nInstalling with Docker can be less error prone than installing in your own environment.\n```bash\ndocker run --gpus '\"all\"' --ipc=host --rm -it axolotlai\u002Faxolotl:main-latest\n```\n\nOther installation approaches are described [here](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Finstallation.html).\n\n#### Cloud Providers\n\n\u003Cdetails>\n\n- [RunPod](https:\u002F\u002Frunpod.io\u002Fgsc?template=v2ickqhz9s&ref=6i7fkpdz)\n- [Vast.ai](https:\u002F\u002Fcloud.vast.ai?ref_id=62897&template_id=bdd4a49fa8bce926defc99471864cace&utm_source=github&utm_medium=developer_community&utm_campaign=template_launch_axolotl&utm_content=readme)\n- [PRIME Intellect](https:\u002F\u002Fapp.primeintellect.ai\u002Fdashboard\u002Fcreate-cluster?image=axolotl&location=Cheapest&security=Cheapest&show_spot=true)\n- [Modal](https:\u002F\u002Fwww.modal.com?utm_source=github&utm_medium=github&utm_campaign=axolotl)\n- [Novita](https:\u002F\u002Fnovita.ai\u002Fgpus-console?templateId=311)\n- [JarvisLabs.ai](https:\u002F\u002Fjarvislabs.ai\u002Ftemplates\u002Faxolotl)\n- [Latitude.sh](https:\u002F\u002Flatitude.sh\u002Fblueprint\u002F989e0e79-3bf6-41ea-a46b-1f246e309d5c)\n\n\u003C\u002Fdetails>\n\n### Your First Fine-tune\n\n```bash\n# Fetch axolotl examples\naxolotl fetch examples\n\n# Or, specify a custom path\naxolotl fetch examples --dest path\u002Fto\u002Ffolder\n\n# Train a model using LoRA\naxolotl train examples\u002Fllama-3\u002Flora-1b.yml\n```\n\nThat's it! Check out our [Getting Started Guide](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fgetting-started.html) for a more detailed walkthrough.\n\n\n## 📚 Documentation\n\n- [Installation Options](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Finstallation.html) - Detailed setup instructions for different environments\n- [Configuration Guide](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fconfig-reference.html) - Full configuration options and examples\n- [Dataset Loading](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fdataset_loading.html) - Loading datasets from various sources\n- [Dataset Guide](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fdataset-formats\u002F) - Supported formats and how to use them\n- [Multi-GPU Training](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmulti-gpu.html)\n- [Multi-Node Training](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmulti-node.html)\n- [Multipacking](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fmultipack.html)\n- [API Reference](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fapi\u002F) - Auto-generated code documentation\n- [FAQ](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Ffaq.html) - Frequently asked questions\n\n## AI Agent Support\n\nAxolotl ships with built-in documentation optimized for AI coding agents (Claude Code, Cursor, Copilot, etc.). These docs are bundled with the pip package — no repo clone needed.\n\n```bash\n# Show overview and available training methods\naxolotl agent-docs\n\n# Topic-specific references\naxolotl agent-docs sft                 # supervised fine-tuning\naxolotl agent-docs grpo                # GRPO online RL\naxolotl agent-docs preference_tuning   # DPO, KTO, ORPO, SimPO\naxolotl agent-docs reward_modelling    # outcome and process reward models\naxolotl agent-docs pretraining         # continual pretraining\naxolotl agent-docs --list              # list all topics\n\n# Dump config schema for programmatic use\naxolotl config-schema\naxolotl config-schema --field adapter\n```\n\nIf you're working with the source repo, agent docs are also available at `docs\u002Fagents\u002F` and the project overview is in `AGENTS.md`.\n\n## 🤝 Getting Help\n\n- Join our [Discord community](https:\u002F\u002Fdiscord.gg\u002FHhrNrHJPRb) for support\n- Check out our [Examples](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Ftree\u002Fmain\u002Fexamples\u002F) directory\n- Read our [Debugging Guide](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Fdebugging.html)\n- Need dedicated support? Please contact [✉️wing@axolotl.ai](mailto:wing@axolotl.ai) for options\n\n## 🌟 Contributing\n\nContributions are welcome! Please see our [Contributing Guide](https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl\u002Fblob\u002Fmain\u002F.github\u002FCONTRIBUTING.md) for details.\n\n## 📈 Telemetry\n\nAxolotl has opt-out telemetry that helps us understand how the project is being used\nand prioritize improvements. We collect basic system information, model types, and\nerror rates—never personal data or file paths. Telemetry is enabled by default. To\ndisable it, set AXOLOTL_DO_NOT_TRACK=1. For more details, see our [telemetry documentation](https:\u002F\u002Fdocs.axolotl.ai\u002Fdocs\u002Ftelemetry.html).\n\n## ❤️ Sponsors\n\nInterested in sponsoring? Contact us at [wing@axolotl.ai](mailto:wing@axolotl.ai)\n\n## 📝 Citing Axolotl\n\nIf you use Axolotl in your research or projects, please cite it as follows:\n\n```bibtex\n@software{axolotl,\n  title = {Axolotl: Open Source LLM Post-Training},\n  author = {{Axolotl maintainers and contributors}},\n  url = {https:\u002F\u002Fgithub.com\u002Faxolotl-ai-cloud\u002Faxolotl},\n  license = {Apache-2.0},\n  year = {2023}\n}\n```\n\n## 📜 License\n\nThis project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.\n","Axolotl 是一个免费且开源的大规模语言模型（LLM）微调框架。它支持多种模型的微调，包括最新的 Mistral Medium 3.5 和 Gemma 4 等，并引入了 SonicMoE 融合 LoRA 技术以提升性能。该项目采用 Python 编写，遵循 Apache License 2.0 开源协议，具备良好的社区支持和持续更新机制。Axolotl 适用于需要对现有 LLM 进行定制化训练以满足特定应用场景需求的研究人员和开发者，如创建更加精准的聊天机器人或专业领域的文本生成工具。",2,"2026-06-11 03:35:12","high_star"]