[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-445":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":47,"readmeContent":48,"aiSummary":49,"trendingCount":16,"starSnapshotCount":16,"syncStatus":50,"lastSyncTime":51,"discoverSource":52},445,"LlamaFactory","hiyouga\u002FLlamaFactory","hiyouga","Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)","https:\u002F\u002Fllamafactory.readthedocs.io",null,"Python",72087,8820,337,964,0,33,208,935,158,45,"Apache License 2.0",false,"main",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46],"agent","ai","deepseek","fine-tuning","gemma","gpt","instruction-tuning","large-language-models","llama","llama3","llm","lora","moe","nlp","peft","qlora","quantization","qwen","rlhf","transformers","2026-06-12 02:00:13","![# LLaMA Factory](assets\u002Flogo.png)\n\n[![GitHub Repo stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fhiyouga\u002FLLaMA-Factory?style=social)](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fstargazers)\n[![GitHub last commit](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flast-commit\u002Fhiyouga\u002FLLaMA-Factory)](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fcommits\u002Fmain)\n[![GitHub contributors](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fcontributors\u002Fhiyouga\u002FLLaMA-Factory?color=orange)](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fgraphs\u002Fcontributors)\n[![GitHub workflow](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Factions\u002Fworkflows\u002Ftests.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Factions\u002Fworkflows\u002Ftests.yml)\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fllamafactory)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fllamafactory\u002F)\n[![Citation](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcitation-1000+-green)](https:\u002F\u002Fscholar.google.com\u002Fscholar?cites=12620864006390196564)\n[![Docker Pulls](https:\u002F\u002Fimg.shields.io\u002Fdocker\u002Fpulls\u002Fhiyouga\u002Fllamafactory)](https:\u002F\u002Fhub.docker.com\u002Fr\u002Fhiyouga\u002Fllamafactory\u002Ftags)\n\n[![Twitter](https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002Fllamafactory_ai)](https:\u002F\u002Ftwitter.com\u002Fllamafactory_ai)\n[![Discord](assets\u002Fthirdparty\u002Fdiscord.svg)](https:\u002F\u002Fdiscord.gg\u002FrKfvV9r9FK)\n[![WeChat](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWeChat-User%20Group-blue?logo=wechat)](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002Fllamafactory-community)\n[![Blog](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FHugo-Official%20Blog-blue?logo=hugo)](https:\u002F\u002Fblog.llamafactory.net\u002Fen\u002F)\n\n[![Open in Colab](assets\u002Fthirdparty\u002Fcolab.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)\n[![Open in DSW](assets\u002Fthirdparty\u002Fdsw.svg)](https:\u002F\u002Fgallery.pai-ml.com\u002F#\u002Fpreview\u002FdeepLearning\u002Fnlp\u002Fllama_factory)\n[![Open in Lab4ai](assets\u002Fthirdparty\u002Flab4ai.svg)](https:\u002F\u002Fwww.lab4ai.cn\u002Fcourse\u002Fdetail?id=7c13e60f6137474eb40f6fd3983c0f46&utm_source=LLaMA-Factory)\n[![Open in Online](assets\u002Fthirdparty\u002Fonline.svg)](https:\u002F\u002Fwww.llamafactory.com.cn\u002F?utm_source=LLaMA-Factory)\n[![Open in Spaces](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗-Open%20in%20Spaces-blue)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fhiyouga\u002FLLaMA-Board)\n[![Open in Studios](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FModelScope-Open%20in%20Studios-blue)](https:\u002F\u002Fmodelscope.cn\u002Fstudios\u002Fhiyouga\u002FLLaMA-Board)\n[![Open in Novita](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FNovita-Deploy%20Template-blue)](https:\u002F\u002Fnovita.ai\u002Ftemplates-library\u002F105981?sharer=88115474-394e-4bda-968e-b88e123d0c47)\n\n### Used by [Amazon](https:\u002F\u002Faws.amazon.com\u002Fcn\u002Fblogs\u002Fmachine-learning\u002Fhow-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod\u002F), [NVIDIA](https:\u002F\u002Fdeveloper.nvidia.com\u002Frtx\u002Fai-toolkit), [Aliyun](https:\u002F\u002Fhelp.aliyun.com\u002Fzh\u002Fpai\u002Fuse-cases\u002Ffine-tune-a-llama-3-model-with-llama-factory), etc.\n\n\u003Cdiv align=\"center\" markdown=\"1\">\n\n### Supporters ❤️\n\n| \u003Cdiv style=\"text-align: center;\">\u003Ca href=\"https:\u002F\u002Fwarp.dev\u002Fllama-factory\">\u003Cimg alt=\"Warp sponsorship\" width=\"400\" src=\"assets\u002Fsponsors\u002Fwarp.jpg\">\u003C\u002Fa>\u003Cbr>\u003Ca href=\"https:\u002F\u002Fwarp.dev\u002Fllama-factory\" style=\"font-size:larger;\">Warp, the agentic terminal for developers\u003C\u002Fa>\u003Cbr>\u003Ca href=\"https:\u002F\u002Fwarp.dev\u002Fllama-factory\">Available for MacOS, Linux, & Windows\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Fserpapi.com\">\u003Cimg alt=\"SerpAPI sponsorship\" width=\"250\" src=\"assets\u002Fsponsors\u002Fserpapi.svg\"> \u003C\u002Fa> |\n| ---- | ---- |\n\n----\n\n### Easily fine-tune 100+ large language models with zero-code [CLI](#quickstart) and [Web UI](#fine-tuning-with-llama-board-gui-powered-by-gradio)\n\n![GitHub Trend](https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F4535)\n\n\u003C\u002Fdiv>\n\n👋 Join our [WeChat](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002Fllamafactory-community\u002Fblob\u002Fmain\u002Fwechat\u002Fmain.jpg), [NPU](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002Fllamafactory-community\u002Fblob\u002Fmain\u002Fwechat\u002Fnpu.jpg), [Lab4AI](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002Fllamafactory-community\u002Fblob\u002Fmain\u002Fwechat\u002Flab4ai.jpg), [LLaMA Factory Online](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002Fllamafactory-community\u002Fblob\u002Fmain\u002Fwechat\u002Fonline.jpg) user group.\n\n\\[ English | [中文](README_zh.md) \\]\n\n**Fine-tuning a large language model can be easy as...**\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F3991a3a8-4276-4d30-9cab-4cb0c4b9b99e\n\nStart local training:\n- Please refer to [usage](#getting-started)\n\nStart cloud training:\n- **Colab (free)**: https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing\n- **PAI-DSW (free trial)**: https:\u002F\u002Fgallery.pai-ml.com\u002F#\u002Fpreview\u002FdeepLearning\u002Fnlp\u002Fllama_factory\n- **LLaMA Factory Online**: https:\u002F\u002Fwww.llamafactory.com.cn\u002F?utm_source=LLaMA-Factory\n- **Alaya NeW (cloud GPU deal)**: https:\u002F\u002Fdocs.alayanew.com\u002Fdocs\u002Fdocuments\u002FuseGuide\u002FLLaMAFactory\u002Fmutiple\u002F?utm_source=LLaMA-Factory\n\nRead technical notes:\n- **Documentation (WIP)**: https:\u002F\u002Fllamafactory.readthedocs.io\u002Fen\u002Flatest\u002F\n- **Documentation (AMD GPU)**: https:\u002F\u002Frocm.docs.amd.com\u002Fprojects\u002Fai-developer-hub\u002Fen\u002Flatest\u002Fnotebooks\u002Ffine_tune\u002Fllama_factory_llama3.html\n- **Official Blog**: https:\u002F\u002Fblog.llamafactory.net\u002Fen\u002F\n- **Official Course**: https:\u002F\u002Fwww.lab4ai.cn\u002Fcourse\u002Fdetail?id=7c13e60f6137474eb40f6fd3983c0f46&utm_source=LLaMA-Factory\n\n> [!NOTE]\n> Except for the above links, all other websites are unauthorized third-party websites. Please carefully use them.\n\n## Table of Contents\n\n- [Features](#features)\n- [Blogs](#blogs)\n- [Changelog](#changelog)\n- [Supported Models](#supported-models)\n- [Supported Training Approaches](#supported-training-approaches)\n- [Provided Datasets](#provided-datasets)\n- [Requirement](#requirement)\n- [Getting Started](#getting-started)\n  - [Installation](#installation)\n  - [Data Preparation](#data-preparation)\n  - [Quickstart](#quickstart)\n  - [Fine-Tuning with LLaMA Board GUI](#fine-tuning-with-llama-board-gui-powered-by-gradio)\n  - [LLaMA Factory Online](#llama-factory-online)\n  - [Build Docker](#build-docker)\n  - [Deploy with OpenAI-style API and vLLM](#deploy-with-openai-style-api-and-vllm)\n  - [Download from ModelScope Hub](#download-from-modelscope-hub)\n  - [Download from Modelers Hub](#download-from-modelers-hub)\n  - [Use W&B Logger](#use-wb-logger)\n  - [Use SwanLab Logger](#use-swanlab-logger)\n- [Projects using LLaMA Factory](#projects-using-llama-factory)\n- [License](#license)\n- [Citation](#citation)\n- [Acknowledgement](#acknowledgement)\n\n## Features\n\n- **Various models**: LLaMA, LLaVA, Mistral, Mixtral-MoE, Qwen3, Qwen3-VL, DeepSeek, Gemma, GLM, Phi, etc.\n- **Integrated methods**: (Continuous) pre-training, (multimodal) supervised fine-tuning, reward modeling, PPO, DPO, KTO, ORPO, etc.\n- **Scalable resources**: 16-bit full-tuning, freeze-tuning, LoRA and 2\u002F3\u002F4\u002F5\u002F6\u002F8-bit QLoRA via AQLM\u002FAWQ\u002FGPTQ\u002FLLM.int8\u002FHQQ\u002FEETQ.\n- **Advanced algorithms**: [GaLore](https:\u002F\u002Fgithub.com\u002Fjiaweizzhao\u002FGaLore), [BAdam](https:\u002F\u002Fgithub.com\u002FLedzy\u002FBAdam), [APOLLO](https:\u002F\u002Fgithub.com\u002Fzhuhanqing\u002FAPOLLO), [Adam-mini](https:\u002F\u002Fgithub.com\u002Fzyushun\u002FAdam-mini), [Muon](https:\u002F\u002Fgithub.com\u002FKellerJordan\u002FMuon), [OFT](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fpeft\u002Ftree\u002Fmain\u002Fsrc\u002Fpeft\u002Ftuners\u002Foft), DoRA, LongLoRA, LLaMA Pro, Mixture-of-Depths, LoRA+, LoftQ and PiSSA.\n- **Practical tricks**: [FlashAttention-2](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention), [Unsloth](https:\u002F\u002Fgithub.com\u002Funslothai\u002Funsloth), [Liger Kernel](https:\u002F\u002Fgithub.com\u002Flinkedin\u002FLiger-Kernel), [KTransformers](https:\u002F\u002Fgithub.com\u002Fkvcache-ai\u002Fktransformers\u002F), RoPE scaling, NEFTune and rsLoRA.\n- **Wide tasks**: Multi-turn dialogue, tool using, image understanding, visual grounding, video recognition, audio understanding, etc.\n- **Experiment monitors**: LlamaBoard, TensorBoard, Wandb, MLflow, [SwanLab](https:\u002F\u002Fgithub.com\u002FSwanHubX\u002FSwanLab), etc.\n- **Faster inference**: OpenAI-style API, Gradio UI and CLI with [vLLM worker](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm) or [SGLang worker](https:\u002F\u002Fgithub.com\u002Fsgl-project\u002Fsglang).\n\n### Day-N Support for Fine-Tuning Cutting-Edge Models\n\n| Support Date | Model Name                                                           |\n| ------------ | -------------------------------------------------------------------- |\n| Day 0        | Qwen3 \u002F Qwen2.5-VL \u002F Gemma 3 \u002F GLM-4.1V \u002F InternLM 3 \u002F MiniCPM-o-2.6 |\n| Day 1        | Llama 3 \u002F GLM-4 \u002F Mistral Small \u002F PaliGemma2 \u002F Llama 4               |\n\n## Blogs\n\n> [!TIP]\n> Now we have a dedicated blog for LLaMA Factory!\n>\n> Website: https:\u002F\u002Fblog.llamafactory.net\u002Fen\u002F\n\n- 💡 [KTransformers Fine-Tuning × LLaMA Factory: Fine-tuning 1000 Billion models with 2 4090-GPU + CPU](https:\u002F\u002Fblog.llamafactory.net\u002Fen\u002Fposts\u002Fktransformers\u002F) (English)\n- 💡 [Easy Dataset × LLaMA Factory: Enabling LLMs to Efficiently Learn Domain Knowledge](https:\u002F\u002Fbuaa-act.feishu.cn\u002Fwiki\u002FGVzlwYcRFiR8OLkHbL6cQpYin7g) (English)\n- [Fine-tune a mental health LLM using LLaMA-Factory](https:\u002F\u002Fwww.lab4ai.cn\u002Fproject\u002Fdetail?id=25cce32ec131497b9e06a93336a0817f&type=project&utm_source=LLaMA-Factory) (Chinese)\n- [Fine-tune GPT-OSS for Role-Playing using LLaMA-Factory](https:\u002F\u002Fdocs.llamafactory.com.cn\u002Fdocs\u002Fdocuments\u002Fbest-practice\u002Fgptroleplay\u002F?utm_source=LLaMA-Factory) (Chinese)\n- [A One-Stop Code-Free Model Reinforcement Learning and Deployment Platform based on LLaMA-Factory and EasyR1](https:\u002F\u002Faws.amazon.com\u002Fcn\u002Fblogs\u002Fchina\u002Fbuilding-llm-model-hub-based-on-llamafactory-and-easyr1\u002F) (Chinese)\n- [How Apoidea Group enhances visual information extraction from banking documents with multimodal models using LLaMA-Factory on Amazon SageMaker HyperPod](https:\u002F\u002Faws.amazon.com\u002Fcn\u002Fblogs\u002Fmachine-learning\u002Fhow-apoidea-group-enhances-visual-information-extraction-from-banking-documents-with-multimodal-models-using-llama-factory-on-amazon-sagemaker-hyperpod\u002F) (English)\n\n\u003Cdetails>\u003Csummary>All Blogs\u003C\u002Fsummary>\n\n- [Fine-tune Llama3.1-70B for Medical Diagnosis using LLaMA-Factory](https:\u002F\u002Fdocs.alayanew.com\u002Fdocs\u002Fdocuments\u002FbestPractice\u002FbigModel\u002Fllama70B\u002F?utm_source=LLaMA-Factory) (Chinese)\n- [Fine-tune Qwen2.5-VL for Autonomous Driving using LLaMA-Factory](https:\u002F\u002Fdocs.alayanew.com\u002Fdocs\u002Fdocuments\u002FuseGuide\u002FLLaMAFactory\u002Fmutiple\u002F?utm_source=LLaMA-Factory) (Chinese)\n- [LLaMA Factory: Fine-tuning the DeepSeek-R1-Distill-Qwen-7B Model for News Classifier](https:\u002F\u002Fgallery.pai-ml.com\u002F#\u002Fpreview\u002FdeepLearning\u002Fnlp\u002Fllama_factory_deepseek_r1_distill_7b) (Chinese)\n- [A One-Stop Code-Free Model Fine-Tuning \\& Deployment Platform based on SageMaker and LLaMA-Factory](https:\u002F\u002Faws.amazon.com\u002Fcn\u002Fblogs\u002Fchina\u002Fa-one-stop-code-free-model-fine-tuning-deployment-platform-based-on-sagemaker-and-llama-factory\u002F) (Chinese)\n- [LLaMA Factory Multi-Modal Fine-Tuning Practice: Fine-Tuning Qwen2-VL for Personal Tourist Guide](https:\u002F\u002Fgallery.pai-ml.com\u002F#\u002Fpreview\u002FdeepLearning\u002Fnlp\u002Fllama_factory_qwen2vl) (Chinese)\n- [LLaMA Factory: Fine-tuning Llama3 for Role-Playing](https:\u002F\u002Fgallery.pai-ml.com\u002F#\u002Fpreview\u002FdeepLearning\u002Fnlp\u002Fllama_factory) (Chinese)\n\n\u003C\u002Fdetails>\n\n## Changelog\n\n[25\u002F10\u002F26] We support Megatron-core training backend with [**mcore_adapter**](https:\u002F\u002Fgithub.com\u002Falibaba\u002FROLL\u002Ftree\u002Fmain\u002Fmcore_adapter). See [PR #9237](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fpull\u002F9237) to get started.\n\n[25\u002F08\u002F22] We supported **[OFT](https:\u002F\u002Farxiv.org\u002Fabs\u002F2306.07280)** and **[OFTv2](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.19847)**. See [examples](examples\u002FREADME.md) for usage.\n\n[25\u002F08\u002F20] We supported fine-tuning the **[Intern-S1-mini](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002FIntern-S1-mini)** models. See [PR #8976](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fpull\u002F8976) to get started.\n\n[25\u002F08\u002F06] We supported fine-tuning the **[GPT-OSS](https:\u002F\u002Fgithub.com\u002Fopenai\u002Fgpt-oss)** models. See [PR #8826](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fpull\u002F8826) to get started.\n\n\u003Cdetails>\u003Csummary>Full Changelog\u003C\u002Fsummary>\n\n[25\u002F07\u002F02] We supported fine-tuning the **[GLM-4.1V-9B-Thinking](https:\u002F\u002Fgithub.com\u002FTHUDM\u002FGLM-4.1V-Thinking)** model.\n\n[25\u002F04\u002F28] We supported fine-tuning the **[Qwen3](https:\u002F\u002Fqwenlm.github.io\u002Fblog\u002Fqwen3\u002F)** model family.\n\n[25\u002F04\u002F21] We supported the **[Muon](https:\u002F\u002Fgithub.com\u002FKellerJordan\u002FMuon)** optimizer. See [examples](examples\u002FREADME.md) for usage. Thank [@tianshijing](https:\u002F\u002Fgithub.com\u002Ftianshijing)'s PR.\n\n[25\u002F04\u002F16] We supported fine-tuning the **[InternVL3](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab\u002FInternVL3-8B)** model. See [PR #7258](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fpull\u002F7258) to get started.\n\n[25\u002F04\u002F14] We supported fine-tuning the **[GLM-Z1](https:\u002F\u002Fhuggingface.co\u002FTHUDM\u002FGLM-Z1-9B-0414)** and **[Kimi-VL](https:\u002F\u002Fhuggingface.co\u002Fmoonshotai\u002FKimi-VL-A3B-Instruct)** models.\n\n[25\u002F04\u002F06] We supported fine-tuning the **[Llama 4](https:\u002F\u002Fai.meta.com\u002Fblog\u002Fllama-4-multimodal-intelligence\u002F)** model. See [PR #7611](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fpull\u002F7611) to get started.\n\n[25\u002F03\u002F31] We supported fine-tuning the **[Qwen2.5 Omni](https:\u002F\u002Fqwenlm.github.io\u002Fblog\u002Fqwen2.5-omni\u002F)** model. See [PR #7537](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fpull\u002F7537) to get started.\n\n[25\u002F03\u002F15] We supported **[SGLang](https:\u002F\u002Fgithub.com\u002Fsgl-project\u002Fsglang)** as inference backend. Try `infer_backend: sglang` to accelerate inference.\n\n[25\u002F03\u002F12] We supported fine-tuning the **[Gemma 3](https:\u002F\u002Fhuggingface.co\u002Fblog\u002Fgemma3)** model.\n\n[25\u002F02\u002F24] Announcing **[EasyR1](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FEasyR1)**, an efficient, scalable and multi-modality RL training framework for efficient GRPO training.\n\n[25\u002F02\u002F11] We supported saving the **[Ollama](https:\u002F\u002Fgithub.com\u002Follama\u002Follama)** modelfile when exporting the model checkpoints. See [examples](examples\u002FREADME.md) for usage.\n\n[25\u002F02\u002F05] We supported fine-tuning the **[Qwen2-Audio](Qwen\u002FQwen2-Audio-7B-Instruct)** and **[MiniCPM-o-2.6](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-o-2_6)** on audio understanding tasks.\n\n[25\u002F01\u002F31] We supported fine-tuning the **[DeepSeek-R1](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai\u002FDeepSeek-R1)** and **[Qwen2.5-VL](https:\u002F\u002Fhuggingface.co\u002FQwen\u002FQwen2.5-VL-7B-Instruct)** models.\n\n[25\u002F01\u002F15] We supported **[APOLLO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.05270)** optimizer. See [examples](examples\u002FREADME.md) for usage.\n\n[25\u002F01\u002F14] We supported fine-tuning the **[MiniCPM-o-2.6](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-o-2_6)** and **[MiniCPM-V-2.6](https:\u002F\u002Fhuggingface.co\u002Fopenbmb\u002FMiniCPM-V-2_6)** models. Thank [@BUAADreamer](https:\u002F\u002Fgithub.com\u002FBUAADreamer)'s PR.\n\n[25\u002F01\u002F14] We supported fine-tuning the **[InternLM 3](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Finternlm\u002F)** models. Thank [@hhaAndroid](https:\u002F\u002Fgithub.com\u002FhhaAndroid)'s PR.\n\n[25\u002F01\u002F10] We supported fine-tuning the **[Phi-4](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft\u002Fphi-4)** model.\n\n[24\u002F12\u002F21] We supported using **[SwanLab](https:\u002F\u002Fgithub.com\u002FSwanHubX\u002FSwanLab)** for experiment tracking and visualization. See [this section](#use-swanlab-logger) for details.\n\n[24\u002F11\u002F27] We supported fine-tuning the **[Skywork-o1](https:\u002F\u002Fhuggingface.co\u002FSkywork\u002FSkywork-o1-Open-Llama-3.1-8B)** model and the **[OpenO1](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FO1-OPEN\u002FOpenO1-SFT)** dataset.\n\n[24\u002F10\u002F09] We supported downloading pre-trained models and datasets from the **[Modelers Hub](https:\u002F\u002Fmodelers.cn\u002Fmodels)**. See [this tutorial](#download-from-modelers-hub) for usage.\n\n[24\u002F09\u002F19] We supported fine-tuning the **[Qwen2.5](https:\u002F\u002Fqwenlm.github.io\u002Fblog\u002Fqwen2.5\u002F)** models.\n\n[24\u002F08\u002F30] We supported fine-tuning the **[Qwen2-VL](https:\u002F\u002Fqwenlm.github.io\u002Fblog\u002Fqwen2-vl\u002F)** models. Thank [@simonJJJ](https:\u002F\u002Fgithub.com\u002FsimonJJJ)'s PR.\n\n[24\u002F08\u002F27] We supported **[Liger Kernel](https:\u002F\u002Fgithub.com\u002Flinkedin\u002FLiger-Kernel)**. Try `enable_liger_kernel: true` for efficient training.\n\n[24\u002F08\u002F09] We supported **[Adam-mini](https:\u002F\u002Fgithub.com\u002Fzyushun\u002FAdam-mini)** optimizer. See [examples](examples\u002FREADME.md) for usage. Thank [@relic-yuexi](https:\u002F\u002Fgithub.com\u002Frelic-yuexi)'s PR.\n\n[24\u002F07\u002F04] We supported [contamination-free packed training](https:\u002F\u002Fgithub.com\u002FMeetKai\u002Ffunctionary\u002Ftree\u002Fmain\u002Ffunctionary\u002Ftrain\u002Fpacking). Use `neat_packing: true` to activate it. Thank [@chuan298](https:\u002F\u002Fgithub.com\u002Fchuan298)'s PR.\n\n[24\u002F06\u002F16] We supported **[PiSSA](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.02948)** algorithm. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F06\u002F07] We supported fine-tuning the **[Qwen2](https:\u002F\u002Fqwenlm.github.io\u002Fblog\u002Fqwen2\u002F)** and **[GLM-4](https:\u002F\u002Fgithub.com\u002FTHUDM\u002FGLM-4)** models.\n\n[24\u002F05\u002F26] We supported **[SimPO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.14734)** algorithm for preference learning. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F05\u002F20] We supported fine-tuning the **PaliGemma** series models. Note that the PaliGemma models are pre-trained models, you need to fine-tune them with `paligemma` template for chat completion.\n\n[24\u002F05\u002F18] We supported **[KTO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.01306)** algorithm for preference learning. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F05\u002F14] We supported training and inference on the Ascend NPU devices. Check [installation](#installation) section for details.\n\n[24\u002F04\u002F26] We supported fine-tuning the **LLaVA-1.5** multimodal LLMs. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F04\u002F22] We provided a **[Colab notebook](https:\u002F\u002Fcolab.research.google.com\u002Fdrive\u002F1eRTPn37ltBbYsISy9Aw2NuI2Aq5CQrD9?usp=sharing)** for fine-tuning the Llama-3 model on a free T4 GPU. Two Llama-3-derived models fine-tuned using LLaMA Factory are available at Hugging Face, check [Llama3-8B-Chinese-Chat](https:\u002F\u002Fhuggingface.co\u002Fshenzhi-wang\u002FLlama3-8B-Chinese-Chat) and [Llama3-Chinese](https:\u002F\u002Fhuggingface.co\u002Fzhichen\u002FLlama3-Chinese) for details.\n\n[24\u002F04\u002F21] We supported **[Mixture-of-Depths](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.02258)** according to [AstraMindAI's implementation](https:\u002F\u002Fgithub.com\u002Fastramind-ai\u002FMixture-of-depths). See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F04\u002F16] We supported **[BAdam](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.02827)** optimizer. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F04\u002F16] We supported **[unsloth](https:\u002F\u002Fgithub.com\u002Funslothai\u002Funsloth)**'s long-sequence training (Llama-2-7B-56k within 24GB). It achieves **117%** speed and **50%** memory compared with FlashAttention-2, more benchmarks can be found in [this page](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fwiki\u002FPerformance-comparison).\n\n[24\u002F03\u002F31] We supported **[ORPO](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.07691)**. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F03\u002F21] Our paper \"[LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.13372)\" is available at arXiv!\n\n[24\u002F03\u002F20] We supported **FSDP+QLoRA** that fine-tunes a 70B model on 2x24GB GPUs. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F03\u002F13] We supported **[LoRA+](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.12354)**. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F03\u002F07] We supported **[GaLore](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.03507)** optimizer. See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F03\u002F07] We integrated **[vLLM](https:\u002F\u002Fgithub.com\u002Fvllm-project\u002Fvllm)** for faster and concurrent inference. Try `infer_backend: vllm` to enjoy **270%** inference speed.\n\n[24\u002F02\u002F28] We supported weight-decomposed LoRA (**[DoRA](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.09353)**). Try `use_dora: true` to activate DoRA training.\n\n[24\u002F02\u002F15] We supported **block expansion** proposed by [LLaMA Pro](https:\u002F\u002Fgithub.com\u002FTencentARC\u002FLLaMA-Pro). See [examples](examples\u002FREADME.md) for usage.\n\n[24\u002F02\u002F05] Qwen1.5 (Qwen2 beta version) series models are supported in LLaMA-Factory. Check this [blog post](https:\u002F\u002Fqwenlm.github.io\u002Fblog\u002Fqwen1.5\u002F) for details.\n\n[24\u002F01\u002F18] We supported **agent tuning** for most models, equipping model with tool using abilities by fine-tuning with `dataset: glaive_toolcall_en`.\n\n[23\u002F12\u002F23] We supported **[unsloth](https:\u002F\u002Fgithub.com\u002Funslothai\u002Funsloth)**'s implementation to boost LoRA tuning for the LLaMA, Mistral and Yi models. Try `use_unsloth: true` argument to activate unsloth patch. It achieves **170%** speed in our benchmark, check [this page](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fwiki\u002FPerformance-comparison) for details.\n\n[23\u002F12\u002F12] We supported fine-tuning the latest MoE model **[Mixtral 8x7B](https:\u002F\u002Fhuggingface.co\u002Fmistralai\u002FMixtral-8x7B-v0.1)** in our framework. See hardware requirement [here](#hardware-requirement).\n\n[23\u002F12\u002F01] We supported downloading pre-trained models and datasets from the **[ModelScope Hub](https:\u002F\u002Fmodelscope.cn\u002Fmodels)**. See [this tutorial](#download-from-modelscope-hub) for usage.\n\n[23\u002F10\u002F21] We supported **[NEFTune](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.05914)** trick for fine-tuning. Try `neftune_noise_alpha: 5` argument to activate NEFTune.\n\n[23\u002F09\u002F27] We supported **$S^2$-Attn** proposed by [LongLoRA](https:\u002F\u002Fgithub.com\u002Fdvlab-research\u002FLongLoRA) for the LLaMA models. Try `shift_attn: true` argument to enable shift short attention.\n\n[23\u002F09\u002F23] We integrated MMLU, C-Eval and CMMLU benchmarks in this repo. See [examples](examples\u002FREADME.md) for usage.\n\n[23\u002F09\u002F10] We supported **[FlashAttention-2](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention)**. Try `flash_attn: fa2` argument to enable FlashAttention-2 if you are using RTX4090, A100 or H100 GPUs.\n\n[23\u002F08\u002F12] We supported **RoPE scaling** to extend the context length of the LLaMA models. Try `rope_scaling: linear` argument in training and `rope_scaling: dynamic` argument at inference to extrapolate the position embeddings.\n\n[23\u002F08\u002F11] We supported **[DPO training](https:\u002F\u002Farxiv.org\u002Fabs\u002F2305.18290)** for instruction-tuned models. See [examples](examples\u002FREADME.md) for usage.\n\n[23\u002F07\u002F31] We supported **dataset streaming**. Try `streaming: true` and `max_steps: 10000` arguments to load your dataset in streaming mode.\n\n[23\u002F07\u002F29] We released two instruction-tuned 13B models at Hugging Face. See these Hugging Face Repos ([LLaMA-2](https:\u002F\u002Fhuggingface.co\u002Fhiyouga\u002FLlama-2-Chinese-13b-chat) \u002F [Baichuan](https:\u002F\u002Fhuggingface.co\u002Fhiyouga\u002FBaichuan-13B-sft)) for details.\n\n[23\u002F07\u002F18] We developed an **all-in-one Web UI** for training, evaluation and inference. Try `train_web.py` to fine-tune models in your Web browser. Thank [@KanadeSiina](https:\u002F\u002Fgithub.com\u002FKanadeSiina) and [@codemayq](https:\u002F\u002Fgithub.com\u002Fcodemayq) for their efforts in the development.\n\n[23\u002F07\u002F09] We released **[FastEdit](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FFastEdit)** ⚡🩹, an easy-to-use package for editing the factual knowledge of large language models efficiently. Please follow [FastEdit](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FFastEdit) if you are interested.\n\n[23\u002F06\u002F29] We provided a **reproducible example** of training a chat model using instruction-following datasets, see [Baichuan-7B-sft](https:\u002F\u002Fhuggingface.co\u002Fhiyouga\u002FBaichuan-7B-sft) for details.\n\n[23\u002F06\u002F22] We aligned the [demo API](src\u002Fapi_demo.py) with the [OpenAI's](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fapi-reference\u002Fchat) format where you can insert the fine-tuned model in **arbitrary ChatGPT-based applications**.\n\n[23\u002F06\u002F03] We supported quantized training and inference (aka **[QLoRA](https:\u002F\u002Fgithub.com\u002Fartidoro\u002Fqlora)**). See [examples](examples\u002FREADME.md) for usage.\n\n\u003C\u002Fdetails>\n\n> [!TIP]\n> If you cannot use the latest feature, please pull the latest code and install LLaMA-Factory again.\n\n## Supported Models\n\n| Model                                                             | Model size                       | Template             |\n| ----------------------------------------------------------------- | -------------------------------- | -------------------- |\n| [BLOOM\u002FBLOOMZ](https:\u002F\u002Fhuggingface.co\u002Fbigscience)                 | 560M\u002F1.1B\u002F1.7B\u002F3B\u002F7.1B\u002F176B      | -                    |\n| [DeepSeek (LLM\u002FCode\u002FMoE)](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai)     | 7B\u002F16B\u002F67B\u002F236B                  | deepseek             |\n| [DeepSeek 3-3.2](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai)              | 236B\u002F671B                        | deepseek3            |\n| [DeepSeek R1 (Distill)](https:\u002F\u002Fhuggingface.co\u002Fdeepseek-ai)       | 1.5B\u002F7B\u002F8B\u002F14B\u002F32B\u002F70B\u002F671B      | deepseekr1           |\n| [ERNIE-4.5](https:\u002F\u002Fhuggingface.co\u002Fbaidu)                         | 0.3B\u002F21B\u002F300B                    | ernie_nothink        |\n| [Falcon\u002FFalcon H1](https:\u002F\u002Fhuggingface.co\u002Ftiiuae)                 | 0.5B\u002F1.5B\u002F3B\u002F7B\u002F11B\u002F34B\u002F40B\u002F180B | falcon\u002Ffalcon_h1     |\n| [Gemma\u002FGemma 2\u002FCodeGemma](https:\u002F\u002Fhuggingface.co\u002Fgoogle)          | 2B\u002F7B\u002F9B\u002F27B                     | gemma\u002Fgemma2         |\n| [Gemma 3\u002FGemma 3n](https:\u002F\u002Fhuggingface.co\u002Fgoogle)                 | 270M\u002F1B\u002F4B\u002F6B\u002F8B\u002F12B\u002F27B         | gemma3\u002Fgemma3n       |\n| [GLM-4\u002FGLM-4-0414\u002FGLM-Z1](https:\u002F\u002Fhuggingface.co\u002Fzai-org)         | 9B\u002F32B                           | glm4\u002Fglmz1           |\n| [GLM-4.5\u002FGLM-4.5(6)V](https:\u002F\u002Fhuggingface.co\u002Fzai-org)             | 9B\u002F106B\u002F355B                     | glm4_moe\u002Fglm4_5v     |\n| [GPT-2](https:\u002F\u002Fhuggingface.co\u002Fopenai-community)                  | 0.1B\u002F0.4B\u002F0.8B\u002F1.5B              | -                    |\n| [GPT-OSS](https:\u002F\u002Fhuggingface.co\u002Fopenai)                          | 20B\u002F120B                         | gpt_oss              |\n| [Granite 3-4](https:\u002F\u002Fhuggingface.co\u002Fibm-granite)                 | 1B\u002F2B\u002F3B\u002F7B\u002F8B                   | granite3\u002Fgranite4    |\n| [Hunyuan\u002FHunyuan1.5 (MT)](https:\u002F\u002Fhuggingface.co\u002Ftencent\u002F)        | 0.5B\u002F1.8B\u002F4B\u002F7B\u002F13B              | hunyuan\u002Fhunyuan_small|\n| [InternLM 2-3](https:\u002F\u002Fhuggingface.co\u002Finternlm)                   | 7B\u002F8B\u002F20B                        | intern2              |\n| [InternVL 2.5-3.5](https:\u002F\u002Fhuggingface.co\u002FOpenGVLab)              | 1B\u002F2B\u002F4B\u002F8B\u002F14B\u002F30B\u002F38B\u002F78B\u002F241B | intern_vl            |\n| [Intern-S1-mini](https:\u002F\u002Fhuggingface.co\u002Finternlm\u002F)                | 8B                               | intern_s1            |\n| [Kimi-VL](https:\u002F\u002Fhuggingface.co\u002Fmoonshotai)                      | 16B                              | kimi_vl              |\n| [Ling 2.0 (mini\u002Fflash)](https:\u002F\u002Fhuggingface.co\u002FinclusionAI)       | 16B\u002F100B                         | bailing_v2           |\n| [LFM 2.5 (VL)](https:\u002F\u002Fhuggingface.co\u002FLiquidAI)                   | 1.2B\u002F1.6B                        | lfm2\u002Flfm2_vl         |\n| [Llama](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fllama)                | 7B\u002F13B\u002F33B\u002F65B                   | -                    |\n| [Llama 2](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama)                      | 7B\u002F13B\u002F70B                       | llama2               |\n| [Llama 3-3.3](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama)                  | 1B\u002F3B\u002F8B\u002F70B                     | llama3               |\n| [Llama 4](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama)                      | 109B\u002F402B                        | llama4               |\n| [Llama 3.2 Vision](https:\u002F\u002Fhuggingface.co\u002Fmeta-llama)             | 11B\u002F90B                          | mllama               |\n| [LLaVA-1.5](https:\u002F\u002Fhuggingface.co\u002Fllava-hf)                      | 7B\u002F13B                           | llava                |\n| [LLaVA-NeXT](https:\u002F\u002Fhuggingface.co\u002Fllava-hf)                     | 7B\u002F8B\u002F13B\u002F34B\u002F72B\u002F110B           | llava_next           |\n| [LLaVA-NeXT-Video](https:\u002F\u002Fhuggingface.co\u002Fllava-hf)               | 7B\u002F34B                           | llava_next_video     |\n| [MiMo](https:\u002F\u002Fhuggingface.co\u002FXiaomiMiMo)                         | 7B\u002F309B                          | mimo\u002Fmimo_v2         |\n| [MiniCPM 4](https:\u002F\u002Fhuggingface.co\u002Fopenbmb)                       | 0.5B\u002F8B                          | cpm4                 |\n| [MiniCPM-o\u002FMiniCPM-V 4.5](https:\u002F\u002Fhuggingface.co\u002Fopenbmb)         | 8B\u002F9B                            | minicpm_o\u002Fminicpm_v  |\n| [MiniMax-M1\u002FMiniMax-M2](https:\u002F\u002Fhuggingface.co\u002FMiniMaxAI\u002Fmodels)  | 229B\u002F456B                        | minimax1\u002Fminimax2    |\n| [Ministral 3](https:\u002F\u002Fhuggingface.co\u002Fmistralai)                   | 3B\u002F8B\u002F14B                        | ministral3           |\n| [Mistral\u002FMixtral](https:\u002F\u002Fhuggingface.co\u002Fmistralai)               | 7B\u002F8x7B\u002F8x22B                    | mistral              |\n| [PaliGemma\u002FPaliGemma2](https:\u002F\u002Fhuggingface.co\u002Fgoogle)             | 3B\u002F10B\u002F28B                       | paligemma            |\n| [Phi-3\u002FPhi-3.5](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft)                 | 4B\u002F14B                           | phi                  |\n| [Phi-3-small](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft)                   | 7B                               | phi_small            |\n| [Phi-4-mini\u002FPhi-4](https:\u002F\u002Fhuggingface.co\u002Fmicrosoft)              | 3.8B\u002F14B                         | phi4_mini\u002Fphi4       |\n| [Pixtral](https:\u002F\u002Fhuggingface.co\u002Fmistralai)                       | 12B                              | pixtral              |\n| [Qwen2 (Code\u002FMath\u002FMoE\u002FQwQ)](https:\u002F\u002Fhuggingface.co\u002FQwen)          | 0.5B\u002F1.5B\u002F3B\u002F7B\u002F14B\u002F32B\u002F72B\u002F110B | qwen                 |\n| [Qwen3 (MoE\u002FInstruct\u002FThinking\u002FNext)](https:\u002F\u002Fhuggingface.co\u002FQwen) | 0.6B\u002F1.7B\u002F4B\u002F8B\u002F14B\u002F32B\u002F80B\u002F235B | qwen3\u002Fqwen3_nothink  |\n| [Qwen3.5](https:\u002F\u002Fhuggingface.co\u002FQwen)                            | 0.8B\u002F2B\u002F4B\u002F9B\u002F27B\u002F35B\u002F122B\u002F397B  | qwen3_5\u002Fqwen3_5_nothink              |\n| [Qwen3.6](https:\u002F\u002Fhuggingface.co\u002FQwen)                            | 35B                              | qwen3_6\u002Fqwen3_6_nothink              |\n| [Qwen2-Audio](https:\u002F\u002Fhuggingface.co\u002FQwen)                        | 7B                               | qwen2_audio          |\n| [Qwen2.5-Omni](https:\u002F\u002Fhuggingface.co\u002FQwen)                       | 3B\u002F7B                            | qwen2_omni           |\n| [Qwen3-Omni](https:\u002F\u002Fhuggingface.co\u002FQwen)                         | 30B                              | qwen3_omni           |\n| [Qwen2-VL\u002FQwen2.5-VL\u002FQVQ](https:\u002F\u002Fhuggingface.co\u002FQwen)            | 2B\u002F3B\u002F7B\u002F32B\u002F72B                 | qwen2_vl             |\n| [Qwen3-VL](https:\u002F\u002Fhuggingface.co\u002FQwen)                           | 2B\u002F4B\u002F8B\u002F30B\u002F32B\u002F235B            | qwen3_vl             |\n| [Seed (OSS\u002FCoder)](https:\u002F\u002Fhuggingface.co\u002FByteDance-Seed)         | 8B\u002F36B                           | seed_oss\u002Fseed_coder  |\n| [StarCoder 2](https:\u002F\u002Fhuggingface.co\u002Fbigcode)                     | 3B\u002F7B\u002F15B                        | -                    |\n| [TeleChat 2-2.5](https:\u002F\u002Fhuggingface.co\u002FTele-AI)                  | 3B\u002F7B\u002F35B\u002F115B                   | telechat2            |\n| [Yuan 2](https:\u002F\u002Fhuggingface.co\u002FIEITYuan)                         | 2B\u002F51B\u002F102B                      | yuan                 |\n\n> [!NOTE]\n> For the \"base\" models, the `template` argument can be chosen from `default`, `alpaca`, `vicuna` etc. But make sure to use the **corresponding template** for the \"instruct\u002Fchat\" models.\n>\n> If the model has both reasoning and non-reasoning versions, please use the `_nothink` suffix to distinguish between them. For example, `qwen3` and `qwen3_nothink`.\n>\n> Remember to use the **SAME** template in training and inference.\n>\n> \\*: You should install the `transformers` from main branch and use `DISABLE_VERSION_CHECK=1` to skip version check.\n>\n> \\*\\*: You need to install a specific version of `transformers` to use the corresponding model.\n\nPlease refer to [constants.py](src\u002Fllamafactory\u002Fextras\u002Fconstants.py) for a full list of models we supported.\n\nYou also can add a custom chat template to [template.py](src\u002Fllamafactory\u002Fdata\u002Ftemplate.py).\n\n## Supported Training Approaches\n\n| Approach               |     Full-tuning    |    Freeze-tuning   |       LoRA         |       QLoRA        |        OFT         |        QOFT        |\n| ---------------------- | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ | ------------------ |\n| Pre-Training           | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |\n| Supervised Fine-Tuning | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |\n| Reward Modeling        | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |\n| PPO Training           | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |\n| DPO Training           | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |\n| KTO Training           | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |\n| ORPO Training          | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |\n| SimPO Training         | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: |\n\n> [!TIP]\n> The implementation details of PPO can be found in [this blog](https:\u002F\u002Fnewfacade.github.io\u002Fnotes-on-reinforcement-learning\u002F17-ppo-trl.html).\n\n## Provided Datasets\n\n\u003Cdetails>\u003Csummary>Pre-training datasets\u003C\u002Fsummary>\n\n- [Wiki Demo (en)](data\u002Fwiki_demo.txt)\n- [RefinedWeb (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftiiuae\u002Ffalcon-refinedweb)\n- [RedPajama V2 (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftogethercomputer\u002FRedPajama-Data-V2)\n- [Wikipedia (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Folm\u002Folm-wikipedia-20221220)\n- [Wikipedia (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fpleisto\u002Fwikipedia-cn-20230720-filtered)\n- [Pile (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FEleutherAI\u002Fpile)\n- [SkyPile (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FSkywork\u002FSkyPile-150B)\n- [FineWeb (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHuggingFaceFW\u002Ffineweb)\n- [FineWeb-Edu (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHuggingFaceFW\u002Ffineweb-edu)\n- [CCI3-HQ (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBAAI\u002FCCI3-HQ)\n- [CCI3-Data (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBAAI\u002FCCI3-Data)\n- [CCI4.0-M2-Base-v1 (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBAAI\u002FCCI4.0-M2-Base-v1)\n- [CCI4.0-M2-CoT-v1 (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBAAI\u002FCCI4.0-M2-CoT-v1)\n- [CCI4.0-M2-Extra-v1 (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBAAI\u002FCCI4.0-M2-Extra-v1)\n- [The Stack (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fbigcode\u002Fthe-stack)\n- [StarCoder (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fbigcode\u002Fstarcoderdata)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>Supervised fine-tuning datasets\u003C\u002Fsummary>\n\n- [Identity (en&zh)](data\u002Fidentity.json)\n- [Stanford Alpaca (en)](https:\u002F\u002Fgithub.com\u002Ftatsu-lab\u002Fstanford_alpaca)\n- [Stanford Alpaca (zh)](https:\u002F\u002Fgithub.com\u002Fymcui\u002FChinese-LLaMA-Alpaca-3)\n- [Alpaca GPT4 (en&zh)](https:\u002F\u002Fgithub.com\u002FInstruction-Tuning-with-GPT-4\u002FGPT-4-LLM)\n- [Glaive Function Calling V2 (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fglaiveai\u002Fglaive-function-calling-v2)\n- [LIMA (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FGAIR\u002Flima)\n- [Guanaco Dataset (multilingual)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJosephusCheung\u002FGuanacoDataset)\n- [BELLE 2M (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBelleGroup\u002Ftrain_2M_CN)\n- [BELLE 1M (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBelleGroup\u002Ftrain_1M_CN)\n- [BELLE 0.5M (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBelleGroup\u002Ftrain_0.5M_CN)\n- [BELLE Dialogue 0.4M (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBelleGroup\u002Fgenerated_chat_0.4M)\n- [BELLE School Math 0.25M (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBelleGroup\u002Fschool_math_0.25M)\n- [BELLE Multiturn Chat 0.8M (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBelleGroup\u002Fmultiturn_chat_0.8M)\n- [UltraChat (en)](https:\u002F\u002Fgithub.com\u002Fthunlp\u002FUltraChat)\n- [OpenPlatypus (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fgarage-bAInd\u002FOpen-Platypus)\n- [CodeAlpaca 20k (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fsahil2801\u002FCodeAlpaca-20k)\n- [Alpaca CoT (multilingual)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FQingyiSi\u002FAlpaca-CoT)\n- [OpenOrca (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FOpen-Orca\u002FOpenOrca)\n- [SlimOrca (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FOpen-Orca\u002FSlimOrca)\n- [MathInstruct (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTIGER-Lab\u002FMathInstruct)\n- [Firefly 1.1M (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FYeungNLP\u002Ffirefly-train-1.1M)\n- [Wiki QA (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fwiki_qa)\n- [Web QA (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fsuolyer\u002Fwebqa)\n- [WebNovel (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fzxbsmk\u002Fwebnovel_cn)\n- [Nectar (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fberkeley-nest\u002FNectar)\n- [deepctrl (en&zh)](https:\u002F\u002Fwww.modelscope.cn\u002Fdatasets\u002Fdeepctrl\u002Fdeepctrl-sft-data)\n- [Advertise Generating (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHasturOfficial\u002Fadgen)\n- [ShareGPT Hyperfiltered (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Ftotally-not-an-llm\u002Fsharegpt-hyperfiltered-3k)\n- [ShareGPT4 (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fshibing624\u002Fsharegpt_gpt4)\n- [UltraChat 200k (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHuggingFaceH4\u002Fultrachat_200k)\n- [Infinity Instruct (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBAAI\u002FInfinity-Instruct)\n- [AgentInstruct (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTHUDM\u002FAgentInstruct)\n- [LMSYS Chat 1M (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Flmsys\u002Flmsys-chat-1m)\n- [Evol Instruct V2 (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FWizardLM\u002FWizardLM_evol_instruct_V2_196k)\n- [Cosmopedia (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHuggingFaceTB\u002Fcosmopedia)\n- [STEM (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fhfl\u002Fstem_zh_instruction)\n- [Ruozhiba (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fhfl\u002Fruozhiba_gpt4_turbo)\n- [Neo-sft (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fm-a-p\u002Fneo_sft_phase2)\n- [Magpie-Pro-300K-Filtered (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FMagpie-Align\u002FMagpie-Pro-300K-Filtered)\n- [Magpie-ultra-v0.1 (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fargilla\u002Fmagpie-ultra-v0.1)\n- [WebInstructSub (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTIGER-Lab\u002FWebInstructSub)\n- [OpenO1-SFT (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FO1-OPEN\u002FOpenO1-SFT)\n- [Open-Thoughts (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopen-thoughts\u002FOpenThoughts-114k)\n- [Open-R1-Math (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopen-r1\u002FOpenR1-Math-220k)\n- [Chinese-DeepSeek-R1-Distill (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FCongliu\u002FChinese-DeepSeek-R1-Distill-data-110k-SFT)\n- [LLaVA mixed (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBUAADreamer\u002Fllava-en-zh-300k)\n- [Pokemon-gpt4o-captions (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fjugg1024\u002Fpokemon-gpt4o-captions)\n- [DLR-Web (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FAttention1115\u002FDLR-Web)\n- [Open Assistant (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Foasst_de)\n- [Dolly 15k (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Fdolly-15k_de)\n- [Alpaca GPT4 (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Falpaca-gpt4_de)\n- [OpenSchnabeltier (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Fopenschnabeltier_de)\n- [Evol Instruct (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Fevol-instruct_de)\n- [Dolphin (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Fdolphin_de)\n- [Booksum (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Fbooksum_de)\n- [Airoboros (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Fairoboros-3.0_de)\n- [Ultrachat (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Fultra-chat_de)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>Preference datasets\u003C\u002Fsummary>\n\n- [DPO mixed (en&zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fhiyouga\u002FDPO-En-Zh-20k)\n- [UltraFeedback (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHuggingFaceH4\u002Fultrafeedback_binarized)\n- [COIG-P (zh)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fm-a-p\u002FCOIG-P)\n- [RLHF-V (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopenbmb\u002FRLHF-V-Dataset)\n- [VLFeedback (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FZhihui\u002FVLFeedback)\n- [RLAIF-V (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopenbmb\u002FRLAIF-V-Dataset)\n- [Orca DPO Pairs (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FIntel\u002Forca_dpo_pairs)\n- [HH-RLHF (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FAnthropic\u002Fhh-rlhf)\n- [Nectar (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fberkeley-nest\u002FNectar)\n- [Orca DPO (de)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fmayflowergmbh\u002Fintel_orca_dpo_pairs_de)\n- [KTO mixed (en)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fargilla\u002Fkto-mix-15k)\n\n\u003C\u002Fdetails>\n\nSome datasets require confirmation before using them, so we recommend logging in with your Hugging Face account using these commands.\n\n```bash\npip install \"huggingface_hub\u003C1.0.0\"\nhuggingface-cli login\n```\n\n## Requirement\n\n| Mandatory    | Minimum | Recommend |\n| ------------ | ------- | --------- |\n| python       | 3.11     | >=3.11   |\n| torch        | 2.0.0   | 2.6.0     |\n| torchvision  | 0.15.0  | 0.21.0    |\n| transformers | 4.49.0  | 4.50.0    |\n| datasets     | 2.16.0  | 3.2.0     |\n| accelerate   | 0.34.0  | 1.2.1     |\n| peft         | 0.14.0  | 0.15.1    |\n| trl          | 0.8.6   | 0.9.6     |\n\n| Optional     | Minimum | Recommend |\n| ------------ | ------- | --------- |\n| CUDA         | 11.6    | 12.2      |\n| deepspeed    | 0.10.0  | 0.16.4    |\n| bitsandbytes | 0.39.0  | 0.43.1    |\n| vllm         | 0.4.3   | 0.8.2     |\n| flash-attn   | 2.5.6   | 2.7.2     |\n\n### Hardware Requirement\n\n\\* *estimated*\n\n| Method                              | Bits |   7B  |  14B  |  30B  |   70B  |   `x`B  |\n| ----------------------------------- | ---- | ----- | ----- | ----- | ------ | ------- |\n| Full (`bf16` or `fp16`)             |  32  | 120GB | 240GB | 600GB | 1200GB | `18x`GB |\n| Full (`pure_bf16`)                  |  16  |  60GB | 120GB | 300GB |  600GB |  `8x`GB |\n| Freeze\u002FLoRA\u002FGaLore\u002FAPOLLO\u002FBAdam\u002FOFT |  16  |  16GB |  32GB |  64GB |  160GB |  `2x`GB |\n| QLoRA \u002F QOFT                        |   8  |  10GB |  20GB |  40GB |   80GB |   `x`GB |\n| QLoRA \u002F QOFT                        |   4  |   6GB |  12GB |  24GB |   48GB | `x\u002F2`GB |\n| QLoRA \u002F QOFT                        |   2  |   4GB |   8GB |  16GB |   24GB | `x\u002F4`GB |\n\n## Getting Started\n\n### Installation\n\n> [!IMPORTANT]\n> Installation is mandatory.\n\n#### Install from Source\n\n```bash\ngit clone --depth 1 https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLlamaFactory.git\ncd LlamaFactory\npip install -e .\npip install -r requirements\u002Fmetrics.txt\n```\n\nOptional dependencies available: `metrics`, `deepspeed`. Install with: `pip install -e . && pip install -r requirements\u002Fmetrics.txt -r requirements\u002Fdeepspeed.txt`\n\nAdditional dependencies for specific features are available in `examples\u002Frequirements\u002F`.\n\n#### Install from Docker Image\n\n```bash\ndocker run -it --rm --gpus=all --ipc=host hiyouga\u002Fllamafactory:latest\n```\n\nThis image is built on Ubuntu 22.04 (x86\\_64), CUDA 12.4, Python 3.11, PyTorch 2.6.0, and Flash-attn 2.7.4.\n\nFind the pre-built images: https:\u002F\u002Fhub.docker.com\u002Fr\u002Fhiyouga\u002Fllamafactory\u002Ftags\n\nPlease refer to [build docker](#build-docker) to build the image yourself.\n\n\u003Cdetails>\u003Csummary>Setting up a virtual environment with \u003Cb>uv\u003C\u002Fb>\u003C\u002Fsummary>\n\nCreate an isolated Python environment with [uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv):\n\n```bash\nuv run llamafactory-cli webui\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>For Windows users\u003C\u002Fsummary>\n\n#### Install PyTorch\n\nYou need to manually install the GPU version of PyTorch on the Windows platform. Please refer to the [official website](https:\u002F\u002Fpytorch.org\u002Fget-started\u002Flocally\u002F) and the following command to install PyTorch with CUDA support:\n\n```bash\npip uninstall torch torchvision torchaudio\npip install torch torchvision torchaudio --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu126\npython -c \"import torch; print(torch.cuda.is_available())\"\n```\n\nIf you see `True` then you have successfully installed PyTorch with CUDA support.\n\nTry `dataloader_num_workers: 0` if you encounter `Can't pickle local object` error.\n\n#### Install BitsAndBytes\n\nIf you want to enable the quantized LoRA (QLoRA) on the Windows platform, you need to install a pre-built version of `bitsandbytes` library, which supports CUDA 11.1 to 12.2, please select the appropriate [release version](https:\u002F\u002Fgithub.com\u002Fjllllll\u002Fbitsandbytes-windows-webui\u002Freleases\u002Ftag\u002Fwheels) based on your CUDA version.\n\n```bash\npip install https:\u002F\u002Fgithub.com\u002Fjllllll\u002Fbitsandbytes-windows-webui\u002Freleases\u002Fdownload\u002Fwheels\u002Fbitsandbytes-0.41.2.post2-py3-none-win_amd64.whl\n```\n\n#### Install Flash Attention-2\n\nTo enable FlashAttention-2 on the Windows platform, please use the script from [flash-attention-windows-wheel](https:\u002F\u002Fhuggingface.co\u002Flldacing\u002Fflash-attention-windows-wheel) to compile and install it by yourself.\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>For Ascend NPU users\u003C\u002Fsummary>\n\nTo install LLaMA Factory on Ascend NPU devices, please upgrade Python to version 3.10 or higher: `pip install -r requirements\u002Fnpu.txt`. Additionally, you need to install the **Ascend CANN Toolkit and Kernels**. Please follow the [installation tutorial](https:\u002F\u002Fllamafactory.readthedocs.io\u002Fen\u002Flatest\u002Fadvanced\u002Fnpu_installation.html).\n\n\nYou can also download the pre-built Docker images:\n\n```bash\n# Docker Hub\ndocker pull hiyouga\u002Fllamafactory:latest-npu-a2\ndocker pull hiyouga\u002Fllamafactory:latest-npu-a3\n\n# quay.io\ndocker pull quay.io\u002Fascend\u002Fllamafactory:latest-npu-a2\ndocker pull quay.io\u002Fascend\u002Fllamafactory:latest-npu-a3\n```\n\n#### Install BitsAndBytes\n\nTo use QLoRA based on bitsandbytes on Ascend NPU, please follow these 3 steps:\n\n1. Manually compile bitsandbytes: Refer to [the installation documentation](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fbitsandbytes\u002Finstallation?backend=Ascend+NPU&platform=Ascend+NPU) for the NPU version of bitsandbytes to complete the compilation and installation. The compilation requires a cmake version of at least 3.22.1 and a g++ version of at least 12.x.\n\n```bash\n# Install bitsandbytes from source\n# Clone bitsandbytes repo, Ascend NPU backend is currently enabled on multi-backend-refactor branch\ngit clone -b multi-backend-refactor https:\u002F\u002Fgithub.com\u002Fbitsandbytes-foundation\u002Fbitsandbytes.git\ncd bitsandbytes\u002F\n\n# Install dependencies\npip install -r requirements-dev.txt\n\n# Install the dependencies for the compilation tools. Note that the commands for this step may vary depending on the operating system. The following are provided for reference\napt-get install -y build-essential cmake\n\n# Compile & install  \ncmake -DCOMPUTE_BACKEND=npu -S .\nmake\npip install .\n```\n\n2. Install transformers from the main branch.\n\n```bash\ngit clone -b main https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers.git\ncd transformers\npip install .\n```\n\n3. Set `double_quantization: false` in the configuration. You can refer to the [example](examples\u002Ftrain_qlora\u002Fqwen3_lora_sft_bnb_npu.yaml).\n\n\u003C\u002Fdetails>\n\n### Data Preparation\n\nPlease refer to [data\u002FREADME.md](data\u002FREADME.md) for checking the details about the format of dataset files. You can use datasets on HuggingFace \u002F ModelScope \u002F Modelers hub, load the dataset in local disk, or specify a path to s3\u002Fgcs cloud storage.\n\n> [!NOTE]\n> Please update `data\u002Fdataset_info.json` to use your custom dataset.\n\nYou can also use **[Easy Dataset](https:\u002F\u002Fgithub.com\u002FConardLi\u002Feasy-dataset)**, **[DataFlow](https:\u002F\u002Fgithub.com\u002FOpenDCAI\u002FDataFlow)** and **[GraphGen](https:\u002F\u002Fgithub.com\u002Fopen-sciencelab\u002FGraphGen)** to create synthetic data for fine-tuning.\n\n### Quickstart\n\nUse the following 3 commands to run LoRA **fine-tuning**, **inference** and **merging** of the Qwen3-4B-Instruct model, respectively.\n\n```bash\nllamafactory-cli train examples\u002Ftrain_lora\u002Fqwen3_lora_sft.yaml\nllamafactory-cli chat examples\u002Finference\u002Fqwen3_lora_sft.yaml\nllamafactory-cli export examples\u002Fmerge_lora\u002Fqwen3_lora_sft.yaml\n```\n\nSee [examples\u002FREADME.md](examples\u002FREADME.md) for advanced usage (including distributed training).\n\n> [!TIP]\n> Use `llamafactory-cli help` to show help information.\n>\n> Read [FAQs](https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\u002Fissues\u002F4614) first if you encounter any problems.\n\n### Fine-Tuning with LLaMA Board GUI (powered by [Gradio](https:\u002F\u002Fgithub.com\u002Fgradio-app\u002Fgradio))\n\n```bash\nllamafactory-cli webui\n```\n\n### LLaMA Factory Online\n\nRead our [documentation](https:\u002F\u002Fdocs.llamafactory.com.cn\u002Fdocs\u002Fdocuments\u002Fquickstart\u002Fgetstarted\u002F?utm_source=LLaMA-Factory).\n\n### Build Docker\n\nFor CUDA users:\n\n```bash\ncd docker\u002Fdocker-cuda\u002F\ndocker compose up -d\ndocker compose exec llamafactory bash\n```\n\nFor Ascend NPU users:\n\n```bash\ncd docker\u002Fdocker-npu\u002F\ndocker compose up -d\ndocker compose exec llamafactory bash\n```\n\nFor AMD ROCm users:\n\n```bash\ncd docker\u002Fdocker-rocm\u002F\ndocker compose up -d\ndocker compose exec llamafactory bash\n```\n\n\u003Cdetails>\u003Csummary>Build without Docker Compose\u003C\u002Fsummary>\n\nFor CUDA users:\n\n```bash\ndocker build -f .\u002Fdocker\u002Fdocker-cuda\u002FDockerfile \\\n    --build-arg PIP_INDEX=https:\u002F\u002Fpypi.org\u002Fsimple \\\n    -t llamafactory:latest .\n\ndocker run -dit --ipc=host --gpus=all \\\n    -p 7860:7860 \\\n    -p 8000:8000 \\\n    --name llamafactory \\\n    llamafactory:latest\n\ndocker exec -it llamafactory bash\n```\n\nFor Ascend NPU users:\n\n```bash\ndocker build -f .\u002Fdocker\u002Fdocker-npu\u002FDockerfile \\\n    --build-arg PIP_INDEX=https:\u002F\u002Fpypi.org\u002Fsimple \\\n    -t llamafactory:latest .\n\ndocker run -dit --ipc=host \\\n    -v \u002Fusr\u002Flocal\u002Fdcmi:\u002Fusr\u002Flocal\u002Fdcmi \\\n    -v \u002Fusr\u002Flocal\u002Fbin\u002Fnpu-smi:\u002Fusr\u002Flocal\u002Fbin\u002Fnpu-smi \\\n    -v \u002Fusr\u002Flocal\u002FAscend\u002Fdriver:\u002Fusr\u002Flocal\u002FAscend\u002Fdriver \\\n    -v \u002Fetc\u002Fascend_install.info:\u002Fetc\u002Fascend_install.info \\\n    -p 7860:7860 \\\n    -p 8000:8000 \\\n    --device \u002Fdev\u002Fdavinci0 \\\n    --device \u002Fdev\u002Fdavinci_manager \\\n    --device \u002Fdev\u002Fdevmm_svm \\\n    --device \u002Fdev\u002Fhisi_hdc \\\n    --name llamafactory \\\n    llamafactory:latest\n\ndocker exec -it llamafactory bash\n```\n\nFor AMD ROCm users:\n\n```bash\ndocker build -f .\u002Fdocker\u002Fdocker-rocm\u002FDockerfile \\\n    --build-arg PIP_INDEX=https:\u002F\u002Fpypi.org\u002Fsimple \\\n    -t llamafactory:latest .\n\ndocker run -dit --ipc=host \\\n    -p 7860:7860 \\\n    -p 8000:8000 \\\n    --device \u002Fdev\u002Fkfd \\\n    --device \u002Fdev\u002Fdri \\\n    --name llamafactory \\\n    llamafactory:latest\n\ndocker exec -it llamafactory bash\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\u003Csummary>Use Docker volumes\u003C\u002Fsummary>\n\nYou can uncomment `VOLUME [ \"\u002Froot\u002F.cache\u002Fhuggingface\", \"\u002Fapp\u002Fshared_data\", \"\u002Fapp\u002Foutput\" ]` in the Dockerfile to use data volumes.\n\nWhen building the Docker image, use `-v .\u002Fhf_cache:\u002Froot\u002F.cache\u002Fhuggingface` argument to mount the local directory to the container. The following data volumes are available.\n\n- `hf_cache`: Utilize Hugging Face cache on the host machine.\n- `shared_data`: The directionary to store datasets on the host machine.\n- `output`: Set export dir to this location so that the merged result can be accessed directly on the host machine.\n\n\u003C\u002Fdetails>\n\n### Deploy with OpenAI-style API and vLLM\n\n```bash\nAPI_PORT=8000 llamafactory-cli api examples\u002Finference\u002Fqwen3.yaml infer_backend=vllm vllm_enforce_eager=true\n```\n\n> [!TIP]\n> Visit [this page](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fapi-reference\u002Fchat\u002Fcreate) for API document.\n>\n> Examples: [Image understanding](scripts\u002Fapi_example\u002Ftest_image.py) | [Function calling](scripts\u002Fapi_example\u002Ftest_toolcall.py)\n\n### Download from ModelScope Hub\n\nIf you have trouble with downloading models and datasets from Hugging Face, you can use ModelScope.\n\n```bash\nexport USE_MODELSCOPE_HUB=1 # `set USE_MODELSCOPE_HUB=1` for Windows\n```\n\nTrain the model by specifying a model ID of the ModelScope Hub as the `model_name_or_path`. You can find a full list of model IDs at [ModelScope Hub](https:\u002F\u002Fmodelscope.cn\u002Fmodels), e.g., `LLM-Research\u002FMeta-Llama-3-8B-Instruct`.\n\n### Download from Modelers Hub\n\nYou can also use Modelers Hub to download models and datasets.\n\n```bash\nexport USE_OPENMIND_HUB=1 # `set USE_OPENMIND_HUB=1` for Windows\n```\n\nTrain the model by specifying a model ID of the Modelers Hub as the `model_name_or_path`. You can find a full list of model IDs at [Modelers Hub](https:\u002F\u002Fmodelers.cn\u002Fmodels), e.g., `TeleAI\u002FTeleChat-7B-pt`.\n\n### Use W&B Logger\n\nTo use [Weights & Biases](https:\u002F\u002Fwandb.ai) for logging experimental results, you need to add the following arguments to yaml files.\n\n```yaml\nreport_to: wandb\nrun_name: test_run # optional\n```\n\nSet `WANDB_API_KEY` to [your key](https:\u002F\u002Fwandb.ai\u002Fauthorize) when launching training tasks to log in with your W&B account.\n\n### Use SwanLab Logger\n\nTo use [SwanLab](https:\u002F\u002Fgithub.com\u002FSwanHubX\u002FSwanLab) for logging experimental results, you need to add the following arguments to yaml files.\n\n```yaml\nuse_swanlab: true\nswanlab_run_name: test_run # optional\n```\n\nWhen launching training tasks, you can log in to SwanLab in three ways:\n\n1. Add `swanlab_api_key=\u003Cyour_api_key>` to the yaml file, and set it to your [API key](https:\u002F\u002Fswanlab.cn\u002Fsettings).\n2. Set the environment variable `SWANLAB_API_KEY` to your [API key](https:\u002F\u002Fswanlab.cn\u002Fsettings).\n3. Use the `swanlab login` command to complete the login.\n\n## Projects using LLaMA Factory\n\nIf you have a project that should be incorporated, please contact via email or create a pull request.\n\n\u003Cdetails>\u003Csummary>Click to show\u003C\u002Fsummary>\n\n1. Wang et al. ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation. 2023. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.02223)\n1. Yu et al. Open, Closed, or Small Language Models for Text Classification? 2023. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.10092)\n1. Wang et al. UbiPhysio: Support Daily Functioning, Fitness, and Rehabilitation with Action Understanding and Feedback in Natural Language. 2023. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.10526)\n1. Luceri et al. Leveraging Large Language Models to Detect Influence Campaigns in Social Media. 2023. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.07816)\n1. Zhang et al. Alleviating Hallucinations of Large Language Models through Induced Hallucinations. 2023. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.15710)\n1. Wang et al. Know Your Needs Better: Towards Structured Understanding of Marketer Demands with Analogical Reasoning Augmented LLMs. KDD 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.04319)\n1. Wang et al. CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning. ACL 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.07286)\n1. Choi et al. FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.05904)\n1. Zhang et al. AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.07625)\n1. Lyu et al. KnowTuning: Knowledge-aware Fine-tuning for Large Language Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.11176)\n1. Yang et al. LaCo: Large Language Model Pruning via Layer Collaps. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.11187)\n1. Bhardwaj et al. Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.11746)\n1. Yang et al. Enhancing Empathetic Response Generation by Augmenting LLMs with Small-scale Empathetic Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.11801)\n1. Yi et al. Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding. ACL 2024 Findings. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.11809)\n1. Cao et al. Head-wise Shareable Attention for Large Language Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.11819)\n1. Zhang et al. Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.12204)\n1. Kim et al. Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.14714)\n1. Yu et al. KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models. ACL 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2402.15043)\n1. Huang et al. Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.02333)\n1. Duan et al. Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.03419)\n1. Xie and Schwertfeger. Empowering Robotics with Large Language Models: osmAG Map Comprehension with LLMs. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.08228)\n1. Wu et al. Large Language Models are Parallel Multilingual Learners. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.09073)\n1. Zhang et al. EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.14541)\n1. Weller et al. FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.15246)\n1. Hongbin Na. CBT-LLM: A Chinese Large Language Model for Cognitive Behavioral Therapy-based Mental Health Question Answering. COLING 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.16008)\n1. Zan et al. CodeS: Natural Language to Code Repository via Multi-Layer Sketch. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.16443)\n1. Liu et al. Extensive Self-Contrast Enables Feedback-Free Language Model Alignment. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.00604)\n1. Luo et al. BAdam: A Memory Efficient Full Parameter Training Method for Large Language Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.02827)\n1. Du et al. Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.04167)\n1. Ma et al. Parameter Efficient Quasi-Orthogonal Fine-Tuning via Givens Rotation. ICML 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.04316)\n1. Liu et al. Dynamic Generation of Personalities with Large Language Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.07084)\n1. Shang et al. How Far Have We Gone in Stripped Binary Code Understanding Using Large Language Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.09836)\n1. Huang et al. LLMTune: Accelerate Database Knob Tuning with Large Language Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.11581)\n1. Deng et al. Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.14215)\n1. Acikgoz et al. Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.16621)\n1. Zhang et al. Small Language Models Need Strong Verifiers to Self-Correct Reasoning. ACL 2024 Findings. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.17140)\n1. Zhou et al. FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering. NAACL 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.18585)\n1. Xu et al. Large Language Models for Cyber Security: A Systematic Literature Review. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.04760)\n1. Dammu et al. \"They are uncultured\": Unveiling Covert Harms and Social Threats in LLM Generated Conversations. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.05378)\n1. Yi et al. A safety realignment framework via subspace-oriented model fusion for large language models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.09055)\n1. Lou et al. SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.12739)\n1. Zhang et al. Getting More from Less: Large Language Models are Good Spontaneous Multilingual Learners. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.13816)\n1. Zhang et al. TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.20215)\n1. Zihong Chen. Sentence Segmentation and Sentence Punctuation Based on XunziALLM. 2024. [[paper]](https:\u002F\u002Faclanthology.org\u002F2024.lt4hala-1.30)\n1. Gao et al. The Best of Both Worlds: Toward an Honest and Helpful Large Language Model. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.00380)\n1. Wang and Song. MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.02106)\n1. Hu et al. Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.03136)\n1. Ge et al. Time Sensitive Knowledge Editing through Efficient Finetuning. ACL 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.04496)\n1. Tan et al. Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.05688)\n1. Song et al. Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.05955)\n1. Gu et al. RWKV-CLIP: A Robust Vision-Language Representation Learner. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.06973)\n1. Chen et al. Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.07115)\n1. Zhu et al. Are Large Language Models Good Statisticians?. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.07815)\n1. Li et al. Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.10099)\n1. Ding et al. IntentionQA: A Benchmark for Evaluating Purchase Intention Comprehension Abilities of Language Models in E-commerce. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.10173)\n1. He et al. COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.12074)\n1. Lin et al. FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.14408)\n1. Treutlein et al. Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.14546)\n1. Feng et al. SS-Bench: A Benchmark for Social Story Generation and Evaluation. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.15695)\n1. Feng et al. Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.17233)\n1. Liu et al. Large Language Models for Cuffless Blood Pressure Measurement From Wearable Biosignals. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.18069)\n1. Iyer et al. Exploring Very Low-Resource Translation with LLMs: The University of Edinburgh's Submission to AmericasNLP 2024 Translation Task. AmericasNLP 2024. [[paper]](https:\u002F\u002Faclanthology.org\u002F2024.americasnlp-1.25)\n1. Li et al. Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.19949)\n1. Yang et al. Financial Knowledge Large Language Model. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.00365)\n1. Lin et al. DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.01470)\n1. Bako et al. Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.06129)\n1. Huang et al. RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.08044)\n1. Jiang et al. LLM-Collaboration on Automatic Science Journalism for the General Audience. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.09756)\n1. Inouye et al. Applied Auto-tuning on LoRA Hyperparameters. 2024. [[paper]](https:\u002F\u002Fscholarcommons.scu.edu\u002Fcseng_senior\u002F272\u002F)\n1. Qi et al. Research on Tibetan Tourism Viewpoints information generation system based on LLM. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.13561)\n1. Xu et al. Course-Correction: Safety Alignment Using Synthetic Preferences. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.16637)\n1. Sun et al. LAMBDA: A Large Model Based Data Agent. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.17535)\n1. Zhu et al. CollectiveSFT: Scaling Large Language Models for Chinese Medical Benchmark with Collective Instructions in Healthcare. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.19705)\n1. Yu et al. Correcting Negative Bias in Large Language Models through Negative Attention Score Alignment. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.00137)\n1. Xie et al. The Power of Personalized Datasets: Advancing Chinese Composition Writing for Elementary School through Targeted Model Fine-Tuning. IALP 2024. [[paper]](https:\u002F\u002Fwww.asianlp.sg\u002Fconferences\u002Fialp2024\u002Fproceedings\u002Fpapers\u002FIALP2024_P055.pdf)\n1. Liu et al. Instruct-Code-Llama: Improving Capabilities of Language Model in Competition Level Code Generation by Online Judge Feedback. ICIC 2024. [[paper]](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-981-97-5669-8_11)\n1. Wang et al. Cybernetic Sentinels: Unveiling the Impact of Safety Data Selection on Model Security in Supervised Fine-Tuning. ICIC 2024. [[paper]](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-981-97-5669-8_23)\n1. Xia et al. Understanding the Performance and Estimating the Cost of LLM Fine-Tuning. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.04693)\n1. Zeng et al. Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.04168)\n1. Xia et al. Using Pre-trained Language Model for Accurate ESG Prediction. FinNLP 2024. [[paper]](https:\u002F\u002Faclanthology.org\u002F2024.finnlp-2.1\u002F)\n1. Liang et al. I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative Self-Enhancement Paradigm. 2024. [[arxiv]](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.08072)\n1. Bai et al. Aligning Large Language Model with Direct Multi-Preference Optimization for Recommendation. CIKM 2024. [[paper]](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3627673.3679611)\n1. Zhang et al. CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling. ACL 2024. [[paper]](https:\u002F\u002Faclanthology.org\u002F2024.findings-acl.830.pdf)\n1. **[StarWhisper](https:\u002F\u002Fgithub.com\u002FYu-Yang-Li\u002FStarWhisper)**: A large language model for Astronomy, based on ChatGLM2-6B and Qwen-14B.\n1. **[DISC-LawLLM](https:\u002F\u002Fgithub.com\u002FFudanDISC\u002FDISC-LawLLM)**: A large language model specialized in Chinese legal domain, based on Baichuan-13B, is capable of retrieving and reasoning on legal knowledge.\n1. **[Sunsimiao](https:\u002F\u002Fgithub.com\u002FX-D-Lab\u002FSunsimiao)**: A large language model specialized in Chinese medical domain, based on Baichuan-7B and ChatGLM-6B.\n1. **[CareGPT](https:\u002F\u002Fgithub.com\u002FWangRongsheng\u002FCareGPT)**: A series of large language models for Chinese medical domain, based on LLaMA2-7B and Baichuan-13B.\n1. **[MachineMindset](https:\u002F\u002Fgithub.com\u002FPKU-YuanGroup\u002FMachine-Mindset\u002F)**: A series of MBTI Personality large language models, capable of giving any LLM 16 different personality types based on different datasets and training methods.\n1. **[Luminia-13B-v3](https:\u002F\u002Fhuggingface.co\u002FNekochu\u002FLuminia-13B-v3)**: A large language model specialized in generate met","LLaMA Factory 是一个用于高效微调超过100种大型语言模型（LLMs）和视觉语言模型（VLMs）的统一框架。该项目支持多种微调技术，如LoRA、QLoRA以及参数高效微调（PEFT），并提供量化工具以减少模型大小和推理时间，适用于需要快速迭代模型的应用场景。此外，它还兼容主流的深度学习库如Hugging Face Transformers，并且可以在多种平台上运行，包括Colab、Docker及各大云服务商环境。无论是学术研究还是工业应用，特别是当面对资源受限或对模型性能有高要求的情况时，LLaMA Factory 都是一个理想的选择。",2,"2026-06-11 02:35:57","top_all"]