[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-75197":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":8,"language":10,"languages":8,"totalLinesOfCode":8,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":8,"rankLanguage":8,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":8,"pushedAt":8,"updatedAt":38,"readmeContent":39,"aiSummary":40,"trendingCount":15,"starSnapshotCount":15,"syncStatus":41,"lastSyncTime":42,"discoverSource":43},75197,"Jackrong-llm-finetuning-guide","R6410418\u002FJackrong-llm-finetuning-guide","R6410418",null,"https:\u002F\u002Fhuggingface.co\u002FJackrong","Jupyter Notebook",1388,238,21,10,0,38,65,236,114,102.14,"Apache License 2.0",false,"main",true,[26,27,28,29,30,31,32,33,34,35,36,37],"dataset","deepseek","fine-tuning","guide","llama3","llm","machine-learning","nlp","openai","pytorch","qwen","unsloth","2026-06-12 04:01:18","\u003Cdiv align=\"center\">\n\n# Jackrong-llm-finetuning-guide 🌌\n**An Educational, End-to-End LLM Fine-Tuning Pipeline for Beginners and Developers**\n\n🌐 **Select Language:**  🇬🇧 **English** ｜ [🇨🇳 中文](.\u002Fdocs\u002FREADME_zh.md) ｜ [🇰🇷 한국어](.\u002Fdocs\u002FREADME_ko.md) ｜ [🇯🇵 日本語](.\u002Fdocs\u002FREADME_ja.md)\n\n🤗 **HuggingFace:** [Jackrong](https:\u002F\u002Fhuggingface.co\u002FJackrong)\n\n\u003Cbr>\n\n[![Unsloth](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPowered%20by-Unsloth-8A2BE2?style=flat-square)](https:\u002F\u002Fgithub.com\u002Funslothai\u002Funsloth)\n[![Google Colab](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FEnvironment-Google%20Colab-F9AB00?style=flat-square&logo=googlecolab&logoColor=white)](https:\u002F\u002Fcolab.research.google.com\u002F)\n[![PyTorch](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFramework-PyTorch-EE4C2C?style=flat-square&logo=pytorch&logoColor=white)](https:\u002F\u002Fpytorch.org\u002F)\n[![Hugging Face](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FModel%20Hub-Hugging%20Face-FFD21E?style=flat-square&logo=huggingface&logoColor=black)](https:\u002F\u002Fhuggingface.co\u002F)\n[![LoRA PEFT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTechnique-LoRA%20%2F%20PEFT-007EC6?style=flat-square)](#)\n[![Beginner Friendly](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLevel-Beginner%20Friendly-brightgreen?style=flat-square)](#)\n\n\u003C\u002Fdiv>\n\n---\n\n## 📑 Abstract\n\nAn educational Large Language Model (LLM) fine-tuning repository designed for beginners and developers. This project provides detailed theoretical explanations, robust data processing workflows, reproducible training pipelines (including Supervised Fine-Tuning and future Reinforcement Learning implementations), and practical deployment strategies. The full training code for the author's open-source projects is fully accessible within this repository.\n\n\u003Cdiv align=\"center\">\n  \u003Cimg width=\"100%\" alt=\"Project Overview\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F67dff316-e77e-4909-a338-56caeb6583b4\" style=\"border-radius: 8px; box-shadow: 0 4px 8px rgba(0,0,0,0.1); margin: 20px 0;\"\u002F>\n\u003C\u002Fdiv>\n\n---\n\n## 🏛️ About This Project\n\nThis repository is designed as a **\"Zero to One\"** learning platform. Whether you have zero technical background or are an experienced developer, you will find reproducible, end-to-end guides that walk you through the entire lifecycle of large language models. Starting from simply registering a Google account and opening Colab, you will learn how to efficiently adapt models to your specific domain needs.\n\n### ✨ Key Features & Offerings\n\n| Aspect | Description |\n| :--- | :--- |\n| 🛤️ **0-to-1 Learning Path** | Step-by-step guides starting from the absolute basics, requiring nothing more than a browser and a free cloud environment. |\n| 🔄 **Diverse Training Workflows** | Codebases covering Supervised Fine-Tuning (SFT) and foundational setups for Reinforcement Learning (RL) and other advanced paradigms. |\n| ⚡ **Resource-Efficient Engineering** | Leveraging tools like Unsloth and 4-bit quantization to run large-scale training within single-GPU constraints (e.g., standard Google Colab). |\n| 📦 **End-to-End Delivery** | From multi-source data normalization to LoRA adaptation, merged 16-bit exports, and GGUF quantization for local deployment. |\n\n---\n\n## 💡 A Message to Builders\n\n> [!NOTE]\n> *\"For beginners, hobbyists, and anyone curious about AI: this path is learnable.\"*\n\nThe purpose of this document is not only to describe one training run, but also to communicate a broader message: fine-tuning, post-training, and even moderate-scale pretraining are **not inaccessible technical rituals**. They are **engineering practices** that can be learned, reproduced, and gradually mastered. With open-source models, public datasets, cloud compute platforms, and an increasingly mature training toolchain, what you often need is simply a Google account, a regular laptop, and sustained curiosity.\n\nAs a learner who also started from zero, I understand the uncertainty many newcomers face: environment setup complexity, opaque hyperparameters, and anxiety about compute resources often become the first barrier to entry. This is exactly why optimization toolchains such as Unsloth matter: by improving training efficiency and resource utilization, they substantially lower the practical threshold for large-model fine-tuning, turning what once required expensive hardware and specialized experience into something ordinary developers can attempt and master.\n\n**In that sense, we all have the opportunity to stand on the shoulders of giants, understand models, adapt models, and give them new capabilities.**\n\n*No one starts as an expert. But every expert was once brave enough to begin.*\n\n## 🚀 Upcoming Model Support & Roadmap\n\nIn the near future, this repository will continuously expand its support for the latest state-of-the-art open-source model families. The upcoming tutorials and codebases will comprehensively cover both **Supervised Fine-Tuning (SFT)** and **Reinforcement Learning (RL - specifically GRPO)** pipelines.\n\nBelow is the planned support matrix for upcoming model families:\n\n| Model Family | SFT Support | RL (GRPO) Support |\n| :--- | :---: | :---: |\n| **Qwen 3.5** | ✅ Released | Scheduled |\n| **Qwen 3** | Scheduled | Scheduled |\n| **Llama3.2-R1 (3B)** | ✅ Included | ✅ Released |\n| **Llama** (3.1 \u002F 3.3) | Scheduled | Scheduled |\n| **Phi-4** | Scheduled | Scheduled |\n| **Gemma 4** | Scheduled | Scheduled |\n| **DeepSeek** | Scheduled | Scheduled |\n\n---\n\n## 📓 Interactive Training Notebooks\n\nBelow are the interactive Kaggle and Colab notebooks, organized by model architecture. You can run the entire pipeline—from data preparation to training and inference—directly in your browser. All notebooks are available in the [`train_code`](.\u002Ftrain_code\u002F) repository folder.\n\n### 🌟 Main Notebooks\n\n#### 🗄️ Dataset Preparation\n| 🤖 Model Architecture | 🛠️ Pipeline | 🚀 Quick Setup |\n| :--- | :--- | :---: |\n| **Teacher model distill pipeline** | Data Processing | [![Python Code](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCode-Python-3776AB?style=flat-square&logo=python&logoColor=white)](https:\u002F\u002Fgithub.com\u002FR6410418\u002FJackrong-llm-finetuning-guide\u002Fblob\u002Fmain\u002Fdata_processing_code\u002FDeepSeek-v4-API-distill.ipynb) |\n\n#### 🏋️ Model Training\n| 🤖 Model Architecture | 🛠️ Pipeline | 🚀 Quick Setup (1-Click Run) |\n| :--- | :--- | :---: |\n| **Qwopus3.5 (27B)** | SFT | [![Open In Colab](https:\u002F\u002Fcolab.research.google.com\u002Fassets\u002Fcolab-badge.svg)](https:\u002F\u002Fcolab.research.google.com\u002Fgithub\u002FR6410418\u002FJackrong-llm-finetuning-guide\u002Fblob\u002Fmain\u002Ftrain_code\u002FQwopus3-5-27b-Colab.ipynb) |\n| **Qwen3.5 (9B)** | SFT | [![Open In Kaggle](https:\u002F\u002Fkaggle.com\u002Fstatic\u002Fimages\u002Fopen-in-kaggle.svg)](https:\u002F\u002Fkaggle.com\u002Fkernels\u002Fwelcome?src=https:\u002F\u002Fgithub.com\u002FR6410418\u002FJackrong-llm-finetuning-guide\u002Fblob\u002Fmain\u002Ftrain_code\u002FQwen3.5-9B-Neo-Kaggle.ipynb) |\n| **Qwopus3.5 (35B)** | SFT | [![Open In Kaggle](https:\u002F\u002Fkaggle.com\u002Fstatic\u002Fimages\u002Fopen-in-kaggle.svg)](https:\u002F\u002Fkaggle.com\u002Fkernels\u002Fwelcome?src=https:\u002F\u002Fgithub.com\u002FR6410418\u002FJackrong-llm-finetuning-guide\u002Fblob\u002Fmain\u002Ftrain_code\u002FQwopus-3.5-35B-A3B-Kaggle.ipynb) |\n| **Llama3.2-R1 (3B)** | RL (GPRO) | [![Open In Kaggle](https:\u002F\u002Fkaggle.com\u002Fstatic\u002Fimages\u002Fopen-in-kaggle.svg)](https:\u002F\u002Fkaggle.com\u002Fkernels\u002Fwelcome?src=https:\u002F\u002Fgithub.com\u002FR6410418\u002FJackrong-llm-finetuning-guide\u002Fblob\u002Fmain\u002Ftrain_code\u002FLlama-3.2-3B-R1-Zero-GRPO.ipynb) |\n\n---\n\n## 📖 Comprehensive Model Training Guide\n\nFor a detailed, step-by-step PDF walkthrough of the entire Qwopus 3.5 fine-tuning process—including environment setup, data preparation, and optimization tips—please refer to our latest guide:\n\n> [!TIP]\n> **🔗 [Download Complete Guide: Qwopus3-5-27b-Colab_complete_guide_to_llm_finetuning.pdf](https:\u002F\u002Fgithub.com\u002FR6410418\u002FJackrong-llm-finetuning-guide\u002Fblob\u002Fmain\u002FguidePDF\u002FQwopus3-5-27b-Colab_complete_guide_to_llm_finetuning.pdf)**\n\n\n---\n\n## 📚 High-Fidelity Distillation Datasets\n\nHigh-quality data is the engine of effective model adaptation. In parallel with our training code, this repository provides access to **24 curated, high-fidelity datasets** specifically collected and distilled to enhance model reasoning, coding, and conversational capabilities.\n\nThese datasets are primarily distilled from state-of-the-art flagship models (such as *DeepSeek-V3.2*, *Qwen3-235B*, *GLM-4.7*, and *GPT-OSS-120B*) and follow advanced Chain-of-Thought (CoT) formatting. \n\n**Key Dataset Categories Included:**\n*   🧠 **Reasoning & CoT (Chain-of-Thought):** Datasets like `Jackrong\u002FQwen3.5-reasoning-700`, `Jackrong\u002FNatural-Reasoning-gpt-oss-120B-S1`, and `Jackrong\u002Fglm-4.7-multiturn-CoT` designed to improve step-by-step logic and deduction.\n*   📐 **Mathematics & STEM:** Specialized data such as `DeepSeek-v3.1-reasoner-Distilled-math-samples` and focused domain knowledge like `Jackrong\u002FQwen3-235B-A22B-Instruct-2507-Distilled-chat`.\n*   💻 **Code & Algorithms:** Collections like `Competitive-Programming-python-blend` and `qwen3-coder-480b-distill-mini` to strengthen competitive programming and algorithmic generation.\n*   💬 **Instruction & Multi-turn Chat:** Resources like `Jackrong\u002FLogicMind-Chat-Reasoning-SFT-300K`, `Chinese-Qwen3-235B-Thinking-2507-Distill-100k`, and `ShareGPT-gpt-oss-120B-reasoning` focused on human alignment, IELTS writing feedback, and robust conversational flowing.\n\n*All datasets are open-sourced on the [HuggingFace Hub](https:\u002F\u002Fhuggingface.co\u002FJackrong). You can also use the included `download_datasets.py` script to batch download the entire suite for local training.*\n\n---\n\n## 🌍 Open Source Commitment & Community Impact\n\nMoving forward, the complete training source code for every fine-tuned model I release on Hugging Face will be fully open-sourced in this repository. My goal is to ensure that anyone—regardless of their background or resources—can freely download, execute, and learn from these scripts to build their own AI capabilities.\n\nI am deeply grateful for the community's support. The Qwen3.5 fine-tunes I shared on Hugging Face have recently reached over a **million downloads**—a quiet reminder of the power of open knowledge. It is my sincere hope that making these full training pipelines publicly available will encourage more developers to start their own fine-tuning journeys.\n\n---\n\n## 📝 Citation\n\nIf you find this repository helpful in your learning or research, please consider citing it:\n\n```bibtex\n@misc{jackrong-llm-finetuning,\n  author = {Jackrong},\n  title = {Jackrong-llm-finetuning-guide: An Educational LLM Fine-Tuning Pipeline},\n  year = {2026},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002FJackrong\u002FJackrong-llm-finetuning-guide}}\n}\n```\n","Jackrong-llm-finetuning-guide 是一个面向初学者和开发者的大型语言模型（LLM）微调教育项目。该项目提供了详细的理论解释、强大的数据处理流程、可复现的训练管道（包括监督微调和未来的强化学习实现）以及实用的部署策略。其核心技术特点包括多样化的训练工作流、资源高效的工程实践（如使用Unsloth和4位量化技术在单GPU环境下进行大规模训练），以及从多源数据规范化到LoRA适配器的端到端交付能力。此项目特别适合那些希望快速入门并掌握如何根据特定领域需求调整LLM的专业人士，无论是完全没有技术背景的新手还是有经验的开发者都能从中受益。",2,"2026-06-11 03:52:37","high_star"]