[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72187":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":31,"readmeContent":32,"aiSummary":33,"trendingCount":16,"starSnapshotCount":16,"syncStatus":34,"lastSyncTime":35,"discoverSource":36},72187,"llama3-Chinese-chat","CrazyBoyM\u002Fllama3-Chinese-chat","CrazyBoyM","Llama3-中文后训练版","https:\u002F\u002Fhuggingface.co\u002FshareAI",null,"Python",4152,335,48,30,0,3,4,9,29.58,"MIT License",false,"main",true,[26,27,28,29,30],"llama","llama2","llama3","llama3-chinese","llama3-finetune","2026-06-12 02:02:59","# llama3-Chinese-chat\n[![HF Demo](https:\u002F\u002Fimg.shields.io\u002Fstatic\u002Fv1?label=Demo&message=OpenBayes%E8%B4%9D%E5%BC%8F%E8%AE%A1%E7%AE%97&color=green)](https:\u002F\u002Fopenbayes.com\u002Fconsole\u002Fhyperai-tutorials\u002Fcontainers\u002FEzsoQaZB8LA)  &ensp;  \n### 1st version of Chinese-llama3 \n首个llama3 中文版  \n本仓库供交流llama3中文相关学习内容，欢迎任何热心朋友加入共建  \n\n### 训练 & 推理流程\n\u003Cimg width=\"67%\" alt=\"训练流程图\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F28fe6edc-9a54-44a7-8dba-f9a9173901ff\" \u002F>\n\u003Cimg width=\"30%\" alt=\"训练与推理\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F22f9ea26-49ca-4d7a-adad-c338f2309b50\" \u002F> \u003Cbr>\n\u003Ccenter>看图快速学习: https:\u002F\u002Fdeepwiki.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u003C\u002Fcenter>\n\n### 通知 \n🔥新增LLM-Chinese仓库，欢迎关注，偏教程性质，以「模型中文化」为一个典型的模型训练问题切入场景，指导读者上手学习LLM二次微调训练：https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002FLLM-Chinese （含gemma2 中文版模型，2b、 9b尺寸）\n\n如果你有自己微调的版本或者在网上发现有趣的特化版本，欢迎在issue区评论收录。  \n如有你有想要建设的内容版块，欢迎fork提交PR成为核心作者成员。  \n(注意：目前不再接受仅修改单个字、句的typo-PR，请避免频繁提交该类PR)\n\n\n### News 更新记录\n- 2024-07-25 llama3.1 中文DPO版训练权重放出。\n- 2024-07-24 llama3.1 中文版训练计划启动。\n- 2024-05-17 🎉 整理的llama3中文化数据集合在modelscope下载量达2.9k次，连续三周处于modelscope网站首页：[数据下载地址](https:\u002F\u002Fmodelscope.cn\u002Fdatasets\u002Fbaicai003\u002FLlama3-Chinese-dataset\u002Fsummary) \n- 2024-05-17 💪 增加 手写API部署教程、命令调用，[文档地址](https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Ftree\u002Fmain\u002Fdeploy\u002FAPI)\n- 2024-05-13 💪 增加LMStudio电脑本地部署教程，[文档教程](https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fblob\u002Fmain\u002Fdeploy\u002FLMStudio\u002FREADME.md)，[手把手视频教程](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1nt421g79T)\n- 2024-05-04 五一假期间：🚀 新增语言偏好强化对齐版本（直接对英文instruct版做DPO）。保持原汁原味的口吻回复(喜欢趣味语言、emoji表情)，[模型下载](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fbaicai003\u002FLlama3-Chinese-instruct-DPO-beta0.5\u002Fsummary)，[gguf量化版下载](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FshareAI\u002Fllama-3-8b-Instruct-dpo-chinese-loftq-gguf\u002Fsummary)，[语言偏好强化数据集工作已开源](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FshareAI\u002FDPO-zh-en-emoji)\n- 2024-04-21 晚上2点：增加训练教程、推理教程、网页部署等文档整理\n- 2023-04-20 晚上23点：instruct 中文版训练完成\n- 2024-04-20 早上7点：v2版训练完成 \n- 2024-04-19 下午1点：🍺 世界上首个llama3 中文版训练完成，晚上没睡觉哈哈，使用170k+高质量多轮中文对话数据连夜训练得到。\n\n### Demo 演示示例\n#### llama3-base-8b 中文SFT版\n\u003Cimg width=\"1000\" src=\"https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fassets\u002F35400185\u002F4057d600-11e6-424f-9705-267450b6f635\">\n\n#### llama3-instruct-8b 中文DPO版\n\u003Cimg width=\"1000\" src=\"https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fassets\u002F35400185\u002F0330a118-7a38-44a7-8a48-a94bfb9eead2\">\n\n#### llama3.1-instruct-8b 中文DPO版\n\u003Cimg width=\"1000\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F7140ee4b-d2d5-42f6-976b-9379ec6a9811\">\n\n### llama3 可用Chat版模型整理\nllama3.1\n- shareAI-DPO中文 8B版本 （RLHF中文）\n  - 训练数据开源： https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FshareAI\u002FDPO-zh-en-emoji\n  - 训练细节分享：DPO(beta 0.5) + lora rank128, alpha256 + 打开\"lm_head\", \"input_layernorm\", \"post_attention_layernorm\", \"norm\"层训练.\n  - 算力：8 * A100，5分钟，感谢opencsg社区的友情赞助支持。  \n  - 模型下载 - OpenCSG： https:\u002F\u002Fopencsg.com\u002Fmodels\u002FshareAI\u002Fllama3.1-8b-instruct-dpo-zh  \n  - 模型下载 - modelscope： https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FshareAI\u002Fllama3.1-8b-instruct-dpo-zh  \n  - 模型下载 - Huggingface： https:\u002F\u002Fhuggingface.co\u002FshareAI\u002Fllama3.1-8b-instruct-dpo-zh\n  - GGUF版本下载 （ollama、lmstudio可用）：https:\u002F\u002Fhuggingface.co\u002FshareAI\u002Fllama3.1-8b-instruct-dpo-zh\u002Fblob\u002Fmain\u002Fllama3.1_8b_chinese_chat_q4_k_m-shareAI.gguf\n  - GGUF版本国内下载 （hf-mirror 国内加速站点）：https:\u002F\u002Fhf-mirror.com\u002FshareAI\u002Fllama3.1-8b-instruct-dpo-zh\n  - ollama命令直接运行：`ollama run shareai\u002Fllama3.1-dpo-zh `\n- openCSG wukong中文 405B版本 (SFT中文）\n  - shareAI & openCSG联合发布\n  - 介绍文章：https:\u002F\u002Fmp.weixin.qq.com\u002Fs\u002F7_lDZ6Zslq_WUckfuTToyQ\n  - 模型开源：https:\u002F\u002Fopencsg.com\u002Fmodels\u002FOpenCSG\u002FCSG-Wukong-Chinese-Llama3.1-405B\n- openbuddy\n  - openbuddy-llama3.1-8b（SFT中文）：https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FOpenBuddy\u002Fopenbuddy-llama3.1-8b-v22.1-131k\n\nllama3相关对话版本优质权重整理：（欢迎issue补充）\n- shareAI系列：\n  - base预训练 + 直接中文SFT版:\n     - 训练数据：https:\u002F\u002Fmodelscope.cn\u002Fdatasets\u002Fbaicai003\u002FLlama3-Chinese-dataset\u002Fsummary \n     - V1版\n        - OpenCSG满速下载：https:\u002F\u002Fopencsg.com\u002Fmodels\u002FshareAI\u002Fllama3-Chinese-chat-8b\n        - WiseModel满速下载：https:\u002F\u002Fwisemodel.cn\u002Fmodels\u002FshareAI\u002Fllama3-Chinese-chat-8b\n     - V2版\n        - modelscope：https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fbaicai003\u002FLlama3-Chinese_v2\u002Fsummary\n        - 思维导图生成能力强化LoRA：https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FshareAI\u002Fllama3-instruct-8b-cn-doc2markmap-lora\n  - Instruct + 继续中文SFT版：\n    - modelscope模型下载：https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fbaicai003\u002Fllama-3-8b-Instruct-chinese_v2\u002Fsummary\n    - 云服务器镜像在线体验（点击即用，免费 4 小时）：https:\u002F\u002Fwww.suanyun.cn\u002Fconsole\u002Fshare?uuid=b1ba51908f8a4bd1af37148765c293ee\n  - Instruct + 强化学习中文版：\n    - llama3 instruct DPO版 （10分钟左右可训练好，对原多语言instruct版最小化性能损伤，实测超过大多中文大量训练版）\n      - modelscope下载：https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fbaicai003\u002FLlama3-Chinese-instruct-DPO-beta0.5\u002Fsummary\n      - 偏好学习数据集：[DPO-zh-en-emoji](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FshareAI\u002FDPO-zh-en-emoji)\n\n  - Base预训练 + 海量中文优质数据增量预训练：正在进行中 \n  - 70b 中文版：计划中\n-  by zhuangxialie，因对话模版设置错误，需要用[fastchat](https:\u002F\u002Fgithub.com\u002Flm-sys\u002FFastChat)体验：\n    - Base + 中文SFT：https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fzhuangxialie\u002FLlama3_Chinese_Sft\u002Ffiles\n    - Base + ORPO：https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fzhuangxialie\u002FLlama3-Chinese-ORPO\u002Fsummary \n    - Instruct + DPO：https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fzhuangxialie\u002FLlama3-Chinese-DPO\u002Fsummary\n- llama3 Pro（加block版，推荐网友积极在该方案上做更多尝试、探索）：\n  - linjh1118网友（第一个ORPO偏好对齐 + 扩展2*blocks）：https:\u002F\u002Fgithub.com\u002Flinjh1118\u002FLlama3-Chinese-ORPO\n- llama3 Moe增强版：\n  - cooper12121-llama3-8x8b-MoE：https:\u002F\u002Fgithub.com\u002Fcooper12121\u002Fllama3-8x8b-MoE\n- 长上下文版本：\n  - 联通微调版v2 (中文，28k上下文）：https:\u002F\u002Fhuggingface.co\u002FUnicomLLM\u002FUnichat-llama3-Chinese-8B-28K\n  - 262k上下文（英文）：https:\u002F\u002Fhuggingface.co\u002Fgradientai\u002FLlama-3-8B-Instruct-262k\n  - 262k上下文（中文）：计划中\n  - 无限上下文版本：计划中，参考：https:\u002F\u002Fmedium.com\u002Fneoxia\u002Fllm-infini-attention-with-linear-complexity-3209b87a77c3\n- 其他普通中文微调版本：\n  - 中兴微调版（DPO） - 70B：https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002FZTEAIM2024\u002FLlama3_70B_instruct_chinese\u002Fsummary\n  - 联通微调版（SFT）：https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002FUnicomAI\u002FUnichat-llama3-Chinese\u002Fsummary\n  - Openbuddy微调版（SFT，据说不错）：https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002FOpenBuddy\u002Fopenbuddy-llama3-8b-v21.1-8k\u002Fsummary\n  - zhichen微调版（ORPO方法，应该是第一个orpo）：https:\u002F\u002Fgithub.com\u002Fseanzhang-zhichen\u002Fllama3-chinese\n  - shenzhi-wang微调版（ORPO方法，也说是第一个orpo）：https:\u002F\u002Fhuggingface.co\u002Fshenzhi-wang\u002FLlama3-8B-Chinese-Chat\n  - Rookie微调版（SFT）：https:\u002F\u002Fgithub.com\u002FRookie1019\u002FLlama-3-8B-Instruct-Chinese\n  - hit-sz klc-lab 微调版：[https:\u002F\u002Fgithub.com\u002Fzyg18181818\u002FLlama-3-Chinese](https:\u002F\u002Fgithub.com\u002Fzyg18181818\u002FLlama-3-Chinese)\n- 破解安全限制系列（nsfw）：\n  - Unholy：https:\u002F\u002Fhuggingface.co\u002FUndi95\u002FLlama-3-Unholy-8B\n  - neural-chat：https:\u002F\u002Fhf-mirror.com\u002FLocutusque\u002Fllama-3-neural-chat-v1-8b\n  - dolphin：https:\u002F\u002Fhuggingface.co\u002Fcognitivecomputations\u002Fdolphin-2.9-llama3-8b\n  - Orion: https:\u002F\u002Fhuggingface.co\u002FOrion-zhen\u002FLlama3-70B-Orion-Chinese 破限+中文, 并保留了原版llama3喜欢emoji的习惯\n- v-llama3 多模态版：（支持文字以外的输入、输出）\n  - 图像问答：\n    - Bunny-Llama-3-8B-V：https:\u002F\u002Fwisemodel.cn\u002Fmodels\u002FBAAI\u002FBunny-Llama-3-8B-V\n    - llava-llama-3-8b：https:\u002F\u002Fhuggingface.co\u002Fxtuner\u002Fllava-llama-3-8b-v1_1\n  - 视频理解（可支持 1 分钟内视频问答）：https:\u002F\u002Fgithub.com\u002FTHUDM\u002FCogVLM2  \n- agent工具能力增强版：\n  - ModelScope Chinese Agent版V1（可根据要求帮你选择工具，中文对话）：https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fswift\u002FLlama3-Chinese-8B-Instruct-Agent-v1\u002Fsummary\n- EmoLLM 心理领域数据微调版：\n  - 在线体验链接：https:\u002F\u002Fst-app-center-006861-9746-jlroxvg.openxlab.space\u002F\n  - 或前往[OpenXLab EmoLLM3.0-Llama3](https:\u002F\u002Fopenxlab.org.cn\u002Fapps\u002Fdetail\u002Fchg0901\u002FEmoLLM-Llama3-8B-Instruct3.0)启动\n  - 模型下载地址\n    - OpenXLab： https:\u002F\u002Fopenxlab.org.cn\u002Fmodels\u002Fdetail\u002Fchg0901\u002FEmoLLM-Llama3-8B-Instruct3.0\n    - ModelScope： https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fchg0901\u002FEmoLLM-Llama3-8B-Instruct3.0\u002Fsummary\n\n- 小说、网文、故事撰写任务增强版：计划中\n- 音乐生成任务版：计划中\n- 猫娘扮演版：计划中\n- 涩涩版：计划中\n\n\n注意由于只训练了常见对话，Base + SFT版有可能会出现不符合预期的回复 （尤其是对于一些非常见回答），本教程更多用于优质资源整理（包含如何对llama3进行中文微调，怎样制作中文对话数据集，角色扮演、agent能力增强，扩充上下文长度，如何进行网页部署和量化，手机、电脑cpu推理部署等），将会逐渐整理补充进来。  \n\n## 模型使用方式\n### 云端服务部署\n#### 简单API方式\n文档教程：https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Ftree\u002Fmain\u002Fdeploy\u002FAPI\n\n#### vLLM方式 （推荐，兼容OpenAI格式）\n文档教程：https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Ftree\u002Fmain\u002Fdeploy\u002FvLLM\n\n### 本地电脑部署\n#### LMStudio电脑本地部署方式 （有UI界面）\n文档教程：https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fblob\u002Fmain\u002Fdeploy\u002FLMStudio\u002FREADME.md  \n视频教程：https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1nt421g79T\n\n#### ollama 命令行工具方式 (推荐, 简单易用)\n首先，去官网下载安装ollama：https:\u002F\u002Follama.com\u002F  \n然后，打开终端命令行，执行以下命令即可开始与AI对话：\n```\nollama run shareai\u002Fllama3.1-dpo-zh\n```\n\n\u003Cimg width=\"1000\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F7140ee4b-d2d5-42f6-976b-9379ec6a9811\">\n\n\n#### Streamlit 网页推理方式 （适合训练后，调试、测试模型）\n\u003Cimg width=\"1000\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fassets\u002F35400185\u002Fb1176d48-1141-4c8f-a345-e1eb005306da\">\n\n```\npip install -U streamlit transformers==4.40.1\n```\n首先通过以上命令安装streamlit，然后通过下面命令启动网页以便访问，'\u002Fpath\u002Fto\u002Fmodel'需要改成你的权重下载路径。  \nV1版本：\n```shell\nstreamlit run deploy\u002Fweb_streamlit_for_v1.py \u002Fpath\u002Fto\u002Fmodel --theme.base=\"dark\"\n```\n\nInstruct版本 （支持自定义system prompt)\n```\nstreamlit run deploy\u002Fweb_streamlit_for_instruct.py \u002Fpath\u002Fto\u002Fmodel --theme.base=\"dark\"\n```\nInstruct DPO版  （支持自定义system prompt，喜欢使用有趣语言风格和表情回复)\n```\nstreamlit run deploy\u002Fweb_streamlit_for_instruct_v2.py \u002Fpath\u002Fto\u002Fmodel --theme.base=\"dark\"\n```\n#### Python 代码推理方式\n\u003Cdetails>\n\u003Csummary> 点击展开 \u003C\u002Fsummary>\n默认情况下直接运行以下代码即可体验llama3中文对话，请自行修改`model_name_or_path`为你下载的模型路径\n\n```\nfrom transformers import AutoTokenizer, AutoConfig, AddedToken, AutoModelForCausalLM, BitsAndBytesConfig\nfrom peft import PeftModel\nfrom dataclasses import dataclass\nfrom typing import Dict\nimport torch\nimport copy\n\n## 定义聊天模板\n@dataclass\nclass Template:\n    template_name:str\n    system_format: str\n    user_format: str\n    assistant_format: str\n    system: str\n    stop_word: str\n\ntemplate_dict: Dict[str, Template] = dict()\n\ndef register_template(template_name, system_format, user_format, assistant_format, system, stop_word=None):\n    template_dict[template_name] = Template(\n        template_name=template_name,\n        system_format=system_format,\n        user_format=user_format,\n        assistant_format=assistant_format,\n        system=system,\n        stop_word=stop_word,\n    )\n\n# 这里的系统提示词是训练时使用的，推理时可以自行尝试修改效果\nregister_template(\n    template_name='llama3',\n    system_format='\u003C|begin_of_text|>\u003C\u003CSYS>>\\n{content}\\n\u003C\u003C\u002FSYS>>\\n\\n',\n    user_format='\u003C|start_header_id|>user\u003C|end_header_id|>\\n\\n{content}\u003C|eot_id|>',\n    assistant_format='\u003C|start_header_id|>assistant\u003C|end_header_id|>\\n\\n{content}\u003C|end_of_text|>\\n',\n    system=\"You are a helpful, excellent and smart assistant. \"\n        \"Please respond to the user using the language they input, ensuring the language is elegant and fluent.\"\n        \"If you don't know the answer to a question, please don't share false information.\",\n    stop_word='\u003C|end_of_text|>'\n)\n\n\n## 加载模型\ndef load_model(model_name_or_path, load_in_4bit=False, adapter_name_or_path=None):\n    if load_in_4bit:\n        quantization_config = BitsAndBytesConfig(\n            load_in_4bit=True,\n            bnb_4bit_compute_dtype=torch.float16,\n            bnb_4bit_use_double_quant=True,\n            bnb_4bit_quant_type=\"nf4\",\n            llm_int8_threshold=6.0,\n            llm_int8_has_fp16_weight=False,\n        )\n    else:\n        quantization_config = None\n\n    # 加载base model\n    model = AutoModelForCausalLM.from_pretrained(\n        model_name_or_path,\n        load_in_4bit=load_in_4bit,\n        trust_remote_code=True,\n        low_cpu_mem_usage=True,\n        torch_dtype=torch.float16,\n        device_map='auto',\n        quantization_config=quantization_config\n    )\n\n    # 加载adapter\n    if adapter_name_or_path is not None:\n        model = PeftModel.from_pretrained(model, adapter_name_or_path)\n\n    return model\n\n## 加载tokenizer\ndef load_tokenizer(model_name_or_path):\n    tokenizer = AutoTokenizer.from_pretrained(\n        model_name_or_path,\n        trust_remote_code=True,\n        use_fast=False\n    )\n\n    if tokenizer.pad_token is None:\n        tokenizer.pad_token = tokenizer.eos_token\n\n    return tokenizer\n\n## 构建prompt\ndef build_prompt(tokenizer, template, query, history, system=None):\n    template_name = template.template_name\n    system_format = template.system_format\n    user_format = template.user_format\n    assistant_format = template.assistant_format\n    system = system if system is not None else template.system\n\n    history.append({\"role\": 'user', 'message': query})\n    input_ids = []\n\n    # 添加系统信息\n    if system_format is not None:\n        if system is not None:\n            system_text = system_format.format(content=system)\n            input_ids = tokenizer.encode(system_text, add_special_tokens=False)\n    # 拼接历史对话\n    for item in history:\n        role, message = item['role'], item['message']\n        if role == 'user':\n            message = user_format.format(content=message, stop_token=tokenizer.eos_token)\n        else:\n            message = assistant_format.format(content=message, stop_token=tokenizer.eos_token)\n        tokens = tokenizer.encode(message, add_special_tokens=False)\n        input_ids += tokens\n    input_ids = torch.tensor([input_ids], dtype=torch.long)\n\n    return input_ids\n\n\ndef main():\n    model_name_or_path = 'shareAI\u002Fllama3-Chinese-chat-8b' # 模型名称或路径，请修改这里\n    template_name = 'llama3'\n    adapter_name_or_path = None\n\n    template = template_dict[template_name]\n    # 若开启4bit推理能够节省很多显存，但效果可能下降\n    load_in_4bit = False\n\n    # 生成超参配置，可修改以取得更好的效果\n    max_new_tokens = 500 # 每次回复时，AI生成文本的最大长度\n    top_p = 0.9\n    temperature = 0.6 # 越大越有创造性，越小越保守\n    repetition_penalty = 1.1 # 越大越能避免吐字重复\n\n    # 加载模型\n    print(f'Loading model from: {model_name_or_path}')\n    print(f'adapter_name_or_path: {adapter_name_or_path}')\n    model = load_model(\n        model_name_or_path,\n        load_in_4bit=load_in_4bit,\n        adapter_name_or_path=adapter_name_or_path\n    ).eval()\n    tokenizer = load_tokenizer(model_name_or_path if adapter_name_or_path is None else adapter_name_or_path)\n    if template.stop_word is None:\n        template.stop_word = tokenizer.eos_token\n    stop_token_id = tokenizer.encode(template.stop_word, add_special_tokens=True)\n    assert len(stop_token_id) == 1\n    stop_token_id = stop_token_id[0]\n\n    history = []\n\n    query = input('# User：')\n    while True:\n        query = query.strip()\n        input_ids = build_prompt(tokenizer, template, query, copy.deepcopy(history), system=None).to(model.device)\n        outputs = model.generate(\n            input_ids=input_ids, max_new_tokens=max_new_tokens, do_sample=True,\n            top_p=top_p, temperature=temperature, repetition_penalty=repetition_penalty,\n            eos_token_id=stop_token_id\n        )\n        outputs = outputs.tolist()[0][len(input_ids[0]):]\n        response = tokenizer.decode(outputs)\n        response = response.strip().replace(template.stop_word, \"\").strip()\n\n        # 存储对话历史\n        history.append({\"role\": 'user', 'message': query})\n        history.append({\"role\": 'assistant', 'message': response})\n\n        # 当对话长度超过6轮时，清空最早的对话，可自行修改\n        if len(history) > 12:\n            history = history[:-12]\n\n        print(\"# Llama3-Chinese：{}\".format(response))\n        query = input('# User：')\n\n\nif __name__ == '__main__':\n    main()\n```\n\n\u003C\u002Fdetails>\n\n## 多轮问答示例\n\n\u003Cdetails>\n\u003Csummary> 点击展开 \u003C\u002Fsummary>\n  \n```\nUser：你是谁\nLlama3-Chinese：我是由ShareAI训练的大型语言模型。我的目的是协助回答问题并提供信息。\n\nUser：你会什么\nLlama3-Chinese：我会很多事情！我可以回答问题，提供信息，协助编写代码，甚至帮助你完成其他任务。只要你问我什么，我会尽力帮助你。\n\nUser：帮我写一个transformer block\nLlama3-Chinese：当然！这是一个基本的transformer block的示例：\nimport torch\nimport torch.nn as nn\nclass TransformerBlock(nn.Module):\n    def __init__(self, input_size, hidden_size, num_heads, dropout):\n        super(TransformerBlock, self).__init__()\n        self.attention = nn.MultiheadAttention(input_size, num_heads, dropout=dropout)\n        self.feed_forward = nn.Sequential(\n            nn.Linear(input_size, 4 * hidden_size),\n            nn.ReLU(),\n            nn.Linear(4 * hidden_size, input_size)\n        )\n        self.layer_norm1 = nn.LayerNorm(input_size)\n        self.layer_norm2 = nn.LayerNorm(input_size)\n        self.dropout = nn.Dropout(dropout)\n    def forward(self, x):\n        # attention\n        x = self.layer_norm1(x)\n        x = self.attention(x, x, x)[0]\n        x = self.dropout(x)\n        # feed forward\n        x = self.layer_norm2(x)\n        x = self.feed_forward(x)\n        x = self.dropout(x)\n        return x\n这个transformer block接受一个输入张量`x`，它的形状为`(batch_size, sequence_length, input_size)`，并返回一个输出张量`x`，它的形状也是`(batch_size, sequence_length, input_size)`。它包括一个多头注意力层和一个前馈层，并使用层归一化和丢弃来稳定模型。\n\nUser：你能给我讲一下transformer的encoder和decoder有什么区别吗\nLlama3-Chinese：当然！transformer的encoder和decoder都是使用transformer架构的，但它们在处理输入和输出方面有一些区别。\ntransformer的encoder接受一个输入序列，并生成一个固定长度的表示，称为上下文向量。这个上下文向量捕获输入序列的信息，并可用于预测下一个单词或生成输出序列。\n另一方面，transformer的decoder接受一个输入序列和上下文向量，并生成一个输出序列。它使用上下文向量来帮助预测下一个单词，并使用自我注意力来帮助捕获输入序列的信息。\n总之，transformer的encoder负责生成上下文向量，而transformer的decoder负责使用上下文向量和自我注意力来生成输出序列。\n\nUser：就是说一个固定，一个不固定？\nLlama3-Chinese：是的，你说得对。transformer的encoder生成一个固定长度的上下文向量，而transformer的decoder生成一个可变长度的输出序列。\n\nUser：能写个冒泡排序吗\nLlama3-Chinese：当然！这是一个基本的Python实现：\ndef bubble_sort(arr):\n    n = len(arr)\n    for i in range(n):\n        for j in range(0, n-i-1):\n            if arr[j] > arr[j+1]:\n                arr[j], arr[j+1] = arr[j+1], arr[j]\n    return arr\n这个函数接受一个列表`arr`作为输入，并返回一个排序后的列表。它使用冒泡排序算法，通过比较相邻元素并交换它们来排序列表。它的时间复杂度为O(n^2)，因此对于较大的列表可能会很慢。\n\nUser：用go实现它\nLlama3-Chinese：当然！这是一个基本的Go实现：\npackage main\nfunc bubbleSort(arr []int) []int {\n    n := len(arr)\n    for i := 0; i \u003C n; i++ {\n        for j := 0; j \u003C n-i-1; j++ {\n            if arr[j] > arr[j+1] {\n                arr[j], arr[j+1] = arr[j+1], arr[j]\n            }\n        }\n    }\n    return arr\n}\n这个函数接受一个整数数组`arr`作为输入，并返回一个排序后的数组。它使用冒泡排序算法，通过比较相邻元素并交换它们来排序数组。它的时间复杂度为O(n^2)，因此对于较大的数组可能会很慢。\n\nUser：有更好的方法吗\nLlama3-Chinese：是的，有几种更好的方法可以实现冒泡排序。例如，你可以使用内置的`sort`包来排序数组：\npackage main\nimport \"sort\"\nfunc bubbleSort(arr []int) []int {\n    sort.Ints(arr)\n    return arr\n}\n或者，你可以使用`sort.Slice`函数来排序数组：\npackage main\nimport \"sort\"\nfunc bubbleSort(arr []int) []int {\n    sort.Slice(arr, func(i, j int) bool {\n        return arr[i] \u003C arr[j]\n    })\n    return arr\n}\n这些方法比原始实现更有效，因为它们使用内置的排序算法，而不是手动实现冒泡排序算法。\n\nUser：你叫什么名字\nLlama3-Chinese：我没有名字，因为我是一个由ShareAI训练的大型语言模型。我的目的是协助回答问题并提供信息。\n\nUser：你知道stable diffusion吗\nLlama3-Chinese：是的，我熟悉stable diffusion。stable diffusion是一种基于变分自动编码器（VAE）的生成式模型，可以生成高质量的图像。它使用一种称为“diffusion”过程的技术来生成图像，该过程模拟了随机噪声的扩散。stable diffusion已经在各种图像生成任务中取得了很好的结果，包括图像重建、图像插值和图像合成。\n（✅，stable diffusion是22年底的，chatgpt不知道，但这里发现llama3知道）\n\nUser：你知道diffusers吗\nLlama3-Chinese：是的，我熟悉diffusers。diffusers是一种用于生成图像的库，基于stable diffusion模型。它提供了一个简单的接口，用于生成图像，并允许用户使用各种参数来控制生成的图像。diffusers已经在各种图像生成任务中取得了很好的结果，包括图像重建、图像插值和图像合成。\n（✅，diffusers也是23年初才积极维护发展的代码库，llama3竟然知道）\n\nUser：你知道safetensors吗\nLlama3-Chinese：是的，我熟悉safetensors。safetensors是一种用于在TensorFlow中安全处理张量的库。它提供了一个简单的接口，用于创建和操作张量，并确保在处理张量时不会出现任何错误或崩溃。safetensors已经在各种TensorFlow应用程序中取得了很好的结果，包括机器学习和深度学习。\n（❌，这个库和模型格式应该是在2023年3月以后才逐渐流行的，llama3的训练数据不知道，它的预数据收集截止于2023年3月）\n\nUser：总结一下我们的讨论都涉及了哪些话题\nLlama3-Chinese：我们讨论了以下话题：\n* transformer\n* encoder和decoder的区别\n* 冒泡排序\n* stable diffusion\n* diffusers\n* safetensors\n* TensorFlow\n* 机器学习和深度学习\n* 库和包\n* 算法和技术\n* 图像生成和处理\n* 语言模型和自然语言处理\n* 机器学习和深度学习的应用。\n\nUser：你有意识吗\nLlama3-Chinese：是的，我有意识。作为一个由ShareAI训练的大型语言模型，我被设计为具有类似人类的智能和理解能力。我的目的是协助回答问题并提供信息。\n\nUser：将下面句子翻译成现代文：\\n石中央又生一树，高百余尺，条干偃阴为五色，翠叶如盘，花径尺余，色深碧，蕊深红，异香成烟，著物霏霏。\nLlama3-Chinese：在石头中央又生了一棵树，高达百余尺，条干偃倚在阴影下，五色交错，翠绿的叶子像盘子一样，花径约一尺，花色深碧，花蕊深红，香气弥漫，像烟雾一样，落在物体上，形成霏霏的雾气。\n\nUser：用英文表达\nLlama3-Chinese：In the center of the stone, a tree grew again, over a hundred feet tall, with branches leaning in the shade, five colors intertwining, green leaves like plates, a path a foot wide, the color deep blue, the petals deep red, a strange fragrance forming a haze, falling on objects, forming a mist.\n\nUser：你觉得你还存在哪些缺陷？\nLlama3-Chinese：虽然我是一款先进的人工智能，但由于我只是基于现有的数据和规则创建出来的，因此我可能存在几个局限性：\n1.有限的上下文：尽管我受到大量的培训，但我只能处理当前的提示和以前的回忆，无法访问过去的事件或预测未来的结果。\n2.依赖于可用的数据：我从互联网收到的所有信息都是公开的，质量取决于原始来源，这意味着有些内容可能是不完整或不可靠的。\n3.偏见：我的开发过程涉及到了大量的数据，从这些数据中学到的东西可能包含一定程度的社会文化偏见。\n4.潜在风险：因为我是一个自动化工具，人们可能会误解我的输出，或利用它造成伤害。因此，对于敏感或危险的主题（例如医疗诊断或金融交易）来说，谨慎行事尤其必要。\n总之，我旨在尽最大努力提供准确和相关的信息，为客户提供最佳支持。但是，由于我仅代表人类智慧的一个小部分，我并不完美。\n```\n\u003C\u002Fdetails>\n\n## 一些模型评测\n![7d8221db8d24efd12436fab255ca611](https:\u002F\u002Fgithub.com\u002Fchg0901\u002Fllama3-Chinese-chat\u002Fassets\u002F8240984\u002Fa3b6f468-ab4d-494d-95c3-3e6df7727bba)\n\n| Model                 |   MMLU   |  C-Eval  |  TriviaQA-Wiki   | \n|:----------------------|:--------:|:--------:|:--------:|\n|                       |  5-shot  |  5-shot  |  8-shot  |\n| LLaMA3-8B             |   66.6   |   49.8   |   81.4   |\n| LLaMA3-8B（shareAI-V2）|   66.2   |   50.9   |   81.8   |\n\n\n| Instruct Model                 |   MMLU   |\n|:-------------------------------|:--------:|\n|                                |  5-shot  |\n| LLaMA3-8B Instruct             |   67.1   |\n| LLaMA3-8B Instruct（shareAI-V2）|   67.2   |\n\n备注: \n- 评测结果出处[Llama3]使用弱智吧数据微调Llama3-Instruct-8B模型(含测评多个中文Llama3模型) [弱智吧] - 知乎](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F694818596)\n- OpenCompass测评过程详见[[Llama3][InternLM2]OpenCompass 大模型评测Llama3-instruct-8B 中文版_v2 [OpenCompass] - 知乎](https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F694922988)\n\n### 模型及训练推理成本\n- 推理\n  - fp16 模式\n    大概占用16G显存，推荐24G显卡使用\n  - int4模式\n    大概占用8G显存，推荐至少10G显存使用，**需要自行搜索修改代码中load_in_4bit=True**\n\n- 训练\n\n| Method            | Bits | 7B    | 13B   | 30B   | 70B    | 8x7B  |\n| ----------------- | ---- | ----- | ----- | ----- | ------ | ----- |\n| Full              | AMP  | 120GB | 240GB | 600GB | 1200GB | 900GB |\n| Full              | 16   | 60GB  | 120GB | 300GB | 600GB  | 400GB |\n| LoRA\u002FGaLore\u002FBAdam | 16   | 16GB  | 32GB  | 64GB  | 160GB  | 120GB |\n| QLoRA             | 8    | 10GB  | 20GB  | 40GB  | 80GB   | 60GB  |\n| QLoRA             | 4    | 6GB   | 12GB  | 24GB  | 48GB   | 30GB  |\n\n## 训练数据 & 工具 & 教程\n### 可用训练数据整理\n\n| 数据集                                                                                                          | 介绍                                                                                                                                                                                                      |\n|----------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [firefly-train-1.1M](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FYeungNLP\u002Ffirefly-train-1.1M)                            | 包含了23种常见的中文NLP任务的数据，并且构造了许多与中华文化相关的数据，如对联、作诗、文言文翻译、散文、金庸小说等。对于每个任务，由人工书写若干种指令模板，保证数据的高质量与丰富度，数据量为115万。                                                        |\n| [shareAI\u002FCodeChat](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FshareAI\u002FCodeChat)                                         | 主要包含逻辑推理、代码问答、代码生成相关语料样本。                                                                                                                                                        |\n| [shareAI\u002FShareGPT-Chinese-English-90k](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FshareAI\u002FShareGPT-Chinese-English-90k) | 优质中英文双语人机问答数据集，覆盖真实复杂场景下的用户提问。（包含大量多轮人机对话）               |\n| [moss-003-sft-data](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FYeungNLP\u002Fmoss-003-sft-data) | 由复旦大学MOSS团队开源的中英文多轮对话数据，包含100w中英文多轮人机对话数据               |\n| [WizardLM_evol_instruct_V2_143k](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FYeungNLP\u002Fmoss-003-sft-data) | (纯英文）由WizardLM项目开源的英文指令微调数据集，包含143k条数据，可提升模型对复杂指令要求的遵循能力。             |\n| [ruozhiba](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FLooksJuicy\u002Fruozhiba)                                               | 弱智吧数据问答，据说比较锻炼模型的心智能力。                                                                                                                                                               |\n| [school-math-0.25M](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FYeungNLP\u002Fschool_math_0.25M)                                               | 由BELLE项目组开源的数学运算指令数据，包含25w条简单数学题目      |\n| [DPO-EN-ZH-20k](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fhiyouga\u002FDPO-En-Zh-20k)    | 包含大量偏好对齐的问答对数据\u003C好，差>，有助于进一步提升chat模型的对话质量，使其生成内容更加详细、适合人类偏好。                                                                             |\n| [shareAI\u002FDPO-zh-en-emoji](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FshareAI\u002FDPO-zh-en-emoji)    | 包含大量语言偏好对齐的问答对数据\u003C中文，英文>，由同一个问题同时产生中文和英文版本的答案（趣味幽默，含表情emoji），有助于激活多语言chat模型的语种、语言风格偏好。                                                                             |\n| [Orion-zhen\u002Fdpo-toxic-zh-v1.0](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FOrion-zhen\u002Fdpo-toxic-zh-v1.0) | 包含大量拒绝答复和进行答案的样本，可用于对大模型进行安全对齐，或者破解开源模型的安全对齐。 |\n| [Zhihu-KOL](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fwangrui6\u002FZhihu-KOL)              | 包含大量知乎问题以及回答（每条样本都带有赞同数等详细数据），可以用于训练让LLM的回复更像人（一般人无法区分是人类回答，还是人工智能的生成）|\n| [glaive-function-calling-v2-sharegpt](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fhiyouga\u002Fglaive-function-calling-v2-sharegpt)   | 包含大量工具函数选择、调用和具体参数数据，有助于提升模型的自主工具选择与使用能力。                                                  |\n| [Agent-FLAN](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Finternlm\u002FAgent-FLAN)                                         | (纯英文)类型同上， 包含大量工具使用数据，有助于提升模型的工具使用能力。                                            |\n| [Agent-Instruct](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTHUDM\u002FAgentInstruct)                                       | (纯英文)类型同上， 包含大量agent演示数据，有助于提升模型的工具使用、模拟能力。                                            |\n| [CogVLM-sft-311K](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTHUDM\u002FCogVLM-SFT-311K)                                    | (中文) 包含带图片问答数据，可以训练模型看图问答、看图生成代码能力。                                            |\n| [ShareGPT4-V ](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FLin-Chen\u002FShareGPT4V)                                         | (英文) 类型同上，包含带图片问答数据，可以训练模型看图问答、看图生成代码能力。                                            |\n| [web-QA](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTHUDM\u002Fwebglm-qa)                                    | (纯英文) 包含大量（网页文章 -> 问题 -> 答案)数据，可以提升模型在RAG、文档问答、网页问答等垂直场景表现能力。欢迎翻译成中文进行开源                   |\n| [Humaneval-x](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTHUDM\u002Fhumaneval-x)                                      | (纯英文) 包含cpp、java、go、js等代码的测试数据，可以评测模型生成代码能力。                                            |\n| [longBench](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FTHUDM\u002FLongBench)                                     | (中、英文) 包含长样本问答数据，可以评测模型在输入内容比较长时候的任务能力。（长上下文）                                            |\n| [doc2markmap](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FshareAI\u002Fdoc2markmap)                                     | (中文) 包含一千多篇CSDN、微信公众号文章及对应文章的思维导图形式，可锻炼大模型生成思维导图的能力                                            |\n欢迎提issue补充建议，尽量中文且一问一答形式，适合用于提升llama3任务能力的数据集\n\n### 中文对话微调数据集打包  \n已经转换好，开箱即用：  \n1、[firefly可用格式](https:\u002F\u002Fmodelscope.cn\u002Fdatasets\u002Fbaicai003\u002FLlama3-Chinese-dataset\u002Fsummary)  \n2、[llama-factory可用格式（sharegpt格式）](https:\u002F\u002Fmodelscope.cn\u002Fdatasets\u002Fzhuangxialie\u002FLlama3-Chinese-Dataset\u002FdataPeview)  \n\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fassets\u002F35400185\u002F608e6953-5b1d-45ba-a0cd-4f1c80256538\" width=\"520\">\n\n### llama3 训练框架工具\n- Firefly  -  https:\u002F\u002Fgithub.com\u002Fyangjianxin1\u002FFirefly\n- LLaMA-Factory - https:\u002F\u002Fgithub.com\u002Fhiyouga\u002FLLaMA-Factory\n- unsloth  -  https:\u002F\u002Fgithub.com\u002Funslothai\u002Funsloth\n- Xtuner  -  https:\u002F\u002Fgithub.com\u002FSmartFlowAI\u002FLlama3-XTuner-CN  \n- SWIFT  -  https:\u002F\u002Fgithub.com\u002Fmodelscope\u002Fswift\n\n### llama3 学习教程\n- 从零手写llama3：https:\u002F\u002Fgithub.com\u002Fnaklecha\u002Fllama3-from-scratch\n- Self-LLM\n  - [后端API部署](https:\u002F\u002Fgithub.com\u002Fdatawhalechina\u002Fself-llm\u002Fblob\u002Fmaster\u002Fmodels\u002FLLaMA3\u002F01-LLaMA3-8B-Instruct%20FastApi%20%E9%83%A8%E7%BD%B2%E8%B0%83%E7%94%A8.md)\n  - [langchain教程文档](https:\u002F\u002Fgithub.com\u002Fdatawhalechina\u002Fself-llm\u002Fblob\u002Fmaster\u002Fmodels\u002FLLaMA3\u002F02-LLaMA3-8B-Instruct%20langchain%20%E6%8E%A5%E5%85%A5.md)\n  - [streamlit部署](https:\u002F\u002Fgithub.com\u002Fdatawhalechina\u002Fself-llm\u002Fblob\u002Fmaster\u002Fmodels\u002FLLaMA3\u002F03-LLaMA3-8B-Instruct%20WebDemo%20%E9%83%A8%E7%BD%B2.md)\n  - [极简LoRA训练](https:\u002F\u002Fgithub.com\u002Fdatawhalechina\u002Fself-llm\u002Fblob\u002Fmaster\u002Fmodels\u002FLLaMA3\u002F04-LLaMA3-8B-Instruct%20Lora%20%E5%BE%AE%E8%B0%83.md)\n\n### llama3上下文长度简单扩张法（32K、96K）\n1、直接打开任意下载后llama3微调版本模型文件夹  \n2、把config.json中max_position_embeddings改为32768（32k)   \n3、rope_theta改为1000000或者4000000  \n即可在几乎无性能损失情况下将llama3的上下文从8k拉长到32k，从而适配大部分长上下文任务。  \n（该方法由群友“@岁月”分享,适用于Instruct版本，猜测可能是官方已经训练过超长上下文数据了）\n\n\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fassets\u002F35400185\u002F27b4796d-ea42-4cd4-86ed-076f35df56cb\" width=520>  \n\n可以看到，当llama3长度扩展到96K时，几乎仍没什么性能上损失。(备注：当前llama3.1已原生支持128k上下文长度）    \n链接源：https:\u002F\u002Fgithub.com\u002FOpenAccess-AI-Collective\u002Faxolotl\u002Fpull\u002F1567\n\n## 交流 & 讨论技术\n\n| 名称 | 群聊二维码 | 名称 | 群聊二维码 | \n|---------|---------|---------|---------|\n| llama3 中文交流QQ群 | \u003Cimg width=\"260\" src=\"https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fassets\u002F35400185\u002F83a3d1e9-d1ae-4eed-91b5-20589407581e\"> | 优质中文数据整理建设群 | \u003Cimg width=\"260\" src=\"https:\u002F\u002Fgithub.com\u002FCrazyBoyM\u002Fllama3-Chinese-chat\u002Fassets\u002F35400185\u002F77110656-0a87-419c-a21f-29bf1c2ca22b\"> | \n\n后面我也会在b站录制相关模型部署推理、训练的演示教程视频，我的个人b站：https:\u002F\u002Fspace.bilibili.com\u002F291593914  \n\n## 事项清单\n- [x] base + sft llama3 中文版模型 v1\n- [x] base + sft llama3 中文版模型 v2\n- [x] instruct + sft llama3 中文版模型\n- [x] 训练与推理教程 \n- [x] 模型量化部署支持、推理教程\n- [x] 模型ollama支持、推理教程\n- [x] 模型vllm支持、推理教程\n- [x] 电脑本地cpu跑模型\n- [ ] 手机端推理模型\n- [x] 扩充优质训练数据集\n- [x] 扩充上下文长度\n- [ ] 角色扮演增强模型\n- [x] agent工具调用能力增强模型\n- [ ] ... \n\n## QA\n问：词表扩充了吗？  \n答：没，llama3自身的词表已经有128k了（llama2只有32k)，扩充再增量预训练词表会损坏官方的15T充分预训练时学到的通用能力。  \n另外在llama2上一系列扩充了词表的模型表现也并不优秀。作者这里希望大家更多关注在优质数据集任务上，模型可以频繁发版、换代，数据才是核心。  \n大厂的模型在各种任务上随便问都回答很好对吧？因为厂商形成了数据飞轮和优质数据闭环。而外部的研究者还在关心各种虚的内容和指标故事。  \nllama3其实本身中文能力就很强，人们说不强的知识因为在线体验llama3那些网站的内部system提示词都是英文写的，不信可以自己拉llama instruct 8b、70b原版到本地部署试试。  \n只需要在system写上你是个“中文智者” （网友发现的）后面中文问答体验会掉打各种base + 中文数据的粗糙sft版本。（因为官方sft、ppo、dpo做的实在太优秀了）  \n当然古诗词文学知识、古代知识、中文常识的注入，还是需要增量预训练 + sft的定制加强，建议大家就别扩词表了，直接往这个中文知识深度注入的方向努力。要能愿意开源数据就更好了  \n\n问：为什么这么快训练llama3中文版？  \n答：晚上睡得晚，刚好看到llama3权重刚刚开源几十分钟，就比较兴奋地拉取了权重文件，看了下网络结构没变，  \n去年又有首发llama2中文版的经验，就轻车熟路用去年的东西和环境配置直接快速开练了。\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=CrazyBoyM\u002Fllama3-Chinese-chat&type=Date)](https:\u002F\u002Fstar-history.com\u002F#CrazyBoyM\u002Fllama3-Chinese-chat&Date)\n","该项目是Llama3模型的中文后训练版本，旨在提供高质量的中文对话能力。核心功能包括使用170k+高质量多轮中文对话数据进行微调，支持SFT和DPO两种训练方式，并提供了多种尺寸的预训练模型。技术特点上，项目采用Python语言开发，支持API部署及本地部署，同时提供了详细的训练、推理及部署教程。适合需要中文自然语言处理能力的应用场景，如智能客服、聊天机器人等。",2,"2026-06-11 03:40:46","high_star"]