[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72228":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},72228,"smollm","huggingface\u002Fsmollm","huggingface","Everything about the SmolLM and SmolVLM family of models ","https:\u002F\u002Fhuggingface.co\u002FHuggingFaceTB",null,"Python",3809,299,28,52,0,6,13,36,18,81.53,"Apache License 2.0",false,"main",true,[],"2026-06-12 04:01:04","# Smol Models 🤏\n\nWelcome to Smol Models, a family of efficient and lightweight AI models from Hugging Face. Our mission is to create fully open powerful yet compact models, for text and vision, that can run effectively on-device while maintaining strong performance.\n\n## [NEW] SmolLM3 (Language Model)\n![image](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F2bf61ea2-8d2e-426b-ba40-0242d34325d2)\n\nOur 3B model outperforms Llama 3.2 3B and Qwen2.5 3B while staying competitive with larger 4B alternatives (Qwen3 & Gemma3). Beyond the performance numbers, we're sharing exactly how we built it using public datasets and training frameworks.\n\nRessources:\n- [SmolLM3-Base](https:\u002F\u002Fhf.co\u002FHuggingFaceTB\u002FSmolLM3-3B-Base)\n- [SmolLM3](https:\u002F\u002Fhf.co\u002FHuggingFaceTB\u002FSmolLM3-3B)\n- [blog](https:\u002F\u002Fhf.co\u002Fblog\u002Fsmollm3)\n\nSummary:\n- **3B model** trained on 11T tokens, SoTA at the 3B scale and competitive with 4B models\n- **Fully open model**, open weights + full training details including public data mixture and training configs\n- **Instruct model** with **dual mode reasoning,** supporting think\u002Fno_think modes\n- **Multilingual support** for 6 languages: English, French, Spanish, German, Italian, and Portuguese\n- **Long context** up to 128k with NoPE and using YaRN\n\n![image](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Ff1b76d3b-af2b-4218-91b3-4ce815bdf0a8)\n\n## 👁️ SmolVLM (Vision Language Model)\n[SmolVLM](https:\u002F\u002Fhuggingface.co\u002FHuggingFaceTB\u002FSmolVLM-Instruct) is our compact multimodal model that can:\n- Process both images and text and perform tasks like visual QA, image description, and visual storytelling\n- Handle multiple images in a single conversation\n- Run efficiently on-device\n\n## Repository Structure\n```\nsmollm\u002F\n├── text\u002F               # SmolLM3\u002F2\u002F1 related code and resources\n├── vision\u002F            # SmolVLM related code and resources\n└── tools\u002F             # Shared utilities and inference tools\n    ├── smol_tools\u002F    # Lightweight AI-powered tools\n    ├── smollm_local_inference\u002F\n    └── smolvlm_local_inference\u002F\n```\n\n## Getting Started\n\n### SmolLM3\n```python\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\n\nmodel_name = \"HuggingFaceTB\u002FSmolLM3-3B\"\ndevice = \"cuda\"  # for GPU usage or \"cpu\" for CPU usage\n\n# load the tokenizer and the model\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = AutoModelForCausalLM.from_pretrained(\n    model_name,\n).to(device)\n\n# prepare the model input\nprompt = \"Give me a brief explanation of gravity in simple terms.\"\nmessages_think = [\n    {\"role\": \"user\", \"content\": prompt}\n]\n\ntext = tokenizer.apply_chat_template(\n    messages_think,\n    tokenize=False,\n    add_generation_prompt=True,\n)\nmodel_inputs = tokenizer([text], return_tensors=\"pt\").to(model.device)\n\n# Generate the output\ngenerated_ids = model.generate(**model_inputs, max_new_tokens=32768)\n\n# Get and decode the output\noutput_ids = generated_ids[0][len(model_inputs.input_ids[0]) :]\nprint(tokenizer.decode(output_ids, skip_special_tokens=True))\n```\n\n### SmolVLM\n```python\nfrom transformers import AutoProcessor, AutoModelForVision2Seq\n\nprocessor = AutoProcessor.from_pretrained(\"HuggingFaceTB\u002FSmolVLM-Instruct\")\nmodel = AutoModelForVision2Seq.from_pretrained(\"HuggingFaceTB\u002FSmolVLM-Instruct\")\n\nmessages = [\n    {\n        \"role\": \"user\",\n        \"content\": [\n            {\"type\": \"image\"},\n            {\"type\": \"text\", \"text\": \"What's in this image?\"}\n        ]\n    }\n]\n```\n\n## Ecosystem\n\u003Cdiv align=\"center\">\n\u003Cimg src=\"https:\u002F\u002Fcdn-uploads.huggingface.co\u002Fproduction\u002Fuploads\u002F61c141342aac764ce1654e43\u002FRvHjdlRT5gGQt5mJuhXH9.png\" width=\"700\"\u002F>\n\u003C\u002Fdiv>\n\n## Resources\n\n### Documentation\n- [SmolLM3 Documentation](text\u002FREADME.md)\n- [SmolLM2 paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.02737v1)\n- [SmolVLM Documentation](vision\u002FREADME.md)\n- [Local Inference Guide](tools\u002FREADME.md)\n\n### Pretrained Models\n- [SmolLM3 Models Collection](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FHuggingFaceTB\u002Fsmollm3-686d33c1fdffe8e635317e23)\n- [SmolLM2 Models Collection](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FHuggingFaceTB\u002Fsmollm2-6723884218bcda64b34d7db9)\n- [SmolVLM Model](https:\u002F\u002Fhuggingface.co\u002FHuggingFaceTB\u002FSmolVLM-Instruct)\n\n### Datasets\n- [SmolLM3 Pretraining dataset](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FHuggingFaceTB\u002Fsmollm3-pretraining-datasets-685a7353fdc01aecde51b1d9)\n- [SmolTalk](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHuggingFaceTB\u002Fsmoltalk) - Our instruction-tuning dataset\n- [FineMath](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHuggingFaceTB\u002Ffinemath) - Mathematics pretraining dataset\n- [FineWeb-Edu](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FHuggingFaceFW\u002Ffineweb-edu) - Educational content pretraining dataset\n","Smol Models 是由 Hugging Face 开发的一系列高效且轻量级的 AI 模型，旨在提供强大的文本和视觉处理能力，同时保持在设备上的高效运行。该项目包括 SmolLM3 语言模型和 SmolVLM 视觉-语言模型两大核心组件，其中 SmolLM3 在 3B 参数级别上超越了 Llama 3.2 和 Qwen2.5，并且与更大规模的 4B 模型竞争；而 SmolVLM 则能够处理图像与文本结合的任务，如视觉问答、图像描述等。这些模型支持多达六种语言，具备长上下文理解能力和双模式推理功能（思考\u002F不思考），并且所有训练细节都是公开透明的。适用于需要高性能但资源受限的场景，比如移动设备或边缘计算环境中的自然语言处理和多模态应用。",2,"2026-06-11 03:40:57","high_star"]