[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-71099":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":16,"stars30d":16,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":18,"archived":19,"fork":20,"defaultBranch":21,"hasWiki":19,"hasPages":20,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":16,"starSnapshotCount":16,"syncStatus":26,"lastSyncTime":27,"discoverSource":28},71099,"metaseq","facebookresearch\u002Fmetaseq","facebookresearch","Repo for external large-scale work","",null,"Python",6546,718,8,105,0,39.57,"MIT License",true,false,"main",[],"2026-06-12 02:02:47","\n\n# Metaseq\nA codebase for working with [Open Pre-trained Transformers](projects\u002FOPT), originally forked from [fairseq](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffairseq).\n\n\n## Community Integrations\n\n### Using OPT with 🤗 Transformers\n\nThe OPT 125M--66B models are now available in [Hugging Face Transformers](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Ftransformers\u002Freleases\u002Ftag\u002Fv4.19.0). You can access them under the `facebook` organization on the [Hugging Face Hub](https:\u002F\u002Fhuggingface.co\u002Ffacebook)\n\n### Using OPT-175B with Alpa\n\nThe OPT 125M--175B models are now supported in the [Alpa project](https:\u002F\u002Falpa-projects.github.io\u002Ftutorials\u002Fopt_serving.html), which \nenables serving OPT-175B with more flexible parallelisms on older generations of GPUs, such as 40GB A100, V100, T4, M60, etc.\n\n### Using OPT with Colossal-AI\n\nThe OPT models are now supported in the [Colossal-AI](https:\u002F\u002Fgithub.com\u002Fhpcaitech\u002FColossalAI#OPT), which helps users to efficiently and quickly deploy OPT models training and inference, reducing large AI model budgets and scaling down the labor cost of learning and deployment.\n\n### Using OPT with CTranslate2\n\nThe OPT 125M--66B models can be executed with [CTranslate2](https:\u002F\u002Fgithub.com\u002FOpenNMT\u002FCTranslate2\u002F), which is a fast inference engine for Transformer models. The project integrates the [SmoothQuant](https:\u002F\u002Fgithub.com\u002Fmit-han-lab\u002Fsmoothquant) technique to allow 8-bit quantization of OPT models. See the [usage example](https:\u002F\u002Fopennmt.net\u002FCTranslate2\u002Fguides\u002Ftransformers.html#opt) to get started.\n\n### Using OPT with FasterTransformer\n\nThe OPT models can be served with [FasterTransformer](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FFasterTransformer), a highly optimized inference framework written and maintained by NVIDIA. We provide instructions to convert OPT checkpoints into FasterTransformer format and [a usage example](docs\u002Ffaster-transformer.md) with some benchmark results.\n\n### Using OPT with DeepSpeed\n\nThe OPT models can be finetuned using [DeepSpeed](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeed). See the [DeepSpeed-Chat example](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FDeepSpeedExamples\u002Ftree\u002Fmaster\u002Fapplications\u002FDeepSpeed-Chat) to get started.\n\n## Getting Started in Metaseq\nFollow [setup instructions here](docs\u002Fsetup.md) to get started.\n\n### Documentation on workflows\n* [Training](docs\u002Ftraining.md)\n* [API](docs\u002Fapi.md)\n\n### Background Info\n* [Background & relationship to fairseq](docs\u002Fhistory.md)\n* [Chronicles of training OPT-175B](projects\u002FOPT\u002Fchronicles\u002FREADME.md)\n\n## Support\nIf you have any questions, bug reports, or feature requests regarding either the codebase or the models released in the projects section, please don't hesitate to post on our [Github Issues page](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fmetaseq\u002Fissues).\n\nPlease remember to follow our [Code of Conduct](CODE_OF_CONDUCT.md).\n\n## Contributing\nWe welcome PRs from the community!\n\nYou can find information about contributing to metaseq in our [Contributing](docs\u002FCONTRIBUTING.md) document.\n\n## The Team\nMetaseq is currently maintained by the CODEOWNERS: [Susan Zhang](https:\u002F\u002Fgithub.com\u002Fsuchenzang), [Naman Goyal](https:\u002F\u002Fgithub.com\u002Fngoyal2707), [Punit Singh Koura](https:\u002F\u002Fgithub.com\u002Fpunitkoura), [Moya Chen](https:\u002F\u002Fgithub.com\u002Fmoyapchen), [Kurt Shuster](https:\u002F\u002Fgithub.com\u002Fklshuster), [David Esiobu](https:\u002F\u002Fgithub.com\u002Fdavides), [Igor Molybog](https:\u002F\u002Fgithub.com\u002FigormolybogFB), [Peter Albert](https:\u002F\u002Fgithub.com\u002FXirider), [Andrew Poulton](https:\u002F\u002Fgithub.com\u002FandrewPoulton), [Nikolay Bashlykov](https:\u002F\u002Fgithub.com\u002Fbashnick), [Binh Tang](https:\u002F\u002Fgithub.com\u002Ftangbinh), [Uriel Singer](https:\u002F\u002Fgithub.com\u002Furielsinger), [Yuchen Zhang](https:\u002F\u002Fgithub.com\u002Fzycalice), [Armen Aghajanya](https:\u002F\u002Fgithub.com\u002FArmenAg), [Lili Yu](https:\u002F\u002Fgithub.com\u002Flilisierrayu), and [Adam Polyak](https:\u002F\u002Fgithub.com\u002Fadampolyak).\n\n## License\n\nThe majority of metaseq is licensed under the MIT license, however portions of the project are available under separate license terms: \n* Megatron-LM is licensed under the [Megatron-LM license](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FMegatron-LM\u002Fblob\u002Fmain\u002FLICENSE)\n\n","Metaseq是一个用于处理大规模预训练Transformer模型（如OPT）的代码库。它支持从125M到175B参数规模的模型，并通过与Hugging Face Transformers、Alpa、Colossal-AI、CTranslate2、FasterTransformer以及DeepSpeed等工具和框架集成，提供了高效的训练、推理及部署能力。技术特点包括对多种硬件平台的支持（如不同代次的GPU）、8位量化等优化手段以减少资源消耗。适用于需要利用大规模语言模型进行研究或应用开发的场景，尤其是在预算有限且追求高性能计算效率的情况下。",2,"2026-06-11 03:35:54","high_star"]