[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72076":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":21,"hasPages":21,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":16,"starSnapshotCount":16,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},72076,"gemma_pytorch","google\u002Fgemma_pytorch","google","The official PyTorch implementation of Google's Gemma models","https:\u002F\u002Fai.google.dev\u002Fgemma",null,"Python",5688,599,41,21,0,9,17,39.33,"Apache License 2.0",false,"main",[24,7,25],"gemma","pytorch","2026-06-12 02:02:58","# Gemma in PyTorch\n\n**Gemma** is a family of lightweight, state-of-the art open models built from research and technology used to create Google Gemini models. They include both text-only and multimodal decoder-only large language models, with open weights, pre-trained variants, and instruction-tuned variants. For more details, please check out the following links:\n\n * [Gemma on Google AI](https:\u002F\u002Fai.google.dev\u002Fgemma)\n * [Gemma on Kaggle](https:\u002F\u002Fwww.kaggle.com\u002Fmodels\u002Fgoogle\u002Fgemma-3)\n * [Gemma on Vertex AI Model Garden](https:\u002F\u002Fpantheon.corp.google.com\u002Fvertex-ai\u002Fpublishers\u002Fgoogle\u002Fmodel-garden\u002Fgemma3)\n\nThis is the official PyTorch implementation of Gemma models. We provide model and inference implementations using both PyTorch and PyTorch\u002FXLA, and support running inference on CPU, GPU and TPU.\n\n## Updates\n\n * [March 12th, 2025 🔥] Support Gemma v3. You can find the checkpoints [on Kaggle](https:\u002F\u002Fwww.kaggle.com\u002Fmodels\u002Fgoogle\u002Fgemma-3\u002Fpytorch) and [Hugging Face](https:\u002F\u002Fhuggingface.co\u002Fmodels?other=gemma_torch)\n\n * [June 26th, 2024] Support Gemma v2. You can find the checkpoints [on Kaggle](https:\u002F\u002Fwww.kaggle.com\u002Fmodels\u002Fgoogle\u002Fgemma-2\u002Fpytorch) and Hugging Face\n\n * [April 9th, 2024] Support CodeGemma. You can find the checkpoints [on Kaggle](https:\u002F\u002Fwww.kaggle.com\u002Fmodels\u002Fgoogle\u002Fcodegemma\u002Fpytorch) and [Hugging Face](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fgoogle\u002Fcodegemma-release-66152ac7b683e2667abdee11)\n\n * [April 5, 2024] Support Gemma v1.1. You can find the v1.1 checkpoints [on Kaggle](https:\u002F\u002Fwww.kaggle.com\u002Fmodels\u002Fgoogle\u002Fgemma\u002Fframeworks\u002FpyTorch) and [Hugging Face](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fgoogle\u002Fgemma-release-65d5efbccdbb8c4202ec078b).\n\n## Download Gemma model checkpoint\n\nYou can find the model checkpoints on Kaggle:\n\n- [Gemma 3](https:\u002F\u002Fwww.kaggle.com\u002Fmodels\u002Fgoogle\u002Fgemma-3\u002FpyTorch)\n- [Gemma 2](https:\u002F\u002Fwww.kaggle.com\u002Fmodels\u002Fgoogle\u002Fgemma-2\u002FpyTorch)\n- [Gemma](https:\u002F\u002Fwww.kaggle.com\u002Fmodels\u002Fgoogle\u002Fgemma\u002FpyTorch)\n\nAlternatively, you can find the model checkpoints on the Hugging Face Hub [here](https:\u002F\u002Fhuggingface.co\u002Fmodels?other=gemma_torch). To download the models, go the the model repository of the model of interest and click the `Files and versions` tab, and download the model and tokenizer files. For  programmatic downloading, if you have `huggingface_hub` installed, you can also run:\n\n```\nhuggingface-cli download google\u002Fgemma-3-4b-it-pytorch\n```\n\nThe following model sizes are available:\n\n- **Gemma 3**: \n  - **Text only**: 1b\n  - **Multimodal**: 4b, 12b, 27b_v3\n- **Gemma 2**: \n  - **Text only**: 2b-v2, 9b, 27b\n- **Gemma**: \n  - **Text only**: 2b, 7b\n\n\nNote that you can choose between the 1B, 4B, 12B, and 27B variants.\n\n```\nVARIANT=\u003C1b, 2b, 2b-v2, 4b, 7b, 9b, 12b, 27b, 27b_v3>\nCKPT_PATH=\u003CInsert ckpt path here>\n```\n\n## Try it free on Colab\n\nFollow the steps at\n[https:\u002F\u002Fai.google.dev\u002Fgemma\u002Fdocs\u002Fpytorch_gemma](https:\u002F\u002Fai.google.dev\u002Fgemma\u002Fdocs\u002Fpytorch_gemma).\n\n## Try it out with PyTorch\n\nPrerequisite: make sure you have setup docker permission properly as a non-root user.\n\n```bash\nsudo usermod -aG docker $USER\nnewgrp docker\n```\n\n### Build the docker image.\n\n```bash\nDOCKER_URI=gemma:${USER}\n\ndocker build -f docker\u002FDockerfile .\u002F -t ${DOCKER_URI}\n```\n\n### Run Gemma inference on CPU.\n\n> NOTE: This is a multimodal example. Use a multimodal variant.\n\n```bash\ndocker run -t --rm \\\n    -v ${CKPT_PATH}:\u002Ftmp\u002Fckpt \\\n    ${DOCKER_URI} \\\n    python scripts\u002Frun_multimodal.py \\\n    --ckpt=\u002Ftmp\u002Fckpt \\\n    --variant=\"${VARIANT}\" \\\n    # add `--quant` for the int8 quantized model.\n```\n\n### Run Gemma inference on GPU.\n\n> NOTE: This is a multimodal example. Use a multimodal variant.\n\n```bash\ndocker run -t --rm \\\n    --gpus all \\\n    -v ${CKPT_PATH}:\u002Ftmp\u002Fckpt \\\n    ${DOCKER_URI} \\\n    python scripts\u002Frun_multimodal.py \\\n    --device=cuda \\\n    --ckpt=\u002Ftmp\u002Fckpt \\\n    --variant=\"${VARIANT}\"\n    # add `--quant` for the int8 quantized model.\n```\n\n## Try It out with PyTorch\u002FXLA\n\n### Build the docker image (CPU, TPU).\n\n```bash\nDOCKER_URI=gemma_xla:${USER}\n\ndocker build -f docker\u002Fxla.Dockerfile .\u002F -t ${DOCKER_URI}\n```\n\n### Build the docker image (GPU).\n\n```bash\nDOCKER_URI=gemma_xla_gpu:${USER}\n\ndocker build -f docker\u002Fxla_gpu.Dockerfile .\u002F -t ${DOCKER_URI}\n```\n\n### Run Gemma inference on CPU.\n\n> NOTE: This is a multimodal example. Use a multimodal variant.\n\n```bash\ndocker run -t --rm \\\n    --shm-size 4gb \\\n    -e PJRT_DEVICE=CPU \\\n    -v ${CKPT_PATH}:\u002Ftmp\u002Fckpt \\\n    ${DOCKER_URI} \\\n    python scripts\u002Frun_xla.py \\\n    --ckpt=\u002Ftmp\u002Fckpt \\\n    --variant=\"${VARIANT}\" \\\n    # add `--quant` for the int8 quantized model.\n```\n\n### Run Gemma inference on TPU.\n\nNote: be sure to use the docker container built from `xla.Dockerfile`.\n\n```bash\ndocker run -t --rm \\\n    --shm-size 4gb \\\n    -e PJRT_DEVICE=TPU \\\n    -v ${CKPT_PATH}:\u002Ftmp\u002Fckpt \\\n    ${DOCKER_URI} \\\n    python scripts\u002Frun_xla.py \\\n    --ckpt=\u002Ftmp\u002Fckpt \\\n    --variant=\"${VARIANT}\" \\\n    # add `--quant` for the int8 quantized model.\n```\n\n### Run Gemma inference on GPU.\n\nNote: be sure to use the docker container built from `xla_gpu.Dockerfile`.\n\n```bash\ndocker run -t --rm --privileged \\\n    --shm-size=16g --net=host --gpus all \\\n    -e USE_CUDA=1 \\\n    -e PJRT_DEVICE=CUDA \\\n    -v ${CKPT_PATH}:\u002Ftmp\u002Fckpt \\\n    ${DOCKER_URI} \\\n    python scripts\u002Frun_xla.py \\\n    --ckpt=\u002Ftmp\u002Fckpt \\\n    --variant=\"${VARIANT}\" \\\n    # add `--quant` for the int8 quantized model.\n```\n\n### Tokenizer Notes\n\n99 unused tokens are reserved in the pretrained tokenizer model to assist with more efficient training\u002Ffine-tuning. Unused tokens are in the string format of `\u003Cunused[0-97]>` with token id range of `[7-104]`. \n\n```\n\"\u003Cunused0>\": 7,\n\"\u003Cunused1>\": 8,\n\"\u003Cunused2>\": 9,\n...\n\"\u003Cunused98>\": 104,\n```\n\n## Disclaimer\n\nThis is not an officially supported Google product.\n","Gemma 是由 Google 开发的一系列轻量级、先进的开源模型，包括纯文本和多模态解码器模型。该项目提供了 Gemma 模型的官方 PyTorch 实现，支持使用 PyTorch 和 PyTorch\u002FXLA 进行推理，并可在 CPU、GPU 和 TPU 上运行。Gemma 模型涵盖了多种大小和配置，从 1B 到 27B 参数不等，适用于自然语言处理任务和多模态应用。这些模型在 Kaggle 和 Hugging Face 上提供预训练权重和指令微调版本，方便用户快速上手和部署。适用于需要高性能且资源受限的场景，如移动设备或边缘计算环境。",2,"2026-06-11 03:40:16","high_star"]