[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72256":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},72256,"ComfyUI-GGUF","city96\u002FComfyUI-GGUF","city96","GGUF Quantization support for native ComfyUI models",null,"Python",3723,308,29,194,0,23,43,110,69,29.47,"Apache License 2.0",false,"main",true,[],"2026-06-12 02:03:00","# ComfyUI-GGUF\nGGUF Quantization support for native ComfyUI models\n\nThis is currently very much WIP. These custom nodes provide support for model files stored in the GGUF format popularized by [llama.cpp](https:\u002F\u002Fgithub.com\u002Fggerganov\u002Fllama.cpp).\n\nWhile quantization wasn't feasible for regular UNET models (conv2d), transformer\u002FDiT models such as flux seem less affected by quantization. This allows running it in much lower bits per weight variable bitrate quants on low-end GPUs. For further VRAM savings, a node to load a quantized version of the T5 text encoder is also included.\n\n![Comfy_Flux1_dev_Q4_0_GGUF_1024](https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F70d16d97-c522-4ef4-9435-633f128644c8)\n\nNote: The \"Force\u002FSet CLIP Device\" is **NOT** part of this node pack. Do not install it if you only have one GPU. Do not set it to cuda:0 then complain about OOM errors if you do not undestand what it is for. There is not need to copy the workflow above, just use your own workflow and replace the stock \"Load Diffusion Model\" with the \"Unet Loader (GGUF)\" node.\n\n## Installation\n\n> [!IMPORTANT]  \n> Make sure your ComfyUI is on a recent-enough version to support custom ops when loading the UNET-only.\n\nTo install the custom node normally, git clone this repository into your custom nodes folder (`ComfyUI\u002Fcustom_nodes`) and install the only dependency for inference (`pip install --upgrade gguf`)\n\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fcity96\u002FComfyUI-GGUF\n```\n\nTo install the custom node on a standalone ComfyUI release, open a CMD inside the \"ComfyUI_windows_portable\" folder (where your `run_nvidia_gpu.bat` file is) and use the following commands:\n\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fcity96\u002FComfyUI-GGUF ComfyUI\u002Fcustom_nodes\u002FComfyUI-GGUF\n.\\python_embeded\\python.exe -s -m pip install -r .\\ComfyUI\\custom_nodes\\ComfyUI-GGUF\\requirements.txt\n```\n\nOn MacOS sequoia, torch 2.4.1 seems to be required, as 2.6.X nightly versions cause a \"M1 buffer is not large enough\" error. See [this issue](https:\u002F\u002Fgithub.com\u002Fcity96\u002FComfyUI-GGUF\u002Fissues\u002F107) for more information\u002Fworkarounds.\n\n## Usage\n\nSimply use the GGUF Unet loader found under the `bootleg` category. Place the .gguf model files in your `ComfyUI\u002Fmodels\u002Funet` folder.\n\nLoRA loading is experimental but it should work with just the built-in LoRA loader node(s).\n\nPre-quantized models:\n\n- [flux1-dev GGUF](https:\u002F\u002Fhuggingface.co\u002Fcity96\u002FFLUX.1-dev-gguf)\n- [flux1-schnell GGUF](https:\u002F\u002Fhuggingface.co\u002Fcity96\u002FFLUX.1-schnell-gguf)\n- [stable-diffusion-3.5-large GGUF](https:\u002F\u002Fhuggingface.co\u002Fcity96\u002Fstable-diffusion-3.5-large-gguf)\n- [stable-diffusion-3.5-large-turbo GGUF](https:\u002F\u002Fhuggingface.co\u002Fcity96\u002Fstable-diffusion-3.5-large-turbo-gguf)\n\nInitial support for quantizing T5 has also been added recently, these can be used using the various `*CLIPLoader (gguf)` nodes which can be used inplace of the regular ones. For the CLIP model, use whatever model you were using before for CLIP. The loader can handle both types of files - `gguf` and regular `safetensors`\u002F`bin`.\n\n- [t5_v1.1-xxl GGUF](https:\u002F\u002Fhuggingface.co\u002Fcity96\u002Ft5-v1_1-xxl-encoder-gguf)\n\nSee the instructions in the [tools](https:\u002F\u002Fgithub.com\u002Fcity96\u002FComfyUI-GGUF\u002Ftree\u002Fmain\u002Ftools) folder for how to create your own quants.\n","ComfyUI-GGUF项目为ComfyUI模型提供了GGUF格式的量化支持。其核心功能包括对存储在GGUF格式下的模型文件的支持，以及通过量化技术减少模型大小，从而使得这些模型能够在低端GPU上运行。该项目特别适用于需要在资源受限环境下（如低内存或低计算能力的设备）部署和运行机器学习模型的场景。此外，它还提供了一个加载量化版本T5文本编码器的节点，进一步节省了显存使用。基于Python开发，并采用Apache License 2.0开源许可，适合希望优化模型性能同时保持准确性的开发者使用。",2,"2026-06-11 03:41:05","high_star"]