[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72480":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":35,"readmeContent":36,"aiSummary":37,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":38,"discoverSource":39},72480,"PartCrafter","wgsxm\u002FPartCrafter","wgsxm","[NeurIPS 2025] PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers","https:\u002F\u002Fwgsxm.github.io\u002Fprojects\u002Fpartcrafter\u002F",null,"Python",2436,161,171,3,0,2,8,21,6,28.63,"MIT License",false,"main",true,[27,28,29,30,31,32,33,34],"3d","3d-generation","3d-object-generation","3d-object-reconstruction","3d-reconstruction","3d-scene-generation","3d-scene-reconstruction","image-to-3d","2026-06-12 02:03:03","# [NeurIPS 2025] PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers\n\n\u003Ch4 align=\"center\">\n\n[Yuchen Lin\u003Csup>*\u003C\u002Fsup>](https:\u002F\u002Fwgsxm.github.io), [Chenguo Lin\u003Csup>*\u003C\u002Fsup>](https:\u002F\u002Fchenguolin.github.io), [Panwang Pan\u003Csup>†\u003C\u002Fsup>](https:\u002F\u002Fpaulpanwang.github.io), [Honglei Yan](https:\u002F\u002Fopenreview.net\u002Fprofile?id=~Honglei_Yan1), [Yiqiang Feng](https:\u002F\u002Fopenreview.net\u002Fprofile?id=~Feng_Yiqiang1), [Yadong Mu](http:\u002F\u002Fwww.muyadong.com), [Katerina Fragkiadaki](https:\u002F\u002Fwww.cs.cmu.edu\u002F~katef\u002F)\n\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2506.05573-b31b1b.svg?logo=arXiv)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.05573)\n[![Project Page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🏠-Project%20Page-blue.svg)](https:\u002F\u002Fwgsxm.github.io\u002Fprojects\u002Fpartcrafter)\n[\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FYouTube-Video-red\" alt=\"YouTube\">](https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=ZaZHbkkPtXY)\n[![Model](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗%20Model-PartCrafter-yellow.svg)](https:\u002F\u002Fhuggingface.co\u002Fwgsxm\u002FPartCrafter)\n[![Model-Scene](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗%20Model-PartCrafter--Scene-yellow.svg)](https:\u002F\u002Fhuggingface.co\u002Fwgsxm\u002FPartCrafter-Scene)\n[![Demo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F🤗%20Demo-PartCrafter-green.svg)](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Falexnasa\u002FPartCrafter)\n\n\u003Cp align=\"center\">\n    \u003Cimg width=\"90%\" alt=\"pipeline\", src=\".\u002Fassets\u002Fteaser.png\">\n\u003C\u002Fp>\n\n\u003C\u002Fh4>\n\nThis repository contains the official implementation of the paper: [PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers](https:\u002F\u002Fwgsxm.github.io\u002Fprojects\u002Fpartcrafter\u002F). \nPartCrafter is a structured 3D generative model that jointly generates multiple parts and objects from a single RGB image in one shot. \nHere is our [Project Page](https:\u002F\u002Fwgsxm.github.io\u002Fprojects\u002Fpartcrafter).\n\nFeel free to contact me (linyuchen@stu.pku.edu.cn) or open an issue if you have any questions or suggestions.\n\n## 📢 News\n- **2025-09-18**: PartCrafter is accepted to NeurIPS 2025. \n- **2025-08-15**: PartCrafter HuggingFace🤗 demo is available [here](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Falexnasa\u002FPartCrafter). Thanks to [alexnasa](https:\u002F\u002Fhuggingface.co\u002Falexnasa). \n- **2025-07-23**: The 3D scene version of PartCrafter is released, which is trained on [3D-Front](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fhuanngzh\u002F3D-Front). \n- **2025-07-20**: A guide for installing PartCrafter on Windows is available in [this fork](https:\u002F\u002Fgithub.com\u002FJackDainzh\u002FPartCrafter-Windows\u002Ftree\u002Fwindows-main). Thanks to [JackDainzh](https:\u002F\u002Fgithub.com\u002FJackDainzh)!\n- **2025-07-13**: PartCrafter is fully open-sourced 🚀.\n- **2025-06-09**: PartCrafter is on arXiv. \n\n## 📋 TODO\n- [x] Release inference scripts. \n- [x] Release training code and data preprocessing scripts. \n- [x] Release pretrained checkpoints on both object and scene level. \n- [x] Provide a HuggingFace🤗 demo.\n- [ ] Release preprocessed dataset. \n\n## 🔧 Installation\nWe use `torch-2.5.1+cu124` and `python-3.11`. But it should also work with other versions. Create a conda environment with the following command (optional):\n```\nconda create -n partcrafter python=3.11.13\nconda activate partcrafter\npip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu124\n```\nThen, install other dependencies with the following command:\n```\ngit clone https:\u002F\u002Fgithub.com\u002Fwgsxm\u002FPartCrafter.git\ncd PartCrafter\nbash settings\u002Fsetup.sh\n```\nIf you do not have root access and use conda environment, you can install required graphics libraries with the following command:\n```\nconda install -c conda-forge libegl libglu pyopengl\n```\nWe test the above installation on Debian 12 with NVIDIA H20 GPUs. For Windows users, you can try to set up the environment according to [this pull request](https:\u002F\u002Fgithub.com\u002Fwgsxm\u002FPartCrafter\u002Fpull\u002F24) and [this fork](https:\u002F\u002Fgithub.com\u002FJackDainzh\u002FPartCrafter-Windows\u002Ftree\u002Fwindows-main). We sincerely thank [JackDainzh](https:\u002F\u002Fgithub.com\u002FJackDainzh) for contributing to the Windows support! \n\n## 💡 Quick Start\n### 3D Part-Level Object Generation\n\u003Cp align=\"center\">\n    \u003Cimg width=\"90%\" alt=\"pipeline\", src=\".\u002Fassets\u002Frobot.gif\">\n\u003C\u002Fp>\n\nGenerate a 3D part-level object from an image:\n```\npython scripts\u002Finference_partcrafter.py \\\n  --image_path assets\u002Fimages\u002Fnp3_2f6ab901c5a84ed6bbdf85a67b22a2ee.png \\\n  --num_parts 3 --tag robot --render\n```\nThe required model weights will be automatically downloaded:\n- PartCrafter model from [wgsxm\u002FPartCrafter](https:\u002F\u002Fhuggingface.co\u002Fwgsxm\u002FPartCrafter) → pretrained_weights\u002FPartCrafter\n- RMBG model from [briaai\u002FRMBG-1.4](http:\u002F\u002Fhuggingface.co\u002Fbriaai\u002FRMBG-1.4) → pretrained_weights\u002FRMBG-1.4\n\nThe generated results will be saved to `.\u002Fresults\u002Frobot`. We provide several example images from Objaverse and ABO in `.\u002Fassets\u002Fimages`. Their filenames start with recommended number of parts, e.g., `np3` which means 3 parts. You can also try other part count for the same input images. \n\nSpecify `--rmbg` if you use custom images. **This will remove the background of the input image and resize it appropriately.**\n\n### VLM-Based Part Suggestion\nInstead of manually specifying `--num_parts`, you can use a VLM to automatically suggest the number of parts:\n```\nGEMINI_API_KEY=your_key python scripts\u002Finference_partcrafter.py \\\n  --image_path assets\u002Fimages\u002Fnp3_2f6ab901c5a84ed6bbdf85a67b22a2ee.png \\\n  --part_suggest --tag robot --rmbg --render\n```\nThis sends the image to a VLM (default: `gemini-3-flash-preview`) which analyzes the object and suggests an appropriate part count. You can override the provider or model:\n```\n--part_provider gemini --part_model gemini-3-flash-preview\n```\n\n### Style Transfer for Real-World Images\nPartCrafter was trained on rendered images from Objaverse. When using real-world photos, you can apply style transfer to bridge the domain gap:\n```\nGEMINI_API_KEY=your_key python scripts\u002Finference_partcrafter.py \\\n  --image_path real_photo.jpg \\\n  --num_parts 4 --style_transfer --rmbg --render\n```\nThis converts the input photo to an Objaverse-style 3D rendering (default model: `gemini-3.1-flash-image-preview`) before feeding it to the pipeline. The stylized image is saved as `styled_input.png` in the output directory. You can override the provider or model:\n```\n--style_provider gemini --style_model gemini-3.1-flash-image-preview\n```\n\nBoth features can be combined:\n```\nGEMINI_API_KEY=your_key python scripts\u002Finference_partcrafter.py \\\n  --image_path real_photo.jpg \\\n  --part_suggest --style_transfer --rmbg --render\n```\n\nThe provider architecture is extensible -- adding a new provider (e.g., OpenAI) requires only a new file in `src\u002Futils\u002Fproviders\u002F` implementing `suggest_num_parts()` and\u002For `stylize_for_objaverse()`.\n\n### 3D Scene Generation\n\u003Cp align=\"center\">\n    \u003Cimg width=\"90%\" alt=\"pipeline\", src=\".\u002Fassets\u002Fdining_room.gif\">\n\u003C\u002Fp>\n\nGenerate a 3D scene from an image:\n```\npython scripts\u002Finference_partcrafter_scene.py \\\n  --image_path assets\u002Fimages_scene\u002Fnp6_0192a842-531c-419a-923e-28db4add8656_DiningRoom-31158.png \\\n  --num_parts 6 --tag dining_room --render\n```\nThe required model weights will be automatically downloaded:\n- PartCrafter-Scene model from [wgsxm\u002FPartCrafter-Scene](https:\u002F\u002Fhuggingface.co\u002Fwgsxm\u002FPartCrafter-Scene) → pretrained_weights\u002FPartCrafter-Scene\n\nThe generated results will be saved to `.\u002Fresults\u002Fdining_room`. We provide several example images from 3D-Front in `.\u002Fassets\u002Fimages_scene`. Their filenames start with recommended number of parts, e.g., `np3` which means 3 parts. You can also try other part count for the same input images. \n\nThe `--part_suggest` and `--style_transfer` flags are also available for scene-level generation.\n\n## 💻 System Requirements\nA CUDA-enabled GPU with at least 8GB VRAM. You can reduce number of parts or number of tokens to save GPU memory. We set the number of tokens per part to `1024` on object level and `2048` on scene level by default for better quality. \n\n## 📊 Dataset\nPlease refer to [Dataset README](.\u002Fdatasets\u002FREADME.md) to download and preprocess the dataset. To generate a minimal dataset, you can run:\n```\npython datasets\u002Fpreprocess\u002Fpreprocess.py --input assets\u002Fobjects --output preprocessed_data\n```\nThis script preprocesses GLB files in `.\u002Fassets\u002Fobjects` and saves the preprocessed data to `.\u002Fpreprocessed_data`. We provide a pseudo data configuration [here](.\u002Fdatasets\u002Fobject_part_configs.json), which makes use of the minimal preprocessed data and is compatible with the training settings.\n\n## 🦾 Training\nTo train PartCrafter from scratch, you first need to download TripoSG from [VAST-AI\u002FTripoSG](https:\u002F\u002Fhuggingface.co\u002FVAST-AI\u002FTripoSG) and store the weights in `.\u002Fpretrained_models\u002FTripoSG`. \n```\nhuggingface-cli download VAST-AI\u002FTripoSG --local-dir pretrained_weights\u002FTripoSG\n```\n\nOur training scripts are suitable for training with 8 H20 GPUs (96G VRAM each). Currently, we only finetune the DiT of TripoSG and keep the VAE fixed. But you can also finetune the VAE of TripoSG, which should improve the quality of the generated 3D parts. PartCrafter is compatible with all 3D object generative models based on vector sets such as [Hunyuan3D-2.1](https:\u002F\u002Fgithub.com\u002FTencent-Hunyuan\u002FHunyuan3D-2.1). We warmly welcome pull requests from the community. \n\nWe provide several training configurations [here](.\u002Fconfigs). You should modify the path of dataset configs in the training config files, which is currently set to `.\u002Fdatasets\u002Fobject_part_configs.json`. \n\nIf you use `wandb`, you should also modify the `WANDB_API_KEY` in the training script. If you have trouble connecting to `wandb`, try `export WANDB_BASE_URL=https:\u002F\u002Fapi.bandw.top`. \n\nTrain PartCrafter from TripoSG:\n```\nbash scripts\u002Ftrain_partcrafter.sh --config configs\u002Fmp8_nt512.yaml --use_ema \\\n  --gradient_accumulation_steps 4 \\\n  --output_dir output_partcrafter \\\n  --tag scaleup_mp8_nt512\n```\n\nFinetune PartCrafter with larger number of parts:\n```\nbash scripts\u002Ftrain_partcrafter.sh --config configs\u002Fmp16_nt512.yaml --use_ema \\\n  --gradient_accumulation_steps 4 \\\n  --output_dir output_partcrafter \\\n  --load_pretrained_model scaleup_mp8_nt512 \\\n  --load_pretrained_model_ckpt 10 \\\n  --tag scaleup_mp16_nt512\n```\n\nFinetune PartCrafter with more tokens:\n```\nbash scripts\u002Ftrain_partcrafter.sh --config configs\u002Fmp16_nt1024.yaml --use_ema \\\n  --gradient_accumulation_steps 4 \\\n  --output_dir output_partcrafter \\\n  --load_pretrained_model scaleup_mp16_nt512 \\\n  --load_pretrained_model_ckpt 10 \\\n  --tag scaleup_mp16_nt1024\n```\n\n## 😊 Acknowledgement\nWe would like to thank the authors of [DiffSplat](https:\u002F\u002Fchenguolin.github.io\u002Fprojects\u002FDiffSplat\u002F), [TripoSG](https:\u002F\u002Fyg256li.github.io\u002FTripoSG-Page\u002F), [HoloPart](https:\u002F\u002Fvast-ai-research.github.io\u002FHoloPart\u002F), and [MIDI-3D](https:\u002F\u002Fhuanngzh.github.io\u002FMIDI-Page\u002F) \nfor their great work and generously providing source codes, which inspired our work and helped us a lot in the implementation. \n\n\n## 📚 Citation\nIf you find our work helpful, please consider citing:\n```bibtex\n@misc{lin2025partcrafter,\n  title={PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers}, \n  author={Yuchen Lin and Chenguo Lin and Panwang Pan and Honglei Yan and Yiqiang Feng and Yadong Mu and Katerina Fragkiadaki},\n  year={2025},\n  eprint={2506.05573},\n  url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.05573}\n}\n```\n\n## 🌟 Star History\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=wgsxm\u002FPartCrafter&type=Date)](https:\u002F\u002Fwww.star-history.com\u002F#wgsxm\u002FPartCrafter&Date)\n","PartCrafter 是一个基于组合潜扩散变换器的结构化3D网格生成模型，能够从单张RGB图像一次性生成多个部件和对象。该项目利用先进的深度学习技术，特别是潜扩散模型和变换器架构，实现高质量的3D对象及场景重建与生成。其核心功能包括从2D图像到3D模型的转换、多部件联合生成以及对复杂场景的支持。适用于需要进行3D内容创作、虚拟现实环境构建或计算机视觉研究等领域的专业人员使用。项目提供详细的文档、预训练模型以及HuggingFace上的在线演示，便于用户快速上手和应用。","2026-06-11 03:42:13","high_star"]