[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80594":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":12,"subscribersCount":12,"size":12,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":12,"forks30d":12,"starsTrendScore":18,"compositeScore":12,"rankGlobal":9,"rankLanguage":9,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":9,"pushedAt":9,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":12,"starSnapshotCount":12,"syncStatus":14,"lastSyncTime":27,"discoverSource":28},80594,"PartFlow","Dennis-JwWeng\u002FPartFlow","Dennis-JwWeng","PartFlow: two-stage image-conditioned 3D editing (inference code)",null,"Python",64,0,51,2,1,3,9,4,"MIT License",false,"main",true,[],"2026-06-12 02:04:04","\u003Cdiv align=\"center\">\n\n# Feedforward 3D Editing Learns from Semantic-Part Transformation\n\n[Jiawei Weng](mailto:jweng007@e.ntu.edu.sg)\u003Csup>1,&ast;\u003C\u002Fsup>,\n[Saining Zhang](https:\u002F\u002Fsainingzhang.github.io\u002F)\u003Csup>1,&ast;,†\u003C\u002Fsup>,\n[Zhenxin Diao](mailto:diaozhenxin2005@outlook.com)\u003Csup>2,&ast;\u003C\u002Fsup>,\n[Peishuo Li](mailto:peishuo001@e.ntu.edu.sg)\u003Csup>1\u003C\u002Fsup>,\n[Henghaofan Zhang](mailto:hhfzhang@outlook.com)\u003Csup>2\u003C\u002Fsup>,\n[Junhao Chen](https:\u002F\u002Fyisuanwang.github.io\u002F)\u003Csup>2\u003C\u002Fsup>,\n[Hao Zhao](https:\u002F\u002Fsites.google.com\u002Fview\u002Ffromandto)\u003Csup>2,†\u003C\u002Fsup>\n\n\u003Csup>1\u003C\u002Fsup>Nanyang Technological University, Singapore &nbsp;&nbsp;\n\u003Csup>2\u003C\u002Fsup>Tsinghua University, China\n\n\u003Csub>&ast;Equal contribution. †Corresponding author.\u003C\u002Fsub>\n\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fdennis-jwweng.github.io\u002Fpxform\u002F\">\u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject%20Page-333399.svg?logo=googlehome height=22px>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.27351\">\u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-arxiv-b5212f.svg?logo=readthedocs height=22px>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.27351\">\u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FArxiv-2605.27351-b5212f.svg?logo=arxiv height=22px>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FART-3D\u002FPxform_v1\">\u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Dataset-Pxform__v1-d96902.svg height=22px>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002FART-3D\u002FPartFlow_models\">\u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Weights-PartFlow__models-276cb4.svg height=22px>\u003C\u002Fa>\n  \u003Ca href=\"LICENSE\">\u003Cimg src=https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg height=22px>\u003C\u002Fa>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"assets\u002Fgallery.png\" alt=\"PartFlow — edited asset gallery\" width=\"95%\">\n\u003C\u002Fdiv>\n\n> **PartFlow** is a feedforward 3D editing network that edits an existing 3D\n> asset to match a target edit image — no per-asset optimisation, no 3D mask\n> at inference. We train it on **Pxform**, a high-quality 3D editing dataset\n> with 100K+ consistent before\u002Fafter pairs across seven edit types, grounding\n> edits in semantic 3D parts.\n\n## Highlights\n\n- **Feedforward** — one forward pass per edit\n- **Semantic-part grounded** — trained on Pxform's part-level pairs\n- **Mask-free at inference** — only needs the source asset + a target image\n- **Two-stage flow** — sparse-structure edit ➜ structured-latent edit\n\n\n## Method\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"assets\u002Fmethod.png\" alt=\"PartFlow architecture — two-stage controlled flow\" width=\"95%\">\n\u003C\u002Fdiv>\n\nPartFlow edits in two stages, conditioning a pretrained 3D generative prior\n([TRELLIS](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTRELLIS)) on the **source asset's\nlatent** and a **target edit image**. Each stage is a controlled flow model\nwith a zero-linear gated reference branch and a mask-aware training loss:\n\n- **Stage 1 — Sparse-structure flow.** Inputs the source SS latent + edit\n  condition, predicts the edited 16³ voxel structure.\n- **Stage 2 — Structured-latent (SLAT) flow.** Inputs the source SLAT mapped\n  to the edited coords + edit condition, predicts the edited SLAT, which the\n  TRELLIS decoders turn into a textured `edit.glb`.\n\n## Installation\n\nPartFlow reuses the TRELLIS runtime (same CUDA extensions, same frozen\nDINOv2 \u002F SS \u002F SLAT decoders). Set up TRELLIS first, then add PartFlow on top.\nTested with **Python 3.10**, **PyTorch 2.5.0**, **CUDA 12.4**.\n\n**1. Set up the TRELLIS environment.** Follow the official\n[TRELLIS installation guide](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTRELLIS#-installation)\nto create the conda env and build the CUDA extensions (`spconv`,\n`flash-attn`, `kaolin`, `diff_gaussian_rasterization`, `nvdiffrast`,\n`diffoctreerast`). For convenience, an equivalent one-liner is bundled here:\n\n```bash\n. .\u002Fsetup.sh --new-env --basic --flash-attn --diffoctreerast --spconv \\\n             --mipgaussian --kaolin --nvdiffrast\n```\n\n**2. Install PartFlow's extra Python dependencies** into the same env:\n\n```bash\npip install -r requirements.txt\n```\n\n## Weights\n\n```bash\npython download_weights.py          # -> .\u002Fweights\u002F{stage1_ss,stage2_slat}\u002F\n```\n\nPulls the two trained stage models from\n[`ART-3D\u002FPartFlow_models`](https:\u002F\u002Fhuggingface.co\u002FART-3D\u002FPartFlow_models).\n\n## Data layout\n\nInference reads pre-encoded inputs. Each *case* is a directory:\n\n```text\n\u003Ccase_dir>\u002F\n    ori_ss_latents.npz   # key `mean`: float32 [8, 16, 16, 16]   — source sparse-structure latent\n    ori_latents.npz      # `coords` [N,3] int, `feats` [N,8] f32 — source structured latent (SLAT)\n    edit_img.png         # the target edit image (RGB or RGBA)\n    case_meta.json       # optional metadata (prompt, edit type, ...)\n```\n\n`ori_ss_latents.npz` \u002F `ori_latents.npz` are the TRELLIS latents of the\n**source** asset; produce them with the standard TRELLIS image-to-3D encoder.\nGround-truth `edit_*` files, if present, are ignored by inference.\n\n## Run inference\n\n```bash\n# single case\npython inference.py --input examples\u002Fmod_glass_disc_table --output_dir outputs\n\n# a whole directory of cases\npython inference.py --input \u002Fpath\u002Fto\u002Fpxform\u002Fcases --output_dir outputs\n\n# useful flags\n#   --steps 50           flow-sampling steps\n#   --cfg_strength 0.0   classifier-free guidance (0 = condition only)\n#   --manifest ids.json  restrict to a list of case ids\n#   --skip_existing      resume a partial run\n```\n\nEach case writes `outputs\u002F\u003Cedit_id>\u002Fedit.glb` and `pred_slat.npz`.\n\n## Repository layout\n\n```text\nPartFlow\u002F\n├── inference.py        two-stage inference pipeline + CLI\n├── dataset.py          PxformDataset (per-case loader)\n├── download_weights.py fetch weights from Hugging Face\n├── configs\u002F            Stage 1 \u002F Stage 2 model configs\n├── examples\u002F           one ready-to-run example case\n├── trellis\u002F            TRELLIS backbone + PartFlow stage models\n├── assets\u002F             README figures\n├── setup.sh            CUDA-extension installer\n└── requirements.txt    pure-pip dependencies\n```\n\n## Results Comparison\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"assets\u002Fteaser_geometry.jpg\" alt=\"PartFlow vs. baselines — geometry edits\" width=\"95%\">\n  \u003Cbr\u002F>\u003Cbr\u002F>\n  \u003Cimg src=\"assets\u002Fteaser_colormat.jpg\" alt=\"PartFlow vs. baselines — appearance edits\" width=\"95%\">\n\u003C\u002Fdiv>\n\n## Citation\n\n```bibtex\n@article{weng2026partflow,\n  title   = {Feedforward 3D Editing Learns from Semantic-Part Transformation},\n  author  = {Weng, Jiawei and Zhang, Saining and Diao, Zhenxin and Li, Peishuo and Zhang, Henghaofan and Chen, Junhao and Zhao, Hao},\n  journal = {arXiv preprint arXiv:2605.27351},\n  year    = {2026}\n}\n```\n\n## Acknowledgements\n\nBuilt on [TRELLIS](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FTRELLIS).\n","PartFlow 是一个基于图像条件的3D编辑网络，能够根据目标编辑图像对现有的3D资产进行修改。该项目的核心功能在于其无需针对每个资产单独优化，也不需要在推理时提供3D掩码，仅需源资产和目标图像即可完成编辑。它采用两阶段流程：首先进行稀疏结构编辑，然后是结构化潜变量编辑，整个过程通过一次前向传递实现。此外，该模型训练所用的数据集Pxform包含超过10万对高质量的前后对比数据，覆盖了七种不同的编辑类型。PartFlow特别适用于需要快速、高效地对3D模型进行视觉调整的应用场景，如游戏开发、虚拟现实内容创作等。","2026-06-11 04:01:19","CREATED_QUERY"]