[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-75443":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":16,"compositeScore":18,"rankGlobal":8,"rankLanguage":8,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":8,"pushedAt":8,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":14,"starSnapshotCount":14,"syncStatus":12,"lastSyncTime":27,"discoverSource":28},75443,"recgen","TRI-ML\u002Frecgen","TRI-ML",null,"Jupyter Notebook",189,15,2,1,0,3,9,67,3.61,"Other",false,"main",true,[],"2026-06-12 02:03:34","# RecGen Inference\n\nThis repository contains inference code for **RecGen**, a model for single-view and multi-view 3D reconstruction from RGB-D.\nGiven an RGB image, a depth map, an object mask, and camera intrinsics, RecGen produces a textured mesh, a Gaussian splat, and the object's 6-DoF pose in the camera frame.\n\nTraining code will be released in this same repository in a future update.\n\n## Setup\n\nRecGen depends on `torch`, a few 3D libraries, and two CUDA extensions (`spconv`, `diff-gaussian-rasterization`). We recommend [pixi](https:\u002F\u002Fpixi.sh), which pins Python, CUDA, and PyTorch in a single lockfile:\n\n```bash\ncurl -fsSL https:\u002F\u002Fpixi.sh\u002Finstall.sh | bash   # one-time\npixi install                                    # CUDA 12.1\npixi install -e cu118                           # CUDA 11.8\n```\n\nOptional for speedups and additional visualizations (after `pixi install`):\n\n```bash\npixi run post-install                 # flash-attn (falls back to xformers if the build fails)\npixi run build-nvdiffrast             # nvdiffrast, needed for mp4 vis of the rendering as well as glb (--save-glb)\npixi run build-gaussian-rasterizer    # diff_gaussian_rasterization, needed for turntable.mp4 and --save-glb\n```\n\nWithout these, `mesh.obj`, `overlay.png`, and `metadata.json` are still produced; the turntable video is skipped.\n\n\u003Cdetails>\n\u003Csummary>Manual install (no pixi)\u003C\u002Fsummary>\n\n```bash\npip install -e .\nbash scripts\u002Fsetup_cuda.sh            # spconv + flash-attn + diff-gaussian-rasterization\nbash scripts\u002Fsetup_cuda.sh --all      # also builds nvdiffrast\n```\n\nYou are responsible for providing a working PyTorch + CUDA toolchain.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>Docker (no local install)\u003C\u002Fsummary>\n\nIf you'd rather not install anything on the host, a CUDA 12.1 image is provided. Requires the [NVIDIA Container Toolkit](https:\u002F\u002Fdocs.nvidia.com\u002Fdatacenter\u002Fcloud-native\u002Fcontainer-toolkit\u002Flatest\u002Finstall-guide.html).\n\n```bash\ndocker build -t recgen-inference .\ndocker run --rm --gpus all -v $PWD\u002Fdata:\u002Fdata recgen-inference\n```\n\nOr via Compose (persists the HuggingFace cache in a named volume):\n\n```bash\ndocker compose run --rm recgen bash\n```\n\n\u003C\u002Fdetails>\n\n## Quick Start\n\n```python\nfrom recgen_inference import build_recgen, generate\n\npipeline = build_recgen.build(\"recgen_base.multiview_stereo\")\n\nresult = generate(\n    pipeline,\n    image=rgb,           # HxWx3 uint8\n    depth=depth,         # HxW uint16 (mm) or float32 (m) \n    mask=mask,           # HxW uint8, non-zero = object\n    intrinsics=K,        # 3x3\n)\n\nprint(result.pose_matrix)           # (4, 4) float64 pose in the camera frame\nprint(result.pose_quat)              # (7,) [tx, ty, tz, qx, qy, qz, qw]\nprint(result.mesh.vertices.shape)    # final textured mesh\n\nresult.save(\".\u002Fout\", save_splat=True, save_glb=False)\n```\n\n`result` is a [`RecGenResult`](recgen_inference\u002F_result.py) dataclass that carries the final mesh, the raw object-frame mesh, the 6-DoF pose (both as a 4×4 matrix and a 7-vector).\n\nSee [notebooks\u002Fexample.ipynb](notebooks\u002Fexample.ipynb) for a runnable walkthrough, or run the CLI script:\n\n```bash\npixi run python scripts\u002Frun_inference.py \\\n    --rgb examples\u002Fex0_rgb.png \\\n    --depth examples\u002Fex0_depth.png \\\n    --mask examples\u002Fex0_mask.png \\\n    --intrinsics examples\u002Fintrinsics.yaml \\\n    --out .\u002Fout --save-splat\n```\n\n`intrinsics.yaml` contains `fu`, `fv`, `pu`, `pv` as top-level keys.\n\n## Multi-view Inference\n\n```python\nfrom recgen_inference import generate_multiview\n\nresult = generate_multiview(\n    pipeline,\n    anchor_view={\"rgb\": rgb0, \"depth\": d0, \"mask\": m0, \"camera_intrinsics\": K},\n    second_views=[\n        {\"rgb\": rgb1, \"depth\": d1, \"mask\": m1, \"camera_intrinsics\": K},\n    ],\n)\n```\n\nThe result is expressed in the anchor view's camera frame.\n\n## Input Format\n\n- **RGB** — any common format readable by OpenCV (PNG\u002FJPG).\n- **Depth** — 16-bit PNG in millimeters *or* float32 in meters. The unit is auto-detected.\n- **Mask** — single-channel PNG; non-zero pixels mark the object.\n- **Intrinsics** — `(3, 3)` matrix. Helper: `recgen_inference.utils.intrinsics_from_params(fx, fy, cx, cy)`.\n\n## Output Files (`result.save(...)`)\n\n| File | Description |\n| --- | --- |\n| `mesh.obj` | mesh in object frame (with vertex colors baked into MTL) |\n| `posed_mesh.obj` | mesh in camera frame (with vertex colors baked into MTL) |\n| `overlay.png` | posed mesh rendered over the input RGB (purple, software rasterizer) |\n| `turntable.mp4` | 4 s side-by-side turntable: gaussian color \\| mesh normals (skipped with a warning if nvdiffrast + diff_gaussian_rasterization are not installed) |\n| `gaussian.ply` | Gaussian splat in object frame (if `save_splat=True`) |\n| `posed_gaussian.ply` | Gaussian splat in camera frame (if `save_splat=True`) |\n| `textured_mesh.glb` | mesh with baked texture in object frame (if `save_glb=True` and nvdiffrast + diff_gaussian_rasterization are installed) |\n| `metadata.json` | predicted pose + camera info |\n\n## Viewing Gaussian Splats\n\nThe `gaussian.ply` produced with `--save-splat` is compatible with [SuperSplat](https:\u002F\u002Fsuperspl.at\u002Feditor), a browser-based viewer and editor. Drag the file into the editor window — no upload or install required, everything runs locally in the browser. SuperSplat is also useful for cropping, cleaning, and re-exporting splats (`.ply`, `.splat`, or compressed `.ply`).\n\n## Troubleshooting\n\n- **`spconv` import error** — wrong CUDA variant. Reinstall with `pip install spconv-cu118` or `spconv-cu120` to match your CUDA.\n- **`flash-attn` build fails** — safe to ignore; RecGen falls back to PyTorch SDPA.\n- **`diff_gaussian_rasterization` error on CUDA 12+** — rebuild from source via `bash scripts\u002Fsetup_cuda.sh`; the prebuilt wheel targets CUDA 11.\n- **GLB export fails** — run `pixi run build-nvdiffrast` and `pixi run build-gaussian-rasterizer`, or set `save_glb=False`.\n- **OpenGL \u002F EGL errors on headless servers** — `PYOPENGL_PLATFORM=egl` is set automatically; ensure your driver has EGL support.\n\n## License\n\nThis project is released for non-commercial use only. See [LICENSE](LICENSE) for full terms.\n","RecGen是一个用于从RGB-D数据中进行单视图和多视图3D重建的模型。其核心功能包括根据输入的RGB图像、深度图、对象掩码及相机内参生成纹理网格、高斯点云以及物体在相机坐标系中的6自由度姿态。技术上，RecGen依赖于PyTorch框架，并集成了多个3D处理库与CUDA扩展程序以加速计算过程。适用于需要高质量3D重建的应用场景，如虚拟现实、增强现实、机器人导航等领域。此外，项目还提供了基于Docker的快速部署选项，便于用户在不安装本地环境的情况下运行。","2026-06-11 03:52:46","CREATED_QUERY"]