[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-79182":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":18,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":15,"starSnapshotCount":15,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},79182,"PanoWorld","jjrCN\u002FPanoWorld","jjrCN","Official repo for the paper \"PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis\"",null,"Python",123,12,13,3,0,4,42,1,49.54,"MIT License",false,"main",true,[],"2026-06-12 04:01:24","# PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis\n\n\u003Cp align=\"center\">\n  \u003Cstrong>Jinrang Jia, Zhenjia Li, Yijiang Hu, Yifeng Shi\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cstrong>Ke Holdings Inc.\u003C\u002Fstrong>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fpdf\u002F2605.17916\">\u003Cimg alt=\"arXiv\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.17916-b31b1b.svg\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fjjrcn.github.io\u002FPanoWorld-project-home\u002F\">\u003Cimg alt=\"Project Page\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-2f80ed.svg\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FJiaJinrang\u002FPanoWorld-VR-Tour\">\u003Cimg alt=\"HF Space\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FHF-Space-f59e0b.svg\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002FJiaJinrang\u002FPanoWorld\u002Ftree\u002Fmain\">\u003Cimg alt=\"Model\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FModel-HuggingFace-f97316.svg\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiaJinrang\u002FPanoWorld\">\u003Cimg alt=\"Dataset\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDataset-HuggingFace-10b981.svg\">\u003C\u002Fa>\n  \u003Ca href=\"LICENSE\">\u003Cimg alt=\"License\" src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green.svg\">\u003C\u002Fa>\n\u003C\u002Fp>\n\nPanoWorld is a generative spatial world model for consistent whole-house panorama synthesis. Given a floorplan and a style reference, it autoregressively generates node-based 360-degree panoramas that align with practical VR-tour navigation while preserving cross-view geometry and material consistency across an entire house.\n\nThis repository currently releases the **PanoWorld-LRM inference code**, together with model checkpoints and evaluation data links. More components of the full PanoWorld pipeline will be released progressively.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fpanoworld.png\" alt=\"PanoWorld main figure\" width=\"95%\">\n\u003C\u002Fp>\n\n## Overview\n\n- Whole-house synthesis is formulated as autoregressive generation over discrete panorama viewpoints, matching real VR-tour navigation.\n- A floorplan-derived 3D shell provides global structural guidance for multi-room layout consistency.\n- A dynamic 3DGS cache serves as renderable spatial memory, preserving cross-node geometry and material identity.\n- PanoWorld-LRM reconstructs metric-scale multi-room geometry from panoramic observations for high-quality whole-house rendering and evaluation.\n\n## News\n\n- `2026-05-19`: Paper released and project page launched.\n- `2026-05-25`: Open-sourced the PanoWorld-LRM inference code, checkpoints (including `1024x512` and `2048x1024` model weights), and evaluation data (`50` RealSee3D scenes).\n- `Coming Soon`: PanoWorld 2D generator inference code and checkpoints.\n- `Coming Soon`: Full PanoWorld pipeline, visualization, and evaluation code.\n- `Coming Soon`: Private scene data for evaluating PanoWorld panorama synthesis.\n- `Coming Soon`: PanoWorld-LRM training code.\n- `Coming Soon`: PanoWorld 2D generator training code.\n\n## Inference\n\n### Quick Start\n\n#### PanoWorld-LRM\n\n1. Install dependencies:\n\n```bash\npip install -r requirements.txt\n```\n\nThe released inference package is tested with\n`Python 3.10.18`, `PyTorch 2.3.1`, `TorchVision 0.18.1`, and `CUDA 12.1`.\n\n2. Download the prepared RealSee3D inference and evaluation data ([Download](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiaJinrang\u002FPanoWorld\u002Ftree\u002Fmain)):\n\n3. Check the selected config and update `data.root_data_dir`, `data.data_path`, `inference.ckpt_path`, and `inference.out_dir` if needed.\n\n4. Launch inference with one of the provided scripts:\n\n```bash\nbash infer_1024_512.sh\n```\n\nor\n\n```bash\nbash infer_2048_1024.sh\n```\n\nYou can also run inference directly with:\n\n```bash\npython inference.py --config configs\u002Finference_1024_512.yaml\n```\n\n5. If you would like to run inference on your own data, please refer to the dataset format description ([Here](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiaJinrang\u002FPanoWorld)):\n\nYou may reorganize your own data into the same format. Inference only depends on the panoramic image `panoImage_1600.jpg`, the camera extrinsics `extrinsics.txt`, and the viewpoint-to-room grouping defined in `map.json`. Organize your data as follows:\n\n```text\n\u003Cyour_data_root>\n  \u003Cscene_name1>\n    map.json\n    viewpoints\n      \u003Cview_name1>\n        panoImage_1600.jpg   # panorama image, w:h = 2:1; the resolution is not strictly limited to 1600x800\n        extrinsics.txt       # 4x4 camera extrinsic matrix (c2w) for this viewpoint\n      \u003Cview_name2>\n        panoImage_1600.jpg   # panorama image, w:h = 2:1\n        extrinsics.txt       # 4x4 camera extrinsic matrix (c2w) for this viewpoint\n      \u003Cview_name3>\n        panoImage_1600.jpg   # panorama image, w:h = 2:1\n        extrinsics.txt       # 4x4 camera extrinsic matrix (c2w) for this viewpoint\n      \u003Cview_name4>\n        panoImage_1600.jpg   # panorama image, w:h = 2:1\n        extrinsics.txt       # 4x4 camera extrinsic matrix (c2w) for this viewpoint\n      ...\n  \u003Cscene_name2>\n  \u003Cscene_name3>\n  ...\n```\n\nCreate a TXT file listing the scenes to be processed in the same format as `realsee3D_eval_8views.txt`, then set `config.data.root_data_dir` and `config.data.data_path` accordingly.\n\n6. The inference results will be saved in `inference.out_dir`. The `output_ply` directory can be directly visualized with `SIBR_Viewer`:\n\n```bash\n.\u002FSIBR_gaussianViewer_app -m \u002FPath\u002Fto\u002Foutput_ply\n```\n\nYou may also use other viewers such as `SuperSplat`.\n\n\u003Cp align=\"center\">\u003Cstrong>Inference GPU Memory Usage\u003C\u002Fstrong>\u003C\u002Fp>\n\n\u003Ctable align=\"center\">\n  \u003Cthead>\n    \u003Ctr>\n      \u003Cth align=\"center\">\u003C\u002Fth>\n      \u003Cth align=\"center\">1024x512\u003C\u002Fth>\n      \u003Cth align=\"center\">2048x1024\u003C\u002Fth>\n    \u003C\u002Ftr>\n  \u003C\u002Fthead>\n  \u003Ctbody>\n    \u003Ctr>\n      \u003Ctd align=\"center\">8-views\u003C\u002Ftd>\n      \u003Ctd align=\"center\">27507MiB\u003C\u002Ftd>\n      \u003Ctd align=\"center\">108369MiB\u003C\u002Ftd>\n    \u003C\u002Ftr>\n    \u003Ctr>\n      \u003Ctd align=\"center\">12-views\u003C\u002Ftd>\n      \u003Ctd align=\"center\">40285MiB\u003C\u002Ftd>\n      \u003Ctd align=\"center\">OOM\u003C\u002Ftd>\n    \u003C\u002Ftr>\n  \u003C\u002Ftbody>\n\u003C\u002Ftable>\n\n\u003Cp align=\"center\">\u003Csub>\u003Cem>Tested on NVIDIA H200. The paper uses \u003Ccode>1024x512\u003C\u002Fcode> for experiments and metric computation.\u003C\u002Fem>\u003C\u002Fsub>\u003C\u002Fp>\n\n#### PanoWorld 2D Generator\n\nComing Soon\n\n#### PanoWorld\n\nComing Soon\n\n### Released Files\n\n- `inference.py`: main inference entrypoint\n- `model.py`, `transformer.py`, `dpt_head.py`, `prope_custom.py`: model definition\n- `dataset.py`, `utils.py`, `metric_utils.py`: dataset loading and evaluation helpers\n- `configs\u002F`: released inference configs for `1024x512` and `2048x1024`\n- `data_realsee3D\u002F`: released RealSee3D evaluation file lists\n\n## Model Checkpoints\n\n| Component | Resolution | Link | Notes |\n| --- | --- | --- | --- |\n| PanoWorld-LRM | `1024x512` | [Checkpoint](https:\u002F\u002Fhuggingface.co\u002FJiaJinrang\u002FPanoWorld\u002Fblob\u002Fmain\u002Fmodel_ckpt\u002Fckpt_panoworld_lrm_1024_512.pt) | Released |\n| PanoWorld-LRM | `2048x1024` | [Checkpoint](https:\u002F\u002Fhuggingface.co\u002FJiaJinrang\u002FPanoWorld\u002Fblob\u002Fmain\u002Fmodel_ckpt\u002Fckpt_panoworld_lrm_2048_1024.ckpt) | Released |\n| PanoWorld 2D Generator | Coming Soon | Coming Soon | Coming Soon |\n\n## Data\n\n| Split | Dataset | Usage | Link | Notes |\n| --- | --- | --- | --- | --- |\n| Training | 3D Front | Train LRM and 2D generator | [Download](https:\u002F\u002Ftianchi.aliyun.com\u002Fdataset\u002F65347) | Data processing scripts: Coming Soon |\n| Training | RealSee3D | Train LRM and 2D generator | [Download](https:\u002F\u002Fgithub.com\u002Frealsee-developer\u002FRealSee3D) | Data processing scripts: Coming Soon |\n| Training | Private 2D panoramas | 2D generator only | - | Private |\n| Evaluation | RealSee3D | Evaluate LRM | [Download](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FJiaJinrang\u002FPanoWorld\u002Ftree\u002Fmain) | Released, including `50` RealSee3D scenes |\n| Evaluation | Private scene data | Evaluate PanoWorld panorama synthesis | Coming Soon | Coming Soon |\n\n## Citation\n\nIf you find this project useful, please cite:\n\n```bibtex\n@misc{jia2026panoworldgenerativespatialworld,\n      title={PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis},\n      author={Jinrang Jia and Zhenjia Li and Yijiang Hu and Yifeng Shi},\n      year={2026},\n      eprint={2605.17916},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.17916},\n}\n```\n\n## License\n\nThis project is released under the MIT License. See [LICENSE](LICENSE) for details.\n\n## Acknowledgements\n\nWe would like to thank [Gynjn\u002FMVP](https:\u002F\u002Fgithub.com\u002FGynjn\u002FMVP), [QwenLM\u002FQwen-Image](https:\u002F\u002Fgithub.com\u002FQwenLM\u002FQwen-Image), [realsee-developer\u002FRealSee3D](https:\u002F\u002Fgithub.com\u002Frealsee-developer\u002FRealSee3D), and [3D Front](https:\u002F\u002Ftianchi.aliyun.com\u002Fdataset\u002F65347) for their inspiring open-source contributions.\n","PanoWorld是一个用于生成一致的全屋全景图的空间世界模型。其核心功能在于，根据给定的平面图和风格参考，自回归地生成基于节点的360度全景图像，这些图像不仅符合实际VR游览导航的需求，还能保持整个房屋内不同视角之间的几何形状和材质一致性。该项目使用Python开发，通过一个由平面图衍生出的3D外壳提供全局结构指导，并利用动态3DGS缓存作为可渲染的空间记忆，以维护跨节点的一致性。PanoWorld特别适用于需要高质量、沉浸式虚拟现实体验的设计或展示场景中，如房地产展示、室内设计预览等。",2,"2026-06-11 03:57:35","CREATED_QUERY"]