[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-78047":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":14,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":16,"compositeScore":18,"rankGlobal":9,"rankLanguage":9,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":9,"pushedAt":9,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":15,"starSnapshotCount":15,"syncStatus":13,"lastSyncTime":27,"discoverSource":28},78047,"Code-as-Room","YxuanAr\u002FCode-as-Room","YxuanAr","A MLLM-based agentic system converts a single room image into executable Blender code for 3D room reconstruction.",null,"Python",162,18,2,1,0,38,105,3.84,"Apache License 2.0",false,"main",true,[],"2026-06-12 02:03:45","# Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fteaser_cropped.png\" alt=\"Code-as-Room teaser\" width=\"100%\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fyxuanar.github.io\u002F\">Yixuan Yang\u003C\u002Fa>\u003Csup>1*\u003C\u002Fsup>,\n  \u003Ca href=\"https:\u002F\u002Fscholar.google.com\u002Fcitations?user=hYXbJgcAAAAJ&hl=zh-CN\">Zhen Luo\u003C\u002Fa>\u003Csup>2,3*\u003C\u002Fsup>,\n  \u003Ca href=\"https:\u002F\u002Fganwanshui.github.io\u002F\">Wanshui Gan\u003C\u002Fa>\u003Csup>1*\u003C\u002Fsup>,\n  \u003Ca href=\"https:\u002F\u002Fjinkun-hao.github.io\u002F\">Jinkun Hao\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>,\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002FJunrulu\">Junru Lu\u003C\u002Fa>\u003Csup>4\u003C\u002Fsup>,\n  Jinghao Yan\u003Csup>1\u003C\u002Fsup>,\n  \u003Ca href=\"https:\u002F\u002Fzhaoyanglyu.github.io\u002F\">Zhaoyang Lyu\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>,\n  \u003Ca href=\"https:\u002F\u002Fsheldontsui.github.io\u002F\">Xudong Xu\u003C\u002Fa>\u003Csup>1†\u003C\u002Fsup>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Csup>1\u003C\u002Fsup>Shanghai Artificial Intelligence Laboratory&nbsp;&nbsp;\n  \u003Csup>2\u003C\u002Fsup>Shanghai Innovation Institute&nbsp;&nbsp;\n  \u003Csup>3\u003C\u002Fsup>Southern University of Science and Technology&nbsp;&nbsp;\n  \u003Csup>4\u003C\u002Fsup>University of Warwick\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Csup>*\u003C\u002Fsup>Equal Contribution&nbsp;&nbsp;\n  \u003Csup>†\u003C\u002Fsup>Corresponding Author&nbsp;&nbsp;\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fcode-as-room.github.io\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-green.svg\" alt=\"Project Page\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.18451\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-arXiv-b31b1b.svg\" alt=\"Paper\">\u003C\u002Fa>\n  \u003Ca href=\"LICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache%202.0-blue.svg\" alt=\"License: Apache 2.0\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003C!-- \u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Ficon.png\" alt=\"Code-as-Room icon\" width=\"150\">\n\u003C\u002Fp> -->\n\n## Overview\n\n**Code-as-Room** is an MLLM-based agentic framework equipped with a structured execution harness that represents 3D rooms with Blender code. Given a single top-down view image, the framework parses scene elements and their spatial relationships, and synthesizes executable Blender code for geometry, materials, and lighting through a principled, multi-stage pipeline.\n\nThe pipeline is agent-driven: LLM\u002FVLM stages produce scene semantics, relation graphs, Blender layout code, object descriptions, detailed geometry, materials, texture prompts, and render settings. Deterministic code handles orchestration, validation, repair, memory, code integration, and several geometry\u002Flayout constraints.\n\n## What This Repository Does\n\nThis repository releases the Blender-code generation pipeline for Code-as-Room. The goal is to turn a top-down room image into an executable Blender scene by progressively converting visual evidence into structured scene understanding, object layout, geometry code, materials, textures, and render settings.\n\nThe current implementation focuses on code-synthesized room reconstruction. It includes isolated run directories, resumable stages, scene-type routing, major-object geometry refinement, material generation, optional texture generation, and final render-script generation.\n\nThe asset-retrieval data\u002Fcheckpoints and 3D generation combination components described in the broader project plan are not included in this release yet. They will be released separately.\n\n## Pipeline Stages\n\n```text\nStage 0   Scene classification\nStage 1   Spatial semantic analysis\nStage 2   Scene graph construction\nStage 3   Base Blender code generation\nStage 4   Wall objects and selected minor placeholders\nStage 5   Major object descriptions\nStage 6   Detailed geometry for major objects\nStage 7   Surface-based small-object placement\nStage 8   Detailed small-object descriptions (optional extension)\nStage 9   Detailed small-object geometry (optional extension)\nStage 10  Per-part PBR material generation\nStage 11  Real texture generation and injection\nStage 12  Render-ready lighting and render settings\n```\n\nStages 8 and 9 are optional extensions in this codebase for giving generated small objects their own descriptions and composite geometry. They are not part of the main paper pipeline. They are off by default and only run when `--detail-small-objects` is enabled. If you do not need detailed small-object geometry, leave this option disabled.\n\n```bash\npython run_pipeline.py \\\n  --image example\u002Fexample1.png \\\n  --detail-small-objects\n```\n\n## Release Plan\n\n- [x] Blender code generation release.\n  - Initial release of the core 3D room generation pipeline.\n- [ ] Web-based editing and viewing interface.\n  - We plan to release a Web UI for editing generated scenes in the browser, with synchronization between the scene, the underlying code, and Blender.\n  - This interface is intended to reduce both time and token cost compared with post-hoc correction inside the agent loop :)\n- [ ] 3D assets retrieval checkpoint release.\n  - Code-only geometry can be insufficient for representing fine-grained small objects in downstream applications such as robotics. We plan to release retrieval data and checkpoints to improve object-level realism and usability.\n- [ ] Support for more diverse room shapes.\n  - The current pipeline works best on rectangular or near-rectangular rooms. We plan to improve support for irregular room layouts.\n- [ ] Whole-floor-plan to 3D scene generation.\n  - The current release focuses on single-room reconstruction. We plan to extend the pipeline to handle multi-room floor plans.\n- [ ] Benchmark release.\n  - Building and scaling the benchmark requires substantial time and token cost. We plan to expand it beyond the current internal version, but it is resource-intensive. If you are interested in collaborating on the benchmark, please feel free to contact us.\n\n## Requirements\n\nSystem:\n\n- Python 3.10+\n- Blender 3.6+ or Blender 4.x\n- An OpenAI-compatible chat\u002FVLM API endpoint for the text and vision stages\n- Optional image-generation endpoint for Stage 11 texture generation\n\nPython packages:\n\n```bash\npip install langchain-openai langchain-core openai pillow requests\n```\n\nBlender supplies `bpy`, `bmesh`, and `mathutils`; install those by installing Blender, not with pip.\n\n## Setup\n\nClone the repository and install dependencies:\n\n```bash\ngit clone \u003Cyour-repo-url>\ncd Code-as-Room_github\npip install langchain-openai langchain-core openai pillow requests\n```\n\nConfigure your API credentials with environment variables:\n\n```bash\nexport SCENEGEN_MODEL=\"gemini-3.1-pro-preview-thinking\"\nexport SCENEGEN_BASE_URL=\"https:\u002F\u002Fyour-openai-compatible-endpoint\u002Fv1\"\nexport SCENEGEN_API_KEY=\"your-api-key\"\n\n# Optional: only needed for Stage 11 real texture generation.\nexport SCENEGEN_TEXTURE_MODEL=\"gemini-3-pro-image-preview\"\nexport SCENEGEN_TEXTURE_BASE_URL=\"https:\u002F\u002Fyour-image-generation-endpoint\"\nexport SCENEGEN_TEXTURE_API_KEY=\"your-texture-api-key\"\n```\n\nYou can also put these values in a JSON config file instead of environment variables. Copy `example\u002Fpipeline_config.example.json` to a local file, then edit `model`, `base_url`, `api_key`, `stage11_texture_model`, `stage11_texture_base_url`, and `stage11_texture_api_key` there.\n\n## Example Inputs\n\nThe `example\u002F` directory contains small inputs and a runnable config template:\n\n- `example\u002Fexample1.png`, `example\u002Fexample2.jpeg`, `example\u002Fexample3.png`: sample top-down room references.\n- `example\u002Fpipeline_config.example.json`: config-file template for `run_pipeline.py`.\n- `example\u002Frun_*\u002F`: generated example outputs. These show the expected stage folders and final scripts.\n\nThe `image_prompt_gen\u002F` directory contains the image-prompt workflow:\n\n- `image_prompt_gen\u002Ftopdown_room_image_generator.py`: generates top-down room image prompts and optionally calls an image-generation endpoint.\n- `image_prompt_gen\u002Fgenerated_prompts_example.json`: example prompt JSON that can be used as input to image generation.\n\n## Quick Start\n\nRun the full pipeline:\n\n```bash\npython run_pipeline.py --image example\u002Fexample1.png\n```\n\nBy default, output is written next to the input image:\n\n```text\nexample\u002Frun_YYYYMMDD_HHMMSS_example1\u002F\n```\n\nTo choose a different output parent directory:\n\n```bash\npython run_pipeline.py \\\n  --image example\u002Fexample1.png \\\n  --output-dir \u002Fpath\u002Fto\u002Foutput_root\n```\n\nThe final Blender script is:\n\n```text\n\u003Crun_dir>\u002Fstage12_render\u002Frender_output.py\n```\n\nRender or inspect it with Blender:\n\n```bash\n\u002FApplications\u002FBlender.app\u002FContents\u002FMacOS\u002FBlender \\\n  --python \u003Crun_dir>\u002Fstage12_render\u002Frender_output.py\n```\n\nIf Blender is not at the macOS path above, pass it to the pipeline:\n\n```bash\npython run_pipeline.py --image example\u002Fexample1.png --blender \u002Fpath\u002Fto\u002Fblender\n```\n\n## Config File\n\nInstead of passing many CLI flags, copy the example config:\n\n```bash\ncp example\u002Fpipeline_config.example.json my_config.json\n```\n\nEdit `my_config.json`, then run:\n\n```bash\npython run_pipeline.py --config my_config.json\n```\n\nThe API-related fields to edit are:\n\n- `model`, `base_url`, `api_key`: main OpenAI-compatible text\u002FVLM endpoint used by most stages.\n- `stage11_texture_model`, `stage11_texture_base_url`, `stage11_texture_api_key`: optional image-generation endpoint used by Stage 11 texture generation.\n- `image`: input image path.\n- `blender`: Blender executable path if your Blender is not at the default macOS location.\n\nCLI arguments override config-file values:\n\n```bash\npython run_pipeline.py --config my_config.json --start 5 --end 12\n```\n\nConfig keys use the CLI option names with underscores instead of hyphens, for example:\n\n```json\n{\n  \"image\": \"example\u002Fexample1.png\",\n  \"output_dir\": \"outputs\",\n  \"start\": 1,\n  \"end\": 12,\n  \"model\": \"gemini-3.1-pro-preview-thinking\",\n  \"base_url\": \"https:\u002F\u002Fyour-openai-compatible-endpoint\u002Fv1\",\n  \"api_key\": \"your-api-key\",\n  \"stage11_texture_model\": \"gemini-3-pro-image-preview\",\n  \"stage11_texture_base_url\": \"https:\u002F\u002Fyour-image-generation-endpoint\",\n  \"stage11_texture_api_key\": \"your-texture-api-key\",\n  \"wall_intensity\": \"subtle\"\n}\n```\n\nDo not commit real API keys.\n\n## Generate Top-Down Input Images\n\nThe repo includes a helper for creating synthetic top-down room images before running the 3D pipeline. It has three modes:\n\n- `prompt`: generate prompt JSON from a text\u002FVLM model.\n- `image`: generate images from an existing prompt JSON.\n- `all`: generate prompt JSON, then generate images.\n\nGenerate prompt JSON only:\n\n```bash\npython image_prompt_gen\u002Ftopdown_room_image_generator.py prompt \\\n  --count 20 \\\n  --model gpt-4o \\\n  --api-key \"$SCENEGEN_API_KEY\" \\\n  --base-url \"$SCENEGEN_BASE_URL\" \\\n  --scene-scope non_residential \\\n  --output image_prompt_gen\u002Fgenerated_prompts.json\n```\n\nGenerate images from an existing prompt file:\n\n```bash\npython image_prompt_gen\u002Ftopdown_room_image_generator.py image \\\n  --prompts image_prompt_gen\u002Fgenerated_prompts_example.json \\\n  --image-model gemini-3-pro-image-preview \\\n  --api-key \"$SCENEGEN_TEXTURE_API_KEY\" \\\n  --base-url \"$SCENEGEN_TEXTURE_BASE_URL\" \\\n  --output-dir generated_images\u002Fexample \\\n  --aspect-ratio 16:9 \\\n  --image-size 1K\n```\n\nRun both steps in one command:\n\n```bash\npython image_prompt_gen\u002Ftopdown_room_image_generator.py all \\\n  --count 20 \\\n  --prompt-model gpt-4o \\\n  --image-model gemini-3-pro-image-preview \\\n  --api-key \"$SCENEGEN_TEXTURE_API_KEY\" \\\n  --base-url \"$SCENEGEN_TEXTURE_BASE_URL\" \\\n  --scene-scope non_residential \\\n  --output image_prompt_gen\u002Fgenerated_prompts.json \\\n  --output-dir generated_images\u002Fnon_residential \\\n  --aspect-ratio 16:9 \\\n  --image-size 1K\n```\n\nThe prompt generator writes a JSON file with `metadata` and `prompts`. Each prompt item contains the original structured parameters plus the final image prompt string. The image generator writes PNGs named from the prompt id, room type, and style.\n\nImportant: `--base-url` for `prompt` mode is OpenAI-compatible chat style and may include `\u002Fv1`. `--base-url` for `image` mode is the image-generation proxy root; the script appends the Gemini `v1beta\u002Fmodels\u002F...:generateContent` path internally.\n\n## Small-Object Generation\n\nThe pipeline includes additional code generation for small objects beyond the original base-room layout:\n\n- Stage 4 uses Stage 1\u002F2 semantics to add wall-mounted or minor objects that may be absent from the base layout.\n- Stage 7 parses Stage 6 detailed geometry, finds usable support surfaces, and adds small objects grounded in the reference image.\n- Stage 7 writes `stage7_small_objects\u002Fsmall_objects.json` and an updated Blender script.\n- With `--detail-small-objects`, Stage 8 describes each added small object and Stage 9 replaces simple small-object primitives with compact composite geometry.\n- Stage 10\u002F11 then assign materials and texture integrations over the combined major-object and small-object scene.\n\nThis is useful for lab benches, desks, shelves, kitchen counters, office tables, and other clutter-heavy scenes.\n\n## Common Commands\n\nRun only the base scene:\n\n```bash\npython run_pipeline.py --image example\u002Fexample1.png --end 4\n```\n\nResume a previous run from Stage 5:\n\n```bash\npython run_pipeline.py \\\n  --image example\u002Fexample1.png \\\n  --run-dir \u003Crun_dir> \\\n  --start 5 \\\n  --end 12\n```\n\nRun only material, texture, and render stages after geometry is ready:\n\n```bash\npython run_pipeline.py \\\n  --image example\u002Fexample1.png \\\n  --run-dir \u003Crun_dir> \\\n  --start 10 \\\n  --end 12\n```\n\nDisable image compression:\n\n```bash\npython run_pipeline.py --image example\u002Fexample1.png --no-compress\n```\n\nSet the wall texture style:\n\n```bash\npython run_pipeline.py --image example\u002Fexample1.png --wall-intensity subtle\npython run_pipeline.py --image example\u002Fexample1.png --wall-intensity bold\npython run_pipeline.py --image example\u002Fexample1.png --wall-intensity mural_like\n```\n\nForce a scene type:\n\n```bash\npython run_pipeline.py --image example\u002Fexample1.png --scene-type lab\npython run_pipeline.py --image example\u002Fexample1.png --scene-type residential\n```\n\n## Batch Runs\n\nUse `batch_run_pipeline.py` to run `run_pipeline.py` over every image in a folder. This is the normal entry point for generated image batches.\n\nPreview the batch without running stages:\n\n```bash\npython batch_run_pipeline.py \\\n  --images-dir example \\\n  --label example \\\n  --model-tag local-test \\\n  --dry-run\n```\n\nRun a batch sequentially:\n\n```bash\npython batch_run_pipeline.py \\\n  --images-dir generated_images\u002Fnon_residential \\\n  --label non_residential \\\n  --model-tag gemini31 \\\n  --parallel 16 \\\n  --max-concurrent 1\n```\n\nRun multiple images at the same time:\n\n```bash\npython batch_run_pipeline.py \\\n  --images-dir generated_images\u002Fnon_residential \\\n  --label non_residential \\\n  --model-tag gemini31 \\\n  --parallel 16 \\\n  --max-concurrent 2\n```\n\nOutput is organized as:\n\n```text\n\u003Coutput-root>\u002F\u003Cmodel-tag>\u002F\u003Clabel>\u002Frun_YYYYMMDD_HHMMSS_\u003Cimage_stem>\u002F\n```\n\nBy default, `output-root` is:\n\n```text\nagent_utils\u002Fpipeline_output\u002F\n```\n\nUse a custom output root when running large batches:\n\n```bash\npython batch_run_pipeline.py \\\n  --images-dir generated_images\u002Fnon_residential \\\n  --output-root \u002Fpath\u002Fto\u002FCAR3D_output \\\n  --label non_residential \\\n  --model-tag gemini31 \\\n  --max-concurrent 2\n```\n\nBatch options to keep straight:\n\n- `--parallel`: internal Stage 6 geometry worker count for one pipeline run.\n- `--max-concurrent`: number of images\u002Fpipelines running at the same time.\n- `--label`: dataset or image class folder under the output bucket.\n- `--model-tag`: filesystem-safe model folder name. Use this even when the actual model name is long.\n- `--stop-on-error`: stop the batch after the first failed image.\n- `--quiet`: reduce per-stage logs. In parallel mode, each image writes its full log to `\u003Crun_dir>\u002Frun.log`.\n\nStart conservatively. On a laptop, `--max-concurrent 4` or `6` is usually safer because each pipeline may call LLM APIs and spawn Blender.\n\n## Run Management\n\nList historical runs under the default workspace output folder:\n\n```bash\npython run_pipeline.py --list-runs\n```\n\nShow memory status for a run:\n\n```bash\npython run_pipeline.py --status --run-dir \u003Crun_dir>\n```\n\nClear all memory for a run:\n\n```bash\npython run_pipeline.py --clear-memory --run-dir \u003Crun_dir>\n```\n\nClear one stage and rerun from there:\n\n```bash\npython run_pipeline.py --clear-stage stage7_small_objects --run-dir \u003Crun_dir>\npython run_pipeline.py --image example\u002Fexample1.png --run-dir \u003Crun_dir> --start 7\n```\n\n## Output Structure\n\nA typical run directory contains:\n\n```text\nrun_YYYYMMDD_HHMMSS_image\u002F\n├── agent_memory.jsonl\n├── run_config.json\n├── compressed_images\u002F\n├── stage1\u002F\n├── stage2\u002F\n├── stage3\u002F\n├── stage4\u002F\n├── stage5_describe\u002F\n├── stage6_geometry\u002F\n├── stage7_small_objects\u002F\n├── stage8_small_describe\u002F       # only when --detail-small-objects is used\n├── stage9_small_geometry\u002F       # only when --detail-small-objects is used\n├── stage10_material\u002F\n├── stage11_texture\u002F\n└── stage12_render\u002F\n```\n\nImportant final artifacts:\n\n- `stage12_render\u002Frender_output.py`: final render-ready Blender script.\n- `stage11_texture\u002Fimages\u002F`: generated texture maps.\n- `stage11_texture\u002Ftexture_manifest.json`: texture-generation manifest.\n- `stage10_material\u002Fmaterial_config.json`: generated material configuration.\n- `stage7_small_objects\u002Fsmall_objects.json`: added small objects and placement data.\n\n## License\n\nThis repository is licensed under the [Apache License 2.0](LICENSE).\n\n## Citation\n\nIf you find our work useful, please consider citing:\n\n```bibtex\n@article{yang2026codeasroom,\n  title={Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis},\n  author={Yang, Yixuan and Luo, Zhen and Gan, Wanshui and Hao, Jinkun and Lu, Junru and Yan, Jinghao and Lyu, Zhaoyang and Xu, Xudong},\n  journal={arXiv preprint arXiv:2605.18451},\n  year={2026}\n}\n```\n\n\u003C!-- ## Notes for Open Source Use\n\n- Keep API keys in environment variables or local config files that are not committed.\n- Generated outputs can be large. Consider ignoring run directories and generated images in downstream forks.\n- The pipeline depends on external LLM\u002FVLM and image-generation APIs; exact visual quality varies by model and endpoint.\n- Stage 11 texture generation can be skipped by ending at Stage 10 if you only need procedural\u002FPBR materials:\n\n```bash\npython run_pipeline.py --image example\u002Fexample1.png --end 10\n``` -->\n","Code-as-Room 是一个基于多模态大语言模型的框架，能够从俯视图图像生成3D房间。该项目通过结构化的执行流程，将场景元素及其空间关系解析为Blender可执行代码，涵盖几何、材质和光照等信息。其核心功能包括使用LLM\u002FVLM阶段生成场景语义、关系图、布局代码及物体描述，并通过确定性代码处理协调、验证、修复、内存管理以及代码集成等任务。适用于需要根据平面图自动构建三维室内环境的设计或研究场景，如虚拟现实、室内设计等领域。","2026-06-11 03:56:25","CREATED_QUERY"]