[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82734":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":10,"languages":10,"totalLinesOfCode":10,"stars":11,"forks":12,"watchers":13,"openIssues":12,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":15,"stars30d":16,"stars90d":14,"forks30d":14,"starsTrendScore":17,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":19,"topics":21,"createdAt":10,"pushedAt":10,"updatedAt":22,"readmeContent":23,"aiSummary":24,"trendingCount":14,"starSnapshotCount":14,"syncStatus":12,"lastSyncTime":25,"discoverSource":26},82734,"GenClaw","yejy53\u002FGenClaw","yejy53","GenClaw: Code-Driven Agentic Image Generation","",null,218,2,9,0,64,108,14,68.43,false,"main",[],"2026-06-12 04:01:38","# GenClaw: Code-Driven Agentic Image Generation\n\n[![Paper](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-Hugging%20Face-yellow)](https:\u002F\u002Fhuggingface.co\u002Fpapers\u002F2605.30248)\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.30248-b31b1b)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.30248)\n[![GitHub stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyejy53\u002FGenClaw?style=social)](https:\u002F\u002Fgithub.com\u002Fyejy53\u002FGenClaw)\n\nGenClaw explores **code-driven agentic image generation**: instead of only rewriting prompts, an image generation agent uses code as a controllable visual canvas before calling image generation models for final rendering.\n\nThe core idea is simple: **think, sketch with code, then render**.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fteaser.jpg\" width=\"95%\">\n\u003C\u002Fp>\n\n## Highlights\n\n🎨 **Code as a Visual Brush.** The agent creates by writing executable visual sketches—SVG, HTML\u002FCSS, Python, lightweight 3D code—turning object count, spatial layout, and text rendering into executable, verifiable, debuggable programs. Image synthesis shifts from implicit diffusion sampling to an explicit, reasoning-friendly process.\n\n✋ **Draw as a Human Artist.** We mirror the human creative loop—conceptualize → sketch → coloring → refine—and make every stage transparent: ideation, reference retrieval, drafting, and incremental rendering are all surfaced as inspectable, editable, revertible artifacts. Generation becomes an iterative collaboration rather than one-shot black-box inference.\n\n🔌 **Agent Harness for Image Generation.** We plug an LLM agent's proven planning, tool-use, and reflection abilities directly into image synthesis, exploring an agent harness for image generation—so that creating images becomes a first-class capability inside the agent's toolbox, not an isolated standalone model.\n\n## Showcase\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fshowcase.jpg\" width=\"95%\">\n\u003C\u002Fp>\n\n## Visual Examples\n\n### Complex Scene Composition\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fcomplex_scene.jpg\" width=\"95%\">\n\u003C\u002Fp>\n\n### Text Rendering and Poster Design\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Ftext_rendering.jpg\" width=\"95%\">\n\u003C\u002Fp>\n\n### Physical Reasoning\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fphysical_reasoning.jpg\" width=\"95%\">\n\u003C\u002Fp>\n\n### Knowledge-Grounded Generation\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fmind_bench.jpg\" width=\"95%\">\n\u003C\u002Fp>\n\n## Status\n\nThe technical report is available now. Code and demos are being prepared and will be released later.\n\n## Links\n\n- Paper: https:\u002F\u002Fhuggingface.co\u002Fpapers\u002F2605.30248\n- arXiv: https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.30248\n- Repository: https:\u002F\u002Fgithub.com\u002Fyejy53\u002FGenClaw\n\nIf you find this project interesting, please consider giving it a star and voting for the paper on Hugging Face.\n\n## Citation\n\nIf you find GenClaw useful, please consider citing our technical report:\n\n```bibtex\n@article{ye2026genclaw,\n  title={GenClaw: Code-Driven Agentic Image Generation},\n  author={Ye, Junyan and others},\n  journal={arXiv preprint arXiv:2605.30248},\n  year={2026}\n}\n```\n","GenClaw 是一个探索代码驱动的图像生成项目。其核心功能是通过编写可执行的视觉草图（如SVG、HTML\u002FCSS、Python等）来控制图像生成过程，将对象数量、空间布局和文本渲染转化为可执行、可验证和可调试的程序。这种方法使得图像合成从隐式的扩散采样转变为显式的、易于推理的过程。GenClaw适合需要高度可控性和透明度的图像生成场景，例如复杂场景组合、海报设计以及基于知识的图像生成。通过将大语言模型的能力直接集成到图像生成中，该项目旨在使图像创建成为代理工具箱中的一个主要能力，而不仅仅是一个孤立的独立模型。","2026-06-11 04:09:05","CREATED_QUERY"]