[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72421":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":15,"starSnapshotCount":15,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},72421,"ScreenCoder","leigest519\u002FScreenCoder","leigest519","ScreenCoder — Turn any UI screenshot into clean, editable HTML\u002FCSS with full control. Fast, accurate, and easy to customize.","",null,"Python",2687,265,12,0,6,9,22,18,29.27,"Apache License 2.0",false,"main",[],"2026-06-12 02:03:03","# ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents\n\n\u003Cdiv align=\"center\">\n\nYilei Jiang\u003Csup>1*\u003C\u002Fsup>, Yaozhi Zheng\u003Csup>1*\u003C\u002Fsup>, Yuxuan Wan\u003Csup>2*\u003C\u002Fsup>, Jiaming Han\u003Csup>1\u003C\u002Fsup>, Qunzhong Wang\u003Csup>1\u003C\u002Fsup>,  \nMichael R. Lyu\u003Csup>2\u003C\u002Fsup>, Xiangyu Yue\u003Csup>1✉\u003C\u002Fsup>  \n\u003Cbr>\n\u003Csup>1\u003C\u002Fsup>CUHK MMLab, \u003Csup>2\u003C\u002Fsup>CUHK ARISE Lab  \n\u003Cbr>\n\u003Csup>*\u003C\u002Fsup>Equal contribution  \u003Csup>✉\u003C\u002Fsup>Corresponding author\n\n\u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.22827\">\n    \u003Cimg\n      src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-Paper-red?logo=arxiv&logoColor=red\"\n      alt=\"Paper on arXiv\"\n    \u002F>\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FJimmyzheng-10\u002FScreenCoder\">\n    \u003Cimg \n        src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FHF-Demo-yellow?logo=huggingface&logoColor=yellow\" \n        alt=\"Huggingface Demo\"\n    \u002F>\n  \u003C\u002Fa>\n\u003C\u002Fdiv>\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"teaser.jpg\" width=\"100%\"\u002F>\n  \n\u003C\u002Fdiv>\n\n## Introduction\n\n**ScreenCoder** is an intelligent UI-to-code generation system that transforms any screenshot or design mockup into clean, production-ready HTML\u002FCSS code. Built with a modular multi-agent architecture, it combines visual understanding, layout planning, and adaptive code synthesis to produce accurate and editable front-end code.\n\nIt also supports customized modifications, allowing developers and designers to tweak layout and styling with ease. Whether you're prototyping quickly or building pixel-perfect interfaces, ScreenCoder bridges the gap between design and development — just copy, customize, and deploy.\n\n## News\n- We have released the post-training code (SFT + RL) used to align ScreenCoder.\n- We also release ScreenBench (https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FLeigest\u002FScreenCoder), a novel benchmark for visual-to-code\u002Fweb UI generation, including 1000 up-to-date real-world sampled web screenshots and corresponding HTML source code with diverse topics.\n\n## Huggingface Demo\n- Try our huggingface demo at [Demo](https:\u002F\u002Fhuggingface.co\u002Fspaces\u002FJimmyzheng-10\u002FScreenCoder)\n\n- Run the demo locally (download from huggingface space):\n\n  ```bash\n  python app.py\n  ```\n  \n## Demo Videos\n\nA showcase of how **ScreenCoder** transforms UI screenshots into structured, editable HTML\u002FCSS code using a modular multi-agent framework.\n\n### Youtube Page\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F5d4c0808-76b8-4eb3-b333-79d0ac690189\n\n### Instagram Page\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F9819d559-863e-4126-8506-1eccaa806df0\n\n### Design Draft（allow customized modifications!）\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fd2f26583-4649-4b6d-8072-b11cd1025f4b\n\n## Qualitative Comparisons\n\nWe present qualitative examples to illustrate the improvements achieved by our method over existing approaches. The examples below compare the output of a baseline method with ours on the same input.\n\n### Baseline or Other Method\n\n![Other Method Output](example_others.jpeg)\n\n### Our Method\n\n![Our Method Output](example_ours.jpeg)\n\nAs shown above, our method produces results that are more accurate, visually aligned, and semantically faithful to the original design.\n\n## Project Structure\n- `main.py`: The main script to generate final HTML code for a single screenshot.\n- `UIED\u002F`: Contains the UIED (UI Element Detection) engine for analyzing screenshots and detecting components.\n  - `run_single.py`: Python script to run UI component detection on a single image.\n- `html_generator.py`: Takes the detected component data and generates a complete HTML layout with generated code for each module.\n- `image_replacer.py`: A script to replace placeholder divs in the final HTML with actual cropped images.\n- `mapping.py`: Maps the detected UIED components to logical page regions.\n- `requirements.txt`: Lists all the necessary Python dependencies for the project.\n- `doubao_api.txt`: API key file for the Doubao model (should be kept private and is included in `.gitignore`).\n\n## Setup and Installation\n\n1.  **Clone the repository:**\n    ```bash\n    git clone https:\u002F\u002Fgithub.com\u002Fleigest519\u002FScreenCoder.git\n    cd screencoder\n    ```\n\n2.  **Create a virtual environment:**\n    ```bash\n    python3 -m venv .venv\n    source .venv\u002Fbin\u002Factivate\n    ```\n\n3.  **Install dependencies:**\n    ```bash\n    pip install -r requirements.txt\n    ```\n4. **Configure the model and API key**\n    - ***Choose a generation model***: Set the desired model in `block_parsor.py` and `html_generator.py`. Supported options: Doubao(default), Qwen, GPT, Gemini.\n    - ***Add the API key***: Create a plain-text file (`doubao_api.txt`, `qwen_api.txt`, `gpt_api.txt`, `gemini_api.txt`) in the project root directory that corresponds to your selected model, and paste your API key inside.\n\n## Usage\n\nThe typical workflow is a multi-step process as follows:\n\n1.  **Initial Generation with Placeholders:**\n    Run the Python script to generate the initial HTML code for a given screenshot.\n    - Block Detection:\n      ```bash\n      python block_parsor.py\n      ```\n    - Generation with Placeholders (Gray Images Blocks):\n      ```bash\n      python html_generator.py\n      ```\n\n2.  **Final HTML Code:**\n    Run the python script to generate final HTML code with copped images from the original screenshot.\n    - Placeholder Detection:\n      ```bash\n      python image_box_detection.py\n      ```\n    - UI Element Detection:\n      ```bash\n      python UIED\u002Frun_single.py\n      ```\n    - Mapping Alignment Between Placeholders and UI Elements:\n      ```bash\n      python mapping.py\n      ```\n    - Placeholder Replacement:\n      ```bash\n      python image_replacer.py\n      ```\n\n3.  **Simple Run:**\n    Run the python script to generate the final HTML code:\n    ```bash\n    python main.py\n    ```\n\n## More Projects on MLLM for Web\u002FCode Generation\n- [WebPAI (Web Development Powered by AI)](https:\u002F\u002Fgithub.com\u002FWebPAI) released a set of research resources and datasets for webpage generation studies, aiming to build an AI platform for more reliable and practical automated webpage generation.\n\n- [Awesome-Multimodal-LLM-for-Code](https:\u002F\u002Fgithub.com\u002Fxjywhu\u002FAwesome-Multimodal-LLM-for-Code) maintains a comprehensive list of papers on methods, benchmarks, and evaluation for code generation under multimodal scenarios.\n\n\n## Acknowledgements\n\nThis project builds upon several outstanding open-source efforts. We would like to thank the authors and contributors of the following projects: [UIED](https:\u002F\u002Fgithub.com\u002FMulongXie\u002FUIED), [DCGen](https:\u002F\u002Fgithub.com\u002FWebPAI\u002FDCGen), [Design2Code](https:\u002F\u002Fgithub.com\u002FNoviScl\u002FDesign2Code)\n\n\n","ScreenCoder 是一个能够将任何UI截图或设计草图转换为干净、可编辑的HTML\u002FCSS代码的智能系统。它基于模块化多代理架构，集成了视觉理解、布局规划和自适应代码合成技术，生成高精度且易于定制的前端代码。该工具支持用户对布局和样式进行个性化调整，非常适合快速原型设计以及需要精确控制界面细节的开发场景。无论是加速设计到编码的转化过程还是提升前端开发效率，ScreenCoder 都能提供强有力的支持。",2,"2026-06-11 03:41:58","high_star"]