[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82272":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":12,"contributorsCount":12,"subscribersCount":12,"size":12,"stars1d":12,"stars7d":12,"stars30d":14,"stars90d":12,"forks30d":12,"starsTrendScore":12,"compositeScore":15,"rankGlobal":9,"rankLanguage":9,"license":16,"archived":17,"fork":17,"defaultBranch":18,"hasWiki":17,"hasPages":17,"topics":19,"createdAt":9,"pushedAt":9,"updatedAt":20,"readmeContent":21,"aiSummary":22,"trendingCount":12,"starSnapshotCount":12,"syncStatus":23,"lastSyncTime":24,"discoverSource":25},82272,"ECHO","midea-ai\u002FECHO","midea-ai","Official Repository for \"ECHO:Efficient Chest X-ray Report Generation with One-step Block Diffusion\"",null,"Python",53,0,1,19,41.9,"Other",false,"main",[],"2026-06-12 04:01:37","\u003Cdiv align=\"center\">\n  \u003Cimg src=\".\u002Fassets\u002FECHO_repo_head.png\" alt=\"ECHO\" width=\"75%\">\n  \u003Cbr>\u003Cbr>\n  \u003Ca href=\".\u002FLICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Midea%20NC-blue.svg\" alt=\"License: Midea Non-Commercial\">\u003C\u002Fa>\n  &nbsp;\n  \u003Ca href=\"https:\u002F\u002Fecho-midea-airc.github.io\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-ECHO-blue.svg\" alt=\"Website: ECHO\">\u003C\u002Fa>\n  &nbsp;\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FMidea-AIRC\u002Fecho\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FHuggingFace-ECHO-yellow.svg\" alt=\"Hugging Face: ECHO\">\u003C\u002Fa>\n  &nbsp;\n  \u003Ca href=\"https:\u002F\u002Fecho-midea-airc.github.io\u002F\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTechnical%20Report-arXiv-red.svg\" alt=\"Technical Report: arXiv\">\u003C\u002Fa>\n  \u003Cbr>\u003Cbr>\n\u003C\u002Fdiv>\n\nECHO (Efficient Chest X-ray Report Generation with One-step Block Diffusion) is a discrete diffusion vision–language model for automated chest X-ray report generation. It converts a pretrained autoregressive model into a one-step-per-block decoder via Response-Asymmetric Diffusion (RAD) adaptation and Direct Conditional Distillation (DCD). DCD constructs non-factorized supervision from on-policy teacher trajectories, enabling coherent single-step decoding that was previously unachievable in discrete diffusion models. ECHO surpasses state-of-the-art autoregressive and diffusion-based methods while achieving up to an 8× inference speedup.\n\nHighlights:\n\n- 🏥 **State-of-the-Art Chest X-ray Report Generation** — surpasses both AR and diffusion-based SOTA, with large margins on clinical fidelity metrics (RaTEScore, SemScore)\n- ⚡ **8× Inference Speedup** — one-step-per-block decoding via DCD distillation, with minimal quality degradation\n- 🌐 **Bilingual** — supports both English and Chinese prompts and outputs for CXR report generation  \n\n\n\n\u003C!-- Below: add Overview, installation, model table, training, evaluation, and citation (see README.md). Replace arXiv href with paper abs when available. -->\n\n## 🔥 Motivations\n\n\u003Cp align=\"justify\">Discrete diffusion language models approximate the joint token distribution through token factorization, treating each position as conditionally independent. This approximation ignores inter-token dependencies, requiring multi-step remasking to progressively recover output coherence. Each additional step, however, incurs an extra model forward pass, increasing inference latency and creating a fundamental quality–speed dilemma. ECHO resolves this through Direct Conditional Distillation (DCD), which constructs non-factorized supervision from the teacher's on-policy multi-step trajectories, enabling the student to capture joint token dependencies in a single forward pass per block — achieving multi-step quality at single-step speed.\u003C\u002Fp>\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\".\u002Fassets\u002Frepo_motivation.png\" alt=\"ECHO Motivation\" width=\"95%\">\n\u003C\u002Fdiv>\n\n\u003Cp align=\"justify\">(a) Decoding all tokens simultaneously in one step produces incoherent outputs, as standard diffusion models predict each position independently. Our Direct Conditional Distillation (DCD) distills from a non-factorized target, yielding coherent one-step-per-block outputs. (b) Compared to both autoregressive and diffusion-based baselines, ECHO achieves a favorable trade-off between generation quality (SemScore) and decoding throughput (tokens per forward pass).\u003C\u002Fp>\n\n## 🗺️ Roadmap\n\n- [x] Inference code\n- [x] Evaluation code\n- [x] Model weights (HuggingFace)\n- [ ] Training scripts — coming soon\n\n## ⚙️ Usage\n\n### Inference\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fmidea-ai\u002FECHO.git\ncd .\u002FECHO\u002Finference\npip install transformers==4.55.4\npython generate_echo.py\n```\n\nTwo inference scripts are provided under `inference\u002F`:\n\n- **`generate_echo.py`** — single-image inference for distilled ECHO models (`ECHO_block4\u002F8`), with fused single-step decoding support.\n- **`generate_vl_block.py`** — single-image inference for multi-step base models (`ECHO_Base_block4\u002F8`), supporting configurable denoising steps and remasking strategies.\n\n```bash\n# ECHO (single-step, distilled)\npython inference\u002Fgenerate_echo.py \\\n  --model_dir Midea-AIRC\u002FECHO_block4 \\\n  --image_path \u002Fpath\u002Fto\u002Fimage.jpg \\\n  --prompt_text \"Review this chest X-ray and write a report.Use this format: Findings: {}, Impression: {}.\" \\\n  --block_length 4 \\\n  --denoising_steps 1\n\n# ECHO_Base (multi-step)\npython inference\u002Fgenerate_vl_block.py \\\n  --model_dir Midea-AIRC\u002FECHO_Base_block4 \\\n  --image_path \u002Fpath\u002Fto\u002Fimage.jpg \\\n  --prompt_text \"这是一组胸部X光图像，请生成一份医学报告，包括所见和结论。以以下格式返回报告：所见：{} 结论：{}。\" \\\n  --remasking_strategy \"low_confidence_dynamic\" \\\n  --block_length 4 \\\n  --denoising_steps 4\n```\n\n### Evaluation\n\nSee [`eval\u002FREADME.md`](eval\u002FREADME.md) for environment setup, batch inference, metric evaluation, and speed profiling.\n\n## 🗂️ Model Zoo\n\nAll checkpoints live in the **[ECHO collection](https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FMidea-AIRC\u002Fecho)** on Hugging Face.\n\n| Model | Stage | Description | Link |\n|------|--------|-------------|------|\n| `ECHO_Base_block4` | RAD | Multi-step block diffusion (block length 4), teacher for distillation | [ECHO_Base_block4](https:\u002F\u002Fhuggingface.co\u002FMidea-AIRC\u002FECHO_Base_block4) |\n| `ECHO_Base_block8` | RAD | Multi-step block diffusion (block length 8), teacher for distillation | [ECHO_Base_block8](https:\u002F\u002Fhuggingface.co\u002FMidea-AIRC\u002FECHO_Base_block8) |\n| `ECHO_block4` | DCD | Single-step distilled student (block length 4) | [ECHO_block4](https:\u002F\u002Fhuggingface.co\u002FMidea-AIRC\u002FECHO_block4) |\n| `ECHO_block8` | DCD | Single-step distilled student (block length 8) | [ECHO_block8](https:\u002F\u002Fhuggingface.co\u002FMidea-AIRC\u002FECHO_block8) |\n\n> Both the **code** and **model weights** in this repository are released under the [Midea Model License Agreement - Non-Commercial Use Version](LICENSE). Use for research, study, and personal non-commercial purposes only. Commercial use is strictly prohibited.\n\n\n## 👏 Acknowledge\n\nWe would like to express our gratitude to the following works ([SDAR](https:\u002F\u002Fjetastra.github.io\u002FSDAR\u002F), [Lingshu](https:\u002F\u002Falibaba-damo-academy.github.io\u002Flingshu\u002F), [BD3LM](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.09573)) for providing important model foundations for ECHO.\n\nWe would like to express our gratitude to the following works ([MIMIC-CXR](https:\u002F\u002Fphysionet.org\u002Fcontent\u002Fmimic-cxr\u002F2.1.0\u002F), [ReXGradient](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Frajpurkarlab\u002FReXGradient-160K), [CheXpert Plus](https:\u002F\u002Faimi.stanford.edu\u002Fdatasets\u002Fchexpert-plus) and [IU X-ray](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fdz-osamu\u002FIU-Xray)) for providing important dataset foundations for ECHO.\n\n## 📬 Contact\n\nFor issues or inquiries:\n\n- **Lifeng Chen**, Beijing Jiaotong University (lfchen@bjtu.edu.cn)\n- **Hao Liu** (Corresponding Author), AI Research Center, Midea Group (liuhao249@midea.com)\n\n## 🔬 Citation\n\n```\n@misc{chen2026echoefficientchestxray,\n      title={ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion}, \n      author={Lifeng Chen and Tianqi You and Hao Liu and Zhimin Bao and Jile Jiao and Xiao Han and Zhicai Ou and Tao Sun and Xiaofeng Mou and Xiaojie Jin and Yi Xu},\n      year={2026},\n      eprint={2604.09450},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.09450}, \n}\n```\n","ECHO 是一个用于自动生成胸部X光报告的视觉-语言模型。它通过响应非对称扩散（RAD）适应和直接条件蒸馏（DCD）技术，将预训练的自回归模型转换为一步解码器，从而在保持高质量报告生成的同时显著提高推理速度。该模型支持中英文双语输入与输出，并且在临床准确度指标上超越了现有的自回归及基于扩散的方法，同时实现了最高8倍的推理加速。ECHO 适用于需要快速准确生成胸部X光报告的医疗场景，如放射科日常诊断或远程医疗服务。",2,"2026-06-11 04:08:12","CREATED_QUERY"]