[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80135":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":15,"stars30d":15,"stars90d":14,"forks30d":14,"starsTrendScore":16,"compositeScore":17,"rankGlobal":8,"rankLanguage":8,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":8,"pushedAt":8,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":14,"starSnapshotCount":14,"syncStatus":15,"lastSyncTime":26,"discoverSource":27},80135,"ram","AndreasBergmeister\u002Fram","AndreasBergmeister",null,"Python",57,3,53,4,0,2,6,1.81,"MIT License",false,"main",true,[],"2026-06-12 02:03:58","# Reinforce Adjoint Matching (RAM)\n\n**Scaling RL post-training of diffusion and flow-matching models.**\n\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.10759-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.10759)\n[![Blog](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBlog-bergmeister.ai-1f6feb.svg)](https:\u002F\u002Fbergmeister.ai\u002Fram\u002F)\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fzebra_pair.png\" width=\"720\" alt=\"SD3.5M (left) vs. RAM-post-trained SD3.5M (right) on the GenEval prompt 'a red zebra'\"\u002F>\u003Cbr\u002F>\n  \u003Csub>\u003Cem>Prompt:\u003C\u002Fem> \"a red zebra\". \u003Cstrong>Left:\u003C\u002Fstrong> pretrained SD3.5M. \u003Cstrong>Right:\u003C\u002Fstrong> SD3.5M post-trained with RAM on GenEval.\u003C\u002Fsub>\n\u003C\u002Fp>\n\nCode for the three text-to-image experiments in the paper: post-training **Stable Diffusion 3.5 Medium** with RAM on **GenEval** (compositional generation), **OCR** (visual text rendering), and **PickScore** (human preference).\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Ftraining_speed.png\" width=\"780\" alt=\"Training-reward curves: RAM vs. Flow-GRPO on GenEval, OCR, and PickScore\"\u002F>\n\u003C\u002Fp>\n\n---\n\n## Installation\n\nWe ran our experiments on a cluster with 4× H100 96GB.\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FAndreasBergmeister\u002Fram.git\ncd ram\npython -m venv .venv && source .venv\u002Fbin\u002Factivate\npip install -e .\n```\n\nAll dependency versions are pinned in `pyproject.toml`. Note that `paddlepaddle` and `onedl-mmcv` are CUDA-specific; the pinned versions target a CUDA 12.x environment.\n\n> Run all commands below from the repository root — the scripts and the GenEval reward use repo-relative paths (`configs\u002F`, `prompts\u002F`, `models\u002F`).\n\n---\n\n## Training\n\nTraining is launched with [`accelerate`](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Faccelerate). Each config in `configs\u002F` reproduces one of the three paper experiments:\n\n```bash\n# Compositional generation (GenEval)\naccelerate launch scripts\u002Ftraining_sd3.py geneval_sd3\n\n# Visual text rendering (OCR)\naccelerate launch scripts\u002Ftraining_sd3.py ocr_sd3\n\n# Human-preference alignment (PickScore)\naccelerate launch scripts\u002Ftraining_sd3.py pickscore_sd3\n```\n\nOutputs (a copy of the resolved config + checkpoints) land in `outputs\u002F\u003Cconfig_name>\u002F`. Override with `--output_dir \u003Cpath>`.\n\nPer-key config overrides on the command line:\n\n```bash\naccelerate launch scripts\u002Ftraining_sd3.py geneval_sd3 --lr 1e-4 --reward_multiplier 50\n```\n\nThe configs specify epoch sizes (`num_prompts_per_epoch`, etc.) as **totals across all processes**, so the same config reproduces the paper on any number of GPUs. The per-process slice is derived at launch time.\n\n### Resuming\n\nRe-run the same command. If `\u003Coutput_dir>\u002Flatest` exists, training continues from there; otherwise it starts fresh.\n\n### W&B logging (optional)\n\nSet `wandb_project` in the config (or pass `--wandb_project \u003Cname>`) to log metrics and validation samples to Weights & Biases.\n\n---\n\n## Evaluation\n\nEvaluate a trained checkpoint on the held-out task prompts plus DrawBench image-quality metrics:\n\n```bash\naccelerate launch scripts\u002Fevaluate.py \\\n    --checkpoint outputs\u002Fgeneval_sd3\u002Flatest \\\n    --rewards '[Geneval]' \\\n    --drawbench_metrics ImageReward AestheticScore HPSv2 PickScore DeQAScore\n```\n\nThe training config that produced the checkpoint is loaded automatically from `\u003Ccheckpoint>\u002F..\u002Fconfig.yaml`.\n\nTo evaluate the pretrained baseline (no LoRA), use `--model_name` instead of `--checkpoint`:\n\n```bash\naccelerate launch scripts\u002Fevaluate.py \\\n    --model_name stabilityai\u002Fstable-diffusion-3.5-medium \\\n    --prompts geneval \\\n    --rewards '[Geneval]'\n```\n\nResults print to stdout as `RESULT \u003Ckey>=\u003Cvalue>` lines.\n\n### Reward-scorer weights\n\nMost scorers fetch their weights automatically on first use. Four need to be downloaded by hand and placed under `models\u002F`:\n\n| File | Used by | Source |\n| --- | --- | --- |\n| `models\u002Fsac+logos+ava1-l14-linearMSE.pth` | `AestheticScore` | [LAION aesthetic predictor](https:\u002F\u002Fgithub.com\u002Fchristophschuhmann\u002Fimproved-aesthetic-predictor) |\n| `models\u002FHPS_v2.1_compressed.pt` | `HPSv2` | [HPSv2 release](https:\u002F\u002Fgithub.com\u002Ftgxs002\u002FHPSv2) |\n| `models\u002Fopen_clip_pytorch_model.bin` | `HPSv2` (OpenCLIP backbone) | [OpenCLIP ViT-H-14](https:\u002F\u002Fhuggingface.co\u002Flaion\u002FCLIP-ViT-H-14-laion2B-s32B-b79K) |\n| `models\u002Fmask2former_swin-s-p4-w7-224_8xb2-lsj-50e_coco_20220504_001756-c9d0c4f2.pth` | `Geneval` | [MMDetection](https:\u002F\u002Fgithub.com\u002Fopen-mmlab\u002Fmmdetection) |\n\n---\n\n## Repository layout\n\n```\nram\u002F\n├── reward_models\u002F           # training rewards (Geneval, OCR, PickScore) + eval metrics (Aesthetic, HPSv2, ImageReward, DeQA)\n├── scripts\u002F\n│   ├── training_sd3.py      # RAM training entry point\n│   └── evaluate.py          # task + DrawBench evaluation\n├── configs\u002F                 # one YAML per paper experiment\n└── prompts\u002F                 # benchmark prompt sets (geneval \u002F ocr \u002F pickscore \u002F drawbench)\n```\n\nThe RAM training algorithm itself lives in `scripts\u002Ftraining_sd3.py` — it is self-contained and the natural starting point for reading the code.\n\n---\n\n## Citation\n\n```bibtex\n@misc{bergmeister2026ram,\n  title         = {Reinforce Adjoint Matching: Scaling RL Post-Training of Diffusion and Flow-Matching Models},\n  author        = {Bergmeister, Andreas and Jegelka, Stefanie and N{\\\"u}sken, Nikolas and Domingo-Enrich, Carles and Pidstrigach, Jakiw},\n  year          = {2026},\n  eprint        = {2605.10759},\n  archivePrefix = {arXiv},\n  primaryClass  = {cs.LG}\n}\n```\n\n---\n\n## Acknowledgements\n\nThe GenEval and OCR reward implementations are adapted from [Flow-GRPO](https:\u002F\u002Fgithub.com\u002Fyifan123\u002Fflow_grpo) and [DiffusionNFT](https:\u002F\u002Fgithub.com\u002Fzhengkw18\u002FDiffusionNFT). The HPSv2, ImageReward, and PickScore wrappers re-use the official scorer packages. The aesthetic predictor comes from the [LAION improved-aesthetic-predictor](https:\u002F\u002Fgithub.com\u002Fchristophschuhmann\u002Fimproved-aesthetic-predictor).\n","该项目通过后训练强化学习方法提升扩散模型和流匹配模型的性能。核心功能包括使用Reinforce Adjoint Matching (RAM)技术对Stable Diffusion 3.5 Medium模型进行后训练，以增强其在生成评价、视觉文本渲染以及人类偏好对齐等任务中的表现。技术特点在于利用加速库`accelerate`来高效执行训练过程，并支持权重与偏差（Weights & Biases）日志记录以便于监控和分析。适合需要提高图像生成质量或特定领域内模型表现的应用场景，如创意设计、OCR系统优化及个性化推荐服务等。","2026-06-11 03:59:24","CREATED_QUERY"]