[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82165":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":12,"contributorsCount":12,"subscribersCount":12,"size":12,"stars1d":14,"stars7d":15,"stars30d":16,"stars90d":12,"forks30d":12,"starsTrendScore":17,"compositeScore":18,"rankGlobal":9,"rankLanguage":9,"license":9,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":9,"pushedAt":9,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":12,"starSnapshotCount":12,"syncStatus":26,"lastSyncTime":27,"discoverSource":28},82165,"MIGA","XiaokunFeng\u002FMIGA","XiaokunFeng","Accepted by ICML 2026～",null,"Python",127,0,29,4,35,98,18,72.3,false,"main",true,[],"2026-06-12 04:01:37","# MIGA: Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos\n\n\u003Cdiv align=\"center\">\n\n\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2507.11245-b31b1b?logo=arxiv&logoColor=red)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.18233)\n[![Project Page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-Green)](https:\u002F\u002Fxiaokunfeng.github.io\u002Fmiga_homepage\u002F)\n\n\u003C\u002Fdiv>\n\n## Overview\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fassets\u002Foverview.jpg\" width=\"100%\"\u002F>\n\u003C\u002Fp>\n\n**MIGA** is a novel train-free method for infinite-frame video generation that addresses two key limitations of existing frame-level autoregressive frameworks (e.g., [FIFO-Diffusion](https:\u002F\u002Fgithub.com\u002Fjjihwan\u002FFIFO-Diffusion_public)):\n\n1. **Two-Stage Training-Inference Alignment (TTA):** Reduces the noise span of latents fed to the model during inference through zigzag iterative denoising (Stage 1) and unified noise-level denoising (Stage 2), effectively bridging the training-inference gap.\n\n2. **Dual Consistency Enhancement (DCE):** Promotes long-term temporal consistency through:\n   - *Self-Reflection*: Evaluates and corrects early high-noise frames via latent-space similarity analysis\n   - *Long-Range Frame Guidance*: Leverages distant low-noise frames to steer generation\n\nMIGA achieves state-of-the-art performance on VBench and NarrLV benchmarks while maintaining constant memory consumption.\n\n\n## Updates\n\n* **[2025\u002F05]** MIGA paper is accepted by ICML 2026~.\n\n\n\n## Clone Repository\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FAMAP-ML\u002FMIGA.git\ncd MIGA\n```\n\nMIGA is implemented on two foundation models: **Wan2.1** and **VideoCrafter2**. Instructions for each are provided below.\n\n---\n\n## &#x2600;&#xfe0f; Start with [Wan2.1](https:\u002F\u002Fgithub.com\u002FWan-Video\u002FWan2.1)\n\n### 1. Environment Setup\n\n```bash\nconda create -n miga_wan python=3.10 -y\nconda activate miga_wan\n\npip install torch==2.4.0 torchvision==0.19.0 --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu121\npip install diffusers==0.31.0 transformers accelerate\npip install easydict einops imageio imageio-ffmpeg\npip install opencv-python pillow tqdm thop pyyaml\n```\n\n### 2. Download Models from Hugging Face\n\n| Model | Resolution | Checkpoint |\n|:------|:-----------|:-----------|\n| Wan2.1-1.3B | 480x832 | [Hugging Face](https:\u002F\u002Fhuggingface.co\u002FWan-AI\u002FWan2.1-T2V-1.3B) |\n\nStore them as the following structure:\n```\nWan2.1-T2V-1.3B\u002F\n  ├── Wan2.1_VAE.pth\n  ├── models_t5_umt5-xxl-enc-bf16.pth\n  └── ...\n```\n\n### 3. Run with Wan2.1\n\n```bash\ncd wan_based\n\npython generate.py \\\n    --ckpt_dir \u002Fpath\u002Fto\u002FWan2.1-T2V-1.3B \\\n    --miga_config ..\u002Fconfigs\u002Fwan2.1_1.3B.yaml \\\n    --prompt \"A fluffy Corgi dog trots happily across a lush green lawn. Its short legs and wagging tail convey pure delight as it explores the grassy expanse, a charming and energetic display of canine joy.\" \\\n    --save_dir .\u002Foutputs \\\n    --exp_name corgi_demo\n```\n\n---\n\n## &#x2600;&#xfe0f; Start with [VideoCrafter2](https:\u002F\u002Fgithub.com\u002FAILab-CVC\u002FVideoCrafter)\n\n### 1. Environment Setup\n\n```bash\nconda create -n miga_vc2 python=3.10 -y\nconda activate miga_vc2\n\npip install torch==2.1.0 torchvision==0.16.0 --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu121\npip install pytorch-lightning==1.9.0\npip install omegaconf einops decord imageio imageio-ffmpeg\npip install opencv-python pillow tqdm open-clip-torch\npip install transformers kornia pyyaml\n```\n\n### 2. Download Models from Hugging Face\n\n| Model | Resolution | Checkpoint |\n|:------|:-----------|:-----------|\n| VideoCrafter2 (Text2Video) | 320x512 | [Hugging Face](https:\u002F\u002Fhuggingface.co\u002FVideoCrafter\u002FVideoCrafter2\u002Fblob\u002Fmain\u002Fmodel.ckpt) |\n\nStore them as the following structure:\n```\nvideocrafter_models\u002F\n  └── base_512_v2\u002F\n      └── model.ckpt\n```\n\n### 3. Run with VideoCrafter2\n\n```bash\ncd videocraft_based\n\npython generate.py \\\n    --ckpt_path \u002Fpath\u002Fto\u002Fvideocrafter_models\u002Fbase_512_v2\u002Fmodel.ckpt \\\n    --miga_config ..\u002Fconfigs\u002Fvideocrafter2.yaml \\\n    --prompt \"An astronaut floating in space, high quality, 4K resolution.\" \\\n    --save_dir .\u002Foutputs \\\n    --exp_name astronaut_demo\n```\n\n---\n\n## Configuration\n\nMIGA hyperparameters are managed via YAML config files in `configs\u002F`. You can modify these files or override individual parameters via CLI arguments.\n\n| Parameter | Wan2.1 | VC2 | Description |\n|:----------|:------:|:---:|:------------|\n| `sampling_steps` | 54 | 64 | Total denoising steps *T* |\n| `saw_width` | 7 | 4 | Zigzag width *L_zig* in Stage 1 |\n| `long_iter_nums` | 20 | 30 | Number of generated latent chunks |\n| `temporal_memory_len` | 4 | 4 | Long-range guidance frames *m_guid* |\n| `involve_resample` | true | false | Enable self-reflection (DCE) |\n| `resample_threshold` | 0.001 | 0.05 | Consistency drop threshold *delta_adju* |\n\n**Generated video length:** `N = long_iter_nums x saw_width` latent frames\n\n**Override example:**\n```bash\npython generate.py \\\n    --ckpt_dir \u002Fpath\u002Fto\u002Fmodel \\\n    --miga_config ..\u002Fconfigs\u002Fwan2.1_1.3B.yaml \\\n    --prompt \"Your prompt here\" \\\n    --long_iter_nums 40 \\\n    --involve_resample true\n```\n\n---\n\n## Citation\n\nIf you find this work useful, please cite our paper:\n\n```bibtex\n@inproceedings{feng2025miga,\n    title={Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos},\n    author={Xiaokun Feng and Mingze Wu and Hao Yu and Jitan Tan and Jie Xiao and Fangyuan Zhao and Jiaxu Miao and Kaiwen Hu and Jie Wu and Xiangyang Chu and Ke Huang},\n    booktitle={ICML},\n    year={2025}\n}\n```\n\n\n## Acknowledgements\n\nOur codebase builds on [VideoCrafter2](https:\u002F\u002Fgithub.com\u002FAILab-CVC\u002FVideoCrafter), [Wan2.1](https:\u002F\u002Fgithub.com\u002FWan-Video\u002FWan2.1), and [FIFO-Diffusion](https:\u002F\u002Fgithub.com\u002Fjjihwan\u002FFIFO-Diffusion_public).\nWe appreciate their excellent work!\n","MIGA是一个创新的无需训练的无限帧视频生成方法，旨在解决现有帧级自回归框架中的两大关键问题。该项目通过两阶段训练-推理对齐（TTA）技术减少了推理过程中输入模型的潜在噪声跨度，并利用双重一致性增强（DCE）机制促进长时间序列的一致性，具体包括自我反思和长距离帧引导策略。MIGA在VBench和NarrLV基准测试中表现出色，同时保持了恒定的内存消耗，适用于需要高质量、长时间一致性的视频生成场景。项目基于Python开发，支持Wan2.1与VideoCrafter2两种基础模型。",2,"2026-06-11 04:07:55","CREATED_QUERY"]