[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72475":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":16,"compositeScore":18,"rankGlobal":8,"rankLanguage":8,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":22,"topics":23,"createdAt":8,"pushedAt":8,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":14,"starSnapshotCount":14,"syncStatus":15,"lastSyncTime":27,"discoverSource":28},72475,"Index-anisora","bilibili\u002FIndex-anisora","bilibili",null,"Python",2457,144,27,42,0,2,6,31,65.58,"Apache License 2.0",false,"main",true,[],"2026-06-12 04:01:06","\u003Cdiv align=\"center\">\n\u003Cimg src=\"assets\u002Findex_icon.png\" width=\"250\"\u002F>\n\n\u003C\u002Fdiv>\n\u003Cp align=\"center\">\n       🖥️  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fbilibili\u002FIndex-anisora\u002Ftree\u002Fmain\">GitHub\u003C\u002Fa> &nbsp&nbsp  |  &nbsp&nbsp🤗 \u003Ca href=https:\u002F\u002Fhuggingface.co\u002FIndexTeam\u002FIndex-anisora>Hugging Face\u003C\u002Fa>&nbsp&nbsp |  &nbsp&nbsp🤖 \u003Ca href=https:\u002F\u002Fwww.modelscope.cn\u002Forganization\u002Fbilibili-index>Model Scope\u003C\u002Fa>&nbsp&nbsp | 📑 \u003Ca href='http:\u002F\u002Farxiv.org\u002Fabs\u002F2412.10255'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FArXiv-2412.10255-red'>\u003C\u002Fa> &nbsp&nbsp ｜  📑 \u003Ca href='http:\u002F\u002Farxiv.org\u002Fabs\u002F2504.10044'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FArXiv-2504.10044-red'>\u003C\u002Fa> &nbsp&nbsp\n\n**English** | [**中文简体**](.\u002FREADME_CN.md)\n\n\u003Cbr>\n\n----\n\n[**Index-AniSora:The Ultimate Open-Source Anime Video Generation Model**](http:\u002F\u002Farxiv.org\u002Fabs\u002F2412.10255) \u003Cbe> \n\nThis Project presenting Bilibili's gift to the anime world - Index-AniSora, the most powerful open-source animated video generation model.\nIt enables one-click creation of video shots across diverse anime styles including series episodes, Chinese original animations, manga adaptations, VTuber content, anime PVs, mad-style parodies(鬼畜动画), and more!\nPowered by our IJCAI'25-accepted work  \u003Ca href='http:\u002F\u002Farxiv.org\u002Fabs\u002F2412.10255'>AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era \u003C\u002Fa>\n\n\n## 📣 Updates\n- `2025\u002F10\u002F31` 🔥 we have released the anisora-anymask model weights, supporting image-to-video generation with temporal and spatial masks. [Huggingface](https:\u002F\u002Fhuggingface.co\u002FIndexTeam\u002FIndex-anisora\u002Ftree\u002Fmain\u002Fanymask) [Model Scope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fbilibili-index\u002FIndex-anisora\u002Ffiles)\n- `2025\u002F09\u002F25` 🔥The 12GB VRAM available version of V3.1 has been uploaded to ModelScope. [Model Scope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Fbilibili-index\u002FIndex-anisora\u002Ffile\u002Fview\u002Fmaster\u002Fwan.7z) [Download](https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fbilibili-index\u002FIndex-anisora\u002Fresolve\u002Fmaster\u002Fwan.7z)\n- `2025\u002F09\u002F23` 🔥We have released V3.2 weights,  which is trained on the stronger wan2.2 model and can reduce the number of inference steps to 8. Just like version 3.1, it has added arbitrary-frame inference and character 3D video generation. [Huggingface](https:\u002F\u002Fhuggingface.co\u002FIndexTeam\u002FIndex-anisora\u002Ftree\u002Fmain\u002FV3.2) [Model Scope](https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fbilibili-index\u002FIndex-anisora\u002Ffiles)\n- `2025\u002F09\u002F04` 🔥We have released V3.1 weights, which provide enhanced motion range capabilities. For optimal results, we strongly recommend using the v3.1 weights with a motion score setting of 2.0–4.0. [Huggingface](https:\u002F\u002Fhuggingface.co\u002FIndexTeam\u002FIndex-anisora\u002Ftree\u002Fmain\u002FV3.1) [Model Scope](https:\u002F\u002Fwww.modelscope.cn\u002Fmodels\u002Fbilibili-index\u002FIndex-anisora\u002Ffiles)\n- `2025\u002F08\u002F27` 🔥🔥Anisora V3 weights are now licensed under Apache 2.0 and publicly available for download on ModelScope and Hugging Face. The new version supports \u003Cfont color=\"orange\"> arbitrary-frame inference, character 3D video generation, Video style transfer, Multimodal Guidance, Ultra-Low-Resolution Video Super-Resolution\u003C\u002Ffont>, delivering greater overall dynamics and more natural motion. The V3 model can generate 5 sec 360p video shot within 8 sec.\n- `2025\u002F08\u002F27` 🔥🔥We have submitted our work on agent-related research \u003Cfont color=\"orange\">AniMe\u003C\u002Ffont> to arXiv. Stay tuned for further updates! \u003Ca href='http:\u002F\u002Farxiv.org\u002Fabs\u002F2508.18781'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FArXiv-2508.18781-red'>\u003C\u002Fa>\n- `2025\u002F07\u002F11` 🔥Anisora V2 weights are now licensed under Apache 2.0 and publicly available for download on ModelScope and Hugging Face.\n- `2025\u002F07\u002F10` Fix Anisora V1 inference bug, which may case video artifacts. \n- `2025\u002F07\u002F02` 🔥🔥AniSora V3 Preview is updated, we will share our new progress in Siggraph Day 07\u002F11. Join Us!\n- `2025\u002F05\u002F12` 🔥🔥Everything we build is open-source. Check Out Now!!!\n- `2025\u002F05\u002F10` 🔥Our paper is accepted by IJCAI25. Camera Ready Version is updated. \n- `2024\u002F12\u002F19` We submitted our paper on arXiv and released our project with evaluation benchmark.\n\n## 📣 Join Us \n**This github project is the only official homepage of the Anisora project. Any websites not listed on the official homepage are not affiliated with the project team.**\n\nAccess benchmark or all videos generated by AniSoraV1 and AniSoraV2 on the benchmark, fill the  \u003Ca href=\"assets\u002Fanisora_benchmark_agreement_form.doc\">form\u003C\u002Fa> and send PDF format to jiangyudong@bilibili.com or yangsiqian@bilibili.com or xubaohan@bilibili.com (links provided after agreeing with Bilibili).\nThe signature must be handwritten and include the name of the affiliated company or academic institution.\nJoin us! Send your CV to jiangyudong@bilibili.com\nIf you want to learn more about AniSora, join the group chat.\n\n\u003Cpicture>\n  \u003Cimg src=\"assets\u002Fwechat.png\"  width=\"200\"\u002F>\n\u003C\u002Fpicture>\n\n\n\n## Project Guide\n\n### AniMe\n\u003Cimg src=\"assets\u002Fposter.png\" width=\"600\"\u002F>\n\n### Long Animation Demos Powered by AniMe and AniSora\n| Fiction to Video | 2D Cartoon Adaptation|  3D Cartoon Adaptation| Comic to Video | \n| --- | ---  |--- | ---  | \n| [![Fiction to Video](assets\u002FAniMe_fiction_demo.jpg)](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1KGe1zrE3G)| [![2D Cartoon Adaptation](assets\u002FAniMe_2d_demo.jpg)](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1uAe1zmEmu) | [![3D Cartoon Adaptation](assets\u002FAniMe_3d_demo.jpg)](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV18Ne1zjE7X) | [![Comic to Video](assets\u002FAniMe_comic_demo.jpg)](https:\u002F\u002Fwww.bilibili.com\u002Fvideo\u002FBV1NGe1zrEHp)| \n\n### AniSoraV3\nFind in 📁 `anisoraV3`\n\n\u003Cdiv align=\"center\">\n    \u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fa207d43e-26f6-445b-883e-9b28c129607f\" controls width=\"30%\" poster=\"\">\u003C\u002Fvideo>\n\u003C\u002Fdiv>\n\n- Character 3D video generation\n\nGiven a front-facing character illustration, generate a 360-degree rotation video to achieve multi-angle consistent visual modeling of the character. This assists in maintaining consistency across different shot angles and supports 3D modeling, among other applications.\n\n| Demo1 | Demo2|  Demo3| Demo4 | Demo5|  \n| --- | ---  |--- | ---  | ---  |\n|\u003Cimg src=\"assets\u002Fzhuanquan_demo1.gif\" width=\"150\"\u002F>|\u003Cimg src=\"assets\u002Fzhuanquan_demo3.gif\" width=\"100\"\u002F>|\u003Cimg src=\"assets\u002Fzhuanquan_demo5.gif\" width=\"200\"\u002F>|\u003Cimg src=\"assets\u002Fzhuanquan_demo6.gif\" width=\"280\"\u002F>|\u003Cimg src=\"assets\u002Fzhuanquan_demo7.gif\" width=\"100\"\u002F>|\n\n- Arbitrary-frame inference\n\n| first frame | mid frame| last frame  | Video  |\n| --- | --- | --- | ---  |\n|\u003Cimg src=\"assets\u002Fv2_1_first_frame.png\" width=\"800\"\u002F> | | |![Demo](assets\u002Fv2_1_first_frame.gif)|\n|| \u003Cimg src=\"assets\u002Fv2_1_mid_frame.png\" width=\"800\"\u002F>  | |![Demo](assets\u002Fv2_1_mid_frame.gif)|\n\n- Video style transfer\n\nBy leveraging image generation from the first and last frames combined with line art-based video generation, original videos can be transformed into videos of any desired style.\n\n| origin frame | transferred frame| video input | video transfer  |\n| --- | --- | --- | --- |\n|\u003Cimg src=\"assets\u002Fvideo_transfer_origin_image.png\" width=\"800\"\u002F> |\u003Cimg src=\"assets\u002Fvideo_transfer_transfer_image.png\" width=\"800\"\u002F> |\u003Cimg src=\"assets\u002Fvideo_transfer_origin_video.gif\" width=\"200\"\u002F> \u003Cimg src=\"assets\u002Fvideo_transfer_cond_video.gif\" width=\"200\"\u002F>|\u003Cimg src=\"assets\u002Fvideo_transfer_transfer_video.gif\" width=\"200\"\u002F>| \n\n\n- Multimodal Guidance\n\nPose, depth, line art, and audio guidance: Enable precise control over generated video motions through multiple guidance mechanisms.\n\n| input | prompt |  Video  |\n| --- | --- | --- |\n|\u003Cimg src=\"assets\u002Fmulti_guide_image_1.png\" width=\"200\"\u002F> \u003Cimg src=\"assets\u002Fmulti_guide_pose.gif\" width=\"200\"\u002F>  |The boy in red and the girl in red are fencing in the scene.|\u003Cimg src=\"assets\u002Fmulti_guide_pose_video.gif\" width=\"200\"\u002F>| \n|\u003Cimg src=\"assets\u002Fmulti_guide_image_2.png\" width=\"200\"\u002F> \u003Cimg src=\"assets\u002Fmulti_guide_scr.gif\" width=\"200\"\u002F>  |A worn-out red robot flings its arm away, only to have it fly back and reassemble with a new arm holding a sword.|\u003Cimg src=\"assets\u002Fmulti_guide_scr_video.gif\" width=\"200\"\u002F>| \n|Audio + img  \u003Cimg src=\"assets\u002Faudio_image_2.png\" width=\"200\"\u002F>  | - |  \u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fb3d6038e-35d4-4712-b748-bbaf42e0260e\" controls width=\"100\">\u003C\u002Fvideo> | \n\n- Ultra-Low-Resolution Video Super-Resolution\n\nSupports upscaling from 90p to 720p\u002F1080p, enabling the generation of videos with richer details using fewer sampling steps.\n\n| generated low-resolution  video | high-resolution video(1080p)| GT |\n| --- | --- | --- |\n|  \u003Cimg src=\"assets\u002Fvideo_sr_input_1.gif\" width=\"200\"\u002F> | \u003Cimg src=\"assets\u002Fvideo_sr_video_1.gif\" width=\"200\"\u002F> | \u003Cimg src=\"assets\u002Fvideo_sr_gt_1.gif\" width=\"200\"\u002F> |\n|  \u003Cimg src=\"assets\u002Fvideo_sr_input_3.gif\" width=\"200\"\u002F> | \u003Cimg src=\"assets\u002Fvideo_sr_video_3.gif\" width=\"200\"\u002F> | \u003Cimg src=\"assets\u002Fvideo_sr_gt_3.gif\" width=\"200\"\u002F>  |\n\n\n### AniSoraV2.0\nFind in 📁 `anisoraV2_gpu`, `anisoraV2_npu` \n\n\nPowered by the enhanced Wan2.1-14B foundation model for superior stability.\n- Distillation-accelerated inference without quality compromise, faster and cheaper\n- Full training\u002Finference code release\n- Native support Huawei Ascend 910B NPUs (entirely trained on domestic chips) 📁 `anisoraV2_npu`.\n- High quality video shots generation, covers 90% of application scenarios\n\n### AniSoraV1.0\nFind in 📁 `anisoraV1_infer`\n\n\u003Cdiv align=\"center\">\n    \u003Cvideo src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F4351fc5e-f7fd-456b-807e-82fdcb321de2\" controls width=\"30%\" poster=\"\">\u003C\u002Fvideo>\n\u003C\u002Fdiv>\n\nTrained on the CogVideoX-5B foundation model, with full training and inference code released. \n- Localized region guidance for video control\n- Temporal guidance (first\u002Flast frame guidance, keyframe interpolation, multi-frame guidance)\n- Full training and inference code release. Find in 📁 `anisoraV1_train_npu`\n- Cost-effective deployment on RTX 4090\n- Covers 80% of application scenarios\n\n### Ecosystem Tools\nFind in 📁 `data_pipeline`\n\nEnd-to-end dataset pipeline for rapid training data expansion.\n- Animate data cleaning pipeline.\n\n### Anime-optimized Benchmark System\nFind in 📁 `reward`\n\nSpecialized evaluation models and scoring algorithms for anime video generation, includes reward models suitable for reinforcement learning and benchmarking. \n- Tailored evaluation framework for animation generation\n- Standard test dataset aligned with ACG aesthetics\n- Human Preference Alignment\n\nThe benchmark dataset contains 948 animation video clips are collected and labeled\nwith different actions. Each label contains 10-30 video clips. The corresponding text prompt is generated by Qwen-VL2 at first, then is corrected manually to guarantee the text-video alignment.\n\n\n### AniSoraV1.0_RL\nFind in 📁 `anisora_rl`\n\nThe first RLHF framework for anime video generation. \n- RL-optimized AniSoraV1.0 for enhanced anime-style output\n- Methodology detailed in our preprint: \u003Ca href='http:\u002F\u002Farxiv.org\u002Fabs\u002F2504.10044'> Aligning Anime Video Generation with Human Feedback \u003C\u002Fa>\n\n---\n## 💡 Abstract\nAnimation has gained significant interest in the recent film and TV industry. Despite the success of advanced video generation models like Sora, Kling, and CogVideoX in generating natural videos, they lack the same effectiveness in handling animation videos. Evaluating animation video generation is also a great challenge due to its unique artist styles, violating the laws of physics and exaggerated motions. In this paper, we present a comprehensive system, **AniSora**, designed for animation video generation, which includes a data processing pipeline, a controllable generation model, and an evaluation dataset. Supported by the data processing pipeline with over 10M high-quality data, the generation model incorporates a spatiotemporal mask module to facilitate key animation production functions such as image-to-video generation, frame interpolation, and localized image-guided animation. We also collect an evaluation benchmark of 948 various animation videos, the evaluation on VBench and human double-blind test demonstrates consistency in character and motion, achieving state-of-the-art results in animation video generation.\n\n## 🖥️ Method\n\nThe overview of Index-anisora is shown as follows.\n\n\u003Cpicture>\n  \u003Cimg src=\"assets\u002Fframework.png\"  width=\"800\"\u002F>\n\u003C\u002Fpicture>\n\nFeatures:\n\n1. We develop a comprehensive video processing system that significantly enhances preprocessing for video generation.\n\n2. We propose a unified framework designed for animation video generation with a spatiotemporal mask module, enabling tasks such as image-to-video generation, frame interpolation, and localized image-guided animation.\n\n3. We release a benchmark dataset specifically for evaluating animation video generation.\n\n## 📑 Evaluation\n\nEvaluation results on Vbench:\n\n| Method                   | Motion Smoothness | Motion Score | Aesthetic Quality | Imaging Quality | I2V Subject | I2V Background | Overall Consistency | Subject Consistency |\n|--------------------------|-------------------|--------------|-------------------|-----------------|-------------|----------------|---------------------|---------------------|\n| Opensora-Plan(V1.3)  | 99.13            | 76.45        | 53.21            | 65.11           | 93.53       | 94.71          | 21.67              | 88.86              |\n| Opensora(V1.2)       | 98.78            | 73.62        | 54.30            | 68.44           | 93.15       | 91.09          | 22.68              | 87.71              |\n| Vidu                 | 97.71            | **77.51**        | 53.68            | 69.23           | 92.25       | 93.06          | 20.87              | 88.27              |\n| Covideo(5B-V1)       | 97.67            | 71.47        | **54.87**            | 68.16           | 90.68       | 91.79          | 21.87              | 90.29              |\n| MiniMax              | 99.20            | 66.53        | 54.56            | **71.67**           | 95.95       | **95.42**          | 21.82              | 93.62              |\n| **AniSora**              | **99.34**        | 45.59        | 54.31            | 70.58           | **97.52**       | 95.04          | 21.15              | **96.99**              |\n| AniSora-K            | 99.12            | 59.49        | 53.76            | 68.68           | 95.13       | 93.36          | 21.13              | 94.61              |\n| AniSora-I            | 99.31            | 54.96        | 54.67            | 68.98           | 94.16       | 92.38          | 20.47              | 95.75              |\n| GT                   | 98.72            | 56.05        | 52.70            | 70.50           | 96.02       | 95.03          | 21.29              | 94.37              |\n\n\nEvaluation results on AniSora-Benchmark:\n\n| Method                   | Human Evaluation | Visual Smooth | Visual Motion | Visual Appeal | Text-Video Consistency | Image-Video Consistency | Character Consistency |\n|--------------------------|------------------|---------------|---------------|---------------|------------------------|-------------------------|-----------------------|\n| Vidu-1.5                 | 60.98            | 55.37         | **78.95**     | 50.68         | 60.71                  | 66.85                   | 82.57                 |\n| Opensora-V1.2            | 41.10            | 22.28         | 74.90         | 22.62         | 52.19                  | 55.67                   | 74.76                 |\n| Opensora-Plan-V1.3       | 46.14            | 35.08         | 77.47         | 36.14         | 56.19                  | 59.42                   | 81.19                 |\n| CogVideoX-5B-V1          | 53.29            | 39.91         | 73.07         | 39.59         | 67.98                  | 65.49                   | 83.07                 |\n| MiniMax-I2V01            | 69.63            | 69.38         | 68.05         | 70.34     | 76.14              | 78.74                   | 89.47                 |\n| Wan-2.1            | -            | 81.70        | 61.88         | 82.05    | 87.80              | 88.50                   | 90.65                 |\n| **AniSora-V1 (Ours)**       | **70.13**        | 71.88     | 48.45         | 65.38         | 74.26                  | 82.66               | **94.88**             |\n| **AniSora-V2 (Ours)**       | -        | **86.98**     | 50.34         | **85.91**         | **90.98**                  | **91.96**               | 92.75             |\n| AniSora-V1 (Interpolated Avg) | -             | 70.78         | 53.02         | 64.41         | 73.56                  | 80.62                   | 91.59                 |\n| AniSora-V1 (KeyFrame Interp) | -             | 70.03         | 58.10         | 64.57         | 74.57                  | 80.78                   | 91.98                 |\n| AniSora-V1 (KeyFrame Interp) | -             | 70.03         | 58.10         | 64.57         | 74.57                  | 80.78                   | 91.98                 |\n| GT                       | -                | 92.20         | 58.27         | 89.72         | 92.51                  | 94.69                   | 95.08                 |\n\n\n\nAniSora for our I2V results.\n\nAniSora-K for the key frame interpolation results.\n\nAniSora-I for the average results of frame interpolation conditions, including key frame, last frame, mid frame results.\n\n\n## 🤗 Acknowledgments\nWe would like to express our sincere thanks to the [CogVideoX](https:\u002F\u002Fgithub.com\u002FTHUDM\u002FCogVideo)、[Wan2.1](https:\u002F\u002Fgithub.com\u002FWan-Video\u002FWan2.1)、[FasterCache](https:\u002F\u002Fgithub.com\u002FVchitect\u002FFasterCache) and [OSS](https:\u002F\u002Fgithub.com\u002Fbebebe666\u002FOptimalSteps) for their valuable work.\nA special thanks to the creators of the AniSora community, such as \"花染色\" for their valuable suggestions on the project.\n\n\n## 📚 Citation\n\n🌟 If you find our work helpful, please leave us a star and cite our paper.\n\n```\n@article{jiang2024anisora,\n  title={AniSora: Exploring the Frontiers of Animation Video Generation in the Sora Era},\n  author={Yudong Jiang, Baohan Xu, Siqian Yang, Mingyu Yin, Jing Liu, Chao Xu, Siqi Wang, Yidi Wu, Bingwen Zhu, Xinwen Zhang, Xingyu Zheng,Jixuan Xu, Yue Zhang, Jinlong Hou and Huyang Sun},\n  journal={arXiv preprint arXiv:2412.10255},\n  year={2024}\n}\n```\n\n","Index-AniSora 是一个强大的开源动漫视频生成模型，由Bilibili开发。该项目支持一键生成多种风格的动漫视频，包括系列剧集、中国原创动画、漫画改编、VTuber内容、动漫PV和鬼畜动画等。其核心技术基于IJCAI'25收录的研究成果，通过图像到视频的生成技术，结合时间和空间掩码，实现了高度灵活和高质量的视频创作。此外，最新版本还引入了任意帧推理、角色3D视频生成、视频风格转换以及多模态引导等功能，显著提升了生成视频的质量和多样性。此项目非常适合动漫爱好者、创作者以及需要高效生成动漫内容的专业人士使用。","2026-06-11 03:42:13","high_star"]