[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80896":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":13,"stars7d":12,"stars30d":12,"stars90d":13,"forks30d":13,"starsTrendScore":13,"compositeScore":14,"rankGlobal":8,"rankLanguage":8,"license":8,"archived":15,"fork":15,"defaultBranch":16,"hasWiki":17,"hasPages":15,"topics":18,"createdAt":8,"pushedAt":8,"updatedAt":19,"readmeContent":20,"aiSummary":21,"trendingCount":13,"starSnapshotCount":13,"syncStatus":22,"lastSyncTime":23,"discoverSource":24},80896,"Stream-T1","FrameX-AI\u002FStream-T1","FrameX-AI",null,"Python",35,4,1,0,2.1,false,"main",true,[],"2026-06-12 02:04:08","\u003Cdiv align=\"center\">\n\n\u003Ch1>Stream-T1: \u003Cbr> Test-Time Scaling for Streaming Video Generation\u003C\u002Fh1>\n\n\u003Cdiv>\n  \u003Ca href=\"#\" target=\"_blank\">\u003Cstrong>Yijing Tu\u003C\u002Fstrong>\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>,\n  \u003Ca href=\"#\" target=\"_blank\">\u003Cstrong>Shaojin Wu\u003C\u002Fstrong>\u003C\u002Fa>\u003Csup>3,&dagger;\u003C\u002Fsup>,\n  \u003Ca href=\"https:\u002F\u002Fcorleone-huang.github.io\u002F\" target=\"_blank\">\u003Cstrong>Mengqi Huang\u003C\u002Fstrong>\u003C\u002Fa>\u003Csup>1,&dagger;\u003C\u002Fsup>,\n  \u003Ca href=\"#\" target=\"_blank\">Wenchuan Wang\u003C\u002Fa>\u003Csup>1\u003C\u002Fsup>,\n  \u003Ca href=\"#\" target=\"_blank\">Yuxin Wang\u003C\u002Fa>\u003Csup>2\u003C\u002Fsup>,\n  \u003Ca href=\"#\" target=\"_blank\"> Chunxiao Liu\u003C\u002Fa>\u003Csup>3\u003C\u002Fsup>,\n  \u003Ca href=\"#\" target=\"_blank\">Zhendong Mao\u003C\u002Fa>\u003Csup>1,*\u003C\u002Fsup>\n\u003C\u002Fdiv>\n\n\u003Cbr>\n\n\u003Cdiv>\n  \u003Csup>1\u003C\u002Fsup> University of Science and Technology of China,\n  \u003Csup>2\u003C\u002Fsup> FrameX.AI,\n  \u003Csup>3\u003C\u002Fsup> Independent Researcher\n\u003C\u002Fdiv>\n\n\u003Cbr>\n\n\u003Csub>\u003Csup>*\u003C\u002Fsup> Corresponding author &nbsp;&middot;&nbsp; \u003Csup>&dagger;\u003C\u002Fsup> Project lead\u003C\u002Fsub>\n\n[![Project Page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-blue)](https:\u002F\u002Fstream-t1.github.io\u002F)\n[![Paper](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-arXiv-red)](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2605.04461)\n\n\u003C\u002Fdiv>\n\n## Overview\nWhile Test-Time Scaling (TTS) offers a promising direction to enhance video generation without the surging costs of training, current test-time video generation methods based on diffusion models suffer from exorbitant candidate exploration costs and lack temporal guidance. To address these structural bottlenecks, we propose shifting the focus to streaming video generation. We identify that its chunk-level synthesis and few denoising steps are intrinsically suited for TTS, significantly lowering computational overhead while enabling fine-grained temporal control. Driven by this insight, we introduced Stream-T1, a pioneering comprehensive TTS framework exclusively tailored for streaming video generation. Evaluated on both 5s and 30s comprehensive video benchmarks, Stream-T1 demonstrates profound superiority, significantly improving temporal consistency, motion smoothness, and frame-level visual quality.\n\n### Method\n1. **Stream‑Scaled Noise Propagation**: actively refines the initial latent noise of the generating chunk using historically proven, high-quality previous chunk noise, effectively establishes temporal dependency and utilizing the historical Gaussian prior to guide the current generation;\n2. **Stream‑Scaled Reward Pruning**: comprehensively evaluates generated candidates to strike an optimal balance between local spatial aesthetics and global temporal coherence by integrating immediate short-term assessments with sliding-window-based long-term evaluations; \n3. **Stream‑Scaled Memory Sinking**: dynamically routes the context evicted from KV-cache into distinct updating pathways guided by the reward feedback, ensuring that previously generated visual information effectively anchors and guides the subsequent video stream.\n\n\u003Cp align=\"center\">\n    \u003Cimg src=\"assets\u002Fpipeline.png\" width=\"800\"\u002F>\n\u003C\u002Fp>\n\n## TODO List\n\n- [x] Release the paper and project page.\n- [x] Release the inference code.\n- [x] Release test cases with our pretrained model, prompts, and reference image.\n\n## Requirements\nThe inference are conducted on 1 A800 GPU (80GB VRAM)\n## Setup\n```\ngit clone https:\u002F\u002Fgithub.com\u002FFrameX-AI\u002FStream-T1.git\ncd Stream-T1\n\ncd metrics\nhttps:\u002F\u002Fgithub.com\u002FKlingAIResearch\u002FVideoAlign.git\n```\n\n## Environment\nAll the tests are conducted in Linux. To set up our environment in Linux, please run:\n```\nconda create -n StreamT1 python=3.10 -y\nconda activate StreamT1\n\npip install -r requirements.txt\n```\n\n## Checkpoints\n1.base model checkpoints\n```\nhuggingface-cli download Efficient-Large-Model\u002FLongLive --local-dir longlive_models\nhuggingface-cli download Wan-AI\u002FWan2.1-T2V-1.3B --local-dir wan_models\u002FWan2.1-T2V-1.3B\n```\n2.reward model checkpoints\n```\nhuggingface-cli download MizzenAI\u002FHPSv3 --local-dir metrics\u002Fmodels\u002Fhpsv3_model\nhuggingface-cli download KlingTeam\u002FVideoReward --local-dir metrics\u002Fmodels\u002Fvideoalign\n```\n## Inference\n```\nbash stream_scaling.sh\n```\n## Citation:\nDon't forget to cite this source if it proves useful in your research!\n```\n@misc{tu2026streamt1testtimescalingstreaming,\n      title={Stream-T1: Test-Time Scaling for Streaming Video Generation}, \n      author={Yijing Tu and Shaojin Wu and Mengqi Huang and Wenchuan Wang and Yuxin Wang and Chunxiao Liu and Zhendong Mao},\n      year={2026},\n      eprint={2605.04461},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.04461}, \n}\n```\n## Acknowledgement:\n- LongLive: the codebase and algorithm we built upon. Thanks for their wonderful work.\n- HPSv3 and videoalign: the reward model we use. Thanks for their wonderful work.\n## License\nSee [LICENSE](LICENSE).","Stream-T1 是一个专注于流式视频生成的测试时缩放（TTS）框架。其核心功能包括通过历史噪声传播、奖励修剪和记忆下沉技术来增强视频的时间一致性、运动平滑度及帧级视觉质量，特别适合于需要在较低计算成本下生成高质量视频的应用场景。采用Python开发，该框架利用了分块合成与少量去噪步骤的优势，实现了对流式视频生成过程中的细粒度时间控制，同时显著降低了计算开销。",2,"2026-06-11 04:02:45","CREATED_QUERY"]