[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82157":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":13,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":16,"starSnapshotCount":16,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},82157,"ClipGStream","liangjie1999\u002FClipGStream","liangjie1999","[CVPR 2026] ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion Multi-View Dynamic Scene Reconstruction","",null,"HTML",112,4,1,3,0,37,80,20,73.6,false,"main",true,[],"2026-06-12 04:01:37","# ClipGStream (CVPR 2026)\n### [Project page](https:\u002F\u002Fliangjie1999.github.io\u002FClipGStreamWeb\u002F) | [Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.13746) | [Long 360](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBestWJH\u002FVRU_Basketball\u002Ftree\u002Fmain) | [VRU Dataset](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FBestWJH\u002FVRU_Basketball\u002Ftree\u002Fmain)\n> **ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion\nMulti-View Dynamic Scene Reconstruction**,            \n> Jie Liang, Jiahao Wu, Chao Wang, Jiayu Yang, Xiaoyun Zheng, Kaiqiang Xiong, Zhanke Wang, Jinbo Yan, FengGao, Ronggang Wang  \n> **Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology,\nShenzhen Graduate School, Peking University, Pengcheng Lab, Peking Unviersity**  \n> **CVPR 2026**\n>  \nThis repository is the official implementation of **\"ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion\nMulti-View Dynamic Scene Reconstruction\"**. \n\n\nIn this paper, we propose a hybrid reconstruction framework, Clip-Stream, which performs stream-level optimization at the clip granularity rather than at the frame level. This design enables scalable and temporally coherent reconstruction of long dynamic sequences, effectively eliminating flickering artifacts.\n\n\n![](.\u002Fassets\u002Fteaser.png)\n\n![](.\u002Fassets\u002Fpipeline.png)\n\n## 1. Environmental Setups\n\nWe tested on a server configured with Ubuntu 20.04, cuda 11.8 and gcc 11.4.0. Other similar configurations should also work, but we have not verified each one individually.  `In fact, this environment configuration is not strict — any environment that can run 3DGS properly should also be able to run our program`. In addition, some extra packages are required, such as Tinycudann.\n\n1. Clone this repo:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fliangjie1999\u002FClipGStream --recursive\ncd ClipGStream\n```\n\n2. Install dependencies\n\n```bash\nconda env create --file environment.yml\nconda activate ClipGStream\n```\n\nAfter that, you need to install [tiny-cuda-nn (1.7)](https:\u002F\u002Fgithub.com\u002FNVlabs\u002Ftiny-cuda-nn\u002F?tab=readme-ov-file#pytorch-extension). \n\n\n## 2. Quick Start Guide\n\n**Quickly launch using only a single command.**\n\nWe provide a tiny dataset (20 frames) for quick demonstration. This dataset includes multi-view images and has been fully preprocessed. We will use frames 0-10 as the Reference Clip and frames 10-20 as the Source Clip.\n\n### Download the Dataset\nDownload the dataset from our [GitHub Releases](https:\u002F\u002Fgithub.com\u002Fliangjie1999\u002FClipGStream\u002Freleases\u002Ftag\u002Fv1.0.0)\n### Dataset Structure\n```\nlong_360_tiny_dataset\n|---frame000000\n|---frame000001\n|   |---images\n|       |---\u003Cimage 1>\n|       |---...\n|---frame000002  \n|---...\n|---sparse                  # information of camera    \n|---plys\n    |--- 0.ply              # point cloud of reference clip\n    |--  1.ply              # point cloud of source clip (residual point cloud)\n```\n### Running the Demo\nYou need to first set the source_path (dataset address) in the runTinyDataset.sh file, and then run the following bash.\n```\n.\u002Fscripts\u002Ftiny_long_360\u002Frun.sh\n```\n\n### Expected Results\nYou can see the results in .\u002Foutput\u002Ftiny_long_360, and the result structure are as follow:\n```\n|--- history\n|    |--- decoder\n|         |--- mlp_color.pth\n|         |--- mlp_cov.pth\n|         |--- mlp_offset.pth\n|         |--- mlp_opacity.pth\n|    |--- point\n          |--- 0.ply\n          |--- 1.ply\n|    |--- FDHash_0.pth\n|    |--- FDHash_1.pth\n|--- test\n|    |--- renders         # render images\n|    |--- videos          # render videos\n|    |--- 0.csv           # metrics of test view 0\n|    |--- 1.csv\n|    |--- 2.csv\n|    |--- 3.csv\n```\nHere we also show our tiny dataset's quantitative results (in 0.csv, 1.csv...)\n| Test View | PSNR  | DSSIM1 | DSSIM2 | LPIPS |\n|-------|-------|--------|--------|-------|\n| 0     | 22.90 | 0.101  | 0.043  | 0.197 |\n| 1     | 23.68 | 0.100  | 0.043  | 0.210 |\n| 2     | 24.80 | 0.091  | 0.038  | 0.194 |\n| 3     | 22.50 | 0.129  | 0.060  | 0.237 |\n| AVG      | 23.47 | 0.105  | 0.046  | 0.209 |\n\n\n\n## Long 360 Dataset\n### Download the Dataset\nDownload the dataset from our [GitHub Releases](https:\u002F\u002Fgithub.com\u002Fliangjie1999\u002FClipGStream\u002Freleases\u002Ftag\u002Fv1.0.0)\n\n### Preprocess the Dataset\nFor the Long 360 dataset, we provide a portion of the processed data here, which includes the camera pose of the first frame (frame 0) and the point clouds of all clips: Basketball_gz_cameras_pointcloud.zip. The additional steps you need to carry out are simply running [this script](https:\u002F\u002Fgithub.com\u002Fliangjie1999\u002FClipGStream\u002Ftree\u002Fmain\u002Fdata_process\u002Flong360\u002Frun.sh) Step 1 to convert videos to images. And then running Step 3 and Step 4 to perform undistortion on the remaining frames and correctly configure the data paths.\n\n### Dataset Structure\n```\nlong_360_tiny_dataset\n|---frame000000\n    |--- sparse             # information of camera    \n    |--- images             # undistorted image\n|---view000.mp4\n|---view001.mp4\n|   ...\n|---view035.mp4  \n|---...\n|---sparse                  \n|---plys\n    |--- 0.ply              # point cloud of reference clip\n    |--  1.ply              # point cloud of source clip 1 (residual point cloud)\n    |    ...\n    |--  n.ply              # point cloud of source clip n\n  \n     \n```\n### Training & Render & Metric\n\n```\n.\u002Fscripts\u002Flong_360\u002FtrainReferenceClip.sh\n.\u002Fscripts\u002Flong_360\u002FtrainSourceClip.sh       # if you have multiple GPUs, you can run .\u002Fscripts\u002Flong_360\u002Fparallel_source_clip\u002FgenerateTrainingCmd.py\n.\u002Fscripts\u002Frender.sh                         # render & metric\n```\n\n### Expected Results\n| Test View | PSNR  | DSSIM1 | DSSIM2 | LPIPS |\n|-----------|-------|--------|--------|-------|\n| 0         | 24.03 | 0.069  | 0.032  | 0.122 |\n| 1         | 24.95 | 0.068  | 0.031  | 0.135 |\n| 2         | 26.57 | 0.059  | 0.025  | 0.117 |\n| 3         | 22.86 | 0.111  | 0.053  | 0.183 |\n| AVG       | 24.60 | 0.077  | 0.035  | 0.139 |\n\n\n\n## Custom Dataset\nWe provide a method to process custom data (multi-view video streams) into our dataset format. For details, please refer to [`data_process\u002Fcustom_dataset\u002F`](.\u002Fdata_process\u002Fcustom_dataset\u002FREADME.md)\n\n## Training\nTaking training on the tiny_long_360 dataset as an example (`.\u002Fscripts\u002Ftiny_long_360\u002F`), we divide 20 frames of multi-view images in order into 2 clips, each with a length of 10. The first clip is the Reference Clip, and subsequent clips are Source Clips. Training consists of two steps:\n1. Train the Reference Clip: The Reference Clip serves as a foundational representation of the scene, which will be inherited by subsequent clips to prevent flickering issues between clips. The following parameters are used: `clip_size` sets the length of a single clip, `project_total_frames` defines the total sequence length, and `frames_start_end` specifies the start and end frames for the Reference Clip.\n\n```\nCUDA_VISIBLE_DEVICES=1 trainReferenceClip.py --project_total_frames 20 --clip_size 10 --iterations 5000 -s \"\u002Fdata8\u002Fdataset\u002Flongvideos\u002Fjpg\u002Flong_360_tiny_dataset\u002F\" -m .\u002Foutput\u002Ftiny_long_360 --frames_start_end 0 10 --configs arguments\u002Ftiny\u002Fbasketball.py \n\npython trainReferenceClip.py --project_total_frames 20 \u002F               # N: input video frame count\n                             --clip_size 10  \u002F                         # M: frame count of single clip\n                             --frames_start_end 0 10 \u002F                 # start frames and end frames\n                             --iterations 5000 \u002F\n                             -s \"\u002Famax\u002Flong_360_tiny_dataset\u002F\" \u002F\n                             -m .\u002Foutput\u002Ftiny_long_360 \u002F\n                             --configs arguments\u002Ftiny\u002Fbasketball.py \n\n```\n2. Subsequent Source Clips are trained by inheriting the static information (including anchors, static features, and the decoder) from the Reference Clip. The `-m` parameter should be set the same as when training the Reference Clip, to inherit the static information. Additionally, the `frames_start_end` needs to be adjusted accordingly. Note that the training of each Source Clip is independent, thus it can be parallelized to improve training speed\n```\nCUDA_VISIBLE_DEVICES=1 python trainSourceClip.py --project_total_frames 20 --clip_size 10 --iterations 5000 -s \"\u002Fdata8\u002Fdataset\u002Flongvideos\u002Fjpg\u002Flong_360_tiny_dataset\u002F\" -m .\u002Foutput\u002Ftiny_long_360 --frames_start_end 10 20 --configs arguments\u002Ftiny\u002Fbasketball.py \n\npython trainSourceClip.py  --project_total_frames 20 \u002F               # N: input video frame count\n                           --clip_size 10  \u002F                         # M: frame count of single clip\n                           --frames_start_end 0 10 \u002F                 # start frames and end frames \n                           --iterations 5000 \u002F\n                           -s \"\u002Famax\u002Flong_360_tiny_dataset\u002F\" \u002F\n                           -m .\u002Foutput\u002Ftiny_long_360 \u002F\n                           --configs arguments\u002Ftiny\u002Fbasketball.py \n\n```\n## Render\n```\nCUDA_VISIBLE_DEVICES=1 python render.py --project_total_frames 20 --clip_size 10 -s $source_path --iteration $iteration -m .\u002Foutput\u002Ftiny_long_360 --frames_start_end 0 20 --configs arguments\u002Ftiny\u002Fbasketball.py --skip_video --skip_train \n\npython render.py --project_total_frames 20 \u002F               # N: input video frame count\n                 --clip_size 10  \u002F                         # M: frame count of single clip\n                 --frames_start_end 0 20 \u002F                 # start frames and end frames of All clips\n                 --iteration 5000 \u002F\n                 -s \"\u002Famax\u002Flong_360_tiny_dataset\u002F\" \u002F\n                 -m .\u002Foutput\u002Ftiny_long_360 \u002F\n                 --configs arguments\u002Ftiny\u002Fbasketball.py \u002F\n                 --skip_train   \u002F\n                 --skip_video\n\n```\n## Metrics\n```\npython metrics.py -m .\u002Foutput\u002Ftiny_long_360 --iteration 5000\n```\n\n## Image2Video\n```\npython images2video.py -m .\u002Foutput\u002Ftiny_long_360\u002F --iteration 5000\n```\n\n\n## Citation\n\n```\n@misc{liang2026clipgstreamclipstreamgaussiansplatting,\n      title={ClipGStream: Clip-Stream Gaussian Splatting for Any Length and Any Motion Multi-View Dynamic Scene Reconstruction}, \n      author={Jie Liang and Jiahao Wu and Chao Wang and Jiayu Yang and Xiaoyun Zheng and Kaiqiang Xiong and Zhanke Wang and Jinbo Yan and Feng Gao and Ronggang Wang},\n      year={2026},\n      eprint={2604.13746},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.13746}, \n}\n@inproceedings{wu2025localdygs,\n  title={LocalDyGS: Multi-view Global Dynamic Scene Modeling via Adaptive Local Implicit Feature Decoupling},\n  author={Wu, Jiahao and Peng, Rui and Jiao, Jianbo and Yang, Jiayu and Tang, Luyang and Xiong, Kaiqiang and Liang, Jie and Yan, Jinbo and Liu, Runling and Wang, Ronggang},\n  booktitle={Proceedings of the IEEE\u002FCVF International Conference on Computer Vision},\n  pages={9519--9529},\n  year={2025}\n}\n```\n","ClipGStream 是一个用于任意长度和任意运动的多视角动态场景重建的框架。其核心功能是通过在片段级别而非帧级别进行流级优化，实现长动态序列的可扩展且时间一致的重建，有效消除闪烁伪影。该项目基于 Python 开发，利用了 Gaussian Splatting 技术，并需要 CUDA 和一些额外的库支持，如 Tinycudann。适合于需要高质量动态场景重建的应用场景，例如虚拟现实、增强现实以及影视特效制作等领域。",2,"2026-06-11 04:07:53","CREATED_QUERY"]