[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1455":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":10,"languages":10,"totalLinesOfCode":10,"stars":11,"forks":12,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":14,"forks30d":14,"starsTrendScore":18,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":21,"hasPages":21,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":14,"starSnapshotCount":14,"syncStatus":27,"lastSyncTime":28,"discoverSource":29},1455,"Awesome-Feed-Forward-3D","ziplab\u002FAwesome-Feed-Forward-3D","ziplab","An curated list for feed-forward 3D scene modeling, including research directions, datasets, and applications.","https:\u002F\u002Fff3d-survey.github.io",null,252,6,1,0,4,11,32,12,2.54,"MIT License",false,"main",[],"2026-06-12 02:00:28","# Awesome Feed-Forward 3D\n\n[![Paper](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-arXiv%3A2604.14025-B31B1B?style=flat-square&logo=arxiv&logoColor=white)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.14025)\n[![Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-ff3d--survey.github.io-blue?style=flat-square)](https:\u002F\u002Fff3d-survey.github.io\u002F)\n[![GitHub](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitHub-ziplab%2FAwesome--Feed--Forward--3D-black?style=flat-square&logo=github)](https:\u002F\u002Fgithub.com\u002Fziplab\u002FAwesome-Feed-Forward-3D)\n\nAn curated list for feed-forward 3D scene modeling, including research directions, datasets, and applications.\n\n| \u003Cimg width=\"100%\" src=\"https:\u002F\u002Fff3d-survey.github.io\u002Fassets\u002Foverview.png\"> |\n|:-:|\n\n## Table of Contents\n\n- [Research Directions](#directions)\n  - [Feature Enhancement](#feature-enhancement)\n    - [Advanced Encoding Architectures](#advanced-encoding-architectures)\n    - [Cross-View Fusion](#cross-view-fusion)\n    - [Integration of Visual Foundation Models](#integration-of-visual-foundation-models)\n  - [Geometry-aware Improvement](#geometry-aware-improvement)\n    - [Explicit Geometric Aggregation](#explicit-geometric-aggregation)\n    - [Refining Predicted 3D Scenes](#refining-predicted-3d-scenes)\n    - [Pose-Free Reconstruction](#pose-free-reconstruction)\n    - [Pre-trained Geometric Guidance](#pre-trained-geometric-guidance)\n  - [Model Efficiency](#model-efficiency)\n    - [Feature Efficiency](#feature-efficiency)\n    - [Representation Compaction](#representation-compaction)\n  - [Data & Visual Augmentation](#data--visual-augmentation)\n    - [Data Augmentation](#data-augmentation)\n    - [Visual Augmentation](#visual-augmentation)\n  - [Temporal-aware Models](#temporal-aware-models)\n    - [Online Streaming](#online-streaming)\n    - [Offline Processing](#offline-processing)\n    - [Interactive Modeling](#interactive-modeling)\n    - [Specialized Tasks](#specialized-tasks)\n- [Datasets and Benchmarks](#datasets)\n  - [Geometry Oriented](#geometry-oriented)\n  - [Visual Oriented](#visual-oriented)\n  - [Mixed](#mixed)\n- [Applications](#application)\n  - [Autonomous Driving](#autonomous-driving)\n  - [Robotics](#robotics)\n  - [SfM & SLAM](#sfm--slam)\n  - [Scene Understanding](#scene-understanding)\n  - [Video Generation](#video-generation)\n  - [Others](#others)\n\n## Taxonomy\n\n| \u003Cimg width=\"100%\" src=\"https:\u002F\u002Fff3d-survey.github.io\u002Fassets\u002Ftaxonomy.png\"> |\n|:-:|\n\n## Research Directions\n\n## Feature Enhancement\n\n### Advanced Encoding Architectures\n\n- pixelNeRF: Neural Radiance Fields from One or Few Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.02190) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fsxyu\u002Fpixel-nerf)]\n- IBRNet: Learning Multi-View Image-Based Rendering. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2102.13090) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fgoogleinterns\u002FIBRNet)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fibrnet.github.io\u002F)]\n- Splatter Image: Ultra-Fast Single-View 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Frobots.ox.ac.uk\u002F~vgg\u002Fpublications\u002F2024\u002FSzymanowicz24\u002Fszymanowicz24.pdf) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fszymanowiczs\u002Fsplatter-image)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fszymanowiczs.github.io\u002Fsplatter-image)]\n- Convolutional Occupancy Networks. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.04618) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fautonomousvision\u002Fconvolutional_occupancy_networks)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fpengsongyou.github.io\u002Fconv_onet)]\n- Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2106.02634) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fvsitzmann\u002Flight-field-networks)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvsitzmann.github.io\u002Flfns\u002F)]\n- Learned Initializations for Optimizing Coordinate-Based Neural Representations. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.02189) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ftancik\u002Flearnit)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwww.matthewtancik.com\u002Flearnit)]\n- Neural Rays for Occlusion-aware Image-based Rendering. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2107.13421) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fliuyuan-pal\u002FNeuRay)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fliuyuan-pal.github.io\u002FNeuRay\u002F)]\n- ${C}^{3}$-GS: Learning Context-aware, Cross-dimension, Cross-scale Feature for Generalizable Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.20754) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FYuhsiHu\u002FC3-GS)]\n- Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.13152) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fstelzner\u002Fsrt)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fsrt-paper.github.io\u002F)]\n- VisionNeRF: Vision Transformer for NeRF-Based View Synthesis from a Single Input Image. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.05736) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fken2576\u002Fvision-nerf)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fcseweb.ucsd.edu\u002F~viscomp\u002Fprojects\u002FVisionNeRF\u002F)]\n- RePAST: Relative Pose Attention Scene Representation Transformer. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.00947)]\n- Is Attention All That NeRF Needs? [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.13298) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FVITA-Group\u002FGNT)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvita-group.github.io\u002FGNT\u002F)]\n- Large Reconstruction Model (LRM).\n- Instant3D: Fast Text-to-3D with Sparse-view Generation and Large Reconstruction Model.\n- TripoSR: Fast 3D Object Reconstruction from a Single Image. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.02151)]\n- GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation.\n- GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.14621) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fimlixinyang\u002FGS-LRM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fsai-bi.github.io\u002Fproject\u002Fgs-lrm\u002F)]\n- MeshLRM: Large Reconstruction Model for High-Quality Meshes.\n- MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model.\n- Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation.\n- LVSM: A Fully Data-Driven Approach to Novel View Synthesis.\n- Depth Anything 3: Recovering the Visual Space from Any Views.\n- Gamba: Marry Gaussian Splatting With Mamba for Single-View 3D Reconstruction.\n- MVGamba: Unify 3D Content Generation as State Space Sequence Modeling.\n- Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.12781) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Farthurhero\u002FLong-LRM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Farthurhero.github.io\u002Fprojects\u002Fllrm\u002F)]\n\n### Cross-View Fusion\n\n- AttnRend: Learning to Render Novel Views from Wide-Baseline Stereo Pairs. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.08463) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fyilundu\u002Fcross_attention_renderer)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fyilundu.github.io\u002Fwide_baseline\u002F)]\n- Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2404.06109) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ftatakai1\u002FeFreeSplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Ftatakai1.github.io\u002Fefreesplat\u002F)]\n- LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.\n- DUSt3R: Geometric 3D Vision Made Easy. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.14132) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnaver\u002Fdust3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdust3r.europe.naverlabs.com\u002F)]\n- Grounding Image Matching in 3D with MASt3R. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.09756) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnaver\u002Fmast3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Feurope.naverlabs.com\u002Fblog\u002Fmast3r-matching-and-stereo-3d-reconstruction\u002F)]\n- MV-DUSt3R: Multi-View Dense Stereo 3D Reconstruction.\n- MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.06974) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fmvdust3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fmv-dust3rp.github.io\u002F)]\n- PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.16877) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FComputationalRobotics\u002FPreF3R)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fcomputationalrobotics.seas.harvard.edu\u002FPreF3R\u002F)]\n- 3D Reconstruction with Spatial Memory. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.16061) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FHengyiWang\u002Fspann3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fhengyiwang.github.io\u002Fprojects\u002Fspanner)]\n- Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.13928) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Ffast3r-3d.github.io\u002F)]\n- MUSt3R: Multi-view Network for Stereo 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.01661) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnaver\u002Fmust3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Feurope.naverlabs.com\u002Fresearch\u002Fpublications\u002Fmust3r-multi-view-network-for-stereo-3d-reconstruction\u002F)]\n- WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.05296)]\n- Continuous 3D Perception Model with Persistent State. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.12387) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FCUT3R\u002FCUT3R)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fcut3r.github.io\u002F)]\n- VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.19297)]\n- G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.11379)]\n- TTT3R: 3D Reconstruction as Test-Time Training. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.26645)]\n- Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.02863)]\n- VGGT: Visual Geometry Grounded Transformer. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.11651) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fvggt)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvgg-t.github.io\u002F)]\n- iLRM: An Iterative Large 3D Reconstruction Model. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.23277) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FGynjn\u002FiLRM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fgynjn.github.io\u002FiLRM\u002F)]\n- Dens3r: A Foundation Model for 3D Geometry Prediction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.16290)]\n- MoRE: 3D Visual Geometry Reconstruction Meets Mixture-of-Experts. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.27234)]\n- Uni3R: Unified 3D Reconstruction and Semantic Understanding via Generalizable Gaussian Splatting from Unposed Multi-view Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.03643)]\n- Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.20157)]\n- Gen3R: 3D Scene Generation Meets Feed-Forward Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2601.04090)]\n- NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.04179) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fwrchen530\u002Fnova3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwrchen530.github.io\u002Fnova3r\u002F)]\n- IncVGGT: Incremental VGGT for Memory-Bounded Long-Range 3D Reconstruction.\n- ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.04385)]\n- LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.03269)]\n- tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.20160)]\n- VGG-T3: Offline Feed-Forward 3D Reconstruction at Scale. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.23361)]\n\n### Integration of Visual Foundation Models\n\n- DUSt3R: Geometric 3D Vision Made Easy. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.14132) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnaver\u002Fdust3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdust3r.europe.naverlabs.com\u002F)]\n- Mono3R: Exploiting Monocular Cues for Geometric 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.13419)]\n- Feat2GS: Probing Visual Foundation Models with Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.09606) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffanegg\u002FFeat2GS)｜[:globe_with_meridians: Project Page](https:\u002F\u002Ffanegg.github.io\u002FFeat2GS\u002F)]\n- CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.12906)]\n\n## Geometry-aware Improvement\n\n### Explicit Geometric Aggregation\n\n- MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.15595) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fapchenstu\u002Fmvsnerf)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fapchenstu.github.io\u002Fmvsnerf\u002F)]\n- Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2104.06935) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fjchibane\u002Fsrf)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvirtualhumans.mpi-inf.mpg.de\u002Fsrf\u002F)]\n- GeoNeRF: Generalizing NeRF with Geometry Priors. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.13539) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fidiap\u002FGeoNeRF)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwww.idiap.ch\u002Fpaper\u002Fgeonerf\u002F)]\n- BoostMVSNeRFs: Boosting MVS-based NeRFs to Generalizable View Synthesis in Large-Scale Scenes.\n- Generalizable Patch-Based Neural Rendering. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2207.10662) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fgoogle-research\u002Ftree\u002Fmaster\u002Fgen_patch_neural_rendering)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fmohammedsuhail.net\u002Fgen_patch_neural_rendering\u002F)]\n- MatchNeRF: Explicit Correspondence Matching for Generalizable Neural Radiance Fields. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.12294) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fdonydchen\u002Fmatchnerf)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdonydchen.github.io\u002Fmatchnerf)]\n- GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.10375) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fautonomousvision\u002Fgta)｜[:globe_with_meridians: Project Page](https:\u002F\u002Ftakerum.github.io\u002Fgta\u002F)]\n- MuRF: Multi-Baseline Radiance Fields. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.04565) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fautonomousvision\u002Fmurf)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fhaofeixu.github.io\u002Fmurf\u002F)]\n- SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views.\n- VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction.\n- ReTR: Modeling Depth Distribution for Generalizable Surface Reconstruction.\n- UFORecon: Generalizable Sparse-View Surface Reconstruction from Arbitrary and Unfavorable Sets.\n- SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.15602)]\n- RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.21925) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002Frenderformer)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fmicrosoft.github.io\u002Frenderformer\u002F)]\n- AGG: Amortized Generative 3D Gaussians for Single Image to 3D. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.04099)]\n- TGS: Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers.\n- LaRa: Efficient Large-Baseline Radiance Fields. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.04699) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fautonomousvision\u002FLaRa)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fapchenstu.github.io\u002FLaRa\u002F)]\n- MeshSplat: Generalizable Sparse-View Surface Reconstruction via Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.17811)]\n- TranSplat: Geometry-Aware Feed-Forward Gaussian Splatting with Transformation Consistency.\n- H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.03118)]\n- MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction.\n- pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.12337) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fdcharatan\u002Fpixelsplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdcharatan.github.io\u002Fpixelsplat)]\n- MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.14627) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fdonydchen\u002Fmvsplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdonydchen.github.io\u002Fmvsplat)]\n- MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.15364) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FTQTQliu\u002FMVSGaussian)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fmvsgaussian.github.io\u002F)]\n\n### Refining Predicted 3D Scenes\n\n- FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2405.17958) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fwangys16\u002FFreeSplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwangys16.github.io\u002FFreeSplat-project\u002F)]\n- HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.06245) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FOpen3DVLab\u002FHiSplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fopen3dvlab.github.io\u002FHiSplat\u002F)]\n- PixelGaussian: Generalizable 3D Gaussian Reconstruction from Arbitrary Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.18979) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FBarrybarry-Smith\u002FPixelGaussian)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwzzheng.net\u002FPixelGaussian)]\n- Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.16338) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fshengjun-zhang\u002FGGN)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fshengjun-zhang.github.io\u002FGGN\u002F)]\n- Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction.\n- G3R: Gradient Guided Generalizable Reconstruction.\n\n### Pose-Free Reconstruction\n\n- LEAP: Liberate Sparse-view 3D Modeling from Camera Poses. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2310.01410) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fhwjiang1510\u002FLEAP)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fhwjiang1510.github.io\u002FLEAP\u002F)]\n- Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.13912) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fbtsmart\u002Fsplatt3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fsplatt3r.active.vision\u002F)]\n- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.24207) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fcvg\u002FNoPoSplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fnoposplat.github.io\u002F)]\n- PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.22128) | [:computer: Code](https:\u002F\u002Fcvlab-kaist.github.io\u002FPF3plat\u002F)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fcvlab-kaist.github.io\u002FPF3plat\u002F)]\n- FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.09573) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FTencentARC\u002FFreeSplatter)]\n- FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.12138) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fant-research\u002FFLARE)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fzhanghe3z.github.io\u002FFLARE\u002F)]\n- Pow3R: Empowering Unconstrained 3D Reconstruction with Camera and Scene Priors. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.17316) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnaver\u002Fpow3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Feurope.naverlabs.com\u002Fresearch\u002Fpublications\u002Fpow3r-empowering-unconstrained-3d-reconstruction-with-camera-and-scene-priors\u002F)]\n- RegGS: Unposed Sparse Views Gaussian Splatting with 3DGS Registration. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.08136) | [:computer: Code](https:\u002F\u002Fgithub.com\u002F3DAgentWorld\u002FRegGS)｜[:globe_with_meridians: Project Page](https:\u002F\u002F3dagentworld.github.io\u002Freggs\u002F)]\n- UFV-Splatter: Pose-Free Feed-Forward 3D Gaussian Splatting Adapted to Unfavorable Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.22342) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fyfujimura\u002FUFV-Splatter\u002F)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fyfujimura.github.io\u002FUFV-Splatter_page)]\n- π^3: Scalable Permutation-Equivariant Visual Geometry Learning. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.13347) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fyyfz\u002FPi3)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fyyfz.github.io\u002Fpi3\u002F)]\n- AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.23716)]\n- No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views.\n- SPFSplatV2: Efficient Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.17246)]\n- PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-forward Planar Splatting.\n- YoNoSplat: You Only Need One Model for Feedforward 3D Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2511.07321)]\n\n### Pre-trained Geometric Guidance\n\n- DepthSplat: Connecting Gaussian Splatting and Depth. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.13862) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fcvg\u002Fdepthsplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fhaofeixu.github.io\u002Fdepthsplat\u002F)]\n- Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.04343) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Feldar\u002Fflash3d)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwww.robots.ox.ac.uk\u002F~vgg\u002Fresearch\u002Fflash3d\u002F)]\n- Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.12553) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fxianzuwu\u002FNiagara)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fai-kunkun.github.io\u002FNiagara_page\u002F)]\n- Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.05327) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Faim-uofa\u002FPM-Loss)｜[:globe_with_meridians: Project Page](https:\u002F\u002Faim-uofa.github.io\u002FPMLoss\u002F)]\n- Fin3R: Fine-tuning Feed-forward 3D Reconstruction Models via Monocular Knowledge Distillation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2511.22429)]\n- JointSplat: Joint Depth and Flow Priors for Feed-Forward 3D Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.03872)]\n\n## Model Efficiency\n\n### Feature Efficiency\n\n- Efficient Neural Radiance Fields for Interactive Free-viewpoint Video. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2304.04452) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Faoliao12138\u002FReRF)｜[:globe_with_meridians: Project Page](https:\u002F\u002Faoliao12138.github.io\u002FReRF\u002F)]\n- ProNeRF: Learning Efficient Projection-Aware Ray Sampling for Fine-Grained Implicit Neural Radiance Fields. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.08136) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FKAIST-VICLab\u002Fpronerf)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fkaist-viclab.github.io\u002Fpronerf-site\u002F)]\n- TinySplat: Feedforward Approach for Generating Compact 3D Scene Representation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.09479)]\n- ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.23734) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fziplab\u002FZPressor)｜[:globe_with_meridians: Project Page](https:\u002F\u002Flhmd.top\u002Fzpressor\u002F)]\n- FastVGGT: Training-Free Acceleration of Visual Geometry Transformer. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.02560) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fmystorm16\u002FFastVGGT)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fmystorm16.github.io\u002Ffastvggt\u002F)]\n- Quantized Visual Geometry Grounded Transformer. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.21302)]\n- Faster VGGT with Block-Sparse Global Attention. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.07120)]\n- Evict3R: Training-Free Token Eviction for Memory-Bounded Streaming Visual Geometry Transformers. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.17650)]\n- LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2512.04939)]\n- Speed3R: Sparse Feed-forward 3D Reconstruction Models. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.08055)]\n- SR3R: Rethinking Super-Resolution 3D Reconstruction With Feed-Forward Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.24020)]\n\n### Representation Compaction\n\n- Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.16338) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fshengjun-zhang\u002FGGN)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fshengjun-zhang.github.io\u002FGGN\u002F)]\n- PixelGaussian: Generalizable 3D Gaussian Reconstruction from Arbitrary Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.18979) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FBarrybarry-Smith\u002FPixelGaussian)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwzzheng.net\u002FPixelGaussian)]\n- FreeSplat++: Generalizable 3D Gaussian Splatting for Efficient Indoor Scene Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.22986) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fwangys16\u002FFreeSplatPP)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwangys16.github.io\u002FFreeSplatPP-Page\u002F)]\n- LongSplat: Online Generalizable 3D Gaussian Splatting from Long Sequence Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.16144) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FLongSplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Flinjohnss.github.io\u002Flongsplat)]\n\n## Data & Visual Augmentation\n\n### Data Augmentation\n\n- MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.14166) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fhwjiang1510\u002FMegaSynth)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fhwjiang1510.github.io\u002FMegaSynth\u002F)]\n- Puzzles: Unbounded Video-Depth Augmentation for Scalable End-to-End 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.23863) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FJiahao-Ma\u002Fpuzzles-code)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fjiahao-ma.github.io\u002Fpuzzles\u002F)]\n- Aug3D: Augmenting Large Scale Outdoor Datasets for Generalizable Novel View Synthesis. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.06431)]\n- MVBoost: Boost 3D Reconstruction with Multi-View Refinement.\n\n### Visual Augmentation\n\n- MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.04924) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fdonydchen\u002Fmvsplat360)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdonydchen.github.io\u002Fmvsplat360)]\n- latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.16292) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FChrixtar\u002Flatentsplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fgeometric-rl.mpi-inf.mpg.de\u002Flatentsplat\u002F)]\n- ProSplat: Improved Feed-Forward 3D Gaussian Splatting for Wide-Baseline Sparse Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.07670)]\n- Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models.\n- Reconstruct, Inpaint, Finetune: Dynamic Novel-view Synthesis from Monocular Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.12646)]\n\n## Temporal-aware Models\n\n### Online Streaming\n\n- StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.08862) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FDSL-Lab\u002FStreamSpat)]\n- Continuous 3D Perception Model with Persistent State. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.12387) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FCUT3R\u002FCUT3R)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fcut3r.github.io\u002F)]\n- DGS-LRM: Real-Time Deformable 3D Gaussian Reconstruction From Monocular Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.09997) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Fhubert0527.github.io\u002Fdgslrm\u002F)]\n- Stream3R: Scalable Sequential 3D Reconstruction with Causal Transformer. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.10893)]\n- LongStream: Long-Sequence Streaming Autoregressive Visual Geometry. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2602.13172)]\n\n### Offline Processing\n\n- L4GM: Large 4D Gaussian Reconstruction Model.\n- 4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.18890)]\n- MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion. [[:page_facing_up: Paper](https:\u002F\u002Fmonst3r-project.github.io\u002Ffiles\u002Fmonst3r_paper.pdf) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FJunyi42\u002Fmonst3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fmonst3r-project.github.io\u002F)]\n- Easi3R: Estimating Disentangled Motion from DUSt3R Without Training. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.24391) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FInception3D\u002FEasi3R)｜[:globe_with_meridians: Project Page](https:\u002F\u002Feasi3r.github.io\u002F)]\n- 4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.08015) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002F4dgt)｜[:globe_with_meridians: Project Page](https:\u002F\u002F4dgt.github.io\u002F)]\n- MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.10065) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fchenguolin\u002FMoVieS)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fchenguolin.github.io\u002Fprojects\u002FMoVieS\u002F)]\n- 4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4D Scene Generation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.18839) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Fsnap-research.github.io\u002F4Real-Video-V2\u002F)]\n- MonoFusion: Sparse-View 4D Reconstruction via Monocular Fusion. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.23782) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FImNotPrepared\u002FMonoFusion)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fimnotprepared.github.io\u002Fresearch\u002F25_DSR\u002F)]\n- Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.09145)]\n- Feed-forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.03526)]\n\n### Interactive Modeling\n\n- PIXIE: Physics from Pixels for Interactive Feed-Forward Scene Modeling. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.05361)]\n- PhysGM: Physical Gaussian Modeling for Interactive 3D Scene Editing. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.10442)]\n\n### Specialized Tasks\n\n- DAS3R: Dynamics-Aware Gaussian Splatting for Static Scene Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.19584) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fkai422\u002FDAS3R)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fkai422.github.io\u002FDAS3R\u002F)]\n- St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World.\n\n## Datasets and Benchmarks\n\n| \u003Cimg width=\"100%\" src=\"https:\u002F\u002Fff3d-survey.github.io\u002Fassets\u002Fdataset.png\"> |\n|:-:|\n\n## Geometry Oriented\n\n- DTU: Large Scale Multi-view Stereopsis Evaluation. [[:page_facing_up: Paper](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F6909453) | [:globe_with_meridians: Project Page](http:\u002F\u002Froboimagedata.compute.dtu.dk\u002F?page_id=36)]\n- 7Scenes: Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images. [[:page_facing_up: Paper](https:\u002F\u002Fieeexplore.ieee.org\u002Fdocument\u002F6619221) | [:globe_with_meridians: Project Page](https:\u002F\u002Fwww.microsoft.com\u002Fen-us\u002Fresearch\u002Fproject\u002Frgb-d-dataset-7-scenes\u002F)]\n\n## Visual Oriented\n\n- NeRF-Synthetic: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.08934) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fbmild\u002Fnerf)｜[:globe_with_meridians: Project Page](http:\u002F\u002Fmatthewtancik.com\u002Fnerf) | [🔗Data Link](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1cK3UDIJqKAAm7zyrxRYVFJ0BRMgrwhh4)]\n- Neural 3D Mesh Renderer (NMR): Neural 3D Mesh Renderer. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1711.07566) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fdaniilidis-group\u002Fneural_renderer)]\n- CelebA: Deep Learning Face Attributes in the Wild. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1411.7766) | [:globe_with_meridians: Project Page](https:\u002F\u002Fliuziwei7.github.io\u002Fprojects\u002FFaceAttributes.html) | [🔗Data Link](https:\u002F\u002Fmmlab.ie.cuhk.edu.hk\u002Fprojects\u002FCelebA.html)]\n- Consistent4D: Consistent 360° Dynamic Object Generation from Monocular Video. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.02848) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FyanqinJiang\u002FConsistent4D?tab=readme-ov-file)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fconsistent4d.github.io\u002F) | [🔗Data Link](https:\u002F\u002Fdrive.google.com\u002Ffile\u002Fd\u002F1mJNhFKvzZ-8icAw6KC-W-sf7JmmmMUkx\u002Fview?usp=sharing)]\n- NYUv2: Indoor Segmentation and Support Inference from RGBD Images. [[:page_facing_up: Paper](https:\u002F\u002Flink.springer.com\u002Fchapter\u002F10.1007\u002F978-3-642-33715-4_54) | [:globe_with_meridians: Project Page](https:\u002F\u002Fcs.nyu.edu\u002F~fergus\u002Fdatasets\u002Fnyu_depth_v2.html) | [🔗Data Link](https:\u002F\u002Fcs.nyu.edu\u002F~fergus\u002Fdatasets\u002Fnyu_depth_v2.html#raw_parts)]\n- Habitat: A Platform for Embodied AI Research. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1904.01201) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fhabitat-lab) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fhabitat-lab\u002Fblob\u002Fmain\u002FDATASETS.md)]\n- Hot3D: Hand and Object Tracking in 3D from Egocentric Multi-view Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.09598) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fhot3d)｜[:globe_with_meridians: Project Page](https:\u002F\u002Ffacebookresearch.github.io\u002Fhot3d\u002F) | [🔗Data Link](https:\u002F\u002Fwww.projectaria.com\u002Fdatasets\u002Fhot3D\u002F)]\n- ACID: Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2012.09855) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwww.acidb.net\u002Fdataset) | [🔗Data Link](https:\u002F\u002Fwww.acidb.net\u002Fdownload)]\n- ENeRF-Outdoor: Efficient Neural Radiance Fields for Interactive Free-viewpoint Video. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2112.01517) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fzju3dv\u002FENeRF)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fzju3dv.github.io\u002Fenerf\u002F) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Fzju3dv\u002FENeRF\u002Fblob\u002Fmaster\u002Fdocs\u002Fenerf_outdoor.md)]\n- LLFF: Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1905.00889) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FFyusion\u002FLLFF)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fbmild.github.io\u002Fllff\u002F) | [🔗Data Link](https:\u002F\u002Fbmild.github.io\u002Fllff\u002F)]\n- Neural3DV: Neural 3D Video Synthesis from Multi-view Video. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2103.02597) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FNeural_3D_Video)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fneural-3d-video.github.io\u002F) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FNeural_3D_Video)]\n- EgoExo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2311.18259) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FEGO4D)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fego-exo4d-data.org\u002F) | [🔗Data Link](https:\u002F\u002Fego4ddataset.com\u002Fegoexo-license\u002F)]\n- DAVIS: A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1704.00675) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffperazzi\u002Fdavis-2017)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdavischallenge.org\u002Fdavis2017) | [🔗Data Link](https:\u002F\u002Fdavischallenge.org\u002Fdavis2017\u002Fcode.html)]\n- Youtube-VOS: Sequence-to-sequence Video Object Segmentation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1809.03327) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Fyoutube-vos.org\u002F) | [🔗Data Link](https:\u002F\u002Fyoutube-vos.org\u002Fdataset\u002F)]\n- DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.16256) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FDL3DV-10K\u002FDataset)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdl3dv-10k.github.io\u002FDL3DV-10K\u002F) | [🔗Data Link](https:\u002F\u002Fhuggingface.co\u002FDL3DV\u002Fdatasets)]\n- RealEstate10K: Stereo Magnification: Learning View Synthesis Using Multiplane Images. [[:page_facing_up: Paper](https:\u002F\u002Fresearch.google\u002Fpubs\u002Fpub46965\u002F) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Fgoogle.github.io\u002Frealestate10k\u002F) | [🔗Data Link](https:\u002F\u002Fgoogle.github.io\u002Frealestate10k\u002Fdownload.html)]\n\n## Mixed\n\n- Google Scanned Objects (GSO): A High-Quality Dataset of 3D Scanned Household Items. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2204.11918) | [:globe_with_meridians: Project Page](https:\u002F\u002Fresearch.google\u002Fblog\u002Fscanned-objects-by-google-research-a-dataset-of-3d-scanned-common-household-items\u002F) | [🔗Data Link](https:\u002F\u002Fapp.gazebosim.org\u002FGoogleResearch\u002Ffuel\u002Fcollections\u002FScanned%20Objects%20by%20Google%20Research)]\n- OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2301.07525) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fomniobject3d\u002FOmniObject3D\u002Ftree\u002Fmain)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fomniobject3d.github.io\u002F) | [🔗Data Link](https:\u002F\u002Fopenxlab.org.cn\u002Fdatasets\u002FOpenXDLab\u002FOmniObject3D-New\u002Fexplore\u002Fmain)]\n- CO3D: Common Objects in 3D: Large-Scale Learning and Evaluation of Real-Life 3D Category Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2109.00512) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fco3d)｜[:globe_with_meridians: Project Page](https:\u002F\u002Feval.ai\u002Fweb\u002Fchallenges\u002Fchallenge-page\u002F1819\u002Foverview) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fco3d\u002Fblob\u002Fmain\u002Fco3d\u002Fdataset\u002Fdownload_dataset_impl.py)]\n- WildRGBD: RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2401.12592) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fwildrgbd\u002Fwildrgbd)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwildrgbd.github.io\u002F) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Fwildrgbd\u002Fwildrgbd\u002Fblob\u002Fmain\u002Fdownload.py)]\n- ShapeNet: An Information-Rich 3D Model Repository. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1512.03012) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Fshapenet.org\u002F) | [🔗Data Link](https:\u002F\u002Fshapenet.org\u002Flogin\u002F)]\n- MVImgNet: A Large-Scale Dataset of Multi-View Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.06042) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FGAP-LAB-CUHK-SZ\u002FMVImgNet) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002FGAP-LAB-CUHK-SZ\u002FMVImgNet\u002Fblob\u002Fmain\u002Fdownload_tool.zip)]\n- Objaverse: A Universe of Annotated 3D Objects. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.05663) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fallenai\u002Fobjaverse-xl)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fobjaverse.allenai.org\u002F) | [🔗Data Link](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fallenai\u002Fobjaverse-xl)]\n- Objaverse-XL: A Universe of 10M+ 3D Objects. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.05663) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fallenai\u002Fobjaverse-xl)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fobjaverse.allenai.org\u002F) | [🔗Data Link](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fallenai\u002Fobjaverse-xl)]\n- DeepVoxels: Learning Persistent 3D Feature Embeddings. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1812.01024) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fvsitzmann\u002Fdeepvoxels)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwww.vincentsitzmann.com\u002Fdeepvoxels\u002F) | [🔗Data Link](https:\u002F\u002Fdrive.google.com\u002Fopen?id=1ScsRlnzy9Bd_n-xw83SP-0t548v63mPH)]\n- MultiShapeNet: Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.13152) | [:computer: Code](https:\u002F\u002Fsrt-paper.github.io\u002F#code)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fsrt-paper.github.io\u002F) | [🔗Data Link](https:\u002F\u002Fsrt-paper.github.io\u002F#dataset)]\n- Amazon Berkeley Objects (ABO): Dataset and Benchmarks for Real-World 3D Object Understanding. [[:page_facing_up: Paper](https:\u002F\u002Famazon-berkeley-objects.s3.amazonaws.com\u002Fstatic_html\u002FABO_CVPR2022.pdf) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Famazon-berkeley-objects.s3.amazonaws.com\u002Findex.html) | [🔗Data Link](https:\u002F\u002Famazon-berkeley-objects.s3.amazonaws.com\u002Findex.html#download)]\n- Replica: A Digital Replica of Indoor Spaces. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1906.05797) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FReplica-Dataset) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002FReplica-Dataset\u002Fblob\u002Fmain\u002Fdownload.sh)]\n- TUM RGBD: Evaluating Egomotion and Structure-from-Motion Approaches Using the TUM RGB-D Benchmark. [[:page_facing_up: Paper](https:\u002F\u002Fwww.semanticscholar.org\u002Fpaper\u002FEvaluating-Egomotion-and-Structure-from-Motion-the-Sturm-Burgard\u002F3ad29f46efa040eb14d15be48a563af5b75463a8) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Fcvg.cit.tum.de\u002Fdata\u002Fdatasets\u002Frgbd-dataset) | [🔗Data Link](http:\u002F\u002Fvision.in.tum.de\u002Fdata\u002Fdatasets\u002Frgbd-dataset\u002Fdownload)]\n- Matterport3D: Learning from RGB-D Data in Indoor Environments. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fpdf\u002F1709.06158.pdf) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fniessner\u002FMatterport)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fniessner.github.io\u002FMatterport\u002F) | [🔗Data Link](https:\u002F\u002Fniessner.github.io\u002FMatterport\u002F#download)]\n- Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2011.02523) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fapple\u002Fml-hypersim) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Fapple\u002Fml-hypersim\u002Fblob\u002Fmain\u002Fcode\u002Fpython\u002Ftools\u002Fdataset_download_images.py)]\n- ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1702.04405) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet)｜[:globe_with_meridians: Project Page](http:\u002F\u002Fwww.scan-net.org\u002F) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002FScanNet\u002FScanNet?tab=readme-ov-file#scannet-data)]\n- ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2308.11417) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fscannetpp\u002Fscannetpp)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fscannetpp.mlsg.cit.tum.de\u002Fscannetpp\u002F) | [🔗Data Link](https:\u002F\u002Fscannetpp.mlsg.cit.tum.de\u002Fscannetpp\u002Fregister)]\n- ARKitScenes: A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data. [[:page_facing_up: Paper](https:\u002F\u002Fopenreview.net\u002Fforum?id=tjZjv_qh_CE) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fapple\u002FARKitScenes) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Fapple\u002FARKitScenes\u002Fblob\u002Fmain\u002FDATA.md)]\n- Virtual KITTI 2. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2001.10773) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Feurope.naverlabs.com\u002Fresearch\u002Fproxy-virtual-worlds\u002F) | [🔗Data Link](https:\u002F\u002Fdownload.europe.naverlabs.com\u002F\u002Fvirtual_kitti_2.0.3\u002Fvkitti_2.0.3_md5_checksums.txt)]\n- Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.01943) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fcv-stuttgart\u002Fsceneflow_from_blender)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fspring-benchmark.org\u002F) | [🔗Data Link](https:\u002F\u002Fspring-benchmark.org\u002Fdownload)]\n- MegaDepth: Learning Single-View Depth Prediction from Internet Photos. [[:page_facing_up: Paper](https:\u002F\u002Fwww.cs.cornell.edu\u002Fprojects\u002Fmegadepth\u002Fpaper.pdf) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Flixx2938\u002FMegaDepth)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwww.cs.cornell.edu\u002Fprojects\u002Fmegadepth\u002F) | [🔗Data Link](https:\u002F\u002Fwww.cs.cornell.edu\u002Fprojects\u002Fmegadepth\u002Fdataset\u002FMegadepth_v1\u002FMegaDepth_v1.tar.gz)]\n- PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2307.15055) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Faharley\u002Fpips2)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fpointodyssey.com\u002F) | [🔗Data Link](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1W6wxsbKbTdtV8-2TwToqa_QgLqRY3ft0?usp=drive_link)]\n- TartanAir: A Dataset to Push the Limits of Visual SLAM. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2003.14338) ｜[:globe_with_meridians: Project Page](https:\u002F\u002Ftheairlab.org\u002Ftartanair-dataset\u002F) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002Fcastacks\u002Ftartanair_tools)]\n- Waymo: Scalability in Perception for Autonomous Driving. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1912.04838) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fwaymo-research\u002Fwaymo-open-dataset)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwaymo.com\u002Fopen\u002F) | [🔗Data Link](https:\u002F\u002Fwaymo.com\u002Fopen\u002Flicensing\u002F?continue=%2Fopen%2Fdownload%2F)]\n- nuScenes: A Multimodal Dataset for Autonomous Driving. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1903.11027) | [:globe_with_meridians: Project Page](https:\u002F\u002Fwww.nuscenes.org\u002F) | [🔗Data Link](https:\u002F\u002Fwww.nuscenes.org\u002Fnuscenes#download)]\n- Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2111.12077) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fgoogle-research\u002Fmultinerf)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fjonbarron.info\u002Fmipnerf360\u002F) | [🔗Data Link](http:\u002F\u002Fstorage.googleapis.com\u002Fgresearch\u002Frefraw360\u002F360_v2.zip)]\n- Tanks and Temples: Benchmarking Large-Scale Scene Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3072959.3073599) | [:globe_with_meridians: Project Page](https:\u002F\u002Fwww.tanksandtemples.org\u002F)]\n- ETH3D: A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos. [[:page_facing_up: Paper](https:\u002F\u002Fwww.eth3d.net\u002Fdata\u002Fschoeps2017cvpr.pdf) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FETH3D)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwww.eth3d.net\u002F) | [🔗Data Link](https:\u002F\u002Fwww.eth3d.net\u002Fslam_datasets)]\n- BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F1911.10127) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FYoYo000\u002FBlendedMVS) | [🔗Data Link](https:\u002F\u002Fgithub.com\u002FYoYo000\u002FBlendedMVS\u002Freleases\u002Fdownload\u002Fv1.0.0\u002FBlendedMVS.zip)]\n- DyCheck: Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.18717) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fcoltonstearns\u002Fdynamic-gaussian-marbles)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fgeometry.stanford.edu\u002Fprojects\u002Fdynamic-gaussian-marbles.github.io\u002F) | [🔗Data Link](https:\u002F\u002Fdrive.google.com\u002Fdrive\u002Ffolders\u002F1hKlpqofQt4PhKLWw7kb4tI5CFgJE4Iu-?usp=drive_link)]\n\n## Applications\n\n| \u003Cimg width=\"100%\" src=\"https:\u002F\u002Fff3d-survey.github.io\u002Fassets\u002Fapplication.png\"> |\n|:-:|\n\n## Autonomous Driving\n\n- Driv3R: Learning Dense 4D Reconstruction for Autonomous Driving. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.06777) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FBarrybarry-Smith\u002FDriv3R)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwzzheng.net\u002FDriv3R\u002F)]\n- InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.03934) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnv-tlabs\u002FInfiniCube)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fresearch.nvidia.com\u002Flabs\u002Ftoronto-ai\u002Finfinicube\u002F)]\n- DrivingRecon: Large 4D Gaussian Reconstruction Model for Autonomous Driving. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.09043) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FEnVision-Research\u002FDriveRecon)]\n- GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.03751) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnv-tlabs\u002FGEN3C)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fresearch.nvidia.com\u002Flabs\u002Ftoronto-ai\u002FGEN3C\u002F)]\n- DrivingForward: Feed-forward 3D Gaussian Splatting for Driving Scene Reconstruction from Flexible Surround-view Input. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.12753) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffangzhou2000\u002FDrivingForward)]\n- STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.00602) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FGaussianSTORM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fjiawei-yang.github.io\u002FSTORM\u002F)]\n- SCube: Instant Large-Scale Scene Reconstruction using VoxSplats. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.20030) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnv-tlabs\u002FSCube)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fresearch.nvidia.com\u002Flabs\u002Ftoronto-ai\u002Fscube\u002F)]\n- EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.20168) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FMiaosheng1\u002FEVolSplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fxdimlab.github.io\u002FEVolSplat\u002F)]\n- Omni-Scene: Omni-Gaussian Representation for Ego-Centric Sparse-View Scene Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.06273) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FWU-CVGL\u002FOmniScene)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwswdx.github.io\u002Fomniscene\u002F)]\n- BEV-GS: Feed-forward Gaussian Splatting in Bird's-Eye-View for Road Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.13207) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fcat-wwh\u002FBEV-GS)]\n- Efficient Depth-guided Urban View Synthesis. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2407.12395) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FMiaosheng1\u002FEDUS)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fxdimlab.github.io\u002FEDUS\u002F)]\n- DriveGen3D: Boosting Feed-Forward Driving Scene Generation with Efficient Video Diffusion. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.15264)｜[:globe_with_meridians: Project Page](https:\u002F\u002Flhmd.top\u002Fdrivegen3d)]\n- WorldSplat: Gaussian-Centric Feed-Forward 4D Scene Generation for Autonomous Driving. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.23402) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fwm-research\u002Fworldsplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fwm-research.github.io\u002Fworldsplat\u002F)]\n\n## Robotics\n\n### Manipulation\n\n- GraspNeRF: Multiview-based 6-DoF Grasp Detection for Transparent and Specular Objects Using Generalizable NeRF. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2210.06575) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FPKU-EPIC\u002FGraspNeRF)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fpku-epic.github.io\u002FGraspNeRF\u002F)]\n- ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.08321) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FGuanxingLu\u002FManiGaussian)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fguanxinglu.github.io\u002FManiGaussian\u002F)]\n- ManiGaussian++: General Robotic Bimanual Manipulation with Hierarchical Gaussian World Model. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.19842) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FApril-Yz\u002FManiGaussian_Bimanual)]\n- Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2406.02370)]\n- GAF: Gaussian Action Field as a Dynamic World Model for Robotic Manipulation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.14135)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fchaiying1.github.io\u002FGAF.github.io\u002Fproject_page\u002F)]\n- GaussianGrasper: 3D Language Gaussian Splatting for Open-Vocabulary Robotic Grasping. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.09637) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FMrSecant\u002FGaussianGrasper)]\n- EmbodiedSplat: Personalized Real-to-Sim-to-Real Navigation with Gaussian Splats from a Mobile Device. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.17430)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fgchhablani.github.io\u002Fembodied-splat\u002F)]\n\n### Navigation\n\n- IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.00823) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FGWxuan\u002FIGL-Nav)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fgwxuan.github.io\u002FIGL-Nav\u002F)]\n- VR-Robo: A Real-to-Sim-to-Real Framework for Visual Robot Navigation and Locomotion. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.01536) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fzst1406217\u002FVR-Robo)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvr-robo.github.io\u002F)]\n- GS-LTS: 3D Gaussian Splatting-Based Adaptive Modeling for Long-Term Service Robots. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.17733)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvipl-vsu.github.io\u002F3DGS-LTS)]\n- UnitedVLN: Generalizable Gaussian Splatting for Continuous Vision-Language Navigation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.16053)]\n\n## SfM & SLAM\n\n### SFM\n\n- Visual Geometry Grounded Deep Structure From Motion. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2312.04563) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fvggsfm)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvggsfm.github.io\u002F)]\n- Light3R-SfM: Towards Feed-forward Structure-from-Motion. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.14914)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fselflein.github.io\u002FLight3R\u002F)]\n- Mast3r-sfm: A Fully-Integrated Solution for Unconstrained Structure-from-Motion. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2409.19152) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnaver\u002Fmast3r)]\n- Regist3R: Incremental Registration with Stereo Foundation Model. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.12356) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FLiu-SD\u002FRegist3R)]\n- VGGT-Long: Chunk it, Loop it, Align it -- Pushing VGGT's Limits on Kilometer-scale Long RGB Sequences. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.16443) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FDengKaiCQ\u002FVGGT-Long)]\n- Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.13928) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Ffast3r)｜[:globe_with_meridians: Project Page](https:\u002F\u002Ffast3r-3d.github.io\u002F)]\n- FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2502.12138) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fant-research\u002FFLARE)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fzhanghe3z.github.io\u002FFLARE\u002F)]\n\n### SLAM\n\n- MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.12392) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Frmurai0610\u002FMASt3R-SLAM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fedexheim.github.io\u002Fmast3r-slam\u002F)]\n- SLAM3R. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.09401) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FPKU-VCL-3DV\u002FSLAM3R)]\n- VGGT-SLAM. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.12549) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FMIT-SPARK\u002FVGGT-SLAM)]\n- ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.08551)]\n- EC3R-SLAM: Efficient and Consistent Monocular Dense SLAM with Feed-Forward 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.02080)]\n- MASt3R-Fusion: Integrating Feed-Forward Visual Model with IMU, GNSS for High-Functionality SLAM. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.20757)]\n- ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.01584) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fzhangganlin\u002Fvista-slam)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fganlinzhang.xyz\u002Fvista-slam\u002F)]\n\n## Scene Understanding\n\n### Semantic\n\n- SLGaussian: Fast Language Gaussian Splatting in Sparse Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.08331)]\n- GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.16932)]\n- SegMASt3R: Geometry Grounded Segment Matching. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.05051)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fsegmast3r.github.io\u002F)]\n- PartField: Learning 3D Feature Fields for Part Segmentation and Beyond. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.11451) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fnv-tlabs\u002FPartField)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fresearch.nvidia.com\u002Flabs\u002Ftoronto-ai\u002Fpartfield-release\u002F)]\n- Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2403.15624) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fsharinka0715\u002Fsemantic-gaussians)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fsharinka0715.github.io\u002Fsemantic-gaussians\u002F)]\n- SemanticSplat: Feed-Forward 3D Scene Understanding with Language-Aware Gaussian Fields. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.09565) | [:computer: Code](https:\u002F\u002Fsemanticsplat.github.io\u002F)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fsemanticsplat.github.io\u002F)]\n- UniForward: Unified 3D Scene and Semantic Field Reconstruction via Feed-Forward Gaussian Splatting from Only Sparse-View Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.09378)]\n- Large Spatial Model: End-to-end Unposed Images to Semantic 3D. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2410.18956) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FLSM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Flargespatialmodel.github.io\u002F)]\n- AlignGS: Aligning Geometry and Semantics for Robust Indoor Reconstruction from Sparse Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.07839)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fmediax-sjtu.github.io\u002FAlignGS\u002F)]\n\n### 3D Scene Understanding\n\n- MLLMs Need 3D-Aware Representation Supervision for Scene Understanding. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2506.01946) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FVisual-AI\u002F3DRS)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvisual-ai.github.io\u002F3drs)]\n- Spatio-Temporal LLM: Reasoning about Environments and Actions. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.05258) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fzoezheng126\u002FSpatio-Temporal-LLM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fzoezheng126.github.io\u002FSTLLM-website\u002F)]\n- Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.23747) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fdiankun-wu\u002FSpatial-MLLM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdiankun-wu.github.io\u002FSpatial-MLLM\u002F)]\n- Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.24625) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FLaVi-Lab\u002FVG-LLM)｜[:globe_with_meridians: Project Page](https:\u002F\u002Flavi-lab.github.io\u002FVG-LLM\u002F)]\n- VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2505.20279) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FVITA-Group\u002FVLM-3R)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fvlm-3r.github.io\u002F)]\n\n## Video Generation\n\n### Reconstruction-enhanced Video Generation\n\n- MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2411.04924) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fdonydchen\u002Fmvsplat360)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fdonydchen.github.io\u002Fmvsplat360\u002F)]\n- JOG3R: Towards 3D-Consistent Video Generators. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2501.01409)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fpaulchhuang.github.io\u002Fjog3rwebsite\u002F)]\n- GenFusion: Closing the Loop between Reconstruction and Generation via Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.21219) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FInception3D\u002FGenFusion)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fgenfusion.sibowu.com\u002F)]\n- ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2504.21855)｜[:globe_with_meridians: Project Page](https:\u002F\u002Frevision-video.github.io\u002F)]\n\n### Video Generation-based Scene Reconstruction\n\n- Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2507.07982) | [:computer: Code](https:\u002F\u002Fgithub.com\u002FCIntellifusion\u002FGeometryForcing)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fgeometryforcing.github.io\u002F)]\n- Video World Models with Long-term Spatial Memory. [[:page_facing_up: Paper](http:\u002F\u002Farxiv.org\u002Fabs\u002F2506.05284)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fspmem.github.io\u002F)]\n- SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.12024) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fbyeongjun-park\u002FSteerX)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fbyeongjun-park.github.io\u002FSteerX\u002F)]\n- 4DNeX: Feed-Forward 4D Generative Modeling Made Easy. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.13154) | [:computer: Code](https:\u002F\u002Fgithub.com\u002F3DTopia\u002F4DNeX)｜[:globe_with_meridians: Project Page](https:\u002F\u002F4dnex.github.io\u002F)]\n- Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.19296)]\n- ShapeGen4D: Towards High Quality 4D Shape Generation from Videos. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.06208)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fshapegen4d.github.io\u002F)]\n- EvoWorld: Evolving Panoramic World Generation with Explicit 3D Memory. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2510.01183)]\n- WorldForge: Unlocking Emergent 3D\u002F4D Generation in Video Diffusion Model via Training-Free Guidance. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.15130)]\n- FantasyWorld: Geometry-Consistent World Modeling via Unified Video and 3D Prediction. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.21657)]\n\n## Others\n\n### Panorama\n\n- Splatter-360: Generalizable 360° Gaussian Splatting for Wide-baseline Panoramic Images. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.06250) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fthucz\u002Fsplatter360)｜[:globe_with_meridians: Project Page](https:\u002F\u002F3d-aigc.github.io\u002FSplatter-360\u002F)]\n- PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.12096) | [:computer: Code](https:\u002F\u002Fgithub.com\u002Fchengzhag\u002FPanSplat)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fchengzhag.github.io\u002Fpublication\u002Fpansplat\u002F)]\n- PanoVGGT: Feed-Forward 3D Reconstruction from Panoramic Imagery. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.17571)]\n\n### Localization\n\n- Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.23962)]\n- A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features. [[:page_facing_up: Paper](https:\u002F\u002Fopenreview.net\u002Fforum?id=rmDA02o8MV)]\n- Multi-View 3D Point Tracking. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.21060)｜[:globe_with_meridians: Project Page](https:\u002F\u002Fethz-vlg.github.io\u002Fmvtracker\u002F)]\n- SAIL-Recon: Large SfM by Augmenting Scene Regression with Localization. [[:page_facing_up: Paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2508.","Awesome Feed-Forward 3D 是一个精心整理的资源列表，涵盖了前馈3D场景建模的研究方向、数据集和应用。该项目详细介绍了包括特征增强、几何感知改进、模型效率以及数据与视觉增强等在内的多个研究领域，并提供了丰富的参考资料链接。它适合于从事3D计算机视觉、自动驾驶、机器人技术等领域研究和开发的专业人士使用，帮助他们快速了解最新的研究成果和技术进展。MIT许可证下开源，确保了其内容可以被广泛地分享与利用。",2,"2026-06-11 02:43:53","CREATED_QUERY"]