[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72098":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":16,"starSnapshotCount":16,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},72098,"Depth-Anything-3","ByteDance-Seed\u002FDepth-Anything-3","ByteDance-Seed","Depth Anything 3","https:\u002F\u002Fdepth-anything-3.github.io\u002F",null,"Python",5528,607,47,163,0,54,99,289,162,114.35,"Apache License 2.0",false,"main",[],"2026-06-12 04:01:03","\u003Cdiv align=\"center\">\n\u003Ch1 style=\"border-bottom: none; margin-bottom: 0px \">Depth Anything 3: Recovering the Visual Space from Any Views\u003C\u002Fh1>\n\u003C!-- \u003Ch2 style=\"border-top: none; margin-top: 3px;\">Recovering the Visual Space from Any Views\u003C\u002Fh2> -->\n\n\n[**Haotong Lin**](https:\u002F\u002Fhaotongl.github.io\u002F)\u003Csup>&ast;\u003C\u002Fsup> · [**Sili Chen**](https:\u002F\u002Fgithub.com\u002FSiliChen321)\u003Csup>&ast;\u003C\u002Fsup> · [**Jun Hao Liew**](https:\u002F\u002Fliewjunhao.github.io\u002F)\u003Csup>&ast;\u003C\u002Fsup> · [**Donny Y. Chen**](https:\u002F\u002Fdonydchen.github.io)\u003Csup>&ast;\u003C\u002Fsup> · [**Zhenyu Li**](https:\u002F\u002Fzhyever.github.io\u002F) · [**Guang Shi**](https:\u002F\u002Fscholar.google.com\u002Fcitations?user=MjXxWbUAAAAJ&hl=en) · [**Jiashi Feng**](https:\u002F\u002Fscholar.google.com.sg\u002Fcitations?user=Q8iay0gAAAAJ&hl=en)\n\u003Cbr>\n[**Bingyi Kang**](https:\u002F\u002Fbingyikang.com\u002F)\u003Csup>&ast;&dagger;\u003C\u002Fsup>\n\n&dagger;project lead&emsp;&ast;Equal Contribution\n\n\u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2511.10647\">\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-Depth Anything 3-red' alt='Paper PDF'>\u003C\u002Fa>\n\u003Ca href='https:\u002F\u002Fdepth-anything-3.github.io'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_Page-Depth Anything 3-green' alt='Project Page'>\u003C\u002Fa>\n\u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fspaces\u002Fdepth-anything\u002FDepth-Anything-3'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Demo-blue'>\u003C\u002Fa>\n\u003C!-- \u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fdepth-anything\u002FVGB'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBenchmark-VisGeo-yellow' alt='Benchmark'>\u003C\u002Fa> -->\n\u003C!-- \u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fdepth-anything\u002Fdata'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBenchmark-xxx-yellow' alt='Data'>\u003C\u002Fa> -->\n\n\u003C\u002Fdiv>\n\nThis work presents **Depth Anything 3 (DA3)**, a model that predicts spatially consistent geometry from\narbitrary visual inputs, with or without known camera poses.\nIn pursuit of minimal modeling, DA3 yields two key insights:\n- 💎 A **single plain transformer** (e.g., vanilla DINO encoder) is sufficient as a backbone without architectural specialization,\n- ✨ A singular **depth-ray representation** obviates the need for complex multi-task learning.\n\n🏆 DA3 significantly outperforms\n[DA2](https:\u002F\u002Fgithub.com\u002FDepthAnything\u002FDepth-Anything-V2) for monocular depth estimation,\nand [VGGT](https:\u002F\u002Fgithub.com\u002Ffacebookresearch\u002Fvggt) for multi-view depth estimation and pose estimation.\nAll models are trained exclusively on **public academic datasets**.\n\n\u003C!-- \u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fimages\u002Fda3_teaser.png\" alt=\"Depth Anything 3\" width=\"100%\">\n\u003C\u002Fp> -->\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fimages\u002Fdemo320-2.gif\" alt=\"Depth Anything 3 - Left\" width=\"70%\">\n\u003C\u002Fp>\n\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Fimages\u002Fda3_radar.png\" alt=\"Depth Anything 3\" width=\"100%\">\n\u003C\u002Fp>\n\n\n## 📰 News\n- **11-12-2025:** 🚀 New models and [**DA3-Streaming**](da3_streaming\u002FREADME.md) released! Handle ultra-long video sequence inference with less than 12GB GPU memory via sliding-window streaming inference. Special thanks to [Kai Deng](https:\u002F\u002Fgithub.com\u002FDengKaiCQ) for his contribution to DA3-Streaming!\n- **08-12-2025:** 📊 [Benchmark evaluation pipeline](docs\u002FBENCHMARK.md) released! Evaluate pose estimation & 3D reconstruction on 5 datasets.\n- **30-11-2025:** Add [`use_ray_pose`](#use-ray-pose) and [`ref_view_strategy`](docs\u002Ffuncs\u002Fref_view_strategy.md) (reference view selection for multi-view inputs).   \n- **25-11-2025:** Add [Awesome DA3 Projects](#-awesome-da3-projects), a community-driven section featuring DA3-based applications.\n- **14-11-2025:** Paper, project page, code and models are all released.\n\n## ✨ Highlights\n\n### 🏆 Model Zoo\nWe release three series of models, each tailored for specific use cases in visual geometry.\n\n- 🌟 **DA3 Main Series** (`DA3-Giant`, `DA3-Large`, `DA3-Base`, `DA3-Small`) These are our flagship foundation models, trained with a unified depth-ray representation. By varying the input configuration, a single model can perform a wide range of tasks:\n  + 🌊 **Monocular Depth Estimation**: Predicts a depth map from a single RGB image.\n  + 🌊 **Multi-View Depth Estimation**: Generates consistent depth maps from multiple images for high-quality fusion.\n  + 🎯 **Pose-Conditioned Depth Estimation**: Achieves superior depth consistency when camera poses are provided as input.\n  + 📷 **Camera Pose Estimation**:  Estimates camera extrinsics and intrinsics from one or more images.\n  + 🟡 **3D Gaussian Estimation**: Directly predicts 3D Gaussians, enabling high-fidelity novel view synthesis.\n\n- 📐 **DA3 Metric Series** (`DA3Metric-Large`) A specialized model fine-tuned for metric depth estimation in monocular settings, ideal for applications requiring real-world scale.\n\n- 🔍 **DA3 Monocular Series** (`DA3Mono-Large`). A dedicated model for high-quality relative monocular depth estimation. Unlike disparity-based models (e.g.,  [Depth Anything 2](https:\u002F\u002Fgithub.com\u002FDepthAnything\u002FDepth-Anything-V2)), it directly predicts depth, resulting in superior geometric accuracy.\n\n🔗 Leveraging these available models, we developed a **nested series** (`DA3Nested-Giant-Large`). This series combines a any-view giant model with a metric model to reconstruct visual geometry at a real-world metric scale.\n\n### 🛠️ Codebase Features\nOur repository is designed to be a powerful and user-friendly toolkit for both practical application and future research.\n- 🎨 **Interactive Web UI & Gallery**: Visualize model outputs and compare results with an easy-to-use Gradio-based web interface.\n- ⚡ **Flexible Command-Line Interface (CLI)**: Powerful and scriptable CLI for batch processing and integration into custom workflows.\n- 💾 **Multiple Export Formats**: Save your results in various formats, including `glb`, `npz`, depth images, `ply`, 3DGS videos, etc, to seamlessly connect with other tools.\n- 🔧 **Extensible and Modular Design**: The codebase is structured to facilitate future research and the integration of new models or functionalities.\n\n\n\u003C!-- ### 🎯 Visual Geometry Benchmark\nWe introduce a new benchmark to rigorously evaluate geometry prediction models on three key tasks: pose estimation, 3D reconstruction, and visual rendering (novel view synthesis) quality.\n\n- 🔄 **Broad Model Compatibility**: Our benchmark is designed to be versatile, supporting the evaluation of various models, including both monocular and multi-view depth estimation approaches.\n- 🔬 **Robust Evaluation Pipeline**: We provide a standardized pipeline featuring RANSAC-based pose alignment, TSDF fusion for dense reconstruction, and a principled view selection strategy for novel view synthesis.\n- 📊 **Standardized Metrics**: Performance is measured using established metrics: AUC for pose accuracy, F1-score and Chamfer Distance for reconstruction, and PSNR\u002FSSIM\u002FLPIPS for rendering quality.\n- 🌍 **Diverse and Challenging Datasets**: The benchmark spans a wide range of scenes from datasets like HiRoom, ETH3D, DTU, 7Scenes, ScanNet++, DL3DV, Tanks and Temples, and MegaDepth. -->\n\n\n## 🚀 Quick Start\n\n### 📦 Installation\n\n```bash\npip install xformers torch\\>=2 torchvision\npip install -e . # Basic\npip install --no-build-isolation git+https:\u002F\u002Fgithub.com\u002Fnerfstudio-project\u002Fgsplat.git@0b4dddf04cb687367602c01196913cde6a743d70 # for gaussian head\npip install -e \".[app]\" # Gradio, python>=3.10\npip install -e \".[all]\" # ALL\n```\n\nFor detailed model information, please refer to the [Model Cards](#-model-cards) section below.\n\n### 💻 Basic Usage\n\n```python\nimport glob, os, torch\nfrom depth_anything_3.api import DepthAnything3\ndevice = torch.device(\"cuda\")\nmodel = DepthAnything3.from_pretrained(\"depth-anything\u002FDA3NESTED-GIANT-LARGE\")\nmodel = model.to(device=device)\nexample_path = \"assets\u002Fexamples\u002FSOH\"\nimages = sorted(glob.glob(os.path.join(example_path, \"*.png\")))\nprediction = model.inference(\n    images,\n)\n# prediction.processed_images : [N, H, W, 3] uint8   array\nprint(prediction.processed_images.shape)\n# prediction.depth            : [N, H, W]    float32 array\nprint(prediction.depth.shape)  \n# prediction.conf             : [N, H, W]    float32 array\nprint(prediction.conf.shape)  \n# prediction.extrinsics       : [N, 3, 4]    float32 array # opencv w2c or colmap format\nprint(prediction.extrinsics.shape)\n# prediction.intrinsics       : [N, 3, 3]    float32 array\nprint(prediction.intrinsics.shape)\n```\n\n```bash\n\nexport MODEL_DIR=depth-anything\u002FDA3NESTED-GIANT-LARGE\n# This can be a Hugging Face repository or a local directory\n# If you encounter network issues, consider using the following mirror: export HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com\n# Alternatively, you can download the model directly from Hugging Face\nexport GALLERY_DIR=workspace\u002Fgallery\nmkdir -p $GALLERY_DIR\n\n# CLI auto mode with backend reuse\nda3 backend --model-dir ${MODEL_DIR} --gallery-dir ${GALLERY_DIR} # Cache model to gpu\nda3 auto assets\u002Fexamples\u002FSOH \\\n    --export-format glb \\\n    --export-dir ${GALLERY_DIR}\u002FTEST_BACKEND\u002FSOH \\\n    --use-backend\n\n# CLI video processing with feature visualization\nda3 video assets\u002Fexamples\u002Frobot_unitree.mp4 \\\n    --fps 15 \\\n    --use-backend \\\n    --export-dir ${GALLERY_DIR}\u002FTEST_BACKEND\u002Frobo \\\n    --export-format glb-feat_vis \\\n    --feat-vis-fps 15 \\\n    --process-res-method lower_bound_resize \\\n    --export-feat \"11,21,31\"\n\n# CLI auto mode without backend reuse\nda3 auto assets\u002Fexamples\u002FSOH \\\n    --export-format glb \\\n    --export-dir ${GALLERY_DIR}\u002FTEST_CLI\u002FSOH \\\n    --model-dir ${MODEL_DIR}\n\n```\n\nThe model architecture is defined in [`DepthAnything3Net`](src\u002Fdepth_anything_3\u002Fmodel\u002Fda3.py), and specified with a Yaml config file located at [`src\u002Fdepth_anything_3\u002Fconfigs`](src\u002Fdepth_anything_3\u002Fconfigs). The input and output processing are handled by [`DepthAnything3`](src\u002Fdepth_anything_3\u002Fapi.py). To customize the model architecture, simply create a new config file (*e.g.*, `path\u002Fto\u002Fnew\u002Fconfig`) as:\n\n```yaml\n__object__:\n  path: depth_anything_3.model.da3\n  name: DepthAnything3Net\n  args: as_params\n\nnet:\n  __object__:\n    path: depth_anything_3.model.dinov2.dinov2\n    name: DinoV2\n    args: as_params\n\n  name: vitb\n  out_layers: [5, 7, 9, 11]\n  alt_start: 4\n  qknorm_start: 4\n  rope_start: 4\n  cat_token: True\n\nhead:\n  __object__:\n    path: depth_anything_3.model.dualdpt\n    name: DualDPT\n    args: as_params\n\n  dim_in: &head_dim_in 1536\n  output_dim: 2\n  features: &head_features 128\n  out_channels: &head_out_channels [96, 192, 384, 768]\n```\n\nThen, the model can be created with the following code snippet.\n```python\nfrom depth_anything_3.cfg import create_object, load_config\n\nModel = create_object(load_config(\"path\u002Fto\u002Fnew\u002Fconfig\"))\n```\n\n\n\n## 📚 Useful Documentation\n\n- 🖥️ [Command Line Interface](docs\u002FCLI.md)\n- 📑 [Python API](docs\u002FAPI.md)\n- 📊 [Benchmark Evaluation](docs\u002FBENCHMARK.md)\n\n## 🗂️ Model Cards\n\nGenerally, you should observe that DA3-LARGE achieves comparable results to VGGT.\n\nThe Nested series uses an Any-view model to estimate pose and depth, and a monocular metric depth estimator for scaling. \n\n⚠️ Models with the `-1.1` suffix are retrained after fixing a training bug; prefer these refreshed checkpoints. The original `DA3NESTED-GIANT-LARGE`, `DA3-GIANT`, and `DA3-LARGE` remain available but are deprecated. You could expect much better performance for street scenes with the `-1.1` models.\n\n| 🗃️ Model Name                  | 📏 Params | 📊 Rel. Depth | 📷 Pose Est. | 🧭 Pose Cond. | 🎨 GS | 📐 Met. Depth | ☁️ Sky Seg | 📄 License     |\n|-------------------------------|-----------|---------------|--------------|---------------|-------|---------------|-----------|----------------|\n| **Nested** | | | | | | | | |\n| [DA3NESTED-GIANT-LARGE-1.1](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3NESTED-GIANT-LARGE-1.1)  | 1.40B     | ✅             | ✅            | ✅             | ✅     | ✅             | ✅         | CC BY-NC 4.0   |\n| [DA3NESTED-GIANT-LARGE](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3NESTED-GIANT-LARGE)  | 1.40B     | ✅             | ✅            | ✅             | ✅     | ✅             | ✅         | CC BY-NC 4.0   |\n| **Any-view Model** | | | | | | | | |\n| [DA3-GIANT-1.1](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3-GIANT-1.1)                     | 1.15B     | ✅             | ✅            | ✅             | ✅     |               |           | CC BY-NC 4.0   |\n| [DA3-GIANT](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3-GIANT)                     | 1.15B     | ✅             | ✅            | ✅             | ✅     |               |           | CC BY-NC 4.0   |\n| [DA3-LARGE-1.1](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3-LARGE-1.1)                     | 0.35B     | ✅             | ✅            | ✅             |       |               |           | CC BY-NC 4.0     |\n| [DA3-LARGE](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3-LARGE)                     | 0.35B     | ✅             | ✅            | ✅             |       |               |           | CC BY-NC 4.0     |\n| [DA3-BASE](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3-BASE)                     | 0.12B     | ✅             | ✅            | ✅             |       |               |           | Apache 2.0     |\n| [DA3-SMALL](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3-SMALL)                     | 0.08B     | ✅             | ✅            | ✅             |       |               |           | Apache 2.0     |\n|                               |           |               |              |               |               |       |           |                |\n| **Monocular Metric Depth** | | | | | | | | |\n| [DA3METRIC-LARGE](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3METRIC-LARGE)              | 0.35B     | ✅             |              |               |       | ✅             | ✅         | Apache 2.0     |\n|                               |           |               |              |               |               |       |           |                |\n| **Monocular Depth** | | | | | | | | |\n| [DA3MONO-LARGE](https:\u002F\u002Fhuggingface.co\u002Fdepth-anything\u002FDA3MONO-LARGE)                | 0.35B     | ✅             |              |               |               |       | ✅         | Apache 2.0     |\n\n\n## ❓ FAQ\n\n- **Monocular Metric Depth**: To obtain metric depth in meters from `DA3METRIC-LARGE`, use `metric_depth = focal * net_output \u002F 300.`, where `focal` is the focal length in pixels (typically the average of fx and fy from the camera intrinsic matrix K). Note that the output from `DA3NESTED-GIANT-LARGE` is already in meters.\n\n- \u003Ca id=\"use-ray-pose\">\u003C\u002Fa>**Ray Head (`use_ray_pose`)**:  Our API and CLI support `use_ray_pose` arg, which means that the model will derive camera pose from ray head, which is generally slightly slower, but more accurate. Note that the default is `False` for faster inference speed. \n  \u003Cdetails>\n  \u003Csummary>AUC3 Results for DA3NESTED-GIANT-LARGE\u003C\u002Fsummary>\n  \n  | Model | HiRoom | ETH3D | DTU | 7Scenes | ScanNet++ | \n  |-------|------|-------|-----|---------|-----------|\n  | `ray_head` | 84.4 | 52.6 | 93.9 | 29.5 | 89.4 |\n  | `cam_head` | 80.3 | 48.4 | 94.1 | 28.5 | 85.0 |\n\n  \u003C\u002Fdetails>\n\n\n\n\n- **Older GPUs without XFormers support**: See [Issue #11](https:\u002F\u002Fgithub.com\u002FByteDance-Seed\u002FDepth-Anything-3\u002Fissues\u002F11). Thanks to [@S-Mahoney](https:\u002F\u002Fgithub.com\u002FS-Mahoney) for the solution!\n\n\n## 🏢 Awesome DA3 Projects\n\nA community-curated list of Depth Anything 3 integrations across 3D tools, creative pipelines, robotics, and web\u002FVR viewers, including but not limited to these. You are welcome to submit your DA3-based project via PR, and we will review and feature it if applicable.\n\n- [DA3-blender](https:\u002F\u002Fgithub.com\u002Fxy-gao\u002FDA3-blender): Blender addon for DA3-based 3D reconstruction from a set of images. \n\n- [ComfyUI-DepthAnythingV3](https:\u002F\u002Fgithub.com\u002FPozzettiAndrea\u002FComfyUI-DepthAnythingV3): ComfyUI nodes for Depth Anything 3, supporting single\u002Fmulti-view and video-consistent depth with optional point‑cloud export.\n\n- [DA3-ROS2-Wrapper](https:\u002F\u002Fgithub.com\u002FGerdsenAI\u002FGerdsenAI-Depth-Anything-3-ROS2-Wrapper): Real-time DA3 depth in ROS2 with multi-camera support. \n\n- [DA3-ROS2-CPP-TensorRT](https:\u002F\u002Fgithub.com\u002Fika-rwth-aachen\u002Fros2-depth-anything-v3-trt): DA3 ROS2 C++ TensorRT Inference Node: a ROS2 node for DA3 depth estimation using TensorRT for real-time inference.\n\n- [VideoDepthViewer3D](https:\u002F\u002Fgithub.com\u002Famariichi\u002FVideoDepthViewer3D): Streaming videos with DA3 metric depth to a Three.js\u002FWebXR 3D viewer for VR\u002Fstereo playback.\n\n\n## 🧑‍💻 Official Codebase Core Contributors and Maintainers\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fbingykang.github.io\u002F\">\n        \u003Cimg src=\"https:\u002F\u002Fimages.weserv.nl\u002F?url=https:\u002F\u002Fbingykang.github.io\u002Fimages\u002Fbykang_homepage.jpeg?h=100&w=100&fit=cover&mask=circle&maxage=7d\" width=\"100px;\" alt=\"\"\u002F>\n      \u003C\u002Fa>\n        \u003Cbr \u002F>\n        \u003Csub>\u003Cb>Bingyi Kang\u003C\u002Fb>\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fhaotongl.github.io\u002F\">\n        \u003Cimg src=\"https:\u002F\u002Fimages.weserv.nl\u002F?url=https:\u002F\u002Fhaotongl.github.io\u002Fassets\u002Fimg\u002Fprof_pic.jpg?h=100&w=100&fit=cover&mask=circle&maxage=7d\" width=\"100px;\" alt=\"\"\u002F>\n      \u003C\u002Fa>\n        \u003Cbr \u002F>\n        \u003Csub>Haotong Lin\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FSiliChen321\">\n        \u003Cimg src=\"https:\u002F\u002Fimages.weserv.nl\u002F?url=https:\u002F\u002Favatars.githubusercontent.com\u002Fu\u002F195901058?v=4&h=100&w=100&fit=cover&mask=circle&maxage=7d\" width=\"100px;\" alt=\"\"\u002F>\n      \u003C\u002Fa>\n        \u003Cbr \u002F>\n        \u003Csub>Sili Chen\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fliewjunhao.github.io\u002F\">\n        \u003Cimg src=\"https:\u002F\u002Fimages.weserv.nl\u002F?url=https:\u002F\u002Fliewjunhao.github.io\u002Fimages\u002Fliewjunhao.png?h=100&w=100&fit=cover&mask=circle&maxage=7d\" width=\"100px;\" alt=\"\"\u002F>\n       \u003C\u002Fa>\n        \u003Cbr \u002F>\n        \u003Csub>Jun Hao Liew\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fdonydchen.github.io\u002F\">\n        \u003Cimg src=\"https:\u002F\u002Fimages.weserv.nl\u002F?url=https:\u002F\u002Fdonydchen.github.io\u002Fassets\u002Fimg\u002Fprofile.jpg?h=100&w=100&fit=cover&mask=circle&maxage=7d\" width=\"100px;\" alt=\"\"\u002F>\n      \u003C\u002Fa>\n        \u003Cbr \u002F>\n        \u003Csub>Donny Y. Chen\u003C\u002Fsub>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FDengKaiCQ\">\n        \u003Cimg src=\"https:\u002F\u002Fimages.weserv.nl\u002F?url=https:\u002F\u002Favatars.githubusercontent.com\u002Fu\u002F59907452?v=4&h=100&w=100&fit=cover&mask=circle&maxage=7d\" width=\"100px;\" alt=\"\"\u002F>\n      \u003C\u002Fa>\n        \u003Cbr \u002F>\n        \u003Csub>Kai Deng\u003C\u002Fsub>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## 📝 Citations\nIf you find Depth Anything 3 useful in your research or projects, please cite our work:\n\n```\n@article{depthanything3,\n  title={Depth Anything 3: Recovering the visual space from any views},\n  author={Haotong Lin and Sili Chen and Jun Hao Liew and Donny Y. Chen and Zhenyu Li and Guang Shi and Jiashi Feng and Bingyi Kang},\n  journal={arXiv preprint arXiv:2511.10647},\n  year={2025}\n}\n```\n","Depth Anything 3 (DA3) 是一个能够从任意视角的视觉输入中恢复空间几何结构的模型，无论是否已知相机姿态。该项目的核心功能在于使用单一的普通变换器（如原始DINO编码器）作为基础架构，并采用统一的深度射线表示法，从而避免了复杂的多任务学习需求。DA3在单目深度估计和多视图深度及姿态估计方面表现出色，超越了前代模型DA2和VGGT。适用于需要从图像或视频序列中提取深度信息的应用场景，例如自动驾驶、增强现实以及三维重建等。项目基于公开学术数据集进行训练，确保了其广泛适用性和准确性。",2,"2026-06-11 03:40:21","high_star"]