[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80045":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":16,"stars30d":13,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":19,"topics":21,"createdAt":10,"pushedAt":10,"updatedAt":22,"readmeContent":23,"aiSummary":24,"trendingCount":16,"starSnapshotCount":16,"syncStatus":25,"lastSyncTime":26,"discoverSource":27},80045,"OmniNavBench","AutoLab-SAI-SJTU\u002FOmniNavBench","AutoLab-SAI-SJTU","[RSS 2026] Official code & data for \"OmniNavBench: Beyond Isolation — A Unified Benchmark for General-Purpose Navigation\"","",null,"Python",66,3,63,1,0,1.81,"MIT License",false,"main",[],"2026-06-12 02:03:57","\u003Cdiv align=\"center\">\n\n# OmniNavBench\n\n**Beyond Isolation: A Unified Benchmark for General-Purpose Navigation**\n\n[![RSS 2026](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRSS-2026-blue)](https:\u002F\u002Froboticsconference.org\u002F)\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.09441-red)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.09441)\n[![Leaderboard](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLeaderboard-Live-orange)](http:\u002F\u002Fomninavbench.cloud-ip.cc\u002F)\n[![Dataset](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Dataset-yellow)](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FAutoLab-SJTU\u002FOmniNavBench)\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green)](LICENSE)\n\n\u003C\u002Fdiv>\n\n## 🔥 News\n\n- **[2026.05]** 🎉 OmniNavBench is accepted to **RSS 2026**.\n- **[2026.05]** Code release.\n- **[2026.05]** Leaderboard live at \u003Chttp:\u002F\u002Fomninavbench.cloud-ip.cc\u002F>.\n\n## 📝 TODO\n\n- [x] Code release\n- [x] Leaderboard submission portal\n- [x] Paper release\n- [x] Dataset release\n- [ ] Data-generation pipeline release\n- [ ] Replay pipeline release\n- [ ] Docker version release\n\n## 🔎 Overview\n\nMost embodied-navigation benchmarks isolate a single skill (PointNav, VLN, ObjectNav, SocialNav, Human Following, or EQA) on a single robot morphology, against shortest-path reference data. **OmniNavBench** breaks all three constraints at once: composite instructions that interleave six sub-task families, three robot embodiments, and reference trajectories collected from human teleoperation rather than A\\* shortest-path planners.\n\n**Three paradigm shifts:**\n\n- 🧩 **Compositional complexity** — every instruction weaves together **at least two of six sub-task primitives** (PointNav, VLN, ObjectNav, SocialNav, Human Following, EQA), forcing agents to switch strategies mid-episode while satisfying overarching SocialNav \u002F EQA constraints.\n- 🤖 **Morphological universality & sensor flexibility** — the same instruction set runs on **H1 humanoid, Aliengo quadruped, and Carter wheeled** robots through a modular sensor interface (RGB-D, LiDAR, panoramic), across 170 environments blending 85 GRScenes synthetic assets and 85 real-world Matterport3D scans.\n- 🧑‍✈️ **Naturalistic human demonstrations** — **1,779 expert trajectories collected via human teleoperation**, 16.7 m average length, 29.5 km cumulative, 24 hours of egocentric RGB-D and 2.6 M frames. The data captures exploratory glance, anticipatory avoidance, and other behaviours shortest-path planners cannot reproduce.\n\n**At a glance:**\n\n| | |\n|---|---|\n| Sub-task families | PointNav · VLN · ObjectNav · SocialNav · Human Following · EQA |\n| Robot embodiments | H1 humanoid · Aliengo quadruped · Carter wheeled |\n| Environments | 170 (85 GRScenes synthetic + 85 Matterport3D real) |\n| Composite instructions | 1,779 base · 7,116 with 4 linguistic styles |\n| Reference video | 1,700+ teleoperated demonstrations · 2.6 M frames |\n| Trajectory-only runtime | scoring is offline; local eval and leaderboard submission go through the same `bench\u002Fevaluator\u002Foffline_test.py` code path |\n| Bring your own policy | one HTTP endpoint to implement; reference adapters bundled as templates |\n\n## 📋 Requirements\n\n| Component | Version | Notes |\n|---|---|---|\n| OS | Linux | Vulkan-required; Windows not supported |\n| Python | 3.11 | conda recommended |\n| Isaac Sim | 5.0.0 | install via the [Isaac Lab pip-installation guide](https:\u002F\u002Fisaac-sim.github.io\u002FIsaacLab\u002Fmain\u002Fsource\u002Fsetup\u002Finstallation\u002Fpip_installation.html) |\n| Isaac Lab | 2.3.0 | same guide; the [omni.isaac.matterport](https:\u002F\u002Fgithub.com\u002FSun-Season\u002Fomni.isaac.matterport.git) extension under `IsaacLab\u002Fsource\u002F` is required |\n| GPU | NVIDIA, CUDA 12.8 | ≥ 24 GB VRAM recommended for policy servers |\n| RAM | ≥ 32 GB | Isaac Sim baseline |\n\nPython packages: see [`pyproject.toml`](pyproject.toml). They install via `pip install -e .` after the Isaac Lab guide (which handles `isaacsim` \u002F `isaaclab` themselves).\n\n## 🛠️ Installation\n\n### 1. Install Isaac Sim and Isaac Lab\n\nFollow NVIDIA's official guide: [Isaac Lab — pip installation](https:\u002F\u002Fisaac-sim.github.io\u002FIsaacLab\u002Fmain\u002Fsource\u002Fsetup\u002Finstallation\u002Fpip_installation.html). It walks you through creating a Python 3.11 conda env, pip-installing Isaac Sim, and installing Isaac Lab in one place.\n\n> Tested with **Isaac Sim 5.0.0** + **Isaac Lab 2.3.0** + **Python 3.11**. Older or newer versions may also work but require significant adaptation, as Isaac version updates often rename or restructure dependencies.\n\n### 2. Clone this repo and install the Python deps\n\n```bash\ngit clone \u003Cthis-repo> ~\u002FOmniNavBench\ncd ~\u002FOmniNavBench\npip install -e .\n```\n\n`pip install -e .` resolves the runtime dependencies declared in `pyproject.toml` (without touching the Isaac Sim \u002F Isaac Lab install from Step 1). The shell helper `load_local_paths.sh` (see next section) additionally puts the repo root on `PYTHONPATH` so `import bench`, `import OmniNav`, `import OmniNavExt` resolve from any working directory.\n\n## 🔧 Configuring for Your Machine\n\n**Data lives anywhere on disk.** This repo and the dataset are independent — the dataset can sit on a separate drive (e.g. `\u002Fmedia\u002F\u003Cuser>\u002F\u003Csome-disk>\u002FOmniNavBench`), inside your home directory, or anywhere else. You just tell the runner where to look via two environment variables.\n\nAfter cloning, copy the template and edit it once:\n\n```bash\ncp local_paths.env.example local_paths.env\n$EDITOR local_paths.env\n```\n\nThe two paths you must set:\n\n```bash\n# local_paths.env\nOMNINAV_BENCH_DATASET_ROOT=\"\u002Fabsolute\u002Fpath\u002Fto\u002FOmniNavBench\"   # OmniNavBench dataset root\nOMNINAV_SCENE_ROOT=\"\u002Fabsolute\u002Fpath\u002Fto\u002FAssets\"                 # GRScenes + Matterport3D scene assets\n#OMNINAV_ISAACLAB_SOURCE=\"\u002Fabsolute\u002Fpath\u002Fto\u002FIsaacLab\u002Fsource\"  # optional; auto-detected if Isaac Lab is on a standard path\n```\n\nSource it once per shell:\n\n```bash\nsource load_local_paths.sh\n```\n\nThis sets `OMNINAV_REPO_ROOT`, prepends the repo to `PYTHONPATH`, and exports the variables from `local_paths.env`. After this, `runBench.py` picks up the data and scene paths automatically — no CLI flags required.\n\n> **Precedence:** explicit CLI flags (`--omninavbench-root`, `--scene-root`) override the env vars. The env vars override nothing else — if neither is set when you use `--omninavbench`, `runBench.py` exits with a clear error pointing you back to this section.\n\n## 📦 Data\n\n### OmniNavBench (primary)\n\nDownload the dataset from [AutoLab-SJTU\u002FOmniNavBench](https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002FAutoLab-SJTU\u002FOmniNavBench) (HuggingFace) and unpack it anywhere on disk. Point at the unpack location via `OMNINAV_BENCH_DATASET_ROOT` in `local_paths.env` (or pass `--omninavbench-root \u002Fpath` on the CLI). The expected layout under that root:\n\n```\nOmniNavBench\u002F\n├── annotations\u002F                              # scenario JSONs consumed by runBench.py\n│   ├── train\u002F                                # with GT — local offline scoring is supported\n│   │   └── {original,concise,verbose,first_person}\u002F\n│   │       └── {human,dog,car}\u002F              # robot dirs: human=H1, dog=Aliengo, car=Carter\n│   │           └── \u003Cscene_id>\u002F\n│   │               └── final_episode_N.json\n│   └── test\u002F                                 # sanitized (no GT) — submit results to the leaderboard\n│       └── \u003Cstyle>\u002F\u003Crobot>\u002F\u003Cscene>\u002F...\n│\n└── videos\u002F                                   # GT replay videos (optional, train split only)\n    └── train\u002F...\n```\n\nThe `human` \u002F `dog` \u002F `car` directories are **robot embodiments**, not object types. Mapping:\n\n| `--robot` flag | Dataset directory | Robot model |\n|---|---|---|\n| `h1` | `human\u002F` | Unitree H1 humanoid |\n| `aliengo` | `dog\u002F` | Unitree Aliengo quadruped |\n| `carter` | `car\u002F` | NVIDIA Carter wheeled |\n\n### Scene assets\n\nOmniNavBench uses a hybrid suite of **170 environments**: 85 high-fidelity synthetic assets from [GRScenes](https:\u002F\u002Fgithub.com\u002FOpenRobotLab\u002FGRUtopia) and 85 photorealistic real-world scans from [Matterport3D](https:\u002F\u002Fgithub.com\u002Fniessner\u002FMatterport). Both download separately and live under the `OMNINAV_SCENE_ROOT` directory you set in [Configuring for Your Machine](#-configuring-for-your-machine). Matterport3D usage is governed by the [Matterport3D Terms of Use](https:\u002F\u002Fgithub.com\u002Fniessner\u002FMatterport).\n\n## 🚀 Quick Start\n\nA no-server smoke run end-to-end — verifies the simulator + I\u002FO wiring without any policy server. The `forward` policy just drives the robot straight forward.\n\n```bash\nsource load_local_paths.sh   # once per shell — exports the data\u002Fscene roots\npython runBench.py \\\n    --omninavbench --mode test --robot h1 --style original \\\n    --config configs\u002Faliengoh1_test.yaml \\\n    --output results\u002Fsmoke\u002F \\\n    --policy forward \\\n    --headless\n```\n\nThe dataset path comes from `OMNINAV_BENCH_DATASET_ROOT` (set in `local_paths.env`); pass `--omninavbench-root \u002Fpath` only if you want to override it for this run.\n\n## 🔬 Running Evaluations\n\n### Full policy evaluation (test mode → submit to leaderboard)\n\nYour policy runs as an HTTP server in its own process. `runBench.py` queries it once per simulation step. Start the server first, then point `runBench.py` at its URL.\n\n```bash\n# 1) Start your policy server (example below uses a bundled reference adapter)\npython -m bench.policy.\u003Cyour_adapter>.\u003Cyour_server> --port \u003Cport> [your-args]\n\n# 2) Run the benchmark (dataset path picked up from local_paths.env)\npython runBench.py \\\n    --omninavbench --mode test --robot h1 --style original \\\n    --config configs\u002Faliengoh1_test.yaml \\\n    --output results\u002Fmy_run\u002F \\\n    --policy \u003Cyour_policy> \\\n    --\u003Cyour_policy>-server-url http:\u002F\u002Flocalhost:\u003Cport> \\\n    --headless\n```\n\nThe `--output` directory after a run contains per-episode trajectories and a minimal `summary.json` (steps \u002F time only). **No scoring fields are written** by the runtime — scoring is exclusively offline.\n\n### Bringing your own policy\n\nThe benchmark talks to any policy via a small HTTP protocol. To benchmark a new policy: (1) write a server that exposes the same step\u002Faction endpoint that `bench\u002Fpolicy\u002F\u003Creference_adapter>\u002F` uses, (2) add a `--policy \u003Cname>` choice in `runBench.py` that wires its URL flag, (3) run as above. Use any of the bundled reference adapters as a copy-paste template.\n\n### Reference policy adapters (already wired)\n\nThese are external policies we tested as part of building this benchmark. They are **examples**, not the benchmark itself — bring your own policy to actually evaluate something new.\n\n| `--policy` | Reference for | Notes |\n|---|---|---|\n| `forward` | smoke-test sanity check | built-in, no server, drives the robot straight forward |\n| `uninavid` | Uni-NaVid (3rd-party) | external repo + checkpoint required |\n| `mtu3d` | MTU3D (3rd-party) | external repo + checkpoint required |\n| `poliformer` | PoliFormer (3rd-party) | external repo + checkpoint required |\n| `omninav` | OmniNav (3rd-party) | external repo + checkpoint required |\n\nServer ports are user-chosen — start the server with `--port \u003Cport>` and pass the matching URL to `runBench.py` via `--\u003Cpolicy>-server-url`. Per-policy launch commands and required checkpoints are in `HowtoTestModel.md`.\n\n### Robot ↔ config compatibility\n\n| `--robot` | Recommended `--config` |\n|---|---|\n| `h1` | `configs\u002Faliengoh1_test.yaml` |\n| `aliengo` | `configs\u002Faliengoh1_test.yaml` |\n| `carter` | `configs\u002Fcarter_v1_test.yaml` |\n\n### Batch helpers (multiple scenes or styles)\n\nFor larger sweeps, two thin wrappers around `runBench.py` ship in the repo root.\n\n`benchtestbatch.sh` runs one `runBench.py` per scene in parallel across the GPUs `nvidia-smi -L` reports. Reads `OMNINAV_BENCH_DATASET_ROOT` from `local_paths.env` and walks `annotations\u002F\u003Cmode>\u002F\u003Cstyle>\u002F\u003Crobot>\u002F\u003Cscene>\u002F` itself.\n\n```bash\n# forward smoke-test across every scene\u002Frobot in the test split\n.\u002Fbenchtestbatch.sh --mode test --style original\n\n# evaluate your own server-backed policy\n.\u002Fbenchtestbatch.sh --mode test --style concise \\\n    --policy omninav --server-url http:\u002F\u002Flocalhost:\u003Cport>\n```\n\nSelected flags: `--policy NAME` (default `forward`), `--robot h1,aliengo,carter` (comma-separated, defaults to all three), `--mode train|test` (default `test`), `--style original|concise|verbose|first_person` (default `original`), `--workers-per-gpu N` (default 4), `--num-gpus N` (default = autodetected), `--server-url URL`, `--skip-completed`.\n\n`benchteststyle.sh` sweeps one robot across all four instruction styles via the `--omninavbench` shortcut:\n\n```bash\n.\u002Fbenchteststyle.sh --robot aliengo --policy forward\n.\u002Fbenchteststyle.sh --robot h1 --policy omninav --server-url http:\u002F\u002Flocalhost:\u003Cport>\n```\n\nBoth scripts pass exactly one `--\u003Cpolicy>-server-url` flag (the one matching `--policy`); for `--policy forward`, `--server-url` is ignored.\n\n### Local scoring (train split only — has GT)\n\nRun the benchmark in train mode, then score offline:\n\n```bash\n# 1) Run on train (GT present in private envset)\npython runBench.py \\\n    --omninavbench --mode train --robot aliengo --style concise \\\n    --config configs\u002Faliengoh1_test.yaml \\\n    --output results\u002Fmy_train_run\u002F \\\n    --policy \u003Cyour_policy> --\u003Cyour_policy>-server-url http:\u002F\u002Flocalhost:\u003Cport> \\\n    --headless\n\n# 2) Score against the (with-GT) train annotations\npython -m bench.evaluator.offline_test \\\n    --private \"$OMNINAV_BENCH_DATASET_ROOT\u002Fannotations\u002Ftrain\u002Fconcise\u002Fdog\" \\\n    --results results\u002Fmy_train_run\u002F \\\n    --output results\u002Fmy_train_run\u002Fscoring.json\n```\n\nThe scorer outputs `sr`, `csr`, `softsr`, `spl`, `ne`, `osr`, `social_violation_ratio`, `eqa_accuracy`, plus per-episode breakdowns. **For the test split, do not run the offline scorer locally** — submit your `--output` directory to the leaderboard at \u003Chttp:\u002F\u002Fomninavbench.cloud-ip.cc\u002F>; the same `offline_test.py` runs server-side against the private GT.\n\n## 📤 Per-Episode Output Schema\n\nEach `\u003Cscenario_id>.json` in `--output` contains only embodiment-independent runtime metadata:\n\n```json\n{\n  \"scenario_id\": \"matterport_11\",\n  \"source_envset\": \"\u002Fpath\u002Fto\u002Fepisode.json\",\n  \"instruction\": \"Follow the man ahead of you ...\",\n  \"robot_type\": \"h1\",\n  \"initial_pose\": {\"position\": [...], \"orientation_deg\": 0.0},\n  \"termination_reason\": \"stop_action | timeout | max_steps\",\n  \"steps\": 123,\n  \"time_s\": 45.6,\n  \"path_length\": 12.3,\n  \"stop_step\": 98,\n  \"trajectory\": [\n    {\"step\": 0, \"time_s\": 0.0, \"position\": [x,y,z], \"orientation\": [w,x,y,z]},\n    ...\n  ]\n}\n```\n\n`success` \u002F `distance_to_goal` and any aggregate score fields are **deliberately not written** so that local development on the train split and remote evaluation on the test split compute metrics through the exact same `bench\u002Fevaluator\u002Foffline_test.py` code path.\n\n## 📂 Repository Layout\n\n```\nOmniNavBench\u002F\n├── runBench.py                   # main benchmark runner (this is what you run)\n├── load_local_paths.sh           # env-var loader (source it before running)\n├── local_paths.env.example       # template — copy to local_paths.env and edit\n│\n├── configs\u002F                      # robot\u002Fphysics configs (aliengoh1_test.yaml, carter_v1_test.yaml)\n├── HowtoTestModel.md             # per-policy launch commands\n│\n├── bench\u002F\n│   ├── evaluator\u002F                # benchmark runner + offline scorer\n│   ├── metrics\u002F                  # SR\u002FSPL\u002FCSR\u002FSoftSR etc.\n│   ├── policy\u002F                   # one HTTP-server module per supported policy\n│   ├── datasets\u002Fadapters\u002F        # dataset → envset adapters\n│   └── replay\u002F                   # video rendering pipeline\n│\n├── OmniNav\u002F                       # Isaac Sim integration core\n└── OmniNavExt\u002F                    # Isaac Sim extensions, robot configs, scene loaders\n```\n\n## ❤️ Acknowledgements\n\n### Foundations\n\n- [InternUtopia](https:\u002F\u002Fgithub.com\u002FInternRobotics\u002FInternUtopia) — the Isaac Sim integration scaffolding under `OmniNav\u002F` and `OmniNavExt\u002F` (config schema, simulator runner, sensor \u002F robot abstractions, extension lifecycle) is built on it. We extended it heavily for OmniNavBench, adding the NavMesh baking pipeline, scenario \u002F scene loader, and virtual-human spawning + control stack.\n- [NVIDIA Isaac Lab](https:\u002F\u002Fgithub.com\u002Fisaac-sim\u002FIsaacLab) — simulation platform.\n- [Matterport3D](https:\u002F\u002Fniessner.github.io\u002FMatterport\u002F) — 85 of the 170 real-world scene scans.\n- [GRScenes](https:\u002F\u002Fgithub.com\u002FOpenRobotLab\u002FGRUtopia) — 85 of the 170 synthetic scene assets.\n\n### Reference policy implementations\n\n- [Uni-NaVid](https:\u002F\u002Fgithub.com\u002Fjzhzhang\u002FNaVid-VLN-CE)\n- [MTU3D](https:\u002F\u002Fgithub.com\u002Fbigai-research\u002FMTU3D)\n- [PoliFormer](https:\u002F\u002Fgithub.com\u002Fallenai\u002FPoliFormer)\n- [OmniNav](https:\u002F\u002Fgithub.com\u002Famap-cvlab\u002FOmniNav)\n\nThanks to all authors for releasing high-quality code.\n\n## 📄 License\n\nOmniNavBench code is released under the MIT License — see [LICENSE](LICENSE). Note that the bundled scene data is governed by the [Matterport3D Terms of Use](https:\u002F\u002Fgithub.com\u002Fniessner\u002FMatterport) and is not redistributed by this repository.\n","OmniNavBench 是一个统一的多用途导航基准测试平台，旨在超越单一技能、单一机器人形态和最短路径参考数据的传统限制。其核心功能包括支持六种子任务（如点导航、视觉语言导航等）的组合指令，适用于三种不同类型的机器人（人形H1、四足Aliengo及轮式Carter），并在170个合成与真实环境中进行测试。技术特点方面，该平台基于Python开发，采用自然的人类远程操作轨迹作为参考，总计包含1,779条专家轨迹，平均长度为16.7米。适合用于评估和比较在复杂环境下具有通用导航能力的机器人系统性能。",2,"2026-06-11 03:59:02","CREATED_QUERY"]