[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72022":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},72022,"Isaac-GR00T","NVIDIA\u002FIsaac-GR00T","NVIDIA","NVIDIA Isaac GR00T N1.7 -  A Foundation Model for Generalist Robots.","https:\u002F\u002Fdeveloper.nvidia.com\u002Fisaac\u002Fgr00t",null,"Python",7305,1247,79,202,0,38,137,305,114,40.29,"Apache License 2.0",false,"main",true,[],"2026-06-12 02:02:57","\u003Cdiv align=\"center\">\n\n  \u003Cimg src=\"media\u002Fheader_compress.png\" width=\"800\" alt=\"NVIDIA Isaac GR00T N1.7 Header\">\n\n  \u003C!-- --- -->\n\n  \u003Cp style=\"font-size: 1.2em;\">\n    \u003Ca href=\"https:\u002F\u002Fdeveloper.nvidia.com\u002Fisaac\u002Fgr00t\">\u003Cstrong>Website\u003C\u002Fstrong>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fnvidia\u002Fgr00t-n17\">\u003Cstrong>Model\u003C\u002Fstrong>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fcollections\u002Fnvidia\u002Fphysical-ai\">\u003Cstrong>Dataset\u003C\u002Fstrong>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.14734\">\u003Cstrong>Paper\u003C\u002Fstrong>\u003C\u002Fa> |\n    \u003Ca href=\"https:\u002F\u002Fdeveloper.nvidia.com\u002Fisaac\">\u003Cstrong>NVIDIA Isaac\u003C\u002Fstrong>\u003C\u002Fa> |\n    \u003Ca href=\"FAQ.md\">\u003Cstrong>FAQ\u003C\u002Fstrong>\u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdiv>\n\n## Table of Contents\n\n- [NVIDIA Isaac GR00T](#nvidia-isaac-gr00t)\n- [What's New in GR00T N1.7](#whats-new-in-gr00t-n17)\n- [Installation](#installation)\n- [Model Checkpoints & Embodiment Tags](#model-checkpoints--embodiment-tags)\n- [Data Format](#data-format)\n- [Inference](#inference)\n- [Fine-tuning](#fine-tuning)\n- [Evaluation](#evaluation)\n- [Contributions](#contributions)\n- [License](#license)\n- [Citation](#citation)\n\n---\n\n## NVIDIA Isaac GR00T\n\n\u003Ctable style=\"width:100%; table-layout:fixed;\">\n  \u003Ctr>\n    \u003Ctd style=\"width:33.33%; text-align:center;\">\n      \u003Cimg src=\"media\u002Funitree_g1.gif\" style=\"max-width:100%; height:auto;\">\n    \u003C\u002Ftd>\n    \u003Ctd style=\"width:33.33%; text-align:center;\">\n      \u003Cimg src=\"media\u002Fagibot_g1.gif\" style=\"max-width:100%; height:auto;\">\n    \u003C\u002Ftd>\n    \u003Ctd style=\"width:33.33%; text-align:center;\">\n      \u003Cimg src=\"media\u002Fyam.gif\" style=\"max-width:100%; height:auto;\">\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n> We just released GR00T N1.7 Early Access, the latest version of GR00T N1 with a new VLM backbone (Cosmos-Reason2-2B \u002F Qwen3-VL) and improved performance.\n\n> **This is an Early Access (EA) release.** You are welcome to download the model, explore the codebase, and begin building on the stack, with the understanding that support and stability guarantees are limited until the GA release.\n>\n> **What's available:**\n> - Pre-trained GR00T N1.7 model weights and reference code\n> - Fine-tuning and inference with custom robot data or demonstrations\n> - Experimentation, prototyping, and research use cases\n>\n> **Available at GA:**\n> - Production deployment with commercial support\n> - Complete benchmarks and a fully validated, stable feature set\n> - Pull request contributions\n>\n> We welcome feedback - please feel free to raise issues in this repository.\n\n> To use older versions: [N1.6](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FIsaac-GR00T\u002Freleases\u002Ftag\u002Fn1.6-release) | [N1.5](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FIsaac-GR00T\u002Ftree\u002Fn1.5-release)\n\nNVIDIA Isaac GR00T N1.7 is an open vision-language-action (VLA) model for generalized humanoid robot skills. This cross-embodiment model takes multimodal input, including language and images, to perform manipulation tasks in diverse environments.\n\nGR00T N1.7 is trained on a diverse mixture of robot data including bimanual, semi-humanoid and an expansive humanoid dataset. It is adaptable through post-training for specific embodiments, tasks and environments.\n\nGR00T N1.7 is fully commercially licensable under Apache 2.0. It delivers comparable performance to N1.6, with improved generalization and language-following capabilities driven by the inclusion of 20K hours of EgoScale human video data in pretraining.\n\nThe neural network architecture of GR00T N1.7 is a combination of vision-language foundation model and diffusion transformer head that denoises continuous actions. Here is a schematic diagram of the architecture:\n\n\u003Cdiv align=\"center\">\n\u003Cimg src=\"media\u002Fmodel-architecture.png\" width=\"800\" alt=\"model-architecture\">\n\u003C\u002Fdiv>\n\n### Workflow Overview\n\n1. **Prepare data** — Collect robot demonstrations (video, state, action) and convert them to the [GR00T LeRobot format](#data-format). Demo datasets are included for quick testing.\n2. **Run inference** — Try zero-shot inference with the base model on [pretrain embodiments](#embodiment-tags), or use a [finetuned checkpoint](#checkpoints) for benchmark tasks.\n3. **Fine-tune** — Adapt the model to your robot using [`launch_finetune.py`](#fine-tuning) with your own data and modality config.\n4. **Evaluate** — Validate with [open-loop evaluation](#open-loop-evaluation), then test in [simulation benchmarks](#benchmark-examples) or on real hardware via the [Policy API](getting_started\u002Fpolicy.md).\n5. **Deploy** — Connect `Gr00tPolicy` to your robot controller, optionally accelerated with [TensorRT](scripts\u002Fdeployment\u002FREADME.md).\n\n## What's New in GR00T N1.7\n\nGR00T N1.7 builds on N1.6 with a new VLM backbone and code-level improvements.\n\n1. **Relative EEF Action Space** — N1.7 adopts a relative end-effector action space shared across robot and human embodiments. Representing actions as deltas from the current pose (rather than absolute targets) improves generalization and is a key factor in the model's cross-embodiment performance. See [`getting_started\u002Ffinetune_new_embodiment.md`](getting_started\u002Ffinetune_new_embodiment.md) for guidance on configuring relative EEF for your own robot.\n\n2. **Human Video Pretraining** — N1.7 is pretrained on 20K hours of EgoScale human video data alongside diverse robot demonstrations. Because the relative EEF action representation is consistent across both human and robot data, the model can transfer manipulation priors learned from human video directly to robot control.\n\n### Key Changes from N1.6\n\n- **New VLM backbone:** Cosmos-Reason2-2B (Qwen3-VL architecture), replacing the Eagle backbone used in N1.6. Supports flexible resolution and encodes images in their native aspect ratio without padding.\n- Simplified data processing pipeline (`processing_gr00t_n1d7.py`).\n- Added full pipeline export to ONNX and TensorRT with improved frequency.\n\n---\n\n## Installation\n\n### Hardware Requirements\n\n**Inference:** 1 GPU with 16 GB+ VRAM (e.g., RTX 4090, L40, H100, Jetson AGX Thor\u002FOrin, DGX Spark).\n\n**Fine-tuning:** 1 or more GPUs with 40 GB+ VRAM recommended. We recommend H100 or L40 nodes for optimal performance. Other hardware (e.g., A6000) works but may require longer training time. See the [Hardware Recommendation Guide](getting_started\u002Fhardware_recommendation.md) for detailed specs.\n\n**CUDA \u002F Python per platform:** dGPU on CUDA 12.8 with Python 3.10; Jetson Orin on CUDA 12.6 with Python 3.10; Jetson Thor and DGX Spark on CUDA 13.0 with Python 3.12. The per-platform install scripts and Dockerfiles live under `scripts\u002Fdeployment\u002F`; see the [Deployment & Inference Guide](scripts\u002Fdeployment\u002FREADME.md) for the full matrix.\n\n### Clone the Repository\n\nGR00T relies on submodules for certain dependencies. Include them when cloning:\n\n**Note:** `git-lfs` is **required** to download parquet data files in `\u002Fdemo_data`. Install it before cloning: `sudo apt install git-lfs && git lfs install`.\n```sh\ngit clone --recurse-submodules https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FIsaac-GR00T\ncd Isaac-GR00T\n```\n\nIf you've already cloned without submodules, initialize them separately:\n\n```sh\ngit submodule update --init --recursive\n```\n\n### Set Up the Environment\n\nGR00T uses [uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv) for fast, reproducible dependency management. Install uv first:\n\n```sh\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\n```\n\n#### dGPU (x86_64) — Default\n\nInstall FFmpeg (required by `torchcodec`, the default video backend):\n```sh\nsudo apt-get update && sudo apt-get install -y ffmpeg\n```\n\nCreate the environment and install GR00T:\n```sh\nuv sync --python 3.10\n```\nGPU dependencies (flash-attn, TensorRT, etc.) are included in the default install.\n\nVerify the installation:\n```sh\nuv run python -c \"import gr00t; print('GR00T installed successfully')\"\n```\n\n> **`flash-attn` message on every `uv run`:** You may see `Installing flash-attn...` each time you run `uv run`. This is a known `uv` behavior with URL-pinned wheel sources — `uv` re-validates the cached wheel against the source URL on each invocation. It is **not** rebuilding from source; the wheel is already cached locally and the operation takes 2-3 seconds. This only affects x86_64 platforms. \n> To suppress it, remove the `flash-attn` entries under `[tool.uv.sources]` in your local `pyproject.toml` after the initial install. But that will break `uv lock` and cause flash-attn to build from source on next lock regeneration.\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Alternative: pip install (without uv)\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nIf you prefer pip\u002Fconda over uv, create a Python 3.10 virtualenv and install:\n```sh\npython3.10 -m venv .venv && source .venv\u002Fbin\u002Factivate\npip install -e .\n```\nNote: GPU dependencies (flash-attn, TensorRT) may require manual installation with pip. The `uv` workflow handles these automatically.\n\u003C\u002Fdetails>\n\n> **If fine-tuning fails with `CUDA_HOME is unset`:** Run `bash scripts\u002Fdeployment\u002Fdgpu\u002Finstall_deps.sh` once to configure CUDA paths, or manually `export CUDA_HOME=\u002Fusr\u002Flocal\u002Fcuda`.\n\n> **CUDA 13.x Users (Thor, Spark, and other CUDA 13+ platforms):** PyTorch 2.7 pins Triton to 3.3.1, which does not recognize CUDA major version 13+. This causes a `RuntimeError` in Triton's `ptx_get_version()`. Run the patch script to fix:\n> ```sh\n> uv run bash scripts\u002Fpatch_triton_cuda13.sh\n> ```\n\n> **GB300 (sm_103) Users:** Triton 3.3.1 (pinned by PyTorch 2.7) does not support the GB300 GPU architecture (sm_103). `torch.compile` will fail on GB300. Use PyTorch eager mode or TensorRT inference instead. Triton 3.5.1+ adds sm_103 support but is not yet compatible with the pinned PyTorch version.\n\n> **aarch64 Video Backend:** On aarch64 platforms (Thor, Orin, Spark), `torchcodec` is the required video backend. `install_deps.sh` prefers the prebuilt aarch64 wheel under `scripts\u002Fdeployment\u002Fdgpu\u002Fwheels\u002F` (shared by Thor\u002FSpark against FFmpeg 6; Orin uses a matching build against FFmpeg 4) and falls back to a source build only if the wheel is missing. If you encounter `NotImplementedError` from the video backend, ensure `torchcodec` was installed successfully during setup. Other backends (decord, pyav) are not supported on aarch64.\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>DGX Spark\u003C\u002Fstrong> (tested with DGX Spark GB10)\u003C\u002Fsummary>\n\n```bash\nbash scripts\u002Fdeployment\u002Fspark\u002Finstall_deps.sh\nsource .venv\u002Fbin\u002Factivate\nsource scripts\u002Factivate_spark.sh\n```\n\nSee the [Spark setup guide](scripts\u002Fdeployment\u002FREADME.md#dgx-spark-setup) for Docker and bare metal details.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Jetson AGX Thor\u003C\u002Fstrong> (tested with JetPack 7.1)\u003C\u002Fsummary>\n\n> **flash-attn on older systems (e.g., Ubuntu 20.04 with glibc \u003C 2.35):** The pre-built `flash-attn` wheel may fail with `ImportError: glibc_compat.so: cannot open shared object file`. To fix this, build from source:\n> ```sh\n> uv pip install flash-attn==2.7.4.post1 --no-binary flash-attn --no-cache\n> ```\n> This compiles locally (~10-30 minutes) and avoids the glibc compatibility issue.\n\n```bash\nbash scripts\u002Fdeployment\u002Fthor\u002Finstall_deps.sh\nsource .venv\u002Fbin\u002Factivate\nsource scripts\u002Factivate_thor.sh\n```\n\nSee the [Thor setup guide](scripts\u002Fdeployment\u002FREADME.md#jetson-thor-setup) for Docker and bare metal details.\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Jetson Orin\u003C\u002Fstrong> (tested with JetPack 6.2)\u003C\u002Fsummary>\n\n```bash\nbash scripts\u002Fdeployment\u002Forin\u002Finstall_deps.sh\nsource .venv\u002Fbin\u002Factivate\nsource scripts\u002Factivate_orin.sh\n```\n\nSee the [Orin setup guide](scripts\u002Fdeployment\u002FREADME.md#jetson-orin-setup) for Docker and bare metal details.\n\u003C\u002Fdetails>\n\nFor a containerized setup that avoids system-level dependency conflicts, see our [Docker Setup Guide](docker\u002FREADME.md).\n\n---\n\n## Model Checkpoints & Embodiment Tags\n\n### Checkpoints\n\n| Checkpoint | Type | Embodiment Tag | Description |\n|------------|------|---------------|-------------|\n| [`nvidia\u002FGR00T-N1.7-3B`](https:\u002F\u002Fhuggingface.co\u002Fnvidia\u002FGR00T-N1.7-3B) | Base | See [pretrain tags](getting_started\u002Fpolicy.md#--embodiment-tag) | Base model (3B params) — zero-shot inference on pretrain embodiments, or finetune for new tasks |\n| [`nvidia\u002FGR00T-N1.7-LIBERO`](https:\u002F\u002Fhuggingface.co\u002Fnvidia\u002FGR00T-N1.7-LIBERO) | Finetuned | `LIBERO_PANDA` | Finetuned on [LIBERO](https:\u002F\u002Flibero-project.github.io\u002F) benchmark (Franka Panda) |\n| [`nvidia\u002FGR00T-N1.7-DROID`](https:\u002F\u002Fhuggingface.co\u002Fnvidia\u002FGR00T-N1.7-DROID) | Finetuned | `OXE_DROID_RELATIVE_EEF_RELATIVE_JOINT` | Finetuned on [DROID](https:\u002F\u002Fdroid-dataset.github.io\u002F) dataset |\n| [`nvidia\u002FGR00T-N1.7-SimplerEnv-Bridge`](https:\u002F\u002Fhuggingface.co\u002Fnvidia\u002FGR00T-N1.7-SimplerEnv-Bridge) | Finetuned | `SIMPLER_ENV_WIDOWX` | Finetuned on SimplerEnv Bridge (WidowX) |\n| [`nvidia\u002FGR00T-N1.7-SimplerEnv-Fractal`](https:\u002F\u002Fhuggingface.co\u002Fnvidia\u002FGR00T-N1.7-SimplerEnv-Fractal) | Finetuned | `SIMPLER_ENV_GOOGLE` | Finetuned on SimplerEnv Fractal (Google Robot) |\n\n> Older versions: [N1.6 checkpoints](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FIsaac-GR00T\u002Ftree\u002Fn1.6-release) | [N1.5 checkpoints](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FIsaac-GR00T\u002Ftree\u002Fn1.5-release)\n\n### Embodiment Tags\n\nEvery inference or finetuning command requires an `--embodiment-tag`. The tag determines which modality config (state\u002Faction keys, normalization) the model uses. Tags are **case-insensitive**.\n\nFor the full list of pretrain and posttrain tags, see the [Policy API Guide — Embodiment Tags](getting_started\u002Fpolicy.md#--embodiment-tag).\n\n---\n\n## Data Format\n\nGR00T uses a flavor of the [LeRobot v2 dataset format](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Flerobot) with an additional `meta\u002Fmodality.json` file that describes state\u002Faction\u002Fvideo structure. A dataset looks like:\n\n```\nmy_dataset\u002F\n  meta\u002F\n    info.json            # dataset metadata\n    episodes.jsonl       # episode index and lengths\n    tasks.jsonl          # language task descriptions\n    modality.json        # state\u002Faction\u002Fvideo key mapping (GR00T-specific)\n  data\u002Fchunk-000\u002F        # parquet files (state, action per timestep)\n  videos\u002Fchunk-000\u002F      # mp4 video files per episode\n```\n\nThe `modality.json` maps how the concatenated state\u002Faction arrays split into named fields (e.g., `x`, `y`, `z`, `gripper`) and which video keys are available. This is what the embodiment tag uses to interpret the data.\n\n**Included demo datasets** (ready to use, no download needed):\n\n| Dataset | Robot | Embodiment Tag | Use Case |\n|---------|-------|---------------|----------|\n| `demo_data\u002Fdroid_sample` | DROID (3 episodes) | `OXE_DROID_RELATIVE_EEF_RELATIVE_JOINT` | Zero-shot or finetuned inference (DROID) |\n| `demo_data\u002Flibero_demo` | LIBERO Panda (5 episodes) | `LIBERO_PANDA` | Inference with finetuned checkpoint |\n| `demo_data\u002Fsimplerenv_bridge_sample` | WidowX (SimplerEnv Bridge) | `SIMPLER_ENV_WIDOWX` | Inference with finetuned SimplerEnv Bridge checkpoint |\n| `demo_data\u002Fsimplerenv_fractal_sample` | Google Robot (SimplerEnv Fractal) | `SIMPLER_ENV_GOOGLE` | Inference with finetuned SimplerEnv Fractal checkpoint |\n| `demo_data\u002Fcube_to_bowl_5` | SO100 arm (5 episodes) | `NEW_EMBODIMENT` | Fine-tuning custom embodiment example |\n| `demo_data\u002Fcube_to_bowl_5_with_mask` | SO100 arm + per-frame masks | `NEW_EMBODIMENT` | [Mask-guided background suppression](examples\u002Fmask-guided-background-suppression\u002FREADME.md) example |\n\n> To generate more DROID episodes: `python scripts\u002Fdownload_droid_sample.py --num-episodes 10`\n\n**Using your own data:** Convert your demonstrations to the format above. If coming from LeRobot v3, use the conversion script: `python scripts\u002Flerobot_conversion\u002Fconvert_v3_to_v2.py`. See the full [Data Preparation Guide](getting_started\u002Fdata_preparation.md) for schema details and examples.\n\n---\n\n## Inference\n\n### Zero-Shot Inference (Base Model)\n\nThe included `demo_data\u002Fdroid_sample` dataset works with the base model out of the box — no finetuning or checkpoint download needed:\n\n```bash\nuv run python scripts\u002Fdeployment\u002Fstandalone_inference_script.py \\\n    --model-path nvidia\u002FGR00T-N1.7-3B \\\n    --dataset-path demo_data\u002Fdroid_sample \\\n    --embodiment-tag OXE_DROID_RELATIVE_EEF_RELATIVE_JOINT \\\n    --traj-ids 1 2 \\\n    --inference-mode pytorch \\\n    --action-horizon 8\n```\n\nThis runs open-loop inference on 2 DROID episodes, comparing predicted actions against ground truth. The base model downloads automatically from HuggingFace on first run (~6 GB).\n\n### Finetuned Inference\n\nFor posttrain embodiments, use a finetuned checkpoint. Most finetuned checkpoints (e.g., DROID, SimplerEnv) have a flat file structure and can be passed directly as a HuggingFace model ID — no manual download needed:\n\n```bash\nuv run python scripts\u002Fdeployment\u002Fstandalone_inference_script.py \\\n    --model-path nvidia\u002FGR00T-N1.7-DROID \\\n    --dataset-path demo_data\u002Fdroid_sample \\\n    --embodiment-tag OXE_DROID_RELATIVE_EEF_RELATIVE_JOINT \\\n    --traj-ids 1 2 \\\n    --inference-mode pytorch \\\n    --action-horizon 8\n```\n\nSome checkpoints (e.g., LIBERO) use a nested folder structure with model files under a subfolder. HuggingFace does not support nested repo paths in `--model-path`, so you must download first:\n\n```bash\nuv run hf download nvidia\u002FGR00T-N1.7-LIBERO \\\n    --include \"libero_10\u002Fconfig.json\" \"libero_10\u002Fembodiment_id.json\" \\\n    \"libero_10\u002Fmodel-*.safetensors\" \"libero_10\u002Fmodel.safetensors.index.json\" \\\n    \"libero_10\u002Fprocessor_config.json\" \"libero_10\u002Fstatistics.json\" \\\n    --local-dir checkpoints\u002FGR00T-N1.7-LIBERO\n```\n\n```bash\nuv run python scripts\u002Fdeployment\u002Fstandalone_inference_script.py \\\n    --model-path checkpoints\u002FGR00T-N1.7-LIBERO\u002Flibero_10 \\\n    --dataset-path demo_data\u002Flibero_demo \\\n    --embodiment-tag LIBERO_PANDA \\\n    --traj-ids 0 1 2 \\\n    --inference-mode pytorch \\\n    --action-horizon 8\n```\n\n### Server-Client Inference (for Deployment)\n\nFor real-world deployment or simulation evaluation, use the server-client architecture. The policy runs on a GPU server; a lightweight client sends observations and receives actions over ZMQ.\n\n**Terminal 1 — Start the policy server:**\n```bash\nuv run python gr00t\u002Feval\u002Frun_gr00t_server.py \\\n    --model-path nvidia\u002FGR00T-N1.7-3B \\\n    --embodiment-tag OXE_DROID_RELATIVE_EEF_RELATIVE_JOINT \\\n    --device cuda:0\n```\n\n**Terminal 2 — Run open-loop evaluation as a client:**\n```bash\nuv run python gr00t\u002Feval\u002Fopen_loop_eval.py \\\n    --dataset-path demo_data\u002Fdroid_sample \\\n    --embodiment-tag OXE_DROID_RELATIVE_EEF_RELATIVE_JOINT \\\n    --host 127.0.0.1 \\\n    --port 5555 \\\n    --traj-ids 1 2 \\\n    --action-horizon 8\n```\n\n> **Tip:** If you get `ZMQError: Address already in use`, the default port 5555 is occupied. Use `--port \u003Cother_port>`.\n\nFor connecting to a real robot (e.g., DROID hardware), see [examples\u002FDROID\u002FREADME.md](examples\u002FDROID\u002FREADME.md). For faster inference with TensorRT, see the [Deployment & Inference Guide](scripts\u002Fdeployment\u002FREADME.md).\n\nSee the complete [Policy API Guide](getting_started\u002Fpolicy.md) for documentation on observation\u002Faction formats, batched inference, and troubleshooting.\n\n---\n\n## Fine-tuning\n\n### Reproducing Benchmark Results\n\nEach benchmark has a self-contained README with dataset download, finetune, and evaluation commands:\n\n| Benchmark | Embodiment | Guide |\n|-----------|-----------|-------|\n| LIBERO | `LIBERO_PANDA` | [examples\u002FLIBERO\u002FREADME.md](examples\u002FLIBERO\u002FREADME.md) |\n| SimplerEnv (Fractal) | `SIMPLER_ENV_GOOGLE` | [examples\u002FSimplerEnv\u002FREADME.md](examples\u002FSimplerEnv\u002FREADME.md) |\n| SimplerEnv (Bridge) | `SIMPLER_ENV_WIDOWX` | [examples\u002FSimplerEnv\u002FREADME.md](examples\u002FSimplerEnv\u002FREADME.md) |\n| SO100 | `NEW_EMBODIMENT` | [examples\u002FSO100\u002FREADME.md](examples\u002FSO100\u002FREADME.md) |\n\n### Humanoid Whole-Body Control (SONIC)\n\nGR00T N1.7 supports whole-body humanoid control via the `UNITREE_G1_SONIC` embodiment tag and the [GEAR-SONIC](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FGR00T-WholeBodyControl) controller. In this workflow, the VLA predicts compact latent action tokens that a learned whole-body controller decodes into full-body joint commands — including legs, arms, and hands. A single policy produces language-conditioned, coordinated manipulation and locomotion end-to-end. SONIC supports whole-body coordination with precise hand and foot placements.\n\nThe complete collect → finetune → deploy workflow is documented in the [GR00T-WholeBodyControl repository](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FGR00T-WholeBodyControl):\n\n- [Data collection](https:\u002F\u002Fnvlabs.github.io\u002FGR00T-WholeBodyControl\u002Ftutorials\u002Fdata_collection.html) — VR teleoperation with SONIC for demonstration recording\n- [VLA Workflow](https:\u002F\u002Fnvlabs.github.io\u002FGR00T-WholeBodyControl\u002Ftutorials\u002Fvla_workflow.html) — finetuning Isaac-GR00T N1.7 on collected data and deploying the policy\n- [VLA Inference](https:\u002F\u002Fnvlabs.github.io\u002FGR00T-WholeBodyControl\u002Ftutorials\u002Fvla_inference.html) — running the PolicyServer + SONIC decoder for real-time control\n\n> **Note:** The `UNITREE_G1` embodiment tag is compatible with the [decoupled WBC](https:\u002F\u002Fgithub.com\u002FNVlabs\u002FGR00T-WholeBodyControl\u002Ftree\u002Fmain\u002Fdecoupled_wbc) controller, but the end-to-end collect-finetune-deploy workflow is only supported for GEAR-SONIC (`UNITREE_G1_SONIC`).\n\n### Fine-tune on Your Own Robot (\"NEW_EMBODIMENT\")\n\nTo finetune GR00T on your own robot data and configuration, follow the detailed tutorial at [`getting_started\u002Ffinetune_new_embodiment.md`](getting_started\u002Ffinetune_new_embodiment.md).\n\nEnsure your input data follows the [GR00T LeRobot format](#data-format), and specify your modality configuration via `--modality-config-path`.\n\n**Single GPU:**\n```bash\nCUDA_VISIBLE_DEVICES=0 uv run python \\\n    gr00t\u002Fexperiment\u002Flaunch_finetune.py \\\n    --base-model-path nvidia\u002FGR00T-N1.7-3B \\\n    --dataset-path demo_data\u002Fcube_to_bowl_5 \\\n    --embodiment-tag NEW_EMBODIMENT \\\n    --modality-config-path examples\u002FSO100\u002Fso100_config.py \\\n    --num-gpus 1 \\\n    --output-dir \u002Ftmp\u002Ftest_finetune \\\n    --max-steps 2000 \\\n    --global-batch-size 32 \\\n    --dataloader-num-workers 4\n```\n\n**Multi-GPU (e.g., 8xH100):**\n```bash\nuv run torchrun --nproc_per_node=8 --master_port=29500 \\\n    gr00t\u002Fexperiment\u002Flaunch_finetune.py \\\n    --base-model-path nvidia\u002FGR00T-N1.7-3B \\\n    --dataset-path demo_data\u002Fcube_to_bowl_5 \\\n    --embodiment-tag NEW_EMBODIMENT \\\n    --modality-config-path examples\u002FSO100\u002Fso100_config.py \\\n    --num-gpus 8 \\\n    --output-dir \u002Ftmp\u002Ftest_finetune_8gpu \\\n    --max-steps 2000 \\\n    --global-batch-size 32 \\\n    --dataloader-num-workers 4\n```\n\nReplace `demo_data\u002Fcube_to_bowl_5` and `examples\u002FSO100\u002Fso100_config.py` with your own dataset and modality config. See [`examples\u002FSO100`](examples\u002FSO100\u002FREADME.md) for a complete walkthrough.\n\n> **Note:** Use `uv run torchrun` (not bare `torchrun`) to ensure the correct virtual environment is used. Add `--use-wandb` to enable Weights & Biases logging. For more extensive configuration, use `gr00t\u002Fexperiment\u002Flaunch_train.py`.\n\n### Training Tips\n\n- Maximize batch size for your hardware and train for a few thousand steps.\n- Users may observe 5-6% variance between runs due to non-deterministic image augmentations. Keep this in mind when comparing to reported benchmarks.\n- **`--state_dropout_prob`** (model config default: 0.8; finetune CLI default: 0.2; see `gr00t\u002Fconfigs\u002Ffinetune_config.py`): Randomly drops state inputs during training to improve generalization and reduce state-dependency. The shipped benchmark scripts override the CLI default per suite: LIBERO 10-Long uses 0.2 (the CLI default), SimplerEnv Bridge uses 0.8, SimplerEnv Fractal uses 0.5. If your task relies heavily on proprioceptive state, lower this value.\n\n---\n\n## Evaluation\n\n### Open-Loop Evaluation\n\nCompare predicted actions against ground truth from your dataset:\n\n```bash\nuv run python gr00t\u002Feval\u002Fopen_loop_eval.py \\\n    --dataset-path \u003CDATASET_PATH> \\\n    --embodiment-tag NEW_EMBODIMENT \\\n    --model-path \u003CCHECKPOINT_PATH> \\\n    --traj-ids 0 \\\n    --action-horizon 16\n```\n\nThis generates a visualization at `\u002Ftmp\u002Fopen_loop_eval\u002Ftraj_{traj_id}.jpeg` with ground truth vs. predicted actions and MSE metrics. Use `--save-plot-path \u003Cdir>` to save plots to a custom location.\n\n### Closed-Loop Evaluation\n\nTest your model in simulation or on real hardware using the server-client architecture:\n\n```bash\n# Start the policy server\nuv run python gr00t\u002Feval\u002Frun_gr00t_server.py \\\n    --embodiment-tag NEW_EMBODIMENT \\\n    --model-path \u003CCHECKPOINT_PATH> \\\n    --device cuda:0 \\\n    --host 0.0.0.0 --port 5555\n```\n\n```python\nfrom gr00t.policy.server_client import PolicyClient\n\npolicy = PolicyClient(host=\"localhost\", port=5555)\nenv = YourEnvironment()\nobs, info = env.reset()\naction, info = policy.get_action(obs)\nobs, reward, done, truncated, info = env.step(action)\n```\n\n**Debugging with ReplayPolicy:** To verify your environment setup without a trained model, start the server with `--dataset-path \u003CDATASET_PATH>` (omit `--model-path`) to replay recorded actions from the dataset.\n\nSee the complete [Policy API Guide](getting_started\u002Fpolicy.md) for observation\u002Faction formats, batched inference, and troubleshooting.\n\n### Benchmark Examples\n\nWe support evaluation on public benchmarks using a server-client architecture. The policy server reuses the project root's uv environment; simulation clients have individual setup scripts.\n\nYou can use [the verification script](scripts\u002Feval\u002Fcheck_sim_eval_ready.py) to verify that all dependencies are properly configured.\n\n**Zero-shot** (evaluate with the base model, no finetuning):\n- [DROID](examples\u002FDROID\u002FREADME.md) — real-world DROID robot (also available as the finetuned `nvidia\u002FGR00T-N1.7-DROID` checkpoint; `examples\u002FDROID\u002FREADME.md` covers both paths)\n\n**Finetuned** (evaluate with finetuned checkpoints):\n- [DROID](examples\u002FDROID\u002FREADME.md) — real-world DROID robot via `nvidia\u002FGR00T-N1.7-DROID`\n- [LIBERO](examples\u002FLIBERO\u002FREADME.md) — LIBERO benchmark (Franka Panda)\n- [SimplerEnv](examples\u002FSimplerEnv\u002FREADME.md) — Google Robot (Fractal) and WidowX (Bridge)\n- [SO100](examples\u002FSO100\u002FREADME.md) — SO100 custom embodiment workflow\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Adding a New Sim Benchmark\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nEach sim benchmark registers its environments under a gym env_name with the format `{prefix}\u002F{task_name}` (e.g., `libero_sim\u002FLIVING_ROOM_SCENE2_put_soup_in_basket`). The evaluation framework uses the prefix to look up the corresponding `EmbodimentTag` via a mapping in [`gr00t\u002Feval\u002Fsim\u002Fenv_utils.py`](gr00t\u002Feval\u002Fsim\u002Fenv_utils.py).\n\n> **Important:** The env_name prefix and the `EmbodimentTag` value are often different. For example, `libero_sim` maps to `EmbodimentTag.LIBERO_PANDA` (`\"libero_sim\"`). Do not assume they match.\n\nTo add a new benchmark:\n\n1. Add an entry to `ENV_PREFIX_TO_EMBODIMENT_TAG` in `gr00t\u002Feval\u002Fsim\u002Fenv_utils.py`:\n   ```python\n   ENV_PREFIX_TO_EMBODIMENT_TAG = {\n       ...\n       \"my_new_benchmark\": EmbodimentTag.MY_ROBOT,\n   }\n   ```\n2. If the benchmark has multiple env_name prefixes (e.g., `my_benchmark_v1`, `my_benchmark_v2`), all related prefixes **must** map to the same `EmbodimentTag`.\n3. Add corresponding test cases in `tests\u002Fgr00t\u002Feval\u002Fsim\u002Ftest_env_utils.py` and update the `test_all_known_prefixes_present` test.\n\u003C\u002Fdetails>\n\n\n\n---\n\n# Contributions\n\nDuring Early Access we are not accepting pull requests while the codebase stabilizes. If you encounter issues or have suggestions, please open an [Issue](https:\u002F\u002Fgithub.com\u002FNVIDIA\u002FIsaac-GR00T\u002Fissues) in this repository.\n\n# Support\n\nSupport during Early Access is best-effort. We will continue iterating toward a more stable General Availability (GA) release.\n\n\n## License\n\n- **Code:** Apache 2.0 — see [LICENSE](LICENSE)\n- **Model weights:** [NVIDIA Open Model License](https:\u002F\u002Fwww.nvidia.com\u002Fen-us\u002Fagreements\u002Fenterprise-software\u002Fnvidia-open-model-license\u002F)\n\n```\n# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.\n# SPDX-License-Identifier: Apache-2.0\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http:\u002F\u002Fwww.apache.org\u002Flicenses\u002FLICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n```\n\n\n## Citation\n\n[Paper Site](https:\u002F\u002Fresearch.nvidia.com\u002Flabs\u002Flpr\u002Fpublication\u002Fgr00tn1_2025\u002F)\n```bibtex\n@inproceedings{gr00tn1_2025,\n  archivePrefix = {arxiv},\n  eprint     = {2503.14734},\n  title      = {{GR00T} {N1}: An Open Foundation Model for Generalist Humanoid Robots},\n  author     = {NVIDIA and Johan Bjorck and Fernando Castañeda, Nikita Cherniadev and Xingye Da and Runyu Ding and Linxi \"Jim\" Fan and Yu Fang and Dieter Fox and Fengyuan Hu and Spencer Huang and Joel Jang and Zhenyu Jiang and Jan Kautz and Kaushil Kundalia and Lawrence Lao and Zhiqi Li and Zongyu Lin and Kevin Lin and Guilin Liu and Edith Llontop and Loic Magne and Ajay Mandlekar and Avnish Narayan and Soroush Nasiriany and Scott Reed and You Liang Tan and Guanzhi Wang and Zu Wang and Jing Wang and Qi Wang and Jiannan Xiang and Yuqi Xie and Yinzhen Xu and Zhenjia Xu and Seonghyeon Ye and Zhiding Yu and Ao Zhang and Hao Zhang and Yizhou Zhao and Ruijie Zheng and Yuke Zhu},\n  month      = {March},\n  year       = {2025},\n  booktitle  = {ArXiv Preprint},\n}\n```\n\n","NVIDIA Isaac GR00T N1.7 是一个面向通用机器人技能的开放视觉-语言-动作（VLA）模型。该项目的核心功能在于其能够接收多模态输入，如语言和图像，并在多样化的环境中执行操作任务。GR00T N1.7 通过结合双臂、半人形及广泛的人形数据集进行训练，展现出强大的适应性和泛化能力。此外，该模型支持基于自定义机器人数据或演示进行微调和推理，适用于实验、原型开发及研究等场景。当前版本为早期访问版，提供了预训练模型权重和参考代码，但正式商用支持将在后续版本中提供。",2,"2026-06-11 03:39:59","high_star"]