[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1792":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":15,"starSnapshotCount":15,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},1792,"UMI-3D","hku-mars\u002FUMI-3D","hku-mars","UMI-3D SLAM and Data Processing Pipeline: https:\u002F\u002Fumi-3d.github.io\u002F","",null,"Python",236,23,11,0,1,12,40,3,4.14,"GNU General Public License v2.0",false,"master",[25,26,27,28,29,30,31,32,33],"embodied-ai","embodied-intelligence","lidar-slam","manipulation","manipulation-data","sensor-data","sensor-fusion","slam","umi","2026-06-12 02:00:32","# UMI-3D SLAM and Data Processing\n\n\u003Cdiv align=\"center\">\n\n\u003Ch3>\n  🌐 \u003Ca href=\"https:\u002F\u002Fumi-3d.github.io\u002F\">UMI-3D Project Homepage\u003C\u002Fa>\n\u003C\u002Fh3>\n\n\u003Ctable>\n  \u003Ctr>\n    \u003Ctd align=\"center\" width=\"33%\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence-Laboratory\u002FUMI-3D-Hardware\">\n        \u003Cb>🔧 UMI-3D Hardware\u003C\u002Fb>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"33%\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fhku-mars\u002FUMI-3D\">\n        \u003Cb>🛰️ UMI-3D SLAM Pipeline\u003C\u002Fb>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\" width=\"33%\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence-Laboratory\u002FUMI-3D-Policy\">\n        \u003Cb>🤖 UMI-3D Policy\u003C\u002Fb>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence-Laboratory\u002FUMI-3D-Hardware\">\n        \u003Cimg src=\"docs\u002Fassets\u002Fnav_hardware.jpg\" width=\"100%\" alt=\"UMI-3D Hardware\"\u002F>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fhku-mars\u002FUMI-3D\">\n        \u003Cimg src=\"docs\u002Fassets\u002Fnav_processing.jpg\" width=\"100%\" alt=\"UMI-3D SLAM Pipeline\"\u002F>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence-Laboratory\u002FUMI-3D-Policy\">\n        \u003Cimg src=\"docs\u002Fassets\u002Fnav_policy.jpg\" width=\"100%\" alt=\"UMI-3D Policy\"\u002F>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\n  \u003Ctr>\n    \u003Ctd align=\"center\">\n      Hardware design, BOM, CAD, 3D-print parts\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      SLAM, synchronization, calibration, and data processing\n    \u003C\u002Ftd>\n    \u003Ctd align=\"center\">\n      Policy training, deployment, inference\u003Cbr>\u003Cbr>\n      \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence-Laboratory\u002FUMI-3D-Dataset\">\n        \u003Cb>📦 Dataset & Models\u003C\u002Fb>\n      \u003C\u002Fa>\n    \u003C\u002Ftd>\n  \u003C\u002Ftr>\n\u003C\u002Ftable>\n\n\u003C\u002Fdiv>\n\n\n## Overview\nUMI-3D provides a complete end-to-end pipeline, transforming **raw rosbag recordings** into **training-ready datasets** for embodied manipulation learning:\n```\nCollected rosbag Files\n  ↓\nCalibration  \n  ↓\nSLAM\n  ↓\nAligned Demos\n  ↓\nDataset Pipeline\n  ↓\nZarr Dataset (for policy training, e.g. Diffusion Policy)\n```\n\n## 0. Complete Data Collection\n\nTo build the UMI-3D data collection system, please follow the hardware assembly and sensor setup instructions in:\n\n👉 https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence-Laboratory\u002FUMI-3D-Hardware\n\nUMI-3D collects two types of data:\n\n1. **Demonstration data**  \n   Human-guided manipulation trajectories captured during task execution.\n\n2. **Gripper calibration data**  \n   Slowly open and close the gripper for approximately 5 cycles to estimate gripper motion range.\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fcup_demo_gif.gif\" width=\"64%\" \u002F>\n  \u003Cimg src=\"docs\u002Fassets\u002Fgripper.gif\" width=\"34%\" \u002F>\n\u003C\u002Fdiv>\n\nAll data are recorded as **rosbag files**, including:\n- LiDAR point clouds  \n- IMU measurements  \n- Camera images  \n\nThese recordings serve as the raw input for the full UMI-3D data processing pipeline.\n\n## 1. Sensors Calibration\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"sensors_calibration\u002Ffisheye_intrinsics\u002Fpics\u002Ffisheye_calib_guide.jpg\" width=\"76%\"\u002F>\n  \u003Cimg src=\"sensors_calibration\u002Ffisheye_intrinsics\u002Fpics\u002Fcheckboard_6x9.jpeg\" width=\"15%\"\u002F>\n\u003C\u002Fdiv>\n\n### 1.1 Fisheye Camera Intrinsic Calibration\n\n**Step 1 — Collect calibration images**\n\n- Use a checkerboard (**6 × 9 inner corners**, square size = **0.1 m**, configurable in script)  \n- Capture ≥ 100 images with different positions (center \u002F edges \u002F corners), orientations, and distances  \n- Save all images to `fisheye_intrinsics\u002Fimages\u002F`\n\n**Step 2 — Run calibration**\n\n```bash\ncd fisheye_intrinsics\n\npython3 calibrate_fisheye_intrinsics.py \\\n    --image_glob \"images\u002F*.png\" \\\n    --checkerboard_cols 6 \\\n    --checkerboard_rows 9 \\\n    --square_size 0.10 \\\n    --output_dir calib_output\n```\n\nCalibration results will be saved to: `fisheye_intrinsics\u002Fcalib_output\u002F`\n\n\n### 1.2 LiDAR–Camera Extrinsic Calibration\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"sensors_calibration\u002Flivox2cam_calibration\u002Fsrc\u002Fpics\u002Flivox2cam_guide.jpg\" width=\"90%\"\u002F>\n\u003C\u002Fdiv>\n\nThis step estimates the rigid transformation between the Livox MID-360 LiDAR and the fisheye camera.\n\n---\n\n**Step 1 — Prepare calibration data**\n\n- Record a **static rosbag** containing:\n  - Livox point cloud (`\u002Flivox\u002Flidar`)\n  - Camera images  \n- Place the rosbag into: `livox2cam_calibration\u002Fsrc\u002Fcalib_data\u002F`\n\n- Fill in the previously calibrated camera intrinsics into: `livox2cam_calibration\u002Fsrc\u002Fconfig\u002Fqr_params.yaml`\n\n- Calibration board files are provided here:  [Calibration Board Files](sensors_calibration\u002Flivox2cam_calibration\u002Fsrc\u002Fcalib_borad_files)\n\n\n\n---\n\n**Step 2 — Build and Run**\n\n**Prerequisites:**\n- Ubuntu 20.04, ROS Noetic\n- PCL ≥ 1.8  \n- OpenCV ≥ 4.0  \n```bash\nconda deactivate\n\n# Build\ncd livox2cam_calibration\ncatkin_make\n\n# Run Calibration\nsource devel\u002Fsetup.bash\nroslaunch livox2cam_calibration calib.launch\n```\n- Output: Extrinsic transformation between LiDAR and camera (rotation + translation)\n\n\n## 2. UMI-3D SLAM\n\nThis module performs LiDAR–inertial SLAM to estimate the camera trajectory and reconstruct the environment.\n\n---\n\n**Step 1 — Configure extrinsics** \n\nFill the calibrated LiDAR–camera extrinsic parameters into: `umi_3d_slam_ws\u002Fsrc\u002Fumi_3d_slam\u002Fconfig\u002Fmid360_180.yaml`\n\n\n---\n\n**Step 2 — Install dependencies**\n\n- **Environment:** Ubuntu 20.04, ROS Noetic   \n\n- **Libraries:** PCL ≥ 1.8, Eigen ≥ 3.3.4, OpenCV ≥ 4.2  \n\n- **Install Sophus:**\n\n  ```bash\n  git clone https:\u002F\u002Fgithub.com\u002Fbitcat-tech\u002FSophus\n  cd Sophus\n  mkdir build && cd build\n  cmake ..\n  make\n  sudo make install\n  ```\n---\n**Step 3 — Build the SLAM system** \n```\nconda deactivate\ncd umi_3d_slam_ws\ncatkin_make\n```\n\n---\n**Step 4 — Run SLAM Demo**\n```\nsource devel\u002Fsetup.bash\n\n# Start SLAM\nroslaunch umi_3d_slam mapping_mid360_180.launch rviz:=true\n\n# Play rosbag\nrosbag play YOUR_DEMO.bag\n```\n- Output: Estimated camera trajectory saved in `umi_3d_slam_ws\u002Fsrc\u002Fumi_3d_slam\u002Foutput\u002Fcamera_trajectory.csv`\n\n> **Note**: Ensure proper time synchronization between LiDAR and camera.\n\n## 3. Data Processing for Training\n### 3.1 Rosbag Preprocessing\n\nThis stage converts raw rosbag recordings into **time-aligned multi-modal data**, and prepares them for SLAM and dataset generation.\n\nThe pipeline consists of two main steps:\n\n```\nRaw rosbags\n   ↓\nauto_bag_to_mp4_aligned.py   (alignment + video export)\n   ↓\naligned_bags\u002F\n   ├── demos\u002F\n   ├── 000000.bag ...\n   ↓\nauto_umi_3d_slam.sh         (trajectory estimation)\n   ↓\nFinal demos with trajectory\n```\n\n---\n\n#### Step 1 — Prepare Raw Rosbags\n\nPlace all raw rosbags into a single directory:\n\n```\n\u002Fpath\u002Fto\u002Fyour\u002Frosbags\u002F\n    ├── 2026-03-30-13-33-14.bag\n    ├── 2026-03-30-13-33-37.bag\n    ├── ...\n    ├── 20xx-xx-xx-xx-xx-xx.bag\n    ├── gripper_calibration*.bag\n```\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fraw_rosbag_fils.png\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n---\n\n#### Step 2 — Multi-modal Alignment and Video Export\n\nRun the preprocessing script:\n\n```bash\nconda deactivate\n\npython3 scripts_slam_pipeline\u002Fauto_bag_to_mp4_aligned.py \\\n  --dir \u002Fpath\u002Fto\u002Fyour\u002Frosbags \\\n  --align \\\n  --organize_each \\\n  --start_idx 0 \\\n  --id_width 6 \\\n  --use_header_stamp \\\n  --gate 0.02 \\\n  --no_symlink\n```\n\n---\n\n##### 🔍 What this script does\n\n- Synchronizes:\n  - LiDAR (Livox)\n  - Camera images\n  - IMU\n- Uses **timestamp gating (`--gate 0.02`)** for alignment\n- Re-indexes all demos into consistent IDs\n- Converts image streams into MP4 videos\n- Outputs per-frame timestamps\n\n---\n\n##### 📂 Output structure\n\n```\naligned_bags\u002F\n├── demos\u002F\n│   ├── demo_000000_000000\u002F\n│   │   ├── raw_video.mp4\n│   │   ├── raw_video_timestamps.csv\n│   │   └── source.txt\n│   ├── demo_000001_000001\u002F\n│   │   ├── ...\n│\n├── 000000.bag\n├── 000001.bag\n├── ...\n```\n\nEach demo folder corresponds to one aligned sequence.\n\n---\n\n#### Step 3 — Run SLAM for Trajectory Estimation\n\nRun batch SLAM processing:\n\n```bash\nconda deactivate\n\nbash scripts_slam_pipeline\u002Fauto_umi_3d_slam.sh \\\n  --bag_dir \u002Fpath\u002Fto\u002Fyour\u002Frosbags\u002Faligned_bags \\\n  --start 0 \\\n  --end YOUR_BAG_NUMBER\n```\n\n---\n\n##### 🔍 What this script does\n\nBased on the implementation :contentReference[oaicite:0]{index=0}:\n\n- Iterates over each indexed bag (`000000.bag`, `000001.bag`, ...)\n- For each bag:\n  1. Launches **UMI-3D SLAM system**\n  2. Plays rosbag\n  3. Waits for trajectory output\n  4. Moves result to corresponding demo folder:\n     ```\n     demos\u002Fdemo_xxxxxx_xxxxxx\u002Fcamera_trajectory.csv\n     ```\n  5. Optionally deletes processed bag to save disk space\n\n---\n\n##### 📂 Final Output\n\n```\naligned_bags\u002F\n├── demos\u002F\n│   ├── demo_000000_000000\u002F\n│   │   ├── raw_video.mp4\n│   │   ├── raw_video_timestamps.csv\n│   │   ├── camera_trajectory.csv   ← SLAM output\n│   │   └── source.txt\n│\n├── ...\n```\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Frosbag_preprocessing.png\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n\u003Cdiv align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fdemo_results.png\" width=\"100%\"\u002F>\n\u003C\u002Fdiv>\n\n\n### 3.2 UMI-format Training Data Packaging\n\nThis stage converts the preprocessed aligned demos into a **UMI-format replay buffer** for policy training.\n\nThe full pipeline is wrapped by:\n\n```bash\npython run_dataset_pipeline.py \\\n  --session_dir \u002Fpath\u002Fto\u002Faligned_bags \\\n  --output \u002Fpath\u002Fto\u002Faligned_bags\u002FDATASET_NAME.zarr.zip\n```\n\nThe pipeline runs four stages in order:\n\n```text\naligned_bags\u002F\n   └── demos\u002F\n        ├── demo_xxxxxx_xxxxxx\u002F\n        │    ├── raw_video.mp4\n        │    ├── raw_video_timestamps.csv\n        │    ├── camera_trajectory.csv\n        │    └── source.txt\n        ├── gripper_calibration*\u002F\n        │    ├── raw_video.mp4\n        │    ├── raw_video_timestamps.csv\n   ↓\n00_detect_aruco.py\n   ↓\n01_run_calibrations.py\n   ↓\n02_generate_dataset_plan.py\n   ↓\n03_generate_replay_buffer.py\n   ↓\nDATASET_NAME.zarr.zip\n```\n\n---\n\n#### Step 1 — Install environment\n\n**System dependencies**\n```bash\nsudo apt install -y libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf\n```\n\n**Conda environment**\n\nWe recommend using [Miniforge](https:\u002F\u002Fgithub.com\u002Fconda-forge\u002Fminiforge) instead of the standard Anaconda distribution.\n\n```bash\nmamba env create -f conda_environment.yaml\nconda activate umi\n```\n\n---\n\n#### Step 2 — Prepare inputs\n\nBefore running the dataset pipeline, make sure your session directory already contains:\n\n- `demos\u002Fdemo_*\u002Fraw_video.mp4`\n- `demos\u002Fdemo_*\u002Fraw_video_timestamps.csv`\n- `demos\u002Fdemo_*\u002Fcamera_trajectory.csv`\n- `demos\u002Fgripper_calibration*\u002Fraw_video.mp4`\n\nYou also need:\n\n- camera intrinsics: `example\u002Fcalibration\u002Ffisheye.json`\n- ArUco configuration: `example\u002Fcalibration\u002Faruco_config.yaml`\n\nIf needed, you can override them with:\n\n```bash\n--camera_intrinsics \u002Fpath\u002Fto\u002Fcustom_fisheye.json\n--aruco_config \u002Fpath\u002Fto\u002Fcustom_aruco_config.yaml\n```\n\n---\n\n#### Step 3 — Run the full dataset pipeline\n\n```bash\nconda activate umi\n\npython run_dataset_pipeline.py \\\n  --session_dir \u002Fpath\u002Fto\u002Faligned_bags \\\n  --output \u002Fpath\u002Fto\u002Faligned_bags\u002FDATASET_NAME.zarr.zip \n```\n\n\n---\n\n#### Output Summary\n\nAfter the full pipeline finishes, the main outputs are:\n\n```text\naligned_bags\u002F\n├── demos\u002F\n│   ├── demo_000000_000000\u002F\n│   │   ├── raw_video.mp4\n│   │   ├── raw_video_timestamps.csv\n│   │   ├── camera_trajectory.csv\n│   │   ├── tag_detection.pkl\n│   │   └── source.txt\n│   ├── ...\n│   ├── gripper_calibration*\u002F\n│   │   ├── raw_video.mp4\n│   │   ├── raw_video_timestamps.csv\n│   │   ├── tag_detection.pkl\n│   │   └── gripper_range.json\n│\n├── dataset_plan.pkl\n└── DATASET_NAME.zarr.zip\n```\n\n> **Note:** This version is currently designed for the **single-gripper UMI-3D setup**, where `camera_idx` is fixed to 0.\n\n\n## 4. Next Step: Policy Training and Deployment\n\nAfter obtaining the final dataset:\n\n```bash\nDATASET_NAME.zarr.zip\n```\n\nyou can proceed to policy training and real-world deployment using the UMI-3D Policy framework:\n\n👉 https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence-Laboratory\u002FUMI-3D-Policy\n\nThis repository provides:\n\n- Diffusion policy training\n- Real-world deployment on robotic platforms\n\u003Cdiv align=\"left\"> \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence-Laboratory\u002FUMI-3D-Policy\"> \u003Cimg src=\"docs\u002Fassets\u002Fdemo1.gif\" width=\"60%\" \u002F> \u003C\u002Fa> \u003C\u002Fdiv> \n\n## Citation\n\nIf you find this work useful for your research, please consider citing:\n\n```bibtex\n@misc{wang2026umi3dextendinguniversalmanipulation,\n  title={UMI-3D: Extending Universal Manipulation Interface from Vision-Limited to 3D Spatial Perception},\n  author={Ziming Wang},\n  year={2026},\n  eprint={2604.14089},\n  archivePrefix={arXiv},\n  primaryClass={cs.RO},\n  url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.14089}\n}\n```\n\n## Acknowledgements\n\nThis project builds upon a number of outstanding open-source works in LiDAR SLAM, calibration, and embodied perception, including: [UMI](https:\u002F\u002Fgithub.com\u002Freal-stanford\u002Funiversal_manipulation_interface), [VoxelMap](https:\u002F\u002Fgithub.com\u002Fhku-mars\u002Fvoxelmap),  [FAST-LIVO2](https:\u002F\u002Fgithub.com\u002Fhku-mars\u002Ffast-livo2), [FAST-LIO](https:\u002F\u002Fgithub.com\u002Fhku-mars\u002FFAST_LIO), [IKFoM](https:\u002F\u002Fgithub.com\u002Fhku-mars\u002FIKFoM), [velo2cam_calibration](https:\u002F\u002Fgithub.com\u002Fbeltransen\u002Fvelo2cam_calibration), [FAST-Calib](https:\u002F\u002Fgithub.com\u002Fhku-mars\u002FFAST-Calib). We sincerely thank the authors and contributors of these projects for their pioneering work and valuable contributions to the community, which have greatly inspired and enabled the development of UMI-3D.\n","UMI-3D 是一个用于SLAM和数据处理的端到端管道，旨在将原始rosbag记录转换为可用于具身操作学习的训练就绪数据集。其核心功能包括传感器校准、同步、SLAM以及数据对齐与处理，并支持LiDAR点云、IMU测量及相机图像等多种传感器数据。项目采用Python开发，具有良好的扩展性和兼容性。UMI-3D特别适用于需要高精度环境感知与物体操作的研究场景，如机器人导航、自动化装配线等。此外，通过提供详尽的数据收集硬件设计指南与软件处理流程，UMI-3D为研究人员构建完整的实验平台提供了便利。",2,"2026-06-11 02:46:02","CREATED_QUERY"]