[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82739":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},82739,"wuji-mjlab","wuji-technology\u002Fwuji-mjlab","wuji-technology","Wuji Hand in-hand reorientation RL with sim-to-real deployment, built on mjlab","",null,"Python",156,5,57,0,3,26,93,15,70.63,"Apache License 2.0",false,"main",true,[],"2026-06-12 04:01:38","# wuji-mjlab\n\n[中文版](README_zh.md)\n\n[![License: Apache 2.0](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache_2.0-blue.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FApache-2.0)\n[![Release](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fv\u002Frelease\u002Fwuji-technology\u002Fwuji-mjlab)](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-mjlab\u002Freleases)\n[![CI](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-mjlab\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-mjlab\u002Factions\u002Fworkflows\u002Fci.yml)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.11+-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n[![PyTorch](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPyTorch-2.7%2B-EE4C2C?logo=pytorch&logoColor=white)](https:\u002F\u002Fpytorch.org\u002F)\n[![CUDA](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCUDA-12.8-76B900?logo=nvidia&logoColor=white)](https:\u002F\u002Fdeveloper.nvidia.com\u002Fcuda-toolkit)\n[![Ruff](https:\u002F\u002Fimg.shields.io\u002Fendpoint?url=https:\u002F\u002Fraw.githubusercontent.com\u002Fastral-sh\u002Fruff\u002Fmain\u002Fassets\u002Fbadge\u002Fv2.json)](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fruff)\n[![pre-commit](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpre--commit-enabled-brightgreen?logo=pre-commit)](https:\u002F\u002Fgithub.com\u002Fpre-commit\u002Fpre-commit)\n[![Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fwuji-technology\u002Fwuji-mjlab?style=social)](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-mjlab\u002Fstargazers)\n\n> In-hand cube reorientation on the Wuji Hand: PPO policies trained in mjlab (GPU-batched physics via mujoco-warp), covering the full SO(3) goal space, with a sim2real bridge for closed-loop deployment on the physical hand.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fsim.gif\" width=\"45%\" alt=\"sim reorient demo\" \u002F>\n  \u003Cimg src=\"docs\u002Fassets\u002Freal.gif\" width=\"45%\" alt=\"real-hand reorient demo\" \u002F>\n\u003C\u002Fp>\n\n## Tasks\n\n| Robot | Task ID | Pretrained checkpoint | Demo |\n|---|---|---|---|\n| Wuji Hand | `WujiHand_Reorient` | [Latest release assets](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-mjlab\u002Freleases\u002Flatest) | sim + real GIFs above |\n\nPull the checkpoint and CAD bundle from the latest release:\n\n```bash\n# Requires gh CLI (https:\u002F\u002Fcli.github.com); the glob keeps this command\n# working across future release tags. See docs\u002Fsim2real\u002Fsetup.md §3 for\n# the manual fallback if you don't have gh installed.\ngh release download --repo wuji-technology\u002Fwuji-mjlab --pattern '*-assets.zip'\nunzip wuji-mjlab-*-assets.zip\nmv wuji-mjlab-*-assets release-assets\n```\n\n## Repository layout\n\n```text\nwuji-mjlab\u002F\n├── src\u002F\n│   ├── wuji_mjlab\u002F        # task package (tasks\u002Freorient\u002F, assets\u002F, utils\u002F, rl\u002F)\n│   └── wuji_rl_libs\u002F      # vendored rsl-rl (5.0.1+wuji1, min_std clamp)\n├── deploy\u002Freorient\u002F       # sim2real bridge (vision, ZMQ, hand driver)\n├── scripts\u002F               # train \u002F play \u002F tools entry points\n├── docs\u002F                  # architecture + sim2real setup\n├── pixi.toml              # canonical install + task runner\n└── pyproject.toml         # package metadata\n```\n\n## Requirements\n\n- Linux x86_64\n- NVIDIA GPU, CUDA 12.8 (Blackwell sm_120 \u002F RTX 50-series supported)\n- [pixi](https:\u002F\u002Fpixi.sh) ≥ 0.66 (the version CI uses) — **the only supported installer**\n- For sim2real: Wuji Hand hardware + Hikrobot USB-3 camera + Hikvision MVS SDK + 3D-printed ArUco-tagged cube + wrist AprilTag — see [`docs\u002Fsim2real\u002Fsetup.md`](docs\u002Fsim2real\u002Fsetup.md)\n\n> ⚠️ **CAUTION**: this repo is **pixi-only**. `conda + pip install -e .` is not tested and not supported.\n\n## Installation\n\n```bash\n# 1. install pixi (one-time)\ncurl -fsSL https:\u002F\u002Fpixi.sh\u002Finstall.sh | bash\n\n# 2. clone + resolve environment\ngit clone https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-mjlab\ncd wuji-mjlab\npixi install\n```\n\nThis produces a `default` environment for training\u002Feval and an optional `deploy` environment (`pixi install -e deploy`) for the sim2real bridge.\n\nVerify the environment: `pixi run list-envs` (lists registered tasks and confirms the mjlab + tyro stack imports cleanly).\n\n## Train\n\n```bash\npixi run train --task WujiHand_Reorient --agent.upload-model False\n```\n\n`--agent.upload-model False` keeps checkpoints local-only. Drop it (and set `WANDB_API_KEY`) to also push the final-iteration checkpoint to W&B as a model artifact — local `.pt` files are still written on every `save_interval` boundary either way.\n\n> **If `pixi run train` OOMs**, swap to the lower-VRAM variant:\n>\n> ```bash\n> pixi run train --task WujiHand_Reorient_Light\n> ```\n>\n> `WujiHand_Reorient` approximately reproduces the released checkpoint at `num_envs=8192, max_iterations=5000` and needs ~20 GB of GPU memory. `WujiHand_Reorient_Light` uses `num_envs=4096, max_iterations=7500` — fits comfortably under ~12 GB but converges to a visibly weaker policy (occasional cube drops, finger-jam behavior on harder reorientations).\n\nCheckpoints and W&B logs land under `logs\u002Frsl_rl\u002F\u003Crun_name>\u002F`. Task MDP, reward shaping, and the contact-parameter domain randomisation split into two anatomical groups (palm + thumb compliance zone vs fingers 2-5) are documented in the Architecture section below.\n\n## Play and evaluate\n\n```bash\n# Interactive viewer with a trained checkpoint\npixi run play --task WujiHand_Reorient --checkpoint-file \u003Cpath-to-ckpt.pt>\n\n# Success-rate eval over N trials (consumes ONNX)\npixi run python -m wuji_mjlab.tasks.reorient.scripts.eval_success_rate \u003Cpath-to-policy.onnx>\n\n# Export PPO checkpoint → ONNX (sidecar JSON with action_scale \u002F ema_alpha \u002F ctrl_dt)\npixi run python -m wuji_mjlab.tasks.reorient.scripts.export_onnx \u003Cpath-to-ckpt.pt>\n```\n\nAdditional dev utilities:\n\n```bash\npixi run list-envs                                                              # list registered tasks\npixi run python -m wuji_mjlab.tasks.reorient.scripts.view_task WujiHand_Reorient  # view task with a dummy policy\n```\n\n## Sim-to-real\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fdeploy.gif\" width=\"80%\" alt=\"sim2real deploy rig: camera over the Wuji Hand + jig, MuJoCo mirror viewer on the right\" \u002F>\n\u003C\u002Fp>\n\nThe deploy bridge runs the exported ONNX policy on the real Wuji Hand. A vision module tracks an ArUco-tagged cube (anchored to a wrist AprilTag world frame) via a USB camera and publishes the pose over ZMQ; `play_real` subscribes to that pose, runs ONNX inference, and closes the loop by sending commands to the hand driver.\n\n> **No training needed to deploy.** Download the pre-trained `policy.onnx` + `policy_config.json` from [Releases](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-mjlab\u002Freleases) and pass the `policy.onnx` path as `--ckpt` below. The released policy is what produces the demo GIF above.\n\n```bash\n# After the hardware setup in docs\u002Fsim2real\u002Fsetup.md is complete:\npixi run -e deploy home                              # reset hand to home pose\npixi run -e deploy vision                            # launch cube observer (OpenCV preview)\npixi run -e deploy play-real --ckpt \u003Cpath-to.onnx>   # closed-loop control + mirror viewer\n```\n\n- **Software pipeline & configuration**: [`deploy\u002Freorient\u002FREADME.md`](deploy\u002Freorient\u002FREADME.md)\n- **Hardware setup, 3D-printed cube, camera mounting, calibration**: [`docs\u002Fsim2real\u002Fsetup.md`](docs\u002Fsim2real\u002Fsetup.md)\n\n## Architecture\n\nThree-layer stack: this repo (tasks + deploy) → [mjlab](https:\u002F\u002Fgithub.com\u002Fmujocolab\u002Fmjlab) → [MuJoCo](https:\u002F\u002Fmujoco.org) + [mujoco-warp](https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind\u002Fmujoco_warp). PPO via the vendored [rsl-rl](https:\u002F\u002Fgithub.com\u002Fleggedrobotics\u002Frsl_rl) backend under `src\u002Fwuji_rl_libs\u002Frsl_rl\u002F`.\n\n\u003Cdetails>\n\u003Csummary>Deep dive — three-layer diagram, MDP spec, domain randomisation, adding a new task\u003C\u002Fsummary>\n\n```\n  +--------------------------------------------------------+\n  | wuji-mjlab (this repo)                                 |\n  |  +----------------------+  +-------------------------+ |\n  |  | tasks\u002Freorient\u002F      |  | deploy\u002Freorient\u002F        | |\n  |  |   - env cfg + MDP    |  |   - real-hand env       | |\n  |  |   - 2-group DR       |  |   - vision pipeline     | |\n  |  |   - eval + export    |  |   - closed-loop control | |\n  |  +----------------------+  +-------------------------+ |\n  |  +----------------------+                              |\n  |  | utils\u002F               |  \u003C- shared building blocks   |\n  |  +----------------------+                              |\n  |  +----------------------+                              |\n  |  | rl\u002F                  |  \u003C- thin RL backend adapter  |\n  |  +----------------------+                              |\n  |                                                        |\n  |  src\u002Fwuji_rl_libs\u002Frsl_rl\u002F \u003C- vendored PPO backend      |\n  +--------------------------------------------------------+\n              |                            |\n              v                            v\n  +-----------------------+  +---------------------------+\n  | mjlab (pip \u002F pixi)    |  | torch + onnxruntime       |\n  | + mujoco-warp         |  | (training + inference)    |\n  | + mujoco              |  |                           |\n  +-----------------------+  +---------------------------+\n```\n\n### Reorient task (`src\u002Fwuji_mjlab\u002Ftasks\u002Freorient\u002F`)\n\nFull SO(3) in-hand reorientation with the Wuji Hand. Files:\n\n| File | Role |\n|---|---|\n| `reorient_env_cfg.py` | Top-level `ManagerBasedRlEnvCfg` factory |\n| `reorient_terms.py` | All event \u002F termination \u002F reward \u002F DR terms (the **task design** lives here, not in robot bindings) |\n| `reorient_constants.py` | Initial pose constants (palm-up R_y(-90°), cube above palm) |\n| `config\u002Fwuji_hand\u002F` | Robot-binding layer: thin wiring of the task design onto Wuji Hand (20-DoF dexterous hand) |\n| `mdp\u002F` | Observations, commands, actions specific to reorientation |\n| `tooling\u002F` | Eval entrypoints + ONNX export |\n\nTask design (MDP terms, reward shaping, anatomically-split contact-parameter DR groups) lives in `reorient_terms.py`. See [`src\u002Fwuji_mjlab\u002Ftasks\u002Freorient\u002FREADME.md`](src\u002Fwuji_mjlab\u002Ftasks\u002Freorient\u002FREADME.md) for the architecture invariants and [`deploy\u002Freorient\u002FREADME.md`](deploy\u002Freorient\u002FREADME.md) for the sim2real bridge — `RealHandEnv` reuses the sim observation + action managers verbatim, no parallel pipelines.\n\n### Adding a new task\n\n1. Create `src\u002Fwuji_mjlab\u002Ftasks\u002F\u003Cyour_task>\u002F` with an env cfg factory.\n2. Put all MDP design (events, rewards, terminations) in `\u003Cyour_task>_terms.py`. The robot-specific config layer in `config\u002F\u003Crobot>\u002F` should be a thin binding only.\n3. Register via `register_mjlab_task()` in `config\u002F\u003Crobot>\u002F__init__.py`.\n4. Add a quick `pixi run train --task \u003Cyour_task_id>` smoke run before committing (canonical training entrypoint is `scripts\u002Ftrain\u002Ftrain_rsl_rl.py`, exposed via the `train` pixi task).\n\n\u003C\u002Fdetails>\n\n## Development\n\nAfter cloning, install the pre-commit hooks:\n\n```bash\npixi run pre-commit install\n```\n\nEvery `git commit` then runs ruff, codespell, and the YAML\u002FTOML\u002Flarge-file checks defined in [`.pre-commit-config.yaml`](.pre-commit-config.yaml) — locally, before CI sees the change. Manual full-tree run: `pixi run pre-commit run --all-files`.\n\n> ⚠️ Don't `pip install` packages into the pixi env — pip deps aren't tracked by `pixi.toml` \u002F `pixi.lock` and disappear on the next resolve. Edit `pixi.toml` and run `pixi install`.\n\n## Related Projects\n\n- [wujihandpy](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwujihandpy) — Wuji Hand SDK (C++ core with Python bindings)\n- [wuji-retargeting](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-retargeting) — Hand pose retargeting (Vision Pro \u002F glove \u002F video → robot joints)\n- [wujihandros2](https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwujihandros2) — ROS 2 driver for Wuji Hand\n- [docs.wuji.tech](https:\u002F\u002Fdocs.wuji.tech) — Official Wuji documentation portal\n\n## Acknowledgements\n\nThis project builds on the following open-source projects:\n\n- [mjlab](https:\u002F\u002Fgithub.com\u002Fmujocolab\u002Fmjlab) — manager-based RL framework\n- [mujoco-warp](https:\u002F\u002Fgithub.com\u002Fgoogle-deepmind\u002Fmujoco_warp) — GPU-batched MuJoCo physics\n- [MuJoCo](https:\u002F\u002Fmujoco.org\u002F) — the underlying physics engine\n- [rsl_rl](https:\u002F\u002Fgithub.com\u002Fleggedrobotics\u002Frsl_rl) — PPO implementation (vendored under `src\u002Fwuji_rl_libs\u002F`)\n- [pupil-apriltags](https:\u002F\u002Fgithub.com\u002Fpupil-labs\u002Fapriltags) — AprilTag detector for the deploy vision module\n\n## Contributors\n\n- [Jielin Wu](https:\u002F\u002Fgithub.com\u002FAIRJASON50)\n- [Shenzhe Yao](https:\u002F\u002Fgithub.com\u002FLeopoldYao)\n- [Han Yang](https:\u002F\u002Fgithub.com\u002Fyanghan-a)\n- [Li Chengmeng](https:\u002F\u002Fgithub.com\u002FAsahelLee)\n\n## Citation\n\nIf you find this project useful, please consider citing:\n\n```bibtex\n@software{wuji2026mjlab,\n  title={Wuji-MJLab: RL Training for Wuji Hand Dexterous Manipulation},\n  author={{Wuji Technology}},\n  year={2026},\n  url={https:\u002F\u002Fgithub.com\u002Fwuji-technology\u002Fwuji-mjlab}\n}\n```\n\n## License\n\nApache 2.0. See [LICENSE](LICENSE) and [NOTICE](NOTICE) for third-party attribution.\n","wuji-mjlab 是一个基于 mjlab 的强化学习项目，专注于使用 Wuji Hand 实现物体在手中的重定向。该项目通过PPO策略在模拟环境中训练模型，并能将训练结果部署到实际的机械手上，覆盖了完整的SO(3)目标空间。技术上，它利用了Python、PyTorch 2.7+ 和 CUDA 12.8 等工具来实现高效的GPU批处理物理模拟。适合于需要精确控制机器人手部动作的研究或开发场景，如自动化装配线上的精密操作任务。",2,"2026-06-11 04:09:05","CREATED_QUERY"]