[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80011":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":15,"stars7d":17,"stars30d":13,"stars90d":16,"forks30d":16,"starsTrendScore":13,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":20,"hasPages":20,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":16,"starSnapshotCount":16,"syncStatus":15,"lastSyncTime":26,"discoverSource":27},80011,"openpi-RLT","Yyshadow\u002Fopenpi-RLT","Yyshadow","openpi-RLT is an openpi-based real-robot RL system with RL-token-guided action refinement.","",null,"Python",75,6,69,2,0,4,2.54,"Apache License 2.0",false,"main",[],"2026-06-12 02:03:56","# openpi-RLT\n\n`openpi-RLT` is an openpi-based reproduction of **RL Token (RLT)** for\nreal-robot online reinforcement learning. It keeps the upstream\n[openpi](https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence\u002Fopenpi) VLA training and\ninference stack, then adds the RLT token module, policy-serving path, replay\nruntime, actor-critic learner, and real-robot rollout tools needed to run RLT\nend to end.\n\nTo the best of our knowledge, openpi-RLT is the first open-source,\nopenpi\u002Fpi0.5-based real-robot reproduction of the RLT-style pipeline,\ndemonstrated on Ethernet insertion with RL-token adaptation, frozen VLA\nreference serving, online actor-critic learning, replay, rollout, and\nevaluation.\n\n## Real-Robot Results\n\nThe first clip is the frozen VLA baseline; the second clip is the RLT policy\nafter online training.\n\n\u003Cp align=\"center\">\u003Cstrong>Frozen VLA Baseline\u003C\u002Fstrong>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg alt=\"Frozen VLA baseline on Ethernet insertion\" src=\"media\u002Fethernet_vla_baseline.gif\" width=\"48%\"\u002F>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\u003Cstrong>RLT Policy\u003C\u002Fstrong>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg alt=\"RLT policy on Ethernet insertion\" src=\"media\u002Fethernet_rlt_policy.gif\" width=\"48%\"\u002F>\n\u003C\u002Fp>\n\nHigh-resolution MP4s are also available:\n[VLA baseline](media\u002Fethernet_vla_baseline.mp4) and\n[RLT policy](media\u002Fethernet_rlt_policy.mp4).\n\n## Core Components\n\nThe upstream openpi layout is preserved. This section highlights the main\nRLT-specific additions and modified entry points that make up openpi-RLT.\n\n| Component | Path | Role |\n| --- | --- | --- |\n| RL-token model integration | `src\u002Fopenpi\u002Fmodels\u002Frl_token.py`, `src\u002Fopenpi\u002Fmodels\u002Fpi0.py` | Adds the RL-token encoder\u002Fdecoder and pi0\u002Fpi0.5 hooks used by the RLT policy. |\n| Training entry points | `src\u002Fopenpi\u002Ftraining\u002Fconfig.py`, `scripts\u002Ftrain_rlt.py` | Registers RLT configs and launches stage-1 RL-token training. |\n| Remote policy serving | `scripts\u002Fserve_rlt_policy.py`, `packages\u002Fopenpi-client\u002F` | Serves frozen VLA references and compact RLT features to the online RL runtime. |\n| Deployment policy adapter | `src\u002Fopenpi\u002Fpolicies\u002Fagilexbag_image_policy.py` | Adapts image observations and action chunks for deployment. |\n| Online RL runtime | `rlt_online_rl\u002Fsrc\u002Frlt_online_rl\u002F` | Contains actor, critic, learner, replay, inference, and rollout-side runtime modules. |\n| Experiment launch and configs | `rlt_online_rl\u002Flaunch\u002F`, `rlt_online_rl\u002Fconfigs\u002F` | Provides launch scripts and runtime configs for experiments such as Ethernet insertion. |\n| Robot interface bridge | `rlt_online_rl\u002Ftrain_deploy_alignment\u002F` | Connects the online RL runtime with real-robot control and signal interfaces. |\n| Replay and analysis tools | `rlt_online_rl\u002Fscripts\u002F` | Includes offline replay inspection, replay export, and experiment utilities. |\n| Demo assets and notes | `media\u002F`, `docs\u002F` | Stores README media assets and focused setup notes. |\n\nDetailed runtime instructions live in\n[rlt_online_rl\u002FREADME.md](rlt_online_rl\u002FREADME.md), and package internals are\nsummarized in\n[rlt_online_rl\u002Fsrc\u002Frlt_online_rl\u002FREADME.md](rlt_online_rl\u002Fsrc\u002Frlt_online_rl\u002FREADME.md).\n\n## Getting Started\n\nClone the repository and install the openpi\u002FRLT stack from the repository root:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FYyshadow\u002Fopenpi-RLT.git\ncd openpi-RLT\nuv sync\nuv pip install -e .\n```\n\nRLT training configs are registered in\n[`src\u002Fopenpi\u002Ftraining\u002Fconfig.py`](src\u002Fopenpi\u002Ftraining\u002Fconfig.py). A typical\nstage-1 command looks like:\n\n```bash\nuv run scripts\u002Ftrain_rlt.py rlt_pi05_agilexbag_image_delta_joint \\\n  --exp-name \u003Crun-name> \\\n  --overwrite\n```\n\nAfter a trained checkpoint is available, serve the frozen VLA\u002FRLT policy with:\n\n```bash\npython scripts\u002Fserve_rlt_policy.py \\\n  --config rlt_pi05_agilexbag_image_delta_joint \\\n  --checkpoint-dir \u003Ccheckpoint-dir> \\\n  --port 8000\n```\n\nFor the real-robot online RL launch order, keyboard controls, replay semantics,\nand eval-only rollout flow, follow\n[rlt_online_rl\u002FREADME.md](rlt_online_rl\u002FREADME.md).\n\n## Relationship to Upstream Work\n\nThis project follows the RLT recipe from **RL Token: Bootstrapping Online RL\nwith Vision-Language-Action Models** and uses openpi's pi0.5 as the base VLA\nstack. The RLT-specific additions in this fork are layered on top of upstream\nopenpi rather than replacing it with another backbone.\n\nKey implementation pieces include:\n\n- RL-token encoder\u002Fdecoder modules and pi0.5 integration.\n- RL-token training with an optional supervised VLA fine-tuning term.\n- Frozen VLA policy serving that returns reference action chunks and compact\n  RL-token features.\n- A lightweight online actor-critic runtime for chunk-level action refinement.\n- Real-robot replay collection, episode finalization, human-intervention\n  handling, and eval-only rollout support.\n\n## Contributors\n\nCore contributors are listed in contribution order:\n\nYi Yang\u003Csup>&#42;\u003C\u002Fsup>, Huaihang Zheng\u003Csup>&#42;\u003C\u002Fsup>, Kai Ma\u003Csup>&#42;&#8224;\u003C\u002Fsup>, Tian Xie, Guozheng Li, Shenglin Xu,\nXiangyu Wang, Yiren Ma, Baoxu Liu\u003Csup>&#8224;\u003C\u002Fsup>\n\n\u003Csub>&#42; Equal contribution. &#8224; Project lead.\u003C\u002Fsub>\n\n## Citation\n\nPlease cite this repository, the original RLT paper, and openpi when using this\ncode in your work.\n\n```bibtex\n@misc{openpi_rlt_2026,\n  title = {openpi-RLT: Real-Robot RLT Reproduction on openpi},\n  author = {Yi Yang and Huaihang Zheng and Kai Ma and Tian Xie and Guozheng Li and Shenglin Xu and Xiangyu Wang and Yiren Ma and Baoxu Liu},\n  year = {2026},\n  note = {Open-source real-robot reproduction of RL Token (RLT) built on openpi.}\n}\n```\n\n## Acknowledgements\n\nThis repository builds on\n[openpi](https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence\u002Fopenpi) from Physical\nIntelligence and follows the RLT paper\n[RL Token: Bootstrapping Online RL with Vision-Language-Action Models](https:\u002F\u002Fpi.website\u002Fresearch\u002Frlt).\n\n## License\n\nSee [LICENSE](LICENSE) and [LICENSE_GEMMA.txt](LICENSE_GEMMA.txt).\n","openpi-RLT是一个基于openpi的实机器人强化学习系统，通过RL-token指导的动作细化来实现在线学习。该项目在保留了上游openpi的VLA训练和推理堆栈的基础上，增加了RL-token模块、策略服务路径、重放缓存运行时、演员-评论家学习器以及实机器人滚动工具等关键组件，以支持端到端的RLT流程。它特别适用于需要结合实时反馈进行调整的真实物理任务场景中，比如演示中的以太网插头插入实验，展示了如何利用冻结的VLA参考提供服务，并通过在线学习改进策略表现。此项目为开源软件，遵循Apache License 2.0协议。","2026-06-11 03:58:53","CREATED_QUERY"]