[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80751":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":14,"stars7d":13,"stars30d":13,"stars90d":15,"forks30d":15,"starsTrendScore":16,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":15,"starSnapshotCount":15,"syncStatus":13,"lastSyncTime":26,"discoverSource":27},80751,"RC-aux","Guang000\u002FRC-aux","Guang000","Official Code of \"Predictive but Not Plannable: RC-aux for Latent World Models\"","https:\u002F\u002Fguang000.github.io\u002FRC-aux-Webpage\u002F",null,"Python",43,2,1,0,3,1.43,"MIT License",false,"main",true,[],"2026-06-12 02:04:06","# Predictive but Not Plannable: RC-aux for Latent World Models\n\nThis repository contains the code for **Reachability-Correction auxiliary objective (RC-aux)**.  RC-aux is a lightweight training and planning correction for reconstruction-free latent world models: it keeps the LeWM backbone unchanged, trains open-loop multi-horizon prediction, adds a budget-conditioned reachability head, and optionally uses that reachability signal during planning.\n\nPaper: [arXiv:2605.07278](https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.07278)\n\n## What Is Included\n\n```text\n.\n├── train.py \u002F eval.py        # five-task pixel-control training and MPC evaluation\n├── jepa.py                   # LeWM\u002FRC-aux latent world model\n├── module.py                 # predictor, regularizer, reachability head\n├── rcaux.py                  # multi-horizon and reachability objectives\n├── config\u002F\n│   ├── train\u002F                # LeWM and RC-aux configs\n│   └── eval\u002F                 # TwoRoom, Reacher, Push-T, Cube configs\n├── scripts\u002F\n│   └── eval_dino_family_official.py\n├── libero\u002F                   # LIBERO-Goal OFT\u002FBCRNN action-head scripts\n├── tools\u002F                    # fixed-group generation and success summaries\n└── results\u002F                  # result CSV summaries\n```\n\nCheckpoints and datasets are not stored in the GitHub repository.  We release the main public checkpoint separately on [Hugging Face](https:\u002F\u002Fhuggingface.co\u002Fbiubiu116\u002FRC-aux).\n\n## Installation\n\nPython 3.10 is recommended.\n\n```bash\npython -m venv .venv\nsource .venv\u002Fbin\u002Factivate\npython -m pip install --upgrade pip\npython -m pip install -r requirements.txt\n```\n\nFor headless MuJoCo evaluation:\n\n```bash\nexport MUJOCO_GL=egl\n```\n\n## Data and Checkpoints\n\nThe five pixel-control tasks use `stable-worldmodel`'s cache root:\n\n```bash\nexport STABLEWM_HOME=\u002Fpath\u002Fto\u002Fstable-wm-cache\n```\n\nExpected dataset locations:\n\n| Task | Dataset path under `$STABLEWM_HOME` |\n| --- | --- |\n| TwoRoom | `tworoom.h5` |\n| Reacher | `dmc\u002Freacher_random.h5` |\n| Push-T | `pusht_expert_train.h5` |\n| Cube | `ogbench\u002Fcube_single_expert.h5` |\n| Wall | `dino_wall.h5` |\n\nCheckpoint paths are resolved relative to `$STABLEWM_HOME`.  Pass policy stems without the `_object.ckpt` suffix.\n\n## Five-Task Evaluation\n\nExample with the released TwoRoom RC-aux checkpoint:\n\n```bash\nmkdir -p \"$STABLEWM_HOME\u002Ftworoom_rcaux\"\ncp \u002Fpath\u002Fto\u002Fhf_release\u002Frcaux\u002Fcheckpoints\u002Fpixel_control\u002Ftworoom_rcaux\u002F* \"$STABLEWM_HOME\u002Ftworoom_rcaux\u002F\"\n\npython eval.py --config-name=tworoom.yaml \\\n  cache_dir=\"$STABLEWM_HOME\" \\\n  policy=tworoom_rcaux\u002Frcaux_tworoom \\\n  +planner_override.use_reachability_cost=true \\\n  +planner_override.reachability_cost_weight=0.85 \\\n  output.filename=tworoom_rcaux_eval.txt\n```\n\nUse fixed evaluation groups by passing:\n\n```bash\neval.row_indices_file=\u002Fpath\u002Fto\u002Fgroup_00.json\n```\n\nSummarize logs:\n\n```bash\npython tools\u002Fsummarize_group_success.py \u002Fpath\u002Fto\u002Fresults\u002F*.txt\n```\n\nWall uses the DINO-WM environment-native benchmark path.  Provide the DINO-WM source checkout and Wall dataset explicitly:\n\n```bash\npython scripts\u002Feval_dino_family_official.py \\\n  --source-root \u002Fpath\u002Fto\u002Fdino_wm \\\n  --benchmark wall \\\n  --policy-kind object \\\n  --policy \u002Fpath\u002Fto\u002Flewm_epoch_8_object.ckpt \\\n  --dataset-path \"$STABLEWM_HOME\u002Fdino_wall.h5\" \\\n  --val-dataset-path \"$STABLEWM_HOME\u002Fdino_wall.h5\" \\\n  --num-samples 600 \\\n  --n-steps 20 \\\n  --topk 60 \\\n  --horizon 8 \\\n  --receding-horizon 4 \\\n  --action-block 5 \\\n  --goal-cost-reduce softmin \\\n  --goal-cost-softmin-temperature 1.0 \\\n  --use-reachability-cost on \\\n  --reachability-cost-weight 0.85 \\\n  --output wall_rcaux_eval.txt\n```\n\n## Training\n\nLeWM baseline:\n\n```bash\npython train.py --config-name=lewm.yaml \\\n  data=tworoom \\\n  wandb.enabled=false \\\n  output_model_name=lewm_tworoom \\\n  subdir=tworoom\u002Flewm_tworoom\n```\n\nRC-aux:\n\n```bash\npython train.py --config-name=rcaux_default.yaml \\\n  data=tworoom \\\n  wandb.enabled=false \\\n  output_model_name=rcaux_tworoom \\\n  subdir=tworoom\u002Frcaux_tworoom\n```\n\nContinuation from an existing LeWM checkpoint:\n\n```bash\ninit.weights_path=\u002Fpath\u002Fto\u002Fsource_object.ckpt init.strict=false\n```\n\n## LIBERO-Goal\n\nThe LIBERO-Goal extension code is under `libero\u002F`.  It trains an OFT-style action chunk head on top of the LeWM-family representation and evaluates with the official LIBERO success checker.\nLIBERO-Goal does not use the MPC planner or `planner_override.*` settings; there is no `reachability_cost_weight` argument in these scripts.\n\nExpected local layout:\n\n```text\nassets\u002Fbenchmarks\u002FLIBERO\u002F\nassets\u002Fdatasets\u002Flibero_goal_agentview.h5\ncheckpoints\u002Flibero\u002Flewm_epoch_40_object.ckpt\n```\n\nTrain the OFT-style action head:\n\n```bash\npython libero\u002Ftrain_libero_goal_lewm_oft_head.py \\\n  --libero-root assets\u002Fbenchmarks\u002FLIBERO \\\n  --init-policy checkpoints\u002Flibero\u002Flewm_epoch_40_object.ckpt \\\n  --tasks all \\\n  --image-keys agentview_rgb,eye_in_hand_rgb \\\n  --chunk-len 8 \\\n  --action-horizon 8 \\\n  --hidden-dim 1024 \\\n  --batch-size 32 \\\n  --max-epochs 30 \\\n  --train-encoder \\\n  --run-dir runs\u002Flibero_goal_rcaux_oft\n```\n\nEvaluate:\n\n```bash\nTORCH_FORCE_NO_WEIGHTS_ONLY_LOAD=1 \\\npython libero\u002Feval_libero_goal_lewm_oft_head.py \\\n  --checkpoint runs\u002Flibero_goal_rcaux_oft\u002Flewm_libero_oft_head_epoch_30.ckpt \\\n  --tasks all \\\n  --n-eval 50 \\\n  --max-steps 600 \\\n  --output results\u002Flibero_goal_rcaux_oft_n50.json\n```\n\nOn systems where robosuite cannot create an EGL offscreen context, run the same command with `MUJOCO_GL=glx`.\n\n## Main Reported Results\n\nThe CSV summaries in `results\u002F` mirror the reported tables.  Local LeWM-family rows are mean±std over five fixed evaluation groups of 50 episodes.\n\n| Task | LeWM | LeWM-cont | RC-aux | Matched delta |\n| --- | ---: | ---: | ---: | ---: |\n| TwoRoom | 88.8±3.0 | 88.8±3.0 | 98.0±1.4 | +9.2 |\n| Reacher | 81.2±7.9 | 82.8±7.2 | 87.2±6.4 | +4.4 |\n| Push-T | 90.4±3.0 | 91.2±3.9 | 90.8±3.3 | -0.4 |\n| Wall | 50.4±6.5 | -- | 83.6±3.6 | +33.2 |\n| Cube | 72.4±5.9 | 72.8±5.2 | 76.0±7.5 | +3.2 |\n\nFor TwoRoom, Reacher, Push-T, and Cube, matched deltas compare against LeWM-cont.  For Wall, no continuation control is available, so the matched delta compares against local LeWM.\n\n## Citation\n\n```bibtex\n@article{li2026predictive,\n  title={Predictive but Not Plannable: RC-aux for Latent World Models},\n  author={Li, Wenyuan and Li, Guang and Maeda, Keisuke and Ogawa, Takahiro and Haseyama, Miki},\n  journal={arXiv preprint arXiv:2605.07278},\n  year={2026}\n}\n```\n\nThis project builds on LeWorldModel, stable-worldmodel, stable-pretraining, DINO-WM, and LIBERO.  Please cite the corresponding original work when using those components.\n","该项目提供了实现“可达性校正辅助目标（RC-aux）”的官方代码，旨在为无重建潜变量世界模型提供轻量级的训练和规划修正。核心功能包括保持LeWM主干不变的同时训练开环多步预测，并添加一个预算条件下的可达性头部，在规划时可选择性地使用该可达性信号。技术上基于Python开发，具有良好的模块化设计，支持多种环境下的五任务像素控制训练与评估。适用于需要改进长期预测准确性和规划能力的强化学习场景，特别是那些依赖于高效、精确的世界模型来指导决策的应用场合。","2026-06-11 04:01:53","CREATED_QUERY"]