[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82737":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":12,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":12,"stars7d":14,"stars30d":15,"stars90d":13,"forks30d":13,"starsTrendScore":16,"compositeScore":17,"rankGlobal":8,"rankLanguage":8,"license":8,"archived":18,"fork":18,"defaultBranch":19,"hasWiki":18,"hasPages":18,"topics":20,"createdAt":8,"pushedAt":8,"updatedAt":21,"readmeContent":22,"aiSummary":23,"trendingCount":13,"starSnapshotCount":13,"syncStatus":24,"lastSyncTime":25,"discoverSource":26},82737,"tau-0-wm","sii-research\u002Ftau-0-wm","sii-research",null,"Python",208,12,3,0,30,95,16,67.84,false,"main",[],"2026-06-12 04:01:38","\n# $\\tau_0$-World Model\n\n\u003Cdiv id=\"top\" align=\"center\">\n\n![Overview](figures\u002FVAM-teaser-img.jpg)\n\n\u003C\u002Fa> &nbsp; \u003Ca href='https:\u002F\u002Ffinch.agibot.com\u002Fresearch\u002Ftau0-wm'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject_Website-tau0_WM-blue' height='25'>\u003C\u002Fa> &nbsp; \u003Ca href='https:\u002F\u002Ffinch-static.agibot.com\u002FVAM\u002Fblog\u002Ftau_0_wm.pdf'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPaper-tau_0_WM-red' height='25'>\u003C\u002Fa> &nbsp; \u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fsii-research\u002Ftau-0-wm'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWeight-huggingface-orange' height='25'>\u003C\u002Fa> &nbsp;\n\n\u003C\u002Fdiv>\n\nThis repo is the official implementation of **$\\tau_0$-World Model: A Unified Video-Action World Model for Robotic Manipulation**.\n\n\n## News\n- [2026.05.31] 🚀 We release $\\tau_0$-World Model [Paper](https:\u002F\u002Ffinch-static.agibot.com\u002FVAM\u002Fblog\u002Ftau_0_wm.pdf), [Project Website](https:\u002F\u002Ffinch.agibot.com\u002Fresearch\u002Ftau0-wm), [Huggingface](https:\u002F\u002Fhuggingface.co\u002Fsii-research\u002Ftau-0-wm).\n\n\n## Pretrained Model\n\n* The pretrained weights of VAM can be found on [huggingface]([https:\u002F\u002Fhuggingface.co\u002F](https:\u002F\u002Fhuggingface.co\u002Fsii-research\u002Ftau-0-wm)).\n\n* The pretrained weights of Simulator will be released soon.\n\n* The codes of Test-Time Computation will be further released with The pretrained weights of Simulator.\n\n\n\n\n\n## Real-World Deployment\n\n### Setup\n\n```\npip install -r requirements.txt\n```\n\n### Preparation\n\n1. Download the pretrained weight of $\\tau_0$-World Model.\n\n2. Download the weight of [Wan2.2-TI2V-5B](https:\u002F\u002Fhuggingface.co\u002FWan-AI\u002FWan2.2-TI2V-5B).\n\n3. Replace `diffusion_model.model_path` in `configs\u002Fdeployment\u002Fwan_pretrain_rela_eef6d.yaml` with your local path to $\\tau_0$-WM's weight.\n\n4. Replace `vae_path` in the config with your local path to VAE's weight.\n\n5. Replace `text_encoder.checkpoint_path` and `text_encoder.tokenizer_path` in the config with your local path to text encoder and tokenizer.\n\n\n### Action Space\n\n*state* sent to the server should be the **absolute** poses of two end-of-effectors, including 14 channels (xyz and quaternion with order *xyzw*). The coordinate origin of each eef pose is its corresponding **Arm Base link**.\n\n*gripper state* should include 2 channels, ranging from 0 to 120 (0 for opening and 120 for close).\n\n*action* obtained from the server will be the **absolute** poses of end-of-effectors with shape {T, 16}. \n\nThe order of output actions:\n- left end-effector (xyz + quaternion with order *xyzw*)\n- left gripper openness (ranging from 0 to 1, 0 for opening and 1 for close)\n- right end-effector\n- right gripper openness.\n\n\nIn the pretraining stage, $\\tau_0$-WM is optimized to predict the relative pose of end-effectors, including 20 channels (xyz and 6d-rotation for each arm). **The conversion between quaternion and 6d-rotation will be executed automatically.**\n\n\n\n### Running\nWe provide a simple script of deploying $\\tau_0$-WM server:\n\n```\n# Policy Server\nbash run_infer_server.sh $HOST $PORT\n\n# A simple client that send random observations\npython web_infer_utils\u002Fsimple_client.py\n```\n\n## Acknowledgment\n- The video model of $\\tau_0$-WM is built on [Wan-2.2](https:\u002F\u002Fgithub.com\u002FWan-Video\u002FWan2.2).\n- Some codes in this repo are modified from [GE-Act](https:\u002F\u002Fgithub.com\u002FAgibotTech\u002FGenie-Envisioner.git).\n- The web-socket based policy server is built on [openpi](https:\u002F\u002Fgithub.com\u002FPhysical-Intelligence\u002Fopenpi).\n\n\n### License\nData and codes within this repo are under [Apache License 2.0](https:\u002F\u002Fgithub.com\u002Fhuggingface\u002Fdiffusers\u002Fblob\u002Fmain\u002FLICENSE).\n\n","$\\tau_0$-World Model 是一个统一的视频-动作世界模型，专为机器人操作设计。其核心功能在于能够通过视频输入预测机器人的动作序列，支持绝对姿态和夹爪状态的计算与转换，适用于需要高精度控制的场景。该模型使用Python开发，并在Huggingface平台上提供了预训练权重，方便用户快速部署。特别适合于研究和工业应用中涉及复杂环境下的机器人自主操作任务。",2,"2026-06-11 04:09:05","CREATED_QUERY"]