[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80792":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":8,"language":10,"languages":8,"totalLinesOfCode":8,"stars":11,"forks":12,"watchers":11,"openIssues":12,"contributorsCount":12,"subscribersCount":12,"size":12,"stars1d":12,"stars7d":12,"stars30d":12,"stars90d":12,"forks30d":12,"starsTrendScore":12,"compositeScore":12,"rankGlobal":8,"rankLanguage":8,"license":8,"archived":13,"fork":13,"defaultBranch":14,"hasWiki":15,"hasPages":13,"topics":16,"createdAt":8,"pushedAt":8,"updatedAt":17,"readmeContent":18,"aiSummary":19,"trendingCount":12,"starSnapshotCount":12,"syncStatus":20,"lastSyncTime":21,"discoverSource":22},80792,"ECHO_CODE","Hxxxz0\u002FECHO_CODE","Hxxxz0",null,"https:\u002F\u002Fecho-phi-eight.vercel.app","Python",38,0,false,"main",true,[],"2026-06-12 02:04:06","# ECHO: Edge-Cloud Humanoid Orchestration for Language-to-Motion Control\n\n[![Project Page](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-blue)](https:\u002F\u002Fecho-phi-eight.vercel.app)\n[![ModelScope](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FModel-Weights-purple)](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FHzzzz001\u002FECHO\u002Fsummary)\n[![Paper](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2603.16188-b31b1b)](https:\u002F\u002Farxiv.org\u002Fpdf\u002F2603.16188)\n\n**ECHO** is an edge–cloud framework for language-driven whole-body control of humanoid robots. A cloud-hosted diffusion-based text-to-motion generator synthesizes motion references from natural language, while an edge-deployed RL tracker executes them in closed loop on the **Unitree G1** humanoid.\n\n## Overview\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Fstatic\u002Fimages\u002Fcarousel1.jpg\" alt=\"ECHO Overview\" width=\"100%\">\n\u003C\u002Fp>\n\n**ECHO** processes natural language instructions through a CLIP-conditioned diffusion model on a cloud GPU, producing 38D robot-native motion sequences in ~1 second. The motion is streamed via WebSocket to an edge-deployed ONNX tracking policy that runs at 50 Hz on the G1 with PD control and autonomous fall recovery.\n\n---\n\n## Demo Videos\n\n### Real Robot (Unitree G1)\n\n\u003Cp align=\"center\">\n  \u003Cvideo src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Freal-preview\u002Fwalk%205%20step.mp4\" muted autoplay loop playsinline width=\"22%\">\u003C\u002Fvideo>\n  \u003Cvideo src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Freal-preview\u002Fdo%20jumping%20jacks.mp4\" muted autoplay loop playsinline width=\"22%\">\u003C\u002Fvideo>\n  \u003Cvideo src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Freal-preview\u002Fwave%20right%20hand.mp4\" muted autoplay loop playsinline width=\"22%\">\u003C\u002Fvideo>\n  \u003Cvideo src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Freal-preview\u002Fwalk%20in%20a%20circle.mp4\" muted autoplay loop playsinline width=\"22%\">\u003C\u002Fvideo>\n\u003C\u002Fp>\n\n### Simulation (MuJoCo)\n\n\u003Cp align=\"center\">\n  \u003Cvideo src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Fsim-preview\u002Fwalk%205%20step.mp4\" muted autoplay loop playsinline width=\"22%\">\u003C\u002Fvideo>\n  \u003Cvideo src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Fsim-preview\u002Ffly%20kick.mp4\" muted autoplay loop playsinline width=\"22%\">\u003C\u002Fvideo>\n  \u003Cvideo src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Fsim-preview\u002Fa%20person%20is%20drinking%20water.mp4\" muted autoplay loop playsinline width=\"22%\">\u003C\u002Fvideo>\n  \u003Cvideo src=\"https:\u002F\u002Fecho-phi-eight.vercel.app\u002Fsim-preview\u002Fhe%20is%20running%20straight%20and%20stopped.mp4\" muted autoplay loop playsinline width=\"22%\">\u003C\u002Fvideo>\n\u003C\u002Fp>\n\n> More videos on the [project page](https:\u002F\u002Fecho-phi-eight.vercel.app).\n\n---\n\n## Key Features\n\n- **Robot-native**: generates directly in G1 29-DOF joint space — no human body model, no retargeting\n- **38D velocity-based representation**: joint angles + root velocity + root height + continuous 6D rotation\n- **Classifier-free guidance**: DDIM sampling with 10 denoising steps produces motions in ~1 second on cloud GPU\n- **Edge deployment**: ONNX tracking policy runs on CPU at 50 Hz with PD control and autonomous fall recovery\n\n---\n\n## Installation\n\n```bash\nconda create -n echo python=3.10 -y\nconda activate echo\nconda install pytorch pytorch-cuda=12.8 -c pytorch -c nvidia -y\npip install -r generator\u002Frequirements.txt\n\n# For WebSocket server\npip install -r generator\u002Frequirements_server.txt\n```\n\n## Model Weights\n\nDownload from ModelScope ([Hzzzz001\u002FECHO](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FHzzzz001\u002FECHO\u002Fsummary)):\n\n```bash\ngit clone https:\u002F\u002Fwww.modelscope.cn\u002FHzzzz001\u002FECHO.git checkpoints\u002F\n```\n\n| Checkpoint | Backbone | Dim | Inference |\n|-----------|----------|-----|-----------|\n| `checkpoints\u002Frobotv2\u002Frobotv2_38d_lite` | UNet (small) | 128 | ~1.0s |\n| `checkpoints\u002Frobotv2\u002Frobotv2_38d` | UNet (full) | 512 | ~1.5s |\n| `checkpoints\u002Frobotv2\u002Frobotv2_38d_transformer` | Transformer | 768 | ~3.0s |\n\nEach checkpoint: `opt.txt` (config), `model\u002Flatest.tar` (weights), `meta\u002F{mean,std}.npy` (normalization).\n\nNormalization stats for the dataset: `generator\u002Fdata\u002FMean_38d.npy`, `generator\u002Fdata\u002FStd_38d.npy`.\n\n---\n\n## Usage\n\n### Generate motion from text\n\n```bash\ncd generator\npython scripts\u002Fgenerate_robot.py \\\n    --opt_path ..\u002Fcheckpoints\u002Fcheckpoints\u002Frobotv2\u002Frobotv2_38d_lite\u002Fopt.txt \\\n    --text_prompt \"a person walks forward\" \\\n    --motion_length 4.0 \\\n    --output_dir .\u002Foutput\n```\n\nOutput: `output\u002Fnpz\u002F000000.npz` with `joint_pos (T,29)`, `root_pos (T,3)`, `root_rot (T,4)`.\n\n### Start WebSocket server (cloud)\n\n```bash\ncd generator\npython scripts\u002Fserver_robot_ws.py \\\n    --opt_path ..\u002Fcheckpoints\u002Fcheckpoints\u002Frobotv2\u002Frobotv2_38d_lite\u002Fopt.txt \\\n    --port 8000 --host 127.0.0.1\n```\n\nHealth check: `curl http:\u002F\u002F127.0.0.1:8000\u002F` → `{\"status\":\"running\",\"service\":\"ECHO Motion Generation Server\"}`\n\nWebSocket API: connect to `ws:\u002F\u002F127.0.0.1:8000\u002Fws`, send JSON request, receive binary NPZ.\n\n```json\n{\"text\": \"walk forward slowly\", \"motion_length\": 4.0, \"num_inference_steps\": 10, \"seed\": 42}\n```\n\nRemote access via SSH tunnel: `ssh -L 8000:127.0.0.1:8000 user@cloud-server`\n\nSee [generator\u002Fdocs\u002FWEBSOCKET_QUICKSTART.md](generator\u002Fdocs\u002FWEBSOCKET_QUICKSTART.md) and [generator\u002Fdocs\u002FCLIENT_API.md](generator\u002Fdocs\u002FCLIENT_API.md) for details.\n\n### Evaluate model\n\n```bash\ncd generator\npython scripts\u002Fevaluation.py \\\n    --opt_path ..\u002Fcheckpoints\u002Fcheckpoints\u002Frobotv2\u002Frobotv2_38d\u002Fopt.txt \\\n    --evaluator_dir ..\u002Fcheckpoints\u002Fcheckpoints\u002Frobot_evaluator\n```\n\nMetrics: FID, R-Precision Top-1\u002F2\u002F3, Matching Score, Diversity, Multimodality, Motion Safety Score (MSS), Root Trajectory Consistency (RTC).\n\n### Train generator\n\nRequires preprocessed 38D robot motion data.\n\n```bash\ncd generator\naccelerate launch scripts\u002Ftrain.py \\\n    --dataset_name robotv2 \\\n    --name robotv2_experiment \\\n    --batch_size 64 \\\n    --num_train_steps 500000 \\\n    --model_ema \\\n    --model_type unet \\\n    --base_dim 512 \\\n    --lr 2e-4\n```\n\n### Deploy to G1 robot\n\nSee [deploy\u002FREADME.md](deploy\u002FREADME.md) — sim2sim test, real robot setup, text-to-motion client, and ONNX policy inference.\n\n---\n\n## 38D Motion Representation\n\n| Index | Field | Dims | Description |\n|-------|-------|------|-------------|\n| 0–28 | `joint_pos` | 29 | Joint angles (rad) in Isaac Gym order |\n| 29–30 | `root_vel_xy` | 2 | Root planar velocity in body frame |\n| 31 | `root_z` | 1 | Root height above ground (m) |\n| 32–37 | `root_rot_6d` | 6 | Continuous 6D root rotation |\n\n50 FPS, max 490 frames (~9.8s). Velocity-based root motion avoids global drift. 6D rotation prevents gimbal lock.\n\n## Project Structure\n\n```\nECHO_CODE\u002F\n├── generator\u002F                  # Cloud diffusion generator\n│   ├── models\u002F                 # EchoUnet (1D Conv), Transformer\n│   │   ├── unet.py             # CondUNet1D + AdaGN + cross-attention\n│   │   ├── transformer.py      # Decoder-only diffusion transformer\n│   │   └── gaussian_diffusion.py  # DDIM\u002FDPMSolver inference pipeline\n│   ├── datasets\u002F               # 38D robot motion dataset loader\n│   ├── trainers\u002F               # DDPM training loop + EMA\n│   ├── eval\u002F                   # MoCLIP, MSS, RTC evaluation\n│   ├── utils\u002F                  # Motion processing, rotation, quaternion\n│   ├── options\u002F                # CLI argument parsers\n│   ├── scripts\u002F\n│   │   ├── train.py            # Training entry point\n│   │   ├── generate_robot.py   # Text-to-motion generation\n│   │   ├── server_robot_ws.py  # WebSocket inference server\n│   │   ├── evaluation.py       # Evaluation pipeline\n│   │   └── compute_38d_stats.py\n│   ├── tools\u002FMoCLIP\u002F           # MoCLIP evaluator training\n│   ├── config\u002F                 # Diffusion scheduler & evaluator YAML\n│   ├── docs\u002F                   # WebSocket API docs\n│   ├── data\u002F                   # Mean_38d.npy, Std_38d.npy\n│   └── checkpoints\u002F            # Pretrained weights (downloaded)\n├── deploy\u002F                     # Edge deployment (Sim2Real)\n│   ├── src\u002F\n│   │   ├── deploy.py           # Main controller (real + sim)\n│   │   ├── sim2sim.py          # MuJoCo simulator bridge\n│   │   ├── text_to_motion.py   # Cloud generator WS client\n│   │   ├── policy.py           # ONNX runtime inference\n│   │   ├── observation.py      # Observation construction\n│   │   └── common\u002F             # Joint mapper, PD helpers, math\n│   ├── config\u002F                 # tracking.yaml, controller.yaml\n│   └── assets\u002Fckpts\u002F           # ONNX policy checkpoint\n└── scripts\u002F                    # download_weights.sh, serve.sh\n```\n\n## Citation\n\n```bibtex\n@misc{jia2026echoedgecloudhumanoidorchestration,\n      title={ECHO: Edge-Cloud Humanoid Orchestration for Language-to-Motion Control},\n      author={Haozhe Jia and Jianfei Song and Yuan Zhang and Honglei Jin and Youcheng Fan and Wenshuo Chen and Wei Zhang and Yutao Yue},\n      year={2026},\n      eprint={2603.16188},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.16188},\n}\n```\n\n## License\n\nMIT — see [LICENSE](generator\u002FLICENSE).\n","ECHO是一个用于人形机器人全身语言驱动控制的边缘-云端框架。该项目通过云GPU上基于扩散模型的文本到动作生成器将自然语言指令转换为动作序列，并通过WebSocket将这些动作流传输给部署在边缘设备上的ONNX跟踪策略，该策略以50Hz频率运行于Unitree G1人形机器人上，支持PD控制和自主跌倒恢复。其核心功能包括直接生成适用于G1 29自由度关节空间的动作、使用38维速度表示法以及无需人类身体模型或重定向处理。ECHO特别适合需要高效准确地从自然语言命令中生成并执行复杂运动的人形机器人应用场景，如服务机器人、娱乐表演等。",2,"2026-06-11 04:02:20","CREATED_QUERY"]