[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1004":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":35,"readmeContent":36,"aiSummary":37,"trendingCount":16,"starSnapshotCount":16,"syncStatus":38,"lastSyncTime":39,"discoverSource":40},1004,"CyberVerse","dsd2077\u002FCyberVerse","dsd2077","Self hosted, real-time digital human agent platform. Build voice-first AI agents with WebRTC, persona memory, tools, RAG, and optional digital-human video.","",null,"Python",1115,153,7,3,0,79,622,84.56,"GNU General Public License v3.0",false,"main",true,[25,26,27,28,29,30,31,32,33,34],"ai-agents","ai-companion","digital-human","digital-life","grok-companion","jarvis-assistant","sillytavern","streaming","vtuber","webrtc","2026-06-12 04:00:07","\u003Ch1 align=\"center\">CyberVerse\u003C\u002Fh1>\n\u003Cp align=\"center\">\u003Cem>CyberVerse is an open-source \u003Cstrong>digital human agent platform\u003C\u002Fstrong> with real-time video calling. Create an AI agent you can see and talk to, face to face, just like a video call.\u003C\u002Fem>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"README.md\">\u003Cstrong>English\u003C\u002Fstrong>\u003C\u002Fa> · \u003Ca href=\"README.zh-CN.md\">简体中文\u003C\u002Fa> · \u003Ca href=\"README.ja.md\">日本語\u003C\u002Fa> · \u003Ca href=\"README.ko.md\">한국어\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"LICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-GPL%20v3-blue.svg\" alt=\"License: GPL v3\"\u002F>\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fdsd2077\u002FCyberVerse\u002Fpulls\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPRs-welcome-brightgreen.svg\" alt=\"PRs Welcome\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"docs\u002Fassets\u002Flogo.png\">\u003Cimg src=\"docs\u002Fassets\u002Flogo.png\" alt=\"CyberVerse logo\" width=\"100%\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\n### One Photo. A Living Digital Human.\n\n> Ever dreamed of having your own J.A.R.V.I.S. — an AI that truly sees you, hears you, and talks back in real time?\n>\n> Want to see someone you've lost again, hear their voice, watch them smile at you?\n>\n> Or maybe there's a character you've always wished you could bring to life?\n>\n> **Just one photo. CyberVerse makes them alive.**\n\n## Features\n\n### Real-Time Video Call\n\nNot pre-recorded. Not turn-based. **Unlimited-duration**, live, low-latency video calls with a digital human — first frame in **~1.5s**. Built on WebRTC with P2P streaming and embedded TURN\u002FNAT traversal.\n\n### Agent, Not Just an Avatar\n\nEvery digital human is more than an avatar you can talk to. It is the AI that actually does things.\n\n### One Photo to Life\n\nUpload a single photo to create your digital human. State-of-the-art avatar models deliver real-time facial animation, natural lip-sync, and subtle idle breathing — no 3D modeling or motion capture.\n\n### Assemble Your Agent\n\nBrain, face, voice, ears — every component is a swappable plugin. Mix and match LLMs, TTS engines, ASR models, and avatar backends via YAML config.\n\n## Demo\n\u003Cp align=\"center\">\u003Cem>Characters shown here are demo examples only. They are not bundled with CyberVerse and are not provided for commercial use.\u003C\u002Fem>\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"docs\u002Fassets\u002Fcharacter1.png\">\u003Cimg src=\"docs\u002Fassets\u002Fcharacter1.png\" alt=\"CyberVerse character selection gallery\" width=\"100%\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"docs\u002Fassets\u002Fcharacter2.png\">\u003Cimg src=\"docs\u002Fassets\u002Fcharacter2.png\" alt=\"CyberVerse character gallery examples\" width=\"100%\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cdiv align=\"center\">\n\n| [![](docs\u002Fassets\u002F爱丽丝.mov.png)](https:\u002F\u002Fyoutu.be\u002FLk88sew2x4o) | [![](docs\u002Fassets\u002F丽娜.mov.png)](https:\u002F\u002Fyoutu.be\u002F8jdQ3ThcwgA) |\n|:---:|:---:|\n| [**Alice — watch on YouTube**](https:\u002F\u002Fyoutu.be\u002FLk88sew2x4o) | [**Lina — watch on YouTube**](https:\u002F\u002Fyoutu.be\u002F8jdQ3ThcwgA) |\n\n| [![](docs\u002Fassets\u002F小龙女.mov.png)](https:\u002F\u002Fyoutu.be\u002FWjEHUYZx5Gs) |\n|:---:|\n| [**Xiaolongnü — watch on YouTube**](https:\u002F\u002Fyoutu.be\u002FWjEHUYZx5Gs) |\n\n\u003C\u002Fdiv>\n\n## Hardware Requirements\n\nReal-time video conversation requires GPU acceleration. Below are benchmarks for FlashHead and LiveAct avatar models:\n\n| Model | Quality | GPU | Count | Resolution | FPS | Real-time? |\n|-------|---------|-----|-------|------------|-----|------------|\n| FlashHead 1.3B | Pro | RTX 5090 | 2 | 512×512 | 25+ | ✅ Yes |\n| FlashHead 1.3B | Pro | RTX 5090 | 1 | 464x464 | 20 | ✅ Yes |\n| FlashHead 1.3B | Pro | RTX PRO 6000 | 1 | 512×512 | 20 | ✅ Yes |\n| FlashHead 1.3B | Pro | RTX 4090 | 1 | 512×512 | ~10.8 | ❌ No |\n| FlashHead 1.3B | Lite | RTX 4090 | 1 | 512×512 | 25+ | ✅ Yes |\n| LiveAct 18B | — | RTX PRO 6000 | 2 | 320×480 | 20 | ✅ Yes |\n| LiveAct 18B | — | RTX PRO 6000 | 1 | 256×417 | 20 | ✅ Yes |\n\n> **Pro** favors visual quality; **Lite** favors speed. The table reflects typical **quality–compute** balances — more GPU headroom lets you push higher quality; tighter hardware calls for lower settings (resolution, **Pro** vs **Lite**, etc.) to stay real-time.\n\n## Quick Start\n\n### Prerequisites\n\n- Node 18+\n- Go 1.25 (required: `protoc-gen-go`, `protoc-gen-go-grpc`)\n- GPU with CUDA 12.8+\n- FFmpeg (must include `libvpx` for video encoding)\n- Conda\n- Python 3.10+\n- PyTorch 2.8 (CUDA 12.8)\n\nTo verify, use:\n\n```bash\nnode --version\ngo version\nprotoc --version\nffmpeg -version|grep libvpx\nconda --version\n```\n\n### Step 1: Clone\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fdsd2077\u002FCyberVerse.git\ncd CyberVerse\n```\n\n### Step 2: Create Python environment\n\n```bash\nconda create -n cyberverse python=3.10\nconda activate cyberverse\n```\n\nInstall PyTorch (CUDA 12.8) in this environment:\n\n```bash\npip3 install torch==2.8.0 torchvision==0.23.0 --index-url https:\u002F\u002Fdownload.pytorch.org\u002Fwhl\u002Fcu128\n```\n\n### Step 3: Configure environment variables\n\n```bash\ncp infra\u002F.env.example .env\n```\n\nEdit `.env`, fill in your API keys:\n\n```\nDOUBAO_ACCESS_TOKEN=your_doubao_access_token   # ByteDance Doubao voice LLM\nDOUBAO_APP_ID=your_doubao_app_id\n```\n\nDoubao Voice: get **App ID** \u002F **API Key** per [Volcengine quick start](https:\u002F\u002Fwww.volcengine.com\u002Fdocs\u002F6561\u002F2119699?lang=zh) → `DOUBAO_APP_ID` \u002F `DOUBAO_ACCESS_TOKEN`.\n\nAfter the stack is running, you can change these values (and other API keys \u002F service endpoints) from the web UI at **`\u002Fsettings`** instead of editing `.env` only.\n\n### Step 4: Download model weights\n\nCyberVerse currently supports **FlashHead** and **LiveAct**; download only what you need. More backends are planned.\n\n```bash\npip install \"huggingface_hub[cli]\"\n```\n\n#### FlashHead (SoulX-FlashHead)\n\n| Model Component | Description | Link |\n| :--- | :--- | :--- |\n| `SoulX-FlashHead-1_3B` | 1.3B FlashHead weights | [Hugging Face](https:\u002F\u002Fhuggingface.co\u002FSoul-AILab\u002FSoulX-FlashHead-1_3B), [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FSoul-AILab\u002FSoulX-FlashHead-1_3B) |\n| `wav2vec2-base-960h` | Audio feature extractor | [Hugging Face](https:\u002F\u002Fhuggingface.co\u002Ffacebook\u002Fwav2vec2-base-960h), [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002Ffacebook\u002Fwav2vec2-base-960h) |\n\n```bash\n# If you are in mainland China, you can use a mirror first:\n# export HF_ENDPOINT=https:\u002F\u002Fhf-mirror.com\n\nhuggingface-cli download Soul-AILab\u002FSoulX-FlashHead-1_3B \\\n  --local-dir .\u002Fcheckpoints\u002FSoulX-FlashHead-1_3B\n\nhuggingface-cli download facebook\u002Fwav2vec2-base-960h \\\n  --local-dir .\u002Fcheckpoints\u002Fwav2vec2-base-960h\n```\n\n#### LiveAct (SoulX-LiveAct)\n\n| ModelName | Download |\n|-----------|----------|\n| SoulX-LiveAct | [Hugging Face](https:\u002F\u002Fhuggingface.co\u002FSoul-AILab\u002FLiveAct), [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FSoul-AILab\u002FLiveAct) |\n| chinese-wav2vec2-base | [Hugging Face](https:\u002F\u002Fhuggingface.co\u002FTencentGameMate\u002Fchinese-wav2vec2-base), [ModelScope](https:\u002F\u002Fmodelscope.cn\u002Fmodels\u002FTencentGameMate\u002Fchinese-wav2vec2-base) |\n\n```bash\nhuggingface-cli download Soul-AILab\u002FLiveAct \\\n  --local-dir .\u002Fcheckpoints\u002FLiveAct\n\nhuggingface-cli download TencentGameMate\u002Fchinese-wav2vec2-base \\\n  --local-dir .\u002Fcheckpoints\u002Fchinese-wav2vec2-base\n```\n\n\n### Step 5: Create and update local config\n\n```bash\ncp infra\u002Fcyberverse_config.example.yaml cyberverse_config.yaml\n```\n\nEdit the local `cyberverse_config.yaml`, update the model paths to match your local checkpoints. This file is ignored by git so local paths and deployment settings do not conflict with upstream changes.\n\n```yaml\ninference:\n  avatar:\n    default: \"flash_head\"               # selects which avatar model to start; if set to live_act, fill the live_act section below\n    runtime:\n      cuda_visible_devices: 0      # shared GPU ID(s), e.g. 0,1 for multi-GPU\n      world_size: 1                # shared GPU count, set to 2 for dual-GPU\n    flash_head:\n      checkpoint_dir: \".\u002Fcheckpoints\u002FSoulX-FlashHead-1_3B\"  # ← your path\n      wav2vec_dir: \".\u002Fcheckpoints\u002Fwav2vec2-base-960h\"        # ← your path\n      model_type: \"lite\"           # \"pro\" for higher quality (needs more GPU)\n      compile_model: true\n      compile_vae: true\n      dist_worker_main_thread: true\n      infer_params:\n        frame_num: 33\n        motion_frames_latent_num: 2\n        tgt_fps: 20\n        sample_rate: 16000\n        sample_shift: 5\n        color_correction_strength: 1.0\n        cached_audio_duration: 8\n        num_heads: 12\n        height: 512\n        width: 512\n    live_act:\n      ckpt_dir: \".\u002Fcheckpoints\u002FLiveAct\"                     # ← your path\n      wav2vec_dir: \".\u002Fcheckpoints\u002Fchinese-wav2vec2-base\"   # ← your path\n      seed: 42\n      compile_wan_model: false\n      compile_vae_decode: false\n      dist_worker_main_thread: true\n      default_prompt: \"一个人在说话\"\n      infer_params:\n        size: \"320*480\"\n        fps: 20\n        audio_cfg: 1.0\n```\n\nYou can skip editing paths here for now and adjust these options later in the web UI.\n\n### Step 6: Install SageAttention & FlashAttention (optional)\n```bash\n# SageAttention (source build)\ngit clone https:\u002F\u002Fgithub.com\u002Fthu-ml\u002FSageAttention.git\ncd SageAttention\nexport EXT_PARALLEL=4 NVCC_APPEND_FLAGS=\"--threads 8\" MAX_JOBS=32 # Optional\npython setup.py install\n```\n\n```bash\n# FlashAttention (optional)\npip install ninja\npip install flash_attn==2.8.0.post2 --no-build-isolation\n```\n\n> If compilation is slow, download a prebuilt wheel from [flash-attention releases](https:\u002F\u002Fgithub.com\u002FDao-AILab\u002Fflash-attention\u002Freleases\u002Ftag\u002Fv2.8.0.post2) and `pip install \u003Cwheel>.whl`.\n\n\n\n### Step 7: Install project dependencies\n\n```bash\nmake setup\n```\n\nThis installs the base editable package (`[dev,inference]`), generates gRPC stubs, and installs frontend dependencies. For extra Python packages, either install **everything** (large) or **cherry-pick** extras listed under `[project.optional-dependencies]` in [`pyproject.toml`](pyproject.toml):\n\n```bash\n# all optional groups at once\npip install -e \".[all]\"\n\n# or pick what you need, e.g.:\npip install -e \".[voice_llm,flash_head]\"\npip install -e \".[live_act]\"\n```\n\n### Step 8: Start services (3 terminals)\n\n**Terminal 1** — Python inference server:\n\n```bash\nconda activate cyberverse\nmake inference\n```\n\n`make inference` will read `inference.avatar.default` from `cyberverse_config.yaml`, then initialize exactly that one avatar model in the current inference process. Startup logs will print the active avatar model.\n\nWait until you see:\n\n- `Active avatar model initialized: \u003Cmodel_name>`\n- `CyberVerse Inference Server started on port 50051`\n\n**Terminal 2** — Go API server:\n\n```bash\nmake server\n```\n\n**Terminal 3** — Frontend:\n\n```bash\nmake frontend\n```\n\n### Step 9: Verify\n\n```bash\n# Check API health\ncurl -s http:\u002F\u002Flocalhost:8080\u002Fapi\u002Fv1\u002Fhealth\n```\n\n### Check 8443\u002FTCP Connectivity for Remote Access\n\nWhen `streaming_mode: direct` uses the embedded TURN server, the browser must be able to reach the server's `8443\u002FTCP`. If the page loads but audio\u002Fvideo never connects, or the server logs show `ICE connection state: failed` or `publish timeout waiting for connection`, first check whether your machine can reach port `8443` on the server:\n\n```bash\nnc -vz \u003Cserver-ip> 8443\n```\n\nIf `8443` is not reachable, the usual cause is a cloud security group, firewall, or NAT restriction. In that case, you can forward your local `8443` to the server through an SSH tunnel:\n\n```bash\nssh -L 8443:127.0.0.1:8443 user@host -p port\n```\n\nAfter the tunnel is established, the browser will access the remote TURN service through local `127.0.0.1:8443`.\n\nIf you want the browser to connect to the remote server directly instead of through an SSH tunnel, set `pipeline.ice_public_ip` in `cyberverse_config.yaml` to the server's public IP or domain. If you are using an SSH tunnel, you can keep the default value (`127.0.0.1`).\n\nOpen http:\u002F\u002Flocalhost:5173 in your browser — you're ready to go.\n\n## Roadmap\n\n### 1. **Digital Human Creation Platform**  \nConfigure characters, inference, and launch real-time digital-human sessions.\n\n- [x] Character CRUD with multiple reference images, active image, fixed\u002Frandom display mode, optional face crop, tags, voice fields, personality, welcome message, and system prompt\n- [x] Real-time avatar video driven from reference images via configurable avatar plugins (e.g. FlashHead, LiveAct)\n- [x] Real-time voice and video over WebRTC — direct P2P (embedded TURN) or LiveKit SFU\n- [x] Pluggable modules (avatar, voice LLM, LLM, TTS, ASR); configure different vendors’ API keys via YAML (a single Doubao Voice API key is enough to run today)\n- [x] Session management: per-character chat history persisted to disk and loaded when a conversation starts\n- [x] Voice cloning: supports Doubao voice cloning\n- [x] Hybrid input: supports both voice and text in the same conversation\n- [x] Voice interruption while the model is speaking, plus session pause and resume\n- [x] User camera input and screen-sharing visual frames in standard mode\n- [x] Face-to-face: user-side camera\u002Fscreen input\n- [ ] Import knowledge, documents, and biographical material for character-grounded RAG Q&A\n- [ ] Embeddable for developers (Web component or SDK) to integrate self-hosted instances into their own sites\n- [ ] Live streaming: audio\u002Fvideo output for broadcast-style use cases\n\n### 2. **Digital Humans as Agents**  \nTurn digital humans into agents with memory, tools, and task execution.\n\n- [ ] **Memory system**: long-term memory across sessions, integrated with character knowledge bases and RAG for richer backstory and dialogue continuity\n- [ ] Tool use and function calling\n- [ ] Workflow execution and task completion\n\n### 3. **Agent Network**  \nConnect multiple agents so they can communicate, collaborate, and form networks.\n- [ ] Enable agent-to-agent communication\n- [ ] Enable multi-agent collaboration and delegation\n- [ ] Enable shared memory and shared knowledge between agents\n- [ ] Build an open network of connected agents\n\n## License\n\nGNU General Public License v3.0 — see [LICENSE](LICENSE)\n\n## Acknowledgements\n\n- [SoulX-FlashHead](https:\u002F\u002Fgithub.com\u002FSoul-AILab\u002FSoulX-FlashHead) — Avatar model by Soul AI Lab\n\n- [SoulX-LiveAct](https:\u002F\u002Fgithub.com\u002FSoul-AILab\u002FSoulX-LiveAct) - Avatar model by Soul AI Lab\n- [Pion](https:\u002F\u002Fgithub.com\u002Fpion\u002Fwebrtc) — Go WebRTC implementation\n","CyberVerse是一个开源的数字人代理平台，支持实时视频通话。用户可以创建一个能够面对面交流的AI代理，就像进行视频通话一样。其核心功能包括基于WebRTC技术实现的低延迟、不限时长的实时视频通话；仅需上传一张照片即可生成具有面部动画、自然唇形同步和细微呼吸动作的数字人；以及通过YAML配置文件自由组合大脑、面容、声音等组件的能力。该平台适用于需要虚拟助手或希望将特定角色数字化的场景，如个人助理、虚拟主播等。",2,"2026-06-06 02:41:41","CREATED_QUERY"]