[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1728":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":32,"readmeContent":33,"aiSummary":34,"trendingCount":15,"starSnapshotCount":15,"syncStatus":35,"lastSyncTime":36,"discoverSource":37},1728,"gameworld","gameworld-project\u002Fgameworld","gameworld-project","GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents","https:\u002F\u002Fgameworld-project.github.io\u002F",null,"Python",191,7,8,0,1,5,18,3,46.01,false,"main",true,[25,26,27,28,29,30,31],"agent","benchmark","computer-use","game","gui","llm","vlm","2026-06-12 04:00:11","\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fgameworld-banner.jpeg\" alt=\"GameWorld Banner\">\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.07429\">[Technical Report]\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fgameworld-project.github.io\u002F\">[Project Page]\u003C\u002Fa> •\n  \u003Ca href=\"docs\u002Finstall\u002FQUICK_START.md\">[Quick Start]\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fdiscord.com\u002Finvite\u002FQp8X6kVZSn\">[Discord]\u003C\u002Fa>\n\u003C\u002Fp>\n\n**Game Agent Benchmark: Can Multimodal Agents Play Computer Games as Humans Do?**\n\nGameWorld benchmarks multimodal game agents across 34 browser games and 170 tasks, evaluating game agents with computer-use control and semantic control in a browser-based environment with outcome-based, state-verifiable evaluation.\n\n\n\u003Ctable>\n\u003Ctr>\n  \u003Cth>\u003Cstrong>Puzzle\u003C\u002Fstrong>\u003C\u002Fth>\n  \u003Cth>\u003Cstrong>Platformer\u003C\u002Fstrong>\u003C\u002Fth>\n  \u003Cth>\u003Cstrong>Simulation\u003C\u002Fstrong>\u003C\u002Fth>\n  \u003Cth>\u003Cstrong>Arcade\u003C\u002Fstrong>\u003C\u002Fth>\n  \u003Cth>\u003Cstrong>Runner\u003C\u002Fstrong>\u003C\u002Fth>\n\u003C\u002Ftr>\n\u003Ctr>\n  \u003Ctd>\u003Cimg src=\"docs\u002Fassets\u002Fgifs\u002Fastray.gif\" alt=\"Astray preview\" width=\"140\"\u002F>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=\"docs\u002Fassets\u002Fgifs\u002Fcaptain-callisto.gif\" alt=\"Captain Callisto preview\" width=\"140\"\u002F>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=\"docs\u002Fassets\u002Fgifs\u002Fmonkey-mart.gif\" alt=\"Monkey Mart preview\" width=\"140\"\u002F>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=\"docs\u002Fassets\u002Fgifs\u002Fpacman.gif\" alt=\"Pac-Man preview\" width=\"140\"\u002F>\u003C\u002Ftd>\n  \u003Ctd>\u003Cimg src=\"docs\u002Fassets\u002Fgifs\u002Ftemple-run-2.gif\" alt=\"Temple Run 2 preview\" width=\"140\"\u002F>\u003C\u002Ftd>\n\u003C\u002Ftr>\n\u003C\u002Ftable>\n\n## 📢 Updates\n- 2026.04.19: The full game library for the benchmark evaluation is available at [gameworld-dev\u002Fgameworld-games](https:\u002F\u002Fgithub.com\u002Fgameworld-dev\u002Fgameworld-games).\n- 2026.04.15: GameWorld launched with its [Technical Report](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.07429) and [Project Page](https:\u002F\u002Fgameworld-project.github.io\u002F).\n\n## 📦 Installation\n\nPython and browser environment:\n```bash\nconda create -n gameworld python=3.12\nconda activate gameworld\npip install -r requirements.txt\nplaywright install chromium\n```\n\nSet the provider keys you need for the models:\n```bash\nexport GOOGLE_API_KEY=...\nexport OPENAI_API_KEY=...\nexport ANTHROPIC_API_KEY=...\n```\n\nOr host your own models locally with `vLLM`.\n```bash\nvllm serve Qwen\u002FQwen3.5-122B-A10B --port 8088\n```\n\nGet the full game library under `games\u002Fbenchmark`:\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fgameworld-dev\u002Fgameworld-games.git games\u002Fbenchmark\n```\n\nMore setup notes: [docs\u002Finstall\u002FINSTALLATION.md](docs\u002Finstall\u002FINSTALLATION.md).\n\n## 🚀 Quick Start\n\nValidate the browser\u002Fruntime first:\n\n```bash\npython play.py --game 10_doodle-jump\n```\n\nRun a single preset:\n\n```bash\npython main.py --config 10_doodle-jump+10_01+gpt-5.2 --headed\n```\n\nRun a suite:\n\n```bash\npython run_suite.py --suite benchmark\u002Fsuites\u002Fquick_start_test.yaml --max-parallel 5\n```\n\n## 🖥️ Results and Monitoring\n\nStandalone runs write to: `results\u002Frun_\u003Csession>_\u003Cgame>_\u003Ctask>_\u003Cmodel>\u002F`. Each run can include:\n\n- `replay.html` for static HTML replay\n- `replay.mp4` for video replay\n\nWe recommend using the dashboard to monitor the parallel runs. To launch the dashboard, run:\n\n```bash\npython -m tools.monitor.server --results-dir results --host 127.0.0.1 --port 8787 --open-browser\n```\n\n## 📚 Documentation\n\nSee [docs\u002F](docs) for full documentation.\n\n## 💬 Game Agent Community\n\n🎙️ Join our [Discord](https:\u002F\u002Fdiscord.com\u002Finvite\u002FQp8X6kVZSn) to discuss GameWorld, ask questions, and share your thoughts on multimodal game agents. GLHF!\n\n## 📆 TODO\n\n- [ ] Release GameWorld leaderboard.\n\n## 📖 BibTeX\nIf you find GameWorld useful for your research, please kindly cite:\n```bibtex\n@article{ouyang2026gameworld,\n  title={GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents},\n  author={Ouyang, Mingyu and Hu, Siyuan and Lin, Kevin Qinghong and Ng, Hwee Tou and Shou, Mike Zheng},\n  journal={arXiv preprint arXiv:2604.07429},\n  year={2026},\n}\n```\n","GameWorld 是一个用于评估多模态游戏代理在浏览器游戏中表现的标准化和可验证平台。它支持34款浏览器游戏中的170个任务，通过基于结果和状态验证的方式，在计算机使用控制和语义控制下对游戏代理进行评估。项目采用Python开发，涵盖了从益智、平台跳跃到模拟等多种类型的游戏。特别适合于研究者和开发者测试和改进他们的多模态AI模型在游戏中执行复杂任务的能力。此外，GameWorld 提供了详细的安装指南与快速启动示例，便于用户迅速上手并开始实验。",2,"2026-06-11 02:45:41","CREATED_QUERY"]