[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82094":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":12,"stars30d":16,"stars90d":14,"forks30d":14,"starsTrendScore":17,"compositeScore":18,"rankGlobal":9,"rankLanguage":9,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":20,"hasPages":20,"topics":22,"createdAt":9,"pushedAt":9,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":14,"starSnapshotCount":14,"syncStatus":26,"lastSyncTime":27,"discoverSource":28},82094,"BES","Embodied-Minds-Lab\u002FBES","Embodied-Minds-Lab","We propose Bidirectional Evolutionary Search (BES), a search framework that couples forward candidate evolution with backward goal decomposition. ",null,"Python",153,15,33,0,1,101,7,63.11,"MIT License",false,"main",[],"2026-06-12 04:01:37","# BES: Self-Improving Language Models with Bidirectional Evolutionary Search\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.28814\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpaper-arxiv.2605.28814-B31B1B.svg\" \u002F>\u003C\u002Fa>\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-%3E%3D3.10-blue\" \u002F>\n  \u003Ca href='https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FXkev\u002Fbes'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20Hugging%20Face-Page-blue'>\u003C\u002Fa>\n  \u003Ca href='https:\u002F\u002Fguoweixu.com\u002Fbes\u002F'>\u003Cimg src='https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Page-Green'>\u003C\u002Fa>\n  \u003Ca href=\"LICENSE\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue.svg\" \u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n## Overview\n\nSearch has been proposed as an effective method for self-improving language models and agentic systems, both for post-training sample generation and for inference. However, widely used methods such as best-of-N sampling and tree search face two fundamental limitations: they are guided by **sparse verification signals**, and they construct candidates primarily through **autoregressive expansion**, restricting exploration to regions with substantial model probability mass.\n\nWe propose **Bidirectional Evolutionary Search (BES)**, a search framework that couples *forward candidate evolution* with *backward goal decomposition*. The forward search augments standard expansion with evolution operators (combination, translocation, deletion, crossover) that recombine parts of existing trajectories into candidates that are difficult to reach from a single rollout. The backward search recursively decomposes the task objective into a tree of checkable sub-goals, producing dense intermediate feedback that prioritizes which forward candidates to grow.\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fe2a9ac6d-a12d-4d77-adbc-98b1f84fb954\n\n## Experiments\n\nWe evaluate BES on both post-training and inference across LLM and agent settings. For post-training, we consider Logical Reasoning (LLM) and Multi-Hop Reasoning (Agent). For inference, we consider three representative open problem solving benchmarks: Circle Packing (Square), Circle Packing (Rectangle), and the Heilbronn Convex problem.\n\nEach setting is self-contained under its own directory, with its own README, data, and launchers:\n\n| Directory |  Setting |\n|---|---|\n| [`logical\u002FREADME.md`](logical\u002FREADME.md) | RL post-training on Knights-and-Knaves with Gemma-3-1B-it (GRPO \u002F MaxRL \u002F BES) |\n| [`multihop\u002FREADME.md`](multihop\u002FREADME.md) | RL post-training on MuSiQue with Llama-3.2-3B \u002F Llama-3.1-8B (GRPO \u002F Tree-GRPO \u002F BES) |\n| [`inference\u002FREADME.md`](inference\u002FREADME.md) | Inference-time open-problem solving on Circle Packing (Square \u002F Rect) and Heilbronn (Convex), built on top of ShinkaEvolve |\n\n## Citation\n\nIf you find this work useful, please cite:\n\n```bibtex\n@misc{xu2026selfimprovinglanguagemodelsbidirectional,\n      title={Self-Improving Language Models with Bidirectional Evolutionary Search}, \n      author={Guowei Xu and Zhenting Qi and Huangyuan Su and Weirui Ye and Himabindu Lakkaraju and Sham M. Kakade and Yilun Du},\n      year={2026},\n      eprint={2605.28814},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.28814}, \n}\n```\n","BES项目提出了一种双向进化搜索框架，用于提高语言模型和代理系统的自我改进能力。其核心功能在于结合了前向候选演化与后向目标分解，通过进化操作符（如组合、转位、删除、交叉）生成难以从单一路径达到的新候选方案，并利用密集的中间反馈指导搜索过程。技术上，BES使用Python实现，支持多种设置下的实验验证，包括逻辑推理、多跳推理及开放问题求解等场景。该项目适用于需要增强语言模型推理能力和解决复杂任务的应用场合。",2,"2026-06-11 04:07:44","CREATED_QUERY"]