[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72197":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":18,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":22,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":16,"starSnapshotCount":16,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},72197,"verifiers","PrimeIntellect-ai\u002Fverifiers","PrimeIntellect-ai","Our library for RL environments + evals","",null,"Python",4178,556,28,38,0,10,30,78,103.04,"MIT License",false,"main",[],"2026-06-12 04:01:04","\u003Cp align=\"center\">\n  \u003Cpicture>\n    \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F40c36e38-c5bd-4c5a-9cb3-f7b902cd155d\">\n    \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6414bc9b-126b-41ca-9307-9e982430cde8\">\n    \u003Cimg alt=\"Prime Intellect\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6414bc9b-126b-41ca-9307-9e982430cde8\" width=\"312\" style=\"max-width: 100%;\">\n  \u003C\u002Fpicture>\n\u003C\u002Fp>\n\n---\n\n\u003Ch3 align=\"center\">\nVerifiers: Environments for LLM Reinforcement Learning\n\u003C\u002Fh3>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fdocs.primeintellect.ai\u002Fverifiers\">Documentation\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fapp.primeintellect.ai\u002Fdashboard\u002Fenvironments?ex_sort=most_stars\">Environments Hub\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fprime-rl\">PRIME-RL\u003C\u002Fa>\n\u003C\u002Fp>\n\n---\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fverifiers\u002Factions\u002Fworkflows\u002Fstyle.yml\">\n    \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fverifiers\u002Factions\u002Fworkflows\u002Fstyle.yml\u002Fbadge.svg\" alt=\"Style\" \u002F>\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fverifiers\u002Factions\u002Fworkflows\u002Ftest.yml\">\n    \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fverifiers\u002Factions\u002Fworkflows\u002Ftest.yml\u002Fbadge.svg\" alt=\"Test\" \u002F>\n  \u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fverifiers\u002Factions\u002Fworkflows\u002Fpublish-envs.yml\">\n    \u003Cimg src=\"https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fverifiers\u002Factions\u002Fworkflows\u002Fpublish-envs.yml\u002Fbadge.svg\" alt=\"Envs\" \u002F>\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n## News & Updates\n\n- [05\u002F07\u002F26] v0.1.14 is released, featuring the v1 Taskset\u002FHarness API, shared eval and training config shape, model-family starter configs, OpenAI Responses and renderer-backed clients, per-turn timing, GEPA prompt artifacts, Lean guard markers, and release\u002Finfrastructure hardening.\n- [04\u002F28\u002F26] v0.1.13.dev8 is released, featuring per-rollout wall-clock timeouts for `MultiTurnEnv`, CLI timeout config, sandbox timeout propagation, and smaller `CliAgentEnv` and RLM fixes.\n- [04\u002F17\u002F26] v0.1.12 is released, featuring upstreamed opencode and RLM harnesses\u002Ftasksets, major `RLMEnv` improvements (context dropping, prompt builder, hardened transport), multi-worker env server support, expanded `vf-tui` capabilities, and richer eval configuration.\n- [03\u002F12\u002F26] v0.1.11 is released, featuring a unified client stack, major `RLMEnv` and env server reliability improvements, a substantially refined eval TUI, new pass@k and ablation sweep support, and bundled opencode environments.\n- [02\u002F10\u002F26] v0.1.10 is released, featuring OpenEnv and BrowserEnv integrations, resumed evals, improved rollout and token tracking, safer sandbox lifecycle behavior, refreshed workspace setup, and opencode harbor improvements.\n- [01\u002F08\u002F26] v0.1.9 is released, featuring a number of new experimental environment class types, monitor rubrics for automatic metric collection, improved workspace setup flow, improved error handling, bug fixes, and a documentation overhaul.\n- [11\u002F19\u002F25] v0.1.8 is released, featuring a major refactor of the rollout system to use trajectory-based tracking for token-in token-out training across turns, as well as support for truncated or branching rollouts.\n- [11\u002F07\u002F25] Verifiers v0.1.7 is released! This includes an improved quickstart configuration for training with [prime-rl](https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fprime-rl), a new included \"nano\" trainer (`vf.RLTrainer`, replacing `vf.GRPOTrainer`), and a number of bug fixes and improvements to the documentation.\n- [10\u002F27\u002F25] A new iteration of the Prime Intellect [Environments Program](https:\u002F\u002Fdocs.google.com\u002Fspreadsheets\u002Fd\u002F13UDfRDjgIZXsMI2s9-Lmn8KSMMsgk2_zsfju6cx_pNU\u002Fedit?gid=0#gid=0) is live!  \n\n\n# Overview\n\nVerifiers is our library for creating environments to train and evaluate LLMs.\n\nEnvironments contain everything required to run and evaluate a model on a particular task:\n- A *dataset* of task inputs\n- A *harness* for the model (tools, sandboxes, context management, etc.)\n- A reward function or *rubric* to score the model's performance\n\nEnvironments can be used for training models with reinforcement learning (RL), evaluating capabilities, generating synthetic data, experimenting with agent harnesses, and more. \n\nVerifiers is tightly integrated with the [Environments Hub](https:\u002F\u002Fapp.primeintellect.ai\u002Fdashboard\u002Fenvironments?ex_sort=most_stars), as well as our training framework [prime-rl](https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fprime-rl) and our [Hosted Training](https:\u002F\u002Fapp.primeintellect.ai\u002Fdashboard\u002Ftraining) platform.\n\n## Getting Started\n\nEnsure you have `uv` installed, as well as the `prime` [CLI](https:\u002F\u002Fdocs.primeintellect.ai\u002Fcli-reference\u002Fintroduction) tool:\n```bash\n# install uv\ncurl -LsSf https:\u002F\u002Fastral.sh\u002Fuv\u002Finstall.sh | sh\n# install the prime CLI\nuv tool install prime\n# log in to the Prime Intellect platform\nprime login\n```\nTo set up a new workspace for developing environments, do:\n```bash\n# ~\u002Fdev\u002Fmy-lab\nprime lab setup \n```\n\nThis sets up a Python project if needed (with `uv init`), installs `verifiers` (with `uv add verifiers`), creates the recommended workspace structure, and downloads useful starter files:\n```\nconfigs\u002F\n├── endpoints.toml      # OpenAI-compatible API endpoint configuration\n├── rl\u002F                 # Example configs for Hosted Training\n├── eval\u002F               # Example multi-environment eval configs\n└── gepa\u002F               # Example configs for prompt optimization\n.prime\u002F\n└── skills\u002F             # Bundled workflow skills for create\u002Fbrowse\u002Freview\u002Feval\u002FGEPA\u002Ftrain\u002Fbrainstorm\nenvironments\u002F\n└── AGENTS.md           # Documentation for AI coding agents\nAGENTS.md               # Top-level documentation for AI coding agents\nCLAUDE.md               # Claude-specific pointer to AGENTS.md\n```\n\nAlternatively, add `verifiers` to an existing project:\n```bash\nuv add verifiers && prime lab setup --skip-install\n```\n\nEnvironments built with Verifiers are self-contained Python modules. To initialize a fresh environment template, do:\n```bash\nprime env init my-env # creates a new template in .\u002Fenvironments\u002Fmy_env\n```\nAdd an explicit harness loader when the environment owns harness behavior:\n```bash\nprime env init my-env --with-harness\n```\nFor OpenEnv integration, use:\n```bash\nprime env init my-openenv --openenv\n```\nThen copy your OpenEnv project into `environments\u002Fmy_openenv\u002Fproj\u002F` and build the image with:\n```bash\nuv run vf-build my-openenv\n```\n\nThis will create a new module called `my_env` with a basic environment template.\n```\nenvironments\u002Fmy_env\u002F\n├── my_env.py           # Main implementation\n├── pyproject.toml      # Dependencies and metadata\n└── README.md           # Documentation\n```\n\nEnvironment modules should expose a `load_environment` function which returns an\nenvironment object. For simple legacy environments, this can still be a direct\nconstructor:\n```python\n# my_env.py\nimport verifiers as vf\n\ndef load_environment(dataset_name: str = 'gsm8k') -> vf.Environment:\n    dataset = vf.load_example_dataset(dataset_name) # 'question'\n    async def correct_answer(completion, answer) -> float:\n        completion_ans = completion[-1]['content']\n        return 1.0 if completion_ans == answer else 0.0\n    rubric = vf.Rubric(funcs=[correct_answer])\n    env = vf.SingleTurnEnv(dataset=dataset, rubric=rubric)\n    return env\n```\n\nFor new environments with reusable tasksets, toolsets, custom programs, or\ncustom harnesses, use the v1 Taskset\u002FHarness path:\n```python\n# my_env.py\nimport verifiers as vf\n\ndef source():\n    yield {\n        \"prompt\": [{\"role\": \"user\", \"content\": \"Reverse abc.\"}],\n        \"answer\": \"cba\",\n        \"max_turns\": 1,\n    }\n\n@vf.reward(weight=1.0)\nasync def contains_answer(task, state) -> float:\n    return float(task[\"answer\"] in str(state.get(\"completion\") or \"\"))\n\ndef load_taskset(config: vf.TasksetConfig):\n    return vf.Taskset(source=source, rewards=[contains_answer], config=config)\n\ndef load_environment(config: vf.EnvConfig) -> vf.Env:\n    return vf.Env(taskset=load_taskset(config=config.taskset))\n```\nIf no harness is passed, `vf.Env` uses the base endpoint-backed harness. See\n**[BYO Harness](docs\u002Fbyo-harness.md)** for the advanced v1 taskset\u002Fharness API.\nReusable taskset and harness packages live under `verifiers.v1.packages` while\nthe v1 API stabilizes, and are re-exported from `verifiers.v1` for normal use.\nFor example, Harbor task directories can run through the bundled OpenCode CLI\nharness with:\n\n```python\nenv = vf.Env(\n    taskset=vf.HarborTaskset(),\n    harness=vf.OpenCode(),\n)\n```\n\nThe same environment package is the unit used by evals and `prime-rl`. The\ntrainer owns model, endpoint, sampling, and rollout count; v1-specific options\nstay on the taskset or harness config that owns them:\n\n```toml\n# configs\u002Frl\u002Fmy-v1-env.toml\nmodel = \"Qwen\u002FQwen3-30B-A3B-Instruct-2507\"\nmax_steps = 100\nbatch_size = 256\nrollouts_per_example = 8\n\n[sampling]\nmax_tokens = 4096\n\n[[env]]\nid = \"my-env\"\n\n[env.harness]\nmax_turns = 1\n\n[env.taskset]\nsplit = \"train\"\n\n[env.taskset.scoring.contains_answer]\nweight = 1.0\n```\n\n```bash\nprime env install my-env\n```\n\nFor self-managed training launch commands, use the `prime-rl` documentation.\n\nTo install the environment module into your project, do:\n```bash\nprime env install my-env # installs from .\u002Fenvironments\u002Fmy_env\n```\n\nTo install an environment from the Environments Hub into your project, do:\n```bash\nprime env install primeintellect\u002Fmath-python\n```\n\nTo run a local evaluation with any OpenAI-compatible model, do:\n```bash\nprime eval run my-env -m openai\u002Fgpt-5-nano # run and save eval results locally\n```\nEvaluations use [Prime Inference](https:\u002F\u002Fdocs.primeintellect.ai\u002Finference\u002Foverview) by default; configure your own API endpoints in `.\u002Fconfigs\u002Fendpoints.toml`.\n\nView local evaluation results in the terminal UI:\n```bash\nprime eval view\n```\n\nTo publish the environment to the [Environments Hub](https:\u002F\u002Fapp.primeintellect.ai\u002Fdashboard\u002Fenvironments?ex_sort=most_stars), do:\n```bash\nprime env push --path .\u002Fenvironments\u002Fmy_env\n```\n\nTo run an evaluation directly from the Environments Hub, do:\n```bash\nprime eval run primeintellect\u002Fmath-python\n```\n\n## Documentation\n\n**[Environments](docs\u002Fenvironments.md)** — Create datasets, rubrics, and custom multi-turn interaction protocols.\n\n**[BYO Harness](docs\u002Fbyo-harness.md)** — Build v1 Taskset\u002FHarness environments with custom tools, sandboxes, users, and custom programs.\n\n**[Evaluation](docs\u002Fevaluation.md)** - Evaluate models using your environments.\n\n**[Training](docs\u002Ftraining.md)** — Train models in your environments with reinforcement learning.\n\n**[Development](docs\u002Fdevelopment.md)** — Contributing to verifiers\n\n**[API Reference](docs\u002Freference.md)** — Understanding the API and data structures\n\n**[FAQs](docs\u002Ffaqs.md)** - Other frequently asked questions.\n\n\n## Citation\n\nOriginally created by Will Brown ([@willccbb](https:\u002F\u002Fgithub.com\u002Fwillccbb)).\n\nIf you use this code in your research, please cite:\n\n```bibtex\n@misc{brown_verifiers_2025,\n  author       = {William Brown},\n  title        = {{Verifiers}: Environments for LLM Reinforcement Learning},\n  howpublished = {\\url{https:\u002F\u002Fgithub.com\u002FPrimeIntellect-ai\u002Fverifiers}},\n  note         = {Commit abcdefg • accessed DD Mon YYYY},\n  year         = {2025}\n}\n```\n","Verifiers 是一个用于大语言模型（LLM）强化学习的环境库。它提供了丰富的评估工具和环境，支持多种任务集和测试框架，并且通过统一的客户端堆栈增强了环境服务器的可靠性。项目采用Python编写，具备版本化的任务集\u002F测试框架API、共享的评估与训练配置结构、模型家族启动配置等特性，还支持OpenAI响应和渲染器支持的客户端。适用于需要对LLM进行性能测试、调优或开发新应用的研究者及开发者使用。",2,"2026-06-11 03:40:49","high_star"]