[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80962":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":13,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":21,"hasPages":21,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":16,"starSnapshotCount":16,"syncStatus":27,"lastSyncTime":28,"discoverSource":29},80962,"FrontierSmith","FrontierCS\u002FFrontierSmith","FrontierCS","FrontierSmith, a new system that uses AI to synthesize open-ended coding problems at scale","https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.14445",null,"Python",36,4,31,1,0,3,5,9,2.1,false,"main",[],"2026-06-12 02:04:09","\u003Cp align=\"center\">\n  \u003Cimg src=\"assets\u002Flogo.png\" alt=\"FrontierSmith Logo\" width=\"200\"\u002F>\n\u003C\u002Fp>\n\n\u003Ch1 align=\"center\">FrontierSmith\u003C\u002Fh1>\n\n\u003Ch3 align=\"center\">\nSynthetic Open-ended Problem Generation\n\u003C\u002Fh3>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.14445\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2605.14445-b31b1b?logo=arxiv&logoColor=white\" alt=\"arXiv\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Ffrontier-cs.org\u002Fblog\u002Ffrontiersmith\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FBlog-frontier--cs.org-1f6feb\" alt=\"Blog\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Frunyuanhe\u002Fqwen35-9b-frontiersmith\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002F%F0%9F%A4%97%20HuggingFace-Model-yellow\" alt=\"HuggingFace Model\">\u003C\u002Fa>\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002FFrontierCS\u002FFrontier-CS\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFrontier--CS-Official_Repo-blue?logo=github\" alt=\"Frontier-CS\">\u003C\u002Fa>\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSynthetic_Problems-10-green\" alt=\"Synthetic Problems\">\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.11+-yellow?logo=python&logoColor=white\" alt=\"Python\">\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDocker-24+-2496ED?logo=docker&logoColor=white\" alt=\"Docker\">\n\u003C\u002Fp>\n\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fb91ee8fb-d5e0-40b3-8a6c-cd1b3b915ac2\n\n## Overview\n\n**FrontierSmith** is the synthetic open-ended problem-generation pipeline. This repository contains training code, evaluation code, and **10 synthetic algorithmic problems** used in the paper's parity experiment.\n\n> The orchestrator and LLM-driven test\u002Fchecker generators are intentionally withheld.\n\n---\n\n## Repository Structure\n\n```\nFrontierSmith\u002F\n├── README.md\n├── requirements.txt\n├── setup-env.sh                          # one-shot environment bootstrap\n├── verl\u002F                                 # vendored VERL framework (editable install)\n├── ALE-Bench\u002F                            # ALE-Bench validator (third-party)\n├── Frontier-CS\u002F\n│   ├── algorithmic\u002F\n│   │   ├── problems\u002F                     # 10 synthetic problems\n│   │   │   └── frontiersmith_{1..10}\u002F\n│   │   ├── Dockerfile \u002F server.js \u002F judge\u002F \u002F scripts\u002F\n│   │   └── ...\n│   ├── src\u002F pyproject.toml\n│   └── README.md\n├── harbor\u002F\n│   └── adapters\u002Ffrontier-cs-algorithm\u002F   # Harbor adapter\n├── scripts\u002F                              # training \u002F evaluation \u002F data-prep\n└── data\u002F\n    └── sample_lists\u002F                     # reproducibility manifests\n```\n\n---\n\n## Synthetic Problems\n\n10 problems in `Frontier-CS\u002Falgorithmic\u002Fproblems\u002F`. These correspond to problems **306–315** in the [Frontier-CS main repository](https:\u002F\u002Fgithub.com\u002FFrontierCS\u002FFrontier-CS):\n\n| ID | Frontier-CS ID | Name |\n|:---|:---------------|:-----|\n| `frontiersmith_1` | 306 | Scorched Bridges Campaign |\n| `frontiersmith_2` | 307 | Farmwide Teleport Pad Deployment |\n| `frontiersmith_3` | 308 | Metallic Pink Resonator Layout |\n| `frontiersmith_4` | 309 | Park Ranger Shift Balancing |\n| `frontiersmith_5` | 310 | Prime Resonance Retuning |\n| `frontiersmith_6` | 311 | Mobile Relay Layout |\n| `frontiersmith_7` | 312 | Archipelago Relay Network Design |\n| `frontiersmith_8` | 313 | Resonant Bay Layout |\n| `frontiersmith_9` | 314 | Duff's Defensive Lineup |\n| `frontiersmith_10` | 315 | Quadratic Witness Packing |\n\nEach directory contains:\n\n```\nchk.cc           # custom checker\nconfig.yaml      # judge configuration\ngen.cpp          # testlib-style test-case generator\nstatement.txt    # problem statement\ntestdata\u002F        # *.in \u002F *.ans pairs\n```\n\n---\n\n## Environment Setup\n\n```bash\nsource setup-env.sh             # creates .venv, installs all deps\nsource setup-env.sh --skip      # activate existing env quickly\n```\n\nExternal services:\n\n```bash\nhf auth login          # to download Qwen3.5-9B \u002F 27B weights\nwandb login            # optional, for training logs\n```\n\n### Tested Versions\n\n| Package      | Version           | Notes                                   |\n|:-------------|:------------------|:----------------------------------------|\n| Python       | 3.11              | `apt install python3.11 python3.11-dev` |\n| torch        | 2.11.0+cu130      | pulled by vllm                          |\n| vllm         | 0.20.0            |                                         |\n| transformers | 5.7.0             | Qwen3.5 needs >= 5.2.0                  |\n| verl         | 0.8.0.dev (local) | editable install from `verl\u002F`           |\n| ray          | 2.55.1            |                                         |\n\n---\n\n## Datasets\n\n### Frontier-CS Algorithmic Track (172 problems, public)\n\nNot redistributed. Use the [official release](https:\u002F\u002Fgithub.com\u002FFrontierCS\u002FFrontier-CS) to populate `Frontier-CS\u002Falgorithmic\u002Fproblems\u002F\u003Cnumeric_id>\u002F`.\n\n### HardTest (sampled, public)\n\n```bash\npython scripts\u002Fdownload_hardtest.py\npython scripts\u002Finstall_hardtest_frontier_packages.py\npython scripts\u002Fsplit_hardtest_by_difficulty.py\npython scripts\u002Fsample_hardtest_problems.py --n 200 --seed 42 \\\n       -o results\u002Fhardtest_hard_sampled_200.json\n```\n\nThe exact 200-problem manifest is at `data\u002Fsample_lists\u002Fhardtest_hard_sampled_200.json`.\n\n### Synthetic Problems (10, this repo)\n\nThe 30-problem mixed sample list (10 from each of HardTest, Frontier-CS, synthetic) is at `data\u002Fsample_lists\u002Fharbor_sample_30.jsonl`.\n\n### Harbor + Claude Code Reproduction\n\nThe 10 FrontierSmith problems can be loaded into a local Harbor dataset with\nthe bundled Frontier-CS algorithm adapter, then run with Harbor's standard\n`claude-code` agent.\n\nPrerequisites:\n\n- Docker Desktop or another Docker daemon reachable by the current user.\n- `uv` and the Harbor CLI (`uv tool install harbor`, or `pip install harbor`).\n- A Claude Code compatible Anthropic key exported as `ANTHROPIC_API_KEY`.\n- If your machine needs a proxy, configure both the shell and Docker Desktop\n  for it, for example `http:\u002F\u002F127.0.0.1:7897`.\n\nLoad the FrontierSmith tasks:\n\n```bash\nexport FRONTIERSMITH_ROOT=\"$(pwd)\"\nexport FRONTIER_CS_ALGORITHMIC_PATH=\"$FRONTIERSMITH_ROOT\u002FFrontier-CS\u002Falgorithmic\"\n\ncd \"$FRONTIERSMITH_ROOT\u002Fharbor\u002Fadapters\u002Ffrontier-cs-algorithm\"\nuv sync\nuv run frontier-cs-algorithm \\\n  --source \"$FRONTIERSMITH_ROOT\u002FFrontier-CS\" \\\n  --output-dir \"$FRONTIERSMITH_ROOT\u002Fdatasets\u002Ffrontiersmith-sample\" \\\n  --include-non-numeric \\\n  --task-ids \\\n    frontiersmith_1 frontiersmith_2 frontiersmith_3 frontiersmith_4 frontiersmith_5 \\\n    frontiersmith_6 frontiersmith_7 frontiersmith_8 frontiersmith_9 frontiersmith_10 \\\n  --overwrite\ncd \"$FRONTIERSMITH_ROOT\"\n```\n\nThis creates 10 Harbor task directories under\n`datasets\u002Ffrontiersmith-sample\u002Ffrontier-cs-algorithm-frontiersmith_*`.\n\nRun a smoke test with Claude Code on one task:\n\n```bash\nharbor run \\\n  -p datasets\u002Ffrontiersmith-sample \\\n  -a claude-code \\\n  -m anthropic\u002Fclaude-opus-4-6 \\\n  -l 1 \\\n  -n 1 \\\n  --jobs-dir jobs\u002Ffrontiersmith-claude-smoke \\\n  --yes\n```\n\nRun all 10 tasks:\n\n```bash\nharbor run \\\n  -p datasets\u002Ffrontiersmith-sample \\\n  -a claude-code \\\n  -m anthropic\u002Fclaude-opus-4-6 \\\n  -n 1 \\\n  --jobs-dir jobs\u002Ffrontiersmith-claude \\\n  --yes\n```\n\nResults are written under the selected `jobs\u002F` directory. Each task expects\nthe agent to write `\u002Fapp\u002Fsolution.cpp`; the verifier posts that solution to\nthe Frontier-CS judge sidecar and records the reward in\n`reward.txt` \u002F `reward.json`.\n\n### ALE-Bench (validation only)\n\n```bash\npython scripts\u002Fprepare_alebench_parquet.py\n```\n\n---\n\n## Services\n\n### Frontier-CS Judge\n\n```bash\ncd Frontier-CS\u002Falgorithmic\ndocker build -t frontiercs-judge .\n.\u002Frun_judge.sh                  # listens on http:\u002F\u002Flocalhost:8082\n```\n\n### ALE-Bench Judge (optional)\n\n```bash\ncd ALE-Bench\nbash scripts\u002Fdocker_build_202301.sh $(id -u) $(id -g)\n```\n\n---\n\n## Data Preparation\n\n```bash\npython scripts\u002Fprepare_frontiercs_parquet.py             # Frontier-CS 172 → parquet\npython scripts\u002Fprepare_hardtest_hard_sample_parquet.py    # HardTest 200 → parquet\npython scripts\u002Fprepare_synthetic_parquet.py               # 10 synthetic → parquet\npython scripts\u002Fprepare_alebench_parquet.py                # ALE-Bench validation\npython scripts\u002Fprepare_random_reward_train_parquet.py     # Random-reward\n```\n\n---\n\n## Training\n\nAll scripts use `python -m verl.trainer.main_ppo` with Hydra overrides.\n\n```bash\n# Main 9B GRPO run\nbash scripts\u002Frun_verl_grpo_frontiercs_qwen35_9b.sh\n\n# Multi-GPU\nNGPU=8 TP=2 bash scripts\u002Frun_verl_grpo_frontiercs_qwen35_9b.sh\n```\n\n| Variable      | Default                         | Description                          |\n|:--------------|:--------------------------------|:-------------------------------------|\n| `MODEL_PATH`  | `Qwen\u002FQwen3.5-9B`              | HF model id or local path           |\n| `TRAIN_DATA`  | `data\u002Ffrontiercs\u002Ftrain.parquet` | training parquet                     |\n| `VAL_DATA`    | `data\u002Ffrontiercs\u002Ffull.parquet`  | validation parquet                   |\n| `NGPU`        | `4`                             | GPUs per node                        |\n| `TP`          | `1`                             | tensor parallel size for vLLM        |\n| `FRESH_START` | `0`                             | set `1` to start from scratch        |\n\n### Experiment Scripts\n\n| Script | Purpose |\n|:-------|:--------|\n| `run_verl_grpo_frontiercs_qwen35_9b.sh` | 9B on Frontier-CS (172 problems) |\n| `run_verl_grpo_frontiercs_qwen35_9b_no_validation.sh` | above, validation disabled |\n| `run_verl_grpo_frontiercs_qwen35_9b_alebench.sh` | 9B with ALE-Bench validation |\n| `run_verl_grpo_frontiercs_qwen35_9b_hardtest.sh` | 9B on HardTest 200 |\n| `run_verl_grpo_frontiercs_qwen35_9b_synthetic.sh` | 9B + synthetic mix |\n| `run_verl_grpo_frontiercs_qwen35_9b_nofilter.sh` | ablation (no filtering) |\n| `run_verl_grpo_frontiercs_qwen35_9b_randomreward.sh` | random-reward control |\n\n---\n\n## Evaluation\n\n```bash\n# Start vLLM server\nbash scripts\u002Fstart_vllm_server.sh\n\n# Base model \u002F checkpoint sweeps\nbash scripts\u002Feval_base_model_frontiercs.sh\nbash scripts\u002Feval_frontiercs_checkpoints.sh\n\n# Single-shot evaluation\npython scripts\u002Feval_frontiercs_via_vllm.py\npython scripts\u002Frun_qwen_frontiercs.py\npython scripts\u002Frun_merged_model.py\n\n# VERL inference\nbash scripts\u002Frun_verl_inference_server.sh\nbash scripts\u002Frun_verl_inference_from_ckpt.sh\nbash scripts\u002Frun_verl_inference_from_model.sh\n\n# Convert FSDP shards → HF model\npython scripts\u002Fmerge_fsdp_to_hf.py --ckpt-dir \u003C...> --output-dir \u003C...>\n```\n\n### Citing Us\n```\n@article{he2026frontiersmith,\n  title={FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale},\n  author={He, Runyuan and Mang, Qiuyang and Zhou, Shang and Liu, Kaiyuan and Li, Hanchen and Mao, Huanzhi and Zhang, Qizheng and Li, Zerui and Peng, Bo and Cheng, Lufeng and others},\n  journal={arXiv preprint arXiv:2605.14445},\n  year={2026}\n}\n```\n","FrontierSmith 是一个利用人工智能大规模生成开放式编程问题的系统。该项目的核心功能是通过AI技术自动生成具有挑战性的算法问题，支持Python 3.11+环境，并提供Docker容器化部署选项以简化运行过程。它包含了一个完整的训练和评估代码库，以及10个用于论文实验验证的合成算法问题。这些特性使得FrontierSmith特别适合教育领域中编程教学与考核、编程竞赛题目设计等场景，同时也为研究者提供了探索AI在复杂任务生成方面能力的平台。",2,"2026-06-11 04:03:00","CREATED_QUERY"]