[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1053":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":8,"language":10,"languages":8,"totalLinesOfCode":8,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":15,"stars30d":14,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":16,"rankGlobal":8,"rankLanguage":8,"license":17,"archived":18,"fork":18,"defaultBranch":19,"hasWiki":18,"hasPages":18,"topics":20,"createdAt":8,"pushedAt":8,"updatedAt":21,"readmeContent":22,"aiSummary":23,"trendingCount":15,"starSnapshotCount":15,"syncStatus":24,"lastSyncTime":25,"discoverSource":26},1053,"FM-Agent","haoran-ding\u002FFM-Agent","haoran-ding",null,"https:\u002F\u002Ffm-agent.ai\u002F","Python",396,25,3,4,0,4.24,"Apache License 2.0",false,"main",[],"2026-06-12 02:00:22","# FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning\n\nFM-Agent is the first framework that realizes automated compositional reasoning for large-scale systems (e.g., [Claude's C Compiler](https:\u002F\u002Fgithub.com\u002Fanthropics\u002Fclaudes-c-compiler) with 143K LoC).\nIt is presented in the paper \"[FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.11556)\".\n\nThe [website](http:\u002F\u002Ffm-agent.ai\u002F) of FM-Agent provides an online service for reasoning about codebases. You can try it easily!\n\n> **⚠️ Warning**: The effectiveness of this framework is heavily influenced by the capability of the underlying model. Weaker models may produce hallucinations, leading to incorrect reasoning conclusions. We recommend using models with strong reasoning abilities (Claude Opus 4.6\u002F4.7, Claude Sonnet 4.6) for more reliable results.\n\n## Table of Contents\n\n- [File Structure](#file-structure)\n- [Environment Setup](#environment-setup)\n  - [Requirements](#requirements)\n  - [Install Dependencies](#install-dependencies)\n- [Configuration](#configuration)\n- [Quick Start](#quick-start)\n- [Important Notes](#important-notes)\n- [Citation](#citation)\n- [Contact](#contact)\n\n\n## File Structure\n\n```\n|-- main.py                # Entry point and pipeline orchestrator\n|-- config.py              # Configuration constants (model, granularity, concurrency)\n|-- install.sh             # Dependency installation script\n|-- src\u002F                   # Core source modules (extraction, reasoning, LLM interaction, etc.)\n|-- md\u002F                    # Workflow of FM-Agent to guide LLMs\n```\n\n## Environment Setup\n\n### Requirements\n\n- Ubuntu (24.04 LTS is tested)\n- Python 3.12\n- pip >= 23\n- [openai](https:\u002F\u002Fpypi.org\u002Fproject\u002Fopenai\u002F) 2.15.0\n- [OpenCode](https:\u002F\u002Fgithub.com\u002Fopencode-ai\u002Fopencode) 1.4.6\n- [Bun](https:\u002F\u002Fbun.sh\u002F)\n- [oh-my-opencode](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Foh-my-opencode) plugin (installed via `bunx`)\n- [OpenRouter](https:\u002F\u002Fopenrouter.ai\u002F) API key\n\n### Install Dependencies\n\nSet your [OpenRouter](https:\u002F\u002Fopenrouter.ai\u002F) API key as an environment variable. Note that FM-Agent only supports the OpenRouter API key for now, because it will concurrently invoke LLMs. OpenRouter is flexible in RPM (requests per minute) and TPM (tokens per minute).\n\n\n```bash\nexport OPENROUTER_API_KEY=\"your-api-key-here\"\n```\n\nThen, all of the above dependencies (except Ubuntu and Python) can be installed via the provided script:\n\n```bash\n.\u002Finstall.sh\n```\n\n(Optional) If needed, you can manually set the default LLM model and API key of OpenCode in its configuration file.\n\n**Important:** FM-Agent automatically derives test cases based on the reasoning process to trigger potential bugs, which help developers locate and fix them. Before running FM-Agent, please ensure the execution environment for test cases is ready, and if necessary, specify how to run test cases in `md\u002Fbug_validator.md`. If you do not specify, the agent will autonomously decide the execution method.\n\n## Configuration\n\nKey parameters can be adjusted in [config.py](config.py).\n\n| Parameter | Default | Description |\n|---|---|---|\n| `LLM_MODEL` | `anthropic\u002Fclaude-sonnet-4.6` | LLM model used via OpenRouter |\n| `LLM_OPENROUTER_API_KEY` | (env) | OpenRouter API key (read via `os.environ.get(\"OPENROUTER_API_KEY\")`) |\n| `LLM_OPENROUTER_API_BASE_URL` | `https:\u002F\u002Fopenrouter.ai\u002Fapi\u002Fv1` | OpenRouter API base URL |\n\n**Important Note:** We strongly recommend using Claude Opus 4.6\u002F4.7 or Claude Sonnet 4.6, as other models may lack the reasoning capabilities required by FM-Agent and may not be able to effectively uncover bugs. In addition, please use an API key with access to Claude models, since FM-Agent invokes OpenCode, which may potentially access Claude models.\n\n(Optional) FM-Agent uses oh-my-opencode plugin to enhance OpenCode. The comment-checker hook built into this plugin should be disabled, otherwise it may intercept every comment block that FM-Agent writes, which are specifications of functions. It may force the agent to waste tokens justifying or removing them.\nYou can open your oh-my-opencode config file (typically ~\u002F.config\u002Fopencode\u002Foh-my-opencode.json) and add disabled_hooks:\n\n```json\n{\n  \"disabled_hooks\": [\"comment-checker\"],\n}\n```\n\n\n## Quick Start\n\n```bash\npython3 main.py \u003Cproj_dir>\n```\n\n| Argument | Description |\n|---|---|\n| `proj_dir` | Directory of codebase that you want to check correctness |\n\n### Output\n\nFM-Agent creates an `fm_agent\u002F` directory under your codebase directory. The key outputs are:\n\n#### Bug Reports (`fm_agent\u002Fbug_validation\u002F\u003Cbug_id>.md`)\n\nEach confirmed or investigated bug produces a Markdown report containing:\n\n| Section | Content |\n|---|---|\n| Specification Claim | The post-condition that the function specification requires |\n| Actual Behavior | The post-condition that the code actually implements |\n| Code Evidence | The specific code statements (with line numbers) that cause the violation |\n| Trigger Condition | A description of the condition that triggers the bug |\n| How to Trigger | Concrete input parameters, expected vs. actual output, and reproduction steps |\n| Probe Script | The full test script used to confirm the bug |\n| Probe Output | Raw stdout from executing the probe script |\n\nA companion `\u003Cbug_id>.result.json` is generated alongside each report, containing machine-readable fields such as `confirmation_status` (`confirmed`, `not_confirmed`, or `error`), `probe_script` path, and `trigger_summary`.\n\nA `summary.json` file in `fm_agent\u002Fbug_validation\u002F` aggregates all bug results with counts of total reported, confirmed, not confirmed, and errored bugs.\n\n#### Log File (`fm_agent\u002Ffm_agent.log`)\n\nA single log file records the entire pipeline execution, including file extraction progress, reasoning submissions and completions, network errors and retries, and the final reasoning summary statistics. The log level is `INFO` and the format is `%(asctime)s [%(levelname)s] %(message)s`.\n\n## Important Notes\n\n1. FM-Agent will create an `fm_agent\u002F` directory under your codebase directory. Make sure there is no name conflict.\n2. The markdown files under `md\u002F` provide general instructions that guide the agent's reasoning process. Customizing them for your specific project can improve accuracy and help uncover more bugs. For example, you can include project documentation to give the agent deeper understanding of your codebase, or if you are reasoning about a compiler, modify `md\u002Fbug_validator.md` to instruct the agent to compare outputs against a reference implementation (e.g., GCC).\n3. **Supported languages**: Rust, C, C++, Python, Java, Go, CUDA, JavaScript, TypeScript, ArkTS.\n\n## Citation\n\nIf you use FM-Agent in your projects or research, please kindly cite our [paper](https:\u002F\u002Farxiv.org\u002Fabs\u002F2604.11556):\n\n```bibtex\n@misc{ding2026fmagent,\nAuthor = {Haoran Ding and Zhaoguo Wang and Haibo Chen},\nTitle = {FM-Agent: Scaling Formal Methods to Large Systems via LLM-Based Hoare-Style Reasoning},\nYear = {2026},\nEprint = {arXiv:2604.11556},\n}\n```\n\n## Contact\n\nIf you have any questions, please submit an issue or send [email](mailto:nhaorand@gmail.com).\n","FM-Agent 是一个用于大规模系统形式化方法自动组合推理的框架。它通过基于大语言模型（LLM）的Hoare风格推理，实现了对大型代码库（如143K行代码的Claude's C Compiler）的自动化验证。项目采用Python开发，支持使用强大的推理模型（如Claude Opus 4.6\u002F4.7, Claude Sonnet 4.6）以提高准确性。用户可以通过其官网在线服务轻松尝试该工具。需要注意的是，底层模型的能力直接影响框架的有效性，较弱的模型可能导致错误结论。FM-Agent适用于需要对复杂软件系统进行形式化验证和测试的场景，特别是在开发过程中帮助识别潜在bug并提供修复建议。",2,"2026-06-01 02:39:05","CREATED_QUERY"]