[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80736":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":15,"stars30d":16,"stars90d":14,"forks30d":14,"starsTrendScore":14,"compositeScore":17,"rankGlobal":9,"rankLanguage":9,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":19,"hasPages":19,"topics":21,"createdAt":9,"pushedAt":9,"updatedAt":22,"readmeContent":23,"aiSummary":24,"trendingCount":14,"starSnapshotCount":14,"syncStatus":16,"lastSyncTime":25,"discoverSource":26},80736,"eager-tools","cloudthinker-ai\u002Feager-tools","cloudthinker-ai","Eager tool dispatch for streaming LLM agents — dispatch tools the moment their JSON block seals, not after the message ends.",null,"Python",44,4,42,0,1,2,42.8,"MIT License",false,"main",[],"2026-06-12 04:01:29","# eager-tools\n\n> **Cut agent wall-clock latency by overlapping tool execution with LLM streaming.**\n>\n> A production-grade reference implementation of **eager tool calling** — the pattern that dispatches each tool the moment its block finishes streaming, not after `message_stop`.\n\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Feager-tools-core.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Feager-tools-core\u002F)\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green.svg)](.\u002FLICENSE)\n[![CI](https:\u002F\u002Fgithub.com\u002Fcloudthinker-ai\u002Feager-tools\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fcloudthinker-ai\u002Feager-tools\u002Factions)\n[![Docs](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdocs-METHOD.md-blue.svg)](.\u002FMETHOD.md)\n\n---\n\n## The problem in one graph\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fdocs\u002Fdiagrams\u002Feager-vs-classic-timeline.svg?v=3\" alt=\"Timeline: parallel waits for stream to finish; eager fires each tool the moment its block seals — tools and stream overlap.\"\u002F>\n\u003C\u002Fp>\n\n\u003Cdetails>\n\u003Csummary>ASCII fallback\u003C\u002Fsummary>\n\n```\nClassic parallel tool calling:\nstream : [==================================]\ntools  :                                     [===========]  ← idle during stream\ntotal  :                                                    ← stream + max(tool)\n\nEager tool calling:\nstream : [==================================]\ntool A :   [=========]        ← fires mid-stream\ntool B :       [=========]    ← fires mid-stream, overlaps A\ntool C :           [=========]← fires at message_stop\ntotal  : [==================================]               ← max(stream, max(tool))\n```\n\n\u003C\u002Fdetails>\n\nParallel tool calling overlaps tools with tools. **Eager tool calling overlaps tools with generation itself.**\n\n## Benchmark headline\n\nSynthetic harness — `make bench` reproduces locally, deterministic.\nAcross 16 workloads (3 → 15 tools), eager beats parallel by **1.20× – 1.50×** (median ~1.28×).\nParallel is the right baseline: modern frameworks (`langchain.agents.create_agent`,\nOpenAI Agents SDK, Vercel AI SDK) already execute tool calls from one\nassistant message concurrently. Eager's win comes from overlapping tools\nwith the *stream itself* — something parallel dispatch can't do.\nFull table + repro details: [`bench\u002Fresults.md`](.\u002Fbench\u002Fresults.md).\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fdocs\u002Fdiagrams\u002Fbenchmark-results-chart.svg?v=3\" alt=\"Bar chart: sequential vs parallel vs eager across 3-tool, 9-tool, and 15-tool workloads. Eager wins by 1.21×–1.46× vs parallel.\"\u002F>\n\u003C\u002Fp>\n\n| Workload | Sequential | Parallel | **Eager** | Speedup vs parallel |\n|----------|------------|----------|-----------|---------------------|\n| 3-tool analytics | 4.90s | 3.50s | **2.90s** | 1.21× |\n| 9-tool incident triage | 17.61s | 9.50s | **6.50s** | 1.46× |\n| 15-tool ad campaign | 30.42s | 11.50s | **8.80s** | 1.31× |\n\n> These are **lower bounds**. The synthetic stream removes network jitter,\n> tail latency, and provider-side variance — the things that make eager\n> dispatch shine in production. Run `make bench-live-anthropic` (or\n> `-openai`) to spot-check against a real provider.\n\n---\n\n## 60-second quickstart\n\n```bash\npip install eager-tools-core eager-tools-langgraph   # once published\n# or, from source:\ngit clone https:\u002F\u002Fgithub.com\u002Fcloudthinker-ai\u002Feager-tools && cd eager-tools && make sync\n```\n\n```python\nimport asyncio, os\nfrom langchain.agents import create_agent\nfrom langchain_anthropic import ChatAnthropic\nfrom langchain_core.messages import HumanMessage\nfrom langchain_core.tools import tool\nfrom eager_tools_langgraph import eager_middleware\n\nclass SlowTool:\n    def __init__(self, name: str, delay: float = 2.0):\n        self.name = name\n        self.idempotent = True\n        self._delay = delay\n    async def __call__(self, arguments):\n        await asyncio.sleep(self._delay)\n        return {\"name\": self.name, \"args\": arguments, \"ok\": True}\n\n@tool\ndef get_weather(city: str) -> str:\n    \"\"\"Get current weather for a city.\"\"\"\n    return \"\"\n\n@tool\ndef get_stock_price(ticker: str) -> str:\n    \"\"\"Get the current stock price for a ticker symbol.\"\"\"\n    return \"\"\n\n@tool\ndef get_news(topic: str) -> str:\n    \"\"\"Get recent news on a topic.\"\"\"\n    return \"\"\n\neager_tools = {\n    \"get_weather\":    SlowTool(\"get_weather\"),\n    \"get_stock_price\": SlowTool(\"get_stock_price\"),\n    \"get_news\":       SlowTool(\"get_news\"),\n}\n\nasync def main():\n    agent = create_agent(\n        model=ChatAnthropic(model_name=\"claude-sonnet-4-5\", timeout=60.0, stop=None),\n        tools=[get_weather, get_stock_price, get_news],\n        middleware=[eager_middleware(eager_tools)],\n    )\n    result = await agent.ainvoke({\n        \"messages\": [HumanMessage(\n            \"Get the weather in NYC, the AAPL stock price, and recent AI news.\"\n        )]\n    })\n    print(result[\"messages\"][-1].content)\n\nasyncio.run(main())\n```\n\nOne middleware line wires eager dispatch into any `create_agent` call — no\nchanges to your tools or prompt. Works with OpenAI too: swap `ChatAnthropic`\nfor `ChatOpenAI`. Runnable variants in [`examples\u002F`](.\u002Fexamples).\n\n---\n\n## Why this exists\n\nModern agent APIs — Anthropic, OpenAI, Bedrock — let the model emit multiple `tool_use` blocks in one assistant message and run them in parallel. That moves the tool phase from *sum* of durations to *max*. Good, but insufficient.\n\nThe **stream phase still happens first**. Tools still wait for `message_stop`. A four-second model stream followed by 2.5s of parallel tool execution is 6.5 seconds of wall clock. Eager tool calling makes it 4 seconds — the tools run *during* the stream, not after it.\n\nSee [`METHOD.md`](.\u002FMETHOD.md) for the full mechanism: the seal event, the `tool_call_id` invariant, the runtime contract, and the edge cases.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fdocs\u002Fdiagrams\u002Fseal-mechanism-flow.svg?v=3\" alt=\"Seal mechanism: a new tool_call_id in the stream triggers a SealEvent, which dispatches the completed tool to the ExecutorPool while the stream buffers the next block.\"\u002F>\n\u003C\u002Fp>\n\n---\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Fdocs\u002Fdiagrams\u002Fstream-handler-architecture.svg?v=3\" alt=\"Stream handler architecture: provider stream → adapter → SealDetector → ExecutorPool → user events and results.\"\u002F>\n\u003C\u002Fp>\n\nFor the per-block mechanism (chunks → buffer → seal → dispatch), see\n[`docs\u002Fdiagrams\u002Fseal-mechanism-flow.svg`](.\u002Fdocs\u002Fdiagrams\u002Fseal-mechanism-flow.svg).\n\n## When NOT to use it\n\n- **Fast tools (sub-50ms).** Seal\u002Fdispatch overhead exceeds the latency saved.\n- **Sequentially dependent tools.** If tool B needs tool A's result, the model won't emit B until A returns — no pipeline opportunity.\n- **Non-idempotent tools.** Payments, destructive commands, outbound messages. Route these to the classic path via `Tool.idempotent = False` for blanket denial, or via a per-call `gate` callable for case-by-case decisions with parsed args visible (e.g. allow `read_file` but not under `\u002Fetc\u002F`). See [`docs\u002Fhitl.md`](.\u002Fdocs\u002Fhitl.md). The gate still gates the *eager* path; the underlying tool still runs at the framework's tool step for non-denied calls.\n- **Non-streaming backends.** If your gateway buffers the full response, eager dispatch is impossible.\n\nLong version with edge cases: [`docs\u002Fwhen-not-to-use.md`](.\u002Fdocs\u002Fwhen-not-to-use.md).\n\n## Contributing\n\nAdapter PRs welcome — LlamaIndex, AutoGen, Vercel AI SDK, any provider that exposes a streaming response with per-block identifiers. Start from `packages\u002Feager-tools-core\u002F` as the contract reference. See [`NEXT.md`](.\u002FNEXT.md) §3 for the extraction pattern.\n\nBug reports + design discussions happen in **GitHub Discussions** — issues are intentionally disabled to keep the signal-to-noise ratio high.\n\n## Acknowledgements\n\nThis pattern was extracted from production at [CloudThinker](https:\u002F\u002Fcloudthinker.io), where it cuts median agent task latency by 50%. Internal codename: *tool-call pipelining*. External name: *eager tool calling*.\n\nRead the full production story: [*Eager Tool Calling at CloudThinker*](https:\u002F\u002Fcloudthinker.io\u002Fblogs\u002Feager-tool-calling-50-percent-faster-agents).\n\n## License\n\nMIT — see [`LICENSE`](.\u002FLICENSE).\n","eager-tools 是一个用于流式处理大语言模型代理工具调用的Python库，能够在JSON块封闭时立即调度工具，而非等到消息结束。其核心功能是通过在生成过程中并行执行工具调用来减少代理的墙钟延迟，从而提高整体响应速度。该库支持生产级部署，并且与现代框架如langchain.agents.create_agent、OpenAI Agents SDK等兼容。适用于需要快速响应和高效处理多个工具调用的应用场景，特别是在对话系统、自动化助手等领域，能够显著提升用户体验。","2026-06-11 04:01:51","CREATED_QUERY"]