[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80910":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":13,"stars30d":15,"stars90d":14,"forks30d":14,"starsTrendScore":14,"compositeScore":16,"rankGlobal":9,"rankLanguage":9,"license":17,"archived":18,"fork":18,"defaultBranch":19,"hasWiki":18,"hasPages":18,"topics":20,"createdAt":9,"pushedAt":9,"updatedAt":21,"readmeContent":22,"aiSummary":23,"trendingCount":14,"starSnapshotCount":14,"syncStatus":15,"lastSyncTime":24,"discoverSource":25},80910,"event-data-chatbot","parmacalcio1913\u002Fevent-data-chatbot","parmacalcio1913","A CLI chatbot powered by StatsBomb open data that can be used to query event data using natural language.",null,"Python",35,3,1,0,2,1.81,"MIT License",false,"main",[],"2026-06-12 02:04:08","# Event Data Chatbot\n\nA CLI chatbot that lets you ask Claude analytical questions about football event data — *\"Who scored the most goals in the 2015\u002F2016 La Liga?\"*, *\"Compare Barcelona's home and away xG\"*, and so on. Under the hood, an MCP (Model Context Protocol) server exposes a local StatsBomb open-data snapshot as a single read-only SQL tool; Claude writes the queries against the underlying DuckDB, you just chat.\n\n## Data source\n\nThis project queries [StatsBomb open data](https:\u002F\u002Fgithub.com\u002Fstatsbomb\u002Fopen-data). If you publish analysis based on this data you must credit **StatsBomb** and display their logo — see [ATTRIBUTION.md](ATTRIBUTION.md) for full requirements. Before using the data, register at the [StatsBomb resource centre](https:\u002F\u002Fstatsbomb.com\u002Fresource-centre\u002F).\n\nThe initial scaffolding for the MCP client, agentic loop, and CLI in this project was adapted from the \"Model Context Protocol\" section of Stephen Grider's [Building with the Claude API](https:\u002F\u002Fanthropic.skilljar.com\u002Fclaude-with-the-anthropic-api\u002F287780) course at Anthropic.\n\n## Prerequisites\n\n- Python 3.10+\n- Anthropic API Key\n\n## Setup\n\n### Step 1: Configure the environment variables\n\n1. Create or edit the `.env` file in the project root and verify that the following variables are set correctly:\n\n```\nCLAUDE_MODEL=\"\" # Enter the Claude model you want to use.\nANTHROPIC_API_KEY=\"\"  # Enter your Anthropic API secret key\n```\n\n### Step 2: Install dependencies\n\n[uv](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv) is a fast Python package installer and resolver.\n\n1. Install uv, if not already installed:\n\n```bash\npip install uv\n```\n\n2. Clone the repository:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fparmacalcio1913\u002Fevent-data-chatbot.git\ncd event-data-chatbot\n```\n\n3. Create and activate a virtual environment:\n\n```bash\nuv venv\nsource .venv\u002Fbin\u002Factivate  # On Windows: .venv\\Scripts\\activate\n```\n\n4. Install dependencies:\n\n```bash\nuv pip install -e .\n```\n\n5. Download the StatsBomb snapshot into `data\u002Fstatsbomb.duckdb` (~0.5 GB, one-shot):\n\n```bash\nuv run scripts\u002Fdownload_data.py\n```\n\n6. Run the project\n\n```bash\nuv run main.py [--query] [--usage]\n```\n\nBy default the CLI prints only Claude's responses. Two opt-in flags expose what's happening under the hood:\n\n- `--usage` — print Anthropic token usage per turn (`[tokens] in=… out=…`) and a running total per user message.\n- `--query` — print each tool call with its input, e.g. `[tool] query({\"sql\": \"...\"})`. Handy for seeing the SQL Claude wrote.\n\n## Usage\n\n### Basic Interaction\n\nType your question and press Enter. For example, ask:\n\n```\n> Who scored the most goals in the 2015\u002F2016 Premier League?\n```\n\nUnder the hood:\n\n1. The orchestration loop (`Chat.run()` in [core\u002Fchat.py](core\u002Fchat.py)) receives your message. Before each call to Claude it collects every MCP tool exposed by every connected server via `ToolManager.get_all_tools()` ([core\u002Ftools.py](core\u002Ftools.py)).\n2. `ToolManager.get_all_tools()` calls `MCPClient.list_tools()` on each client ([mcp_client.py](mcp_client.py)), which sends a `ListToolsRequest` to the corresponding MCP server. The server returns the registered tools — here, just the `query` SQL tool defined in [mcp_server.py](mcp_server.py). Its description includes the full StatsBomb events schema, so Claude already knows what columns exist without inspecting the catalog at request time.\n3. `Chat.run()` calls `Claude.chat(messages=..., tools=...)` ([core\u002Fclaude.py](core\u002Fclaude.py)), forwarding your question along with the available tools. Claude decides the question needs data, formulates a SQL statement, and asks to call the `query` tool.\n4. Claude returns `stop_reason == \"tool_use\"`. `Chat.run()` detects this and calls `ToolManager.execute_tool_requests()`, which finds the client that owns `query` and calls `MCPClient.call_tool(\"query\", {\"sql\": \"...\"})`.\n5. The MCP server runs the SQL against the local DuckDB via `StatsBomb.query()` ([core\u002Fstatsbomb.py](core\u002Fstatsbomb.py)), JSON-serializes the result (dates → ISO strings, `Decimal` → `float`), and returns up to 1000 rows with a `truncated` flag. Errors come back as MCP error responses with the SQL exception message attached.\n6. `Chat.run()` appends the tool result as a user message and loops back to call Claude again (step 3) — this time with the result in the conversation history. Claude either produces a final answer (`stop_reason == \"end_turn\"`) or asks to call the tool again with a refined query.\n\nThe same loop handles multi-step questions naturally: Claude may issue a small exploratory query first (\"what competition names exist in the database?\"), look at the answer, then issue a follow-up aggregation query — all within one user turn.\n\n### Commands\n\nUse the `\u002F` prefix to invoke an MCP prompt defined on the server. This project exposes one — `summary` — which produces a structured match report. Pass the match ID as a positional argument:\n\n```\n> \u002Fsummary 3877313\n```\n\nUnder the hood:\n\n1. `CliApp.run()` ([core\u002Fcli.py](core\u002Fcli.py)) reads the input and forwards it to `CliChat.run()`. `CliChat._process_query()` delegates first to `_process_command()` ([core\u002Fcli_chat.py](core\u002Fcli_chat.py)).\n2. `_process_command()` notices the leading `\u002F`, splits the input into a command name (`summary`) and its positional arguments (`[\"3877313\"]`), and looks up the corresponding `Prompt` definition via `MCPClient.list_prompts()`. It zips the positional args onto the prompt's declared argument names — the server's `summary` prompt declares one argument called `match_id`, so this becomes `{\"match_id\": \"3877313\"}`.\n3. `MCPClient.get_prompt(\"summary\", {\"match_id\": \"3877313\"})` ([mcp_client.py](mcp_client.py)) sends a `GetPromptRequest` to the MCP server. The server's `summary` handler ([mcp_server.py](mcp_server.py)) renders the pre-built report template — instructions plus the exact `matches` \u002F `lineups` \u002F `events` SQL queries Claude is allowed to run — and returns it as a sequence of `PromptMessage` objects.\n4. Those messages are converted into Anthropic `MessageParam` objects by `convert_prompt_messages_to_message_params()` and appended to the conversation history.\n5. `_process_command()` returns `True`, signalling that the user input has already been turned into messages. `Chat.run()` then calls Claude with the pre-built conversation — Claude executes the prescribed queries via the `query` tool and writes the report.\n\nTab completion is provided by `UnifiedCompleter` ([core\u002Fcli.py](core\u002Fcli.py)): typing `\u002F` opens a menu of the prompts exposed by the connected servers. Once a command name is in place, `CommandAutoSuggest` shows the first declared argument name (e.g. `match_id`) inline as a hint.\n\n## Development\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for the full dev guide. In short:\n\n```bash\nuv sync --group dev          # installs ruff, mypy, pre-commit, pytest\nuv run pre-commit install    # wires git hooks\n```\n\nThe four quality gates can be run manually:\n\n```bash\nuv run ruff check .          # lint\nuv run ruff format .         # format\nuv run mypy .                # type check\nuv run pytest                # tests\n```\n\nCI runs the same four checks on push and on every PR against `main`, matrixed over Python 3.10, 3.11, and 3.12.\n\n## Security\n\nIf you find a vulnerability, please **do not** open a public issue. See [SECURITY.md](SECURITY.md) for the private reporting channel and the project's threat model.\n\n## License\n\nMIT — see [LICENSE](LICENSE). The MIT license covers the code in this repository only; StatsBomb data remains subject to the [StatsBomb user agreement](https:\u002F\u002Fgithub.com\u002Fstatsbomb\u002Fopen-data\u002Fblob\u002Fmaster\u002FLICENSE.pdf).\n","该项目是一个基于StatsBomb开放数据的命令行聊天机器人，允许用户通过自然语言查询足球赛事数据。其核心功能是利用Claude模型生成SQL查询语句来分析存储在DuckDB中的赛事数据，支持诸如“2015\u002F2016赛季西甲进球最多的是谁？”或“比较巴塞罗那主场与客场预期进球数”等复杂问题。技术上，项目采用Python编写，并依赖于Anthropic API以及一个本地MCP服务器来暴露数据源。适用于需要快速获取和分析足球比赛统计数据的研究者、分析师或爱好者使用。","2026-06-11 04:02:47","CREATED_QUERY"]