[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83215":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":16,"stars90d":15,"forks30d":15,"starsTrendScore":17,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":31,"readmeContent":32,"aiSummary":10,"trendingCount":15,"starSnapshotCount":15,"syncStatus":33,"lastSyncTime":34,"discoverSource":35},83215,"WorldCupROI","2417467487-hub\u002FWorldCupROI","2417467487-hub","Sports sponsorship intelligence platform for World Cup match data, real-source text signals, ROI prediction, uncertainty analysis, and scenario recommendations.","https:\u002F\u002Fgithub.com\u002F2417467487-hub\u002FWorldCupROI",null,"Python",206,14,5,0,125,44,70.53,false,"main",true,[23,24,25,26,27,28,29,30],"business-intelligence","machine-learning","plotly","roi-prediction","sponsorship","sports-analytics","streamlit","world-cup","2026-06-12 04:01:40","# WorldCupROI\n\n**AI Sports Sponsorship Intelligence Platform**\n\nWorldCupROI turns the World Cup attention market into a sponsorship decision engine. It is not a simple match-result predictor: it blends match performance, media narratives, fan influence, sponsor investment, and uncertainty risk into one ROI decision platform.\n\n[![CI](https:\u002F\u002Fgithub.com\u002F2417467487-hub\u002FWorldCupROI\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002F2417467487-hub\u002FWorldCupROI\u002Factions\u002Fworkflows\u002Fci.yml)\n![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.11-2457c5)\n![ML](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FML-ROI%20Prediction-0f8b6f)\n![Explainability](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FExplainability-SHAP%20Style-f28c28)\n![Risk](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRisk-Conformal%20%2B%20Monte%20Carlo-6d5bd0)\n![Dashboard](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FDashboard-Streamlit%20%2B%20Plotly-1167b1)\n\n| Quick link | Open |\n|---|---|\n| Demo dashboard | [dashboard\u002Fpanel_dashboard.html](dashboard\u002Fpanel_dashboard.html) |\n| Streamlit app | `make dashboard` |\n| Executive summary | [reports\u002Fexecutive_summary.pdf](reports\u002Fexecutive_summary.pdf) |\n| Business insights | [reports\u002Fbusiness_insights.md](reports\u002Fbusiness_insights.md) |\n| Platform GIF | [assets\u002Fgifs\u002Fstatic_platform_dashboard.gif](assets\u002Fgifs\u002Fstatic_platform_dashboard.gif) |\n| Data card | [docs\u002Fdata_card.md](docs\u002Fdata_card.md) |\n| Model card | [reports\u002Fmodel_card.md](reports\u002Fmodel_card.md) |\n| Deployment guide | [docs\u002Fdeployment.md](docs\u002Fdeployment.md) |\n\n```bash\nmake demo       # fully offline reproducible run\nmake dashboard  # Streamlit decision dashboard\nmake assets     # README images, model visuals, demo media\n```\n\nIf `make` is not available on Windows, run `python scripts\u002Frun_pipeline.py --demo` and then `python -m streamlit run dashboard\u002Fapp.py`.\n\n![WorldCupROI method overview](assets\u002Fimages\u002Freadme_hero.png)\n\nThe opening figure is generated with Python from `scripts\u002Fgenerate_readme_assets.py`. It summarizes the project as a machine-learning method overview: multi-source evidence, feature construction, multi-task prediction, explainability, graph intelligence, and ROI decision support.\n\n## Interactive Platform Preview\n\n![WorldCupROI static decision dashboard preview](assets\u002Fgifs\u002Fstatic_platform_dashboard.gif)\n\nThe platform is not only a modeling pipeline. It includes an interactive sponsorship intelligence dashboard for KPI discovery, sponsor ROI ranking, FanScore analysis, scenario simulation, uncertainty review, and graph-based sponsor influence exploration.\n\n| Experience | Open |\n|---|---|\n| Live Streamlit dashboard | `make dashboard` |\n| Static dashboard preview | [dashboard\u002Fpanel_dashboard.html](dashboard\u002Fpanel_dashboard.html) |\n| Animated platform GIF | [assets\u002Fgifs\u002Fstatic_platform_dashboard.gif](assets\u002Fgifs\u002Fstatic_platform_dashboard.gif) |\n| Visual preview page | [preview_visuals.html](preview_visuals.html) |\n\n| Dashboard area | What the interface shows |\n|---|---|\n| Discover | KPI cards, team\u002Fsponsor filters, ROI ranking, FanScore summary. |\n| Explain | SHAP-style ROI drivers, text signals, sponsor-team fit, media exposure. |\n| Predict | Match probability, predicted ROI, interval coverage, risk score. |\n| Simulate | Sponsor spend, player status, media exposure, weather and stage changes. |\n| Recommend | Scenario ROI lift, negative ROI probability, sponsor strategy ranking. |\n\n| Link | Target |\n|---|---|\n| Live Demo | `make dashboard` |\n| Static Demo | [dashboard\u002Fpanel_dashboard.html](dashboard\u002Fpanel_dashboard.html) |\n| Platform GIF | [assets\u002Fgifs\u002Fstatic_platform_dashboard.gif](assets\u002Fgifs\u002Fstatic_platform_dashboard.gif) |\n| Report | [sample_report.pdf](sample_report.pdf) |\n| Research Brief | [reports\u002Fsponsorship_intelligence_brief.md](reports\u002Fsponsorship_intelligence_brief.md) |\n\n| Key result | Current value |\n|---|---:|\n| Match prediction accuracy | 0.5566 |\n| Match prediction log loss | 0.9780 |\n| Sponsor ROI model MAE | 0.1213 |\n| Sponsor ROI model R2 | 0.8590 |\n| Match conformal coverage | 0.9021 |\n| ROI interval coverage | 0.8454 |\n| Average negative ROI probability | 0.0000 |\n\n**Chinese summary:** WorldCupROI 不是单纯预测世界杯胜负，而是把比赛表现、真实文本信号、赞助曝光、粉丝影响力与 ROI 风险整合为体育赞助商业智能平台。\n\n## 10-Second Overview\n\n| Capability | Output | Business value |\n|---|---|---|\n| Sponsor ROI prediction | Expected ROI, ROI lift, ranking | Moves beyond match prediction into commercial decision support. |\n| Real-source text signals | Media heat, narrative momentum, text embeddings | Captures attention shifts that tabular sports data misses. |\n| Uncertainty quantification | Prediction intervals, coverage, negative ROI probability | Makes sponsorship decisions risk-aware instead of point-estimate driven. |\n| Scenario simulation | Spend, exposure, player, weather, stage changes | Tests strategy before campaign money is committed. |\n| Interactive dashboard | Discover -> Explain -> Predict -> Simulate -> Recommend | Turns model outputs into a repeatable business workflow. |\n\n## Results Showcase\n\nResults come first because sponsorship teams need to see the business signal before reading the engineering stack. The tables are intentionally kept compact and consistent so they render cleanly on GitHub.\n\n### Results Overview\n\n| Area | Metric | Current value | Decision meaning |\n|---|---|---:|---|\n| Match prediction | Accuracy | 0.5566 | Baseline signal for team outcome probability. |\n| Match prediction | Log loss | 0.9780 | Measures probability calibration quality. |\n| Sponsor ROI | MAE | 0.1213 | Average ROI prediction error. |\n| Sponsor ROI | R2 | 0.8590 | Share of ROI variance explained by model signals. |\n| Conformal prediction | Match coverage | 0.9021 | Reliability of match prediction sets. |\n| Conformal prediction | ROI coverage | 0.8454 | Reliability of ROI interval estimates. |\n| Uncertainty | Negative ROI probability | 0.0000 | Current average downside probability in generated panel. |\n\n### Model Performance Comparison\n\n| Task | Model | Metrics | Status |\n|---|---|---|---|\n| Match outcome | Centroid classifier | Accuracy 0.5566, Log loss 0.9780 | Reproducible baseline |\n| Sponsor ROI | Ridge regression | R2 0.8590, MAE 0.1213 | Reproducible baseline |\n| Tabular modeling | XGBoost | Accuracy, Log loss, feature gain | Optional package |\n| Tabular modeling | LightGBM | Accuracy, Log loss, feature gain | Optional package |\n| Categorical modeling | CatBoost | Accuracy, Log loss, categorical splits | Optional package |\n\n### ROI Feature Importance \u002F SHAP\n\n![ROI feature importance](docs\u002Fassets\u002Froi_feature_importance.svg)\n\n**What it shows:** Figure 1 ranks the strongest drivers of predicted sponsor ROI, including brand heat, team strength, sponsor spend, ad exposure, sponsor-team fit, and commercial momentum.\n\n**Why it matters:** Sponsor value is driven by both football performance and attention dynamics, so ROI cannot be explained by match results alone.\n\n**Business takeaway:** Brands should evaluate team strength together with media exposure, fan attention, and sponsor-team fit before increasing campaign spend.\n\n### Sponsor ROI Ranking\n\n| Rank | Sponsor | Influence score | Connected nodes | Average edge weight |\n|---:|---|---:|---:|---:|\n| 1 | Hyundai | 1261.417 | 262 | 2.3534 |\n| 2 | Adidas | 1079.883 | 233 | 2.3074 |\n| 3 | Coca-Cola | 1046.330 | 235 | 2.2262 |\n| 4 | Visa | 1030.583 | 236 | 2.2021 |\n| 5 | Hisense | 787.907 | 185 | 2.1411 |\n\n**What it shows:** The sponsor ranking summarizes commercial network influence across team, player, sponsor, and match relationships.\n\n**Why it matters:** Sponsors with broader and stronger network positions are more likely to convert event attention into measurable commercial value.\n\n**Business takeaway:** Sponsorship planning should prioritize both spend level and network fit, not only brand size.\n\n### Scenario ROI Lift\n\n![Scenario ROI lift](docs\u002Fassets\u002Fscenario_ranking.svg)\n\n| Scenario | Average predicted ROI | Average ROI delta | Average ROI lift |\n|---|---:|---:|---:|\n| A_baseline | 3.818 | 0.000 | 0.000% |\n| B_core_player_absent | 3.738 | -0.080 | -2.074% |\n| C_sponsor_upgrade | 3.606 | -0.213 | -5.580% |\n| D_media_cooling | 3.608 | -0.210 | -5.521% |\n\n**What it shows:** Figure 2 compares baseline ROI with counterfactual scenarios such as player absence, sponsor activation change, and media cooling.\n\n**Why it matters:** Sponsorship ROI is sensitive to player availability and attention shocks.\n\n**Business takeaway:** Scenario planning should be part of sponsor budget allocation before tournament exposure peaks.\n\n### Prediction Interval \u002F Conformal Prediction\n\n![Prediction interval](docs\u002Fassets\u002Froi_uncertainty_intervals.svg)\n\n| Prediction target | Coverage rate | Average interval or set size | qhat |\n|---|---:|---:|---:|\n| Match prediction sets | 0.9021 | 2.3814 | 0.8110 |\n| ROI prediction intervals | 0.8454 | 0.4578 | 0.2289 |\n\n**What it shows:** Figure 3 shows prediction intervals and conformal coverage for match outcomes and ROI estimates.\n\n**Why it matters:** Decision makers need ranges and reliability estimates, not only point predictions.\n\n**Business takeaway:** Sponsors can use interval width and coverage as risk controls before approving higher spend.\n\n### Monte Carlo Risk Distribution\n\n| Risk signal | Current value | Decision use |\n|---|---:|---|\n| Average negative ROI probability | 0.0000 | Downside screen for sponsor scenarios. |\n| Average interval width | 0.4340 | Confidence band for ROI planning. |\n| Average Monte Carlo standard deviation | 0.1320 | Volatility signal under scenario perturbation. |\n| Medium-risk cases | 119 | Cases needing additional review. |\n| High-risk cases | 0 | Current generated panel has no high-risk cases. |\n\n**What it shows:** The risk summary combines bootstrap intervals, Monte Carlo perturbation, and variance-based risk scoring.\n\n**Why it matters:** ROI forecasts are more useful when the downside distribution is visible.\n\n**Business takeaway:** Sponsors should compare expected ROI with risk score and interval width before selecting a campaign scenario.\n\n### Text Signal Projection\n\n![Text signal projection](docs\u002Fassets\u002Ftext_embedding_map.svg)\n\n**What it shows:** Figure 4 projects real-source text signals from GDELT and Wikimedia into reduced dimensions for modeling.\n\n**Why it matters:** Media narratives and sponsor news can change commercial momentum before the match result is known.\n\n**Business takeaway:** Text evidence should be treated as an early signal for sponsor attention and campaign timing.\n\n### Sponsor-Team-Player Network\n\n![GNN relationship explanation](docs\u002Fassets\u002Fgnn_relationship_explainer.svg)\n\n| Network signal | Current value | Decision use |\n|---|---:|---|\n| Graph edges | 6112 | Relationship density across sports and sponsor entities. |\n| Graph nodes | 1394 | Scale of the commercial network. |\n| Top sponsor by influence | Hyundai | Current strongest sponsor-network position. |\n| Top sponsor influence score | 1261.417 | Comparable influence score for ranking. |\n\n**What it shows:** The graph layer connects sponsors, teams, players, and matches into a weighted heterogeneous network.\n\n**Why it matters:** Sponsorship effectiveness depends on how brand exposure, team context, player influence, and match stage pass information through the relationship network.\n\n**Business takeaway:** Network centrality and edge strength can help identify sponsors with stronger activation leverage and more resilient commercial pathways.\n\n## Problem\n\nSports sponsorship is a race against a moving attention market. A brand often invests before the tournament story is fully written, while the return depends on conditions that can change within hours:\n\n- Match importance and tournament stage.\n- Team strength and player availability.\n- Fan attention and media reposts.\n- Sponsor spend, ad exposure, brand heat, and brand fit.\n- Weather, venue, and home\u002Faway context.\n- News narratives and public sentiment.\n\nMost sports analytics projects stop at predicting who wins. WorldCupROI treats match probability as only one signal inside a broader sponsor ROI, risk, and recommendation system.\n\n## Why It Matters\n\nMajor tournaments compress global attention into a short decision window. Sponsors need to act before all information is known, and poor timing can turn a high-profile campaign into weak commercial return.\n\n| Audience | Value |\n|---|---|\n| Sports business analysts | Compare sponsors, teams, stages, and ROI risk. |\n| ML and data science reviewers | Inspect reproducible modeling, feature engineering, and uncertainty outputs. |\n| Researchers | Study how sports performance, media attention, sentiment, and sponsorship signals interact. |\n\nThe goal is to connect predictions to business decisions: what to sponsor, when to activate, where the upside is, and how much risk sits behind the headline ROI.\n\n## Key Innovations\n\n![Data flow](docs\u002Fassets\u002Fdata_flow.svg)\n\n| Innovation | Implementation |\n|---|---|\n| Multi-source data system | World Cup match records, GDELT article metadata, Wikimedia text, sponsor tables, and weather context. |\n| Multimodal text layer | 5,450 real-source text units -> hashed TF-IDF -> 24-dimensional reduced text features. |\n| Sponsorship feature store | FanScore, Sponsor Power Index, Media Exposure Index, and Commercial Momentum Score. |\n| Model stack | Match outcome classification, sponsor ROI regression, scenario simulation, and model registry. |\n| Explainability | SHAP-style contribution tables and ROI driver reports. |\n| Uncertainty quantification | Conformal prediction, bootstrap intervals, Monte Carlo risk, negative ROI probability, and risk score. |\n| Graph intelligence | Team-player-sponsor-match graph with sponsor and player commercial influence scores. |\n| Product workflow | Discover -> Explain -> Predict -> Simulate -> Recommend. |\n\n## Research Questions\n\n1. How much do match probability, team strength, and player availability affect sponsor ROI?\n2. Do sponsor spend and ad exposure matter more than fan attention and media narratives?\n3. Can real-source text signals improve commercial momentum analysis?\n4. Which scenarios create the strongest ROI lift under risk constraints?\n5. How can uncertainty intervals make sponsor decisions more defensible?\n6. What role can graph models play in team-player-sponsor-match relationships?\n\n## Dataset & Data Sources\n\n| Dataset | Role | Boundary |\n|---|---|---|\n| `data\u002Fraw\u002Finternational_results.csv` | Public international match records used to derive World Cup match history. | Historical public data. |\n| `data\u002Fraw\u002Fgdelt_worldcup_articles_deduped.json` | GDELT article metadata related to World Cup sponsorship and media. | Real-source text metadata. |\n| `data\u002Fraw\u002Fwikipedia_pages.json` | Wikimedia page text for tournament, marketing, and sponsor context. | Real-source reference text. |\n| `data\u002Freal_text_articles.csv` | 5,450 real-source text units and evidence windows. | Real-source text layer. |\n| `data\u002Ftext_embeddings_reduced.csv` | 24-dimensional reduced text features. | Reproducible derived features. |\n| `data\u002Fmodeling_dataset.csv` | Joined modeling table. | Feature-engineered analysis data. |\n| `data\u002Fpanel_dataset.csv` | Dashboard-ready panel data. | Dashboard and reporting layer. |\n| Sponsor spend and ROI fields | Commercial sponsor inputs and ROI targets. | Proxy\u002Fmock values where contract-level data is unavailable. |\n\nCommercial metrics such as exact sponsor spend are proxy-derived where public contract-level data is unavailable. These columns are documented so they can be replaced by licensed sponsor datasets or future API connectors.\n\n## Architecture\n\n```mermaid\nflowchart LR\n    A[\"Historical matches\u003Cbr\u002F>1930-2022\"] --> F[\"Unified Feature Store\"]\n    B[\"2026 schedule\u003Cbr\u002F>stage + venue\"] --> F\n    C[\"Sponsors\u003Cbr\u002F>spend + exposure\"] --> F\n    D[\"Players + coaches\u003Cbr\u002F>ability + experience\"] --> F\n    E[\"Real-source text\u003Cbr\u002F>GDELT + Wikimedia\"] --> F\n    W[\"Weather + home\u002Faway\u003Cbr\u002F>context\"] --> F\n\n    F --> M[\"Match Outcome Model\u003Cbr\u002F>win\u002Fdraw\u002Floss probability\"]\n    F --> R[\"Sponsor ROI Model\u003Cbr\u002F>commercial return regression\"]\n    F --> G[\"Graph Intelligence\u003Cbr\u002F>team-player-sponsor-match network\"]\n\n    M --> C1[\"Conformal Prediction\u003Cbr\u002F>coverage + prediction sets\"]\n    R --> U[\"Uncertainty Engine\u003Cbr\u002F>bootstrap + Monte Carlo\"]\n    G --> I[\"Influence Scores\u003Cbr\u002F>sponsor + player centrality\"]\n\n    C1 --> S[\"Insight Generator\"]\n    U --> S\n    I --> S\n    S --> D1[\"Dashboard\u003Cbr\u002F>Discover -> Explain -> Predict -> Simulate -> Recommend\"]\n    S --> D2[\"Reports\u003Cbr\u002F>Markdown + PDF + CSV\"]\n```\n\nThis flow is the spine of the platform: data enters once, features are reused across models, and every prediction is routed through explanation, uncertainty, and business reporting before it reaches the dashboard.\n\n### Algorithm Upgrade Structure\n\nWorldCupROI now documents the algorithm system as four connected layers instead of isolated scripts. The current implementation keeps lightweight fallback models runnable, while making the production upgrade path explicit.\n\n| Layer | Current method | Output | Upgrade path |\n|---|---|---|---|\n| Match Outcome Layer | CentroidOutcomeModel with deterministic split | win\u002Fdraw\u002Floss probability, feature importance, conformal set | calibrated logistic regression, LightGBM multiclass, XGBoost multi-class |\n| Sponsor ROI Layer | Standardized RidgeROIModel | predicted ROI, ROI lift, ROI driver ranking, interval estimate | ElasticNet, LightGBMRegressor, XGBoostRegressor, stacked tabular ensemble |\n| Risk & Recommendation Layer | bootstrap, Monte Carlo, conformal intervals | negative ROI probability, scenario ranking, lift-risk recommendation | ensemble variance, Bayesian optimization, portfolio allocation |\n| Relationship Intelligence Layer | weighted heterogeneous graph centrality | sponsor influence, player\u002Fteam\u002Fsponsor graph metrics | GraphSAGE, heterogeneous GNN, temporal graph model |\n\nThe structured algorithm manifest is generated at:\n\n```text\nreports\u002Falgorithm_manifest.json\nreports\u002Falgorithm_strategy.md\n```\n\nEach trained fallback model now also writes a model card:\n\n```text\nreports\u002Fmatch_outcome_model_card.md\nreports\u002Fsponsor_roi_model_card.md\n```\n\nThis makes the repository easier to review as a research project: model target, feature count, metrics, artifact path, random seed, and upgrade notes are tracked as explicit artifacts.\n\n![Architecture diagram](docs\u002Fassets\u002Farchitecture.svg)\n\n**What it shows:** Figure 5 summarizes the platform architecture from data sources to features, models, uncertainty, report generation, and dashboard delivery.\n\n**Why it matters:** The system is designed as a reproducible analytics platform rather than a one-off notebook.\n\n**Business takeaway:** Sponsors can trace a recommendation back to data, features, models, and risk logic.\n\n![Model architecture](docs\u002Fassets\u002Fmodel_pipeline.svg)\n\n**What it shows:** Figure 6 shows the modeling pipeline for match prediction, ROI prediction, uncertainty, and scenario analysis.\n\n**Why it matters:** Separating match outcome modeling from sponsor ROI modeling keeps the business target clear.\n\n**Business takeaway:** Match probability becomes one commercial input rather than the final product.\n\n![Decision flow](docs\u002Fassets\u002Fdecision_workflow.svg)\n\n```mermaid\nflowchart LR\n    D[\"Discover\u003Cbr\u002F>select team, sponsor, stage\"] --> E[\"Explain\u003Cbr\u002F>inspect ROI drivers\"]\n    E --> P[\"Predict\u003Cbr\u002F>match + ROI forecast\"]\n    P --> S[\"Simulate\u003Cbr\u002F>spend, exposure, player status\"]\n    S --> R[\"Recommend\u003Cbr\u002F>lift-risk tradeoff\"]\n```\n\n**What it shows:** Figure 7 maps dashboard use to the business workflow Discover -> Explain -> Predict -> Simulate -> Recommend.\n\n**Why it matters:** Each module answers a decision question instead of presenting disconnected charts.\n\n**Business takeaway:** The dashboard supports repeated sponsor planning, not only static reporting.\n\n## Dashboard Gallery\n\nThe dashboard is structured around a business decision sequence rather than a loose chart collection. Each screen is designed to answer one sponsor question, then hand the user to the next decision.\n\n| Dashboard module | Main interaction | Decision value |\n|---|---|---|\n| Overview | KPI cards, ROI ranking, FanScore summary | Identify the strongest commercial opportunities quickly. |\n| Scenario simulation | Sponsor spend, media exposure, player status controls | See ROI move as strategy assumptions change. |\n| Risk analysis | Intervals, Monte Carlo distribution, negative ROI probability | Separate attractive upside from fragile forecasts. |\n| Network analysis | Sponsor-team-player graph and centrality ranking | Find brands and players with stronger activation leverage. |\n\n| Preview | GIF |\n|---|---|\n| Scenario simulation | ![Scenario simulation](assets\u002Fgifs\u002Fscenario_simulation.gif) |\n| Risk analysis | ![Risk uncertainty](assets\u002Fgifs\u002Frisk_uncertainty.gif) |\n| Network analysis | ![Sponsor network graph](assets\u002Fgifs\u002Fnetwork_graph.gif) |\n\n### Static Platform Workflow\n\nThe main GIF above is captured directly from `dashboard\u002Fpanel_dashboard.html`, so it matches the actual static platform rather than a separate mock animation. It highlights the decision flow: Discover -> Explain -> Predict -> Simulate -> Recommend.\n\nGenerated showcase files are indexed in [docs\u002Fproject_artifacts.md](docs\u002Fproject_artifacts.md), including GIF previews, background images, and regeneration commands.\n\n| Workflow step | Question answered | Output |\n|---|---|---|\n| Discover | Which teams, sponsors, stages, and years are being compared? | Filtered sponsor and match context. |\n| Explain | Which features drive ROI and attention? | ROI drivers, FanScore, SHAP-style ranking. |\n| Predict | What are the expected match and sponsorship outcomes? | Win\u002Fdraw\u002Floss probability and ROI estimate. |\n| Simulate | How does ROI shift under sponsor, player, weather, and stage changes? | Counterfactual ROI lift and risk movement. |\n| Recommend | Which scenario has the best lift-risk tradeoff? | Strategy ranking and business recommendation. |\n\nStatic dashboard:\n\n```text\ndashboard\u002Fpanel_dashboard.html\n```\n\nStreamlit dashboard:\n\n```bash\nmake dashboard\n```\n\n## Installation\n\nClone the repository:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002F2417467487-hub\u002FWorldCupROI.git\ncd WorldCupROI\npython -m venv .venv\n```\n\nWindows PowerShell:\n\n```powershell\n.venv\\\\Scripts\\\\activate\npip install -r requirements.txt\npython scripts\u002Frun_pipeline.py\n```\n\nmacOS:\n\n```bash\nsource .venv\u002Fbin\u002Factivate\npip install -r requirements.txt\npython scripts\u002Frun_pipeline.py\n```\n\nLinux:\n\n```bash\nsource .venv\u002Fbin\u002Factivate\npip install -r requirements.txt\npython scripts\u002Frun_pipeline.py\n```\n\nDirect pipeline entrypoint:\n\n```bash\npython src\u002Fpipeline.py\n```\n\nMakefile shortcuts:\n\n```bash\nmake pipeline\nmake dashboard\nmake assets\n```\n\nDocker:\n\n```bash\ndocker build -t worldcuproi .\ndocker run --rm -p 8501:8501 worldcuproi\n```\n\nEngineering reproducibility:\n\n| Component | Role |\n|---|---|\n| `src\u002Fpipeline.py` | End-to-end reproducible analytics pipeline. |\n| `src\u002Falgorithm_strategy.py` | Algorithm layers, feature groups, model cards, and upgrade manifest. |\n| `src\u002Fplatform_health.py` | Checks required data, model, report, dashboard, and media artifacts. |\n| `.github\u002Fworkflows\u002Fci.yml` | GitHub Actions validation. |\n| `Dockerfile` | Containerized execution. |\n| `config\u002Fpipeline.yaml` | Pipeline configuration and output tracking. |\n\nPlatform health check:\n\n```bash\npython src\u002Fplatform_health.py\n```\n\nor:\n\n```bash\nmake health\n```\n\nLatest generated health artifacts:\n\n```text\nreports\u002Fplatform_health.json\nreports\u002Fplatform_health.md\nreports\u002Fplatform_health.csv\n```\n\n## Contributions\n\n### Academic Contribution\n\n- Frames sponsorship ROI as a multi-signal modeling problem rather than a post-event descriptive metric.\n- Combines sports analytics, media text signals, business features, uncertainty analysis, and graph intelligence.\n- Provides a reproducible research scaffold for studying fan attention, sponsor exposure, and commercial return.\n- Documents future extensions for GNN sponsor networks, conformal prediction, SHAP explanations, and generated business reports.\n\n### Engineering Contribution\n\n- Provides a one-command pipeline and modular source structure.\n- Adds model registry, explainability, uncertainty, conformal prediction, graph analysis, and generated reporting modules.\n- Includes Docker and GitHub Actions for reproducible execution.\n- Produces dashboard-ready data, reports, visual assets, and PDF output.\n\n### Business Contribution\n\n- Helps compare sponsorship strategies before or during tournament windows.\n- Gives executives risk-aware ROI estimates rather than only point predictions.\n- Supports scenario planning for media exposure, player availability, weather, and stage premium.\n- Turns sports performance and media attention into sponsor ROI decision support.\n\n## Roadmap\n\n| Version | Product direction | Planned capability |\n|---|---|---|\n| v1 | Match Prediction | Improve calibrated win\u002Fdraw\u002Floss forecasting and historical validation. |\n| v2 | Sponsor ROI Modeling | Expand sponsor spend, exposure, and conversion features. |\n| v3 | Graph Intelligence | Add Team-Player-Sponsor-Match graph modeling with GNN baselines. |\n| v4 | Uncertainty-Aware Forecasting | Strengthen conformal coverage, bootstrap intervals, and risk dashboards. |\n| v5 | LLM Sponsorship Analyst | Generate sponsor briefs, scenario explanations, and executive reports. |\n| v6 | Real-Time Sports Intelligence Platform | Connect live APIs for weather, media, social attention, injuries, and campaign monitoring. |\n",2,"2026-06-11 04:10:26","CREATED_QUERY"]