[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80140":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":11,"contributorsCount":11,"subscribersCount":11,"size":11,"stars1d":11,"stars7d":11,"stars30d":13,"stars90d":11,"forks30d":11,"starsTrendScore":11,"compositeScore":14,"rankGlobal":8,"rankLanguage":8,"license":8,"archived":15,"fork":15,"defaultBranch":16,"hasWiki":17,"hasPages":15,"topics":18,"createdAt":8,"pushedAt":8,"updatedAt":19,"readmeContent":20,"aiSummary":21,"trendingCount":11,"starSnapshotCount":11,"syncStatus":22,"lastSyncTime":23,"discoverSource":24},80140,"churn_retention_system","jaideep005\u002Fchurn_retention_system","jaideep005",null,"Python",56,0,53,3,34.3,false,"main",true,[],"2026-06-12 04:01:26","# 🔮 ChurnGuard AI — Customer Churn Prediction & Retention Strategy System\n\n![Python](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.9%2B-blue?logo=python)\n![Streamlit](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FStreamlit-1.32-red?logo=streamlit)\n![Scikit-Learn](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FScikit--Learn-1.4.1-orange?logo=scikit-learn)\n![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green)\n\nAn end-to-end **Customer Churn Prediction and Retention Intelligence** web application built with Streamlit, Scikit-Learn, and LLM integration (OpenAI \u002F Gemini).\n\n---\n\n## 📌 Features\n\n| Module | Description |\n|--------|-------------|\n| 📥 **Data Ingestion** | Fetches real-time customer data from the IBM Telco public dataset or any custom API endpoint |\n| 🔧 **ETL Pipeline** | Automated data cleaning, encoding, feature engineering (LTV, engagement score, NPS segments, etc.) and Min-Max scaling |\n| 🤖 **ML Models** | Trains and evaluates **Logistic Regression** and **Random Forest** classifiers with 5-fold cross-validation |\n| 📊 **Dashboard** | Interactive KPI cards, churn probability distribution, risk segmentation pie chart, and revenue analysis |\n| 💡 **Retention Strategies** | AI-generated personalised retention playbooks per customer using **OpenAI GPT** or **Google Gemini** (with a rich mock fallback) |\n| 📤 **Power BI Export** | One-click CSV export with predictions and risk segments ready for Power BI dashboards |\n\n---\n\n## 🏗️ Project Structure\n\n```\nchurn_retention_system\u002F\n│\n├── app.py                        # Main Streamlit UI (5-tab interface)\n├── requirements.txt              # Python dependencies\n├── .gitignore\n├── data\u002F\n│   └── power_bi_export.csv       # Output file for Power BI\n│\n└── modules\u002F\n    ├── __init__.py               # Package exports\n    ├── scraper.py                # Real-time data fetcher (IBM Telco + custom API)\n    ├── etl_pipeline.py           # Data cleaning & feature engineering\n    ├── ml_models.py              # Random Forest & Logistic Regression\n    └── llm_strategy.py           # OpenAI \u002F Gemini \u002F Mock strategy generator\n```\n\n---\n\n## 🚀 Getting Started\n\n### 1. Clone the Repository\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fjaideep005\u002Fchurn_retention_system.git\ncd churn_retention_system\n```\n\n### 2. Install Dependencies\n\n```bash\npip install -r requirements.txt\n```\n\n### 3. Set Up Environment Variables (Optional)\n\nCreate a `.env` file in the project root for LLM API keys:\n\n```env\nGEMINI_API_KEY=your_gemini_api_key_here\nOPENAI_API_KEY=your_openai_api_key_here\n```\n\n> **Note:** If no API key is provided, the app automatically uses the built-in mock strategy generator.\n\n### 4. Run the App\n\n```bash\nstreamlit run app.py\n```\n\nThe app will open at `http:\u002F\u002Flocalhost:8501` in your browser.\n\n---\n\n## 🔄 App Workflow\n\n```\n1. Data Ingestion  →  Fetch real IBM Telco data (or upload CSV)\n2. ETL Pipeline    →  Clean, encode, and engineer features\n3. ML Models       →  Train Logistic Regression + Random Forest\n4. Dashboard       →  View KPIs, risk segments, revenue analysis\n5. Strategies      →  Generate AI-powered retention plans per customer\n```\n\n---\n\n## 🧠 Machine Learning\n\nTwo classifiers are trained and compared:\n\n| Model | Key Hyperparameters |\n|-------|---------------------|\n| **Logistic Regression** | `C=1.0`, `class_weight=balanced`, `max_iter=1000` |\n| **Random Forest** | `n_estimators=300`, `max_depth=12`, `class_weight=balanced` |\n\nBoth models are evaluated on:\n- Accuracy, Precision, Recall, F1 Score, ROC-AUC\n- 5-Fold Stratified Cross-Validation\n\nThe **best model** (by ROC-AUC) is automatically selected to generate full-dataset predictions.\n\n---\n\n## 🌐 Real-Time Data Source\n\nBy default, the scraper pulls from the **IBM Telco Customer Churn** public dataset:\n\n```\nhttps:\u002F\u002Fraw.githubusercontent.com\u002FIBM\u002Ftelco-customer-churn-on-icp4d\u002Fmaster\u002Fdata\u002FTelco-Customer-Churn.csv\n```\n\nTo use a **custom API endpoint**, pass your own URL and column mapping:\n\n```python\nfrom modules.scraper import CompanyDataScraper\n\nscraper = CompanyDataScraper(\n    source_url=\"https:\u002F\u002Fyour-api.com\u002Fcustomers\",\n    column_map={\n        \"TimeWithCompany\": \"tenure_months\",\n        \"MonthlySpend\":    \"monthly_revenue\",\n        \"DidTheyLeave\":    \"churn_raw\",\n    },\n    request_headers={\"Authorization\": \"Bearer YOUR_TOKEN\"},\n)\ndf = scraper.fetch()\n```\n\n---\n\n## 💡 LLM Retention Strategies\n\nSupports three modes:\n\n| Mode | Description |\n|------|-------------|\n| `mock` | Rich rule-based template (default, no API key needed) |\n| `gemini` | Google Gemini 1.5 Flash |\n| `openai` | OpenAI GPT-3.5-Turbo |\n\n---\n\n## 📤 Power BI Integration\n\nAfter training, click **\"Export CSV for Power BI\"** in the Dashboard tab. The exported file includes:\n\n- All original customer columns\n- `churn_probability` (float 0–1)\n- `predicted_churn` (0 or 1)\n- `risk_segment` (Low \u002F Medium \u002F High Risk)\n- `ltv_24mo` (estimated 24-month lifetime value)\n\n---\n\n## 📦 Dependencies\n\n```\nstreamlit==1.32.0\npandas==2.2.1\nnumpy==1.26.4\nscikit-learn==1.4.1\nplotly==5.20.0\nopenai==1.14.3\ngoogle-generativeai==0.4.1\npython-dotenv==1.0.1\nfaker==24.2.0\nrequests==2.31.0\njoblib==1.3.2\n```\n\n---\n\n## 👨‍💻 Author\n\n**Jaideep** — [@jaideep005](https:\u002F\u002Fgithub.com\u002Fjaideep005)\n\n---\n\n## 📄 License\n\nThis project is licensed under the **MIT License**.\n","ChurnGuard AI 是一个客户流失预测与保留策略系统。它使用Python开发，结合了Streamlit作为前端展示工具、Scikit-Learn进行机器学习模型训练以及LLM（如OpenAI GPT或Google Gemini）生成个性化保留建议。项目支持从IBM Telco公开数据集或其他自定义API端点获取实时客户数据，并通过自动化ETL流程对数据进行清洗和特征工程处理。然后，系统会训练并评估逻辑回归和随机森林分类器以预测客户流失概率。此外，该应用还提供了一个交互式仪表板来展示关键绩效指标、流失风险分布等信息，并能够一键导出CSV文件供Power BI进一步分析。适用于需要提高客户留存率的企业场景中，特别是电信、金融等行业。",2,"2026-06-11 03:59:24","CREATED_QUERY"]