[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2377":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},2377,"localGPT","PromtEngineer\u002FlocalGPT","PromtEngineer","Chat with your documents on your local device using GPT models. No data leaves your device and 100% private. ",null,"Python",22216,2483,179,20,0,1,7,23,4,45,"MIT License",false,"main",true,[],"2026-06-12 02:00:40","# LocalGPT - Private Document Intelligence Platform\n\n\u003Cdiv align=\"center\">\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F2947\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F2947\" alt=\"PromtEngineer%2FlocalGPT | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n[![GitHub Stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fstargazers)\n[![GitHub Forks](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fforks\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fnetwork\u002Fmembers)\n[![GitHub Issues](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues)\n[![GitHub Pull Requests](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fissues-pr\u002FPromtEngineer\u002FlocalGPT?style=flat-square)](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fpulls)\n[![Python 3.8+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.8+-blue.svg?style=flat-square)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-green.svg?style=flat-square)](LICENSE)\n[![Docker](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdocker-supported-blue.svg?style=flat-square)](https:\u002F\u002Fwww.docker.com\u002F)\n\n\u003Cp align=\"center\">\n    \u003Ca href=\"https:\u002F\u002Fx.com\u002Fengineerrprompt\">\n      \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFollow%20on%20X-000000?style=for-the-badge&logo=x&logoColor=white\" alt=\"Follow on X\" \u002F>\n    \u003C\u002Fa>\n    \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002FtUDWAFGc\">\n      \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FJoin%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white\" alt=\"Join our Discord\" \u002F>\n    \u003C\u002Fa>\n  \u003C\u002Fp>\n\u003C\u002Fdiv>\n\n## 🚀 What is LocalGPT?\n\nLocalGPT is a **fully private, on-premise Document Intelligence platform**. Ask questions, summarise, and uncover insights from your files with state-of-the-art AI—no data ever leaves your machine.\n\nMore than a traditional RAG (Retrieval-Augmented Generation) tool, LocalGPT features a **hybrid search engine** that blends semantic similarity, keyword matching, and [Late Chunking](https:\u002F\u002Fjina.ai\u002Fnews\u002Flate-chunking-in-long-context-embedding-models\u002F) for long-context precision. A **smart router** automatically selects between RAG and direct LLM answering for every query, while **contextual enrichment** and sentence-level [Context Pruning](https:\u002F\u002Fhuggingface.co\u002Fnaver\u002Fprovence-reranker-debertav3-v1) surface only the most relevant content. An independent **verification** pass adds an extra layer of accuracy.\n\nThe architecture is **modular and lightweight**—enable only the components you need. With a pure-Python core and minimal dependencies, LocalGPT is simple to deploy, run, and maintain on any infrastructure.The system has minimal dependencies on frameworks and libraries, making it easy to deploy and maintain. The RAG system is pure python and does not require any additional dependencies.\n\n## ▶️ Video\nWatch this [video](https:\u002F\u002Fyoutu.be\u002FJTbtGH3secI) to get started with LocalGPT. \n\n| Home | Create Index | Chat |\n|------|--------------|------|\n| ![](Documentation\u002Fimages\u002FHome.png) | ![](Documentation\u002Fimages\u002FIndex%20Creation.png) | ![](Documentation\u002Fimages\u002FRetrieval%20Process.png) |\n\n## ✨ Features\n\n- **Utmost Privacy**: Your data remains on your computer, ensuring 100% security.\n- **Versatile Model Support**: Seamlessly integrate a variety of open-source models via Ollama.\n- **Diverse Embeddings**: Choose from a range of open-source embeddings.\n- **Reuse Your LLM**: Once downloaded, reuse your LLM without the need for repeated downloads.\n- **Chat History**: Remembers your previous conversations (in a session).\n- **API**: LocalGPT has an API that you can use for building RAG Applications.\n- **GPU, CPU, HPU & MPS Support**: Supports multiple platforms out of the box, Chat with your data using `CUDA`, `CPU`, `HPU (Intel® Gaudi®)` or `MPS` and more!\n\n### 📖 Document Processing\n- **Multi-format Support**: PDF, DOCX, TXT, Markdown, and more (Currently only PDF is supported)\n- **Contextual Enrichment**: Enhanced document understanding with AI-generated context, inspired by [Contextual Retrieval](https:\u002F\u002Fwww.anthropic.com\u002Fnews\u002Fcontextual-retrieval)\n- **Batch Processing**: Handle multiple documents simultaneously\n\n### 🤖 AI-Powered Chat\n- **Natural Language Queries**: Ask questions in plain English\n- **Source Attribution**: Every answer includes document references\n- **Smart Routing**: Automatically chooses between RAG and direct LLM responses\n- **Query Decomposition**: Breaks complex queries into sub-questions for better answers\n- **Semantic Caching**: TTL-based caching with similarity matching for faster responses\n- **Session-Aware History**: Maintains conversation context across interactions\n- **Answer Verification**: Independent verification pass for accuracy\n- **Multiple AI Models**: Ollama for inference, HuggingFace for embeddings and reranking\n\n\n### 🛠️ Developer-Friendly\n- **RESTful APIs**: Complete API access for integration\n- **Real-time Progress**: Live updates during document processing\n- **Flexible Configuration**: Customize models, chunk sizes, and search parameters\n- **Extensible Architecture**: Plugin system for custom components\n\n### 🎨 Modern Interface\n- **Intuitive Web UI**: Clean, responsive design\n- **Session Management**: Organize conversations by topic\n- **Index Management**: Easy document collection management\n- **Real-time Chat**: Streaming responses for immediate feedback\n\n---\n\n## 🚀 Quick Start\n\nNote: The installation is currently only tested on macOS. \n\n### Prerequisites\n- Python 3.8 or higher (tested with Python 3.11.5)\n- Node.js 16+ and npm (tested with Node.js 23.10.0, npm 10.9.2)\n- Docker (optional, for containerized deployment)\n- 8GB+ RAM (16GB+ recommended)\n- Ollama (required for both deployment approaches)\n\n### ***NOTE***\nBefore this brach is moved to the main branch, please clone this branch for instalation:\n\n```bash\ngit clone -b localgpt-v2 https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n```\n\n### Option 1: Docker Deployment \n\n```bash\n# Clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# Install Ollama locally (required even for Docker)\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b\nollama pull qwen3:8b\n\n# Start Ollama\nollama serve\n\n# Start with Docker (in a new terminal)\n.\u002Fstart-docker.sh\n\n# Access the application\nopen http:\u002F\u002Flocalhost:3000\n```\n\n**Docker Management Commands:**\n```bash\n# Check container status\ndocker compose ps\n\n# View logs\ndocker compose logs -f\n\n# Stop containers\n.\u002Fstart-docker.sh stop\n```\n\n### Option 2: Direct Development (Recommended for Development)\n\n```bash\n# Clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# Install Python dependencies\npip install -r requirements.txt\n\n# Key dependencies installed:\n# - torch==2.4.1, transformers==4.51.0 (AI models)\n# - lancedb (vector database)\n# - rank_bm25, fuzzywuzzy (search algorithms)\n# - sentence_transformers, rerankers (embedding\u002Freranking)\n# - docling (document processing)\n# - colpali-engine (multimodal processing - support coming soon)\n\n# Install Node.js dependencies\nnpm install\n\n# Install and start Ollama\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b\nollama pull qwen3:8b\nollama serve\n\n# Start the system (in a new terminal)\npython run_system.py\n\n# Access the application\nopen http:\u002F\u002Flocalhost:3000\n```\n\n**System Management:**\n```bash\n# Check system health (comprehensive diagnostics)\npython system_health_check.py\n\n# Check service status and health\npython run_system.py --health\n\n# Start in production mode\npython run_system.py --mode prod\n\n# Skip frontend (backend + RAG API only)\npython run_system.py --no-frontend\n\n# View aggregated logs\npython run_system.py --logs-only\n\n# Stop all services\npython run_system.py --stop\n# Or press Ctrl+C in the terminal running python run_system.py\n```\n\n**Service Architecture:**\nThe `run_system.py` launcher manages four key services:\n- **Ollama Server** (port 11434): AI model serving\n- **RAG API Server** (port 8001): Document processing and retrieval\n- **Backend Server** (port 8000): Session management and API endpoints\n- **Frontend Server** (port 3000): React\u002FNext.js web interface\n\n### Option 3: Manual Component Startup\n\n```bash\n# Terminal 1: Start Ollama\nollama serve\n\n# Terminal 2: Start RAG API\npython -m rag_system.api_server\n\n# Terminal 3: Start Backend\ncd backend && python server.py\n\n# Terminal 4: Start Frontend\nnpm run dev\n\n# Access at http:\u002F\u002Flocalhost:3000\n```\n\n---\n\n### Detailed Installation\n\n#### 1. Install System Dependencies\n\n**Ubuntu\u002FDebian:**\n```bash\nsudo apt update\nsudo apt install python3.8 python3-pip nodejs npm docker.io docker-compose\n```\n\n**macOS:**\n```bash\nbrew install python@3.8 node npm docker docker-compose\n```\n\n**Windows:**\n```bash\n# Install Python 3.8+, Node.js, and Docker Desktop\n# Then use PowerShell or WSL2\n```\n\n#### 2. Install AI Models\n\n**Install Ollama (Recommended):**\n```bash\n# Install Ollama\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\n\n# Pull recommended models\nollama pull qwen3:0.6b          # Fast generation model\nollama pull qwen3:8b            # High-quality generation model\n```\n\n#### 3. Configure Environment\n\n```bash\n# Copy environment template\ncp .env.example .env\n\n# Edit configuration\nnano .env\n```\n\n**Key Configuration Options:**\n```env\n# AI Models (referenced in rag_system\u002Fmain.py)\nOLLAMA_HOST=http:\u002F\u002Flocalhost:11434\n\n# Database Paths (used by backend and RAG system)\nDATABASE_PATH=.\u002Fbackend\u002Fchat_data.db\nVECTOR_DB_PATH=.\u002Flancedb\n\n# Server Settings (used by run_system.py)\nBACKEND_PORT=8000\nFRONTEND_PORT=3000\nRAG_API_PORT=8001\n\n# Optional: Override default models\nGENERATION_MODEL=qwen3:8b\nENRICHMENT_MODEL=qwen3:0.6b\nEMBEDDING_MODEL=Qwen\u002FQwen3-Embedding-0.6B\nRERANKER_MODEL=answerdotai\u002Fanswerai-colbert-small-v1\n```\n\n#### 4. Initialize the System\n\n```bash\n# Run system health check\npython system_health_check.py\n\n# Initialize databases\npython -c \"from backend.database import ChatDatabase; ChatDatabase().init_database()\"\n\n# Test installation\npython -c \"from rag_system.main import get_agent; print('✅ Installation successful!')\"\n\n# Validate complete setup\npython run_system.py --health\n```\n\n---\n\n## 🎯 Getting Started\n\n### 1. Create Your First Index\n\nAn **index** is a collection of processed documents that you can chat with.\n\n#### Using the Web Interface:\n1. Open http:\u002F\u002Flocalhost:3000\n2. Click \"Create New Index\"\n3. Upload your documents (PDF, DOCX, TXT)\n4. Configure processing options\n5. Click \"Build Index\"\n\n#### Using Scripts:\n```bash\n# Simple script approach\n.\u002Fsimple_create_index.sh \"My Documents\" \"path\u002Fto\u002Fdocument.pdf\"\n\n# Interactive script\npython create_index_script.py\n```\n\n#### Using API:\n```bash\n# Create index\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\"name\": \"My Index\", \"description\": \"My documents\"}'\n\n# Upload documents\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes\u002FINDEX_ID\u002Fupload \\\n  -F \"files=@document.pdf\"\n\n# Build index\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Findexes\u002FINDEX_ID\u002Fbuild\n```\n\n### 2. Start Chatting\n\nOnce your index is built:\n\n1. **Create a Chat Session**: Click \"New Chat\" or use an existing session\n2. **Select Your Index**: Choose which document collection to query\n3. **Ask Questions**: Type natural language questions about your documents\n4. **Get Answers**: Receive AI-generated responses with source citations\n\n### 3. Advanced Features\n\n#### Custom Model Configuration\n```bash\n# Use different models for different tasks\ncurl -X POST http:\u002F\u002Flocalhost:8000\u002Fsessions \\\n  -H \"Content-Type: application\u002Fjson\" \\\n  -d '{\n    \"title\": \"High Quality Session\",\n    \"model\": \"qwen3:8b\",\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-4B\"\n  }'\n```\n\n#### Batch Document Processing\n```bash\n# Process multiple documents at once\npython demo_batch_indexing.py --config batch_indexing_config.json\n```\n\n#### API Integration\n```python\nimport requests\n\n# Chat with your documents via API\nresponse = requests.post('http:\u002F\u002Flocalhost:8000\u002Fchat', json={\n    'query': 'What are the key findings in the research papers?',\n    'session_id': 'your-session-id',\n    'search_type': 'hybrid',\n    'retrieval_k': 20\n})\n\nprint(response.json()['response'])\n```\n\n---\n\n## 🔧 Configuration\n\n### Model Configuration\n\nLocalGPT supports multiple AI model providers with centralized configuration:\n\n#### Ollama Models (Local Inference)\n```python\nOLLAMA_CONFIG = {\n    \"host\": \"http:\u002F\u002Flocalhost:11434\",\n    \"generation_model\": \"qwen3:8b\",        # Main text generation\n    \"enrichment_model\": \"qwen3:0.6b\"       # Lightweight routing\u002Fenrichment\n}\n```\n\n#### External Models (HuggingFace Direct)\n```python\nEXTERNAL_MODELS = {\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-0.6B\",           # 1024 dimensions\n    \"reranker_model\": \"answerdotai\u002Fanswerai-colbert-small-v1\", # ColBERT reranker\n    \"fallback_reranker\": \"BAAI\u002Fbge-reranker-base\"             # Backup reranker\n}\n```\n\n### Pipeline Configuration\n\nLocalGPT offers two main pipeline configurations:\n\n#### Default Pipeline (Production-Ready)\n```python\n\"default\": {\n    \"description\": \"Production-ready pipeline with hybrid search, AI reranking, and verification\",\n    \"storage\": {\n        \"lancedb_uri\": \".\u002Flancedb\",\n        \"text_table_name\": \"text_pages_v3\",\n        \"bm25_path\": \".\u002Findex_store\u002Fbm25\"\n    },\n    \"retrieval\": {\n        \"retriever\": \"multivector\",\n        \"search_type\": \"hybrid\",\n        \"late_chunking\": {\"enabled\": True},\n        \"dense\": {\"enabled\": True, \"weight\": 0.7},\n        \"bm25\": {\"enabled\": True}\n    },\n    \"reranker\": {\n        \"enabled\": True,\n        \"type\": \"ai\",\n        \"strategy\": \"rerankers-lib\",\n        \"model_name\": \"answerdotai\u002Fanswerai-colbert-small-v1\",\n        \"top_k\": 10\n    },\n    \"query_decomposition\": {\"enabled\": True, \"max_sub_queries\": 3},\n    \"verification\": {\"enabled\": True},\n    \"retrieval_k\": 20,\n    \"contextual_enricher\": {\"enabled\": True, \"window_size\": 1}\n}\n```\n\n#### Fast Pipeline (Speed-Optimized)\n```python\n\"fast\": {\n    \"description\": \"Speed-optimized pipeline with minimal overhead\",\n    \"retrieval\": {\n        \"search_type\": \"vector_only\",\n        \"late_chunking\": {\"enabled\": False}\n    },\n    \"reranker\": {\"enabled\": False},\n    \"query_decomposition\": {\"enabled\": False},\n    \"verification\": {\"enabled\": False},\n    \"retrieval_k\": 10,\n    \"contextual_enricher\": {\"enabled\": False}\n}\n```\n\n### Search Configuration\n\n```python\nSEARCH_CONFIG = {\n    'hybrid': {\n        'dense_weight': 0.7,\n        'sparse_weight': 0.3,\n        'retrieval_k': 20,\n        'reranker_top_k': 10\n    }\n}\n```\n---\n\n## 🛠️ Troubleshooting\n\n### Common Issues\n\n#### Installation Problems\n```bash\n# Check Python version\npython --version  # Should be 3.8+\n\n# Check dependencies\npip list | grep -E \"(torch|transformers|lancedb)\"\n\n# Reinstall dependencies\npip install -r requirements.txt --force-reinstall\n```\n\n#### Model Loading Issues\n```bash\n# Check Ollama status\nollama list\ncurl http:\u002F\u002Flocalhost:11434\u002Fapi\u002Ftags\n\n# Pull missing models\nollama pull qwen3:0.6b\n```\n\n#### Database Issues\n```bash\n# Check database connectivity\npython -c \"from backend.database import ChatDatabase; db = ChatDatabase(); print('✅ Database OK')\"\n\n# Reset database (WARNING: This deletes all data)\nrm backend\u002Fchat_data.db\npython -c \"from backend.database import ChatDatabase; ChatDatabase().init_database()\"\n```\n\n#### Performance Issues\n```bash\n# Check system resources\npython system_health_check.py\n\n# Monitor memory usage\nhtop  # or Task Manager on Windows\n\n# Optimize for low-memory systems\nexport PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512\n```\n\n### Getting Help\n\n1. **Check Logs**: The system creates structured logs in the `logs\u002F` directory:\n   - `logs\u002Fsystem.log`: Main system events and errors\n   - `logs\u002Follama.log`: Ollama server logs\n   - `logs\u002Frag-api.log`: RAG API processing logs\n   - `logs\u002Fbackend.log`: Backend server logs\n   - `logs\u002Ffrontend.log`: Frontend build and runtime logs\n\n2. **System Health**: Run comprehensive diagnostics:\n   ```bash\n   python system_health_check.py  # Full system diagnostics\n   python run_system.py --health  # Service status check\n   ```\n\n3. **Health Endpoints**: Check individual service health:\n   - Backend: `http:\u002F\u002Flocalhost:8000\u002Fhealth`\n   - RAG API: `http:\u002F\u002Flocalhost:8001\u002Fhealth`\n   - Ollama: `http:\u002F\u002Flocalhost:11434\u002Fapi\u002Ftags`\n\n4. **Documentation**: Check the [Technical Documentation](TECHNICAL_DOCS.md)\n5. **GitHub Issues**: Report bugs and request features\n6. **Community**: Join our Discord\u002FSlack community\n\n---\n\n## 🔗 API Reference\n\n### Core Endpoints\n\n#### Chat API\n```http\n# Session-based chat (recommended)\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"What are the main topics discussed?\",\n  \"search_type\": \"hybrid\",\n  \"retrieval_k\": 20,\n  \"ai_rerank\": true,\n  \"context_window_size\": 5\n}\n\n# Legacy chat endpoint\nPOST \u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"What are the main topics discussed?\",\n  \"session_id\": \"uuid\",\n  \"search_type\": \"hybrid\",\n  \"retrieval_k\": 20\n}\n```\n\n#### Index Management\n```http\n# Create index\nPOST \u002Findexes\nContent-Type: application\u002Fjson\n{\n  \"name\": \"My Index\",\n  \"description\": \"Description\",\n  \"config\": \"default\"\n}\n\n# Get all indexes\nGET \u002Findexes\n\n# Get specific index\nGET \u002Findexes\u002F{id}\n\n# Upload documents to index\nPOST \u002Findexes\u002F{id}\u002Fupload\nContent-Type: multipart\u002Fform-data\nfiles: [file1.pdf, file2.pdf, ...]\n\n# Build index (process uploaded documents)\nPOST \u002Findexes\u002F{id}\u002Fbuild\nContent-Type: application\u002Fjson\n{\n  \"config_mode\": \"default\",\n  \"enable_enrich\": true,\n  \"chunk_size\": 512\n}\n\n# Delete index\nDELETE \u002Findexes\u002F{id}\n```\n\n#### Session Management\n```http\n# Create session\nPOST \u002Fsessions\nContent-Type: application\u002Fjson\n{\n  \"title\": \"My Session\",\n  \"model\": \"qwen3:0.6b\"\n}\n\n# Get all sessions\nGET \u002Fsessions\n\n# Get specific session\nGET \u002Fsessions\u002F{session_id}\n\n# Get session documents\nGET \u002Fsessions\u002F{session_id}\u002Fdocuments\n\n# Get session indexes\nGET \u002Fsessions\u002F{session_id}\u002Findexes\n\n# Link index to session\nPOST \u002Fsessions\u002F{session_id}\u002Findexes\u002F{index_id}\n\n# Delete session\nDELETE \u002Fsessions\u002F{session_id}\n\n# Rename session\nPOST \u002Fsessions\u002F{session_id}\u002Frename\nContent-Type: application\u002Fjson\n{\n  \"new_title\": \"Updated Session Name\"\n}\n```\n\n### Advanced Features\n\n#### Query Decomposition\nThe system can break complex queries into sub-questions for better answers:\n```http\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"Compare the methodologies and analyze their effectiveness\",\n  \"query_decompose\": true,\n  \"compose_sub_answers\": true\n}\n```\n\n#### Answer Verification\nIndependent verification pass for accuracy using a separate verification model:\n```http\nPOST \u002Fsessions\u002F{session_id}\u002Fchat\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"What are the key findings?\",\n  \"verify\": true\n}\n```\n\n#### Contextual Enrichment\nDocument context enrichment during indexing for better understanding:\n```bash\n# Enable during index building\nPOST \u002Findexes\u002F{id}\u002Fbuild\n{\n  \"enable_enrich\": true,\n  \"window_size\": 2\n}\n```\n\n#### Late Chunking\nBetter context preservation by chunking after embedding:\n```bash\n# Configure in pipeline\n\"late_chunking\": {\"enabled\": true}\n```\n\n#### Streaming Chat\n```http\nPOST \u002Fchat\u002Fstream\nContent-Type: application\u002Fjson\n\n{\n  \"query\": \"Explain the methodology\",\n  \"session_id\": \"uuid\",\n  \"stream\": true\n}\n```\n\n#### Batch Processing\n```bash\n# Using the batch indexing script\npython demo_batch_indexing.py --config batch_indexing_config.json\n\n# Example batch configuration (batch_indexing_config.json):\n{\n  \"index_name\": \"Sample Batch Index\",\n  \"index_description\": \"Example batch index configuration\",\n  \"documents\": [\n    \".\u002Frag_system\u002Fdocuments\u002Finvoice_1039.pdf\",\n    \".\u002Frag_system\u002Fdocuments\u002Finvoice_1041.pdf\"\n  ],\n  \"processing\": {\n    \"chunk_size\": 512,\n    \"chunk_overlap\": 64,\n    \"enable_enrich\": true,\n    \"enable_latechunk\": true,\n    \"enable_docling\": true,\n    \"embedding_model\": \"Qwen\u002FQwen3-Embedding-0.6B\",\n    \"generation_model\": \"qwen3:0.6b\",\n    \"retrieval_mode\": \"hybrid\",\n    \"window_size\": 2\n  }\n}\n```\n\n```http\n# API endpoint for batch processing\nPOST \u002Fbatch\u002Findex\nContent-Type: application\u002Fjson\n\n{\n  \"file_paths\": [\"doc1.pdf\", \"doc2.pdf\"],\n  \"config\": {\n    \"chunk_size\": 512,\n    \"enable_enrich\": true,\n    \"enable_latechunk\": true,\n    \"enable_docling\": true\n  }\n}\n```\n\nFor complete API documentation, see [API_REFERENCE.md](API_REFERENCE.md).\n\n---\n\n## 🏗️ Architecture\n\nLocalGPT is built with a modular, scalable architecture:\n\n```mermaid\ngraph TB\n    UI[Web Interface] --> API[Backend API]\n    API --> Agent[RAG Agent]\n    Agent --> Retrieval[Retrieval Pipeline]\n    Agent --> Generation[Generation Pipeline]\n\n    Retrieval --> Vector[Vector Search]\n    Retrieval --> BM25[BM25 Search]\n    Retrieval --> Rerank[Reranking]\n\n    Vector --> LanceDB[(LanceDB)]\n    BM25 --> BM25DB[(BM25 Index)]\n\n    Generation --> Ollama[Ollama Models]\n    Generation --> HF[Hugging Face Models]\n\n    API --> SQLite[(SQLite DB)]\n```\n\nOverview of the Retrieval Agent\n\n```mermaid\ngraph TD\n    classDef llmcall fill:#e6f3ff,stroke:#007bff;\n    classDef pipeline fill:#e6ffe6,stroke:#28a745;\n    classDef cache fill:#fff3e0,stroke:#fd7e14;\n    classDef logic fill:#f8f9fa,stroke:#6c757d;\n    classDef thread stroke-dasharray: 5 5;\n\n    A(Start: Agent.run) --> B_asyncio.run(_run_async);\n    B --> C{_run_async};\n\n    C --> C1[Get Chat History];\n    C1 --> T1[Build Triage Prompt \u003Cbr\u002F> Query + Doc Overviews ];\n    T1 --> T2[\"(asyncio.to_thread)\u003Cbr\u002F>LLM Triage: RAG or LLM_DIRECT?\"]; class T2 llmcall,thread;\n    T2 --> T3{Decision?};\n\n    T3 -- RAG --> RAG_Path;\n    T3 -- LLM_DIRECT --> LLM_Path;\n\n    subgraph RAG Path\n        RAG_Path --> R1[Format Query + History];\n        R1 --> R2[\"(asyncio.to_thread)\u003Cbr\u002F>Generate Query Embedding\"]; class R2 pipeline,thread;\n        R2 --> R3{{Check Semantic Cache}}; class R3 cache;\n        R3 -- Hit --> R_Cache_Hit(Return Cached Result);\n        R_Cache_Hit --> R_Hist_Update;\n        R3 -- Miss --> R4{Decomposition \u003Cbr\u002F> Enabled?};\n\n        R4 -- Yes --> R5[\"(asyncio.to_thread)\u003Cbr\u002F>Decompose Raw Query\"]; class R5 llmcall,thread;\n        R5 --> R6{{Run Sub-Queries \u003Cbr\u002F> Parallel RAG Pipeline}}; class R6 pipeline,thread;\n        R6 --> R7[Collect Results & Docs];\n        R7 --> R8[\"(asyncio.to_thread)\u003Cbr\u002F>Compose Final Answer\"]; class R8 llmcall,thread;\n        R8 --> V1(RAG Answer);\n\n        R4 -- No --> R9[\"(asyncio.to_thread)\u003Cbr\u002F>Run Single Query \u003Cbr\u002F>(RAG Pipeline)\"]; class R9 pipeline,thread;\n        R9 --> V1;\n\n        V1 --> V2{{Verification \u003Cbr\u002F> await verify_async}}; class V2 llmcall;\n        V2 --> V3(Final RAG Result);\n        V3 --> R_Cache_Store{{Store in Semantic Cache}}; class R_Cache_Store cache;\n        R_Cache_Store --> FinalResult;\n    end\n\n    subgraph Direct LLM Path\n        LLM_Path --> L1[Format Query + History];\n        L1 --> L2[\"(asyncio.to_thread)\u003Cbr\u002F>Generate Direct LLM Answer \u003Cbr\u002F> (No RAG)\"]; class L2 llmcall,thread;\n        L2 --> FinalResult(Final Direct Result);\n    end\n\n    FinalResult --> R_Hist_Update(Update Chat History);\n    R_Hist_Update --> ZZZ(End: Return Result);\n```\n\n---\n\n## 🤝 Contributing\n\nWe welcome contributions from developers of all skill levels! LocalGPT is an open-source project that benefits from community involvement.\n\n### 🚀 Quick Start for Contributors\n\n```bash\n# Fork and clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT.git\ncd localGPT\n\n# Set up development environment\npip install -r requirements.txt\nnpm install\n\n# Install Ollama and models\ncurl -fsSL https:\u002F\u002Follama.ai\u002Finstall.sh | sh\nollama pull qwen3:0.6b qwen3:8b\n\n# Verify setup\npython system_health_check.py\npython run_system.py --mode dev\n```\n\n### 📋 How to Contribute\n\n1. **🐛 Report Bugs**: Use our [bug report template](.github\u002FISSUE_TEMPLATE\u002Fbug_report.md)\n2. **💡 Request Features**: Use our [feature request template](.github\u002FISSUE_TEMPLATE\u002Ffeature_request.md)\n3. **🔧 Submit Code**: Follow our [development workflow](CONTRIBUTING.md#development-workflow)\n4. **📚 Improve Docs**: Help make our documentation better\n\n### 📖 Detailed Guidelines\n\nFor comprehensive contributing guidelines, including:\n- Development setup and workflow\n- Coding standards and best practices\n- Testing requirements\n- Documentation standards\n- Release process\n\n**👉 See our [CONTRIBUTING.md](CONTRIBUTING.md) guide**\n\n---\n\n## 📄 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. For models, please check their respective licenses.\n\n---\n\n## 📞 Support\n\n- **Documentation**: [Technical Docs](TECHNICAL_DOCS.md)\n- **Issues**: [GitHub Issues](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fissues)\n- **Discussions**: [GitHub Discussions](https:\u002F\u002Fgithub.com\u002FPromtEngineer\u002FlocalGPT\u002Fdiscussions)\n- **Business Deployment and Customization**: [Contact Us](https:\u002F\u002Ftally.so\u002Fr\u002Fwv6R2d)\n---\n\n\u003Cdiv align=\"center\">\n\n## Star History\n\n[![Star History Chart](https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=PromtEngineer\u002FlocalGPT&type=Date)](https:\u002F\u002Fstar-history.com\u002F#PromtEngineer\u002FlocalGPT&Date)\n","LocalGPT 是一个完全私有的本地文档智能平台，允许用户在其设备上通过GPT模型与其文档进行交互，确保数据100%隐私不外泄。该项目的核心功能包括混合搜索引擎、智能路由选择机制、上下文丰富与修剪技术以及独立验证步骤，这些特性共同作用以提供精准且相关的文档信息查询服务。它采用模块化设计，具有轻量级架构，仅依赖于Python开发，并支持Docker部署，易于在任何基础设施上安装和维护。LocalGPT非常适合需要处理敏感或机密文件的个人或企业使用，在保证数据安全的同时，能够高效地从大量文档中提取有用信息。",2,"2026-06-11 02:49:40","top_language"]