[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-77840":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":12,"compositeScore":19,"rankGlobal":9,"rankLanguage":9,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":9,"pushedAt":9,"updatedAt":35,"readmeContent":36,"aiSummary":37,"trendingCount":15,"starSnapshotCount":15,"syncStatus":14,"lastSyncTime":38,"discoverSource":39},77840,"FinSight-AI","juanjuandog\u002FFinSight-AI","juanjuandog","AI equity research agent with resilient workflows, Redis Lua single-flight, pgvector RAG, versioned reports, evidence tracing, and RAG evaluation.",null,"Java",1053,63,5,2,0,21,90,988,103.42,"MIT License",false,"master",true,[25,26,27,28,29,30,31,32,33,34],"ai-agent","financial-research","llm-evaluation","pgvector","postgresql","rabbitmq","rag","redis","spring-boot","workflow-orchestration","2026-06-12 04:01:22","# FinSight AI\n\n[English](README.md) | [简体中文](README.zh-CN.md)\n\n![CI](https:\u002F\u002Fgithub.com\u002Fjuanjuandog\u002FFinSight-AI\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)\n![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-green)\n![Java](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FJava-17-blue)\n![Spring Boot](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSpring%20Boot-3.3-green)\n![PostgreSQL](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPostgreSQL-pgvector-blue)\n![Redis](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRedis-single--flight-red)\n![RabbitMQ](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRabbitMQ-workflows-orange)\n\nOpen-source AI equity research agent with evidence-grounded reports, resilient workflow orchestration, and RAG evaluation.\n\nFinSight turns filings, financial reports, research notes, market data, and company events into source-grounded answers and versioned AI research reports. The project is intentionally backend-heavy: it shows how to build the infrastructure around an AI agent, not just how to call a model.\n\n![FinSight dashboard preview](docs\u002Fdashboard-preview.png)\n\n## Why It Exists\n\nMost RAG demos stop at \"retrieve chunks and ask an LLM.\" FinSight focuses on the parts that make an AI research system dependable:\n\n- long-running agent workflows with explicit state transitions;\n- idempotent task submission and duplicate execution control;\n- Redis Lua single-flight leases with fencing tokens;\n- report caching tied to data snapshots instead of loose prompt strings;\n- PostgreSQL\u002Fpgvector hybrid retrieval with evidence traceability;\n- RAG and agent quality evaluation for regression checks.\n\n## Highlights\n\n| Area | What FinSight Implements |\n| --- | --- |\n| Agent workflow | Data ingestion, metric recalculation, document indexing, intelligence build, and AI report generation as recoverable stages |\n| Concurrency control | Idempotency keys, repository-level `createIfAbsent`, Redis Lua single-flight lease, fencing token, local fallback lock |\n| Failure recovery | Task status machine, stage tracking, retry, dead letter state, timeout takeover scheduler |\n| Trustworthy AI cache | `contextHash`, `dataSnapshotHash`, `reportVersion`, Redis\u002FPostgreSQL-backed report reuse |\n| Retrieval | PostgreSQL JSONB, full-text search, pgvector embeddings, hybrid recall, deduped evidence chunks |\n| Evaluation | RAG hit rate, evidence coverage, answer coverage, hallucination risk, conclusion consistency, confidence calibration, latency |\n| Demo surface | Spring Boot API, static dashboard, sample data flow, Actuator and Prometheus metrics |\n\n## Architecture\n\n```mermaid\nflowchart LR\n    UI[\"Dashboard \u002F REST API\"] --> Backend[\"Spring Boot Backend\"]\n    Backend --> Workflow[\"Agent Workflow Orchestrator\"]\n    Workflow --> MQ[\"RabbitMQ Async Queue\"]\n    Workflow --> Redis[\"Redis Lua Lease + Cache\"]\n    Workflow --> PG[\"PostgreSQL + pgvector\"]\n    Backend --> AI[\"FastAPI AI Service \u002F Ollama fallback\"]\n    PG --> Retrieval[\"Hybrid Retrieval + Evidence\"]\n    Retrieval --> Backend\n    AI --> Report[\"Versioned AI Report\"]\n    Report --> PG\n    Backend --> Eval[\"RAG \u002F Agent Evaluation\"]\n```\n\nMore detail: [Architecture Notes](docs\u002Farchitecture.md)\n\n## Documentation\n\n- [Architecture Notes](docs\u002Farchitecture.md)\n- [Research API](docs\u002Fapi.md)\n- [Agent Workflow Design](docs\u002Fdesign-agent-workflow.md)\n- [Benchmark And Evaluation Notes](docs\u002Fbenchmark.md)\n- [Resume And Interview Notes](docs\u002Fresume-and-interview.md)\n- [GitHub Presentation Snippets](docs\u002Fgithub-profile.md)\n- [Troubleshooting](docs\u002Ftroubleshooting.md)\n- [Roadmap](ROADMAP.md)\n- [Contributing](CONTRIBUTING.md)\n\n## Quick Start\n\n### 1. Run the full stack\n\n```bash\n.\u002Fscripts\u002Frun-full-stack.sh\n```\n\nThen open:\n\n```bash\nopen http:\u002F\u002Flocalhost:8080\n```\n\nThis starts the backend, dashboard, PostgreSQL\u002Fpgvector, RabbitMQ, Redis, the FastAPI AI sidecar, and supporting infrastructure. If Ollama is not running, the AI service returns deterministic fallback analysis so the demo still works.\n\n### 2. Seed and exercise the demo\n\nIn another terminal:\n\n```bash\n.\u002Fscripts\u002Fquick-demo.sh\n```\n\nOr run the smaller flows separately:\n\n```bash\n.\u002Fscripts\u002Fdemo-flow.sh\n.\u002Fscripts\u002Fdemo-workflow.sh\n```\n\nUseful endpoints:\n\n```bash\nGET  \u002Fapi\u002Fworkflows\u002Fsummary\nPOST \u002Fapi\u002Fevaluations\u002Frag\u002Frun\nGET  \u002Fapi\u002Fcompanies\u002F600519\u002Fai-analysis\u002Flatest\nGET  \u002Fapi\u002Fdocument-index\u002F600519\u002Fsearch?q=现金流风险\n```\n\nExample demo output after `.\u002Fscripts\u002Fquick-demo.sh`:\n\n| Signal | Example Result |\n| --- | --- |\n| Agent workflow | `1\u002F1 tasks`, `0 failed\u002Fdead-letter` |\n| RAG evaluation | `85 \u002F 100`, `2\u002F3 cases passed` |\n| Evidence index | `6 documents`, `6 chunks` for `600519` |\n| Intelligence graph | `20 events`, `36 entities`, `47 relations` |\n| Report cache | `dataSnapshotHash + contextHash + reportVersion` |\n\n### 3. Run without Docker\n\nFor a lightweight local backend using in-memory repositories:\n\n```bash\ncd backend\nmvn spring-boot:run\nopen http:\u002F\u002Flocalhost:8080\n```\n\n## Modules\n\n- `backend`: Spring Boot service for APIs, domain workflow, metrics, and RAG orchestration.\n- `ai-service`: FastAPI service for document parsing, entity extraction, embedding, rerank, and answer generation stubs.\n- `docker`: local infrastructure placeholders.\n\n## Alternative Run Modes\n\nBackend:\n\n```bash\ncd backend\nmvn spring-boot:run\n```\n\nDashboard:\n\n```bash\nopen http:\u002F\u002Flocalhost:8080\n```\n\nBackend with PostgreSQL profile:\n\n```bash\ndocker compose up -d postgres\ncd backend\nmvn spring-boot:run -Dspring-boot.run.profiles=postgres,prod\n```\n\nBackend with PostgreSQL + RabbitMQ workflow:\n\n```bash\n.\u002Fscripts\u002Frun-backend-workflow.sh\n```\n\nProduction-like stack with PostgreSQL, pgvector, RabbitMQ, FastAPI AI service, Actuator, and the dashboard:\n\n```bash\n.\u002Fscripts\u002Frun-full-stack.sh\nopen http:\u002F\u002Flocalhost:8080\n```\n\nAI service:\n\n```bash\ncd ai-service\npython -m venv .venv\nsource .venv\u002Fbin\u002Factivate\npip install -r requirements.txt\nuvicorn app.main:app --reload --port 8001\n```\n\nOptional local Ollama analysis:\n\n```bash\nollama serve\nollama pull qwen2.5:7b\n```\n\nThe FastAPI sidecar calls `OLLAMA_BASE_URL` (`http:\u002F\u002Flocalhost:11434` by default) and `OLLAMA_MODEL`\n(`qwen2.5:7b` by default) from `\u002Fanalyze-stock`. If Ollama is not installed, not running, or the model is\nmissing, the endpoint returns a deterministic rule-based fallback with `aiGenerated=false`, so the dashboard\nkeeps working.\n\n## Sample API Flow\n\n1. `POST \u002Fapi\u002Fingestion\u002Fdemo` seeds a sample company document and financial statements.\n2. `POST \u002Fapi\u002Fmetrics\u002Frecalculate\u002F600519` calculates financial indicators and risk signals.\n3. `POST \u002Fapi\u002Fanalysis\u002Fask` asks a source-grounded question.\n4. `POST \u002Fapi\u002Fdocument-index\u002F{symbol}\u002Frebuild` rebuilds document chunks for retrieval.\n5. `POST \u002Fapi\u002Fintelligence\u002F{symbol}\u002Frebuild` builds timeline events and a lightweight knowledge graph.\n\nAsync workflow:\n\n```bash\nPOST \u002Fapi\u002Fingestion\u002Fdemo\u002Fasync\nGET \u002Fapi\u002Fworkflows\nGET \u002Fapi\u002Fdocument-index\u002F600519\u002Fsearch?q=现金流风险\nGET \u002Fapi\u002Fmetrics\u002F600519\u002Fruns\nGET \u002Fapi\u002Fintelligence\u002F600519\u002Ftimeline\nGET \u002Fapi\u002Fintelligence\u002F600519\u002Fgraph\nPOST \u002Fapi\u002Fevaluations\u002Frag\u002Frun\n```\n\n## Database Stage\n\nThe PostgreSQL implementation is enabled by `postgres,prod` profiles. Flyway creates the core schema:\n\n- `companies`\n- `financial_documents`\n- `financial_statements`\n- `financial_metrics`\n- `risk_signals`\n- `workflow_tasks`\n- `company_events`\n- `rag_traces`\n- `stock_analysis_reports`\n- `user_watchlists`\n\nDefault profile still uses in-memory repositories so the backend remains easy to run without Docker.\n\n## Workflow Stage\n\nThe workflow stage splits long financial data processing into task lifecycle and execution:\n\n- `WorkflowTask` stores idempotency key, status, agent stage, attempt count, payload, error message, lease owner, fencing token, and update time.\n- `WorkflowTaskPublisher` has two implementations:\n  - default direct publisher for local development;\n  - RabbitMQ publisher enabled by `rabbitmq` profile.\n- `WorkflowOrchestrator` uses Redis Lua single-flight leases, idempotency keys, and local fallback locking to prevent duplicate cross-node execution.\n- `WorkflowRecoveryScheduler` scans timed-out `RUNNING` tasks, marks them recoverable\u002Fdead-lettered, and republishes retryable work.\n- Agent stages model the long-running research flow from ingestion to metrics, indexing, intelligence build, AI analysis, success, failure, and recovery.\n- `DOCUMENT_INDEX_BUILD` chunks ingested documents and writes retrieval-ready evidence chunks.\n- `COMPANY_INTELLIGENCE_BUILD` turns documents, metrics, and risk signals into timeline events and graph relations.\n- `STOCK_AI_ANALYSIS` creates source-grounded AI stock reports and persists them for history and caching.\n- `RabbitWorkflowListener` consumes messages and moves failed messages to a dead-letter queue when RabbitMQ rejects them.\n\nRun:\n\n```bash\n.\u002Fscripts\u002Frun-backend-workflow.sh\n.\u002Fscripts\u002Fdemo-workflow.sh\n```\n\n## Retrieval Stage\n\nThe retrieval stage indexes financial documents at evidence-chunk granularity:\n\n- `DocumentChunker` splits long documents with overlap and section metadata.\n- `EmbeddingService` creates deterministic 384-dimensional embeddings for local demos, and can call the FastAPI AI sidecar `\u002Fembed` endpoint when `finsight.ai-service.enabled=true`.\n- `DocumentChunkRepository` supports keyword search, vector search, and chunk replacement.\n- PostgreSQL profile stores chunks in `document_chunks` with JSONB metadata, full-text GIN index, and pgvector cosine index.\n- `HybridRetrievalGateway` merges keyword and vector channels, deduplicates chunks, and passes source-bound evidence to RAG.\n\nUseful endpoints:\n\n```bash\nPOST \u002Fapi\u002Fdocument-index\u002F600519\u002Frebuild\nGET \u002Fapi\u002Fdocument-index\u002F600519\u002Fcount\nGET \u002Fapi\u002Fdocument-index\u002F600519\u002Fsearch?q=现金流风险\n```\n\n## Metric Engine Stage\n\nThe metric engine stage turns hard-coded ratios into a governed calculation pipeline:\n\n- `MetricDefinitionCatalog` defines source metrics, ratio metrics, year-over-year metrics, and derived spreads.\n- `CoreFinancialMetricCalculator` evaluates metrics in fiscal-year order and stores results with a plan version.\n- `MetricCalculationRun` records each calculation run with statement count, metric count, risk count, timestamps, and metadata.\n- `RiskRule` components evaluate financial risk signals from the metric map:\n  - cash earnings quality;\n  - receivable pressure;\n  - profitability trend weakening;\n  - leverage risk.\n\nUseful endpoints:\n\n```bash\nGET \u002Fapi\u002Fmetrics\u002Fdefinitions\nPOST \u002Fapi\u002Fmetrics\u002Frecalculate\u002F600519\nGET \u002Fapi\u002Fmetrics\u002F600519\nGET \u002Fapi\u002Fmetrics\u002F600519\u002Frisks\nGET \u002Fapi\u002Fmetrics\u002F600519\u002Fruns\n```\n\n## Intelligence Stage\n\nThe intelligence stage upgrades isolated documents and metrics into company state modeling:\n\n- `CompanyIntelligenceService` extracts standard events from filings, research notes, metrics, and risk signals.\n- `CompanyEventRepository` stores a company timeline ordered by event date.\n- `KnowledgeGraphRepository` stores lightweight graph nodes and relations in PostgreSQL.\n- Graph entities include company, industry, document, product\u002Fkeyword, financial metric, and risk event.\n- Graph relations include industry membership, published documents, mentioned keywords, financial metrics, risks, and timeline events.\n\nUseful endpoints:\n\n```bash\nPOST \u002Fapi\u002Fintelligence\u002F600519\u002Frebuild\nGET \u002Fapi\u002Fintelligence\u002F600519\u002Ftimeline\nGET \u002Fapi\u002Fintelligence\u002F600519\u002Fgraph\n```\n\n## Dashboard And Evaluation Stage\n\nThe final stage adds a demo console and regression-style RAG evaluation:\n\n- Static dashboard is served by Spring Boot from `\u002F`.\n- The dashboard shows workflow tasks, metric output, retrieval evidence, timeline events, graph counts, and evaluation results.\n- `EvaluationCaseCatalog` defines fixed financial QA test cases.\n- `RagEvaluationService` checks RAG hit rate, evidence coverage, answer coverage, citation presence, hallucination risk, conclusion consistency, confidence calibration, and latency.\n\nUseful endpoints:\n\n```bash\nGET \u002F\nGET \u002Fapi\u002Fevaluations\u002Frag\u002Fcases\nPOST \u002Fapi\u002Fevaluations\u002Frag\u002Frun\n```\n\n## Stock AI Stage\n\nThe stock AI stage turns the dashboard into a practical A-share research workflow:\n\n- `StockUniverseService` syncs 5500+ A-share symbols from free public providers and falls back to Eastmoney search.\n- `StockAnalysisApplicationService` submits single-stock and batch analysis as workflow tasks.\n- `StockAiAnalysisService` builds a prompt context from quote data, financial metrics, risk signals, and RAG evidence chunks.\n- AI analysis calls the FastAPI sidecar and local Ollama when available, then falls back to deterministic rules when the model is unavailable.\n- `stock_analysis_reports` stores every generated report with model\u002Fsource metadata, citations, context hash, `data_snapshot_hash`, report version, and generated time.\n- `StockAnalysisCache` has an in-memory local implementation and a Redis implementation enabled by the `redis` profile; cache keys are tied to the data snapshot hash so stale AI conclusions are not reused after evidence changes.\n- `StockMarketScheduler` can sync the stock universe and submit a morning batch scan on a configurable cron schedule.\n- `user_watchlists` provides a simple user-scoped stock watchlist foundation using the `X-Finsight-User` request header.\n\nUseful endpoints:\n\n```bash\nPOST \u002Fapi\u002Fcompanies\u002Fsync-a-shares\nPOST \u002Fapi\u002Fcompanies\u002Fbatch-analysis\nGET \u002Fapi\u002Fcompanies\u002F600519\u002Fai-analysis\nGET \u002Fapi\u002Fcompanies\u002F600519\u002Fai-analysis\u002Flatest\nGET \u002Fapi\u002Fcompanies\u002F600519\u002Fai-analysis\u002Fhistory\nGET \u002Fapi\u002Fwatchlist\nPOST \u002Fapi\u002Fwatchlist\u002F600519\nDELETE \u002Fapi\u002Fwatchlist\u002F600519\n```\n\n## Production Engineering Stage\n\nThe production-like stage makes the prototype easier to present as a backend\u002FAI system:\n\n- Docker Compose builds and runs `backend`, `ai-service`, PostgreSQL\u002Fpgvector, RabbitMQ, Redis, Elasticsearch, and MinIO.\n- `postgres,rabbitmq,redis,prod` profiles enable persistent repositories, Flyway migrations, pgvector search, Redis analysis cache, and RabbitMQ task dispatch.\n- `RestAiServiceClient` calls FastAPI `\u002Frerank` and `\u002Fgenerate-answer`, while keeping deterministic local fallback for demos and tests.\n- Workflow APIs expose task listing, task detail, status summary, and manual retry for failed\u002Fdead-letter tasks.\n- Spring Boot Actuator exposes health, metrics, and Prometheus scrape output at `\u002Factuator\u002Fhealth`, `\u002Factuator\u002Fmetrics`, and `\u002Factuator\u002Fprometheus`.\n- Test coverage includes deterministic embedding tests and a Testcontainers smoke test for PostgreSQL\u002Fpgvector + RabbitMQ profiles.\n\nUseful endpoints:\n\n```bash\nGET \u002Factuator\u002Fhealth\nGET \u002Factuator\u002Fprometheus\nGET \u002Fapi\u002Fworkflows\u002Fsummary\nGET \u002Fapi\u002Fworkflows\u002F{taskId}\nPOST \u002Fapi\u002Fworkflows\u002F{taskId}\u002Fretry\n```\n","FinSight AI 是一个用于生成基于证据的股票研究报告的人工智能代理。它通过结合强大的工作流编排、Redis Lua单次飞行机制、pgvector向量检索以及版本化报告等功能，实现了高可靠性的金融研究自动化。项目采用Java语言开发，并利用Spring Boot框架、PostgreSQL数据库（含pgvector扩展）、RabbitMQ消息队列等技术栈构建后端服务。适用于需要处理大量财务报表、市场数据及公司事件分析的金融机构或个人投资者，能够帮助用户自动生成高质量且可追溯的研究报告。","2026-06-11 03:56:10","CREATED_QUERY"]