[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-80069":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":12,"openIssues":13,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":14,"stars30d":15,"stars90d":14,"forks30d":14,"starsTrendScore":14,"compositeScore":16,"rankGlobal":8,"rankLanguage":8,"license":8,"archived":17,"fork":17,"defaultBranch":18,"hasWiki":19,"hasPages":17,"topics":20,"createdAt":8,"pushedAt":8,"updatedAt":21,"readmeContent":22,"aiSummary":23,"trendingCount":14,"starSnapshotCount":14,"syncStatus":24,"lastSyncTime":25,"discoverSource":26},80069,"paipai","sunshijun-ctr\u002Fpaipai","sunshijun-ctr",null,"Python",101,5,1,6,0,40,40.33,false,"master",true,[],"2026-06-12 04:01:26","# 🔬 Research Agent — An Autonomous Multi-Agent Research Copilot\n\n> A full-stack, multi-agent LLM system that searches literature, reads PDFs and web pages, reasons over a hybrid-retrieval knowledge base, writes academic prose, and plans multi-step research on its own — with human-in-the-loop control over what it does.\n\n\u003Cp align=\"center\">\n  \u003Cimg alt=\"Python\"     src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPython-3.10+-3776AB?logo=python&logoColor=white\">\n  \u003Cimg alt=\"LangGraph\"  src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLangGraph-orchestration-1C3C3C\">\n  \u003Cimg alt=\"FastAPI\"    src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FFastAPI-async%20%2B%20WebSocket-009688?logo=fastapi&logoColor=white\">\n  \u003Cimg alt=\"LLM\"        src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLLM-OpenAI%20%2B%20Anthropic-412991?logo=openai&logoColor=white\">\n  \u003Cimg alt=\"Vector\"     src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FRAG-Chroma%20%2B%20BM25%20%2B%20Rerank-FF6F00\">\n  \u003Cimg alt=\"Postgres\"   src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FPostgreSQL-checkpointer%20%2B%20notes-4169E1?logo=postgresql&logoColor=white\">\n\u003C\u002Fp>\n\n> 🌏 中文版见 [README.zh-CN.md](README.zh-CN.md)\n\n---\n\n## ✨ Highlights\n\n- 🧭 **Intent-driven multi-agent orchestration** — a dedicated *Intent Agent* routes every request, then an orchestrator dispatches one of **9 specialized agents** across **8 LangGraph workflows** (or a deterministic action handler).\n- 🧠 **A planning agent that composes its own tools** — `plan → approve → execute → synthesize`, choosing and chaining tools (paper search, web fetch, notes…) at runtime instead of following a hard-coded script.\n- 🙋 **Human-in-the-loop with durable checkpointing** — the planning graph can *pause* at a plan-approval checkpoint and *resume* later, with state persisted at every node boundary via a **Postgres-backed LangGraph checkpointer**.\n- 🔎 **Hybrid-retrieval RAG** — dense vectors (Chroma) **+** BM25 keyword search **+** neural reranking (FlashRank), over both a long-term knowledge base and a per-session temporary store.\n- 🗂️ **Layered memory** — short-term (with automatic conversation compression), working memory, and long-term user memory, threaded through every turn.\n- 🧰 **A pluggable tool & LLM layer** — a central tool registry (paper search, PDF parsing, web scraping, image OCR+VLM, note CRUD) and **per-agent LLM providers** swappable between OpenAI and Anthropic.\n- 🌐 **Full-stack & multi-channel** — async FastAPI backend with **streaming over WebSocket**, a Vite web UI, a `pywebview` desktop app, and a QQ bot — all behind one channel abstraction, with JWT auth.\n\n---\n\n## 🎬 Demo\n\n> Real screenshots of the core features below. For extra impact, drop a short GIF at the top of this section (add it to `docs\u002Fscreenshots\u002F` and reference it the same way).\n\n|  |  |\n|:---:|:---:|\n| ![Paper search](docs\u002Fscreenshots\u002Fliterature-search.png) | ![Citation graph](docs\u002Fscreenshots\u002Fcitation-graph.png) |\n| **Paper search** · one-line queries over arXiv \u002F Semantic Scholar with a live results panel | **Citation graph** · interactive visualization of a paper's citations \u002F references |\n| ![PDF reading + RAG QA](docs\u002Fscreenshots\u002Fpdf-rag-qa.png) | ![Library \u002F knowledge base](docs\u002Fscreenshots\u002Flibrary.png) |\n| **PDF reading + RAG QA** · source on the left, grounded structured answer on the right | **Library \u002F knowledge base** · ingest, embed, and retrieve per library |\n| ![Web research](docs\u002Fscreenshots\u002Fweb-research.png) | ![Figure generation](docs\u002Fscreenshots\u002Ffigure-generation.png) |\n| **Web research** · search → scrape → synthesized answer | **Figure generation** · turn paper content into method diagrams \u002F schematics |\n\n\u003Cdetails>\n\u003Csummary>Under the hood: how one research request flows\u003C\u002Fsummary>\n\n```\n┌──────────────────────────────────────────────────────────────┐\n│  user ▸ 调研一下 RAG 在医疗领域的最新进展                        │\n│                                                                │\n│  ▸ intent ........ research_task                                │\n│  ▸ plan .......... [paper_search] + [web_search] (2 steps)      │\n│  ▸ approve ....... ✓ (auto \u002F user-confirmed)                   │\n│  ▸ execute ....... 12 papers · 6 pages fetched                 │\n│  ▸ synthesize .... structured landscape + citations            │\n└──────────────────────────────────────────────────────────────┘\n```\n\n\u003C\u002Fdetails>\n\n---\n\n## 🏗️ Architecture\n\n```mermaid\nflowchart TD\n    subgraph Channels\n        W[Web UI · Vite]\n        D[Desktop · pywebview]\n        Q[QQ Bot]\n    end\n\n    W & D & Q --> API[FastAPI + WebSocket\u003Cbr\u002F>streaming gateway]\n    API --> ORC[Orchestrator]\n\n    ORC --> IA[Intent Agent\u003Cbr\u002F>LLM routing + deterministic fallback]\n    IA --> R{Router}\n\n    R -->|graph workflows| WF[LangGraph Workflow Engine]\n    R -->|deterministic verbs| ACT[Action Handlers\u003Cbr\u002F>note · library ingest]\n    R -->|multi-step research| PA[Planning Agent\u003Cbr\u002F>plan ▸ approve ▸ execute ▸ synthesize]\n\n    WF --> AG[9 Specialized Agents]\n    PA --> AG\n\n    AG -.-> TR[Tool Registry]\n    AG -.-> RAG[(Hybrid RAG\u003Cbr\u002F>Chroma · BM25 · Rerank)]\n    AG -.-> MEM[(Layered Memory\u003Cbr\u002F>short · working · long-term)]\n    AG -.-> LLM[Per-agent LLM Providers\u003Cbr\u002F>OpenAI · Anthropic]\n\n    PA \u003C-->|pause \u002F resume| CKPT[(Postgres Checkpointer)]\n```\n\n---\n\n## 🧠 Request Lifecycle\n\nEvery message flows through the same disciplined pipeline:\n\n1. **Intent recognition** — the Intent Agent classifies the request into a *route* using session context, with a pure-keyword fallback if the LLM call fails.\n2. **Routing** — the orchestrator resolves the route to one of three execution modes:\n   - a **workflow** (a compiled LangGraph state machine),\n   - a **deterministic action** (plain verbs like note CRUD \u002F library ingest — no graph overhead),\n   - or the **planning agent** for open-ended, multi-step research.\n3. **Execution** — agents call tools, retrieve context, and stream progress events back over WebSocket.\n4. **Memory & continuity** — outputs update short\u002Fworking\u002Flong-term memory and session context, so follow-ups (\"save that as a note\", \"expand this\") resolve against the right task.\n\n---\n\n## 🔬 Engineering Deep-Dives\n\nThe parts that were genuinely hard — and the most interesting to talk through.\n\n### 1. Two-tier routing: an LLM brain with a deterministic spine\nNatural-language routing is delegated to the Intent Agent (it has session context, active entities, and recent output in its prompt), but **safety-critical and trivially-classifiable cases are handled deterministically** — explicit UI markers, task continuation, and a full keyword fallback. This avoids the classic failure mode where an LLM router silently mis-routes a user mid-task. Deterministic *verbs* (note CRUD, library ingest) bypass the graph engine entirely as **action handlers**, keeping hot paths fast and predictable.\n\n### 2. A planning agent with real human-in-the-loop control\nThe research path is a 4-node LangGraph (`plan → approve → execute → synthesize`). The **approve** node is split from **plan** on purpose: LangGraph re-runs a node from its start on resume, and re-running the (expensive) planning LLM call would be wasteful — so the cheap approval gate is isolated. When enabled, it `interrupt()`s the graph, surfaces a plan card to the UI, and waits for the user to **approve \u002F modify \u002F cancel**, with an unattended-timeout default. State persists through a **Postgres checkpointer**, so a paused plan survives across requests.\n\n### 3. Hybrid retrieval that doesn't rely on embeddings alone\nRetrieval fuses **dense** (Chroma vector search), **sparse** (BM25 keyword), and **neural reranking** (FlashRank, with a CrossEncoder option). The system keeps a **long-term knowledge base** and a **per-session temporary store**, and decides between cached library context and fresh retrieval based on the query — so \"this paper\" follow-ups stay grounded in the right document.\n\n### 4. Layered memory with automatic compression\nShort-term memory holds recent turns and **compresses** itself once it grows past a threshold (older turns fold into a running summary); working memory carries per-task state; long-term memory captures durable user preferences. The Intent Agent and downstream agents all read from this so the system behaves coherently across a long session.\n\n### 5. Pluggable tools & per-agent models\nTools are registered in a central **Tool Registry** with alias support, so a tool-calling agent can address `paper_search` \u002F `web_fetch` \u002F `note_create` by canonical name. Each agent can be wired to a **different LLM provider\u002Fmodel** (OpenAI or Anthropic), letting you put a cheap model on routing and a strong model on synthesis.\n\n---\n\n## 🤖 Agents\n\n| Agent | Responsibility |\n|-------|----------------|\n| `intent_agent` | Classifies each request into a workflow \u002F action \u002F planning route |\n| `research_agent` | Multi-step planning agent; composes tools autonomously (plan→execute→synthesize) |\n| `literature_agent` | Searches, filters, and downloads papers (arXiv + Semantic Scholar) |\n| `rag_agent` | Hybrid retrieval + grounded reading\u002FQA over the knowledge base or uploads |\n| `web_agent` | Web search → page fetch → synthesized answer |\n| `writing_agent` | Academic writing from user input \u002F uploads \u002F library \u002F any mix |\n| `note_agent` | Create \u002F update \u002F delete \u002F search \u002F embed research notes |\n| `summary_agent` | Conversation & session summarization |\n| `general_agent` | Open-ended reasoning, planning, and chat fallback |\n\n## 🔀 Workflows & Actions\n\n**LangGraph workflows** (compiled state machines): `paper_search`, `question_answer`, `web_search`, `academic_writing`, `image_understanding`, `conversation_summary`, `research_agent`, `general_agent`.\n\n**Deterministic actions** (direct handlers, no graph): `note_action`, `library_ingest_action`.\n\n## 🧰 Tools\n\n| Domain | Tools |\n|--------|-------|\n| Literature | paper search (arXiv, Semantic Scholar), semantic filter, PDF download |\n| Documents | PDF\u002FPPTX parsing (PyMuPDF + LlamaParse), chunking & indexing |\n| Web | web search, page scrape, lightweight URL fetch |\n| Vision | image understanding (OCR + VLM) |\n| Knowledge | library add\u002Fsearch, RAG index & retrieval |\n| Notes | full note CRUD + embedding |\n\n---\n\n## 🛠️ Tech Stack\n\n| Layer | Technologies |\n|-------|--------------|\n| **Agents \u002F Orchestration** | LangGraph, custom orchestrator & router, Pydantic schemas |\n| **LLMs** | OpenAI + Anthropic (pluggable per agent) |\n| **Retrieval** | Chroma (dense), `rank_bm25` (sparse), FlashRank (rerank), LangChain text splitters |\n| **Documents** | PyMuPDF, python-pptx, LlamaIndex \u002F LlamaParse |\n| **Backend** | FastAPI, Uvicorn, async Python, WebSocket streaming |\n| **Storage** | PostgreSQL (notes + LangGraph checkpointer), Chroma |\n| **Frontend** | Vite SPA (ESM), `pywebview` desktop shell |\n| **Channels** | Web, QQ bot (unified channel abstraction) |\n| **Auth** | JWT, bcrypt, email verification (aiosmtplib) |\n\n---\n\n## 🚀 Quick Start\n\n```bash\n# 1. Install backend deps\npip install -r requirements.txt\n\n# 2. Configure (copy and fill in API keys \u002F DB url)\ncp .env.example .env\n\n# 3. Build the web frontend\ncd web && npm install && npm run build && cd ..\n\n# 4. Run\npython web_server.py        # web app  → http:\u002F\u002Flocalhost:8000\n# or\npython desktop_app.py       # desktop app (pywebview)\n```\n\n> Requires Python 3.10+, Node 18+, and a PostgreSQL instance. See `.env.example` for the full configuration surface (LLM keys, per-agent models, DB, email, channels).\n\n---\n\n## 📂 Project Structure\n\n```\napp\u002F\n├── agents\u002F        # 9 specialized agents (intent, research, rag, writing, …)\n├── orchestrator\u002F  # routing, action handlers, HITL checkpoint logic\n├── workflows\u002F     # LangGraph graph builders + registry\n├── rag\u002F           # long-term & temporary retrieval, reranker\n├── memory\u002F        # short-term \u002F working \u002F long-term memory\n├── tools\u002F         # tool registry: search, pdf, web, image, notes, library\n├── channels\u002F      # web + QQ channel adapters\n├── services\u002F      # LLM providers, note service, …\n└── api\u002F           # FastAPI server + WebSocket gateway\n```\n\n---\n\n## 🗺️ Roadmap\n\n- [ ] **Todo list & task board**: Add a persistent frontend workspace for tasks, with filtering, priority, due dates, status transitions, and links to sessions, notes, and papers.\n- [ ] **MCP service**: Expose paper search, knowledge base, notes, files, calendar, and other capabilities as an MCP server so external clients and in-app agents can share one tool protocol.\n- [ ] **Autonomous frontend workflow orchestration**: Add a visual workflow canvas\u002Fnode editor where users can compose agents, tools, inputs, outputs, and approval checkpoints into reusable workflows.\n- [ ] **Docker deployment**: Provide `Dockerfile`, `docker-compose.yml`, and dev\u002Fprod environment templates for FastAPI, Postgres, Chroma\u002Fvector storage, and frontend static assets.\n- [ ] **Online web trial**: Deploy a public demo\u002Ftrial site with guest mode, sample data, quota limits, auth, and data isolation.\n- [ ] **Architecture rebuild**: Rework the boundaries between agents, workflows, tools, memory, channels, and storage; separate core packages from app wiring and define cleaner plugin extension points.\n- [ ] Streaming token-level output from the planning agent\n- [ ] Pluggable retrieval backends (Qdrant \u002F pgvector)\n- [ ] Evaluation harness for RAG faithfulness & answer relevancy\n- [ ] React frontend migration\n\n---\n\n\u003Cp align=\"center\">\u003Csub>Built as a deep exploration of agentic LLM system design — orchestration, planning, retrieval, and memory.\u003C\u002Fsub>\u003C\u002Fp>\n","这是一个名为Research Agent的全栈多代理LLM系统，旨在辅助学术研究。它能够搜索文献、阅读PDF和网页、基于混合检索知识库进行推理、撰写学术文章，并自主规划多步骤研究，同时允许人类参与控制其行为。项目使用Python 3.10+开发，集成了LangGraph工作流编排、FastAPI异步Websocket后端、OpenAI与Anthropic的LLM模型、Chroma向量数据库等技术。该系统特别适合需要高效处理大量学术资料并进行深度分析的研究场景，如科研机构、高校实验室等。",2,"2026-06-11 03:59:06","CREATED_QUERY"]