[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-79956":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":13,"forks30d":13,"starsTrendScore":18,"compositeScore":13,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":22,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":30,"readmeContent":31,"aiSummary":32,"trendingCount":13,"starSnapshotCount":13,"syncStatus":33,"lastSyncTime":34,"discoverSource":35},79956,"PaperPilot","CHB-learner\u002FPaperPilot","CHB-learner","AI 文献检索与综述 Agent：支持多源检索、代码仓库定位、开放 PDF 下载、证据链与中英双语报告。","https:\u002F\u002Fchb-learner.github.io\u002FPaperPilot\u002F",null,"Python",118,0,78,1,11,38,4,"MIT License",false,"main",true,[24,25,26,27,28,29],"academic-research","ai-literature-review","cli","literature-review","llm","paperpilot","2026-06-12 02:03:56","# PaperPilot\n\n[![PyPI](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fpaperpilot?color=2563eb&label=PyPI)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpaperpilot\u002F)\n[![Python](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fpaperpilot?color=0f766e&label=Python)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpaperpilot\u002F)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002FCHB-learner\u002FPaperPilot?color=f59e0b)](LICENSE)\n[![Release](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fv\u002Frelease\u002FCHB-learner\u002FPaperPilot?color=7c3aed&label=Release)](https:\u002F\u002Fgithub.com\u002FCHB-learner\u002FPaperPilot\u002Freleases)\n[![CLI](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCLI-PaperPilot-334155)](https:\u002F\u002Fgithub.com\u002FCHB-learner\u002FPaperPilot)\n[![Reports](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FReports-ZH%2FEN%20MD%20HTML%20PDF-ef4444)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpaperpilot\u002F)\n[![Workflow](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWorkflow-evidence--grounded-0891b2)](https:\u002F\u002Fgithub.com\u002FCHB-learner\u002FPaperPilot)\n[![Online Demo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOnline%20Demo-Cloudflare%20Workers-f38020)](https:\u002F\u002Fpaperpilot.aleck-757.workers.dev\u002F)\n[![Netlify Demo](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FOnline%20Demo-Netlify-00ad9f)](https:\u002F\u002Fvoluble-marshmallow-e2bba5.netlify.app\u002F)\n\n[English](README.md) | [中文](README.zh-CN.md) | [Website](https:\u002F\u002Fchb-learner.github.io\u002FPaperPilot\u002F) | [GitHub](https:\u002F\u002Fgithub.com\u002FCHB-learner\u002FPaperPilot) | [PyPI](https:\u002F\u002Fpypi.org\u002Fproject\u002Fpaperpilot\u002F)\n[Online demo: Cloudflare Workers](https:\u002F\u002Fpaperpilot.aleck-757.workers.dev\u002F) | [Online demo: Netlify](https:\u002F\u002Fvoluble-marshmallow-e2bba5.netlify.app\u002F)\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.star-history.com\u002F#chb-learner\u002Fpaperpilot&Date\">\n    \u003Cimg src=\"https:\u002F\u002Fapi.star-history.com\u002Fsvg?repos=chb-learner\u002Fpaperpilot&type=Date\" alt=\"PaperPilot GitHub star history\" width=\"100%\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Fpaperpilot-hero.png\" alt=\"PaperPilot - scholarly literature review agent\" width=\"100%\">\n\u003C\u002Fp>\n\nPaperPilot is a **CLI research agent for scholarly literature review** across AI, biomedicine, and AI for Science.  \nIt turns one user request into a traceable, evidence-based research workflow and generates bilingual reports (`zh\u002Fen`) in Markdown, HTML, and PDF.\n\nThe Cloudflare Workers online demo provides a lightweight browser experience: it uses an OpenAI-compatible LLM to generate search plans, queries public paper metadata sources, and lets users download a lightweight Markdown or HTML report. The full PaperPilot CLI remains the complete workflow for screened corpora, PDF\u002Ffull-text handling, evidence ledgers, bilingual PDF output, and Obsidian Wiki export.\n\n## ✨ What PaperPilot does\n\nPaperPilot is not a chatbot. It is an **interactive scientific workflow**:\n\n- Parse natural-language research requests\n- Build an explicit search protocol with inclusion\u002Fexclusion rules\n- Query multi-source literature APIs\n- Normalize, deduplicate, and screen papers\n- Verify URLs\u002FPDF\u002Fcode availability\n- Synthesize evidence and generate review reports\n- Output structured artifacts for reproducibility\n\nEach run creates a dedicated folder under `runs\u002F` with full state, logs, and intermediate files.\n\n## 🚀 Highlights\n\n### Core experience\n- Natural-language intake with LLM-assisted interpretation\n- Cloudflare Workers online demo for lightweight search plans, public-source candidates, and downloadable Markdown\u002FHTML reports\n- Interactive shell with:\n  - `\u002Fmodel` to manage LLM profiles\n  - `\u002Fsources` to inspect search source\u002FAPI status\n  - `\u002Fdoctor` for quick self-checks\n- Multi-source retrieval with source registry and diagnostics\n- Resume\u002Finspect modes for reproducible research sessions\n\n### Retrieval and screening\n- Protocol-aware search using plan + diversified keywords\n- Canonicalized `Paper` schema and robust deduplication\n- Core\u002Fadjacent\u002Fexcluded paper classification\n- PDF + code-link verification (no paywall bypass)\n- Optional full-text extraction from downloadable PDFs\n\n### Reporting\n- Canonical bilingual report model\n- Consistent `[1][2][3]` citation mapping\n- Method taxonomy and evidence matrix\n- Markdown + HTML + PDF outputs with aligned content\n- Browser demo can download a lightweight Markdown\u002FHTML briefing based on public metadata and abstracts\n- Final report view keeps up to 100 papers by default, without a hard minimum\n- Obsidian Wiki export with paper, method, topic, and claim notes\n\n### Quality controls\n- Quality gates and reflection workflow\n- Evidence ledger linking claims to corpus evidence\n- Review checks for citation compliance and source reliability\n- Event stream logs for auditability\n\n## 🗂 Source stack\n\nDefault free sources:\n\n- arXiv\n- Semantic Scholar\n- OpenAlex\n- Crossref\n- OpenReview\n- PubMed \u002F NCBI E-utilities\n- Europe PMC\n- bioRxiv \u002F medRxiv\n- DBLP\n- ACL Anthology\n- Papers.cool\n\nOptional API-key sources:\n\n- DeepXiv \u002F Agentic Data\n- CORE\n- Lens.org Scholarly API\n- IEEE Xplore\n- Springer Nature\n- Elsevier \u002F Scopus\n- Dimensions\n\n## 🛠 Installation\n\n```bash\npython -m pip install paperpilot -i https:\u002F\u002Fpypi.org\u002Fsimple\n```\n\nLocal development:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FCHB-learner\u002FPaperPilot.git\ncd PaperPilot\npython -m pip install -e .\n```\n\n## ⚙️ LLM + Source Configuration\n\nPaperPilot requires OpenAI-compatible LLM settings for query understanding, planning, synthesis, and report generation.\n\nOn first run, it creates an editable configuration template at:\n\n```text\n~\u002F.paperpilot\u002Fconfig.json\n```\n\nMinimal default template:\n\n```json\n{\n  \"active\": \"default\",\n  \"profiles\": {\n    \"default\": {\n      \"api_key\": \"\",\n      \"base_url\": \"\",\n      \"model\": \"gpt-5.2\"\n    }\n  },\n  \"sources\": {\n    \"core\": {\"enabled\": null, \"api_key\": \"\", \"base_url\": \"\"},\n    \"lens\": {\"enabled\": null, \"api_key\": \"\", \"base_url\": \"\"},\n    \"ieee\": {\"enabled\": null, \"api_key\": \"\", \"base_url\": \"\"},\n    \"springer\": {\"enabled\": null, \"api_key\": \"\", \"base_url\": \"\"},\n    \"elsevier\": {\"enabled\": null, \"api_key\": \"\", \"base_url\": \"\"},\n    \"dimensions\": {\"enabled\": null, \"api_key\": \"\", \"base_url\": \"\"},\n    \"deepxiv\": {\"enabled\": null, \"api_key\": \"\", \"base_url\": \"\"}\n  }\n}\n```\n\nNotes:\n\n- Leave optional source API keys empty if unavailable.\n- `enabled: null` means auto-enable once a valid key is provided.\n- `~\u002F.paperpilot\u002Fconfig.json` is not committed; edit it directly or use CLI commands.\n\n### CLI config commands\n\n```bash\nPaperPilot config set --base-url https:\u002F\u002Fapi.deepseek.com --model deepseek-chat\nPaperPilot config import .\u002Fapi.json\nPaperPilot config list\nPaperPilot config use deepseek\nPaperPilot config show\nPaperPilot --doctor\n```\n\n```bash\nPaperPilot sources list\nPaperPilot sources config core\nPaperPilot sources config deepxiv\nPaperPilot sources enable core\nPaperPilot sources test core\n```\n\nInside interactive mode, use `\u002Fsources` and `\u002Fdoctor`.\n\n### Cloudflare Workers online demo configuration\n\nThe hosted demo runs on Cloudflare Workers at `https:\u002F\u002Fpaperpilot.aleck-757.workers.dev\u002F` and serves `\u002Fapi\u002Fliterature-search` from the Worker. `wrangler.jsonc` includes safe defaults for the online experience:\n\n```text\nLLM_BASE_URL=https:\u002F\u002Fapi.deepseek.com\nLLM_MODEL=deepseek-v4-flash\nLLM_API_KEY=123456\n```\n\nReplace the placeholder `LLM_API_KEY` in Cloudflare `Variables and Secrets` with a real server-side key. The frontend calls the Worker API and never embeds the key in browser code. The online demo uses OpenAlex and Crossref as public metadata sources; Semantic Scholar is skipped unless `SEMANTIC_SCHOLAR_API_KEY` is configured to avoid public API rate limits.\n\n## 🔑 API source keys references\n\n| Source | Access page |\n|---|---|\n| CORE | https:\u002F\u002Fcore.ac.uk\u002Fservices\u002Fapi |\n| Lens.org | https:\u002F\u002Fdocs.api.lens.org\u002F |\n| IEEE Xplore | https:\u002F\u002Fdeveloper.ieee.org\u002Fgetting_started |\n| Springer Nature | https:\u002F\u002Fdev.springernature.com\u002F |\n| Elsevier \u002F Scopus | https:\u002F\u002Fdev.elsevier.com\u002F |\n| Dimensions | https:\u002F\u002Fdocs.dimensions.ai\u002Fdsl\u002Fapi.html |\n| DeepXiv \u002F Agentic Data | https:\u002F\u002Fdata.rag.ac.cn\u002Fapi\u002Fdocs |\n| Papers.cool | https:\u002F\u002Fpapers.cool |\n\n## 🧪 Quick Start\n\nInteractive usage:\n\n```bash\nPaperPilot\n```\n\nCommand mode example:\n\n```bash\nPaperPilot \"RNA inverse folding sequence design\" \\\n  --auto-confirm \\\n  --max-papers 50 \\\n  --since-year 2021 \\\n  --github-filter required \\\n  --sources auto \\\n  --mode apa \\\n  --quality balanced\n```\n\nImport local corpus and skip download:\n\n```bash\nPaperPilot \"RNA inverse folding sequence design\" \\\n  --auto-confirm \\\n  --user-corpus .\u002Fpapers \\\n  --user-corpus references.bib \\\n  --no-download\n```\n\nInspect\u002Fresume workflow:\n\n```bash\nPaperPilot inspect runs\u002F\u003Ctask-id>\nPaperPilot resume runs\u002F\u003Ctask-id>\n```\n\n## 🧭 Workflow\n\nPaperPilot follows this state-machine pipeline:\n\n```text\nIntake -> Protocol -> Search -> Corpus -> Screening -> Verification -> Synthesis -> Review -> Report\n```\n\n```mermaid\nflowchart LR\n  U[\"User request\"] --> C[\"Run context\"]\n  C --> QA[\"Query understanding\"]\n  QA --> PL[\"Planning + Protocol\"]\n  PL --> ST[\"Source Registry search\"]\n  ST --> NB[\"Corpus normalization\"]\n  NB --> SC[\"Core \u002F adjacent screening\"]\n  SC --> VF[\"Verification + PDF + code checks\"]\n  VF --> SY[\"Literature matrix\"]\n  SY --> QG[\"Quality gate + reflection\"]\n  QG --> EL[\"Evidence ledger\"]\n  EL --> RP[\"Report render: ZH \u002F EN\"]\n```\n\n## 📁 Run artifacts\n\n`runs\u002F\u003Ctask-id>\u002F` will contain:\n\n- `task.json` \u002F `state.json` \u002F `events.jsonl` \u002F `manifest.json`\n- `planning\u002F`: query understanding, search plan, protocol, prompt and registry manifests\n- `search\u002F`: raw normalized metadata and source diagnostics\n- `corpus\u002F`: screened corpus, core\u002Fadjacent\u002Fexcluded sets, ranked report papers\n- `verification\u002F`: verification records, quality gate, reflection, download log, evidence ledger, review findings\n- `synthesis\u002F`: literature matrix and field-level synthesis\n- `reports\u002F`: `report.canonical.json`, bilingual Markdown, HTML, and PDF reports\n- `assets\u002Fpdfs\u002F` and `assets\u002Ffulltext\u002F`: downloaded open PDFs and extracted full text\n- `wiki\u002Fobsidian\u002F`: Obsidian knowledge graph with notes, wikilinks, and lint metadata\n\n## 🧠 Obsidian Wiki\n\nEach successful run generates `runs\u002F\u003Ctask-id>\u002Fwiki\u002Fobsidian\u002F` by default. Open that folder as an Obsidian vault to browse:\n\n- `index.md`: research entry point and reported-paper overview\n- `papers\u002F`: one note per reported paper with citation label, PDF\u002Fcode links, method family, and evidence basis\n- `methods\u002F`: method-family notes linked to representative papers\n- `topics\u002F`: query\u002Fsubtopic notes\n- `claims\u002F`: evidence-map claim notes\n- `_meta\u002Fmanifest.json` and `_meta\u002Fwiki_lint.json`: provenance, hashes, broken-link checks\n\nUse `--no-obsidian-wiki` to skip Wiki generation.\n\nFor a public-safe ScholarFlow-style vault layout and config template, see:\n\n- [`docs\u002Fscholarflow-vault-example.md`](docs\u002Fscholarflow-vault-example.md)\n- [`examples\u002Fscholarflow.example.json`](examples\u002Fscholarflow.example.json)\n\nExample `summary.md` auto-index table:\n\n| Date | Paper | Notes | Code | Source | Remarks |\n|---|---|---|---|---|---|\n| 2026.05.20 | [CitationGraph-RAG](https:\u002F\u002Fexample.org\u002Fpapers\u002Fcitationgraph-rag) | To read | [GitHub](https:\u002F\u002Fgithub.com\u002Fexample\u002Fcitationgraph-rag) | [arXiv](https:\u002F\u002Farxiv.org\u002F) | Public demo row |\n| 2026.05.18 | [BenchAgent-Eval](https:\u002F\u002Fexample.org\u002Fpapers\u002Fbenchagent-eval) | Draft note |  | [OpenReview](https:\u002F\u002Fopenreview.net\u002F) | Sanitized example |\n\nThis table is written as normal Markdown, not inside a fenced code block, so GitHub can render it.\n\n## 🧩 Code filter modes\n\n- `any`: keep all papers and annotate code availability\n- `required`: keep only papers with detected code repositories in final view\n- `none`: keep only papers without detected public code links\n\n## 🧪 CLI options (important ones)\n\n```text\n--max-papers INT                 maximum papers in final report view; default: 100\n--min-report-papers INT          optional minimum report size; default: 0\n--since-year INT                 preferred lower year bound\n--github-filter any|required|none\n--github-search-limit INT\n--no-download                    skip PDF downloads\n--pdf-limit INT                  maximum PDFs to download\n--user-corpus PATH               repeatable local corpus path\n--mode quick|apa|systematic\n--interaction auto|gated\n--quality fast|balanced|strict\n--include-adjacent               include adjacent papers in appendices\n--sources auto|all|core|biomed|cs|configured\n--enable-source SOURCE           enable one source (repeatable)\n--disable-source SOURCE          disable one source (repeatable)\n--no-obsidian-wiki               skip Obsidian Wiki export\n```\n\nSee `paperpilot --help` for full options and Chinese\u002FEnglish output.\n\n## 🧱 Development notes\n\n- Keep run outputs and generated artifacts out of source control.\n- Keep API keys out of git history.\n- Prefer `.gitignore` over manual cleanup.\n- Use semantic tags for releases and keep `README` + docs aligned.\n- Keep `.github\u002Fworkflows\u002F*`, `RELEASING.md`, `CHANGELOG.md` in sync when publishing.\n\n## 🧭 Open source checklist\n\n- Ensure `~\u002F.paperpilot\u002Fconfig.json`, `api.json`, and `.env` with credentials are never committed.\n- Add\u002Fkeep `LICENSE` and `.gitignore`.\n- Add source code and tags before publishing release assets.\n- Publish GitHub Pages from `docs\u002F`.\n- Keep versions in `pyproject.toml`, `literature_agent\u002F__init__.py`, and generated manifests aligned.\n\n### One-command release\n\n```bash\n# dry-run checks only\n.\u002Fscripts\u002Frelease_everywhere.sh --dry-run\n\n# normal release (pushed commit + tag + GH release + PyPI)\nexport PYPI_TOKEN='pypi-...'\n.\u002Fscripts\u002Frelease_everywhere.sh\n\n# release without publishing to PyPI\n.\u002Fscripts\u002Frelease_everywhere.sh --no-pypi\n```\n\nSuggested publish flow (full):\n\n```bash\npython -m unittest discover -s tests\npython -m compileall literature_agent\n.\u002Fpublish_pypi.sh --dry-run --version \u003CVERSION>\ngit add -A\ngit commit -m \"chore: release v\u003CVERSION>\"\ngit tag -a v\u003CVERSION> -m \"v\u003CVERSION>\"\ngit push origin main --tags\n.\u002Fpublish_pypi.sh --version \u003CVERSION>\n```\n\nFor GitHub Pages: enable Pages to deploy from `main` + `\u002Fdocs`, or rely on `.github\u002Fworkflows\u002Fgh-pages.yml`.\n\n## 🙏 Acknowledgements\n\nPaperPilot is shaped by ideas from open academic-research and agent projects. Thanks to these projects and their authors for making their work public:\n\n- [LLMForEverybody](https:\u002F\u002Fgithub.com\u002Fluhengshiwo\u002FLLMForEverybody) for Agent design-pattern learning material.\n- [academic-research-skills](https:\u002F\u002Fgithub.com\u002FImbad0202\u002Facademic-research-skills) for research integrity, source verification, and structured synthesis inspiration.\n- [DeepTutor](https:\u002F\u002Fgithub.com\u002FHKUDS\u002FDeepTutor) for Tool\u002FCapability-style agent architecture ideas.\n- [obsidian-wiki](https:\u002F\u002Fgithub.com\u002Far9av\u002Fobsidian-wiki) for the Obsidian Wiki export direction.\n- [Research-Paper-Writing-Skills](https:\u002F\u002Fgithub.com\u002FMaster-cai\u002FResearch-Paper-Writing-Skills), [research-writing-skill](https:\u002F\u002Fgithub.com\u002FNorman-bury\u002Fresearch-writing-skill), and [SLR-FC](https:\u002F\u002Fgithub.com\u002Fdrshahizan\u002FSLR-FC) for literature review, research writing, and systematic-review workflow references.\n\n## 📚 Citation note\n\nIf you use PaperPilot in your work, include the repository URL and version used so results are reproducible.\n","PaperPilot 是一个用于学术文献检索与综述的命令行工具，特别适用于人工智能、生物医学及科学领域的研究。其核心功能包括多源文献检索、代码仓库定位、开放PDF下载以及生成基于证据链的中英双语报告。项目采用Python语言编写，支持通过自然语言解析用户需求，并构建明确的搜索协议，查询多个文献数据源，对结果进行去重筛选和验证，最终生成Markdown、HTML或PDF格式的双语报告。此外，还提供在线演示版本以供快速体验。此工具非常适合需要系统化整理和分析大量文献的研究人员使用。",2,"2026-06-11 03:58:41","CREATED_QUERY"]