[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2210":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},2210,"paperclip","GXL-ai\u002Fpaperclip","GXL-ai","Paperclip — search, read, and analyze 8M+ biomedical papers from the command line","https:\u002F\u002Fpaperclip.gxl.ai",null,"Python",176,16,1,8,0,4,10,41,12,3.69,"Apache License 2.0",false,"main",true,[],"2026-06-12 02:00:38","# Paperclip\n\n**Search, read, and analyze biomedical papers, regulatory documents, and clinical trials from the command line.**\n\nPaperclip is a CLI and MCP server for AI agents where every document is a directory containing full text, sections, figures, and supplements on a virtual filesystem.\n\n- Search with natural language or regex across biomedical papers from bioRxiv, medRxiv, arXiv, and PubMed Central, plus FDA regulatory documents, ClinicalTrials.gov, and international regulatory and trial registries\n- Run parallel AI readers across papers with `map` and synthesize with `reduce`\n- Pipe results through standard Unix tools (`grep`, `awk`, `sed`, `jq`, etc.)\n- Ask questions about figures with vision AI\n- Query the database directly with SQL\n\nFull documentation: **[paperclip.gxl.ai](https:\u002F\u002Fpaperclip.gxl.ai)**\n\n## Community\n\nThis repository hosts the source code for the Paperclip CLI client. Use it to:\n\n- [Report bugs](https:\u002F\u002Fgithub.com\u002FGXL-ai\u002Fpaperclip\u002Fissues)\n- [Request features](https:\u002F\u002Fgithub.com\u002FGXL-ai\u002Fpaperclip\u002Fissues)\n- [Start discussions](https:\u002F\u002Fgithub.com\u002FGXL-ai\u002Fpaperclip\u002Fdiscussions)\n\n## Install\n\nPython 3.8+ required.\n\n```bash\ncurl -fsSL https:\u002F\u002Fpaperclip.gxl.ai\u002Finstall.sh | bash\n```\n\nInstalls to `~\u002F.paperclip\u002F` with a wrapper at `~\u002F.local\u002Fbin\u002Fpaperclip`.\n\nOr install via pip:\n\n```bash\npip install https:\u002F\u002Fpaperclip.gxl.ai\u002Fpaperclip.whl\npaperclip setup\n```\n\n### Sign in\n\nSign-in happens automatically on first use, or run manually:\n\n```bash\npaperclip login\n```\n\n### Verify\n\n```bash\npaperclip config\n# Server:  https:\u002F\u002Fpaperclip.gxl.ai\n# Auth:    ✓ you@example.com\n# Config:  ~\u002F.paperclip\n```\n\n## MCP Server (alternative)\n\nUse Paperclip as an MCP server directly — no local install needed.\n\n### Claude Code\n\n```bash\nclaude mcp add --transport http paperclip https:\u002F\u002Fpaperclip.gxl.ai\u002Fmcp\n```\n\nThen start `claude`, enter `\u002Fmcp`, and select Authenticate under the paperclip server.\n\n### Cursor\n\nAdd to `~\u002F.cursor\u002Fmcp.json` (or `.cursor\u002Fmcp.json` in your project):\n\n```json\n{\n  \"mcpServers\": {\n    \"paperclip\": {\n      \"url\": \"https:\u002F\u002Fpaperclip.gxl.ai\u002Fmcp\",\n      \"type\": \"http\"\n    }\n  }\n}\n```\n\nThen `Cmd\u002FCtrl + Shift + P` → Tools & MCPs, enable the `paperclip` server, and authenticate.\n\n## Quick Start\n\n```bash\n# Search for papers\npaperclip search \"CRISPR base editing efficiency\"\n\n# Read a paper's metadata\npaperclip cat \u002Fpapers\u002Fbio_4f78753a6feb\u002Fmeta.json\n\n# Preview the first 50 lines\npaperclip head -50 \u002Fpapers\u002Fbio_4f78753a6feb\u002Fcontent.lines\n\n# Grep within a single paper\npaperclip grep -i \"binding affinity\" \u002Fpapers\u002Fbio_4f78753a6feb\u002Fcontent.lines\n\n# Regex search across the entire corpus (sub-second)\npaperclip grep \"alphamissense\" \u002Fpapers\u002F\n\n# Map over search results with an AI reader\npaperclip map --from s_abc123 \"What methods were used?\"\n\n# Run SQL queries\npaperclip sql \"SELECT title, doi FROM documents WHERE authors ILIKE '%Doudna%' LIMIT 5\"\n\n# Save results to a local file\npaperclip search \"CRISPR\" -n 5 > results.txt\n```\n\nUse `paperclip bash '...'` for pipes and chains:\n\n```bash\npaperclip bash 'search \"protein folding\" | grep \"deep learning\"'\n```\n\n## Commands\n\n| Command | Description |\n|---------|-------------|\n| `search` | Hybrid search (BM25 + vector) across papers, regulatory documents, and trials |\n| `searches` | Run multiple queries in parallel and merge results |\n| `grep` | Regex search within a paper or across the entire corpus |\n| `scan` | Multi-pattern grep in a single pass |\n| `lookup` | Find papers by DOI, PMC ID, PMID, author, title, journal |\n| `sql` | Read-only SQL queries against the papers database |\n| `map` | Parallel AI reader across multiple papers |\n| `reduce` | Synthesize map results into summaries, tables, or themes |\n| `filter` | Filter search results for relevance |\n| `ask-image` | Analyze figures with vision AI |\n| `cat` | Read files from the paper filesystem |\n| `head` \u002F `tail` | Preview first or last lines |\n| `ls` \u002F `tree` | List directory contents |\n| `grep` \u002F `scan` | Search within papers |\n| `sed` \u002F `awk` \u002F `jq` | Text processing |\n| `results` | View, browse, and export saved results |\n| `config` | Show or set configuration, connection diagnostics |\n| `install` | Install agent skill for Claude Code, Cursor, or Codex |\n| `update` | Update to the latest version |\n| **Paper Repos** | |\n| `init` | Create a new paper repo |\n| `checkout` | List repos, switch repos or branches |\n| `add` \u002F `remove` | Add or remove papers |\n| `import` | Seed repo from a paper's bibliography |\n| `commit` | Snapshot with reasoning message |\n| `annotate` | Pin notes to specific papers |\n| `status` | Repo state: papers, branches, annotations |\n| `log` | Commit history |\n| `diff` | Compare commits or branches |\n| `export` | Export to BibTeX, RIS, Markdown, or CSV |\n| `branch` \u002F `merge` | Branching and merging |\n| `cite` | Citation counts and relationships |\n\n## Agent Integration\n\nInstall a skill so your coding agent can use Paperclip automatically:\n\n```bash\npaperclip install\n```\n\nSupports Claude Code, Cursor, and Codex. The skill teaches the agent the full command set. Then just mention `\u002Fpaperclip` in your prompt:\n\n> Using \u002Fpaperclip, find recent papers on GLP-1 receptor agonists and summarize the primary endpoints.\n\n## Paper Filesystem\n\nEach paper lives at `\u002Fpapers\u002F\u003Cid>\u002F`:\n\n```\nmeta.json        — title, authors, doi, date, abstract, journal\ncontent.lines    — full text, line-numbered (L\u003Cn>: \u003Ctext>)\nsections\u002F        — named section files (Introduction.lines, Methods.lines, ...)\nfigures\u002F         — figure files (PMC papers)\nsupplements\u002F     — supplementary files (PMC papers)\n```\n\nPaper IDs use prefixes by source: `bio_` (bioRxiv), `med_` (medRxiv), `PMC` (PubMed Central), `arx_` (arXiv). Regulatory documents and clinical trials are accessed via `\u002Ffda\u002F` and `\u002Fclinicaltrials\u002F` virtual directories.\n\n## Paper Repos\n\nBuild versioned, annotated collections of papers with git-like workflows:\n\n```bash\n# Create a repo and seed from a key paper's references\npaperclip init my-review \"Systematic review of XYZ\"\npaperclip import PMC11271413 --min-cites 50\npaperclip import refs.bib                    # import .bib\u002F.ris → library + repo\n\n# View your personal library (persists across repos)\npaperclip library\n\n# Curate: annotate, commit\npaperclip annotate PMC123 \"Key finding on mechanism X\"\npaperclip commit -m \"Initial seed from review + manual curation\"\n\n# Review your work\npaperclip repo                       # list all repos\npaperclip repo \u003Cname>                # repo overview: papers, branches, annotations\npaperclip log                        # commit history\npaperclip diff 9a6d..559a            # compare commits\n\n# Export to reference managers\npaperclip export bib -o refs.bib     # BibTeX (annotations in note field)\npaperclip export ris -o refs.ris     # RIS (Zotero, Paperpile, Mendeley, EndNote)\npaperclip export md -o review.md     # structured markdown report\npaperclip export csv -o papers.csv   # tabular data\n```\n\n## Saving files locally\n\nRedirect `cat` to write any paper file to disk. Text files come back as text; figures and other binaries stream as raw bytes when stdout is redirected (no base64 wrapping):\n\n```bash\npaperclip cat \u002Fpapers\u002FPMC10791696\u002Fmeta.json > meta.json\npaperclip cat \u002Fpapers\u002FPMC10791696\u002Ffigures\u002Ffig1.tif > fig1.tif\n```\n\nFor bulk, loop over `ls`:\n\n```bash\nmkdir -p figures\nfor f in $(paperclip ls \u002Fpapers\u002FPMC10791696\u002Ffigures\u002F); do\n  paperclip cat \u002Fpapers\u002FPMC10791696\u002Ffigures\u002F$f > figures\u002F$f\ndone\n```\n\n## Python SDK\n\nThe `gxl-paperclip` package ships a Python SDK alongside the CLI, so you can call Paperclip directly from scripts, notebooks, and other tools. Installing the package (via `pip install` or the installer script above) gives you both the `paperclip` command and the `gxl_paperclip` module.\n\n### Authentication\n\nThe SDK uses API keys (OAuth is reserved for interactive CLI sign-in). Create a key from the dashboard and make it available to your code:\n\n```bash\nexport PAPERCLIP_API_KEY=\"pk_...\"\n```\n\n```python\nfrom gxl_paperclip import PaperclipClient\n\nclient = PaperclipClient.from_env()           # picks up PAPERCLIP_API_KEY\n# — or pass an explicit strategy —\nfrom gxl_paperclip import APIKeyAuth\nclient = PaperclipClient(auth=APIKeyAuth(\"pk_...\"))\n```\n\n`from_env()` falls back to the credentials saved by `paperclip login` (`~\u002F.paperclip\u002Fcredentials.json`) via `FileCredentialsAuth` when no API key is set — handy on a workstation where you've already signed in.\n\n### Quick start\n\n```python\nfrom gxl_paperclip import PaperclipClient\n\nclient = PaperclipClient.from_env()\n\nresult = client.search(\"CRISPR lipid nanoparticle\", limit=5, source=\"pmc\")\nprint(result.output)           # same formatted text the CLI prints\nprint(result.result_id)        # e.g. \"s_14bebc10\" — pass to map_()\n\nfor event in client.map_(\"What delivery methods were used?\", from_results=result.result_id):\n    if event.type == \"progress\":\n        print(f\"{event.completed}\u002F{event.total} papers done\")\n    else:\n        print(event.output)\n```\n\n### Method reference\n\nEvery optional kwarg defaults to `None` (or `False` for flags) on the client, which means the flag is **omitted** from the underlying command — the server then applies its own default.\n\n#### `client.search(query, *, limit=None, source=None, exact=False, since=None, sort=None, author=None, journal=None, year=None, type=None, category=None, mode=None, all=False, timeout=None) -> ExecuteResult`\n\nHybrid search across bioRxiv, medRxiv, arXiv, PubMed Central, FDA, ClinicalTrials.gov, and international registries.\n\n| Argument | Default when omitted | Notes |\n|---|---|---|\n| `query` | required | Natural-language query string. |\n| `limit` | `100` | Server caps at 1000. |\n| `source` | PMC, bioRxiv, medRxiv, arXiv | Pass `\"pmc\"`, `\"biorxiv\"`, `\"medrxiv\"`, `\"arxiv\"`, `\"abstracts\"`, `\"fda\"`, `\"trials\"`, or a comma-separated list. |\n| `exact` | `False` | `True` switches search mode to phrase matching. |\n| `since` | no recency filter | e.g. `\"7d\"`, `\"30d\"`, `\"6m\"`, `\"1y\"`. |\n| `sort` | `\"relevance\"` | Pass `\"date\"` for newest-first. |\n| `author` | no filter | Substring match on authors. |\n| `journal` | no filter | PMC only. |\n| `year` | no filter | e.g. `2024`. |\n| `type` | no filter | e.g. `\"review-article\"` (PMC). |\n| `category` | no filter | e.g. `\"Neuroscience\"` (bioRxiv). |\n| `mode` | `\"any\"` | Also supports `\"all\"`, `\"50%\"`, `\"75%\"`. |\n| `all` | `False` | When `True`, searches the full corpus instead of the default recency-weighted slice. |\n| `timeout` | `120` s | Seconds before the request aborts. |\n\n#### `client.lookup(field, value, *, limit=None, timeout=None) -> ExecuteResult`\n\nLook up papers by a metadata field.\n\n| Argument | Default when omitted | Notes |\n|---|---|---|\n| `field` | required | `\"doi\"`, `\"pmc\"`, `\"pmid\"`, `\"author\"`, `\"title\"`, `\"journal\"`, `\"year\"`, `\"keywords\"`, etc. |\n| `value` | required | The value to match (partial, case-insensitive). |\n| `limit` | `25` | |\n| `timeout` | `120` s | |\n\n#### `client.sql(query, *, source=None, timeout=None) -> ExecuteResult`\n\nRead-only SQL over the `documents` table. 15s server-side timeout, 200-row cap.\n\n| Argument | Default when omitted | Notes |\n|---|---|---|\n| `query` | required | Must be a `SELECT` against `documents`. |\n| `source` | `\"all\"` | Pass `\"pmc\"` or `\"biorxiv\"` to restrict. |\n| `timeout` | `120` s | |\n\n#### `client.map_(question, *, from_results, timeout=None) -> Iterator[MapEvent]`\n\nRun an AI reader against every paper in a prior search\u002Flookup result set. Yields `MapProgressEvent` objects (OAuth streaming path) followed by a single `MapResultEvent`.\n\n| Argument | Default when omitted | Notes |\n|---|---|---|\n| `question` | required | Question asked against each paper. |\n| `from_results` | required | Pass the `result_id` returned by `search` or `lookup`. |\n| `timeout` | `300` s | Map defaults to the slow-command timeout. |\n\n#### `client.pull(target, dest=None, *, timeout=None) -> ExecuteResult`\n\nDownload a paper or single file from the virtual filesystem.\n\n| Argument | Default when omitted | Notes |\n|---|---|---|\n| `target` | required | e.g. `\"PMC10791696\"` or `\"PMC10791696\u002Ffigures\u002Ffig1.jpg\"`. |\n| `dest` | current directory | Output directory on the server's side of the command. |\n| `timeout` | `120` s | |\n\n#### `client.ask_image(path, question=None, *, fn=None, timeout=None) -> ExecuteResult`\n\nAnalyse a paper figure with vision AI.\n\n| Argument | Default when omitted | Notes |\n|---|---|---|\n| `path` | required | Figure path, e.g. `\"PMC11576387\u002Ffigures\u002Ffx1.jpg\"`. |\n| `question` | `\"Describe this figure in detail.\"` | Custom prompt. |\n| `fn` | free-form prompt | Pass `\"describe\"` or `\"extract-data\"` for canned flows. |\n| `timeout` | `300` s | Uses the slow-command default. |\n\n#### `client.bash(script, *, timeout=None) -> ExecuteResult`\n\nRun an arbitrary server-side pipeline, exactly like `paperclip bash '...'`.\n\n```python\nresult = client.bash('search \"protein folding\" | grep -i \"deep learning\"')\n```\n\n| Argument | Default when omitted | Notes |\n|---|---|---|\n| `script` | required | A single shell-style command string. |\n| `timeout` | `120` s | |\n\n#### `client.health(*, timeout=None) -> HealthStatus`\n\nPing the server and confirm auth works. Returns `HealthStatus(reachable: bool, output: str, exit_code: int)`.\n\n#### `client.results`\n\n- `client.results.list(*, limit=None) -> list[ResultRow]` — recent saved results for the authenticated user. Server default `limit` is `20`.\n- `client.results.get(result_id) -> ResultData` — raw saved output for a specific result ID (e.g. `\"s_14bebc10\"`, `\"m_ec2c9cc9\"`).\n\n#### `client.papers.*`\n\nTyped wrappers over the virtual filesystem commands. Each returns an `ExecuteResult`.\n\n| Method | Defaults |\n|---|---|\n| `papers.cat(path)` | no options |\n| `papers.head(path, *, lines=None)` | `lines` defaults to the CLI's `head` default (`10`). |\n| `papers.tail(path, *, lines=None)` | `lines` defaults to the CLI's `tail` default (`10`). |\n| `papers.ls(path)` | no options |\n| `papers.grep(pattern, path, *, ignore_case=False, extended=False)` | no flags passed when both are `False`. |\n| `papers.scan(path, patterns)` | multiple patterns OR'd in a single pass. |\n\n#### `client.execute(command, args=None, *, timeout=None) -> ExecuteResult`\n\nEscape hatch for any command without a typed wrapper (`sed`, `awk`, `sort`, `cut`, `tr`, `jq`, new server commands, ...). `args` is a list of argv tokens — the SDK quotes them for you.\n\n```python\nresult = client.execute(\"awk\", [\"-F\", \"\\t\", \"{print $1}\", \"\u002Fpapers\u002FPMC1\u002Fcontent.lines\"])\n```\n\n#### `client.stream(command, args=None, *, timeout=None) -> Iterator[MapEvent]`\n\nStreaming escape hatch. Currently only `\"map\"` streams; other commands raise `ValueError`.\n\n### Error handling\n\nAll HTTP and network failures raise a subclass of `PaperclipError`:\n\n```python\nfrom gxl_paperclip import (\n    AuthError, RateLimitError, NotFoundError, ServerError,\n    RequestTimeoutError, NetworkError,\n)\n\ntry:\n    client.search(\"AlphaFold\")\nexcept AuthError:\n    ...  # invalid API key or expired credentials\nexcept RateLimitError:\n    ...  # HTTP 429\nexcept RequestTimeoutError:\n    ...  # client-side timeout\n```\n\n### Result types\n\n- `ExecuteResult(output, exit_code, elapsed_ms, result_id, download_url, download_filename, cwd, raw)`\n- `MapProgressEvent(total, completed, failed, elapsed_s)`\n- `MapResultEvent(output, result_id, elapsed_ms, exit_code)`\n- `ResultRow(result_id, command, raw_input, latency_ms, created_at, raw)`\n- `ResultData(result_id, output, command, raw_input, latency_ms, created_at, raw)`\n- `HealthStatus(reachable, output, exit_code, elapsed_ms)`\n## License\n\nApache-2.0 — see [LICENSE](LICENSE).\n","Paperclip 是一个命令行工具，用于搜索、阅读和分析超过800万篇生物医学论文。它支持自然语言或正则表达式搜索来自bioRxiv、medRxiv、arXiv和PubMed Central的论文，以及FDA监管文件、ClinicalTrials.gov等资源。用户可以通过并行AI读取器处理多篇论文，并利用标准Unix工具如`grep`、`awk`、`sed`进行结果处理。此外，Paperclip还允许通过视觉AI询问图表信息，并直接使用SQL查询数据库。适用于科研人员、数据分析师及任何需要快速访问和处理大量生物医学文献的场景。",2,"2026-06-11 02:48:52","CREATED_QUERY"]