[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82053":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":17,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":22,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":15,"starSnapshotCount":15,"syncStatus":16,"lastSyncTime":30,"discoverSource":31},82053,"clawpdf","openclaw\u002Fclawpdf","openclaw","Zero-dependency PDFium WebAssembly bindings for Node and browsers.","https:\u002F\u002Fclawpdf.dev",null,"TypeScript",90,5,23,0,2,65,2.33,"MIT License",false,"main",true,[24,25,26],"node","pdf","wasm","2026-06-12 02:04:22","# clawpdf\n\n![clawpdf banner](docs\u002Fassets\u002Freadme-banner.jpg)\n\n[![CI](https:\u002F\u002Fgithub.com\u002Fopenclaw\u002Fclawpdf\u002Factions\u002Fworkflows\u002Fci.yml\u002Fbadge.svg)](https:\u002F\u002Fgithub.com\u002Fopenclaw\u002Fclawpdf\u002Factions\u002Fworkflows\u002Fci.yml)\n\nZero-dependency PDFium WebAssembly bindings for Node and browsers.\n\nDocs: \u003Chttps:\u002F\u002Fclawpdf.dev\u002F>\n\n`clawpdf` loads PDFs, extracts text, renders pages, and encodes PNG fallback\nimages without runtime dependencies, native addons, postinstall scripts, or a\ncanvas package.\n\n## Why\n\nOpenClaw needs a predictable local PDF path:\n\n- text extraction before model fallback\n- page rendering when a PDF has little extractable text\n- PNG output for multimodal model input\n- one dependency with no transitive package tree\n- current vendored PDFium provenance\n\nThis package currently vendors `pdfium-lib` release `7623`.\n\n## Install\n\n```bash\nnpm install clawpdf\n```\n\nESM-only. Node 20+ is supported.\n\n## Quick Start\n\n```ts\nimport { writeFile } from \"node:fs\u002Fpromises\";\nimport { openPdf } from \"clawpdf\";\n\nawait using pdf = await openPdf(\"report.pdf\");\n\nconsole.log(pdf.pageCount);\nconsole.log(pdf.text({ maxPages: 5 }));\n\nconst png = await pdf.page(1).png({ dpi: 144, forms: true });\nawait writeFile(\"page-1.png\", png);\n```\n\nAll user-facing page numbers are one-based.\n\n## CLI\n\nThe package also installs a `clawpdf` command:\n\n```bash\nclawpdf report.pdf\ncat report.pdf | clawpdf -\nclawpdf report.pdf --json\nclawpdf render report.pdf --page 1 > page.png\nclawpdf render report.pdf --page 1 --inline auto\n```\n\nUse `--password` or `--password-file` for encrypted PDFs. See the\n[CLI docs](https:\u002F\u002Fclawpdf.dev\u002Fcli.html) for flags, JSON output, and exit codes.\n\n## Reuse an Engine\n\nServer code should create one PDFium engine and reuse it:\n\n```ts\nimport { createEngine } from \"clawpdf\";\n\nawait using engine = await createEngine();\n\nawait using pdf = await engine.open(pdfBytes);\n\nconsole.log(pdf.metadata.title);\nconsole.log(pdf.page(1).text());\n```\n\nUse `engine.extract(...)` when you want the same text-first fallback behavior\nwithout manually opening and closing a document:\n\n```ts\nconst result = await engine.extract(pdfBytes, { mode: \"auto\", maxPages: 20 });\n```\n\n## Text-First Extraction\n\n```ts\nimport { extractPdf } from \"clawpdf\";\nimport { toMessageContent } from \"clawpdf\u002Fadapters\";\n\nconst result = await extractPdf(\"report.pdf\", {\n  mode: \"auto\",\n  minTextChars: 200,\n  maxPages: 20,\n  image: {\n    dpi: 96,\n    maxPixels: 4_000_000,\n    maxDimension: 10_000,\n    forms: true,\n  },\n});\n\nconsole.log(result.text);\nconsole.log(result.images); \u002F\u002F raw PNG bytes\nconsole.log(toMessageContent(result)); \u002F\u002F transport-shaped blocks\n```\n\n`auto` always extracts text and renders PNG images only when extracted text is\nshorter than `minTextChars`.\n\n## Browser Usage\n\nUse `clawpdf\u002Fbrowser` in bundled browser code. It exports the same API and\npre-wires the packaged WASM URL.\n\n```ts\nimport { openPdf } from \"clawpdf\u002Fbrowser\";\n\nawait using pdf = await openPdf(file);\nconsole.log(pdf.text({ maxPages: 3 }));\n```\n\nCustom WASM hosting is still available:\n\n```ts\nimport { createEngine } from \"clawpdf\u002Fbrowser\";\n\nawait using engine = await createEngine({\n  wasmUrl: \"\u002Fassets\u002Fpdfium.esm.wasm\",\n});\n```\n\n## Passwords\n\n```ts\nimport { openPdf } from \"clawpdf\";\n\nawait using pdf = await openPdf(\"secret.pdf\", { password: \"secret\" });\nconsole.log(pdf.text());\n```\n\nWrong or missing passwords throw `PdfPasswordError`.\n\n## API\n\nFeature docs:\n\n- [Loading PDFs](https:\u002F\u002Fclawpdf.dev\u002Floading.html)\n- [CLI](https:\u002F\u002Fclawpdf.dev\u002Fcli.html)\n- [Text extraction](https:\u002F\u002Fclawpdf.dev\u002Ftext-extraction.html)\n- [Page rendering](https:\u002F\u002Fclawpdf.dev\u002Fpage-rendering.html)\n- [PNG output](https:\u002F\u002Fclawpdf.dev\u002Fpng-output.html)\n- [Extraction fallback](https:\u002F\u002Fclawpdf.dev\u002Fextraction-fallback.html)\n- [Password-protected PDFs](https:\u002F\u002Fclawpdf.dev\u002Fpasswords.html)\n- [Browser and bundlers](https:\u002F\u002Fclawpdf.dev\u002Fbrowser-bundlers.html)\n- [PDFium provenance](https:\u002F\u002Fclawpdf.dev\u002Fpdfium-provenance.html)\n- [Package shape](https:\u002F\u002Fclawpdf.dev\u002Fpackage-shape.html)\n- [Performance](https:\u002F\u002Fclawpdf.dev\u002Fperformance.html)\n- [API reference](https:\u002F\u002Fclawpdf.dev\u002Fapi-reference.html)\n\nCore exports:\n\n- `extractPdf(input, options?)`: one-shot extraction with a shared engine.\n- `openPdf(input, options?)`: open one document with private lifetime.\n- `createEngine(options?)`: create a reusable PDFium engine.\n- `releaseExtractEngine()`: dispose the shared extraction engine after in-flight calls finish.\n- `encodePng(rgba, { width, height, compress })`: standalone RGBA to PNG.\n- `PdfError` subclasses for typed failures.\n- `PDFIUM_RELEASE` and `PDFIUM_WASM_SHA256`.\n\n## Performance Snapshot\n\nLocal Node benchmark on five sample PDFs, first page rendered at scale `2` with\ntext extraction and PNG encoding included.\n\n| Sample | previous stack total \u002F RSS \u002F PNG | clawpdf total \u002F RSS \u002F PNG |\n| --- | --- | --- |\n| Form | 95.4 ms \u002F 174.9 MB \u002F 114,930 B | 38.7 ms \u002F 129.4 MB \u002F 100,629 B |\n| Hello | 65.2 ms \u002F 159.7 MB \u002F 41,408 B | 27.2 ms \u002F 124.1 MB \u002F 47,106 B |\n| Scientific | 176.9 ms \u002F 202.0 MB \u002F 608,807 B | 66.0 ms \u002F 137.8 MB \u002F 321,122 B |\n| Magazine | 519.4 ms \u002F 312.0 MB \u002F 1,616,318 B | 255.9 ms \u002F 179.5 MB \u002F 1,930,947 B |\n| Checkmark | 2.6 ms \u002F 128.1 MB \u002F 589 B | 1.1 ms \u002F 83.2 MB \u002F 498 B |\n\n## Package Shape\n\nRuntime dependencies: none.\nRelease history: see `CHANGELOG.md`.\n\nPublished files:\n\n- `dist\u002Findex.js`\n- `dist\u002Fcli.d.ts`\n- `dist\u002Fcli.js`\n- `dist\u002Fbrowser.js`\n- `dist\u002Fadapters\u002Findex.js`\n- `dist\u002Fvendor\u002Fpdfium.esm.js`\n- `dist\u002Fvendor\u002Fpdfium.esm.wasm`\n- `CHANGELOG.md`\n- license\u002Freadme\u002Fnotices\n\nCurrent vendored binary:\n\n- `pdfium-lib`: `7623`\n- WASM SHA-256: `14ca2adbe23b45dea57da28ae2746e376f1cddfb8e2d0b01b71dcc5cf227734e`\n\n## Refresh PDFium\n\n```bash\npnpm download:pdfium\npnpm test\n```\n\nTo move to a newer `pdfium-lib` release, update the release tag and hashes in:\n\n- `scripts\u002Fdownload-pdfium.mjs`\n- `src\u002Fconstants.ts`\n- this README\n- `docs\u002Fpdfium-provenance.md`\n\n## License\n\nMIT for this wrapper. PDFium has upstream BSD-style and Apache-2.0 notices; see\n`THIRD_PARTY_NOTICES.md`.\n","clawpdf 是一个无依赖的PDFium WebAssembly绑定库，适用于Node.js和浏览器环境。它能够加载PDF文件、提取文本、渲染页面以及生成PNG格式的备选图片，无需运行时依赖、原生插件或安装后脚本。该项目采用TypeScript编写，通过WebAssembly技术直接在客户端执行PDF操作，从而提供了一种高效且跨平台的解决方案。clawpdf特别适合需要在前后端环境中处理PDF文档的应用场景，如在线文档查看器、自动化报告生成工具等，其简洁的API设计使得集成变得简单快捷。","2026-06-11 04:07:35","CREATED_QUERY"]