[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1567":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":15,"stars30d":17,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":18,"rankGlobal":10,"rankLanguage":10,"license":19,"archived":20,"fork":20,"defaultBranch":21,"hasWiki":22,"hasPages":20,"topics":23,"createdAt":10,"pushedAt":10,"updatedAt":24,"readmeContent":25,"aiSummary":26,"trendingCount":16,"starSnapshotCount":16,"syncStatus":27,"lastSyncTime":28,"discoverSource":29},1567,"webpull","Dhravya\u002Fwebpull","Dhravya","instantly pull a website down as a clean directory locally","",null,"TypeScript",218,19,196,1,0,13,3.9,"MIT License",false,"main",true,[],"2026-06-12 02:00:29","# webpull\n\nPull any public docs site into local markdown files.\n\n```\n$ webpull https:\u002F\u002Fdocs.example.com\n\n  ⚡ webpull · 16 workers\n  docs.example.com → .\u002Fdocs.example.com\n\n  ●●●·●●●●·●●●●●●●·\n  ├─ ✓ getting-started\u002Finstallation.md\n  ├─ ✓ api\u002Fauthentication.md\n  ├─ ✓ guides\u002Fdeployment.md\n  █████████████░░░░░░░ 68% 102\u002F150 · 6p\u002Fs · 17.2s\n```\n\n## Install\n\n```bash\nbun install -g webpull\n```\n\n## Usage\n\n```\nwebpull \u003Curl> [options]\n\nOptions:\n  -o, --out \u003Cdir>   Output directory (default: .\u002F\u003Chostname>)\n  -m, --max \u003Cn>     Max pages to pull (default: 500)\n```\n\n## Examples\n\n```bash\n# Pull React docs\nwebpull https:\u002F\u002Freact.dev\u002Freference\n\n# Custom output dir, limit to 100 pages\nwebpull https:\u002F\u002Fdocs.python.org -o .\u002Fpython-docs -m 100\n```\n\n## How it works\n\n1. **Discovers pages** via sitemap.xml, nav link extraction, JS bundle route parsing, or link crawling\n2. **Fetches in parallel** using a worker pool sized to your CPU cores\n3. **Renders SPAs** with headless Chromium when JavaScript-rendered content is detected\n4. **Converts to markdown** using [Defuddle](https:\u002F\u002Fgithub.com\u002Fnichochar\u002Fdefuddle) for intelligent content extraction\n5. **Writes to disk** preserving the URL path structure with YAML frontmatter\n\nEach markdown file includes metadata:\n\n```yaml\n---\ntitle: \"Getting Started\"\nurl: \"https:\u002F\u002Fdocs.example.com\u002Fgetting-started\"\n---\n```\n\n## Requirements\n\n- [Bun](https:\u002F\u002Fbun.sh) runtime\n- [Playwright](https:\u002F\u002Fplaywright.dev) Chromium (auto-used for SPAs; install with `npx playwright install chromium`)\n\n## License\n\nMIT\n","webpull 是一个用于将公共文档网站快速下载并转换为本地 Markdown 文件的工具。其核心功能包括通过多种方式发现页面、并行抓取网页内容、使用无头 Chromium 渲染单页应用、智能地将 HTML 转换为带有元数据的 Markdown 文件，并保持原始 URL 结构。该工具利用了 TypeScript 编写，支持自定义输出目录和最大抓取页面数等选项，非常适合需要离线访问在线文档或进行进一步处理（如生成静态站点）的场景。",2,"2026-06-11 02:44:43","CREATED_QUERY"]