[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81878":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":15,"stars7d":15,"stars30d":15,"stars90d":14,"forks30d":14,"starsTrendScore":16,"compositeScore":17,"rankGlobal":9,"rankLanguage":9,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":9,"pushedAt":9,"updatedAt":23,"readmeContent":24,"aiSummary":25,"trendingCount":14,"starSnapshotCount":14,"syncStatus":15,"lastSyncTime":26,"discoverSource":27},81878,"pdf2md","deepdiy\u002Fpdf2md","deepdiy","A blazing fast, layout-aware PDF to Markdown converter built with Rust. Uses DocLayoutNet YOLO-based detection to preserve document structure — images, tables, formulas, captions, headers and more. Pre-built binaries available for macOS, Linux and Windows. Also offers a free online tool and API at pdf2md.deepdiy.net.",null,"Rust",23,1,21,0,2,6,46.1,"MIT License",false,"main",true,[],"2026-06-12 04:01:35","# Blazing Fast PDF to Markdown Converter\n\nConvert PDF to Markdown with layout detection — preserving images, tables, formulas, captions, headers, and footnotes. Built with Rust, NCNN, and MuPDF for maximum performance.\n\n**Try the free online converter:** [pdf2md.deepdiy.net](https:\u002F\u002Fpdf2md.deepdiy.net\u002F)\n\n## Features\n\n- **Layout-aware Markdown** — Uses DocLayoutNet YOLO-based detection to understand document structure. Output preserves headings, paragraphs, tables, lists, formulas, captions, and more in proper reading order.\n- **Images & Assets** — Automatically extracts embedded images and saves them alongside the Markdown output.\n- **Clean Output** — No unnecessary line breaks within paragraphs. Produces readable, well-formatted Markdown.\n- **Self-hostable** — Pre-built binaries for macOS, Linux, and Windows. No Docker or external services required.\n- **Free Web API** — No API key needed. Send a PDF and get back Markdown, image links, and a ZIP download.\n\n## Performance Comparison\n\n![PDF2MD performance comparison vs competitors — 10x faster on a 1c1g VPS](.\u002Fassets\u002Fcompetitor-comparison.png)\n*Faster than other PDF to Markdown tools on equivalent hardware.*\n\n![Self-host on a 1c1g VPS with no Docker required](.\u002Fassets\u002Fself-host-1c1g-vps.png)\n*Runs efficiently on a 1-core 1GB RAM VPS.*\n\n![Layout-aware Markdown preserves document structure including tables, lists, and headings](.\u002Fassets\u002Flayout-aware-markdown.png)\n*DocLayoutNet detection keeps the original layout intact.*\n\n![Clean Markdown without unnecessary line breaks inside paragraphs](.\u002Fassets\u002Fclean-markdown-no-line-breaks.png)\n*No broken inline text — every paragraph stays together.*\n\n![Free web service API for PDF to Markdown conversion](.\u002Fassets\u002Ffree-web-service-api.png)\n*No sign-up required. Upload and convert instantly.*\n\n## Pre-built Binaries\n\nDownload pre-compiled binaries for 4 platforms from the `dist\u002F` directory:\n\n| Platform | Binary |\n|----------|--------|\n| macOS (Apple Silicon) | `dist\u002Fpdf2md-macos-arm64` |\n| Linux (x86_64) | `dist\u002Fpdf2md-x86_64-unknown-linux-gnu` |\n| Linux (ARM64) | `dist\u002Fpdf2md-aarch64-unknown-linux-gnu` |\n| Windows (x86_64) | `dist\u002Fpdf2md-win10-x64.exe` |\n\n### Step 1 — Move files to your working directory\n\n```bash\nmv dist\u002Fpdf2md-\u003Cplatform> \u003Cworkdir>\u002F\nmv yolo26n-doclaynet_ncnn_model\u002F \u003Cworkdir>\u002F\n```\n\n### Step 2 — Run conversion\n\n```bash\ncd \u003Cworkdir>\n.\u002Fpdf2md-\u003Cplatform> \u003Cinput.pdf>\n```\n\nExport full-page layout input images:\n\n```bash\n.\u002Fpdf2md-\u003Cplatform> \u003Cinput.pdf> [output.md] --export-page-image\n```\n\n### Arguments\n\n| Argument | Description |\n|----------|-------------|\n| `input.pdf` | Input PDF file |\n| `output.md` | Output Markdown file (optional, defaults to stdout) |\n\n### Extra options\n\n| Option | Description |\n|--------|-------------|\n| `--asset-dir DIR` | Directory to export page assets |\n| `--detect-dpi N` | DPI for layout detection (default: `72`) |\n| `--asset-dpi N` | DPI for asset export (default: `150`) |\n| `--page N` | Process only the specified page |\n| `--model-dir PATH` | Path to the model directory (default: `.\u002Fyolo26n-doclaynet_ncnn_model\u002F`) |\n| `--export-page-image` | Export the full page image used as layout detection input; increase `--detect-dpi` if you need higher-resolution page images |\n\n## Build from Source\n\n```bash\ncargo build --release --bin pdf2md\n```\n\nThe compiled binary will be at `target\u002Frelease\u002Fpdf2md`.\n\n## Run from Source\n\n```bash\ncargo run --release --bin pdf2md -- .\u002Finput.pdf .\u002Foutput.md\n```\n\n## Self Hosting Streamlit App\n\nA browser-based UI for uploading PDFs and previewing Markdown output with images.\n\nThe app automatically detects your OS and architecture to find the right binary in `dist\u002F`. You can also specify a custom path:\n\n```bash\npip install streamlit\nstreamlit run streamlit_app.py\n```\n\nSpecify a custom binary or model directory:\n\n```bash\nstreamlit run streamlit_app.py -- \\\n  --pdf2md-bin .\u002Fdist\u002Fpdf2md-\u003Cplatform> \\\n  --model-dir \u002Fpath\u002Fto\u002Fyolo26n-doclaynet_ncnn_model\n```\n\n## Free PDF to Markdown API\n\nNo API key required. Submit a PDF and receive Markdown, extracted images, and a downloadable ZIP.\n\n**Endpoint**\n\n```bash\nPOST https:\u002F\u002Fpdf2md.deepdiy.net\u002Fv1\u002Fconvert\nContent-Type: application\u002Fpdf\n```\n\n**curl example**\n\n```bash\ncurl -X POST \"https:\u002F\u002Fpdf2md.deepdiy.net\u002Fv1\u002Fconvert\" \\\n  -H \"Content-Type: application\u002Fpdf\" \\\n  --data-binary @paper.pdf\n```\n\n**Success response**\n\n```json\n{\n  \"status\": \"succeeded\",\n  \"markdown\": \"# Paper title\\n\\nConverted Markdown...\",\n  \"images\": [\n    {\n      \"path\": \"assets\u002Fpage_0001_order_0001_class_6.png\",\n      \"url\": \"https:\u002F\u002F...\"\n    }\n  ],\n  \"zip_url\": \"https:\u002F\u002F...\",\n  \"download_url\": \"https:\u002F\u002F...\",\n  \"expires_in\": 300\n}\n```\n\n**Error response** (HTTP 429)\n\n```json\n{\n  \"error\": \"busy\"\n}\n```\n\n> The system processes one request at a time across all users. If the server is busy, it returns HTTP 429. Wait 1 second and retry. Each conversion runs for up to 120 seconds — you will likely get a slot within that window.\n\n### API Limits\n\n| Item | Value |\n|------|-------|\n| Price | Free |\n| Max PDF size | 20 MB |\n| Concurrency | One request at a time (returns 429 if busy) |\n| Max task duration | 120 seconds |\n| Conversion timeout | 150 seconds |\n| Request timeout | 180 seconds |\n| ZIP download expiry | 5 minutes |\n\n## Detection Classes\n\nYou can use these class IDs to filter or block specific elements (e.g., Page-header, Footnote) from the output:\n\n`0`: Caption, `1`: Footnote, `2`: Formula, `3`: List-item, `4`: Page-footer, `5`: Page-header, `6`: Picture, `7`: Section-header, `8`: Table, `9`: Text, `10`: Title\n\n## License\n\nMIT License. See [`LICENSE`](.\u002FLICENSE).\n","deepdiy\u002Fpdf2md 是一个基于 Rust 的快速且布局感知的 PDF 转 Markdown 工具。它使用 DocLayoutNet YOLO 基础检测技术来保留文档结构，包括图片、表格、公式、标题等元素，并确保输出格式清晰易读。该工具支持 macOS、Linux 和 Windows 平台，提供预编译二进制文件，同时也提供免费在线转换服务和 API 接口。适用于需要将学术论文、技术文档等复杂格式的 PDF 文件转换为可编辑 Markdown 格式的场景，特别适合那些重视内容结构完整性和阅读体验的用户。","2026-06-11 04:07:03","CREATED_QUERY"]