[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-546":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":42,"readmeContent":43,"aiSummary":44,"trendingCount":16,"starSnapshotCount":16,"syncStatus":45,"lastSyncTime":46,"discoverSource":47},546,"docling","docling-project\u002Fdocling","docling-project","Get your documents ready for gen AI","https:\u002F\u002Fdocling-project.github.io\u002Fdocling",null,"Python",61395,4287,219,863,0,71,427,1835,345,120,"MIT License",false,"main",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40,41],"ai","convert","document-parser","document-parsing","documents","docx","html","markdown","pdf","pdf-converter","pdf-to-json","pdf-to-text","pptx","tables","xlsx","2026-06-12 04:00:04","\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fdocling-project\u002Fdocling\">\n    \u003Cimg loading=\"lazy\" alt=\"Docling\" src=\"https:\u002F\u002Fgithub.com\u002Fdocling-project\u002Fdocling\u002Fraw\u002Fmain\u002Fdocs\u002Fassets\u002Fdocling_processing.png\" width=\"100%\"\u002F>\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n# Docling\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F12132\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F12132\" alt=\"DS4SD%2Fdocling | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fp>\n\n[![arXiv](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FarXiv-2408.09869-b31b1b.svg)](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.09869)\n[![Docs](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fdocs-live-brightgreen)](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002F)\n[![PyPI version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fdocling)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fdocling\u002F)\n[![PyPI - Python Version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fdocling)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fdocling\u002F)\n[![uv](https:\u002F\u002Fimg.shields.io\u002Fendpoint?url=https:\u002F\u002Fraw.githubusercontent.com\u002Fastral-sh\u002Fuv\u002Fmain\u002Fassets\u002Fbadge\u002Fv0.json)](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fuv)\n[![Ruff](https:\u002F\u002Fimg.shields.io\u002Fendpoint?url=https:\u002F\u002Fraw.githubusercontent.com\u002Fastral-sh\u002Fruff\u002Fmain\u002Fassets\u002Fbadge\u002Fv2.json)](https:\u002F\u002Fgithub.com\u002Fastral-sh\u002Fruff)\n[![Pydantic v2](https:\u002F\u002Fimg.shields.io\u002Fendpoint?url=https:\u002F\u002Fraw.githubusercontent.com\u002Fpydantic\u002Fpydantic\u002Fmain\u002Fdocs\u002Fbadge\u002Fv2.json)](https:\u002F\u002Fpydantic.dev)\n[![pre-commit](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https:\u002F\u002Fgithub.com\u002Fpre-commit\u002Fpre-commit)\n[![License MIT](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fdocling-project\u002Fdocling)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT)\n[![PyPI Downloads](https:\u002F\u002Fstatic.pepy.tech\u002Fbadge\u002Fdocling\u002Fmonth)](https:\u002F\u002Fpepy.tech\u002Fprojects\u002Fdocling)\n[![Docling Actor](https:\u002F\u002Fapify.com\u002Factor-badge?actor=vancura\u002Fdocling&fpr=docling)](https:\u002F\u002Fapify.com\u002Fvancura\u002Fdocling)\n[![Chat with Dosu](https:\u002F\u002Fdosu.dev\u002Fdosu-chat-badge.svg)](https:\u002F\u002Fapp.dosu.dev\u002F097760a8-135e-4789-8234-90c8837d7f1c\u002Fask?utm_source=github)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1399788921306746971?color=6A7EC2&logo=discord&logoColor=ffffff)](https:\u002F\u002Fdocling.ai\u002Fdiscord)\n[![OpenSSF Best Practices](https:\u002F\u002Fwww.bestpractices.dev\u002Fprojects\u002F10101\u002Fbadge)](https:\u002F\u002Fwww.bestpractices.dev\u002Fprojects\u002F10101)\n[![LF AI & Data](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLF%20AI%20%26%20Data-003778?logo=linuxfoundation&logoColor=fff&color=0094ff&labelColor=003778)](https:\u002F\u002Flfaidata.foundation\u002Fprojects\u002F)\n\n## What is Docling ?\n\nDocling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.\n\n## Features\n\n- 🗂️ Parsing of [multiple document formats][supported_formats] incl. PDF, DOCX, PPTX, XLSX, HTML, WAV, MP3, WebVTT, images (PNG, TIFF, JPEG, ...), LaTeX, plain text, and more\n- 📑 Advanced PDF understanding incl. page layout, reading order, table structure, code, formulas, image classification, and more\n- 🧬 Unified, expressive [DoclingDocument][docling_document] representation format\n- ↪️ Various [export formats][supported_formats] and options, including Markdown, HTML, WebVTT, [DocTags](https:\u002F\u002Farxiv.org\u002Fabs\u002F2503.11576) and lossless JSON\n- 📜 Support of several application-specifc XML schemas incl. [USPTO](https:\u002F\u002Fwww.uspto.gov\u002Fpatents) patents, [JATS](https:\u002F\u002Fjats.nlm.nih.gov\u002F) articles, and [XBRL](https:\u002F\u002Fwww.xbrl.org\u002F) financial reports.\n- 🔒 Local execution capabilities for sensitive data and air-gapped environments\n- 🤖 Plug-and-play [integrations][integrations] incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI\n- 🔍 Extensive OCR support for scanned PDFs and images\n- 👓 Support of several Visual Language Models ([GraniteDocling](https:\u002F\u002Fhuggingface.co\u002Fibm-granite\u002Fgranite-docling-258M))\n- 🎙️ Audio support with Automatic Speech Recognition (ASR) models\n- 🔌 Connect to any agent using the [MCP server](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fusage\u002Fmcp\u002F)\n- 💻 Simple and convenient CLI\n\n### What's new\n\n- 📤 Structured [information extraction][extraction] \\[🧪 beta\\]\n- 📑 New layout model (**Heron**) by default, for faster PDF parsing\n- 🔌 [MCP server](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fusage\u002Fmcp\u002F) for agentic applications\n- 💼 Parsing of XBRL (eXtensible Business Reporting Language) documents for financial reports\n- 💬 Parsing of WebVTT (Web Video Text Tracks) files and export to WebVTT format\n- 💬 Parsing of LaTeX files\n- 📝 Parsing of plain-text files (`.txt`, `.text`) and Markdown supersets (`.qmd`, `.Rmd`)\n- 📝 Chart understanding (Barchart, Piechart, LinePlot): converting them into tables, code or adding detailed descriptions\n\n### Coming soon\n\n- 📝 Metadata extraction, including title, authors, references & language\n- 📝 Complex chemistry understanding (Molecular structures)\n\n## Quickstart\n\n### 1. Install\n\n```bash\npip install docling\n```\n\n> **Note:** Python 3.9 support was dropped in docling version 2.70.0. Please use Python 3.10 or higher.\n\nWorks on macOS, Linux and Windows environments. Both x86_64 and arm64 architectures.\n\nMore [detailed installation instructions](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Finstallation\u002F) are available in the docs.\n\n## 2. Convert a document (CLI)\n\n```bash\ndocling https:\u002F\u002Farxiv.org\u002Fpdf\u002F2206.01062\n```\n\nThis generates a .md file in the current directory containing structured document content.\n\nYou can also use 🥚[GraniteDocling](https:\u002F\u002Fhuggingface.co\u002Fibm-granite\u002Fgranite-docling-258M) and other VLMs via Docling CLI:\n\n```bash\ndocling --pipeline vlm --vlm-model granite_docling https:\u002F\u002Farxiv.org\u002Fpdf\u002F2206.01062\n```\n\n## 3. Python usage (recommended)\n\n```python\nfrom docling.document_converter import DocumentConverter\n\nsource = \"https:\u002F\u002Farxiv.org\u002Fpdf\u002F2408.09869\"  # document per local path or URL\nconverter = DocumentConverter()\nresult = converter.convert(source)\nprint(result.document.export_to_markdown())  # output: \"## Docling Technical Report[...]\"\n```\n\nMore advanced [usage](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fusage\u002F) and [configuration](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Finstallation\u002F) options.\n\n## Documentation\n\nCheck out Docling's [documentation](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002F), for details on\ninstallation, usage, concepts, recipes, extensions, and more.\n\n## Examples\n\nGo hands-on with our [examples](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fexamples\u002F),\ndemonstrating how to address different application use cases with Docling.\n\n## Integrations\n\nTo further accelerate your AI application development, check out Docling's native\n[integrations](https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fintegrations\u002F) with popular frameworks\nand tools.\n\n## Get help and support\n\nPlease feel free to connect with us using the [discussion section](https:\u002F\u002Fgithub.com\u002Fdocling-project\u002Fdocling\u002Fdiscussions).\n\n## Technical report\n\nFor more details on Docling's inner workings, check out the [Docling Technical Report](https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.09869).\n\n## Contributing\n\nPlease read [Contributing to Docling](https:\u002F\u002Fgithub.com\u002Fdocling-project\u002Fdocling\u002Fblob\u002Fmain\u002FCONTRIBUTING.md) for details.\n\n## References\n\nIf you use Docling in your projects, please consider citing the following:\n\n```bib\n@techreport{Docling,\n  author = {Deep Search Team},\n  month = {8},\n  title = {Docling Technical Report},\n  url = {https:\u002F\u002Farxiv.org\u002Fabs\u002F2408.09869},\n  eprint = {2408.09869},\n  doi = {10.48550\u002FarXiv.2408.09869},\n  version = {1.0.0},\n  year = {2024}\n}\n```\n\n## License\n\nThe Docling codebase is under MIT license.\nFor individual model usage, please refer to the model licenses found in the original packages.\n\n## LF AI & Data\n\nDocling is hosted as a project in the [LF AI & Data Foundation](https:\u002F\u002Flfaidata.foundation\u002Fprojects\u002F).\n\n### IBM ❤️ Open Source AI\n\nThe project was started by the AI for knowledge team at IBM Research Zurich.\n\n[supported_formats]: https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fusage\u002Fsupported_formats\u002F\n[docling_document]: https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fconcepts\u002Fdocling_document\u002F\n[integrations]: https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fintegrations\u002F\n[extraction]: https:\u002F\u002Fdocling-project.github.io\u002Fdocling\u002Fexamples\u002Fextraction\u002F\n","Docling 是一个用于文档处理和解析的工具，支持多种格式文件的转换，并能与生成式AI生态系统无缝集成。其核心功能包括对PDF、DOCX、PPTX、XLSX、HTML等常见文档类型的解析，尤其是对PDF文档有深入的理解能力，如页面布局分析、表格结构识别等。此外，Docling还提供了丰富的API接口，方便开发者将其整合到自己的应用程序中。此项目非常适合需要将不同格式文档转换为统一数据结构以便进一步处理或分析的应用场景，例如文档自动化处理、知识库构建等领域。",2,"2026-06-11 02:37:22","top_all"]