[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-1379":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":22,"hasPages":24,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":31,"readmeContent":32,"aiSummary":33,"trendingCount":16,"starSnapshotCount":16,"syncStatus":34,"lastSyncTime":35,"discoverSource":36},1379,"tesseract.js","naptha\u002Ftesseract.js","naptha","Pure Javascript OCR for more than 100 Languages 📖🎉🖥","http:\u002F\u002Ftesseract.projectnaptha.com\u002F",null,"JavaScript",38133,2363,478,31,0,15,76,5,45,"Apache License 2.0",false,"master",true,[26,27,28,29,30],"deep-learning","javascript","ocr","tesseract","webassembly","2026-06-12 02:00:27","\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Ftesseract.projectnaptha.com\u002F\">\n    \u003Cpicture>\n      \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\".\u002Fdocs\u002Fimages\u002Ftesseract_dark.png\">\n      \u003Cimg width=\"256px\" height=\"256px\" alt=\"Tesseract.js\" src=\".\u002Fdocs\u002Fimages\u002Ftesseract.png\">\n    \u003C\u002Fpicture>\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n![Lint & Test](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fworkflows\u002FNode.js%20CI\u002Fbadge.svg)\n![CodeQL](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fworkflows\u002FCodeQL\u002Fbadge.svg)\n[![Gitpod Ready-to-Code](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FGitpod-ready--to--code-blue?logo=gitpod)](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js) \n[![Financial Contributors on Open Collective](https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Fall\u002Fbadge.svg?label=financial+contributors)](https:\u002F\u002Fopencollective.com\u002Ftesseractjs) [![npm version](https:\u002F\u002Fbadge.fury.io\u002Fjs\u002Ftesseract.js.svg)](https:\u002F\u002Fbadge.fury.io\u002Fjs\u002Ftesseract.js)\n[![Maintenance](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMaintained%3F-yes-green.svg)](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fgraphs\u002Fcommit-activity)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-Apache%202.0-blue.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FApache-2.0)\n[![Code Style](https:\u002F\u002Fbadgen.net\u002Fbadge\u002Fcode%20style\u002Fairbnb\u002Fff5a5f?icon=airbnb)](https:\u002F\u002Fgithub.com\u002Fairbnb\u002Fjavascript)\n![npm](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fdm\u002Ftesseract.js?label=npm%20downloads)\n![jsDelivr hits (npm)](https:\u002F\u002Fimg.shields.io\u002Fjsdelivr\u002Fnpm\u002Fhm\u002Ftesseract.js?label=jsdelivr%20hits)\n\nTesseract.js is a javascript library that gets words in [almost any language](.\u002Fdocs\u002Ftesseract_lang_list.md) out of images. ([Demo](http:\u002F\u002Ftesseract.projectnaptha.com\u002F))\n\nImage Recognition\n\n[![fancy demo gif](.\u002Fdocs\u002Fimages\u002Fdemo.gif)](http:\u002F\u002Ftesseract.projectnaptha.com)\n\nVideo Real-time Recognition\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fjeromewu\u002Ftesseract.js-video\">\u003Cimg alt=\"Tesseract.js Video\" src=\".\u002Fdocs\u002Fimages\u002Fvideo-demo.gif\">\u003C\u002Fa>\n\u003C\u002Fp>\n\nTesseract.js works in the browser using [webpack](https:\u002F\u002Fwebpack.js.org\u002F), esm, or plain script tags with a [CDN](#CDN) and on the server with [Node.js](https:\u002F\u002Fnodejs.org\u002Fen\u002F).\nAfter you [install it](#installation), using it is as simple as:\n\n```javascript\nimport { createWorker } from 'tesseract.js';\n\n(async () => {\n  const worker = await createWorker('eng');\n  const ret = await worker.recognize('https:\u002F\u002Ftesseract.projectnaptha.com\u002Fimg\u002Feng_bw.png');\n  console.log(ret.data.text);\n  await worker.terminate();\n})();\n```\nWhen recognizing multiple images, users should create a worker once, run `worker.recognize` for each image, and then run `worker.terminate()` once at the end (rather than running the above snippet for every image). \n\n## Installation\nTesseract.js works with a `\u003Cscript>` tag via local copy or CDN, with webpack via `npm` and on Node.js with `npm\u002Fyarn`.\n\n### CDN\n```html\n\u003C!-- v5 -->\n\u003Cscript src='https:\u002F\u002Fcdn.jsdelivr.net\u002Fnpm\u002Ftesseract.js@5\u002Fdist\u002Ftesseract.min.js'>\u003C\u002Fscript>\n```\nAfter including the script the `Tesseract` variable will be globally available and a worker can be created using `Tesseract.createWorker`.\n\nAlternatively, an ESM build (used with `import` syntax) can be found at `https:\u002F\u002Fcdn.jsdelivr.net\u002Fnpm\u002Ftesseract.js@5\u002Fdist\u002Ftesseract.esm.min.js`. \n\n### Node.js\n\n**Tesseract.js v7 requires Node.js v16 or newer.** (Tesseract.js v6 requires Node.js v14 or newer.)\n\n```shell\n# For latest version\nnpm install tesseract.js\nyarn add tesseract.js\n\n# For old versions\nnpm install tesseract.js@3.0.3\nyarn add tesseract.js@3.0.3\n```\n\n## Project Scope\nTesseract.js aims to bring the [Tesseract](https:\u002F\u002Fgithub.com\u002Ftesseract-ocr\u002Ftesseract) OCR engine (a separate project) to the browser and Node.js, and works by wrapping a [WebAssembly port](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js-core) of Tesseract.  This project does not modify core Tesseract features.  Most notably, **Tesseract.js does not support PDF files and does not modify the Tesseract recognition model to improve accuracy.**\n\nIf your project requires features outside of this scope, consider the [Scribe.js library](https:\u002F\u002Fgithub.com\u002Fscribeocr\u002Fscribe.js).  Scribe.js is an alternative library created to accommodate common feature requests that are outside of the scope of this repo.  Scribe.js includes improvements to the Tesseract recognition model and supports extracting text from PDF documents, among other features.  For more information see [Scribe.js vs. Tesseract.js](https:\u002F\u002Fgithub.com\u002Fscribeocr\u002Fscribe.js\u002Fblob\u002Fmaster\u002Fdocs\u002Fscribe_vs_tesseract.md).\n\n## Documentation\n\n* [Workers vs. Schedulers](.\u002Fdocs\u002Fworkers_vs_schedulers.md)\n* [Examples](.\u002Fdocs\u002Fexamples.md)\n* [Supported Image Formats](.\u002Fdocs\u002Fimage-format.md)\n* [API](.\u002Fdocs\u002Fapi.md)\n* [Local Installation](.\u002Fdocs\u002Flocal-installation.md)\n* [FAQ](.\u002Fdocs\u002Ffaq.md)\n\n## Community Projects and Examples\nThe following are examples and projects built by the community using Tesseract.js. Officially supported examples are found in the [examples](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Ftree\u002Fmaster\u002Fexamples) directory. \n\n- Projects\n   - Scribe OCR: web application for scanning documents (images and PDFs)\n      - Site at [scribeocr.com](https:\u002F\u002Fscribeocr.com\u002F), repo at [github.com\u002Fscribeocr\u002Fscribeocr](https:\u002F\u002Fgithub.com\u002Fscribeocr\u002Fscribeocr)\n   - Chrome Extension (with Manifest V3): https:\u002F\u002Fgithub.com\u002FTshetrim\u002FImage-To-Text-OCR-extension-for-ChatGPT\n- Examples\n   - Converting PDF to text: https:\u002F\u002Fgithub.com\u002Fracosa\u002Fpdf2text-ocr\n   - Use `blocks` output to generate granular data [word\u002Fsymbol level]: https:\u002F\u002Fgithub.com\u002FKishlay-notabot\u002Ftesseract-bbox-examples\n   - Electron: https:\u002F\u002Fgithub.com\u002FBalearica\u002Ftesseract.js-electron\n   - Typescript: https:\u002F\u002Fgithub.com\u002FBalearica\u002Ftesseract.js-typescript\n \nIf you have a project or example repo that uses Tesseract.js, feel free to add it to this list using a pull request. Examples submitted should be well documented such that new users can run them; projects should be functional and actively maintained.\n\n## Major changes in v6\nVersion 6 changes are documented in [this issue](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fissues\u002F993).  Highlights are below.\n - Fixed memory leak in previous versions\n - Overall reductions in runtime and memory usage\n - Breaking changes:\n    - All outputs formats other than `text` are disabled by default.\n      - To re-enable the `hocr` output (for example), set the following: `worker.recognize(image, {}, { hocr: true })`\n    - Minor changes to the structure of the JavaScript object (`blocks`) output\n    - See [this issue](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fissues\u002F993) for full list\n\n## Major changes in v5\nVersion 5 changes are documented in [this issue](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fissues\u002F820).  Highlights are below.\n\n - Significantly smaller files by default (54% smaller for English, 73% smaller for Chinese)\n    - This results in a ~50% reduction in runtime for first-time users (who do not have the files cached yet)\n - Significantly lower memory usage\n - Breaking changes:\n    - `createWorker` arguments changed\n       - Setting non-default language and OEM now happens in `createWorker`\n          - E.g. `createWorker(\"chi_sim\", 1)`\n    - `worker.initialize` and `worker.loadLanguage` functions should be deleted from code\n    - See [this issue](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fissues\u002F820) for full list\n\nUpgrading from v2 to v5?  See [this guide](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fissues\u002F771).\n\n## Major changes in v4\nVersion 4 includes many new features and bug fixes--see [this issue](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fissues\u002F662) for a full list.  Several highlights are below. \n\n- Added rotation preprocessing options (including auto-rotate) for significantly better accuracy\n- Processed images (rotated, grayscale, binary) can now be retrieved\n- Improved support for parallel processing (schedulers)\n- Breaking changes:\n  - `createWorker` is now async\n  - `getPDF` function replaced by `pdf` recognize option\n\n## Contributing\n\n### Development\nTo run a development copy of Tesseract.js do the following:\n```shell\n# First we clone the repository\ngit clone https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js.git\ncd tesseract.js\n\n# Then we install the dependencies\nnpm install\n\n# And finally we start the development server\nnpm start\n```\n\nThe development server will be available at http:\u002F\u002Flocalhost:3000\u002Fexamples\u002Fbrowser\u002Fbasic-efficient.html in your favorite browser.\nIt will automatically rebuild `tesseract.min.js` and `worker.min.js` when you change files in the **src** folder.\n\n### Building Static Files\nTo build the compiled static files just execute the following:\n```shell\nnpm run build\n```\nThis will output the files into the `dist` directory.\n\n### Run Tests\n**Always confirm the automated tests pass before submitting a pull request.**  To run the automated tests locally, run the following commands.\n```shell\nnpm run lint\nnpm run test\n```\n\n## Contributors\n\n### Code Contributors\n\nThis project exists thanks to all the people who contribute. [[Contribute](https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js?tab=readme-ov-file#contributing)].\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fnaptha\u002Ftesseract.js\u002Fgraphs\u002Fcontributors\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Fcontributors.svg?width=890&button=false\" \u002F>\u003C\u002Fa>\n\n### Financial Contributors\n\nBecome a financial contributor and help us sustain our community. [[Contribute](https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Fcontribute)]\n\n#### Individuals\n\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Findividuals.svg?width=890\">\u003C\u002Fa>\n\n#### Organizations\n\nSupport this project with your organization. Your logo will show up here with a link to your website. [[Contribute](https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Fcontribute)]\n\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F0\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F0\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F1\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F1\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F2\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F2\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F3\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F3\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F4\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F4\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F5\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F5\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F6\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F6\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F7\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F7\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F8\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F8\u002Favatar.svg\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F9\u002Fwebsite\">\u003Cimg src=\"https:\u002F\u002Fopencollective.com\u002Ftesseractjs\u002Forganization\u002F9\u002Favatar.svg\">\u003C\u002Fa>\n","Tesseract.js 是一个支持超过100种语言的纯JavaScript OCR库。它利用深度学习技术，能够在浏览器或Node.js环境中从图像中提取文字。该项目的核心功能包括图像识别和视频实时文本识别，通过WebAssembly技术实现高性能处理。Tesseract.js适用于需要在前端或后端进行文字识别的应用场景，例如文档扫描、车牌识别等。其简洁的API设计使得开发者能够轻松集成OCR功能到自己的项目中。",2,"2026-06-11 02:43:22","top_all"]