[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-75000":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},75000,"markit","Michaelliv\u002Fmarkit","Michaelliv","🖍️ Convert anything to markdown. Mark it.",null,"TypeScript",1266,52,6,1,0,3,18,45,9,72.67,"MIT License",false,"main",true,[],"2026-06-12 04:01:16","# markit\n\nConvert anything to markdown. PDF, DOCX, PPTX, XLSX, HTML, EPUB, Jupyter, RSS, images, audio, URLs, and more. Pluggable converters, built-in LLM providers for image description and audio transcription. Works as a CLI and as a library.\n\n```bash\nnpm install -g markit-ai\n```\n\n---\n\n## Quick Start\n\n```bash\n# Documents\nmarkit report.pdf\nmarkit document.docx\nmarkit slides.pptx\n\n# Data\nmarkit data.csv\nmarkit config.json\nmarkit schema.yaml\n\n# Web\nmarkit https:\u002F\u002Fexample.com\u002Farticle\nmarkit https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FMarkdown\n\n# Media (via LLMs. set OPENAI_API_KEY or ANTHROPIC_API_KEY)\nmarkit photo.jpg                          # EXIF metadata + AI description\nmarkit recording.mp3                      # Audio metadata + transcription\nmarkit photo.jpg -p \"Extract all text\"    # Custom instructions\n\n# Write to file\nmarkit report.pdf -o report.md\n\n# Pipe it\nmarkit report.pdf | pbcopy\nmarkit data.xlsx -q | napkin create \"Imported Data\"\n```\n\n---\n\n## Supported Formats\n\n| Format | Extensions | How |\n|--------|-----------|-----|\n| PDF | `.pdf` | Text extraction via unpdf |\n| Word | `.docx` | mammoth → turndown, preserves headings\u002Ftables |\n| PowerPoint | `.pptx` | XML parsing, slides + notes + tables |\n| Excel | `.xlsx` | Each sheet → markdown table |\n| HTML | `.html` `.htm` | turndown, scripts\u002Fstyles stripped |\n| EPUB | `.epub` | Spine-ordered chapters, metadata header |\n| Jupyter | `.ipynb` | Markdown cells + code + outputs |\n| RSS\u002FAtom | `.rss` `.atom` `.xml` | Feed items with dates and content |\n| CSV\u002FTSV | `.csv` `.tsv` | Markdown tables |\n| JSON | `.json` | Pretty-printed code block |\n| YAML | `.yaml` `.yml` | Code block |\n| XML\u002FSVG | `.xml` `.svg` | Code block |\n| Images | `.jpg` `.png` `.gif` `.webp` | EXIF metadata + optional AI description |\n| Audio | `.mp3` `.wav` `.m4a` `.flac` | Metadata + optional AI transcription |\n| ZIP | `.zip` | Recursive. converts each file inside |\n| URLs | `http:\u002F\u002F` `https:\u002F\u002F` | Fetches with `Accept: text\u002Fmarkdown` |\n| Wikipedia | `*.wikipedia.org` | Main content extraction |\n| Code | `.py` `.ts` `.go` `.rs` ... | Fenced code block |\n| Plain text | `.txt` `.md` `.rst` `.log` | Pass-through |\n\nNeed more? [Write a plugin.](#plugins)\n\n---\n\n## AI Features\n\nImages and audio get metadata extraction for free. For AI-powered descriptions and transcription, set an API key:\n\n```bash\n# OpenAI (default provider)\nexport OPENAI_API_KEY=sk-...\nmarkit photo.jpg\n\n# Anthropic\nmarkit config set llm.provider anthropic\nexport ANTHROPIC_API_KEY=sk-ant-...\nmarkit photo.jpg\n\n# Any OpenAI-compatible API (Ollama, Groq, Together, etc.)\nmarkit config set llm.apiBase http:\u002F\u002Flocalhost:11434\u002Fv1\n```\n\nFocus the AI on what matters:\n\n```bash\nmarkit receipt.jpg -p \"List all line items with prices as a table\"\nmarkit diagram.png -p \"Describe the architecture and data flow\"\nmarkit whiteboard.jpg -p \"Extract all text verbatim\"\n```\n\n---\n\n## Plugins\n\nExtend markit with new formats, override builtins, or add LLM providers.\n\n### Install\n\n```bash\nmarkit plugin install npm:markit-plugin-dwg\nmarkit plugin install git:github.com\u002Fuser\u002Fmarkit-plugin-ocr\nmarkit plugin install .\u002Fmy-plugin.ts\nmarkit plugin list\nmarkit plugin remove dwg\n```\n\n### Write a Plugin\n\nA plugin is a function that receives an API and registers converters and\u002For providers:\n\n```typescript\nimport type { MarkitPluginAPI } from \"markit-ai\";\n\nexport default function(api: MarkitPluginAPI) {\n  api.setName(\"cad\");\n  api.setVersion(\"1.0.0\");\n\n  \u002F\u002F Register a converter for a new format\n  api.registerConverter(\n    {\n      name: \"dwg\",\n      accepts: (info) => [\".dwg\", \".dxf\"].includes(info.extension || \"\"),\n      convert: async (input, info) => {\n        \u002F\u002F Your conversion logic\n        return { markdown: \"...\" };\n      },\n    },\n    \u002F\u002F Optional: declare the format so it shows in `markit formats`\n    { name: \"AutoCAD\", extensions: [\".dwg\", \".dxf\"] },\n  );\n}\n```\n\nPlugin converters run **before** builtins. so you can override any format:\n\n```typescript\nexport default function(api: MarkitPluginAPI) {\n  api.setName(\"better-pdf\");\n\n  \u002F\u002F This replaces the built-in PDF converter\n  api.registerConverter({\n    name: \"pdf\",\n    accepts: (info) => info.extension === \".pdf\",\n    convert: async (input, info) => {\n      \u002F\u002F Your superior PDF extraction\n      return { markdown: \"...\" };\n    },\n  });\n}\n```\n\nPlugins can also register LLM providers:\n\n```typescript\napi.registerProvider({\n  name: \"gemini\",\n  envKeys: [\"GOOGLE_API_KEY\"],\n  defaultBase: \"https:\u002F\u002Fgenerativelanguage.googleapis.com\u002Fv1beta\",\n  defaultModel: \"gemini-2.0-flash\",\n  create: (config, prompt) => ({\n    describe: async (image, mime) => { \u002F* ... *\u002F },\n  }),\n});\n```\n\n---\n\n## For Agents\n\nEvery command supports `--json`. Raw markdown with `-q`.\n\n```bash\nmarkit report.pdf --json       # Structured output for parsing\nmarkit report.pdf -q           # Raw markdown, nothing else\nmarkit onboard                 # Add instructions to CLAUDE.md\n```\n\n---\n\n## SDK\n\nmarkit is also a library:\n\n```typescript\nimport { Markit } from \"markit-ai\";\n\nconst markit = new Markit();\nconst { markdown } = await markit.convertFile(\"report.pdf\");\nconst { markdown } = await markit.convertUrl(\"https:\u002F\u002Fexample.com\");\nconst { markdown } = await markit.convert(buffer, { extension: \".docx\" });\n```\n\nWith AI features. pass plain functions, use any provider:\n\n```typescript\nimport OpenAI from \"openai\";\nimport { Markit } from \"markit-ai\";\n\nconst openai = new OpenAI();\n\nconst markit = new Markit({\n  describe: async (image, mime) => {\n    const res = await openai.chat.completions.create({\n      model: \"gpt-4.1-nano\",\n      messages: [{ role: \"user\", content: [\n        { type: \"text\", text: \"Describe this image.\" },\n        { type: \"image_url\", image_url: { url: `data:${mime};base64,${image.toString(\"base64\")}` } },\n      ]}],\n    });\n    return res.choices[0].message.content ?? \"\";\n  },\n  transcribe: async (audio, mime) => {\n    const res = await openai.audio.transcriptions.create({\n      model: \"gpt-4o-mini-transcribe\",\n      file: new File([audio], \"audio.mp3\", { type: mime }),\n    });\n    return res.text;\n  },\n});\n```\n\nMix providers. Claude for vision, OpenAI for audio, whatever:\n\n```typescript\nconst markit = new Markit({\n  describe: async (image, mime) => {\n    const res = await anthropic.messages.create({\n      model: \"claude-haiku-4-5\",\n      messages: [{ role: \"user\", content: [\n        { type: \"image\", source: { type: \"base64\", media_type: mime, data: image.toString(\"base64\") } },\n        { type: \"text\", text: \"Describe this image.\" },\n      ]}],\n    });\n    return res.content[0].text;\n  },\n  transcribe: async (audio, mime) => { \u002F* Whisper, Deepgram, AssemblyAI, ... *\u002F },\n});\n```\n\nOr use the built-in providers. no SDK needed:\n\n```typescript\nimport { Markit, createLlmFunctions, loadConfig } from \"markit-ai\";\n\nconst config = loadConfig(); \u002F\u002F reads .markit\u002Fconfig.json + env vars\nconst markit = new Markit(createLlmFunctions(config));\n```\n\nWith plugins:\n\n```typescript\nimport { Markit, createLlmFunctions, loadConfig, loadAllPlugins } from \"markit-ai\";\n\nconst config = loadConfig();\nconst plugins = await loadAllPlugins();\nconst markit = new Markit(createLlmFunctions(config), plugins);\n```\n\n---\n\n## Configuration\n\n```bash\nmarkit init                              # Create .markit\u002Fconfig.json\nmarkit config show                       # Show resolved settings\nmarkit config get llm.model              # Get a value\nmarkit config set llm.provider anthropic # Switch provider\nmarkit config set llm.apiKey sk-...      # Set a value\n```\n\n`.markit\u002Fconfig.json`:\n\n```json\n{\n  \"llm\": {\n    \"provider\": \"openai\",\n    \"apiBase\": \"https:\u002F\u002Fapi.openai.com\u002Fv1\",\n    \"apiKey\": \"sk-...\",\n    \"model\": \"gpt-4.1-nano\",\n    \"transcriptionModel\": \"gpt-4o-mini-transcribe\"\n  }\n}\n```\n\nEnv vars override config. Each provider checks its own env vars first:\n\n| Provider | Env vars | Default model |\n|----------|---------|---------------|\n| `openai` | `OPENAI_API_KEY`, `MARKIT_API_KEY` | `gpt-4.1-nano` |\n| `anthropic` | `ANTHROPIC_API_KEY`, `MARKIT_API_KEY` | `claude-haiku-4-5` |\n\n---\n\n## CLI Reference\n\n```bash\nmarkit \u003Csource>                          # Convert file or URL\nmarkit \u003Csource> -o output.md             # Write to file\nmarkit \u003Csource> -p \"instructions\"        # Custom AI prompt\nmarkit \u003Csource> --json                   # JSON output\nmarkit \u003Csource> -q                       # Raw markdown only\ncat file.pdf | markit -                  # Read from stdin\nmarkit formats                           # List supported formats\nmarkit init                              # Create .markit\u002F config\nmarkit config show                       # Show settings\nmarkit config get \u003Ckey>                  # Get config value\nmarkit config set \u003Ckey> \u003Cvalue>          # Set config value\nmarkit plugin install \u003Csource>           # Install plugin\nmarkit plugin list                       # List plugins\nmarkit plugin remove \u003Cname>              # Remove plugin\nmarkit onboard                           # Add to CLAUDE.md\n```\n\n---\n\n## Development\n\n```bash\nbun install\nbun run dev -- report.pdf\nbun test\nbun run check\n```\n\n## License\n\nMIT\n","markit 是一个将各种文件格式转换为 Markdown 的工具。它支持 PDF、DOCX、PPTX、XLSX、HTML、EPUB、Jupyter 笔记本、RSS、图像、音频、URL 等多种格式，并且通过插件机制可以扩展支持更多格式。内置的 LLM（大语言模型）提供者能够为图片生成描述和音频转录文字，增强了其处理多媒体内容的能力。适用于需要快速将不同来源的数据或文档统一转换成易于阅读与编辑的 Markdown 格式的场景，如技术文档编写、笔记整理或是学术研究中的资料归档等。",2,"2026-06-11 03:51:54","high_star"]