[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-71925":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":45,"readmeContent":46,"aiSummary":47,"trendingCount":16,"starSnapshotCount":16,"syncStatus":48,"lastSyncTime":49,"discoverSource":50},71925,"Skill_Seekers","yusufkaraaslan\u002FSkill_Seekers","yusufkaraaslan","Convert documentation websites, GitHub repositories, and PDFs into Claude AI skills with automatic conflict detection","https:\u002F\u002Fskillseekersweb.com\u002F",null,"Python",14036,1450,67,97,0,68,178,593,204,44.49,"MIT License",false,"development",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44],"ai-tools","ast-parser","automation","claude-ai","claude-skills","code-analysis","conflict-detection","documentation","documentation-generator","github","github-scraper","mcp","mcp-server","multi-source","ocr","pdf","python","web-scraping","2026-06-12 02:02:56","\u003Cp align=\"center\">\n  \u003Cimg src=\"docs\u002Fassets\u002Flogo.png\" alt=\"Skill Seekers\" width=\"200\"\u002F>\n\u003C\u002Fp>\n\n# Skill Seekers\n\nEnglish | [简体中文](README.zh-CN.md) | [日本語](README.ja.md) | [한국어](README.ko.md) | [Español](README.es.md) | [Français](README.fr.md) | [Deutsch](README.de.md) | [Português](README.pt-BR.md) | [Türkçe](README.tr.md) | [العربية](README.ar.md) | [हिन्दी](README.hi.md) | [Русский](README.ru.md)\n\n[![Version](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fversion-3.5.0-blue.svg)](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002FSkill_Seekers\u002Freleases)\n[![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT)\n[![Python 3.10+](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.10+-blue.svg)](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F)\n[![MCP Integration](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMCP-Integrated-blue.svg)](https:\u002F\u002Fmodelcontextprotocol.io)\n[![Tested](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTests-3194%2B%20Passing-brightgreen.svg)](tests\u002F)\n[![Project Board](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FProject-Board-purple.svg)](https:\u002F\u002Fgithub.com\u002Fusers\u002Fyusufkaraaslan\u002Fprojects\u002F2)\n[![PyPI version](https:\u002F\u002Fbadge.fury.io\u002Fpy\u002Fskill-seekers.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fskill-seekers\u002F)\n[![PyPI - Downloads](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fskill-seekers.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fskill-seekers\u002F)\n[![PyPI - Python Version](https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fpyversions\u002Fskill-seekers.svg)](https:\u002F\u002Fpypi.org\u002Fproject\u002Fskill-seekers\u002F)\n[![Website](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FWebsite-skillseekersweb.com-blue.svg)](https:\u002F\u002Fskillseekersweb.com\u002F)\n[![Twitter Follow](https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002F_yUSyUS_?style=social)](https:\u002F\u002Fx.com\u002F_yUSyUS_)\n[![GitHub Repo stars](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Fstars\u002Fyusufkaraaslan\u002FSkill_Seekers?style=social)](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002FSkill_Seekers)\n[![PyPI Downloads](https:\u002F\u002Fstatic.pepy.tech\u002Fpersonalized-badge\u002Fskill-seekers?period=total&units=INTERNATIONAL_SYSTEM&left_color=BLACK&right_color=GREEN&left_text=downloads)](https:\u002F\u002Fpepy.tech\u002Fprojects\u002Fskill-seekers)\n\n\u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F18329\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F18329\" alt=\"yusufkaraaslan%2FSkill_Seekers | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\n**🧠 The data layer for AI systems.** Skill Seekers turns documentation sites, GitHub repos, PDFs, videos, notebooks, wikis, and 10+ more source types into structured knowledge assets—ready to power AI Skills (Claude, Gemini, OpenAI), RAG pipelines (LangChain, LlamaIndex, Pinecone), and AI coding assistants (Cursor, Windsurf, Cline) in minutes, not hours.\n\n> 🌐 **[Visit SkillSeekersWeb.com](https:\u002F\u002Fskillseekersweb.com\u002F)** - Browse 24+ preset configs, share your configs, and access complete documentation!\n\n> 📋 **[View Development Roadmap & Tasks](https:\u002F\u002Fgithub.com\u002Fusers\u002Fyusufkaraaslan\u002Fprojects\u002F2)** - 134 tasks across 10 categories, pick any to contribute!\n\n## 🌐 Ecosystem\n\nSkill Seekers is a multi-repo project. Here's where everything lives:\n\n| Repository | Description | Links |\n|-----------|-------------|-------|\n| **[Skill_Seekers](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002FSkill_Seekers)** | Core CLI & MCP server (this repo) | [PyPI](https:\u002F\u002Fpypi.org\u002Fproject\u002Fskill-seekers\u002F) |\n| **[skillseekersweb](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002Fskillseekersweb)** | Website & documentation | [Live](https:\u002F\u002Fskillseekersweb.com\u002F) |\n| **[skill-seekers-configs](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002Fskill-seekers-configs)** | Community config repository | |\n| **[skill-seekers-action](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002Fskill-seekers-action)** | GitHub Action for CI\u002FCD | |\n| **[skill-seekers-plugin](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002Fskill-seekers-plugin)** | Claude Code plugin | |\n| **[homebrew-skill-seekers](https:\u002F\u002Fgithub.com\u002Fyusufkaraaslan\u002Fhomebrew-skill-seekers)** | Homebrew tap for macOS | |\n\n> **Want to contribute?** The website and configs repos are great starting points for new contributors!\n\n## 🧠 The Data Layer for AI Systems\n\n**Skill Seekers is the universal preprocessing layer** that sits between raw documentation and every AI system that consumes it. Whether you are building Claude skills, a LangChain RAG pipeline, or a Cursor `.cursorrules` file — the data preparation is identical. You do it once, and export to all targets.\n\n```bash\n# One command → structured knowledge asset\nskill-seekers create https:\u002F\u002Fdocs.react.dev\u002F\n# or: skill-seekers create facebook\u002Freact\n# or: skill-seekers create .\u002Fmy-project\n\n# Export to any AI system\nskill-seekers package output\u002Freact --target claude      # → Claude AI Skill (ZIP)\nskill-seekers package output\u002Freact --target langchain   # → LangChain Documents\nskill-seekers package output\u002Freact --target llama-index # → LlamaIndex TextNodes\nskill-seekers package output\u002Freact --target cursor      # → .cursorrules\nskill-seekers package output\u002Freact --target ibm-bob     # → IBM Bob skill directory\n```\n\n### What gets built\n\n| Output | Target | What it powers |\n|--------|--------|---------------|\n| **Claude Skill** (ZIP + YAML) | `--target claude` | Claude Code, Claude API |\n| **Gemini Skill** (tar.gz) | `--target gemini` | Google Gemini |\n| **OpenAI \u002F Custom GPT** (ZIP) | `--target openai` | GPT-4o, custom assistants |\n| **LangChain Documents** | `--target langchain` | QA chains, agents, retrievers |\n| **LlamaIndex TextNodes** | `--target llama-index` | Query engines, chat engines |\n| **Haystack Documents** | `--target haystack` | Enterprise RAG pipelines |\n| **Pinecone-ready** (Markdown) | `--target markdown` | Vector upsert |\n| **ChromaDB \u002F FAISS \u002F Qdrant** | `--format chroma\u002Ffaiss\u002Fqdrant` | Local vector DBs |\n| **IBM Bob Skill** (directory) | `--target ibm-bob` | IBM Bob project\u002Fglobal skills |\n| **Cursor** `.cursorrules` | `--target claude` → copy | Cursor IDE AI context |\n| **Windsurf \u002F Cline \u002F Continue** | `--target claude` → copy | VS Code, IntelliJ, Vim |\n\n### Why it matters\n\n- ⚡ **99% faster** — Days of manual data prep → 15–45 minutes\n- 🎯 **AI Skill quality** — 500+ line SKILL.md files with examples, patterns, and guides\n- 📊 **RAG-ready chunks** — Smart chunking preserves code blocks and maintains context\n- 🎬 **Videos** — Extract code, transcripts, and structured knowledge from YouTube and local videos\n- 🔄 **Multi-source** — Combine 18 source types (docs, GitHub, PDFs, videos, notebooks, wikis, and more) into one knowledge asset\n- 🌐 **One prep, every target** — Export the same asset to 20 platforms (12 LLM + 8 RAG\u002Fvector) without re-scraping\n- ✅ **Battle-tested** — 3,194+ tests, 24+ framework presets, production-ready\n\n## 🚀 Quick Start (3 Commands)\n\n```bash\n# 1. Install\npip install skill-seekers\n\n# 2. Create skill from any source\nskill-seekers create https:\u002F\u002Fdocs.django.com\u002F\n\n# 3. Package for your AI platform\nskill-seekers package output\u002Fdjango --target claude\n```\n\n**That's it!** You now have `output\u002Fdjango-claude.zip` ready to use.\n\n```bash\n# Use a different AI agent for enhancement (default: claude)\nskill-seekers create https:\u002F\u002Fdocs.django.com\u002F --agent kimi\nskill-seekers create https:\u002F\u002Fdocs.django.com\u002F --agent codex\nskill-seekers create https:\u002F\u002Fdocs.django.com\u002F --agent-cmd \"my-custom-agent run\"\n```\n\n### Other Sources (18 Supported)\n\n```bash\n# GitHub repository\nskill-seekers create facebook\u002Freact\n\n# Local project\nskill-seekers create .\u002Fmy-project\n\n# PDF document\nskill-seekers create manual.pdf\n\n# Word document\nskill-seekers create report.docx\n\n# EPUB e-book\nskill-seekers create book.epub\n\n# Jupyter Notebook\nskill-seekers create notebook.ipynb\n\n# OpenAPI spec\nskill-seekers create openapi.yaml\n\n# PowerPoint presentation\nskill-seekers create presentation.pptx\n\n# AsciiDoc document\nskill-seekers create guide.adoc\n\n# Local HTML file (auto-detected by extension)\nskill-seekers create page.html\n\n# Whole directory of HTML files (auto-detected for HTML-dominant dirs)\nskill-seekers create .\u002Fmirror_output\u002Fsite\u002F\n\n# Force HTML mode on a mixed\u002Fcode-heavy directory\nskill-seekers create .\u002Frepo\u002F --html-path .\u002Frepo\u002Fdocs\u002Fbuild\u002Fhtml\u002F\n\n# RSS\u002FAtom feed\nskill-seekers create feed.rss\n\n# Man page\nskill-seekers create curl.1\n\n# Video (YouTube, Vimeo, or local file — requires skill-seekers[video])\nskill-seekers video --url https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=... --name mytutorial\n# First time? Auto-install GPU-aware visual deps:\nskill-seekers video --setup\n\n# Confluence wiki\nskill-seekers confluence --space TEAM --name wiki\n\n# Notion pages\nskill-seekers notion --database-id ... --name docs\n\n# Slack\u002FDiscord chat export\nskill-seekers chat --export-dir .\u002Fslack-export --name team-chat\n```\n\n### Export Everywhere\n\n```bash\n# Package for multiple platforms\nfor platform in claude gemini openai langchain; do\n  skill-seekers package output\u002Fdjango --target $platform\ndone\n```\n\n## What is Skill Seekers?\n\nSkill Seekers is the **data layer for AI systems**. It transforms 18 source types—documentation websites, GitHub repositories, PDFs, videos, Jupyter Notebooks, Word\u002FEPUB\u002FAsciiDoc documents, OpenAPI specs, PowerPoint presentations, RSS feeds, man pages, Confluence wikis, Notion pages, Slack\u002FDiscord exports, and more—into structured knowledge assets for every AI target:\n\n| Use Case | What you get | Examples |\n|----------|-------------|---------|\n| **AI Skills** | Comprehensive SKILL.md + references | Claude Code, Gemini, GPT |\n| **RAG Pipelines** | Chunked documents with rich metadata | LangChain, LlamaIndex, Haystack |\n| **Vector Databases** | Pre-formatted data ready for upsert | Pinecone, Chroma, Weaviate, FAISS |\n| **AI Coding Assistants** | Context files your IDE AI reads automatically | Cursor, Windsurf, Cline, Continue.dev |\n\n## 📚 Documentation\n\n| I want to... | Read this |\n|--------------|-----------|\n| **Get started quickly** | [Quick Start](docs\u002Fgetting-started\u002F02-quick-start.md) - 3 commands to first skill |\n| **Understand concepts** | [Core Concepts](docs\u002Fuser-guide\u002F01-core-concepts.md) - How it works |\n| **Scrape sources** | [Scraping Guide](docs\u002Fuser-guide\u002F02-scraping.md) - All source types |\n| **Enhance skills** | [Enhancement Guide](docs\u002Fuser-guide\u002F03-enhancement.md) - AI enhancement |\n| **Export skills** | [Packaging Guide](docs\u002Fuser-guide\u002F04-packaging.md) - Platform export |\n| **Look up commands** | [CLI Reference](docs\u002Freference\u002FCLI_REFERENCE.md) - All 20 commands |\n| **Configure** | [Config Format](docs\u002Freference\u002FCONFIG_FORMAT.md) - JSON specification |\n| **Fix issues** | [Troubleshooting](docs\u002Fuser-guide\u002F06-troubleshooting.md) - Common problems |\n\n**Complete documentation:** [docs\u002FREADME.md](docs\u002FREADME.md)\n\nInstead of spending days on manual preprocessing, Skill Seekers:\n\n1. **Ingests** — docs, GitHub repos, local codebases, PDFs, videos, notebooks, wikis, and 10+ more source types\n2. **Analyzes** — deep AST parsing, pattern detection, API extraction\n3. **Structures** — categorized reference files with metadata\n4. **Enhances** — AI-powered SKILL.md generation (Claude, Gemini, or local)\n5. **Exports** — 16 platform-specific formats from one asset\n\n## Why Use This?\n\n### For AI Skill Builders (Claude, Gemini, OpenAI)\n\n- 🎯 **Production-grade Skills** — 500+ line SKILL.md files with code examples, patterns, and guides\n- 🔄 **Enhancement Workflows** — Apply `security-focus`, `architecture-comprehensive`, or custom YAML presets\n- 🎮 **Any Domain** — Game engines (Godot, Unity), frameworks (React, Django), internal tools\n- 🔧 **Teams** — Combine internal docs + code into a single source of truth\n- 📚 **Quality** — AI-enhanced with examples, quick reference, and navigation guidance\n\n### For RAG Builders & AI Engineers\n\n- 🤖 **RAG-ready data** — Pre-chunked LangChain `Documents`, LlamaIndex `TextNodes`, Haystack `Documents`\n- 🚀 **99% faster** — Days of preprocessing → 15–45 minutes\n- 📊 **Smart metadata** — Categories, sources, types → better retrieval accuracy\n- 🔄 **Multi-source** — Combine docs + GitHub + PDFs + videos in one pipeline\n- 🌐 **Platform-agnostic** — Export to any vector DB or framework without re-scraping\n\n### For AI Coding Assistant Users\n\n- 💻 **Cursor \u002F Windsurf \u002F Cline** — Generate `.cursorrules` \u002F `.windsurfrules` \u002F `.clinerules` automatically\n- 🎯 **Persistent context** — AI \"knows\" your frameworks without repeated prompting\n- 📚 **Always current** — Update context in minutes when docs change\n\n## Key Features\n\n### 🌐 Documentation Scraping\n- ✅ **Smart SPA Discovery** - Three-layer discovery for JavaScript SPA sites (sitemap.xml → llms.txt → headless browser rendering)\n- ✅ **llms.txt Support** - Automatically detects and uses LLM-ready documentation files (10x faster)\n- ✅ **Universal Scraper** - Works with ANY documentation website\n- ✅ **Smart Categorization** - Automatically organizes content by topic\n- ✅ **Code Language Detection** - Recognizes Python, JavaScript, C++, GDScript, etc.\n- ✅ **24+ Ready-to-Use Presets** - Godot, React, Vue, Django, FastAPI, and more\n\n### 📄 PDF Support\n- ✅ **Basic PDF Extraction** - Extract text, code, and images from PDF files\n- ✅ **OCR for Scanned PDFs** - Extract text from scanned documents\n- ✅ **Password-Protected PDFs** - Handle encrypted PDFs\n- ✅ **Table Extraction** - Extract complex tables from PDFs\n- ✅ **Parallel Processing** - 3x faster for large PDFs\n- ✅ **Intelligent Caching** - 50% faster on re-runs\n\n### 🎬 Video Extraction\n- ✅ **YouTube & Local Videos** - Extract transcripts, on-screen code, and structured knowledge from videos\n- ✅ **Visual Frame Analysis** - OCR extraction from code editors, terminals, slides, and diagrams\n- ✅ **GPU Auto-Detection** - Automatically installs correct PyTorch build (CUDA\u002FROCm\u002FMPS\u002FCPU)\n- ✅ **AI Enhancement** - Two-pass: clean OCR artifacts + generate polished SKILL.md\n- ✅ **Time Clipping** - Extract specific sections with `--start-time` and `--end-time`\n- ✅ **Playlist Support** - Batch process all videos in a YouTube playlist\n- ✅ **Vision API Fallback** - Use Claude Vision for low-confidence OCR frames\n\n### 🐙 GitHub Repository Analysis\n- ✅ **Deep Code Analysis** - AST parsing for Python, JavaScript, TypeScript, Java, C++, Go\n- ✅ **API Extraction** - Functions, classes, methods with parameters and types\n- ✅ **Repository Metadata** - README, file tree, language breakdown, stars\u002Fforks\n- ✅ **GitHub Issues & PRs** - Fetch open\u002Fclosed issues with labels and milestones\n- ✅ **CHANGELOG & Releases** - Automatically extract version history\n- ✅ **Conflict Detection** - Compare documented APIs vs actual code implementation\n- ✅ **MCP Integration** - Natural language: \"Scrape GitHub repo facebook\u002Freact\"\n\n### 🔄 Unified Multi-Source Scraping\n- ✅ **Combine Multiple Sources** - Mix documentation + GitHub + PDF in one skill\n- ✅ **Conflict Detection** - Automatically finds discrepancies between docs and code\n- ✅ **Intelligent Merging** - Rule-based or AI-powered conflict resolution\n- ✅ **Transparent Reporting** - Side-by-side comparison with ⚠️ warnings\n- ✅ **Documentation Gap Analysis** - Identifies outdated docs and undocumented features\n- ✅ **Single Source of Truth** - One skill showing both intent (docs) and reality (code)\n- ✅ **Backward Compatible** - Legacy single-source configs still work\n\n### 🤖 Multi-LLM Platform Support\n- ✅ **12 LLM Platforms** - Claude AI, Google Gemini, OpenAI ChatGPT, MiniMax AI, Generic Markdown, OpenCode, Kimi (Moonshot AI), DeepSeek AI, Qwen (Alibaba), OpenRouter, Together AI, Fireworks AI\n- ✅ **Universal Scraping** - Same documentation works for all platforms\n- ✅ **Platform-Specific Packaging** - Optimized formats for each LLM\n- ✅ **One-Command Export** - `--target` flag selects platform\n- ✅ **Optional Dependencies** - Install only what you need\n- ✅ **100% Backward Compatible** - Existing Claude workflows unchanged\n\n| Platform | Format | Upload | Enhancement | API Key | Custom Endpoint |\n|----------|--------|--------|-------------|---------|-----------------|\n| **Claude AI** | ZIP + YAML | ✅ Auto | ✅ Yes | ANTHROPIC_API_KEY | ANTHROPIC_BASE_URL |\n| **Google Gemini** | tar.gz | ✅ Auto | ✅ Yes | GOOGLE_API_KEY | - |\n| **OpenAI ChatGPT** | ZIP + Vector Store | ✅ Auto | ✅ Yes | OPENAI_API_KEY | - |\n| **MiniMax AI** | ZIP + Knowledge Files | ✅ Auto | ✅ Yes | MINIMAX_API_KEY | - |\n| **Generic Markdown** | ZIP | ❌ Manual | ❌ No | - | - |\n\n```bash\n# Claude (default - no changes needed!)\nskill-seekers package output\u002Freact\u002F\nskill-seekers upload react.zip\n\n# Google Gemini\npip install skill-seekers[gemini]\nskill-seekers package output\u002Freact\u002F --target gemini\nskill-seekers upload react-gemini.tar.gz --target gemini\n\n# OpenAI ChatGPT\npip install skill-seekers[openai]\nskill-seekers package output\u002Freact\u002F --target openai\nskill-seekers upload react-openai.zip --target openai\n\n# MiniMax AI\npip install skill-seekers[minimax]\nskill-seekers package output\u002Freact\u002F --target minimax\nskill-seekers upload react-minimax.zip --target minimax\n\n# Generic Markdown (universal export)\nskill-seekers package output\u002Freact\u002F --target markdown\n# Use the markdown files directly in any LLM\n```\n\n\u003Cdetails>\n\u003Csummary>🔧 \u003Cstrong>Environment Variables for Claude-Compatible APIs (e.g., GLM-4.7)\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nSkill Seekers supports any Claude-compatible API endpoint:\n\n```bash\n# Option 1: Official Anthropic API (default)\nexport ANTHROPIC_API_KEY=sk-ant-...\n\n# Option 2: GLM-4.7 Claude-compatible API\nexport ANTHROPIC_API_KEY=your-glm-47-api-key\nexport ANTHROPIC_BASE_URL=https:\u002F\u002Fglm-4-7-endpoint.com\u002Fv1\n\n# All AI enhancement features will use the configured endpoint\nskill-seekers enhance output\u002Freact\u002F\nskill-seekers analyze --directory . --enhance\n```\n\n**Note**: Setting `ANTHROPIC_BASE_URL` allows you to use any Claude-compatible API endpoint, such as GLM-4.7 (智谱 AI) or other compatible services.\n\n\u003C\u002Fdetails>\n\n**Installation:**\n```bash\n# Install with Gemini support\npip install skill-seekers[gemini]\n\n# Install with OpenAI support\npip install skill-seekers[openai]\n\n# Install with MiniMax support\npip install skill-seekers[minimax]\n\n# Install with all LLM platforms\npip install skill-seekers[all-llms]\n```\n\n### 🔗 RAG Framework Integrations\n\n- ✅ **LangChain Documents** - Direct export to `Document` format with `page_content` + metadata\n  - Perfect for: QA chains, retrievers, vector stores, agents\n  - Example: [LangChain RAG Pipeline](examples\u002Flangchain-rag-pipeline\u002F)\n  - Guide: [LangChain Integration](docs\u002Fintegrations\u002FLANGCHAIN.md)\n\n- ✅ **LlamaIndex TextNodes** - Export to `TextNode` format with unique IDs + embeddings\n  - Perfect for: Query engines, chat engines, storage context\n  - Example: [LlamaIndex Query Engine](examples\u002Fllama-index-query-engine\u002F)\n  - Guide: [LlamaIndex Integration](docs\u002Fintegrations\u002FLLAMA_INDEX.md)\n\n- ✅ **Pinecone-Ready Format** - Optimized for vector database upsert\n  - Perfect for: Production vector search, semantic search, hybrid search\n  - Example: [Pinecone Upsert](examples\u002Fpinecone-upsert\u002F)\n  - Guide: [Pinecone Integration](docs\u002Fintegrations\u002FPINECONE.md)\n\n**Quick Export:**\n```bash\n# LangChain Documents (JSON)\nskill-seekers package output\u002Fdjango --target langchain\n# → output\u002Fdjango-langchain.json\n\n# LlamaIndex TextNodes (JSON)\nskill-seekers package output\u002Fdjango --target llama-index\n# → output\u002Fdjango-llama-index.json\n\n# Markdown (Universal)\nskill-seekers package output\u002Fdjango --target markdown\n# → output\u002Fdjango-markdown\u002FSKILL.md + references\u002F\n```\n\n**Complete RAG Pipeline Guide:** [RAG Pipelines Documentation](docs\u002Fintegrations\u002FRAG_PIPELINES.md)\n\n---\n\n### 🧠 AI Coding Assistant Integrations\n\nTransform any framework documentation into expert coding context for 4+ AI assistants:\n\n- ✅ **Cursor IDE** - Generate `.cursorrules` for AI-powered code suggestions\n  - Perfect for: Framework-specific code generation, consistent patterns\n  - Works with: Cursor IDE (VS Code fork)\n  - Guide: [Cursor Integration](docs\u002Fintegrations\u002FCURSOR.md)\n  - Example: [Cursor React Skill](examples\u002Fcursor-react-skill\u002F)\n\n- ✅ **Windsurf** - Customize Windsurf's AI assistant context with `.windsurfrules`\n  - Perfect for: IDE-native AI assistance, flow-based coding\n  - Works with: Windsurf IDE by Codeium\n  - Guide: [Windsurf Integration](docs\u002Fintegrations\u002FWINDSURF.md)\n  - Example: [Windsurf FastAPI Context](examples\u002Fwindsurf-fastapi-context\u002F)\n\n- ✅ **Cline (VS Code)** - System prompts + MCP for VS Code agent\n  - Perfect for: Agentic code generation in VS Code\n  - Works with: Cline extension for VS Code\n  - Guide: [Cline Integration](docs\u002Fintegrations\u002FCLINE.md)\n  - Example: [Cline Django Assistant](examples\u002Fcline-django-assistant\u002F)\n\n- ✅ **Continue.dev** - Context servers for IDE-agnostic AI\n  - Perfect for: Multi-IDE environments (VS Code, JetBrains, Vim), custom LLM providers\n  - Works with: Any IDE with Continue.dev plugin\n  - Guide: [Continue Integration](docs\u002Fintegrations\u002FCONTINUE_DEV.md)\n  - Example: [Continue Universal Context](examples\u002Fcontinue-dev-universal\u002F)\n\n**Quick Export for AI Coding Tools:**\n```bash\n# For any AI coding assistant (Cursor, Windsurf, Cline, Continue.dev)\nskill-seekers scrape --config configs\u002Fdjango.json\nskill-seekers package output\u002Fdjango --target claude  # or --target markdown\n\n# Copy to your project (example for Cursor)\ncp output\u002Fdjango-claude\u002FSKILL.md my-project\u002F.cursorrules\n\n# Or for Windsurf\ncp output\u002Fdjango-claude\u002FSKILL.md my-project\u002F.windsurf\u002Frules\u002Fdjango.md\n\n# Or for Cline\ncp output\u002Fdjango-claude\u002FSKILL.md my-project\u002F.clinerules\n\n# Or for Continue.dev (HTTP server)\npython examples\u002Fcontinue-dev-universal\u002Fcontext_server.py\n# Configure in ~\u002F.continue\u002Fconfig.json\n```\n\n**Integration Hub:** [All AI System Integrations](docs\u002Fintegrations\u002FINTEGRATIONS.md)\n\n---\n\n### 🌊 Three-Stream GitHub Architecture\n- ✅ **Triple-Stream Analysis** - Split GitHub repos into Code, Docs, and Insights streams\n- ✅ **Unified Codebase Analyzer** - Works with GitHub URLs AND local paths\n- ✅ **C3.x as Analysis Depth** - Choose 'basic' (1-2 min) or 'c3x' (20-60 min) analysis\n- ✅ **Enhanced Router Generation** - GitHub metadata, README quick start, common issues\n- ✅ **Issue Integration** - Top problems and solutions from GitHub issues\n- ✅ **Smart Routing Keywords** - GitHub labels weighted 2x for better topic detection\n\n**Three Streams Explained:**\n- **Stream 1: Code** - Deep C3.x analysis (patterns, examples, guides, configs, architecture)\n- **Stream 2: Docs** - Repository documentation (README, CONTRIBUTING, docs\u002F*.md)\n- **Stream 3: Insights** - Community knowledge (issues, labels, stars, forks)\n\n```python\nfrom skill_seekers.cli.unified_codebase_analyzer import UnifiedCodebaseAnalyzer\n\n# Analyze GitHub repo with all three streams\nanalyzer = UnifiedCodebaseAnalyzer()\nresult = analyzer.analyze(\n    source=\"https:\u002F\u002Fgithub.com\u002Ffacebook\u002Freact\",\n    depth=\"c3x\",  # or \"basic\" for fast analysis\n    fetch_github_metadata=True\n)\n\n# Access code stream (C3.x analysis)\nprint(f\"Design patterns: {len(result.code_analysis['c3_1_patterns'])}\")\nprint(f\"Test examples: {result.code_analysis['c3_2_examples_count']}\")\n\n# Access docs stream (repository docs)\nprint(f\"README: {result.github_docs['readme'][:100]}\")\n\n# Access insights stream (GitHub metadata)\nprint(f\"Stars: {result.github_insights['metadata']['stars']}\")\nprint(f\"Common issues: {len(result.github_insights['common_problems'])}\")\n```\n\n**See complete documentation**: [Three-Stream Implementation Summary](docs\u002FIMPLEMENTATION_SUMMARY_THREE_STREAM.md)\n\n### 🔐 Smart Rate Limit Management & Configuration\n- ✅ **Multi-Token Configuration System** - Manage multiple GitHub accounts (personal, work, OSS)\n  - Secure config storage at `~\u002F.config\u002Fskill-seekers\u002Fconfig.json` (600 permissions)\n  - Per-profile rate limit strategies: `prompt`, `wait`, `switch`, `fail`\n  - Configurable timeout per profile (default: 30 min, prevents indefinite waits)\n  - Smart fallback chain: CLI arg → Env var → Config file → Prompt\n  - API key management for Claude, Gemini, OpenAI\n- ✅ **Interactive Configuration Wizard** - Beautiful terminal UI for easy setup\n  - Browser integration for token creation (auto-opens GitHub, etc.)\n  - Token validation and connection testing\n  - Visual status display with color coding\n- ✅ **Intelligent Rate Limit Handler** - No more indefinite waits!\n  - Upfront warning about rate limits (60\u002Fhour vs 5000\u002Fhour)\n  - Real-time detection from GitHub API responses\n  - Live countdown timers with progress\n  - Automatic profile switching when rate limited\n  - Four strategies: prompt (ask), wait (countdown), switch (try another), fail (abort)\n- ✅ **Resume Capability** - Continue interrupted jobs\n  - Auto-save progress at configurable intervals (default: 60 sec)\n  - List all resumable jobs with progress details\n  - Auto-cleanup of old jobs (default: 7 days)\n- ✅ **CI\u002FCD Support** - Non-interactive mode for automation\n  - `--non-interactive` flag fails fast without prompts\n  - `--profile` flag to select specific GitHub account\n  - Clear error messages for pipeline logs\n\n**Quick Setup:**\n```bash\n# One-time configuration (5 minutes)\nskill-seekers config --github\n\n# Use specific profile for private repos\nskill-seekers github --repo mycompany\u002Fprivate-repo --profile work\n\n# CI\u002FCD mode (fail fast, no prompts)\nskill-seekers github --repo owner\u002Frepo --non-interactive\n\n# Resume interrupted job\nskill-seekers resume --list\nskill-seekers resume github_react_20260117_143022\n```\n\n**Rate Limit Strategies Explained:**\n- **prompt** (default) - Ask what to do when rate limited (wait, switch, setup token, cancel)\n- **wait** - Automatically wait with countdown timer (respects timeout)\n- **switch** - Automatically try next available profile (for multi-account setups)\n- **fail** - Fail immediately with clear error (perfect for CI\u002FCD)\n\n### 🎯 Bootstrap Skill - Self-Hosting\n\nGenerate skill-seekers as a skill to use within your AI agent (Claude Code, Kimi, Codex, etc.):\n\n```bash\n# Generate the skill\n.\u002Fscripts\u002Fbootstrap_skill.sh\n\n# Install to Claude Code\ncp -r output\u002Fskill-seekers ~\u002F.claude\u002Fskills\u002F\n```\n\n**What you get:**\n- ✅ **Complete skill documentation** - All CLI commands and usage patterns\n- ✅ **CLI command reference** - Every tool and its options documented\n- ✅ **Quick start examples** - Common workflows and best practices\n- ✅ **Auto-generated API docs** - Code analysis, patterns, and examples\n\n### 🔐 Private Config Repositories\n- ✅ **Git-Based Config Sources** - Fetch configs from private\u002Fteam git repositories\n- ✅ **Multi-Source Management** - Register unlimited GitHub, GitLab, Bitbucket repos\n- ✅ **Team Collaboration** - Share custom configs across 3-5 person teams\n- ✅ **Enterprise Support** - Scale to 500+ developers with priority-based resolution\n- ✅ **Secure Authentication** - Environment variable tokens (GITHUB_TOKEN, GITLAB_TOKEN)\n- ✅ **Intelligent Caching** - Clone once, pull updates automatically\n- ✅ **Offline Mode** - Work with cached configs when offline\n\n### 🤖 Codebase Analysis (C3.x)\n\n**C3.4: Configuration Pattern Extraction with AI Enhancement**\n- ✅ **9 Config Formats** - JSON, YAML, TOML, ENV, INI, Python, JavaScript, Dockerfile, Docker Compose\n- ✅ **7 Pattern Types** - Database, API, logging, cache, email, auth, server configurations\n- ✅ **AI Enhancement** - Optional dual-mode AI analysis (API + LOCAL)\n  - Explains what each config does\n  - Suggests best practices and improvements\n  - **Security analysis** - Finds hardcoded secrets, exposed credentials\n- ✅ **Auto-Documentation** - Generates JSON + Markdown documentation of all configs\n- ✅ **MCP Integration** - `extract_config_patterns` tool with enhancement support\n\n**C3.3: AI-Enhanced How-To Guides**\n- ✅ **Comprehensive AI Enhancement** - Transforms basic guides into professional tutorials\n- ✅ **5 Automatic Improvements** - Step descriptions, troubleshooting, prerequisites, next steps, use cases\n- ✅ **Dual-Mode Support** - API mode (Claude API) or LOCAL mode (Claude Code CLI)\n- ✅ **No API Costs with LOCAL Mode** - FREE enhancement using your Claude Code Max plan\n- ✅ **Quality Transformation** - 75-line templates → 500+ line comprehensive guides\n\n**Usage:**\n```bash\n# Quick analysis (1-2 min, basic features only)\nskill-seekers analyze --directory tests\u002F --quick\n\n# Comprehensive analysis with AI (20-60 min, all features)\nskill-seekers analyze --directory tests\u002F --comprehensive\n\n# With AI enhancement\nskill-seekers analyze --directory tests\u002F --enhance\n```\n\n**Full Documentation:** [docs\u002FHOW_TO_GUIDES.md](docs\u002FHOW_TO_GUIDES.md#ai-enhancement-new)\n\n### 🔄 Enhancement Workflow Presets\n\nReusable YAML-defined enhancement pipelines that control how AI transforms your raw documentation into a polished skill.\n\n- ✅ **5 Bundled Presets** — `default`, `minimal`, `security-focus`, `architecture-comprehensive`, `api-documentation`\n- ✅ **User-Defined Presets** — add custom workflows to `~\u002F.config\u002Fskill-seekers\u002Fworkflows\u002F`\n- ✅ **Multiple Workflows** — chain two or more workflows in one command\n- ✅ **Fully Managed CLI** — list, inspect, copy, add, remove, and validate workflows\n\n```bash\n# Apply a single workflow\nskill-seekers create .\u002Fmy-project --enhance-workflow security-focus\n\n# Chain multiple workflows (applied in order)\nskill-seekers create .\u002Fmy-project \\\n  --enhance-workflow security-focus \\\n  --enhance-workflow minimal\n\n# Manage presets\nskill-seekers workflows list                          # List all (bundled + user)\nskill-seekers workflows show security-focus           # Print YAML content\nskill-seekers workflows copy security-focus           # Copy to user dir for editing\nskill-seekers workflows add .\u002Fmy-workflow.yaml        # Install a custom preset\nskill-seekers workflows remove my-workflow            # Remove a user preset\nskill-seekers workflows validate security-focus       # Validate preset structure\n\n# Copy multiple at once\nskill-seekers workflows copy security-focus minimal api-documentation\n\n# Add multiple files at once\nskill-seekers workflows add .\u002Fwf-a.yaml .\u002Fwf-b.yaml\n\n# Remove multiple at once\nskill-seekers workflows remove my-wf-a my-wf-b\n```\n\n**YAML preset format:**\n```yaml\nname: security-focus\ndescription: \"Security-focused review: vulnerabilities, auth, data handling\"\nversion: \"1.0\"\nstages:\n  - name: vulnerabilities\n    type: custom\n    prompt: \"Review for OWASP top 10 and common security vulnerabilities...\"\n  - name: auth-review\n    type: custom\n    prompt: \"Examine authentication and authorisation patterns...\"\n    uses_history: true\n```\n\n### ⚡ Performance & Scale\n- ✅ **Async Mode** - 2-3x faster scraping with async\u002Fawait (use `--async` flag)\n- ✅ **Large Documentation Support** - Handle 10K-40K+ page docs with intelligent splitting\n- ✅ **Router\u002FHub Skills** - Intelligent routing to specialized sub-skills\n- ✅ **Parallel Scraping** - Process multiple skills simultaneously\n- ✅ **Checkpoint\u002FResume** - Never lose progress on long scrapes\n- ✅ **Caching System** - Scrape once, rebuild instantly\n\n### 🤖 Agent-Agnostic Skill Generation\n- ✅ **Multi-Agent Support** - Generate skills for Claude, Kimi, Codex, Copilot, OpenCode, or any custom agent via `--agent` flag\n- ✅ **Custom Agent Commands** - Use `--agent-cmd` to specify a custom agent CLI command for enhancement\n- ✅ **Universal Flags** - `--agent` and `--agent-cmd` available on all commands (create, scrape, github, pdf, etc.)\n\n### 📦 Marketplace Pipeline\n- ✅ **Publish to Marketplace** - Publish skills to Claude Code plugin marketplace repos\n- ✅ **End-to-End Pipeline** - From documentation source to published marketplace entry\n\n### ✅ Quality Assurance\n- ✅ **Fully Tested** - 2,540+ tests with comprehensive coverage\n\n---\n\n## 📦 Installation\n\n```bash\n# Basic install (documentation scraping, GitHub analysis, PDF, packaging)\npip install skill-seekers\n\n# With all LLM platform support\npip install skill-seekers[all-llms]\n\n# With MCP server\npip install skill-seekers[mcp]\n\n# Everything\npip install skill-seekers[all]\n```\n\n**Need help choosing?** Run the setup wizard:\n```bash\nskill-seekers-setup\n```\n\n### Installation Options\n\n| Install | Features |\n|---------|----------|\n| `pip install skill-seekers` | Scraping, GitHub analysis, PDF, all platforms |\n| `pip install skill-seekers[gemini]` | + Google Gemini support |\n| `pip install skill-seekers[openai]` | + OpenAI ChatGPT support |\n| `pip install skill-seekers[all-llms]` | + All LLM platforms |\n| `pip install skill-seekers[mcp]` | + MCP server for Claude Code, Cursor, etc. |\n| `pip install skill-seekers[video]` | + YouTube\u002FVimeo transcript & metadata extraction |\n| `pip install skill-seekers[video-full]` | + Whisper transcription & visual frame extraction |\n| `pip install skill-seekers[jupyter]` | + Jupyter Notebook support |\n| `pip install skill-seekers[pptx]` | + PowerPoint support |\n| `pip install skill-seekers[confluence]` | + Confluence wiki support |\n| `pip install skill-seekers[notion]` | + Notion pages support |\n| `pip install skill-seekers[rss]` | + RSS\u002FAtom feed support |\n| `pip install skill-seekers[chat]` | + Slack\u002FDiscord chat export support |\n| `pip install skill-seekers[asciidoc]` | + AsciiDoc document support |\n| `pip install skill-seekers[all]` | Everything enabled |\n\n> **Video visual deps (GPU-aware):** After installing `skill-seekers[video-full]`, run\n> `skill-seekers video --setup` to auto-detect your GPU and install the correct PyTorch\n> variant + easyocr. This is the recommended way to install visual extraction dependencies.\n\n---\n\n## 🚀 One-Command Install Workflow\n\n**The fastest way to go from config to uploaded skill - complete automation:**\n\n```bash\n# Install React skill from official configs (auto-uploads to Claude)\nskill-seekers install --config react\n\n# Install from local config file\nskill-seekers install --config configs\u002Fcustom.json\n\n# Install without uploading (package only)\nskill-seekers install --config django --no-upload\n\n# Preview workflow without executing\nskill-seekers install --config react --dry-run\n```\n\n**Time:** 20-45 minutes total | **Quality:** Production-ready (9\u002F10) | **Cost:** Free\n\n**Phases executed:**\n```\n📥 PHASE 1: Fetch Config (if config name provided)\n📖 PHASE 2: Scrape Documentation\n✨ PHASE 3: AI Enhancement (MANDATORY - no skip option)\n📦 PHASE 4: Package Skill\n☁️  PHASE 5: Upload to Claude (optional, requires API key)\n```\n\n**Requirements:**\n- ANTHROPIC_API_KEY environment variable (for auto-upload)\n- Claude Code Max plan (for local AI enhancement), or use `--agent` to select a different AI agent\n\n---\n\n## 📊 Feature Matrix\n\nSkill Seekers supports **12 LLM platforms**, **8 RAG\u002Fvector targets**, **18 source types**, and full feature parity across all targets.\n\n**Platforms:** Claude AI, Google Gemini, OpenAI ChatGPT, MiniMax AI, Generic Markdown, OpenCode, Kimi (Moonshot AI), DeepSeek AI, Qwen (Alibaba), OpenRouter, Together AI, Fireworks AI\n**Source Types:** Documentation websites, GitHub repos, PDFs, Word (.docx), EPUB, Video, Local codebases, Jupyter Notebooks, Local HTML, OpenAPI\u002FSwagger, AsciiDoc, PowerPoint (.pptx), RSS\u002FAtom feeds, Man pages, Confluence wikis, Notion pages, Slack\u002FDiscord chat exports\n\nSee [Complete Feature Matrix](docs\u002FFEATURE_MATRIX.md) for detailed platform and feature support.\n\n### Quick Platform Comparison\n\n| Feature | Claude | Gemini | OpenAI | MiniMax | Markdown |\n|---------|--------|--------|--------|--------|----------|\n| Format | ZIP + YAML | tar.gz | ZIP + Vector | ZIP + Knowledge | ZIP |\n| Upload | ✅ API | ✅ API | ✅ API | ✅ API | ❌ Manual |\n| Enhancement | ✅ Sonnet 4 | ✅ 2.0 Flash | ✅ GPT-4o | ✅ M2.7 | ❌ None |\n| All Skill Modes | ✅ | ✅ | ✅ | ✅ | ✅ |\n\n---\n\n## Usage Examples\n\n### Documentation Scraping\n\n```bash\n# Scrape documentation website\nskill-seekers scrape --config configs\u002Freact.json\n\n# Quick scrape without config\nskill-seekers scrape --url https:\u002F\u002Freact.dev --name react\n\n# With async mode (3x faster)\nskill-seekers scrape --config configs\u002Fgodot.json --async --workers 8\n\n# Use a specific AI agent for enhancement\nskill-seekers scrape --config configs\u002Freact.json --agent kimi\n```\n\n### PDF Extraction\n\n```bash\n# Basic PDF extraction\nskill-seekers pdf --pdf docs\u002Fmanual.pdf --name myskill\n\n# Advanced features\nskill-seekers pdf --pdf docs\u002Fmanual.pdf --name myskill \\\n    --extract-tables \\        # Extract tables\n    --parallel \\              # Fast parallel processing\n    --workers 8               # Use 8 CPU cores\n\n# Scanned PDFs (requires: pip install pytesseract Pillow)\nskill-seekers pdf --pdf docs\u002Fscanned.pdf --name myskill --ocr\n```\n\n### Video Extraction\n\n```bash\n# Install video support\npip install skill-seekers[video]        # Transcripts + metadata\npip install skill-seekers[video-full]   # + Whisper + visual frame extraction\n\n# Auto-detect GPU and install visual deps (PyTorch + easyocr)\nskill-seekers video --setup\n\n# Extract from YouTube video\nskill-seekers video --url https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=dQw4w9WgXcQ --name mytutorial\n\n# Extract from a YouTube playlist\nskill-seekers video --playlist https:\u002F\u002Fwww.youtube.com\u002Fplaylist?list=... --name myplaylist\n\n# Extract from a local video file\nskill-seekers video --video-file recording.mp4 --name myrecording\n\n# Extract with visual frame analysis (requires video-full deps)\nskill-seekers video --url https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=... --name mytutorial --visual\n\n# With AI enhancement (cleans OCR + generates polished SKILL.md)\nskill-seekers video --url https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=... --visual --enhance-level 2\n\n# Clip a specific section of a video (supports seconds, MM:SS, HH:MM:SS)\nskill-seekers video --url https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=... --start-time 1:30 --end-time 5:00\n\n# Use Vision API for low-confidence OCR frames (requires ANTHROPIC_API_KEY)\nskill-seekers video --url https:\u002F\u002Fwww.youtube.com\u002Fwatch?v=... --visual --vision-ocr\n\n# Re-build skill from previously extracted data (skip download)\nskill-seekers video --from-json output\u002Fmytutorial\u002Fvideo_data\u002Fextracted_data.json --name mytutorial\n```\n\n> **Full guide:** See [docs\u002FVIDEO_GUIDE.md](docs\u002FVIDEO_GUIDE.md) for complete CLI reference,\n> visual pipeline details, AI enhancement options, and troubleshooting.\n\n### GitHub Repository Analysis\n\n```bash\n# Basic repository scraping\nskill-seekers github --repo facebook\u002Freact\n\n# With authentication (higher rate limits)\nexport GITHUB_TOKEN=ghp_your_token_here\nskill-seekers github --repo facebook\u002Freact\n\n# Customize what to include\nskill-seekers github --repo django\u002Fdjango \\\n    --include-issues \\        # Extract GitHub Issues\n    --max-issues 100 \\        # Limit issue count\n    --include-changelog       # Extract CHANGELOG.md\n```\n\n### Unified Multi-Source Scraping\n\n**Combine documentation + GitHub + PDF into one unified skill with conflict detection:**\n\n```bash\n# Use existing unified configs\nskill-seekers unified --config configs\u002Freact_unified.json\nskill-seekers unified --config configs\u002Fdjango_unified.json\n\n# Or create unified config\ncat > configs\u002Fmyframework_unified.json \u003C\u003C 'EOF'\n{\n  \"name\": \"myframework\",\n  \"merge_mode\": \"rule-based\",\n  \"sources\": [\n    {\n      \"type\": \"documentation\",\n      \"base_url\": \"https:\u002F\u002Fdocs.myframework.com\u002F\",\n      \"max_pages\": 200\n    },\n    {\n      \"type\": \"github\",\n      \"repo\": \"owner\u002Fmyframework\",\n      \"code_analysis_depth\": \"surface\"\n    }\n  ]\n}\nEOF\n\nskill-seekers unified --config configs\u002Fmyframework_unified.json\n```\n\n**Conflict Detection automatically finds:**\n- 🔴 **Missing in code** (high): Documented but not implemented\n- 🟡 **Missing in docs** (medium): Implemented but not documented\n- ⚠️ **Signature mismatch**: Different parameters\u002Ftypes\n- ℹ️ **Description mismatch**: Different explanations\n\n**Full Guide:** See [docs\u002FUNIFIED_SCRAPING.md](docs\u002FUNIFIED_SCRAPING.md) for complete documentation.\n\n### Private Config Repositories\n\n**Share custom configs across teams using private git repositories:**\n\n```bash\n# Option 1: Using MCP tools (recommended)\n# Register your team's private repo\nadd_config_source(\n    name=\"team\",\n    git_url=\"https:\u002F\u002Fgithub.com\u002Fmycompany\u002Fskill-configs.git\",\n    token_env=\"GITHUB_TOKEN\"\n)\n\n# Fetch config from team repo\nfetch_config(source=\"team\", config_name=\"internal-api\")\n```\n\n**Supported Platforms:**\n- GitHub (`GITHUB_TOKEN`), GitLab (`GITLAB_TOKEN`), Gitea (`GITEA_TOKEN`), Bitbucket (`BITBUCKET_TOKEN`)\n\n**Full Guide:** See [docs\u002FGIT_CONFIG_SOURCES.md](docs\u002FGIT_CONFIG_SOURCES.md) for complete documentation.\n\n## How It Works\n\n```mermaid\ngraph LR\n    A[Documentation Website] --> B[Skill Seekers]\n    B --> C[Scraper]\n    B --> D[AI Enhancement]\n    B --> E[Packager]\n    C --> F[Organized References]\n    D --> F\n    F --> E\n    E --> G[AI Skill .zip]\n    G --> H[Upload to AI Platform]\n```\n\n0. **Detect llms.txt** - Checks for llms-full.txt, llms.txt, llms-small.txt first (part of Smart SPA Discovery)\n1. **Scrape**: Extracts all pages from documentation\n2. **Categorize**: Organizes content into topics (API, guides, tutorials, etc.)\n3. **Enhance**: AI analyzes docs and creates comprehensive SKILL.md with examples (supports multiple agents via `--agent`)\n4. **Package**: Bundles everything into a platform-ready `.zip` file\n\n## Architecture\n\nThe system is organized into **8 core modules** and **5 utility modules** (~200 classes total):\n\n![Package Overview](docs\u002FUML\u002Fexports\u002F00_package_overview.png)\n\n| Module | Purpose | Key Classes |\n|--------|---------|-------------|\n| **CLICore** | Git-style command dispatcher | `CLIDispatcher`, `SourceDetector`, `CreateCommand` |\n| **Scrapers** | 18 source-type extractors | `DocToSkillConverter`, `GitHubScraper`, `UnifiedScraper` |\n| **Adaptors** | 20+ output platform formats | `SkillAdaptor` (ABC), `ClaudeAdaptor`, `LangChainAdaptor` |\n| **Analysis** | C3.x codebase analysis pipeline | `UnifiedCodebaseAnalyzer`, `PatternRecognizer`, 10 GoF detectors |\n| **Enhancement** | AI-powered skill improvement via `AgentClient` | `AgentClient`, `AIEnhancer`, `UnifiedEnhancer`, `WorkflowEngine` |\n| **Packaging** | Package, upload, install skills | `PackageSkill`, `InstallAgent` |\n| **MCP** | FastMCP server (40 tools) | `SkillSeekerMCPServer`, 10 tool modules |\n| **Sync** | Doc change detection | `ChangeDetector`, `SyncMonitor`, `Notifier` |\n\nUtility modules: **Parsers** (28 CLI parsers), **Storage** (S3\u002FGCS\u002FAzure), **Embedding** (multi-provider vectors), **Benchmark** (performance), **Utilities** (16 shared helpers).\n\nFull UML diagrams: **[docs\u002FUML_ARCHITECTURE.md](docs\u002FUML_ARCHITECTURE.md)** | StarUML project: `docs\u002FUML\u002Fskill_seekers.mdj` | HTML API reference: `docs\u002FUML\u002Fhtml\u002F`\n\n## 📋 Prerequisites\n\n**Before you start, make sure you have:**\n\n1. **Python 3.10 or higher** - [Download](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F) | Check: `python3 --version`\n2. **Git** - [Download](https:\u002F\u002Fgit-scm.com\u002F) | Check: `git --version`\n3. **15-30 minutes** for first-time setup\n\n**First time user?** → **[Start Here: Bulletproof Quick Start Guide](BULLETPROOF_QUICKSTART.md)** 🎯\n\n---\n\n## 📤 Uploading Skills to Claude\n\nOnce your skill is packaged, you need to upload it to Claude:\n\n### Option 1: Automatic Upload (API-based)\n\n```bash\n# Set your API key (one-time)\nexport ANTHROPIC_API_KEY=sk-ant-...\n\n# Package and upload automatically\nskill-seekers package output\u002Freact\u002F --upload\n\n# OR upload existing .zip\nskill-seekers upload output\u002Freact.zip\n```\n\n### Option 2: Manual Upload (No API Key)\n\n```bash\n# Package skill\nskill-seekers package output\u002Freact\u002F\n# → Creates output\u002Freact.zip\n\n# Then manually upload:\n# - Go to https:\u002F\u002Fclaude.ai\u002Fskills\n# - Click \"Upload Skill\"\n# - Select output\u002Freact.zip\n```\n\n### Option 3: MCP (Claude Code)\n\n```\nIn Claude Code, just ask:\n\"Package and upload the React skill\"\n```\n\n---\n\n## 🤖 Installing to AI Agents\n\nSkill Seekers can automatically install skills to 19 AI coding agents.\n\n```bash\n# Install to specific agent\nskill-seekers install-agent output\u002Freact\u002F --agent cursor\n\n# Install to IBM Bob (project-local .bob\u002Fskills\u002F)\nskill-seekers install-agent output\u002Freact\u002F --agent bob\n\n# Install to all agents at once\nskill-seekers install-agent output\u002Freact\u002F --agent all\n\n# Preview without installing\nskill-seekers install-agent output\u002Freact\u002F --agent cursor --dry-run\n```\n\n### Supported Agents\n\n| Agent | Path | Type |\n|-------|------|------|\n| **Claude Code** | `~\u002F.claude\u002Fskills\u002F` | Global |\n| **Cursor** | `.cursor\u002Fskills\u002F` | Project |\n| **VS Code \u002F Copilot** | `.github\u002Fskills\u002F` | Project |\n| **Amp** | `~\u002F.amp\u002Fskills\u002F` | Global |\n| **Goose** | `~\u002F.config\u002Fgoose\u002Fskills\u002F` | Global |\n| **OpenCode** | `~\u002F.opencode\u002Fskills\u002F` | Global |\n| **Windsurf** | `~\u002F.windsurf\u002Fskills\u002F` | Global |\n| **Roo Code** | `.roo\u002Fskills\u002F` | Project |\n| **Cline** | `.cline\u002Fskills\u002F` | Project |\n| **Aider** | `~\u002F.aider\u002Fskills\u002F` | Global |\n| **Bolt** | `.bolt\u002Fskills\u002F` | Project |\n| **Kilo Code** | `.kilo\u002Fskills\u002F` | Project |\n| **Continue** | `~\u002F.continue\u002Fskills\u002F` | Global |\n| **Kimi Code** | `~\u002F.kimi\u002Fskills\u002F` | Global |\n| **IBM Bob** | `.bob\u002Fskills\u002F` | Project |\n\n---\n\n## 🔌 MCP Integration (26 Tools)\n\nSkill Seekers ships an MCP server for use from Claude Code, Cursor, Windsurf, VS Code + Cline, or IntelliJ IDEA.\n\n```bash\n# stdio mode (Claude Code, VS Code + Cline)\npython -m skill_seekers.mcp.server_fastmcp\n\n# HTTP mode (Cursor, Windsurf, IntelliJ)\npython -m skill_seekers.mcp.server_fastmcp --transport http --port 8765\n\n# Auto-configure all agents at once\n.\u002Fsetup_mcp.sh\n```\n\n**All 26 tools available:**\n- **Core (9):** `list_configs`, `generate_config`, `validate_config`, `estimate_pages`, `scrape_docs`, `package_skill`, `upload_skill`, `enhance_skill`, `install_skill`\n- **Extended (10):** `scrape_github`, `scrape_pdf`, `unified_scrape`, `merge_sources`, `detect_conflicts`, `add_config_source`, `fetch_config`, `list_config_sources`, `remove_config_source`, `split_config`\n- **Vector DB (4):** `export_to_chroma`, `export_to_weaviate`, `export_to_faiss`, `export_to_qdrant`\n- **Cloud (3):** `cloud_upload`, `cloud_download`, `cloud_list`\n\n**Full Guide:** [docs\u002FMCP_SETUP.md](docs\u002FMCP_SETUP.md)\n\n---\n\n## ⚙️ Configuration\n\n### Available Presets (24+)\n\n```bash\n# List all presets\nskill-seekers list-configs\n```\n\n| Category | Presets |\n|----------|---------|\n| **Web Frameworks** | `react`, `vue`, `angular`, `svelte`, `nextjs` |\n| **Python** | `django`, `flask`, `fastapi`, `sqlalchemy`, `pytest` |\n| **Game Development** | `godot`, `pygame`, `unity` |\n| **Tools & DevOps** | `docker`, `kubernetes`, `terraform`, `ansible` |\n| **Unified (Docs + GitHub)** | `react-unified`, `vue-unified`, `nextjs-unified`, and more |\n\n### Creating Your Own Config\n\n```bash\n# Option 1: Interactive\nskill-seekers scrape --interactive\n\n# Option 2: Copy and edit a preset\ncp configs\u002Freact.json configs\u002Fmyframework.json\nnano configs\u002Fmyframework.json\nskill-seekers scrape --config configs\u002Fmyframework.json\n```\n\n### Config File Structure\n\n```json\n{\n  \"name\": \"myframework\",\n  \"description\": \"When to use this skill\",\n  \"base_url\": \"https:\u002F\u002Fdocs.myframework.com\u002F\",\n  \"selectors\": {\n    \"main_content\": \"article\",\n    \"title\": \"h1\",\n    \"code_blocks\": \"pre code\"\n  },\n  \"url_patterns\": {\n    \"include\": [\"\u002Fdocs\", \"\u002Fguide\"],\n    \"exclude\": [\"\u002Fblog\", \"\u002Fabout\"]\n  },\n  \"categories\": {\n    \"getting_started\": [\"intro\", \"quickstart\"],\n    \"api\": [\"api\", \"reference\"]\n  },\n  \"rate_limit\": 0.5,\n  \"max_pages\": 500\n}\n```\n\n### Where to Store Configs\n\nThe tool searches in this order:\n1. Exact path as provided\n2. `.\u002Fconfigs\u002F` (current directory)\n3. `~\u002F.config\u002Fskill-seekers\u002Fconfigs\u002F` (user config directory)\n4. SkillSeekersWeb.com API (preset configs)\n\n---\n\n## 📊 What Gets Created\n\n```\noutput\u002F\n├── godot_data\u002F              # Scraped raw data\n│   ├── pages\u002F              # JSON files (one per page)\n│   └── summary.json        # Overview\n│\n└── godot\u002F                   # The skill\n    ├── SKILL.md            # Enhanced with real examples\n    ├── references\u002F         # Categorized docs\n    │   ├── index.md\n    │   ├── getting_started.md\n    │   ├── scripting.md\n    │   └── ...\n    ├── scripts\u002F            # Empty (add your own)\n    └── assets\u002F             # Empty (add your own)\n```\n\n---\n\n## 🐛 Troubleshooting\n\n### No Content Extracted?\n- Check your `main_content` selector\n- Try: `article`, `main`, `div[role=\"main\"]`\n\n### Data Exists But Won't Use It?\n```bash\n# Force re-scrape\nrm -rf output\u002Fmyframework_data\u002F\nskill-seekers scrape --config configs\u002Fmyframework.json\n```\n\n### Categories Not Good?\nEdit the config `categories` section with better keywords.\n\n### Want to Update Docs?\n```bash\n# Delete old data and re-scrape\nrm -rf output\u002Fgodot_data\u002F\nskill-seekers scrape --config configs\u002Fgodot.json\n```\n\n### Enhancement Not Working?\n```bash\n# Check if API key is set\necho $ANTHROPIC_API_KEY\n\n# Try LOCAL mode instead (uses Claude Code Max, no API key needed)\nskill-seekers enhance output\u002Freact\u002F --mode LOCAL\n\n# Monitor background enhancement status\nskill-seekers enhance-status output\u002Freact\u002F --watch\n```\n\n### GitHub Rate Limit Issues?\n```bash\n# Set a GitHub token (5000 req\u002Fhour vs 60\u002Fhour anonymous)\nexport GITHUB_TOKEN=ghp_your_token_here\n\n# Or configure multiple profiles\nskill-seekers config --github\n```\n\n---\n\n## 📈 Performance\n\n| Task | Time | Notes |\n|------|------|-------|\n| Scraping (sync) | 15-45 min | First time only, thread-based |\n| Scraping (async) | 5-15 min | 2-3x faster with `--async` flag |\n| Building | 1-3 min | Fast rebuild from cache |\n| Re-building | \u003C1 min | With `--skip-scrape` |\n| Enhancement (LOCAL) | 30-60 sec | Uses Claude Code Max |\n| Enhancement (API) | 20-40 sec | Requires API key |\n| Video (transcript) | 1-3 min | YouTube\u002Flocal, transcript only |\n| Video (visual) | 5-15 min | + OCR frame extraction |\n| Packaging | 5-10 sec | Final .zip creation |\n\n---\n\n## 📚 Documentation\n\n### Getting Started\n- **[BULLETPROOF_QUICKSTART.md](BULLETPROOF_QUICKSTART.md)** - 🎯 **START HERE** if you're new!\n- **[QUICKSTART.md](QUICKSTART.md)** - Quick start for experienced users\n- **[TROUBLESHOOTING.md](TROUBLESHOOTING.md)** - Common issues and solutions\n- **[docs\u002FQUICK_REFERENCE.md](docs\u002FQUICK_REFERENCE.md)** - One-page cheat sheet\n\n### Architecture\n- **[docs\u002FUML_ARCHITECTURE.md](docs\u002FUML_ARCHITECTURE.md)** - UML architecture overview with 14 diagrams\n- **[docs\u002FUML\u002Fexports\u002F](docs\u002FUML\u002Fexports\u002F)** - PNG diagram exports (package overview + 13 class diagrams)\n- **[docs\u002FUML\u002Fhtml\u002F](docs\u002FUML\u002Fhtml\u002Findex.html\u002Findex.html)** - Full HTML API reference (all classes, operations, attributes)\n- **[docs\u002FUML\u002Fskill_seekers.mdj](docs\u002FUML\u002Fskill_seekers.mdj)** - StarUML project file (open with [StarUML](https:\u002F\u002Fstaruml.io\u002F))\n\n### Guides\n- **[docs\u002FLARGE_DOCUMENTATION.md](docs\u002FLARGE_DOCUMENTATION.md)** - Handle 10K-40K+ page docs\n- **[ASYNC_SUPPORT.md](ASYNC_SUPPORT.md)** - Async mode guide (2-3x faster scraping)\n- **[docs\u002FENHANCEMENT_MODES.md](docs\u002FENHANCEMENT_MODES.md)** - AI enhancement modes guide\n- **[docs\u002FMCP_SETUP.md](docs\u002FMCP_SETUP.md)** - MCP integration setup\n- **[docs\u002FUNIFIED_SCRAPING.md](docs\u002FUNIFIED_SCRAPING.md)** - Multi-source scraping\n- **[docs\u002FVIDEO_GUIDE.md](docs\u002FVIDEO_GUIDE.md)** - Video extraction guide\n\n### Integration Guides\n- **[docs\u002Fintegrations\u002FLANGCHAIN.md](docs\u002Fintegrations\u002FLANGCHAIN.md)** - LangChain RAG\n- **[docs\u002Fintegrations\u002FCURSOR.md](docs\u002Fintegrations\u002FCURSOR.md)** - Cursor IDE\n- **[docs\u002Fintegrations\u002FWINDSURF.md](docs\u002Fintegrations\u002FWINDSURF.md)** - Windsurf IDE\n- **[docs\u002Fintegrations\u002FCLINE.md](docs\u002Fintegrations\u002FCLINE.md)** - Cline (VS Code)\n- **[docs\u002Fintegrations\u002FRAG_PIPELINES.md](docs\u002Fintegrations\u002FRAG_PIPELINES.md)** - All RAG pipelines\n\n---\n\n## 📝 License\n\nMIT License - see [LICENSE](LICENSE) file for details\n\n---\n\nHappy skill building! 🚀\n\n---\n\n## 🔒 Security\n\n[![MseeP.ai Security Assessment Badge](https:\u002F\u002Fmseep.net\u002Fpr\u002Fyusufkaraaslan-skill-seekers-badge.png)](https:\u002F\u002Fmseep.ai\u002Fapp\u002Fyusufkaraaslan-skill-seekers)\n","Skill Seekers 是一个将文档网站、GitHub 仓库和 PDF 等多种来源转换为 Claude AI 技能的工具，并具备自动冲突检测功能。该项目使用 Python 编写，支持从多种数据源提取信息并将其结构化处理，便于集成到各种 AI 系统中，如生成 Claude AI 技能、增强检索增强生成（RAG）管道以及辅助代码开发等。其核心特性包括强大的 AST 解析器、自动化处理流程、多源数据支持及 OCR 技术，使得知识资产的创建过程更加高效准确。适用于需要快速构建基于文档的知识库以驱动 AI 应用的各种场景，例如企业内部知识管理、软件开发辅助工具或教育内容整理等领域。",2,"2026-06-11 03:39:31","high_star"]