[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82000":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":13,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":13,"stars7d":15,"stars30d":16,"stars90d":13,"forks30d":13,"starsTrendScore":17,"compositeScore":13,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":18,"fork":18,"defaultBranch":19,"hasWiki":18,"hasPages":18,"topics":20,"createdAt":10,"pushedAt":10,"updatedAt":21,"readmeContent":22,"aiSummary":23,"trendingCount":13,"starSnapshotCount":13,"syncStatus":24,"lastSyncTime":25,"discoverSource":26},82000,"SEOCORE","codepurse\u002FSEOCORE","codepurse","Enterprise-grade, multi-threaded SEO Crawler, Rule Engine, and Link Graph Analyzer. Built in TypeScript for speed, compliance, and deep site health audits.","",null,"TypeScript",99,0,1,48,73,6,false,"main",[],"2026-06-12 02:04:22","   # 🕸️ SEOCore\n\n   > Enterprise-grade, multi-threaded SEO Crawler, Rule Engine, and Link Graph Analyzer. Built in TypeScript for speed, compliance, and deep site health audits.\n\n   [![License: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-MIT-yellow.svg)](https:\u002F\u002Fopensource.org\u002Flicenses\u002FMIT)\n   [![Build Status](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fbuild-passing-brightgreen)](https:\u002F\u002Fgithub.com\u002Fcodepurse\u002FSEOCORE)\n   [![Node Version](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnode-%3E%3D20.0.0-blue.svg)](https:\u002F\u002Fnodejs.org)\n\n   SEOCore is an enterprise-grade, high-performance SEO auditing and site crawling platform. It combines a concurrent crawler, Cheerio-based scrapers, a declarative Rules Engine, and Graph Theory to analyze link structures, calculate authority scores, track redirects, and score site health across multiple dimensions.\n\n   ---\n\n   ## 🎯 Target Users\n\n   - **Developers & Web Engineers**: Run local audits, profile rendering pipelines, track performance budgets, and integrate SEO linting directly into CI\u002FCD pipelines.\n   - **SEO Specialists**: Analyze canonicalization, crawl depth, HTTP redirect chains, structured data, canonical compliance, robots.txt directives, and sitemap coverage.\n   - **Site Administrators**: Find broken links, orphan pages, crawl budget waste, and redirect loops.\n\n   ---\n\n   ## 💻 Tech Stack\n\n   - **Runtime**: [Node.js (v20+)](https:\u002F\u002Fnodejs.org\u002F) & [TypeScript](https:\u002F\u002Fwww.typescriptlang.org\u002F)\n   - **Monorepo Manager**: [Nx Monorepo](https:\u002F\u002Fnx.dev\u002F)\n   - **Crawler**: Custom HTTP engine powered by [Bottleneck](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fbottleneck) (rate-limiting) & [p-queue](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fp-queue) (concurrency)\n   - **Headless Browser**: [Playwright](https:\u002F\u002Fplaywright.dev\u002F) (optional, for client-side JavaScript rendering)\n   - **HTML Parser**: [Cheerio](https:\u002F\u002Fcheerio.js.org\u002F) (fast server-side DOM selection)\n   - **Validation & CLI**: [Zod](https:\u002F\u002Fzod.dev\u002F) (configuration schema enforcement) & [Commander.js](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fcommander)\n   - **Test Runner**: [Vitest](https:\u002F\u002Fvitest.dev\u002F)\n\n   ---\n\n   ## ✨ Key Features\n\n   1. **Execution Tier System**:\n      - Tiers drive everything from crawl limits to rule selection and scoring behavior\n      - **Fast**: Core rules only, 1 page, static HTML\n      - **Standard**: + Performance, 100 pages, simulated CWV\n      - **Deep**: + All modules, 500 pages, Playwright rendering\n      - **Enterprise**: + Plugins, 5000 pages, Lighthouse sampling\n\n   2. **High-Performance Concurrent Crawler**:\n      - Built-in rate-limiting, custom backoff delays, retry policies, and timeout handlers.\n      - Respects robots.txt directives and extracts URLs from `sitemap.xml` automatically.\n   3. **Path Filtering (Inclusions\u002FExclusions)**:\n      - Restrict audits using wildcards (e.g. `\u002Fblog\u002F*` or `*.html`).\n      - Block admin sections or static resource patterns.\n   4. **Deep Redirect Hop & Loop Tracking**:\n      - Manual redirection handling intercepts 3xx responses.\n      - Traces complete redirect chains (statusCode and hops) and catches circular redirect loops.\n5. **Unified Structured Data & Entity Graph Auditor**:\n   - Compiles Schema.org JSON-LD, Microdata, RDFa elements from raw source HTML and Playwright rendering.\n   - Stitches nodes into an Entity Graph, resolves referencing pointers deeply, and maps DAG layouts safely.\n   - Evaluates E-E-A-T markers (sameAs links pointing to Wikipedia\u002FWikidata\u002FLinkedIn).\n   - Cross-checks schema values (price, title, canonical URL) against HTML headers, canonicals, and OpenGraph\u002FTwitter card tags.\n6. **Crawl Graph & Link Authority Analysis**:\n   - Computes in-degree, out-degree, and custom authority scores (PageRank style).\n   - Flags orphan pages and structural dead ends.\n7. **AI Visibility & LLM Crawler Directives Auditor**:\n   - Evaluates brand visibility and structured indexing across search engines, chatbots, and AI crawlers.\n   - Strictly validates crawlability configurations (robots.txt, sitemaps).\n   - Audits `llms.txt` and `\u002F.well-known\u002Fllms.txt` rules for GPTBot, ClaudeBot, PerplexityBot, and Google-Extended.\n8. **Mobile SEO Scorer & Evaluator**:\n   - Evaluates mobile usability (viewport meta, responsive layouts, navigation toggle detection, tap targets).\n   - Scores mobile performance (simulated Core Web Vitals including throttled mobile LCP, mobile CLS, JS payload and requests).\n   - Verifies responsive design quality (CSS media queries, fluid layouts, standard mobile breakpoints).\n   - Audits mobile indexing readiness (content parity, structured data validity, mobile-first canonical configuration).\n   - Enforces `isVerifiable()` guards and strict empty states (non-pass by default) under static crawls, capping unverified performance scores at 50 to ensure high scores require real runtime validation.\n9. **E-E-A-T & Content Quality Analyzer**:\n   - Evaluates Experience, Expertise, Authoritativeness, and Trustworthiness (E-E-A-T) pillars.\n   - Scores content readability (Flesch Reading Ease, Flesch-Kincaid Grade Level).\n   - Analyzes content structure, word count, and internal link density.\n   - Extracts top keywords and checks for keyword stuffing.\n   - Verifies AI citation readiness (structured data completeness, llms.txt presence, semantic HTML usage).\n   - Provides actionable findings with severity levels.\n   - Supports JSON and HTML report exports for documentation and CI\u002FCD integration.\n10. **Outbound Authority Links & Google Rank Checker**:\n   - Analyzes backlink domains metrics (authority counts, referring domains, spam scores).\n   - Verifies keywords visibility inside Google Top 10 Search Results via serpapi or headless browser automation.\n11. **Competitive Site Comparer**:\n   - Compares health metrics, performance budgets, metadata, and link structures across two different URLs or exported JSON audits.\n12. **Hreflang Validator**:\n    - Validates bidirectional hreflang links across pages.\n    - Checks for consistent x-default configurations.\n    - Validates language code formats.\n    - Deep-crawls all hreflang-referenced pages with `--deep` option.\n    - Exports validation reports in terminal and JSON formats.\n13. **Optional Headless Rendering**:\n    - Boot Playwright to parse single-page apps (SPAs) that require client-side execution.\n14. **Visual Screenshot Capture**:\n    - Capture screenshots of your pages at different breakpoints (mobile, tablet, desktop).\n    - Use Playwright device descriptors (e.g., \"iPhone 15 Pro\") for accurate mobile screenshots.\n    - Capture full-page screenshots of your website.\n    - Deep crawl to capture screenshots for all pages listed in your sitemap.\n15. **Dedicated Image Audit (`images` command)**:\n    - Audits page or site-wide images for SEO, performance, accessibility, and caching.\n    - Discovers assets from `\u003Cimg>`, `\u003Cpicture>`, inline `background-image`, and `\u003Clink rel=\"preload\" as=\"image\">`.\n    - Fetches metadata in parallel (size, format, cache headers, CDN signals) and decodes dimensions with `sharp`.\n    - Optional Playwright mode for rendered vs natural size, viewport placement, and LCP image detection.\n    - Rules cover payload weight, legacy formats, lazy-loading strategy, CLS risk, responsive `srcset`, alt text, and broken\u002Fmixed-content URLs.\n    - Byte-weighted scoring, mobile payload budgets (1.5MB), and LCP image weight targets (100KB).\n    - Exports terminal summary plus JSON or HTML reports with thumbnails and worst-offender tables.\n16. **Evidence-based Technology Stack Detection (`technology` command)**:\n    - Analyzes frontend frameworks, rendering strategies, CDN\u002Fedge delivery networks, backend servers, CMS packages, analytics trackers, UI systems, asset fonts, and third-party tools.\n    - Suppresses low-confidence noise. Requires deterministic evidence weights before reporting.\n    - Classifies page rendering strategies directly (Hybrid, SSR, CSR, or static HTML).\n    - Exports findings in terminal tables, structured raw JSON, or clean HTML charts.\n17. **JavaScript SEO Impact Report (`js-impact` command)**:\n    - Compares raw HTML against rendered DOM to detect SEO-relevant changes caused by client-side JavaScript.\n    - Flags metadata, heading, content, links, image, and structured-data parity issues between pre-render and post-render states.\n    - Helps diagnose CSR \u002F hydration problems that can hide content or links from crawlers.\n    - Exports terminal, JSON, HTML, and Markdown reports for debugging and CI workflows.\n18. **Business Directory Presence & NAP Consistency Audit (`directories` command)**:\n    - Detects business listings across major directories and local citation sources.\n    - Extracts source-site NAP data (name, phone, address, website) and compares it against candidate listings.\n    - Classifies listings as `Issues not found`, `Wrong Phone Number`, `Wrong Business Name`, `No Phone Number`, `Not Present`, or `Search failed`.\n    - Uses a resilient HTTP search cascade (`Bing -> Brave -> Mojeek -> DuckDuckGo`) with optional SerpAPI or Playwright fallback.\n    - Outputs terminal tables or raw JSON for citation cleanup, local SEO audits, and missed-opportunity reporting.\n19. **Audit Snapshots & Diff**:\n    - Save audit snapshots automatically with `--save` flag\n    - Compare current audit against previous snapshot with `--diff`\n    - CI mode with regression detection (`--diff --ci`) fails only on regressions\n    - Stores snapshots in `.\u002F.seocore\u002Fhistory\u002F\u003Chost>\u002F` directory\n\n20. **Explain & Dry-Run UX**:\n    - Preview audit configuration without crawling with `--dry-run`\n    - Explains active tier, enabled modules, page budget, and active rules\n    - `seocore rules explain \u003Crule-id>` shows detailed rule information\n    - `seocore tier explain \u003Ctier>` shows tier capabilities and configuration\n\n21. **Schema Graph Explorer**:\n    - Analyze structured data entities and their relationships\n    - Detects broken references, duplicate entities, and schema coverage gaps\n    - Exports in terminal, JSON, HTML, and Mermaid diagram formats\n\n22. **Internal Link Planner**:\n    - Generates actionable internal linking recommendations\n    - Identifies orphan pages and low-authority priority pages\n    - Suggests source\u002Ftarget page pairs with anchor text themes\n    - Highlights high-leverage hub pages\n\n23. **Search Opportunities Analyzer**:\n    - Combines crawl findings with optional GSC\u002FCrUX data\n    - Prioritizes opportunities by estimated business impact and ease-of-fix\n    - Works without external providers using heuristic-based ranking\n    - Identifies metadata, performance, indexing, internal links, schema, and content opportunities\n\n24. **Production-Ready Reporting**:\n\n    - Real-time colored terminal logging via custom EventBus.\n    - Exports rich, detailed audit logs in terminal, JSON, HTML, and SARIF formats.\n\n   ---\n\n   ## 🏗️ Monorepo Folder Structure\n\n   The project is structured as a modular TypeScript monorepo managed with Nx:\n\n   ```text\n   packages\u002F\n   ├── cli\u002F         # Command-line interface containing CLI commands\n   ├── engine\u002F      # Main orchestrator linking crawling, parsing, and scoring\n   ├── crawler\u002F     # HttpCrawler, PlaywrightCrawler, robots.txt & sitemap parser\n   ├── analyzers\u002F   # Fast cheerio scrapers and page normalizers\n   ├── rules\u002F       # Declarative SEO auditing rules and rule compiler\n   ├── scoring\u002F     # Crawl graph authority & category scoring engines\n   ├── config\u002F      # Config loading, default presets, and Zod schema validation\n   ├── sdk\u002F         # Shared interfaces, events, schemas, and common utilities\n   └── reporter\u002F    # Exporters (TerminalReporter and JsonReporter)\n   ```\n\n   ---\n\n   ## 🚀 Installation\n\n   ### Prerequisites\n   - **Node.js** v20.0.0 or higher\n   - **npm**\n\n   ### Use as Published CLI\n\n   **Global install** (lets you run `seocore` directly):\n\n   ```bash\n   npm install -g seocore\n   seocore config init\n   ```\n\n   **Local project install** (run with `npx`):\n\n   ```bash\n   npm install seocore\n   npx seocore config init\n   ```\n\n   **One-off run** (no install):\n\n   ```bash\n   npx seocore@latest config init\n   ```\n\n   ### Develop from Repository\n\n   1. **Clone repository**:\n      ```bash\n      git clone https:\u002F\u002Fgithub.com\u002Fcodepurse\u002FSEOCORE.git\n      cd SEOCORE\n      ```\n\n   2. **Install dependencies**:\n      ```bash\n      npm install\n      ```\n\n   3. **Build monorepo**:\n      ```bash\n      npm run build\n      ```\n\n   ---\n\n   ## 📖 Usage Flow\n\n   SEOCore can be executed via the CLI or imported directly as an SDK.\n\n  ### CLI Usage\n\n  The core CLI executable is `seocore`.\n  If you installed package locally with `npm install seocore`, run commands as `npx seocore ...`.\n  Examples below use direct `seocore ...` form, which assumes global install via `npm install -g seocore`.\n\n  #### Main Commands:\n   - `audit`: Audit a website for SEO, speed, indexing, accessibility, and metadata\n   - `crawl`: Crawl a website and list discovered pages without scoring\n   - `compare`: Compare two websites or SEO audit reports\n   - `images`: Analyze images on a webpage or crawl an entire site for image issues\n   - `technology`: Detect website technology stack with evidence-based confidence scores\n   - `js-impact`: Compare raw HTML vs rendered DOM for JavaScript SEO impact\n   - `directories`: Check business directory presence and NAP consistency across citation sources\n   - `inspect`: Single-aspect probes (robots, sitemap, schema, hreflang, backlinks, rank, screenshot, llms-txt)\n   - `analyze`: Analyzer-driven deep dives (content, ai-visibility, schema-graph, link-plan, opportunities)\n   - `config`: Manage and validate SEO config\n   - `rules`: Manage and inspect SEO validation rules\n   - `tier`: Manage execution tiers\n\n   ---\n\n   #### 1. Initialize Configuration\n   Generate a default `seocore.config.json` configuration file at your project root:\n   ```bash\n  seocore config init\n   ```\n\n   Show current config:\n   ```bash\n  seocore config show\n   ```\n\n   Validate config:\n   ```bash\n  seocore config validate\n   ```\n\n   #### 2. Run a Site Audit\n   Audit a website's landing page (default standard tier):\n   ```bash\n  seocore audit https:\u002F\u002Fexample.com\n   ```\n\n   Audit using specific tiers:\n   ```bash\n   # Fast tier (core rules, 1 page, static HTML)\n  seocore audit https:\u002F\u002Fexample.com --tier fast\n\n   # Standard tier (core + performance, 100 pages, simulated CWV)\n  seocore audit https:\u002F\u002Fexample.com --tier standard\n\n   # Deep tier (all modules, 500 pages, Playwright rendering)\n  seocore audit https:\u002F\u002Fexample.com --tier deep\n\n   # Enterprise tier (all modules + plugins, 5000 pages, Lighthouse sampling)\n  seocore audit https:\u002F\u002Fexample.com --tier enterprise\n   ```\n\n   Export audit as HTML report:\n   ```bash\n  seocore audit https:\u002F\u002Fexample.com --format html --output .\u002Fseocore-report.html\n   ```\n\n   Save audit snapshot for later comparison:\n   ```bash\n  seocore audit https:\u002F\u002Fexample.com --save\n   ```\n\n   Compare current audit against previous snapshot:\n   ```bash\n  seocore audit https:\u002F\u002Fexample.com --diff\n   ```\n\n   Save new snapshot and compare with previous:\n   ```bash\n  seocore audit https:\u002F\u002Fexample.com --save --diff\n   ```\n\n   CI mode - fail on regressions only:\n   ```bash\n  seocore audit https:\u002F\u002Fexample.com --diff --ci\n   ```\n\n   Dry-run - preview what will be audited without crawling:\n   ```bash\n  seocore audit https:\u002F\u002Fexample.com --dry-run\n   ```\n\n   **Audit Flags:** `--save`, `--diff`, `--ci`, `--dry-run`, `--history-dir \u003Cpath>` (custom snapshot directory)\n\n   #### 3. Run Crawler Only\n   Map site structure and list HTTP responses without executing SEO rules or scoring:\n   ```bash\n  seocore crawl https:\u002F\u002Fexample.com --depth 2 --max-pages 100\n   ```\n\n   #### 4. Manage SEO Validation Rules\n   List all registered rules, severity levels, and category assignments:\n   ```bash\n  seocore rules list\n   ```\n\n   Describe a specific rule:\n   ```bash\n  seocore rules describe \u003Crule-id>\n   ```\n\n   Explain a specific rule in detail:\n   ```bash\n  seocore rules explain \u003Crule-id>\n   ```\n\n   #### 5. Manage Execution Tiers\n   List all available tiers, their capabilities, and configurations:\n   ```bash\n  seocore tier list\n   ```\n\n   Describe a specific tier:\n   ```bash\n  seocore tier describe \u003Ctier-name>\n   ```\n\n   Explain a specific tier in detail:\n   ```bash\n  seocore tier explain \u003Ctier-name>\n   ```\n\n   #### 6. Analyze AI Visibility & Structure\n   Evaluate search engine\u002Fchatbot discovery, metadata structure, citation readiness, and entity mapping:\n   ```bash\n  seocore analyze ai-visibility https:\u002F\u002Fexample.com\n   ```\n\n   Output results in raw JSON:\n   ```bash\n  seocore analyze ai-visibility https:\u002F\u002Fexample.com --json\n   ```\n\n   #### 7. Analyze E-E-A-T & Content Quality\n   Evaluate Experience, Expertise, Authoritativeness, and Trustworthiness (E-E-A-T), content readability, structure, and AI citation readiness:\n   ```bash\n  seocore analyze content https:\u002F\u002Fexample.com\u002Fblog\u002Fpost\n   ```\n\n   Export as JSON:\n   ```bash\n  seocore analyze content https:\u002F\u002Fexample.com --json --output content-report.json\n   ```\n\n   Export as HTML:\n   ```bash\n  seocore analyze content https:\u002F\u002Fexample.com --format html --output content-report.html\n   ```\n\n   CI mode with budgets:\n   ```bash\n  seocore analyze content https:\u002F\u002Fexample.com --ci --budget-eeat 70 --budget-content 75\n   ```\n\n   #### 7. Analyze Schema Graph\n   Explore structured data entities, relationships, and schema completeness:\n   ```bash\n  seocore analyze schema-graph https:\u002F\u002Fexample.com\n   ```\n\n   Export as Mermaid diagram:\n   ```bash\n  seocore analyze schema-graph https:\u002F\u002Fexample.com --format mermaid\n   ```\n\n   Export as JSON or HTML:\n   ```bash\n  seocore analyze schema-graph https:\u002F\u002Fexample.com --format json\n  seocore analyze schema-graph https:\u002F\u002Fexample.com --format html --output schema-graph.html\n   ```\n\n   **Schema Graph Flags:** `--format terminal|json|html|mermaid`, `-o \u003Cpath>`\n\n   #### 8. Analyze Internal Link Plan\n   Generate actionable internal linking recommendations with ranked source → target suggestions, orphan page detection, and hub identification:\n   ```bash\n  seocore analyze link-plan https:\u002F\u002Fexample.com\n   ```\n\n   Show top N recommendations:\n   ```bash\n  seocore analyze link-plan https:\u002F\u002Fexample.com --top 20\n   ```\n\n   Export as JSON:\n   ```bash\n  seocore analyze link-plan https:\u002F\u002Fexample.com --format json --output link-plan.json\n   ```\n\n   Export as HTML report:\n   ```bash\n  seocore analyze link-plan https:\u002F\u002Fexample.com --format html --output link-plan.html\n   ```\n\n   Full site crawl with high-confidence filter:\n   ```bash\n  seocore analyze link-plan https:\u002F\u002Fexample.com --full --min-confidence 60\n   ```\n\n   **Link Plan Flags:**\n   - `--top \u003Cnumber>` — Limit suggestions displayed\n   - `--format terminal|json|html` — Output format (default: terminal)\n   - `-o, --output \u003Cpath>` — Export file path\n   - `--full` — Crawl entire site (100 pages, depth 5)\n   - `-d, --depth \u003Cnumber>` — Crawl depth limit (default: 3)\n   - `-m, --max-pages \u003Cnumber>` — Maximum pages to crawl (default: 50)\n   - `--min-confidence \u003Cnumber>` — Minimum confidence threshold 0-100 (default: 0)\n   - `--max-suggestions-per-target \u003Cnumber>` — Max suggestions per target page (default: 5)\n   - `--verbose` — Show additional diagnostic details (scores, signals)\n\n   #### 9. Analyze Search Opportunities\n   Identify high-impact, page-level organic search opportunities ranked by deterministic business impact and ease of fix:\n   ```bash\n  seocore analyze opportunities https:\u002F\u002Fexample.com\n   ```\n\n   Show only top opportunities:\n   ```bash\n  seocore analyze opportunities https:\u002F\u002Fexample.com --top 25\n   ```\n\n   Show only medium\u002Fhigh opportunities:\n   ```bash\n  seocore analyze opportunities https:\u002F\u002Fexample.com --min-priority medium\n   ```\n\n   Export as JSON:\n   ```bash\n  seocore analyze opportunities https:\u002F\u002Fexample.com --format json --output opportunities.json\n   ```\n\n   Export as HTML (with rich summary cards and action plan metrics):\n   ```bash\n  seocore analyze opportunities https:\u002F\u002Fexample.com --format html --output opportunities.html\n   ```\n\n   Enrich with Google Search Console or CrUX field performance data:\n   ```bash\n  seocore analyze opportunities https:\u002F\u002Fexample.com --with-gsc --gsc-file .\u002Fgsc-pages.json --with-crux --crux-file .\u002Fcrux-pages.json\n   ```\n\n   Run deeper crawl with explicit limits:\n   ```bash\n  seocore analyze opportunities https:\u002F\u002Fexample.com --full --depth 5 --max-pages 100\n   ```\n\n   Show verbose ranking inputs and loader warnings:\n   ```bash\n  seocore analyze opportunities https:\u002F\u002Fexample.com --verbose\n   ```\n\n   **Opportunities Flags:**\n   - `-f, --format \u003Cterminal|json|html>`: Output format (default: terminal)\n   - `-o, --output \u003Cpath>`: Export file path\n   - `--with-gsc`: Include GSC metrics\n   - `--gsc-file \u003Cpath>`: GSC JSON export file path\n   - `--with-crux`: Include CrUX performance metrics\n   - `--crux-file \u003Cpath>`: CrUX JSON export file path\n   - `--full`: Crawl the entire site using the command's larger default budget\n   - `-d, --depth \u003Cnumber>`: Override crawl depth limit\n   - `-m, --max-pages \u003Cnumber>`: Override maximum crawled pages\n   - `--top \u003Cn>`: Limit shown\u002Fexported top items\n   - `--min-priority \u003Clow|medium|high>`: Filter minimum priority to display\n   - `--verbose`: Show full scoring inputs and warnings\n\n   **Notes:**\n   - Works without external providers using crawl heuristics only.\n   - `--with-gsc` and `--with-crux` improve ranking quality but are optional.\n   - If `--gsc-file` or `--crux-file` is omitted, the command falls back to `.\u002Fgsc-pages.json` and `.\u002Fcrux-pages.json`.\n   - Output is site-level analysis with page-level prioritized actions, not a full enterprise audit replacement.\n\n   #### 10. Inspect Single Aspects\n   The `inspect` command has subcommands for individual checks:\n\n   - **robots**: Verify robots.txt access rules, exclusions, and sitemap references\n     ```bash\n    seocore inspect robots https:\u002F\u002Fexample.com\n     ```\n\n   - **sitemap**: Analyze sitemap.xml and verify all linked URLs are reachable\n     ```bash\n    seocore inspect sitemap https:\u002F\u002Fexample.com --check-links\n     ```\n\n   - **llms-txt**: Verify `llms.txt` and `\u002F.well-known\u002Fllms.txt` rules for AI crawlers like GPTBot, ClaudeBot, and PerplexityBot\n     ```bash\n    seocore inspect llms-txt https:\u002F\u002Fexample.com\n     ```\n\n   - **schema**: Validate Schema.org JSON-LD, Microdata, and RDFa structures\n     ```bash\n    seocore inspect schema https:\u002F\u002Fexample.com\n     ```\n\n   - **hreflang**: Validate a website's hreflang tags for bidirectional links, x-default consistency, and language code validity\n     ```bash\n    seocore inspect hreflang https:\u002F\u002Fexample.com\n     ```\n\n   - **backlinks**: Extract backlink profiles and analyze referring domain authority and spam scores\n     ```bash\n    seocore inspect backlinks https:\u002F\u002Fexample.com\n     ```\n\n   - **keywords**: Perform advanced SEO keyword intelligence, noise filtering, and topic clustering\n     ```bash\n    seocore inspect keywords \"behavioral health\"\n     ```\n     With deep expansions:\n     ```bash\n    seocore inspect keywords \"behavioral health\" --expand\n     ```\n     With noise filtering options:\n     ```bash\n    seocore inspect keywords \"behavioral health\" --strict-noise-filter\n     ```\n\n   - **rank**: Check if a target website ranks in Google's top 10 organic results for a given keyword\n     ```bash\n    seocore inspect rank \"seo crawler\" https:\u002F\u002Fexample.com\n     ```\n\n   - **screenshot**: Capture screenshots of a target page or entire website\n     ```bash\n    seocore inspect screenshot https:\u002F\u002Fexample.com --breakpoints mobile,tablet,desktop\n     ```\n\n   #### 11. Compare Site Audits\n   Compare SEO health scores, metadata differences, and performance metrics across two websites or audit files:\n   ```bash\n  seocore compare https:\u002F\u002Fsite-a.com https:\u002F\u002Fsite-b.com --focus technical\n   ```\n\n   #### 12. Audit Images (SEO + Performance)\n   Audit images on a single page or across the site for weight, format, delivery, CLS, LCP, alt text, caching, and broken URLs. See [docs\u002Fcommands\u002Fimages.md](docs\u002Fcommands\u002Fimages.md) for the full rule catalog.\n\n   Single page (default):\n   ```bash\n  seocore images https:\u002F\u002Fexample.com\n   ```\n\n   Full site crawl (same origin, respects `robots.txt`; capped at ~100 pages and 500 unique images by default):\n   ```bash\n  seocore images https:\u002F\u002Fexample.com --crawl\n   ```\n\n   Playwright mode (rendered size, viewport, LCP element on the start URL):\n   ```bash\n  seocore images https:\u002F\u002Fexample.com --playwright\n   ```\n\n   Site crawl + Playwright + HTML report:\n   ```bash\n  seocore images https:\u002F\u002Fexample.com --crawl --playwright -f html -o .\u002Fseocore-images-report.html\n   ```\n\n   JSON export with custom thresholds:\n   ```bash\n  seocore images https:\u002F\u002Fexample.com --crawl --max-images 200 --threshold-kb 150 -f json -o .\u002Fimages-audit.json\n   ```\n\n   **Flags:** `--crawl`, `--playwright`, `--threshold-kb` (default 100), `--concurrency` (default 10), `--max-images` (default 500), `--user-agent`, `--timeout` (default 30000ms), `-f json|html`, `-o \u003Cpath>`.\n\n   #### 13. Audit Web Technology Stack\n   Identify framework, CDN, hosting, CMS, libraries, analytics, fonts, and external APIs with confidence ratings:\n   ```bash\n  seocore technology https:\u002F\u002Fexample.com\n   ```\n\n   Show underlying signature evidence lines and raw scores:\n   ```bash\n  seocore technology https:\u002F\u002Fexample.com --verbose\n   ```\n\n   Export stack detection to structured JSON or standalone HTML:\n   ```bash\n  seocore technology https:\u002F\u002Fexample.com --format html --output .\u002Ftechnology-report.html\n   ```\n\n   #### 14. Audit JavaScript SEO Impact\n   Compare raw source HTML against rendered DOM to see what JavaScript changes for crawlers. See [docs\u002Fjs-impact.md](docs\u002Fjs-impact.md) for command details and output reference.\n   ```bash\n  seocore js-impact https:\u002F\u002Fexample.com\n   ```\n\n   Use safer wait modes for JS-heavy marketing sites that never go idle:\n   ```bash\n  seocore js-impact https:\u002F\u002Fexample.com --wait-event load --timeout-ms 45000\n   ```\n\n   Export machine-readable JSON or shareable HTML:\n   ```bash\n  seocore js-impact https:\u002F\u002Fexample.com --output json --output-file .\u002Fjs-impact-report.json\n  seocore js-impact https:\u002F\u002Fexample.com --output html --output-file .\u002Fjs-impact-report.html\n   ```\n\n   **Flags:** `--wait-event load|domcontentloaded|networkidle`, `--timeout-ms`, `--wait-extra-ms`, `-o terminal|json|html|markdown`, `--output-file \u003Cpath>`.\n\n   #### 15. Audit Business Directory Presence\n   Check whether a business appears on key local\u002Fbusiness directories and whether the listing NAP matches the source website:\n   ```bash\n seocore directories https:\u002F\u002Fexample.com\n   ```\n\n   Force the multi-engine HTTP cascade search mode:\n   ```bash\n seocore directories https:\u002F\u002Fexample.com --provider cascade\n   ```\n\n   Use live browser search when HTML search engines are blocked:\n   ```bash\n seocore directories https:\u002F\u002Fexample.com --provider playwright --show\n   ```\n\n   Export citation results as JSON:\n   ```bash\n seocore directories https:\u002F\u002Fexample.com --json --output .\u002Fdirectories-report.json\n   ```\n\n   **Search providers:**\n   - `auto`: Use `SERPAPI_KEY` if present, otherwise use the HTTP cascade and fall back to Playwright when needed.\n   - `serpapi`: Most reliable live-search mode when `SERPAPI_KEY` is configured.\n   - `cascade`: HTTP-first search chain using `Bing -> Brave -> Mojeek -> DuckDuckGo`.\n   - `duckduckgo`: Force DuckDuckGo HTML search only.\n   - `playwright`: Browser-driven live search for sites that block HTML endpoints.\n\n   **Flags:** `--provider auto|serpapi|cascade|duckduckgo|playwright`, `--show`, `--concurrency` (default 4), `--max-candidates` (default 3), `--json`, `-f terminal|json`, `-o \u003Cpath>`.\n\n   **Typical statuses:** `Issues not found`, `Wrong Phone Number`, `Wrong Business Name`, `No Phone Number`, `Not Present`, `Search failed`.\n\n   ### SDK Integration\n\n   Import SEOCore directly into your Node\u002FTypeScript backend:\n\n   ```typescript\n   import { SeoEngine } from '@seocore\u002Fengine';\n   import { EventBus, ExecutionTier } from '@seocore\u002Fsdk';\n\n   \u002F\u002F Initialize the real-time event bus\n   const eventBus = new EventBus();\n\n   eventBus.on('page:loaded', (data) => {\n   console.log(`Crawled: ${data.url} | Status: ${data.statusCode}`);\n   });\n\n   \u002F\u002F Run audit using a tier\n   const engine = new SeoEngine(eventBus);\n   const result = await engine.run(\n   'https:\u002F\u002Fexample.com', \n   { \u002F* optional overrides here *\u002F },\n   ExecutionTier.STANDARD\n   );\n\n   console.log(`Overall Health Score: ${result.score}%`);\n   ```\n\n   ---\n\n   ## ⚙️ Configuration\n\n   Custom audits are defined via `seocore.config.json` or inline overrides.\n\n   ### Configuration Schema\n\n   | Option | Type | Default | Description |\n   | :--- | :--- | :--- | :--- |\n   | `tier` | `\"fast\" \\| \"standard\" \\| \"deep\" \\| \"enterprise\"` | `\"standard\"` | Execution tier driving crawl limits, rules, and scoring. Overrides `preset`. |\n   | `preset` | `\"quick\" \\| \"standard\" \\| \"deep\" \\| \"enterprise\"` | `\"standard\"` | Scrape profile adjusting page\u002Fdepth depth limits (legacy, use `tier`). |\n   | `concurrency` | `number` | `5` | Maximum simultaneous page crawl requests. |\n   | `maxDepth` | `number` | `3` | Distance of steps allowed from seed landing URL. |\n   | `maxPages` | `number` | `100` | Hard cap on total crawled pages per audit. |\n   | `rateLimitMs` | `number` | `100` | Delay spacing between concurrent requests. |\n   | `retryCount` | `number` | `2` | Number of crawl attempts on 5xx failures. |\n   | `playwrightEnabled`| `boolean` | `false` | Enable Playwright headless rendering for SPAs. |\n   | `excludePatterns` | `string[]` | `[]` | Glob\u002Fwildcard path list to bypass. |\n   | `includePatterns` | `string[]` | `[]` | Glob\u002Fwildcard path list restricted for crawling. |\n   | `ruleOverrides` | `object` | `{}` | Disable, override weight\u002Fseverity\u002Ffindings for rules. Supports `findingSeverityOverrides`. |\n\n   ### Example `seocore.config.json`\n\n   ```json\n   {\n   \"preset\": \"standard\",\n   \"concurrency\": 10,\n   \"maxPages\": 500,\n   \"rateLimitMs\": 50,\n   \"excludePatterns\": [\n      \"\u002Fadmin\u002F*\",\n      \"*\u002Fcheckout\u002F*\",\n      \"*.pdf\"\n   ],\n   \"includePatterns\": [\n      \"\u002Fblog\u002F*\",\n      \"\u002Fproducts\u002F*\"\n   ],\n   \"ruleOverrides\": {\n      \"missing-meta-description\": {\n         \"severity\": \"error\",\n         \"weight\": 8\n      },\n      \"duplicate-h1\": {\n         \"enabled\": false\n      },\n      \"security-headers\": {\n         \"severity\": \"warning\",\n         \"findingSeverityOverrides\": {\n            \"security-headers:missing-csp\": \"error\"\n         }\n      }\n   }\n   }\n   ```\n\n   ---\n\n   ## 👥 Contributing\n\n   We welcome community contributions! Please read our guidelines to get started:\n\n   1. **Fork the repo** and create your branch from `main`.\n   2. Ensure you have Node 20+ installed.\n   3. Write clean, modular TypeScript following existing packages patterns.\n   4. Run tests before submitting a pull request:\n      ```bash\n      npm test\n      ```\n   5. Submit detailed PR descriptions mapping features to technical specifications.\n\n   ---\n\n   ## 📄 License\n\n   This project is licensed under the **MIT License**. See [LICENSE](LICENSE) for more details.\n","SEOCORE 是一个企业级的多线程SEO爬虫、规则引擎和链接图分析工具。它使用TypeScript构建，旨在提高速度、合规性以及进行深度站点健康审计。核心功能包括高效的并发爬虫、基于Cheerio的抓取器、声明式规则引擎以及利用图论分析链接结构等。此外，它还支持自定义执行层级系统，能够根据需求调整爬取深度和规则应用范围，并具备高级的重定向跟踪能力。适用于开发人员、SEO专家及网站管理员在本地或持续集成环境中对网站进行全面的SEO检查与优化，帮助发现断链、孤立页面等问题，同时也能用于监测性能预算和渲染管道。",2,"2026-06-11 04:07:26","CREATED_QUERY"]