[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72792":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":27,"readmeContent":28,"aiSummary":29,"trendingCount":16,"starSnapshotCount":16,"syncStatus":30,"lastSyncTime":31,"discoverSource":32},72792,"llm-checker","Pavelevich\u002Fllm-checker","Pavelevich","Advanced CLI tool that scans your hardware and tells you exactly which LLM or sLLM models you can run locally, with full Ollama integration.","https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fllm-checker",null,"JavaScript",2553,168,20,3,0,73,227,420,219,108.68,"Other",false,"main",true,[],"2026-06-12 04:01:07","# LLM Checker\n\n![LLM Checker Animated Logo](https:\u002F\u002Fraw.githubusercontent.com\u002FPavelevich\u002Fllm-checker\u002Fmain\u002Fassets\u002Fllm-checker-logo.gif)\n\n**Intelligent Ollama Model Selector**\n\nAI-powered CLI that analyzes your hardware and recommends optimal LLM models.  \nDeterministic scoring across **200+ Ollama models** and **7k+ variants** with a packaged SQLite catalog, live sync, and hardware-calibrated memory estimation.\n\n[![npm version](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fv\u002Fllm-checker?style=flat-square&color=0066FF)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fllm-checker)\n[![npm downloads](https:\u002F\u002Fimg.shields.io\u002Fnpm\u002Fdm\u002Fllm-checker?style=flat-square&color=0066FF)](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fllm-checker)\n[![License](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLicense-NPDL--1.0-CC3300?style=flat-square)](LICENSE)\n[![Discord](https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1457032977849520374?style=flat-square&color=0066FF&label=Discord)](https:\u002F\u002Fdiscord.gg\u002FmnmYrA7T)\n[![Node.js](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fnode-%3E%3D16-0066FF?style=flat-square)](https:\u002F\u002Fnodejs.org\u002F)\n\n[Start Here](#start-here-2-minutes) •\n[Installation](#installation) •\n[Quick Start](#quick-start) •\n[Calibration Quick Start](#calibration-quick-start-10-minutes) •\n[Docs](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Ftree\u002Fmain\u002Fdocs) •\n[Claude MCP](#claude-code-mcp) •\n[Commands](#commands) •\n[Scoring](#scoring-system) •\n[Hardware](#supported-hardware) •\n[Discord](https:\u002F\u002Fdiscord.gg\u002FmnmYrA7T)\n\n---\n\n## Why LLM Checker?\n\nChoosing the right LLM for your hardware is complex. With thousands of model variants, quantization levels, and hardware configurations, finding the optimal model requires understanding memory bandwidth, VRAM limits, and performance characteristics.\n\n**LLM Checker solves this.** It analyzes your system, scores every compatible model across four dimensions (Quality, Speed, Fit, Context), and delivers actionable recommendations in seconds.\n\n---\n\n## Features\n\n| | Feature | Description |\n|:---:|---|---|\n| **200+** | Packaged Model Catalog | Ships with a synced Ollama SQLite catalog and can refresh from Ollama on demand |\n| **4D** | Scoring Engine | Quality, Speed, Fit, Context &mdash; weighted by use case |\n| **Multi-GPU** | Hardware Detection | Apple Silicon, NVIDIA CUDA, AMD ROCm, Intel Arc, CPU, integrated\u002Fdedicated inventory visibility |\n| **Calibrated** | Memory Estimation | Bytes-per-parameter formula validated against real Ollama sizes |\n| **Zero** | Native Dependencies | Pure JavaScript &mdash; works on any Node.js 16+ system |\n| **Live** | AI Run Metrics | `ai-run` shows response speed in tokens\u002Fsec next to model output |\n\n---\n\n## Documentation\n\n- [Docs Hub](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Ftree\u002Fmain\u002Fdocs)\n- [Usage Guide](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Fblob\u002Fmain\u002Fdocs\u002Fguides\u002Fusage-guide.md)\n- [Advanced Usage](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Fblob\u002Fmain\u002Fdocs\u002Fguides\u002Fadvanced-usage.md)\n- [Technical Reference](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Fblob\u002Fmain\u002Fdocs\u002Freference\u002Ftechnical-docs.md)\n- [Changelog](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Fblob\u002Fmain\u002Fdocs\u002Freference\u002Fchangelog.md)\n- [Calibration Fixtures](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Ftree\u002Fmain\u002Fdocs\u002Ffixtures\u002Fcalibration)\n\n---\n\n## Comparison with Other Tooling (e.g. `llmfit`)\n\nLLM Checker and `llmfit` solve related but different problems:\n\n| Tool | Primary Focus | Typical Output |\n|------|---------------|----------------|\n| **LLM Checker** | Hardware-aware **model selection** for local inference | Ranked recommendations, compatibility scores, pull\u002Frun commands |\n| **llmfit** | LLM workflow support and model-fit evaluation from another angle | Different optimization workflow and selection heuristics |\n\nIf your goal is: *\"What should I run on this exact machine right now?\"*, use **LLM Checker** first.  \nIf your goal is broader experimentation across custom pipelines, using both tools can be complementary.\n\n---\n\n## Installation\n\n```bash\n# Install globally\nnpm install -g llm-checker\n\n# Or run directly with npx\nnpx llm-checker hw-detect\n```\n\n**Termux (Android):**\n```bash\npkg update\npkg install ollama\nnpm install -g llm-checker\n```\n\n**Requirements:**\n- Node.js 16+ (any version: 16, 18, 20, 22, 24)\n- [Ollama](https:\u002F\u002Follama.ai) installed for running models\n\nThe package includes a prebuilt model catalog and declares `sql.js` as an optional dependency for SQLite-powered commands. If your package manager skips optional dependencies and database commands report `sql.js` missing, reinstall with optional dependencies enabled:\n\n```bash\nnpm install -g llm-checker --include=optional\n```\n\n---\n\n## Start Here (2 Minutes)\n\nIf you are new, use this exact flow:\n\n```bash\n# 1) Install\nnpm install -g llm-checker\n\n# 2) Detect your hardware\nllm-checker hw-detect\n\n# 3) Get recommendations by category\nllm-checker recommend --category coding\n\n# 4) Refresh the catalog when you want current Ollama references\nllm-checker sync\n\n# 5) Run with auto-selection and tokens\u002Fsec metrics\nllm-checker ai-run --category coding --prompt \"Write a hello world in Python\"\n```\n\nIf you already calibrated routing:\n\n```bash\nllm-checker ai-run --calibrated --category coding --prompt \"Refactor this function\"\n```\n\n---\n\n## Distribution\n\nLLM Checker is published in all primary channels:\n\n- npm (latest, recommended): [`llm-checker@latest`](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fllm-checker)\n- GitHub Releases: [Release history](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Freleases)\n- GitHub Packages (legacy mirror, may lag): [`@pavelevich\u002Fllm-checker`](https:\u002F\u002Fgithub.com\u002Fusers\u002FPavelevich\u002Fpackages\u002Fnpm\u002Fpackage\u002Fllm-checker)\n\n### Important: Use npm for Latest Builds\n\nIf you need the newest release, install from npm (`llm-checker`), not the scoped GitHub Packages mirror.\n\nIf you installed `@pavelevich\u002Fllm-checker` and version looks old:\n\n```bash\nnpm uninstall -g @pavelevich\u002Fllm-checker\nnpm install -g llm-checker@latest\nhash -r\nllm-checker --version\n```\n\n### v3.5.13 Highlights\n\n- Ships npm packages with a ready-to-use SQLite model catalog:\n  - 229 Ollama models\n  - 7176 variants\n  - real pull counts and `last_updated` metadata\n- `sync` refreshes the local SQLite catalog from Ollama; `recommend`, `list-models`, `ai-run`, and `ai-check` now prefer that synced catalog instead of stale scraper cache data.\n- Recommendation normalization was hardened:\n  - no more `pulls: 0` for the full catalog after sync\n  - `335m` style tags are treated as millions, not billions\n  - ambiguous aliases like `latest`, `small`, `medium`, and `large` are not guessed into fake parameter counts\n  - cloud variants are filtered out of local recommendations\n- `ai-run` streams model responses through Ollama and appends measured tokens\u002Fsec so users can compare installed models by real local speed.\n- The interactive panel no longer asks for optional parameters before every command.\n\n### v3.3.0 Highlights\n\n- Calibrated routing is now first-class in `recommend` and `ai-run`:\n  - `--calibrated [file]` support with default discovery path.\n  - clear precedence: `--policy` > `--calibrated` > deterministic fallback.\n  - routing provenance output (source, route, selected model).\n- New calibration fixtures and end-to-end tests for:\n  - `calibrate --policy-out ...` → `recommend --calibrated ...`\n- Hardened Jetson CUDA detection to avoid false CPU-only fallback.\n- Documentation reorganized under `docs\u002F` with clearer onboarding paths.\n\n### Optional (Legacy): Install from GitHub Packages\n\nUse this only if you explicitly need GitHub Packages. It may not match npm latest.\n\n```bash\n# 1) Configure registry + token (PAT with read:packages)\necho \"@pavelevich:registry=https:\u002F\u002Fnpm.pkg.github.com\" >> ~\u002F.npmrc\necho \"\u002F\u002Fnpm.pkg.github.com\u002F:_authToken=${GITHUB_TOKEN}\" >> ~\u002F.npmrc\n\n# 2) Install\nnpm install -g @pavelevich\u002Fllm-checker@latest\n```\n\n---\n\n## Quick Start\n\n```bash\n# 1. Detect your hardware capabilities\nllm-checker hw-detect\n\n# 2. Get full analysis with compatible models\nllm-checker check\n\n# 3. Get intelligent recommendations by category\nllm-checker recommend\n\n# 4. Refresh the catalog when you want current Ollama metadata\nllm-checker sync\nllm-checker search qwen --use-case coding\n```\n\n---\n\n## Calibration Quick Start (10 Minutes)\n\nThis path produces both calibration artifacts and verifies calibrated routing in one pass.\n\n### 1) Use the sample prompt suite\n\n```bash\ncp .\u002Fdocs\u002Ffixtures\u002Fcalibration\u002Fsample-suite.jsonl .\u002Fsample-suite.jsonl\n```\n\n### 2) Generate calibration artifacts (dry-run)\n\n```bash\nmkdir -p .\u002Fartifacts\nllm-checker calibrate \\\n  --suite .\u002Fsample-suite.jsonl \\\n  --models qwen2.5-coder:7b llama3.2:3b \\\n  --runtime ollama \\\n  --objective balanced \\\n  --dry-run \\\n  --output .\u002Fartifacts\u002Fcalibration-result.json \\\n  --policy-out .\u002Fartifacts\u002Fcalibration-policy.yaml\n```\n\nArtifacts created:\n\n- `.\u002Fartifacts\u002Fcalibration-result.json` (calibration contract)\n- `.\u002Fartifacts\u002Fcalibration-policy.yaml` (routing policy for runtime commands)\n\n### 3) Apply calibrated routing\n\n```bash\nllm-checker recommend --calibrated .\u002Fartifacts\u002Fcalibration-policy.yaml --category coding\nllm-checker ai-run --calibrated .\u002Fartifacts\u002Fcalibration-policy.yaml --category coding --prompt \"Refactor this function\"\n```\n\nNotes:\n\n- `--policy \u003Cfile>` has precedence over `--calibrated [file]`.\n- If `--calibrated` has no path, discovery uses `~\u002F.llm-checker\u002Fcalibration-policy.{yaml,yml,json}`.\n- `--mode full` currently requires `--runtime ollama`.\n- `.\u002Fdocs\u002Ffixtures\u002Fcalibration\u002Fsample-generated-policy.yaml` shows the expected policy structure.\n\n---\n\n## Claude Code MCP\n\nLLM Checker includes a built-in [Model Context Protocol](https:\u002F\u002Fmodelcontextprotocol.io\u002F) (MCP) server, allowing **Claude Code** and other MCP-compatible AI assistants to analyze your hardware and manage local models directly.\n\n### Setup (One Command)\n\n```bash\n# Install globally first\nnpm install -g llm-checker\n\n# Add to Claude Code\nclaude mcp add llm-checker -- llm-checker-mcp\n```\n\nOr generate the exact command directly from the CLI:\n\n```bash\nllm-checker mcp-setup\n```\n\nOr with npx (no global install needed):\n\n```bash\nclaude mcp add llm-checker -- npx llm-checker-mcp\n```\n\nRestart Claude Code and you're done.\n\n### Available MCP Tools\n\nOnce connected, Claude can use these tools:\n\n**Core Analysis:**\n\n| Tool | Description |\n|------|-------------|\n| `hw_detect` | Detect your hardware (CPU, GPU, RAM, acceleration backend) |\n| `check` | Full compatibility analysis with all models ranked by score |\n| `recommend` | Top model picks by category (coding, reasoning, multimodal, etc.) |\n| `installed` | Rank your already-downloaded Ollama models |\n| `search` | Search the Ollama model catalog with filters |\n| `smart_recommend` | Advanced recommendations using the full scoring engine |\n| `ollama_plan` | Build a capacity plan for local models with recommended context\u002Fparallel\u002Fmemory settings |\n| `ollama_plan_env` | Return ready-to-paste `export ...` env vars from the recommended or fallback plan profile |\n| `policy_validate` | Validate a policy file against the v1 schema and return structured validation output |\n| `audit_export` | Run policy compliance export (`json`\u002F`csv`\u002F`sarif`\u002F`all`) for `check` or `recommend` flows |\n| `calibrate` | Generate calibration artifacts from a prompt suite with typed MCP inputs |\n\n**Ollama Management:**\n\n| Tool | Description |\n|------|-------------|\n| `ollama_list` | List all downloaded models with params, quant, family, and size |\n| `ollama_pull` | Download a model from the Ollama registry |\n| `ollama_run` | Run a prompt against a local model (with tok\u002Fs metrics) |\n| `ollama_remove` | Delete a model to free disk space |\n\n**Advanced (MCP-exclusive):**\n\n| Tool | Description |\n|------|-------------|\n| `ollama_optimize` | Generate optimal Ollama env vars for your hardware (NUM_GPU, PARALLEL, FLASH_ATTENTION, etc.) |\n| `benchmark` | Benchmark a model with 3 standardized prompts — measures tok\u002Fs, load time, prompt eval |\n| `compare_models` | Head-to-head comparison of two models on the same prompt with speed + response side-by-side |\n| `cleanup_models` | Analyze installed models — find redundancies, cloud-only models, oversized models, and upgrade candidates |\n| `project_recommend` | Scan a project directory (languages, frameworks, size) and recommend the best model for that codebase |\n| `ollama_monitor` | Real-time system status: RAM usage, loaded models, memory headroom analysis |\n| `cli_help` | List all allowlisted CLI commands exposed through MCP |\n| `cli_exec` | Execute any allowlisted `llm-checker` CLI command with custom args (policy\u002Faudit\u002Fcalibrate\u002Fsync\u002Fai-run\u002Fetc.) |\n\n### Example Prompts\n\nAfter setup, you can ask Claude things like:\n\n- *\"What's the best coding model for my hardware?\"*\n- *\"Benchmark qwen2.5-coder and show me the tok\u002Fs\"*\n- *\"Compare llama3.2 vs codellama for coding tasks\"*\n- *\"Clean up my Ollama — what should I remove?\"*\n- *\"What model should I use for this Rust project?\"*\n- *\"Optimize my Ollama config for maximum performance\"*\n- *\"How much RAM is Ollama using right now?\"*\n\nClaude will automatically call the right tools and give you actionable results.\n\n---\n\n## Interactive CLI Panel\n\nRunning `llm-checker` with no arguments now opens an interactive panel (TTY terminals):\n\n- animated startup banner\n- main command list with descriptions\n- type `\u002F` to open all commands\n- use up\u002Fdown arrows to select a command\n- press `Enter` to execute\n- add optional extra flags before run (example: `--json --limit 5`)\n\nFor scripting and automation, direct command invocation remains unchanged:\n\n```bash\nllm-checker check --use-case coding --limit 3\nllm-checker search \"qwen coder\" --json\n```\n\n---\n\n## Commands\n\n### Core Commands\n\n| Command | Description |\n|---------|-------------|\n| `hw-detect` | Detect GPU\u002FCPU capabilities, memory, backends |\n| `check` | Full system analysis with compatible models and recommendations |\n| `recommend` | Intelligent recommendations by category (coding, reasoning, multimodal, etc.) |\n| `calibrate` | Generate calibration result + routing policy artifacts from a JSONL prompt suite |\n| `installed` | Rank your installed Ollama models by compatibility |\n| `list-models` | List the synced Ollama catalog by popularity, category, size, or JSON output |\n| `ollama-plan` | Compute safe Ollama runtime env vars (`NUM_CTX`, `NUM_PARALLEL`, `MAX_LOADED_MODELS`) for selected local models |\n| `mcp-setup` | Print\u002Fapply Claude MCP setup command and config snippet (`--apply`, `--json`, `--npx`) |\n| `gpu-plan` | Multi-GPU placement advisor with single\u002Fpooled model-size envelopes |\n| `verify-context` | Verify practical context-window limits for a local model |\n| `amd-guard` | AMD\u002FWindows reliability guard with mitigation hints |\n| `toolcheck` | Test tool-calling compatibility for local models |\n\n### Database Commands\n\n| Command | Description |\n|---------|-------------|\n| `sync` | Refresh the local SQLite model catalog from Ollama |\n| `search \u003Cquery>` | Search the synced catalog with filters and intelligent scoring |\n| `smart-recommend` | Advanced recommendations using the full scoring engine |\n\n### Enterprise Policy Commands\n\n| Command | Description |\n|---------|-------------|\n| `policy init` | Generate a `policy.yaml` template for enterprise governance |\n| `policy validate` | Validate a policy file and return non-zero on schema errors |\n| `audit export` | Evaluate policy outcomes and export compliance reports (`json`, `csv`, `sarif`) |\n\n### Policy Enforcement in `check` and `recommend`\n\nBoth `check` and `recommend` support `--policy \u003Cfile>`.\n\n- In `audit` mode, policy violations are reported but the command exits with `0`.\n- In `enforce` mode, blocking violations return non-zero (default `1`).\n- You can override the non-zero code with `enforcement.exit_code` in `policy.yaml`.\n\nExamples:\n\n```bash\nllm-checker check --policy .\u002Fpolicy.yaml\nllm-checker check --policy .\u002Fpolicy.yaml --use-case coding --runtime vllm\nllm-checker recommend --policy .\u002Fpolicy.yaml --category coding\n```\n\n### Calibrated Routing in `recommend` and `ai-run`\n\n`recommend` and `ai-run` now support calibration routing policies generated by `calibrate --policy-out`.\n\n- `--calibrated [file]`:\n  - If `file` is omitted, discovery defaults to `~\u002F.llm-checker\u002Fcalibration-policy.{yaml,yml,json}`.\n- `--policy \u003Cfile>` takes precedence over `--calibrated` for routing resolution.\n- Resolution precedence:\n  - `--policy` (explicit)\n  - `--calibrated` (explicit file or default discovery)\n  - deterministic selector fallback\n- CLI output includes routing provenance (`--policy`, `--calibrated`, or default discovery) and the selected route\u002Fmodel.\n\nExamples:\n\n```bash\nllm-checker recommend --calibrated --category coding\nllm-checker recommend --calibrated .\u002Fcalibration-policy.yaml --category reasoning\nllm-checker ai-run --calibrated --category coding --prompt \"Refactor this function\"\nllm-checker ai-run --policy .\u002Fcalibration-policy.yaml --prompt \"Summarize this report\"\n```\n\n### Policy Audit Export\n\nUse `audit export` when you need machine-readable compliance evidence for CI\u002FCD gates, governance reviews, or security tooling.\n\n```bash\n# Single report format\nllm-checker audit export --policy .\u002Fpolicy.yaml --command check --format json --out .\u002Freports\u002Fcheck-policy.json\n\n# Export all configured formats (json, csv, sarif)\nllm-checker audit export --policy .\u002Fpolicy.yaml --command check --format all --out-dir .\u002Freports\n```\n\n- `--command check|recommend` chooses the candidate source.\n- `--format all` honors `reporting.formats` in your policy (falls back to `json,csv,sarif`).\n- In `enforce` mode with blocking violations, reports are still written before non-zero exit.\n\n### Integration Examples (SIEM \u002F CI Artifacts)\n\n```bash\n# CI artifact (JSON) for post-processing in pipeline jobs\nllm-checker audit export --policy .\u002Fpolicy.yaml --command check --format json --out .\u002Freports\u002Fpolicy-report.json\n\n# Flat CSV for SIEM ingestion (Splunk\u002FELK\u002FDataDog pipelines)\nllm-checker audit export --policy .\u002Fpolicy.yaml --command check --format csv --out .\u002Freports\u002Fpolicy-report.csv\n\n# SARIF for security\u002Fcode-scanning tooling integrations\nllm-checker audit export --policy .\u002Fpolicy.yaml --command check --format sarif --out .\u002Freports\u002Fpolicy-report.sarif\n```\n\n### GitHub Actions Policy Gate (Copy-Paste)\n\n```yaml\nname: Policy Gate\non: [pull_request]\n\njobs:\n  policy-gate:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions\u002Fcheckout@v4\n      - uses: actions\u002Fsetup-node@v4\n        with:\n          node-version: 20\n      - run: npm ci\n      - run: node bin\u002Fenhanced_cli.js check --policy .\u002Fpolicy.yaml --runtime ollama --no-verbose\n      - if: always()\n        run: node bin\u002Fenhanced_cli.js audit export --policy .\u002Fpolicy.yaml --command check --format all --runtime ollama --no-verbose --out-dir .\u002Fpolicy-reports\n      - if: always()\n        uses: actions\u002Fupload-artifact@v4\n        with:\n          name: policy-audit-reports\n          path: .\u002Fpolicy-reports\n```\n\n### Provenance Fields in Reports\n\n`check`, `recommend`, and `audit export` outputs include normalized model provenance fields:\n\n- `source`\n- `registry`\n- `version`\n- `license`\n- `digest`\n\nIf a field is unavailable from model metadata, outputs use `\"unknown\"` instead of omitting the field. This keeps downstream parsers deterministic.\nLicense values are canonicalized for policy checks (for example `MIT License` -> `mit`, `Apache 2.0` -> `apache-2.0`).\n\n### AI Commands\n\n| Command | Description |\n|---------|-------------|\n| `ai-check` | AI-powered model evaluation with meta-analysis |\n| `ai-run` | AI-powered model selection and execution with live tokens\u002Fsec output |\n\n---\n\n### `ai-run` &mdash; Auto-Select and Run\n\n```bash\nllm-checker ai-run --category coding --prompt \"Write a file parser in Node.js\"\nllm-checker ai-run --benchmark --category general\nllm-checker ai-run --reference-only --category reasoning\n```\n\n`ai-run` chooses the best installed model for the requested category, falls back to the best local alternative when the top catalog pick is not installed, and streams through Ollama directly.\n\nWhen a response completes, the CLI appends measured local speed:\n\n```text\n>>> hi\nHello! How can I help you today?\n[42.8 tokens\u002Fsec]\n```\n\nUse `--reference-only` when you only want the recommendation card and pull command without starting a chat. Use `--benchmark` for a quick measured speed check on the selected local model.\n\n---\n\n### `hw-detect` &mdash; Hardware Analysis\n\n```bash\nllm-checker hw-detect\n```\n\n```\nSummary:\n  Apple M4 Pro (24GB Unified Memory)\n  Tier: MEDIUM HIGH\n  Max model size: 15GB\n  Best backend: metal\n\nCPU:\n  Apple M4 Pro\n  Cores: 12 (12 physical)\n  SIMD: NEON\n\nMetal:\n  GPU Cores: 16\n  Unified Memory: 24GB\n  Memory Bandwidth: 273GB\u002Fs\n```\n\nOn hybrid or integrated-only systems, `hw-detect` now also surfaces GPU topology explicitly:\n\n```\nDedicated GPUs: NVIDIA GeForce RTX 4060\nIntegrated GPUs: Intel Iris Xe Graphics\nAssist path: Integrated\u002Fshared-memory GPU detected, runtime remains CPU\n```\n\nThis makes integrated GPUs visible even when the selected runtime backend is still CPU.\n\n### `recommend` &mdash; Category Recommendations\n\n```bash\nllm-checker recommend\n```\n\nUse optimization profiles to steer ranking by intent:\n\n```bash\nllm-checker recommend --optimize balanced\nllm-checker recommend --optimize speed\nllm-checker recommend --optimize quality\nllm-checker recommend --optimize context\nllm-checker recommend --optimize coding\n```\n\n```\nINTELLIGENT RECOMMENDATIONS BY CATEGORY\nHardware Tier: HIGH | Models Analyzed: 205\n\nCoding:\n   qwen2.5-coder:14b (14B)\n   Score: 78\u002F100\n   Fine-tuning: LoRA+QLoRA\n   Command: ollama pull qwen2.5-coder:14b\n\nReasoning:\n   deepseek-r1:14b (14B)\n   Score: 86\u002F100\n   Fine-tuning: QLoRA\n   Command: ollama pull deepseek-r1:14b\n\nMultimodal:\n   llama3.2-vision:11b (11B)\n   Score: 83\u002F100\n   Fine-tuning: LoRA+QLoRA\n   Command: ollama pull llama3.2-vision:11b\n```\n\n`check`, `recommend`, and `ai-check` include a fine-tuning suitability label in output to help choose between Full FT, LoRA, and QLoRA paths.\n\n### `search` &mdash; Model Search\n\n```bash\nllm-checker search llama -l 5\nllm-checker search coding --use-case coding\nllm-checker search qwen --quant Q4_K_M --max-size 8\n```\n\n| Option | Description |\n|--------|-------------|\n| `-l, --limit \u003Cn>` | Number of results (default: 10) |\n| `-u, --use-case \u003Ctype>` | Optimize for: `general`, `coding`, `chat`, `reasoning`, `creative`, `fast` |\n| `--max-size \u003Cgb>` | Maximum model size in GB |\n| `--quant \u003Ctype>` | Filter by quantization: `Q4_K_M`, `Q8_0`, `FP16`, etc. |\n| `--family \u003Cname>` | Filter by model family |\n\n---\n\n## Model Catalog\n\nLLM Checker ships with a pre-synced SQLite snapshot of the Ollama catalog. On first run, that snapshot is copied to `~\u002F.llm-checker\u002Fmodels.db`, so recommendations and catalog search work immediately after npm install.\n\nThe packaged snapshot currently includes:\n\n- 229 Ollama models\n- 7176 variants\n- pull counts\n- tag counts\n- last-updated metadata\n- variant params, quantization, size, context, and input type fields when available\n\nRefresh it any time:\n\n```bash\nllm-checker sync\n```\n\nFor release maintainers, the packaged seed can be regenerated from the synced local DB:\n\n```bash\nnpm run sync:seed\n```\n\n`recommend`, `list-models`, `ai-run`, and `ai-check` prefer the synced SQLite catalog. If the SQLite catalog is unavailable, LLM Checker falls back to the scraped cache and then to the curated catalog.\n\nThe curated fallback catalog includes 35+ models from the most popular Ollama families:\n\n| Family | Models | Best For |\n|--------|--------|----------|\n| **Qwen 2.5\u002F3** | 7B, 14B, Coder 7B\u002F14B\u002F32B, VL 3B\u002F7B | Coding, general, vision |\n| **Llama 3.x** | 1B, 3B, 8B, Vision 11B | General, chat, multimodal |\n| **DeepSeek** | R1 8B\u002F14B\u002F32B, Coder V2 16B | Reasoning, coding |\n| **Phi-4** | 14B | Reasoning, math |\n| **Gemma 2** | 2B, 9B | General, efficient |\n| **Mistral** | 7B, Nemo 12B | Creative, chat |\n| **CodeLlama** | 7B, 13B | Coding |\n| **LLaVA** | 7B, 13B | Vision |\n| **Embeddings** | nomic-embed-text, mxbai-embed-large, bge-m3, all-minilm | RAG, search |\n\nAll available models are automatically combined with locally installed Ollama models for scoring. Ambiguous tags such as `latest`, cloud-only variants, and aliases without reliable size metadata are kept out of local recommendations unless they can be resolved to concrete parameters or artifact sizes.\n\n---\n\n## Scoring System\n\nModels are evaluated across four dimensions, weighted by use case:\n\n| Dimension | Description |\n|-----------|-------------|\n| **Q** Quality | Model family reputation + parameter count + quantization penalty |\n| **S** Speed | Estimated tokens\u002Fsec based on hardware backend and model size |\n| **F** Fit | Memory utilization efficiency (how well it fits in available RAM) |\n| **C** Context | Context window capability vs. target context length |\n\n### Scoring Weights by Use Case\n\nThree scoring systems are available, each optimized for different workflows:\n\n**Deterministic Selector** (primary &mdash; used by `check` and `recommend`):\n\n| Category | Quality | Speed | Fit | Context |\n|----------|:-------:|:-----:|:---:|:-------:|\n| `general` | 45% | 35% | 15% | 5% |\n| `coding` | 55% | 20% | 15% | 10% |\n| `reasoning` | 60% | 10% | 20% | 10% |\n| `multimodal` | 50% | 15% | 20% | 15% |\n\n**Scoring Engine** (used by `smart-recommend` and `search`):\n\n| Use Case | Quality | Speed | Fit | Context |\n|----------|:-------:|:-----:|:---:|:-------:|\n| `general` | 40% | 35% | 15% | 10% |\n| `coding` | 55% | 20% | 15% | 10% |\n| `reasoning` | 60% | 15% | 10% | 15% |\n| `chat` | 40% | 40% | 15% | 5% |\n| `fast` | 25% | 55% | 15% | 5% |\n| `quality` | 65% | 10% | 15% | 10% |\n\nAll weights are centralized in `src\u002Fmodels\u002Fscoring-config.js`.\n\n### Memory Estimation\n\nMemory requirements are calculated using calibrated bytes-per-parameter values:\n\n| Quantization | Bytes\u002FParam | 7B Model | 14B Model | 32B Model |\n|:------------:|:-----------:|:--------:|:---------:|:---------:|\n| Q8_0 | 1.05 | ~8 GB | ~16 GB | ~35 GB |\n| Q4_K_M | 0.58 | ~5 GB | ~9 GB | ~20 GB |\n| Q3_K | 0.48 | ~4 GB | ~8 GB | ~17 GB |\n\nThe selector automatically picks the best quantization that fits your available memory.\n\nFor MoE models, deterministic memory estimation supports explicit sparse metadata when present:\n\n- `total_params_b`\n- `active_params_b`\n- `expert_count`\n- `experts_active_per_token`\n\nNormalized recommendation variants expose both snake_case and camelCase metadata aliases\n(for example: `total_params_b` + `totalParamsB`) when available.\n\nMoE parameter path selection is deterministic and uses this fallback order:\n\n1. `active_params_b` (assumption source: `moe_active_metadata`)\n2. `total_params_b * (experts_active_per_token \u002F expert_count)` (assumption source: `moe_derived_expert_ratio`)\n3. `total_params_b` (assumption source: `moe_fallback_total_params`)\n4. Model `paramsB` fallback (assumption source: `moe_fallback_model_params`)\n\nDense models continue to use the dense parameter path (`dense_params`) unchanged.\n\nWhen `active_params_b` (or a derived active-ratio path) is available, inference memory\nuses the sparse-active parameter estimate even if artifact size metadata is present.\n\n### Runtime-Aware MoE Speed Estimation\n\nMoE speed estimates now include runtime-specific overhead assumptions (routing, communication, offload), instead of using a single fixed MoE boost.\n\n- Canonical helper: `src\u002Fmodels\u002Fmoe-assumptions.js`\n- Applied in both:\n  - `src\u002Fmodels\u002Fdeterministic-selector.js`\n  - `src\u002Fmodels\u002Fscoring-engine.js`\n\nCurrent runtime profiles:\n\n| Runtime | Routing | Communication | Offload | Max Effective Gain |\n|:--------|:-------:|:-------------:|:-------:|:------------------:|\n| `ollama` | 18% | 13% | 8% | 2.35x |\n| `vllm` | 12% | 8% | 4% | 2.65x |\n| `mlx` | 16% | 10% | 5% | 2.45x |\n| `llama.cpp` | 20% | 14% | 9% | 2.30x |\n\nRecommendation outputs now expose these assumptions through runtime metadata and MoE speed diagnostics.\n\n---\n\n## Supported Hardware\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Apple Silicon\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n- M1, M1 Pro, M1 Max, M1 Ultra\n- M2, M2 Pro, M2 Max, M2 Ultra\n- M3, M3 Pro, M3 Max\n- M4, M4 Pro, M4 Max\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>NVIDIA (CUDA)\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n- RTX 50 Series (5090, 5080, 5070 Ti, 5070)\n- RTX 40 Series (4090, 4080, 4070 Ti, 4070, 4060 Ti, 4060)\n- RTX 30 Series (3090 Ti, 3090, 3080 Ti, 3080, 3070 Ti, 3070, 3060 Ti, 3060)\n- Data Center (H100, A100, A10, L40, T4)\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>AMD (ROCm)\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n- RX 7900 XTX, 7900 XT, 7800 XT, 7700 XT\n- RX 6900 XT, 6800 XT, 6800\n- Instinct MI300X, MI300A, MI250X, MI210\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Intel\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n- Arc A770, A750, A580, A380\n- Integrated Iris Xe, UHD Graphics\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>CPU Backends\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n- AVX-512 + AMX (Intel Sapphire Rapids, Emerald Rapids)\n- AVX-512 (Intel Ice Lake+, AMD Zen 4)\n- AVX2 (Most modern x86 CPUs)\n- ARM NEON (Apple Silicon, AWS Graviton, Ampere Altra)\n\n\u003C\u002Fdetails>\n\n---\n\n## Architecture\n\nLLM Checker uses a deterministic pipeline so the same inputs produce the same ranked output, with explicit policy outcomes for governance workflows.\n\n```mermaid\nflowchart LR\n  subgraph Inputs\n    HW[\"Hardware detector\u003Cbr\u002F>CPU\u002FGPU\u002FRAM\u002Fbackend\"]\n    REG[\"Synced SQLite Ollama catalog\u003Cbr\u002F>(packaged seed + live sync)\"]\n    LOCAL[\"Installed local models\"]\n    FLAGS[\"CLI options\u003Cbr\u002F>use-case\u002Fruntime\u002Flimits\u002Fpolicy\"]\n  end\n\n  subgraph Pipeline[\"Selection Pipeline\"]\n    NORMALIZE[\"Normalize and deduplicate model pool\"]\n    PROFILE[\"Hardware profile and memory budget\"]\n    FILTER[\"Use-case\u002Fcategory filtering\"]\n    QUANT[\"Quantization fit selection\"]\n    SCORE[\"Deterministic 4D scoring\u003Cbr\u002F>Q\u002FS\u002FF\u002FC\"]\n    POLICY[\"Policy evaluation (optional)\u003Cbr\u002F>audit or enforce\"]\n    RANK[\"Rank and explain candidates\"]\n  end\n\n  subgraph Outputs\n    REC[\"check \u002F recommend output\"]\n    AUDIT[\"audit export\u003Cbr\u002F>JSON \u002F CSV \u002F SARIF\"]\n    RUN[\"pull\u002Frun-ready commands\"]\n  end\n\n  REG --> NORMALIZE\n  LOCAL --> NORMALIZE\n  HW --> PROFILE\n  FLAGS --> FILTER\n  FLAGS --> POLICY\n  NORMALIZE --> FILTER\n  PROFILE --> QUANT\n  FILTER --> QUANT\n  QUANT --> SCORE\n  SCORE --> POLICY\n  SCORE --> RANK\n  POLICY --> RANK\n  RANK --> REC\n  POLICY --> AUDIT\n  RANK --> RUN\n```\n\n### Component Responsibilities\n\n- **Input layer**: Collects runtime constraints from hardware detection, local inventory, dynamic registry data, and CLI flags.\n- **Normalization layer**: Deduplicates identifiers\u002Ftags and builds a canonical candidate set.\n- **Selection layer**: Filters by use case, selects the best fitting quantization, and computes deterministic Q\u002FS\u002FF\u002FC scores.\n- **Governance layer**: Applies policy rules in `audit` or `enforce` mode and records explicit violation metadata.\n- **Output layer**: Returns ranked recommendations plus machine-readable compliance artifacts when requested.\n\n### Execution Stages\n\n1. **Hardware profiling**: Detect CPU\u002FGPU\u002FRAM and effective backend capabilities.\n2. **Model pool assembly**: Merge the synced SQLite catalog (or fallback cache\u002Fcatalog) with locally installed models.\n3. **Candidate filtering**: Keep only relevant models for the requested use case.\n4. **Fit selection**: Choose the best quantization for available memory budget.\n5. **Deterministic scoring**: Score each candidate across quality, speed, fit, and context.\n6. **Policy + ranking**: Apply optional policy checks, then rank and return actionable commands.\n\n---\n\n## Examples\n\n**Detect your hardware:**\n```bash\nllm-checker hw-detect\n```\n\n**Get recommendations for all categories:**\n```bash\nllm-checker recommend\n```\n\n**Full system analysis with compatible models:**\n```bash\nllm-checker check\n```\n\n**Find the best coding model:**\n```bash\nllm-checker recommend --category coding\n```\n\n**Search for small, fast models under 5GB:**\n```bash\nllm-checker search \"7b\" --max-size 5 --use-case fast\n```\n\n**Get high-quality reasoning models:**\n```bash\nllm-checker smart-recommend --use-case reasoning\n```\n\n---\n\n## Development\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker.git\ncd llm-checker\nnpm install\nnode bin\u002Fenhanced_cli.js hw-detect\n```\n\n### Project Structure\n\n```\nsrc\u002F\n  models\u002F\n    deterministic-selector.js  # Primary selection algorithm\n    scoring-config.js          # Centralized scoring weights\n    scoring-engine.js          # Advanced scoring (smart-recommend)\n    catalog.json               # Curated fallback catalog (35+ models, only if dynamic pool unavailable)\n  ai\u002F\n    multi-objective-selector.js  # Multi-objective optimization\n    ai-check-selector.js        # LLM-based evaluation\n  hardware\u002F\n    detector.js                # Hardware detection\n    unified-detector.js        # Cross-platform detection\n  data\u002F\n    model-database.js          # SQLite storage and packaged seed loading\n    seed\u002Fmodels.db             # npm-packaged Ollama catalog snapshot\n    sync-manager.js            # Database sync from Ollama registry\nbin\u002F\n  enhanced_cli.js              # CLI entry point\n```\n\n---\n\n## License\n\nLLM Checker is licensed under **NPDL-1.0** (No Paid Distribution License).\n\n- Free use, modification, and redistribution are allowed.\n- Selling the software or offering it as a paid hosted\u002FAPI service is not allowed without a separate commercial license.\n\nSee [LICENSE](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Fblob\u002Fmain\u002FLICENSE) for full terms.\n\n---\n\n[GitHub](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker) •\n[Releases](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Freleases) •\n[npm](https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002Fllm-checker) •\n[GitHub Packages](https:\u002F\u002Fgithub.com\u002Fusers\u002FPavelevich\u002Fpackages\u002Fnpm\u002Fpackage\u002Fllm-checker) •\n[Issues](https:\u002F\u002Fgithub.com\u002FPavelevich\u002Fllm-checker\u002Fissues) •\n[Discord](https:\u002F\u002Fdiscord.gg\u002FmnmYrA7T)\n","LLM Checker 是一个高级命令行工具，用于扫描您的硬件并推荐适合本地运行的大型语言模型（LLM）或小型语言模型（sLLM），支持与Ollama平台完全集成。其核心功能包括分析系统配置、从超过200种Ollama模型及其7000多个变体中智能挑选最优选项，并通过四维评分系统（质量、速度、匹配度、上下文）来评估每个模型的表现。此外，该工具还具备跨多种GPU架构（如Apple Silicon、NVIDIA CUDA等）的硬件检测能力和精准的内存使用量估算能力。适用于需要在本地部署AI模型但不确定哪款最适合当前硬件条件的研发人员和爱好者。",2,"2026-06-11 03:43:38","high_star"]