[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2278":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":13,"contributorsCount":13,"subscribersCount":13,"size":13,"stars1d":13,"stars7d":13,"stars30d":13,"stars90d":13,"forks30d":13,"starsTrendScore":13,"compositeScore":13,"rankGlobal":10,"rankLanguage":10,"license":15,"archived":16,"fork":16,"defaultBranch":17,"hasWiki":18,"hasPages":16,"topics":19,"createdAt":10,"pushedAt":10,"updatedAt":30,"readmeContent":31,"aiSummary":32,"trendingCount":13,"starSnapshotCount":13,"syncStatus":33,"lastSyncTime":34,"discoverSource":35},2278,"litmux","litmux4ai\u002Flitmux","litmux4ai","⚡ Unit tests for AI. Test prompts, compare models, save money.","https:\u002F\u002Flitmux.dev",null,"Python",104,0,1,"MIT License",false,"main",true,[20,21,22,23,24,25,26,27,28,29],"ai","cli","cost-optimization","developer-tools","evaluation","llm","open-source","prompt-engineering","python","testing","2026-06-12 02:00:39","# Litmux\n\nUnit tests for AI. Test prompts, compare models, catch regressions.\n\n\u003Cp align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fpython-3.11+-blue?logo=python&logoColor=white\" \u002F>\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-green\" \u002F>\n  \u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Ftests-107%20passing-brightgreen\" \u002F>\n\u003C\u002Fp>\n\n```bash\npip install litmux && litmux init && litmux run\n```\n\n---\n\n## Why\n\nEvery team shipping AI features hits the same three problems:\n\n1. **No testing standard.** REST has Postman, frontends have Cypress. LLM calls have manual spot-checking.\n2. **Prompt regression is invisible.** A one-word change can silently break 15% of edge cases.\n3. **Model selection is vibes.** \"We use GPT-4o because it's good\" — but is it $15k\u002Fmonth better than Gemini Flash?\n\nLitmux gives you a YAML config, pass\u002Ffail assertions, and a cost report. That's it.\n\n---\n\n## Quick Start\n\n```bash\npip install litmux\n\ncp .env.example .env\n# Add at least one: OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY, HF_TOKEN\n\nlitmux init    # scaffold a project\nlitmux run     # run tests against all configured models\n```\n\nNo database, no cloud account, no Docker.\n\n---\n\n## Core Commands\n\n### `litmux run` — unit tests for prompts\n\n```yaml\n# litmux.yaml\nmodels:\n  - model: gpt-4o-mini\n  - model: claude-haiku-4-5-20251001\n\ntests:\n  - name: summarize_earnings\n    prompt: prompts\u002Fsummarize.txt\n    inputs:\n      text: \"Revenue grew 15% to $4.2 billion...\"\n    assert:\n      - type: contains\n        value: \"revenue\"\n      - type: cost-less-than\n        value: 0.01\n```\n\n### `litmux eval` — bulk evaluation against datasets\n\n```yaml\nevals:\n  - name: ticket_classifier\n    prompt: prompts\u002Fclassify.txt\n    dataset: datasets\u002Fsupport_tickets.csv\n    input_mapping:\n      ticket: text\n    expected: expected_category\n    assert:\n      - type: json-valid\n    judge:\n      criteria: \"Did the model correctly classify the ticket?\"\n      threshold: 7.0\n```\n\n### `litmux generate` — AI-generated test datasets\n\n```bash\nlitmux generate \\\n  --prompt prompts\u002Fclassify.txt \\\n  --seed datasets\u002Fsample_tickets.csv \\\n  --n 50 \\\n  --output datasets\u002Fsupport_tickets.csv\n```\n\n### `litmux cost` — cost projection across models\n\n```bash\nlitmux cost --volume 50000\n```\n\nFinds the cheapest model that passes your tests.\n\n### `litmux compare` — side-by-side model outputs\n\n```bash\nlitmux compare\n```\n\n---\n\n## Cloud (Optional, Free)\n\nSync results to a hosted dashboard for history, trends, and team visibility.\n\n```bash\nlitmux login       # one-time browser auth\nlitmux run         # results auto-sync\nlitmux dashboard   # open app.litmux.dev\n```\n\nThe CLI works fully offline. Cloud is opt-in.\n\n---\n\n## Assertion Types\n\n| Type | Description |\n|------|-------------|\n| `contains` | Output contains substring |\n| `not-contains` | Output does not contain substring |\n| `regex` | Output matches regex pattern |\n| `json-valid` | Output is valid JSON |\n| `json-schema` | Output has required JSON keys |\n| `cost-less-than` | Cost below threshold (USD) |\n| `latency-less-than` | Latency below threshold (ms) |\n| `llm-judge` | LLM scores output 1–10 against criteria |\n\n---\n\n## CI\u002FCD\n\n```yaml\n# .github\u002Fworkflows\u002Flitmux.yml\n- run: litmux run --ci\n  env:\n    OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}\n```\n\n---\n\n## Configuration\n\n```yaml\nmodels:\n  - provider: openai | anthropic | google | huggingface\n    model: string\n    temperature: 0.0\n    max_tokens: 1024\n\ndefaultTest:\n  assert:\n    - type: cost-less-than\n      value: 0.01\n\ntests:\n  - name: string\n    prompt: path\u002Fto\u002Fprompt.txt\n    inputs: { variable: \"value\" }\n    assert:\n      - type: contains\n        value: \"expected\"\n\nevals:\n  - name: string\n    prompt: path\u002Fto\u002Fprompt.txt\n    dataset: path\u002Fto\u002Fdata.csv\n    input_mapping: { prompt_var: csv_column }\n    expected: csv_column\n    assert: [...]\n    judge:\n      criteria: \"...\"\n      threshold: 7.0\n```\n\n### Environment Variables\n\n| Variable | Purpose |\n|----------|---------|\n| `OPENAI_API_KEY` | OpenAI models, LLM judge, dataset generation |\n| `ANTHROPIC_API_KEY` | Anthropic models |\n| `GOOGLE_API_KEY` | Google models |\n| `HF_TOKEN` | HuggingFace models |\n| `LITMUX_NO_CACHE` | Set to `1` to skip the response cache |\n| `LITMUX_API_URL` | Override cloud API endpoint (default: `https:\u002F\u002Fapi.litmux.dev`) |\n| `LITMUX_API_URL_ALLOW_INSECURE` | Set to `1` to allow non-HTTPS `LITMUX_API_URL` (local dev only) |\n| `LITMUX_DASHBOARD_URL` | Override dashboard URL (default: `https:\u002F\u002Fapp.litmux.dev`) |\n| `LITMUX_JUDGE_MODEL` | LLM model used for `llm-judge` assertions (default: `gpt-4o-mini`) |\n| `LITMUX_CLOUD_ENABLED` | Set to `1` to opt in to Litmux Cloud (private beta) |\n\n---\n\n## All Commands\n\n```\nlitmux run                    Run all tests\nlitmux run -t \u003Cname>          Run a specific test\nlitmux run --ci               CI output (markdown)\nlitmux eval                   Run all evals\nlitmux eval --limit 10        Evaluate first N rows\nlitmux generate ...           Generate a test dataset\nlitmux compare                Side-by-side model outputs\nlitmux cost -v 50000          Project monthly cost\nlitmux cache                  View \u002F clear response cache\nlitmux init                   Scaffold a new project\nlitmux version                Show version\n\n# Cloud (private beta — join the waitlist at https:\u002F\u002Flitmux.dev)\nlitmux login                  Authenticate with Litmux Cloud\nlitmux logout                 Remove saved credentials\nlitmux history                Recent runs from cloud\nlitmux dashboard              Open the dashboard\n```\n\n---\n\n## Examples\n\nSee [`examples\u002F`](examples\u002F) for three ready-to-run projects:\n\n- `01-quickstart` — minimal single-model test\n- `02-multi-model` — compare across providers\n- `03-generate-and-eval` — AI-generated dataset + LLM judge\n\n---\n\n## License\n\nMIT\n","Litmux 是一个用于AI的单元测试工具，可以帮助用户测试提示词、比较模型并节省成本。其核心功能包括通过YAML配置文件定义测试用例和断言条件，支持多种断言类型如包含特定字符串、正则表达式匹配等，并能够生成成本报告以帮助选择性价比更高的模型。此外，它还提供了批量评估数据集、生成测试数据集以及跨模型成本预测等功能。适合于需要频繁调整AI模型参数或优化提示词效果的开发团队使用，特别是在关注成本控制的情况下。项目基于Python开发，易于安装和部署，无需数据库或云账户支持。",2,"2026-06-11 02:49:14","CREATED_QUERY"]