[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81350":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":14,"subscribersCount":14,"size":14,"stars1d":14,"stars7d":15,"stars30d":12,"stars90d":14,"forks30d":14,"starsTrendScore":14,"compositeScore":16,"rankGlobal":9,"rankLanguage":9,"license":17,"archived":18,"fork":18,"defaultBranch":19,"hasWiki":18,"hasPages":18,"topics":20,"createdAt":9,"pushedAt":9,"updatedAt":21,"readmeContent":22,"aiSummary":23,"trendingCount":14,"starSnapshotCount":14,"syncStatus":12,"lastSyncTime":24,"discoverSource":25},81350,"oh-my-free-models","hakilee\u002Foh-my-free-models","hakilee","Route your coding agent to the fastest free LLM in real time.",null,"TypeScript",43,2,41,0,1,1.43,"MIT License",false,"main",[],"2026-06-12 02:04:14","\u003Cp align=\"center\">\n  \u003Cimg src=\".\u002Foh-my-free-models-character.png\" height=\"96\" alt=\"oh-my-free-models character\" \u002F>\n\u003C\u002Fp>\n\n# oh-my-free-models\n\nEnglish | [한국어](.\u002Fdocs\u002FREADME.ko.md) | [简体中文](.\u002Fdocs\u002FREADME.zh-CN.md) | [繁體中文](.\u002Fdocs\u002FREADME.zh-TW.md) | [日本語](.\u002Fdocs\u002FREADME.ja.md)\n\n`oh-my-free-models` (`omfm`) is a local proxy that routes your coding agent to the fastest free model across providers. Point your OpenAI- or Anthropic-compatible agent at `localhost`, pick a few free models, and `omfm` keeps requests flowing as latency, rate limits, and quotas shift underneath.\n\nhttps:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F44c07928-1544-4b33-a472-41e82f7aa7d7\n\n> `omfm` driving OpenCode against routed free models.\n\n## Why this exists\n\nFree-tier coding agents look great on paper and break in practice. Four things go wrong:\n\n**Rate limits stop your work mid-task.** Free models on OpenRouter or NVIDIA hit 429 unpredictably. A clean run becomes a stalled tool call, and you have to retry by hand.\n\n**Latency drifts hour to hour.** The same free model is fast in the morning and unusable by afternoon. No model is \"the fast one\"; only \"the fast one *right now*\" matters.\n\n**Quotas force manual provider swapping.** When one provider's free quota runs out, you're manually swapping keys and base URLs. Your agent doesn't adapt.\n\n**The free catalog churns.** Models appear, disappear, get deprecated, or quietly start returning errors. You find out by hitting the wall, not from a dashboard.\n\n## What omfm does about it\n\nYou give `omfm` an allowlist of free models you actually want to use. It runs as a local proxy on `http:\u002F\u002Flocalhost:4567` and handles these jobs internally.\n\n| Job | What happens |\n| --- | --- |\n| Latency tracking | Measures and caches per-model latency from your machine. |\n| Request routing | Routes generic requests to the lowest-latency live candidate. |\n| Cooldown | Keeps models that just hit 429 or 402 out of rotation for about 10 minutes. |\n| Client compatibility | Exposes OpenAI-compatible `\u002Fv1` and Anthropic-compatible `\u002Fanthropic` surfaces, including Anthropic tool-use fallback and local token counting. |\n\nYour agent points at `localhost`. Provider switching, rate-limit retries, and picking the currently-fast model all happen below it.\n\n## Get API keys\n\n`omfm` only forwards traffic. You bring keys from one or both providers.\n\n**OpenRouter** — sign up at [openrouter.ai](https:\u002F\u002Fopenrouter.ai), then issue a key under Keys (prefix `sk-or-`). Free `:free` models cap at 50 requests\u002Fday; topping up at least $10 in credits raises the cap to 1,000\u002Fday. No credit card needed for the free cap.\n\n**NVIDIA** — sign up at [build.nvidia.com](https:\u002F\u002Fbuild.nvidia.com) (NVIDIA Developer Program), then click \"Get API Key\" on any model card (prefix `nvapi-`). No credit card needed; rate limits apply per model.\n\nAdd whichever you have to `~\u002F.oh-my-free-models\u002F.env` — `omfm` only uses providers whose key is set.\n\n## 30-second try-it\n\n```bash\nnpm install -g oh-my-free-models\nmkdir -p ~\u002F.oh-my-free-models && echo 'OPENROUTER_API_KEY=sk-or-...' > ~\u002F.oh-my-free-models\u002F.env\nomfm model        # pick a few free models in the picker\nomfm start        # serves http:\u002F\u002Flocalhost:4567\n```\n\n## Common commands\n\n| Command | Use |\n| --- | --- |\n| `omfm model` | Open the picker and save selected free models. |\n| `omfm model --all` | Print all eligible models without opening the picker. |\n| `omfm model --no-tui` | Skip the TUI and pick rows from a numbered static table via a single-line prompt. |\n| `omfm model --group fast --best` | Probe the fast group and print the best current candidate. |\n| `omfm start` | Run the local proxy in the foreground with request\u002Fresponse routing logs. |\n| `omfm start --daemon` | Run the local proxy in the background. |\n| `omfm status` | Show daemon, config, and best-route status. |\n| `omfm stop` | Stop the background daemon. |\n| `omfm doctor` | Inspect config paths, keys, model cache, and daemon state. |\n| `omfm usage` | Show per-model request and token observations. |\n\n## Use it from your agent\n\nOpenAI-compatible clients (OpenCode, Hermes Agent, OpenClaw, etc.):\n\n```text\nurl=http:\u002F\u002Flocalhost:4567\u002Fv1\nmodel=omfm             # whole pool; or omfm\u002Ffast, omfm\u002Fbalanced, omfm\u002Fcapable\n```\n\nAnthropic-compatible clients (Claude Code, etc.):\n\n```bash\nexport ANTHROPIC_BASE_URL=http:\u002F\u002Flocalhost:4567\u002Fanthropic\nexport ANTHROPIC_AUTH_TOKEN=omfm-local\nexport ANTHROPIC_API_KEY=\n```\n\nFor Claude Code, you can create a shell alias that routes Opus, Sonnet, and Haiku requests to `omfm` groups:\n\n```bash\nalias freeclaude='ANTHROPIC_BASE_URL=http:\u002F\u002Flocalhost:4567\u002Fanthropic ANTHROPIC_AUTH_TOKEN=omfm-local ANTHROPIC_API_KEY= CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1 ANTHROPIC_DEFAULT_OPUS_MODEL=omfm\u002Fcapable ANTHROPIC_DEFAULT_SONNET_MODEL=omfm\u002Fbalanced ANTHROPIC_DEFAULT_HAIKU_MODEL=omfm\u002Ffast claude'\n```\n\nThe bare `omfm` model routes across the entire selected pool, while `omfm\u002Fcapable`, `omfm\u002Fbalanced`, and `omfm\u002Ffast` filter to the matching model groups. The Claude-style aliases `opus`, `sonnet`, and `haiku` are equivalent to those same groups. You can also pass any specific model ID from `omfm model` to pin a request to it.\n\nThe Anthropic surface also supports local `count_tokens` estimates and translates common tool-use\u002Ftool-result flows when a request falls back to an OpenAI-compatible provider route.\n\n## Keep context sizes consistent\n\n`omfm` forwards each request to the routed model. It does not compact, summarize, or truncate the agent's accumulated conversation, so context-window errors are still possible. If a long session starts on a 1M-token model and later routes or fails over to a 128k or 200k model, the smaller model can reject the request once the prompt exceeds its context window. Client-side compaction can help, but do not rely on it happening automatically.\n\nWhen selecting models, keep each model group in the same context tier. For example, use only ~1M-token models in `capable` if you run long sessions there, or keep all `fast`, `balanced`, and `capable` groups within the 128k-200k tier. The `omfm model` picker shows each model's context size; unknown context is shown as an unknown marker, so treat it as risky for long sessions.\n\n## More\n\n- Setup, all CLI flags, daemon control, diagnostics: [INSTALLATION.md](.\u002Fdocs\u002FINSTALLATION.md)\n- Routing internals: [docs\u002Flatency-routing.md](.\u002Fdocs\u002Flatency-routing.md)\n- Provider catalog: [docs\u002Fprovider-guide.md](.\u002Fdocs\u002Fprovider-guide.md)\n- License: [MIT](.\u002FLICENSE.md)\n","oh-my-free-models 是一个本地代理工具，旨在将你的编码代理实时连接到最快的免费语言模型。它通过监控延迟、处理速率限制和配额问题，自动选择当前最优的免费模型，并支持OpenAI和Anthropic兼容接口，从而确保开发流程顺畅无阻。该项目采用TypeScript编写，具备智能路由请求、动态调整以及与多种API服务的兼容性等技术特点。适用于需要频繁使用免费层级的语言模型进行开发测试或轻量级项目部署的场景，能够显著减少因模型性能波动导致的工作中断。","2026-06-11 04:04:43","CREATED_QUERY"]