[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-81218":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":15,"stars7d":16,"stars30d":16,"stars90d":15,"forks30d":15,"starsTrendScore":15,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":18,"fork":19,"defaultBranch":20,"hasWiki":18,"hasPages":19,"topics":21,"createdAt":10,"pushedAt":10,"updatedAt":22,"readmeContent":23,"aiSummary":24,"trendingCount":15,"starSnapshotCount":15,"syncStatus":14,"lastSyncTime":25,"discoverSource":26},81218,"gc2oc","notBlubbll\u002Fgc2oc","notBlubbll","GithubCopilot2OpenCode- Ollama-emulating API proxy that routes Visual Studio 2026 Copilot Chat requests to opencode-cloudmodels","",null,"JavaScript",25,10,2,0,1,40.72,true,false,"main",[],"2026-06-12 04:01:32","Superseded with deeper integration by https:\u002F\u002Fgithub.com\u002FnotBlubbll\u002FGC2XY\r\n\r\n# gc2oc — GitHub Copilot to [OpenCode]\r\n\r\n\u003Cimg width=\"213\" height=\"46\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F6a6a2626-1779-4873-a30e-7fc2d8621967\" \u002F>\r\n\r\n\r\nOllama-emulating proxy that connects **GitHub Copilot Chat&Agent** to the [OpenCode](https:\u002F\u002Fopencode.ai) Zen + Go APIs + models.\r\n\r\n**No key needed** — free models work immediately. Add an API key in `.env` to unlock paid models *and* enable **premium** mode (models from your go sub). Full tool calling and streaming.\r\n\r\n---\r\n\r\n## Screenshots\r\n\r\n**Console** — model table, context lengths, key status, commands:\r\n\r\n\u003Ctable>\u003Ctr>\r\n\u003Ctd>\u003Cimg width=\"731\" height=\"643\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F0185f535-1000-47ae-af70-ad3abdc9a43c\" \u002F>\u003C\u002Ftd>\r\n\u003Ctd>\u003Cimg alt=\"Console free mode\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F1b19f2d9-5ba8-4d2e-ada6-fd096120d67a\">\u003C\u002Ftd>\r\n\r\n\u003C\u002Ftr>\u003C\u002Ftable>\r\n\r\n**Agent mode** — tool calling with free and paid models:\r\n\r\n\u003Cp align=\"center\">\u003Cimg alt=\"Free model agent mode\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F04c37339-5b65-456d-a57f-7d7a442ea915\">\u003C\u002Fp>\r\n\r\n\u003Cp align=\"center\">\u003Cimg alt=\"Paid model agent mode\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F8e5311f4-6736-422e-8816-fbdc4d5413b0\">\u003C\u002Fp>\r\n\r\nChit-chat:\r\n\u003Cp align=\"center\">\u003Cimg src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F39e7d15b-b56b-4ca6-b31f-e8326606af34\">\u003C\u002Fp>\r\n\r\n---\r\n\r\n## Requirements\r\n\r\n| What | Why |\r\n|------|-----|\r\n| [Bun](https:\u002F\u002Fbun.sh) or [Node.js](https:\u002F\u002Fnodejs.org) | Runtime (Bun preferred, Node as fallback) |\r\n| Visual Studio 2026 **(18.6.0+ incl. Insiders)** | Ollama provider (18.6.0+) |\r\n\r\n---\r\n\r\n## Supported Platforms\r\n\r\n| Client | Status |\r\n|--------|--------|\r\n| VS 2026 (18.6.0+) | ✓ Supported |\r\n| VS 2026 Insiders | ✓ Supported (features may be in flux) |\r\n| VS 2026 (LocalPilot) | ⚠ Unsupported but working |\r\n| VS Code | ⚠ Supported, not fully tested |\r\n| SQL Studio (22.6.0+) | ✓ Supported |\r\n\r\n> **VS 2026**: Ollama provider available in Visual Studio 2026 18.6.0 and later (regular + Insiders). Insiders features may be in flux due to its preview nature — expect occasional regressions.\r\n>\r\n> **VS Code**: Works via the GitHub Copilot extension's Ollama provider, but has not been thoroughly tested. Tool calling and streaming may have edge cases.\r\n>\r\n> **SQL Studio**: Ollama provider available in SQL Server Management Studio 22.6.0 and later.\r\n\r\n---\r\n\r\n## Quick Start\r\n\r\n### 1. Run\r\n\r\n```bash\r\nstart.cmd          # Windows\r\nbun run start      # Bun (preferred)\r\nnpm run node       # Node.js fallback\r\n```\r\n\r\n### 2. Add to Visual Studio\r\n\r\n**Requires Visual Studio 2026 18.6.0+ (regular or Insiders).**\r\n\r\n1. Open the Copilot Chat panel\r\n2. Click the model dropdown (next to the agent selector) → **Manage Models**\r\n3. Click **Select Provider** → **Ollama**\r\n4. Leave the endpoint at `http:\u002F\u002Flocalhost:11434` (unless your port differs)\r\n5. Click **Add** — VS fetches models and validates them automatically:\r\n\r\n\u003Ctable>\u003Ctr>\r\n\u003Ctd>\u003Cimg width=\"634\" height=\"752\" alt=\"paid\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fd5810075-3f1f-4326-a6c8-89b2b0fed482\">\u003C\u002Ftd>\r\n\u003Ctd>\u003Cimg width=\"631\" height=\"757\" alt=\"free\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F32b0be79-4664-441f-8bc4-3b154218964f\">\u003C\u002Ftd>\r\n\u003C\u002Ftr>\u003C\u002Ftable>\r\n\r\nYou can now select any model from the dropdown. No model IDs to configure — the proxy resolves display names to the correct API IDs.\r\n\r\n### 3. Add to VS Code\r\n\r\n1. Install GitHub Copilot extension\r\n2. Open Copilot Chat → model dropdown → **Manage Models**\r\n3. Click **Select Provider** → **Ollama**\r\n4. Enter `http:\u002F\u002Flocalhost:11434` as the endpoint\r\n5. Click **Add** — models appear with `[FREE]` \u002F `[GO]` prefixes:\r\n\r\n| Prefix | Meaning |\r\n|--------|---------|\r\n| `[FREE]` | Free tier — always available, no key needed |\r\n| `[FREE*]` | Freemium — free model, requires API key (orange in console) |\r\n| `[GO]` | Premium — requires `OPENCODE_API_KEY` in `.env` |\r\n| `[CROF]` | CrofAI — requires `CROF_API_KEY` in `.env` |\r\n| `[M365]` | Microsoft 365 Copilot — requires `M365C_TOKEN_PATH` in `.env` |\r\n\r\n\r\nin VSCode:\r\n\u003Cimg width=\"1392\" height=\"525\" alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002Fbc6a2a58-776b-4d2e-8fd8-bcf5f15b7bfa\" \u002F>\r\n\r\n\r\n### 3. (Optional) Unlock paid & freemium models\r\n\r\n```env\r\n# .env\r\nOPENCODE_API_KEY=your-go-key\r\nOPENCODE_API_KEYS=[\"key1\",\"key2\"]  # multi-key rotation\r\n```\r\n\r\nAdding an API key enables **freemium** mode — the Zen free models are also available with API key authentication (shown in orange in the console). Paid models are fetched dynamically and support full tool calling.\r\n\r\n**Startup key validation**: On launch, the proxy pings `deepseek-v4-flash` with `max_tokens: 1` to verify the key can run inference. If it returns 429, the key is marked as rate-limited (with timing extracted from the error message, e.g. `Resets in 1 day`) and paid models are hidden from the list — avoiding wasted API calls on an unusable key. If `deepseek-v4-flash` returns 404, it falls back to the first premium model in the API response. Cooldown state is persisted to `.cache\u002Fkey-state.json` and respected on restart.\r\n\r\n### Key Rotation & Rate Limit Protection\r\n\r\nWhen multiple API keys are configured via `OPENCODE_API_KEYS`, the proxy uses an **ApiBalancer** that:\r\n\r\n1. **Shuffles keys** — keys are shuffled and distributed randomly each time the pool is exhausted, preventing predictable rotation patterns\r\n2. **Tracks consecutive 429s** — each key's rate limit responses are counted independently\r\n3. **Auto-cooldowns keys**:\r\n   - **10 consecutive 429s** → key removed from rotation\r\n   - **Cooldown duration**: if upstream usage data available, matches the **actual API quota reset** (`rollingUsage` ~5h, `weeklyUsage` ~until Monday UTC, `monthlyUsage` ~1st of month). Otherwise falls back to **5h** (first strike) \u002F **1 week** (second strike)\r\n   - A single **successful request** immediately clears all cooldowns and resets the 429 counter\r\n\r\n#### Key state file\r\n\r\nCooldown state is persisted to `.cache\u002Fkey-state.json`. You can manually edit this file to clear cooldowns or adjust counters:\r\n\r\n```json\r\n{\r\n  \"keys\": {\r\n    \"sk-abc1...xyz9\": {\r\n      \"consecutive429\": 3,\r\n      \"cooldownUntil\": \"2026-05-09T18:00:00.000Z\"\r\n    }\r\n  },\r\n  \"updatedAt\": \"2026-05-09T13:00:00.000Z\"\r\n}\r\n```\r\n\r\n| Field | Description |\r\n|-------|-------------|\r\n| `consecutive429` | Current consecutive 429 count (resets on success) |\r\n| `cooldownUntil` | ISO timestamp when the key returns to rotation (5h or 1 week cooldown) |\r\n\r\nTo manually clear a key's cooldown, delete its entry or remove `cooldownUntil`, then restart the proxy.\r\n\r\n### Key Cooldown Checker (startup + refresh)\r\n\r\nOn startup and on each model refresh, the proxy loads `.cache\u002Fkey-state.json` and restores any active cooldowns into the `ApiBalancer`:\r\n\r\n1. **`_restoreState()`** — reads the JSON file, maps `short` key fragments (`sk-abc1...xyz9`) back to full key strings from the env, sets `cooldownUntil` entries for non-expired cooldowns. Logs `[keys] restored N cooldown(s) from cache` on success.\r\n\r\n2. **Direct disk safety net** (`fetchGoModelsRaw`) — as a second check, reads `key-state.json` directly and builds a `cooldownFromDisk` Map. The `keyInCooldown()` helper checks both the in-memory `_balancer.cooldownUntil` AND the direct disk Map. This ensures a key in cooldown is never pinged even if the `_restoreState` mapping fails silently (e.g. key format mismatch between sessions).\r\n\r\n3. **Individual key cooldown** — before each ping in `fetchGoModelsRaw`, `keyInCooldown(k)` is called. Keys in cooldown log `[keys] key[N] in cooldown (~Xs) — skipping` and are never contacted.\r\n\r\n4. **All-key cooldown** — if every key is in cooldown, the entire paid model fetch is skipped with `[keys] all keys in cooldown — skipping paid models`.\r\n\r\nThis means a key rate-limited from a previous session will never be pinged on restart — it's skipped entirely, saving a wasted 429 roundtrip.\r\n\r\n---\r\n\r\n## Configuration\r\n\r\n| Variable | Default | Description |\r\n|----------|---------|-------------|\r\n| `SERVER_PORT` | `11434` | Listen port |\r\n| `SERVER_HOST` | `127.0.0.1` | Listen host |\r\n| `DEFAULT_MODEL` | `big-pickle` | Fallback model |\r\n| `DEFAULT_TEMPERATURE` | — | Global temperature (e.g. `0.1`) |\r\n| `M365CO_PORT` | — | M365 WebSocket relay port (e.g. `8765`) |\r\n| `CROF_API_KEY` | — | CrofAI API key for Crof models (set to enable the Crof model section) |\r\n| `CACHE_ENABLED` | `true` | Prompt cache |\r\n| `CACHE_MAX_SIZE` | `64` | Max cached entries |\r\n| `CACHE_TTL_SEC` | `300` | Cache TTL |\r\n| `REQUEST_LOG` | `true` | Log incoming requests to console |\r\n| `HIDE_FREE` | `false` | Hide free models and `[FREE]`\u002F`[GO]` tags & dividers |\r\n| `SESSION_KEEPALIVE_ENABLED` | `true` | Keep KV cache warm between turns |\r\n| `SESSION_KEEPALIVE_INTERVAL_MS` | `120000` | Keepalive ping interval (ms) |\r\n| `SESSION_KEEPALIVE_IDLE_TIMEOUT_MS` | `600000` | Stop keepalive after idle (ms) |\r\n| `SESSION_KEEPALIVE_MAX_LIFETIME_MS` | `86400000` | Cycle upstream cache after (ms, 24h) |\r\n| `DISPLAY_REASONING` | `false` | Mirror DeepSeek thinking tokens into Cursor\u002FVS-visible markdown blocks |\r\n| `COLLAPSIBLE_REASONING` | `true` | Use collapsible `\u003Cdetails>` blocks when `DISPLAY_REASONING=true` |\r\n\r\n---\r\n\r\n## Models\r\n\r\nModels appear in VS Code's Copilot list as `[FREE] Model Name`, `[FREE*] Model Name`, `[GO] Model Name`, `[CROF] Model Name`, and `[M365] M365 Copilot` — the prefix indicates free vs freemium vs paid vs Crof vs M365 tier at a glance.\r\n\r\n**Free** (always available, auto-validated): Big Pickle, MiniMax M2.5 Free, Nemotron 3 Super Free, Ring 2.6 1T Free\r\n\r\n**Freemium** (requires API key, same Zen free models — auto-detected via ping): Big Pickle, MiniMax M2.5 Free, Nemotron 3 Super Free, Ring 2.6 1T Free — these route to the same free Zen endpoint but authenticate with your API key. On startup, each free model is pinged without a key; if it returns 401, it's retried with your API key. Models that succeed with the key are marked **freemium** and shown in orange in the console.\r\n\r\n**Paid** (requires Go API key): fetched dynamically from OpenCode — all support tool calling\r\n\r\n### Thinking Modes\r\n\r\nSome models support adjustable thinking effort. These appear in the model list both as a **default entry** (no thinking mode applied) and as separate tagged entries per thinking level:\r\n\r\n| Tag | Meaning |\r\n|-----|---------|\r\n| ` [LOW]` | Low reasoning effort |\r\n| ` [MED]` | Medium reasoning effort |\r\n| ` [HIGH]` | High reasoning effort |\r\n| ` [MAX]` | Maximum reasoning effort |\r\n\r\nModels that think on their own (GLM, Kimi, MiniMax, Qwen) appear without tags — their thinking is handled internally and can't be controlled.\r\n\r\n> This thinking-mode tagging on the model name will be replaced with native thinking controls once Visual Studio adds proper support for it.\r\n\r\n**M365 Copilot** (optional, requires `M365CO_PORT`): your company's Microsoft 365 Copilot chat — two models (Quick + Think), chat-only, no tools\r\n\r\n---\r\n\r\n## VS 2026 File Creation\r\n\r\nWhen using VS 2026 agent mode, the proxy handles file creation and project integration:\r\n\r\n- **New files** (`.css`, `.js`, `.py`, etc.) are created via tool calls — written to disk with absolute workspace paths\r\n- **Project files** (`.csproj`, `.vbproj`, `.fsproj`, etc.) are handled natively — markdown code blocks pass through for VS to edit in-place\r\n- **Auto-injection**: new files are automatically added to the project's `.csproj` with the correct `\u003CContent Include=\"...\" \u002F>` entry\r\n- **Workspace root** is extracted from VS 2026's IDE state context — relative file paths are resolved automatically\r\n\r\nTo create a new file, just ask Copilot (e.g. \"create me a css file called test.css\"). The AI will:\r\n1. Create the file\r\n2. Read the project file\r\n3. Add the file reference to the project\r\n\r\n---\r\n\r\n## API Endpoints\r\n\r\n| Endpoint | Method | Description |\r\n|----------|--------|-------------|\r\n| `\u002Fapi\u002Ftags` | GET | Ollama model list with capabilities, context length, pricing |\r\n| `\u002Fv1\u002Fchat\u002Fcompletions` | POST | Chat with tool calling, streaming, cache |\r\n| `\u002Fv1\u002Fengines\u002Fcopilot-codex\u002Fcompletions` | POST | Inline code completions |\r\n| `\u002Fapi\u002Fshow` | POST | Model detail with full capabilities, context, pricing |\r\n| `\u002Fapi\u002Fstats` | GET | Proxy metrics (uptime, model counts, concurrency, reasoning cache, key status) |\r\n| `\u002Fapi\u002Frefresh` | POST | Force refresh model list from upstream APIs |\r\n| `\u002Fapi\u002Fdiagnostics` | POST | Self-test with tool-calling roundtrip (connectivity, streaming, tool verification) |\r\n| `\u002Fhealth` | GET | Health check with model counts |\r\n| `\u002Fapi\u002Fversion` | GET | Returns `420.96.00` |\r\n| `\u002Fstop` | GET | Shutdown |\r\n\r\n---\r\n\r\n## Commands\r\n\r\n| Command | Action |\r\n|---------|--------|\r\n| `r` \u002F `restart` | Restart proxy |\r\n| `s` \u002F `stop` | Shut down |\r\n| `e` \u002F `exit` | Shut down |\r\n\r\nOr `curl http:\u002F\u002Flocalhost:11434\u002Fstop`\r\n\r\n---\r\n\r\n## Version Check\r\n\r\nOn startup, the proxy fetches the latest ticks from [`notBlubbll\u002Fgc2oc\u002F.version`](https:\u002F\u002Fraw.githubusercontent.com\u002FnotBlubbll\u002Fgc2oc\u002Fmain\u002F.version) (raw) and compares them with the local `.version` file. If they differ, the repo has been updated — the **console title** changes to:\r\n\r\n```\r\ngc2oc (outdated, check github for new version)\r\n```\r\n\r\nThe status line shows green when up to date (match) and red when outdated (mismatch).\r\n\r\n> A GitHub Actions workflow writes the current UNIX timestamp in ms to the `version` file on each push to `main`.\r\n\r\n---\r\n\r\n## TPS Tracker\r\n\r\nThe proxy tracks tokens-per-second throughput and displays a rolling average in the console title.\r\n\r\n| State | Title |\r\n|-------|-------|\r\n| No activity | `gc2oc` |\r\n| After requests | `gc2oc [42.5 t\u002Fs]` |\r\n| Outdated + TPS | `gc2oc (outdated, check github for new version) [42.5 t\u002Fs]` |\r\n\r\nSet `SHOW_TPS=false` in `.env` to disable.\r\n\r\n---\r\n\r\n## Self-Updater\r\n\r\nNo git required. `update.cmd` downloads the latest `main.zip` from GitHub and applies only changed files — your config and caches are preserved.\r\n\r\n```\r\nupdate.cmd\r\n```\r\n\r\n| Step | What happens |\r\n|------|-------------|\r\n| Download | Fetches `main.zip` from the repo |\r\n| Extract | Unzips to a temp folder |\r\n| Compare | MD5-hashes every file — copies only new or changed files |\r\n| Preserve | `.env`, `.cache\u002F`, `.dist\u002F`, `node_modules\u002F`, `.git\u002F` are never touched |\r\n| Cleanup | Temp folder removed automatically |\r\n\r\nEach file is labeled `NEW` (first time), `UPD` (changed), or `SKIP` (preserved) so you can see exactly what was updated. Restart the proxy after updating to use the new code.\r\n\r\n### Update + Build\r\n\r\n`update-and-build.cmd` fetches the latest source then runs `build.cmd` in one step — pull the newest code and produce a fresh `.dist` standalone.\r\n\r\n```\r\nupdate-and-build.cmd\r\n```\r\n\r\n---\r\n\r\n## Caching & Validation\r\n\r\n### Disk cache (`models.json`)\r\n\r\nThe full model list (free + paid + Pollinations + M365) is cached to `.cache\u002Fmodels.json`. On restart, if the key hash matches and no relevant config changed (free tier models, M365 token path), the cache is loaded instantly — no upstream API calls needed.\r\n\r\nInvalidation triggers:\r\n- **Key hash changes** — keys added, removed, or rotated → full refresh\r\n- **Free tier models** changed in code — SHA256 hash of all free model IDs compared to cached value\r\n- **M365 token** set or removed — cached M365 presence vs current env mismatch\r\n\r\n### Key hash cache (`keyhash.json`)\r\n\r\nSHA256 hash of all API keys (sorted, deduped) persisted to `.cache\u002Fkeyhash.json`. Used at startup to detect key changes without re-parsing `.env` — if the hash matches, paid models load from disk cache instantly.\r\n\r\n### Key state cache (`key-state.json`)\r\n\r\nPersists per-key cooldown state between restarts. See [Key Cooldown Checker](#key-cooldown-checker-startup--refresh) above for the full load\u002Frestore flow. File is written on every cooldown state change and loaded on startup + each `refreshModels()` call.\r\n\r\n### Prompt cache (in-memory LRU)\r\n\r\nLRU with TTL. Responses keyed by hash of model + temperature + tool count + **session discriminator** + normalized messages. Cache hits replay instantly with zero tokens. Controlled by `CACHE_ENABLED`\u002F`CACHE_MAX_SIZE`\u002F`CACHE_TTL_SEC`.\r\n\r\n### Reasonings cache (in-memory, session-scoped)\r\n\r\nPer-session reasoning text from `\u003Cthink>` tags is cached and re-attached when a cached prompt-response pair is replayed. Ensures DeepSeek-style reasoning isn't lost on cache hits. **Session-scoped** — different conversations never share reasoning data, even if they produce the same text.\r\n\r\n#### Multi-tier reasoning key system (inspired by [yxlao\u002Fdeepseek-cursor-proxy](https:\u002F\u002Fgithub.com\u002Fyxlao\u002Fdeepseek-cursor-proxy))\r\n\r\nReasoning is stored under multiple lookup keys to maximize cache hit rates across tool-call conversations:\r\n\r\n| Key type | Description | Survives |\r\n|----------|-------------|----------|\r\n| Message signature | SHA256 of content + canonicalized tool calls | Exact match |\r\n| Tool call ID | Per-call `id` field | Argument re-ordering |\r\n| Tool call signature | SHA256 of tool name + normalized args | ID reassignment |\r\n| Tool name | Plain function name | Interrupted streams, missing IDs |\r\n| Legacy content hash | Simplified content-based hash | Backward compatibility |\r\n\r\n**Lookup priority**: Message signature → Tool call IDs → Tool call signatures → Tool names → Legacy content hash → FIFO → Global last-reasoning fallback.\r\n\r\n**Smart memory management**: LRU eviction at 5000 entries, preserving permanent fallback keys (`g:*:last`, `*:mdl:*`). No disk persistence — everything stays in memory for speed.\r\n\r\n#### Thinking display\r\n\r\nWhen `DISPLAY_REASONING=true`, DeepSeek reasoning tokens are mirrored into Cursor\u002FVS Code-visible content as Markdown blocks:\r\n\r\n- **Collapsible** (`COLLAPSIBLE_REASONING=true`, default): `\u003Cdetails>\u003Csummary>Thinking\u003C\u002Fsummary>...\u003C\u002Fdetails>`\r\n- **Plain** (`COLLAPSIBLE_REASONING=false`): `\u003Cthink>...\u003C\u002Fthink>`\r\n\r\nEchoed thinking blocks in incoming assistant content are automatically stripped before forwarding to upstream APIs. This prevents \"reasoning doubling\" when VS\u002FCursor echoes back the proxy-injected display blocks.\r\n\r\n#### Reasoning recovery (400 error handling)\r\n\r\nWhen the upstream DeepSeek API returns `reasoning_content must be passed back`, the proxy:\r\n\r\n1. **Tier 1** — Retries with `thinking: false` disabled (preserves full history)\r\n2. **Tier 2** — Strips all assistant\u002Ftool messages, keeping only system + user\r\n\r\nThis matches the recovery strategy in [yxlao\u002Fdeepseek-cursor-proxy](https:\u002F\u002Fgithub.com\u002Fyxlao\u002Fdeepseek-cursor-proxy), which pioneered this approach for Cursor compatibility.\r\n\r\n### Free model validation\r\n\r\nOn startup and refresh, each free model is pinged with a lightweight request (`max_tokens: 1`). Only responding models appear in the list. Results are cached to disk models.\r\n\r\n### Connectivity ping\r\n\r\nA quick `big-pickle` ping runs at startup to verify Zen API reachability: `200 ok`, `401 key denied`, or `unreachable`.\r\n\r\n---\r\n\r\n## Session Tracking\r\n\r\nThe proxy detects and numbers distinct conversation sessions. Each new chat tab or task context gets a monotonic session ID, visible in the console:\r\n\r\n```\r\nnew session 3 (vscode, go\u002Fdeepseek-v4-flash, c:\\workspace\\project)\r\n[vscode][3]>[go\u002Fdeepseek-v4-flash]\r\n[vscode][3] stream done (42 chunks)\r\n```\r\n\r\n### How sessions are detected\r\n\r\nA session is identified by hashing:\r\n\r\n1. **All user messages before the first assistant\u002Ftool message** — VS sends the context block + the user's query as separate user messages, and the combined hash uniquely identifies each chat tab\r\n2. **Workspace root** — same query in a different project is treated as a different session\r\n\r\nSwitching models mid-conversation keeps the same session — the message history and cached context carry over.\r\n\r\nThis means different chat tabs, different workspaces, and different models get separate session IDs with isolated caches.\r\n\r\n### Cache isolation\r\n\r\nAll in-memory caches (prompt-response LRU, reasoning\u002F`\u003Cthink>` text) are **session-scoped** — data from one conversation never leaks into another. Two different users asking the same question will never get each other's cached responses or reasoning text.\r\n\r\n### Session Keepalive & Continuity\r\n\r\nWhen you do iterative development on the same code area, most context (system prompt, loaded files, tool results) is identical across turns. The proxy keeps the upstream LLM provider's **KV cache warm** between turns, so subsequent prompts pay **~10x cheaper cache-read pricing** instead of full input token pricing.\r\n\r\n**Keepalive** — after `SESSION_KEEPALIVE_INTERVAL_MS` (default 2min) of inactivity, a background ping is sent to the upstream API with the same conversation prefix (`max_tokens:1`, no tools, no stream). This prevents KV cache eviction. After `SESSION_KEEPALIVE_IDLE_TIMEOUT_MS` (default 10min) of total inactivity, pinging stops. After `SESSION_KEEPALIVE_MAX_LIFETIME_MS` (default 24h), the keepalive cycles to re-establish a fresh upstream cache.\r\n\r\n**Conversation continuity** — when VS\u002FVS Code opens a new chat in the same workspace, the proxy detects it and:\r\n- Enriches the system prompt with `\"You previously worked on this project...\"` so the model knows prior knowledge applies\r\n- Shares the reasoning cache across sessions in the same workspace (workspace-scoped fallback when conversation-scoped lookup misses)\r\n\r\n| Variable | Default | Description |\r\n|----------|---------|-------------|\r\n| `SESSION_KEEPALIVE_ENABLED` | `true` | Enable\u002Fdisable session keepalive |\r\n| `SESSION_KEEPALIVE_INTERVAL_MS` | `120000` | Milliseconds between pings (min 30000) |\r\n| `SESSION_KEEPALIVE_IDLE_TIMEOUT_MS` | `600000` | Milliseconds of inactivity before stopping |\r\n| `SESSION_KEEPALIVE_MAX_LIFETIME_MS` | `86400000` | Max session life before cycling upstream cache (24h) |\r\n\r\n> Inspired by [TaskSync #98](https:\u002F\u002Fgithub.com\u002F4regab\u002FTaskSync\u002Fissues\u002F98). A 40-minute agentic session with warming can cost **8x less** than a 5-minute session without.\r\n\r\n---\r\n\r\n## Pollinations Free Models\r\n\r\n6 free models via [Pollinations](https:\u002F\u002Ftext.pollinations.ai) (GPT-OSS 20B backend, reasoning + tools):\r\n\r\n| Model ID | Display Name | Context |\r\n|----------|-------------|---------|\r\n| `pol\u002Fopenai-fast` | Pollinations GPT-OSS 20B | 131K |\r\n| `pol\u002FGPT-5` | Pollinations GPT-5 | 131K |\r\n| `pol\u002FClaude` | Pollinations Claude | 200K |\r\n| `pol\u002FGemini` | Pollinations Gemini | 1M |\r\n| `pol\u002FDeepSeek` | Pollinations DeepSeek | 131K |\r\n| `pol\u002FLlama-4` | Pollinations Llama 4 | 131K |\r\n| `pol\u002FMistral` | Pollinations Mistral | 131K |\r\n\r\nAll route through the same Pollinations `openai` backend — no API key required. By default, only the clean `pol\u002Fopenai-fast` model is shown. The 6 cosplay aliases are hidden unless `HIDE_POLL_COSPLAY=false` is set.\r\n\r\n### Pollinations env vars\r\n\r\n| Variable | Default | Description |\r\n|----------|---------|-------------|\r\n| `SHOW_POLL_MODELS` | `true` | Show Pollinations models |\r\n| `HIDE_POLL_COSPLAY` | `true` | Hide cosplay aliases (GPT-5, Claude, Gemini, DeepSeek, Llama-4, Mistral) — only show GPT-OSS 20B |\r\n\r\n---\r\n\r\n## CrofAI (optional)\r\n\r\nYou can add [CrofAI](https:\u002F\u002Fcrof.ai) models as an additional model provider alongside OpenCode Go. Set `CROF_API_KEY` in `.env` to enable. Crof models appear with a `[CROF]` prefix in VS Code and are listed under a **Crof** section in the console banner.\r\n\r\n```env\r\nCROF_API_KEY=your-crof-key\r\n```\r\n\r\nCrof models use the `crof\u002F` prefix in model IDs (e.g. `crof\u002Fdeepseek-v4-flash`) to avoid conflicts with OpenCode Go models. All Crof models support tool calling, streaming, vision, and thinking modes.\r\n\r\n**Auto-refresh**: Crof models detect key state changes at runtime — add or remove `CROF_API_KEY` and the model list refreshes without restarting the proxy.\r\n\r\n## Microsoft 365 Copilot (optional)\r\n\r\nYou can route chat requests through your company's **Microsoft 365 Copilot** (the web chat at [m365.cloud.microsoft](https:\u002F\u002Fm365.cloud.microsoft\u002Fchat)) as an additional model. Two models appear: `[M365] M365 Copilot Quick` and `[M365] M365 Copilot Think`.\r\n\r\n\u003Cimg alt=\"image\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F65773e57-dbfb-4818-99e7-21c58c31683a\" \u002F>\r\n\r\n\r\n### How it works\r\n\r\nThe proxy connects to a **WebSocket relay server** that runs a browser-automated M365 Copilot session. The relay intercepts the M365 substrate WebSocket (`substrate.office.com`) and forwards chat requests\u002Fresponses. This is the same approach used by [m365-copilot-openai-proxy](https:\u002F\u002Fgithub.com\u002Fkuchris\u002Fm365-copilot-openai-proxy).\r\n\r\nAn external relay is required. The proxy uses a WebSocket-based protocol; the relay handles browser automation and M365 auth.\r\n\r\n| Relay | Setup | Description |\r\n|-------|-------|-------------|\r\n| **[g365-headless-relay](https:\u002F\u002Fgithub.com\u002FnotBlubbll\u002Fg365-headless-relay)** | `npm install` | Playwright Chromium off-screen relay — open-source, cross-platform, persistent profile |\r\n\r\nThe proxy's M365 WebSocket protocol is inspired by the same substrate-interception concept used in [m365-copilot-openai-proxy](https:\u002F\u002Fgithub.com\u002Fkuchris\u002Fm365-copilot-openai-proxy), but the wire format is different — they are not interchangeable.\r\n\r\n**Constraints:**\r\n- Token expires in ~1 hour (browser session handles auth — no manual token extraction).\r\n- System prompts and conversation history are folded into the message as plain text (labeled sections with `---` separator).\r\n- **No tool calls or agent mode** — M365 Copilot is chat-only.\r\n\r\n### Setup (g365-headless-relay)\r\n\r\n1. Clone and install:\r\n   ```bash\r\n   git clone https:\u002F\u002Fgithub.com\u002FnotBlubbll\u002Fg365-headless-relay\r\n   cd g365-headless-relay\r\n   npm install\r\n   ```\r\n2. First run — sign in (visible browser):\r\n   ```bash\r\n   debug.cmd\r\n   ```\r\n3. Subsequent runs — off-screen relay:\r\n   ```bash\r\n   start.cmd\r\n   ```\r\n4. Set the relay port in `.env`:\r\n   ```env\r\n   M365CO_PORT=8765\r\n   ```\r\n5. Restart the proxy.\r\n\r\n### Relaying prompt to M365\r\n\r\nSystem prompts and conversation history are folded into the message as plain text before sending to M365:\r\n\r\n```\r\nSystem instructions:\r\nBe concise and helpful.\r\n\r\nPrior conversation transcript:\r\nUser: What is TypeScript?\r\nAssistant: TypeScript is a typed superset of JavaScript.\r\n\r\n---\r\n\r\nTell me more about interfaces.\r\n```\r\n\r\n### Token refresh\r\n\r\nWhen the browser session expires, restart the relay:\r\n- **g365-headless-relay**: run `debug.cmd` to re-sign in, then `start.cmd`\r\n\r\nNo manual token copying required — the browser session handles all auth.\r\n\r\n---\r\n\r\n## Prompt Compression\r\n\r\nEnriched from [OmniRoute](https:\u002F\u002Fgithub.com\u002Fdiegosouzapw\u002FOmniRoute) (RTK+Caveman stacked compression) and [caveman](https:\u002F\u002Fgithub.com\u002FJuliusBrussee\u002Fcaveman). 7 compression levels available:\r\n\r\n| Level | Savings | Description |\r\n|-------|---------|-------------|\r\n| `off` | 0% | No compression |\r\n| `lite` | ~15% | Whitespace collapse, dedup system prompts |\r\n| `caveman` \u002F `standard` | ~30% | 30+ regex rules: filler removal, context condensation, structural compression, multi-turn dedup |\r\n| `aggressive` | ~50% | All Caveman + progressive message aging + tool result summarization |\r\n| `ultra` | ~75% | All Aggressive + heuristic token pruning + stopword removal |\r\n| `rtk` | 60-90% | Command-aware filters for shell\u002Ftest\u002Fbuild\u002Fgit output |\r\n| `stacked` | 78-95% | RTK first, then Caveman — best for mixed prompts with tool logs + prose |\r\n\r\nFunctions available in `token-optimizer.js`: `compressContent()`, `compressMessages()`, `compressBest()`, `estimatedSavings()`.\r\n\r\n---\r\n\r\n## Build Standalone\r\n\r\n`build.cmd` auto-detects the best available runtime and builds accordingly:\r\n\r\n| Script | Behavior | Requires |\r\n|--------|----------|----------|\r\n| `build.cmd` | **Auto-detect** — tries Bun first, falls back to Node.js | Bun or Node.js |\r\n| `build-bun.cmd` | **Explicit Bun** — single `.exe` | [Bun](https:\u002F\u002Fbun.sh) |\r\n| `build-node.cmd` | **Explicit Node.js** — portable folder | [Node.js](https:\u002F\u002Fnodejs.org) |\r\n\r\nAll scripts clean `.dist\u002F` before building but **preserve dotfiles** (`.env`, `.version`, `.cache\u002F`, etc.) so your config survives rebuilds. `.env` is seeded only on the first build (never overwritten), while `.version` is always updated to match the current source.\r\n\r\n### Bun path (`build-bun.cmd` or auto-detected)\r\n\r\nCompiles to `gc2oc` (Bun standalone) + `service.exe` (C# launcher) using `bun build --compile`. The Bun runtime is embedded.\r\n\r\n- **No runtime required** — `gc2oc` is fully self-contained (~112 MB)\r\n- **No `node_modules`** — all JS modules bundled\r\n- `service.exe` handles restart\u002Fupdate loop, `.env` loading, port cleanup, and Windows service mode\r\n- `start.cmd` is a one-shot launcher: calls `service.exe` and exits\r\n- `gc2oc` has no `.exe` extension — prevents accidental double-click; `service.exe` is the entry point\r\n- Requires **Windows 10 1809+ \u002F Windows Server 2019+** (same OS floor as Bun)\r\n\r\n**Windows service:**\r\n```\r\nsc create gc2oc binPath= \"C:\\path\\.dist\\service.exe\" start= auto\r\nsc start gc2oc\r\n```\r\n\r\n### Node.js path (`build-node.cmd` or auto-detected fallback)\r\n\r\nCreates a portable folder with `node` (no extension) + source + production dependencies. Run `start.cmd` or `service.exe` inside the folder.\r\n\r\n- **No install needed** on the target machine — the Node.js binary is bundled\r\n- Works on **Windows Server 2016+** and any Windows that runs Node.js v18+\r\n- `service.exe` is a C# launcher with the same restart\u002Fupdate loop and Windows service support\r\n\r\n### Running without building\r\n\r\nFor older Windows where Bun won't run (Server 2016), use Node.js directly:\r\n\r\n```bash\r\nnpm run node           # Node.js fallback\r\nstart.cmd              # auto-detects Bun vs Node\r\n```\r\n\r\n\r\n---\r\n\r\n## Tech Stack\r\n\r\n**[Bun](https:\u002F\u002Fbun.sh)** (preferred) → **[Node.js](https:\u002F\u002Fnodejs.org)** (fallback for older Windows) · [Hono](https:\u002F\u002Fhono.dev) · direct fetch\r\n\r\n## Credits\r\n\r\nSee **[credits.md](credits.md)** for the full list of open-source projects that inspired patterns and features in gc2oc.\r\n\r\nKey inspirations include [copilot-proxy](https:\u002F\u002Fgithub.com\u002Fchew-z\u002Fcopilot-proxy), [Qwen-Copilot-Proxy](https:\u002F\u002Fgithub.com\u002Fedwardgj\u002FQwen-Copilot-Proxy), [Proxllama](https:\u002F\u002Fgithub.com\u002FMichediana\u002FProxllama), [vLLM-proxy-for-VS-Code](https:\u002F\u002Fgithub.com\u002Fnbuckley\u002FvLLM-proxy-for-VS-Code), [antigravity-copilot](https:\u002F\u002Fgithub.com\u002Fpunal100\u002Fantigravity-copilot), [OmniRoute](https:\u002F\u002Fgithub.com\u002Fdiegosouzapw\u002FOmniRoute), [OpenCode Zen Provider](https:\u002F\u002Fgithub.com\u002Fwienans\u002Fvsc-opencode-zen-chat-provider), [yxlao\u002Fdeepseek-cursor-proxy](https:\u002F\u002Fgithub.com\u002Fyxlao\u002Fdeepseek-cursor-proxy), and many more.\r\n","gc2oc 是一个将 GitHub Copilot Chat 与 OpenCode 的 Zen 和 Go API 及模型连接起来的 Ollama 模拟代理。该项目通过 JavaScript 实现，无需密钥即可使用免费模型，同时支持在 `.env` 文件中添加 API 密钥以解锁付费模型和高级模式。其核心功能包括全工具调用、流式处理以及对 Visual Studio 2026 (18.6.0+) 和 SQL Server Management Studio (22.6.0+) 的良好支持。适用于需要增强代码生成能力或进行自然语言处理辅助开发的场景，特别是对于那些已经在使用 Visual Studio 或 SQL Studio 并希望集成更强大 AI 辅助工具的开发者来说非常有用。","2026-06-11 04:03:57","CREATED_QUERY"]