[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-85124":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":16,"stars30d":16,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":17,"rankGlobal":10,"rankLanguage":10,"license":18,"archived":19,"fork":19,"defaultBranch":20,"hasWiki":21,"hasPages":19,"topics":22,"createdAt":10,"pushedAt":10,"updatedAt":35,"readmeContent":36,"aiSummary":10,"trendingCount":16,"starSnapshotCount":16,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},85124,"coreai-model-zoo","john-rocky\u002Fcoreai-model-zoo","john-rocky","Community model zoo + knowledge base for Apple Core AI (iOS\u002FmacOS 27): Qwen3.5 & Gemma 4 converted end-to-end, verified on-device (iPhone 17 Pro GPU\u002FANE), conversion gotchas, custom Metal kernels, Swift runner","",null,"Swift",124,6,4,1,0,38.54,"Other",false,"main",true,[23,24,25,26,27,28,29,30,31,32,33,34],"ai","apple-silicon","coreai","gemma4","granite","ios","iphone","lfm","llm","local-llm","macos","qwen","2026-06-15 10:04:36","# CoreAI-Model-Zoo\n\nLLMs converted to Apple **Core AI** (`.aimodel`, iOS 27 \u002F macOS 27) — downloadable, verified\non-device, with the conversion code and a knowledge base. Successor to\n[`CoreML-Models`](https:\u002F\u002Fgithub.com\u002Fjohn-rocky\u002FCoreML-Models).\n\n## Models\n\n| Model | Download (`.aimodel`) | License |\n|---|---|---|\n| **Qwen3.5-0.8B** | [🤗 qwen3.5-0.8B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002Fqwen3.5-0.8B-CoreAI) | Apache-2.0 |\n| **Qwen3.5-2B** | [🤗 qwen3.5-2B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002Fqwen3.5-2B-CoreAI) | Apache-2.0 |\n| **Qwen3.6-35B-A3B** (MoE, Mac-only) | [🤗 Qwen3.6-35B-A3B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FQwen3.6-35B-A3B-CoreAI) | Apache-2.0 |\n| **Qwen3.6-27B** (dense, Mac-only) | [🤗 Qwen3.6-27B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FQwen3.6-27B-CoreAI) | Apache-2.0 |\n| **GLM-4.7-Flash** (MoE + MLA, Mac-only — zoo's first MLA) | [🤗 GLM-4.7-Flash-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FGLM-4.7-Flash-CoreAI) | MIT |\n| **Gemma 4 E2B** (text, incl. official-QAT int4) | [🤗 gemma-4-E2B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002Fgemma-4-E2B-CoreAI) | Gemma |\n| **Gemma 4 E4B** (text, official-QAT int4) | [🤗 gemma-4-E4B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002Fgemma-4-E4B-CoreAI) | Gemma |\n| **Gemma 4 12B** (dense, Mac-only — custom flash-decode kernel ‡) | [🤗 Gemma-4-12B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FGemma-4-12B-CoreAI) | Gemma |\n| **Gemma 4 31B** (dense, Mac-only — custom flash-decode kernel ‡) | [🤗 Gemma-4-31B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FGemma-4-31B-CoreAI) | Gemma |\n| **LFM2.5-1.2B-Instruct** | [🤗 LFM2.5-1.2B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FLFM2.5-1.2B-CoreAI) | LFM Open License v1.0 |\n| **LFM2.5-8B-A1B** (MoE, custom `gather_qmm` kernel — first iPhone MoE) | [🤗 LFM2.5-8B-A1B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FLFM2.5-8B-A1B-CoreAI) | LFM Open License v1.0 |\n| **Granite 4.0-H 1B \u002F 350M** | [🤗 granite-4.0-h-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002Fgranite-4.0-h-CoreAI) | Apache-2.0 |\n| **Qwen3-VL** (vision-language) | [🤗 2B](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FQwen3-VL-2B-CoreAI) · [4B](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FQwen3-VL-4B-CoreAI) · [8B](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FQwen3-VL-8B-CoreAI) | Apache-2.0 |\n| **MiniCPM-V 4.6** (vision-language, sub-2B — strongest tiny VLM) | [🤗 MiniCPM-V-4.6-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FMiniCPM-V-4.6-CoreAI) | Apache-2.0 |\n| **Gemma 4 E2B vision (VL)** (image+text) | `vl\u002F` in [🤗 gemma-4-E2B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002Fgemma-4-E2B-CoreAI) | Gemma |\n| **EmbeddingGemma 300M** (text embeddings — on-device RAG \u002F semantic search) | [🤗 embeddinggemma-300m-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002Fembeddinggemma-300m-CoreAI) | Gemma |\n| **Qwen3-Embedding 0.6B** (multilingual text embeddings, last-token pooling + MRL) | [🤗 Qwen3-Embedding-0.6B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FQwen3-Embedding-0.6B-CoreAI) | Apache-2.0 |\n| **Qwen3-Reranker 0.6B** (cross-encoder reranker — yes\u002Fno relevance score) | [🤗 Qwen3-Reranker-0.6B-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FQwen3-Reranker-0.6B-CoreAI) | Apache-2.0 |\n| **RF-DETR nano\u002Fsmall\u002Fmedium\u002Flarge** (object detection, no NMS) | [🤗 RF-DETR-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FRF-DETR-CoreAI) | Apache-2.0 |\n| **RF-DETR-Seg nano→2xlarge** (instance segmentation, 6 sizes) | [🤗 RF-DETR-CoreAI](https:\u002F\u002Fhuggingface.co\u002Fmlboydaisuke\u002FRF-DETR-CoreAI) | Apache-2.0 |\n\n### Decode throughput (tok\u002Fs, greedy; output top-1 exact vs the Hugging Face reference)\n\n| | iPhone 17 Pro · GPU | iPhone 17 Pro · ANE | M4 Max · GPU |\n|---|---|---|---|\n| **Qwen3.5-0.8B** | **71.9** | 14.7 | **210** |\n| **Qwen3.5-2B** | **29** | — | **161** |\n| **LFM2.5-1.2B** | **45.4** | — | **276.5** |\n| **Granite 4.0-H 1B** | **36.3** | — | **136.5** |\n| **Gemma 4 E2B** | **30.3** (QAT 30.7) | 6 | **77.0** (QAT 78.9) |\n| **Gemma 4 E4B** (official QAT) | **15.1** | — | **55.8** |\n| **Gemma 4 E2B VL** (image+text, official QAT) | **25.5** | — | **82.4** |\n| **MiniCPM-V 4.6** (vision-language, sub-2B) | **53.4** | — | **224.3** |\n| **Qwen3.6-35B-A3B** (MoE, 35B\u002F~3B active, Mac-only) | — | — | **64.9** † |\n| **Qwen3.6-27B** (dense, Mac-only) | — | — | **15.9** |\n| **GLM-4.7-Flash** (MoE + MLA, 30B\u002F~3B active, Mac-only) | — | — | **52.4** † |\n| **Gemma 4 12B** (dense, Mac-only) | — | — | **23** int8 \u002F **33** int4 ‡ |\n| **Gemma 4 31B** (dense, Mac-only) | — | — | **17.2** int4 ‡ |\n\nMeasured on the iOS 27 \u002F macOS 27 beta, Apple's `coreai-pipelined` GPU engine, zero custom\nkernels (ANE column + **†**\u002F**‡** excepted). **†** = MoE bundle using the custom\n[`gather_qmm`](knowledge\u002Fcompute-units-and-authoring.md) Metal kernel (reads only the routed\nexperts). **‡** = dense bundle whose full\u002Fglobal-attention SDPA is a custom flash-decode Metal\nkernel — the stock MPSGraph SDPA crashes on the ≥16-head × 512 Q (a GPU scratch-heap overflow,\n[apple\u002Fcoreai-models#27](https:\u002F\u002Fgithub.com\u002Fapple\u002Fcoreai-models\u002Fissues\u002F27)), so these models are\n**unrunnable without it**. Prefill, sizes, per-model caveats, and the Mac-only big models: [`zoo\u002F`](zoo\u002F).\n\n\u003Cp align=\"center\">\n  \u003Cimg width=\"380\" alt=\"CoreAIChat screen recording\" src=\"https:\u002F\u002Fgithub.com\u002Fuser-attachments\u002Fassets\u002F999dbd95-45b5-468f-b1a8-34112ee3b74d\" \u002F>\n\u003C\u002Fp>\n\u003Cp align=\"center\">\u003Ci>CoreAIChat (\u003Ca href=\"apps\u002F\">apps\u002F\u003C\u002Fa>) — the zoo's models running on-device on iPhone.\u003C\u002Fi>\u003C\u002Fp>\n\n## Start here\n\n- **Try the app** (iOS 27 \u002F macOS 27 beta; the model downloads in-app):\n  - **Demo app, no build** → Mac: [**.dmg**](https:\u002F\u002Fgithub.com\u002Fjohn-rocky\u002Fcoreai-model-zoo\u002Freleases\u002Fdownload\u002Fmac-v1.0\u002FCoreAI-Zoo-for-Mac.dmg) (notarized, runs the Mac-only bundles) · iPhone: CoreAIChat on TestFlight (coming soon)\n  - **Build it** → [`apps\u002F`](apps\u002F) — Xcode 27 beta + xcodegen, the `coreai-models` patch stack + `tokenizer.json`\n- **Run a model in your own app** → [`knowledge\u002Fswift-runtime.md`](knowledge\u002Fswift-runtime.md) + the model card\n- **Convert a model** → [`knowledge\u002Fconversion-guide.md`](knowledge\u002Fconversion-guide.md)\n- **Compress** → [`knowledge\u002Fcompression.md`](knowledge\u002Fcompression.md)\n- **Make it fast** → [`knowledge\u002Fcustom-metal-kernels.md`](knowledge\u002Fcustom-metal-kernels.md) · [`knowledge\u002Fperformance-ceiling.md`](knowledge\u002Fperformance-ceiling.md)\n- **Known beta issue** (in-graph KV-write crash; workarounds + the input-mask escape) → [`knowledge\u002Fcoreai-beta-mpsgraph-kvwrite-bug.md`](knowledge\u002Fcoreai-beta-mpsgraph-kvwrite-bug.md) — FB23024751 \u002F [apple\u002Fcoreai-models#5](https:\u002F\u002Fgithub.com\u002Fapple\u002Fcoreai-models\u002Fissues\u002F5)\n\n## Repository layout\n\n| Dir | What |\n|---|---|\n| [`zoo\u002F`](zoo\u002F) | Model cards — configurations, sizes, parity, measured throughput. |\n| [`knowledge\u002F`](knowledge\u002F) | Verified notes on the framework: conversion, compression, stateful KV, custom Metal kernels, AOT, compute-unit rules, the Swift runtime. |\n| [`conversion\u002F`](conversion\u002F) | Re-authored models + convert \u002F verify \u002F compress scripts (PyTorch → `.aimodel`). |\n| [`swift\u002F`](swift\u002F) | `CoreAIRunner` — a Swift package that drives `.aimodel` LLM bundles, including architectures beyond the standard runtime. |\n| [`apps\u002F`](apps\u002F) | SwiftUI on-device chat apps (iOS 27): CoreAIChat (Gemma 4 E2B GPU\u002FANE\u002F⚡ + Qwen3.5 \u002F Qwen3.5-2B \u002F LFM2.5 \u002F Granite ⚡pipelined, one picker) + QwenChatFast (Qwen3.5 static kernels) with in-app model download. |\n\n## License\n\nBSD-3-Clause ([`LICENSE`](LICENSE)). Re-authored model code derives from Apple's BSD-3-Clause\n`coreai_models` and retains its notices. Model weights follow their own licenses (see each\nHugging Face repo).\n",2,"2026-06-15 02:30:05","CREATED_QUERY"]