[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-73889":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":8,"language":10,"languages":8,"totalLinesOfCode":8,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":8,"rankLanguage":8,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":8,"pushedAt":8,"updatedAt":35,"readmeContent":36,"aiSummary":37,"trendingCount":15,"starSnapshotCount":15,"syncStatus":38,"lastSyncTime":39,"discoverSource":40},73889,"Foundry-Local","microsoft\u002FFoundry-Local","microsoft",null,"https:\u002F\u002Ffoundrylocal.ai","C++",2355,323,79,19,0,20,40,77,60,29.53,"Other",false,"main",true,[26,27,28,29,30,7,31,32,33,34],"ai-sdk","chat-completions","foundry-local","gpu-acceleration","local-ai","on-device-inference","onnx-runtime","speech-to-text","whisper","2026-06-12 02:03:19","\u003Cdiv align=\"center\">\n  \u003Cpicture align=\"center\">\n    \u003Csource media=\"(prefers-color-scheme: dark)\" srcset=\"media\u002Ficons\u002Ffoundry_local_white.svg\">\n    \u003Csource media=\"(prefers-color-scheme: light)\" srcset=\"media\u002Ficons\u002Ffoundry_local_black.svg\">\n    \u003Cimg alt=\"Foundry Local icon.\" src=\"media\u002Ficons\u002Ffoundry_local_black.svg\" height=\"100\" style=\"max-width: 100%;\">\n  \u003C\u002Fpicture>\n\u003Cdiv id=\"user-content-toc\">\n  \u003Cul align=\"center\" style=\"list-style: none;\">\n    \u003Csummary>\n      \u003Ch1>Foundry Local\u003C\u002Fh1>\u003Cbr>\n     \u003Ch3>\u003Ca href=\"https:\u002F\u002Faka.ms\u002Ffoundry-local-installer\">Download\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Faka.ms\u002Ffoundry-local-docs\">Documentation\u003C\u002Fa> | \u003Ca href=\"https:\u002F\u002Faka.ms\u002Ffoundry-local-discord\">Discord\u003C\u002Fa>\u003C\u002Fh3>\n    \u003C\u002Fsummary>\n  \u003C\u002Ful>\n\u003C\u002Fdiv>\n\n## Ship on-device AI inside your app\n\n\u003C\u002Fdiv>\n\nFoundry Local is an **end-to-end local AI solution** for building applications that run entirely on the user's device. It provides native SDKs (C#, JavaScript, Python, and Rust), a curated catalog of optimized models, and automatic hardware acceleration — all in a lightweight package (~20 MB). The compact size makes it easy to integrate into your application and distribute to end users.\n\nUser data never leaves the device, responses start immediately with zero network latency, and your app works offline. No per-token costs, no API keys, no backend infrastructure to maintain, and no Azure subscription required.\n\n### Key Features\n\n- **Lightweight runtime** — The runtime handles model acquisition, hardware acceleration, model management, and inference (via [ONNX Runtime](https:\u002F\u002Fonnxruntime.ai\u002F)). \n\n- **Curated model catalog** — A catalog of high-quality models optimized for on-device use across a wide range of consumer hardware. The catalog covers chat completions (for example, GPT OSS, Qwen, DeepSeek, Mistral and Phi) and audio transcription (for example, Whisper). Every model goes through extensive quantization and compression to deliver the best balance of quality and performance. Models are versioned, so your application can pin to a specific version or automatically receive updates.\n\n- **Automatic hardware acceleration** — Foundry Local detects the available hardware on the user's device and selects the best execution provider and device (NPU, GPU or CPU).\n\n- **Smart model management** — Foundry Local handles the full lifecycle of models on end-user devices. Models download automatically on first use, are cached locally for instant subsequent launches, and the best-performing variant is selected for the user's specific hardware.\n\n- **OpenAI-compatible API** — Supports OpenAI request and response formats including the [OpenAI Responses API format](https:\u002F\u002Fdevelopers.openai.com\u002Fapi\u002Freference\u002Fresources\u002Fresponses). If your application already uses the OpenAI SDK, point it to a Foundry Local endpoint with minimal code changes.\n\n- **Optional local server** — An OpenAI-compatible web server for serving models to multiple processes, integrating with tools like LangChain, or experimenting through REST calls. For most embedded application scenarios, use the SDK directly — it runs inference in-process without the overhead of a separate server.\n\n\n## 🚀 Quickstart\n\n> [!TIP]\n> The following shows a quickstart for Python and JavaScript. C# and Rust language bindings are also available. Take a look at the [samples](\u002Fsamples\u002F) for more details.\n\n\n\u003Cdetails open>\n\u003Csummary>\u003Cstrong>JavaScript\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n1. Install the SDK:\n\n    ```bash\n    # Windows (recommended for hardware acceleration)\n    npm install foundry-local-sdk-winml\n    \n    # macOS\u002Flinux\n    npm install foundry-local-sdk\n    ```\n\n2. Run your first chat completion:\n\n    ```javascript\n    import { FoundryLocalManager } from 'foundry-local-sdk';\n\n    const manager = FoundryLocalManager.create({ appName: 'my-app' });\n\n    \u002F\u002F Download and load a model (auto-selects best variant for user's hardware)\n    const model = await manager.catalog.getModel('qwen2.5-0.5b');\n    await model.download((progress) => {\n        process.stdout.write(`\\rDownloading... ${progress.toFixed(2)}%`);\n    });\n    await model.load();\n\n    \u002F\u002F Create a chat client and get a completion\n    const chatClient = model.createChatClient();\n    const response = await chatClient.completeChat([\n        { role: 'user', content: 'What is the golden ratio?' }\n    ]);\n\n    console.log(response.choices[0]?.message?.content);\n\n    \u002F\u002F Unload the model when done\n    await model.unload();\n    ```\n\n\u003C\u002Fdetails>\n\n\n\u003Cdetails open>\n\u003Csummary>\u003Cstrong>Python\u003C\u002Fstrong>\u003C\u002Fsummary>\n\n1. Install the SDK:\n\n    ```bash\n    # Windows (recommended for hardware acceleration)\n    pip install foundry-local-sdk-winml\n\n    # macOS\u002FLinux\n    pip install foundry-local-sdk\n    ```\n\n2. Run your first chat completion:\n\n    ```python\n    from foundry_local_sdk import Configuration, FoundryLocalManager\n\n    config = Configuration(app_name=\"foundry_local_samples\")\n    FoundryLocalManager.initialize(config)\n    manager = FoundryLocalManager.instance\n\n    # Select and load a model from the catalog\n    model = manager.catalog.get_model(\"qwen2.5-0.5b\")\n    model.download()\n    model.load()\n\n    # Get a chat client\n    client = model.get_chat_client()\n\n    # Create and send message\n    messages = [\n        {\"role\": \"user\", \"content\": \"What is the golden ratio?\"}\n    ]\n    response = client.complete_chat(messages)\n    print(f\"Response: {response.choices[0].message.content}\")\n   \n    model.unload()\n    ```\n\n\u003C\u002Fdetails>\n\n\n### 💬 Audio Transcription (Speech-to-Text)\n\nThe SDK also supports audio transcription via Whisper models (available in JavaScript, C#, Python and Rust):\n\n```javascript\nimport { FoundryLocalManager } from 'foundry-local-sdk';\n\nconst manager = FoundryLocalManager.create({ appName: 'my-app' });\n\nconst whisperModel = await manager.catalog.getModel('whisper-tiny');\nawait whisperModel.download();\nawait whisperModel.load();\n\nconst audioClient = whisperModel.createAudioClient();\naudioClient.settings.language = 'en';\n\n\u002F\u002F Transcribe an audio file\nconst result = await audioClient.transcribe('recording.wav');\nconsole.log('Transcription:', result.text);\n\n\u002F\u002F Or stream in real-time\nfor await (const chunk of audioClient.transcribeStreaming('recording.wav')) {\n    process.stdout.write(chunk.text);\n}\n\nawait whisperModel.unload();\n```\n\n> [!TIP]\n> A single `FoundryLocalManager` can manage both chat and audio models simultaneously. See the [chat-and-audio sample](samples\u002Fjs\u002Fchat-and-audio-foundry-local\u002F) for a complete example.\n\n## 📦 Samples\n\nExplore complete working examples in the [`samples\u002F`](samples\u002F) folder:\n\n| Language | Samples | Highlights |\n|----------|---------|------------|\n| [**C#**](samples\u002Fcs\u002F) | 12 | Native chat, audio transcription, tool calling, model management, web server, tutorials |\n| [**JavaScript**](samples\u002Fjs\u002F) | 12 | Native chat, audio, Electron app, Copilot SDK, LangChain, tool calling, tutorials |\n| [**Python**](samples\u002Fpython\u002F) | 9 | Chat completions, audio transcription, LangChain, tool calling, tutorials |\n| [**Rust**](samples\u002Frust\u002F) | 8 | Native chat, audio transcription, tool calling, web server, tutorials |\n\n## 🖥️ CLI\n\nThe Foundry Local CLI lets you explore models and experiment interactively.\n\n**Install:**\n\n```bash\n# Windows\nwinget install Microsoft.FoundryLocal\n\n# macOS\nbrew install microsoft\u002Ffoundrylocal\u002Ffoundrylocal\n```\n\n**Run a model:**\n\n```bash\nfoundry model run qwen2.5-0.5b\n```\n\n**List available models:**\n\n```bash\nfoundry model ls\n```\n\n> For the full CLI reference and advanced usage, see the [CLI documentation on Microsoft Learn](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Ffoundry-local\u002Freference\u002Freference-cli).\n\n\n## Reporting Issues\n\nPlease report issues or suggest improvements in the [GitHub Issues](https:\u002F\u002Fgithub.com\u002Fmicrosoft\u002FFoundry-Local\u002Fissues) section.\n\n## 🎓 Learn More\n\n- [Foundry Local Documentation](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Ffoundry-local\u002F) on Microsoft Learn\n- [Foundry Local Lab](https:\u002F\u002Fgithub.com\u002FMicrosoft-foundry\u002Ffoundry-local-lab) — Hands-on exercises and step-by-step instructions\n\n## ❔ Frequently asked questions\n\n### Is Foundry Local a web server and CLI tool?\n\nNo. Foundry Local is an **end-to-end local AI solution** that your application ships with. It handles model acquisition, hardware acceleration, and inference inside your app process through the SDK. The optional web server and CLI are available for development workflows, but the core product is the local AI runtime and SDK that you integrate directly into your application.\n\n### Why doesn't Foundry Local support every available model?\n\nFoundry Local is designed for shipping production applications, not for general-purpose model experimentation. The model catalog is intentionally curated to include models that are optimized for specific application scenarios, tested across a range of consumer hardware, and small enough to distribute to end users. This approach ensures that every model in the catalog delivers reliable performance when embedded in your application — rather than offering a broad selection of models with unpredictable on-device behavior.\n\n### Can Foundry Local run on a server?\n\nFoundry Local is optimized for hardware-constrained devices where a single user accesses the model at a time. While you can technically install and run it on server hardware, it isn't designed as a server inference stack.\n\nServer-oriented runtimes like [vLLM](https:\u002F\u002Fdocs.vllm.ai\u002Fen\u002Flatest\u002F) or [Triton Inference Server](https:\u002F\u002Fgithub.com\u002Ftriton-inference-server\u002Fserver) are built for multi-user scenarios — they handle concurrent request queuing, continuous batching, and efficient GPU sharing across many simultaneous clients. Foundry Local doesn't provide these capabilities. Instead, it focuses on lightweight, single-user inference with automatic hardware detection, KV-cache management, and model lifecycle handling that make sense for client applications.\n\nIf you need to serve models to multiple concurrent users, use a dedicated server inference framework. Use Foundry Local when the model runs on the end user's own device.\n\n\n### What platforms are supported?\n\nFoundry Local supports Windows, macOS (Apple silicon), and Linux.\n\n\n## ⚖️ License\n\nFoundry Local SDK is licensed under the MIT license. For more details, see the [LICENSE](LICENSE) file.\nFoundry Local CLI is licensed under the Microsoft Software License Terms. For more details, read the [LICENSE](LICENSE) file.\n\nIndividual models made available for use with Foundry Local are subject to the each model's license terms, notices, and use restrictions. Refer to the model's documentation or download\u002Flisting page for the applicable terms before using or redistributing a model.\n","Foundry Local 是一个端到端的本地AI解决方案，用于构建完全在用户设备上运行的应用程序。其核心功能包括轻量级运行时（约20 MB），支持自动硬件加速、智能模型管理和OpenAI兼容API。项目提供了C#、JavaScript、Python和Rust的原生SDK，并且拥有一个经过优化的高质量模型目录，涵盖聊天补全和音频转录等功能。这些模型经过量化和压缩处理，以达到最佳的质量与性能平衡。Foundry Local 适用于需要在无网络环境下提供即时响应的应用场景，如离线聊天机器人或语音识别应用，确保用户数据始终保留在设备上，无需担心网络延迟或云服务费用。",2,"2026-06-11 03:47:48","high_star"]