[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-76118":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":9,"language":10,"languages":9,"totalLinesOfCode":9,"stars":11,"forks":12,"watchers":13,"openIssues":14,"contributorsCount":15,"subscribersCount":15,"size":15,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":15,"forks30d":15,"starsTrendScore":19,"compositeScore":20,"rankGlobal":9,"rankLanguage":9,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":22,"topics":25,"createdAt":9,"pushedAt":9,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":15,"starSnapshotCount":15,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},76118,"cross-platform-llm-client","orailnoor\u002Fcross-platform-llm-client","orailnoor","A unified cross-platform AI client supporting seamless transitions between standard cloud APIs and on-device, offline execution of custom and uncensored language models.",null,"C++",522,112,9,17,0,31,140,342,93,100.16,"MIT License",false,"main",true,[],"2026-06-12 04:01:20","# PrivateLM\n\n[![Live Web App](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FLive_Demo-Try_Web_App-02569B?style=for-the-badge&logo=flutter&logoColor=white)](https:\u002F\u002Fai-chat-orailnoor.web.app\u002F)\n\nA production-ready, cross-platform AI chat client built with Flutter. It unifies local on-device LLM inference (Android) with cloud API access, giving users full control over how their models run.\n\n![Image generation tested on Moto G71 (Snapdragon), Oneplus 10r (Mediatek), Pixel 6A (Tensor), Poco F1 (Snapdragon), Samsung s23 (Snapdragon) 4 steps fast](PrivateLM.png)\n_Image generation tested on Moto G71 (Snapdragon), Oneplus 10r (Mediatek), Pixel 6A (Tensor), Poco F1 (Snapdragon), Samsung s23 (Snapdragon) 4 steps fast_\n\n![Generated on pixel 6 with 20 step](IMG_2390.png)\n_Generated on pixel 6 with 20 step_\n\n---\n\n## What It Does\n\n- **Local Inference on Android** — Download and run GGUF models directly on your phone using GPU-accelerated inference (Vulkan). No internet required after download.\n- **Cloud API Fallback** — Seamlessly switch to OpenAI, Anthropic, Google Gemini, or Kimi (Moonshot AI) when you need more power or are on unsupported platforms.\n- **Multimodal Chat** — Send text and images in conversations. Vision support works with both local models (Qwen2-VL) and cloud providers.\n- **Persistent Sessions** — All chats, tasks, and settings are stored locally via Hive. Nothing leaves your device unless you explicitly choose cloud mode.\n- **Background Services** — Firebase Cloud Messaging integration for push updates and background task handling.\n- **Smart Auto-Configuration** — On first launch, the app detects your device's RAM and recommends optimal context size and token limits automatically.\n- **Task Management** — A dedicated task view for structured AI-assisted workflows alongside free-form chat.\n\n---\n\n## Technical Architecture\n\n### Stack\n\n- **Framework:** Flutter 3.x (Dart >=3.3.0)\n- **State Management:** GetX\n- **Local Storage:** Hive\n- **Networking:** Dio + `package:http`\n- **Background Execution:** `flutter_background_service` + `flutter_local_notifications`\n- **Push Notifications:** Firebase Core + Firebase Messaging\n\n### Inference Pipeline\n\n```\n┌─────────────────────────────────────────────────────────────┐\n│                        UI Layer                              │\n│   ChatView \u002F TaskView \u002F ModelView \u002F SettingsView            │\n└──────────────────────────┬──────────────────────────────────┘\n                           │\n┌──────────────────────────▼──────────────────────────────────┐\n│                    Controllers (GetX)                        │\n│   ChatController · TaskController · ModelController         │\n│   SettingsController · HomeController                       │\n└──────────────────────────┬──────────────────────────────────┘\n                           │\n┌──────────────────────────▼──────────────────────────────────┐\n│                      Services                                │\n│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │\n│  │ InferenceService│  │  CloudService   │  │DownloadSvc  │ │\n│  │  (local GGUF)   │  │ (OpenAI\u002FClaude\u002F │  │ (model dl)  │ │\n│  │                 │  │  Gemini\u002FKimi)   │  │             │ │\n│  └─────────────────┘  └─────────────────┘  └─────────────┘ │\n│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────┐ │\n│  │  HiveService    │  │ DeviceInfoSvc   │  │ExecutionSvc │ │\n│  │  ( persistence) │  │  (RAM\u002FGPU tier) │  │ (bg tasks)  │ │\n│  └─────────────────┘  └─────────────────┘  └─────────────┘ │\n└─────────────────────────────────────────────────────────────┘\n```\n\n### Local Inference (Android)\n\nThe app uses `llama_flutter_android`, a custom Flutter plugin wrapping `llama.cpp` for ARM64 devices. At runtime it:\n\n1. **Detects GPU capabilities** via Vulkan to determine offload layers.\n2. **Selects thread count** based on device tier (ultra \u002F high \u002F mid \u002F low).\n3. **Loads the GGUF model** with progress streaming.\n4. **Generates tokens** via `generateChat()` with native chat-template support (ChatML, Llama-3, Gemma, Phi).\n5. **Falls back** to manual prompt construction if native templates fail.\n\nIdle detection (5s) and hard timeouts (180s) keep the UX responsive even on underpowered hardware.\n\n### Cloud Inference\n\n`CloudService` normalizes four different API shapes into a single interface:\n\n- **OpenAI** — standard `\u002Fv1\u002Fchat\u002Fcompletions`\n- **Anthropic** — Messages API with separate system param\n- **Google Gemini** — `generateContent` with inline image base64\n- **Kimi** — OpenAI-compatible endpoint from Moonshot AI\n\nAPI keys are stored in Hive and never transmitted anywhere except to the provider's endpoint.\n\n### Cross-Platform Abstraction\n\nLocal inference is conditionally compiled:\n\n- **Android** → `inference_android.dart` (full llama.cpp engine)\n- **Web** → `inference_stub.dart` (cloud-only, local coming soon)\n- **iOS** → `inference_android.dart` (full llama.cpp engine via Metal GPU)\n\nThe `InferenceService` exposes `supportsLocalInference` so the UI can hide local-model UI on unsupported platforms.\n\n---\n\n## Supported Platforms\n\n| Platform | Local Inference | Cloud APIs | Notes                           |\n| -------- | --------------- | ---------- | ------------------------------- |\n| Android  | ✅ Yes          | ✅ Yes     | CPU offload via NEON; minSdk 28 |\n| iOS      | ✅ Yes          | ✅ Yes     | Metal GPU acceleration          |\n| Web      | ❌ No           | ✅ Yes     | Cloud-only (local coming soon)  |\n\n### iOS \u002F iPad\n\nThe iPad release is distributed as a standalone ZIP package for sideloading. Download the latest `PrivateLM-iOS.zip` from the [Releases](https:\u002F\u002Fgithub.com\u002Forailnoor\u002Fcross-platform-llm-client\u002Freleases) page, extract it, and install the `.ipa` via AltStore, Sideloadly, or Xcode. iPhone support is experimental — iPad is the recommended iOS target due to RAM requirements for local models.\n\n---\n\n## Build Configuration\n\n### Prerequisites\n\n- Flutter SDK >=3.3.0\n- Android SDK (API 26+)\n- JDK 17\n- NDK (bundled with Android SDK)\n\n### Android\n\n```bash\nflutter pub get\ncd android\n.\u002Fgradlew assembleDebug   # or assembleRelease\n```\n\nFor release builds you should configure your own signing in `android\u002Fapp\u002Fbuild.gradle.kts`:\n\n```kotlin\nbuildTypes {\n    release {\n        signingConfig = signingConfigs.getByName(\"release\")\n        isMinifyEnabled = true\n        isShrinkResources = true\n        proguardFiles(\n            getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n            \"proguard-rules.pro\"\n        )\n    }\n}\n```\n\n### iOS\n\n```bash\nflutter pub get\ncd ios\npod install\nflutter build ios\n```\n\n### Web\n\n```bash\nflutter pub get\nflutter build web --release\n```\n\n---\n\n## License\n\nMIT — see [LICENSE](.\u002FLICENSE) for details.\n","PrivateLM 是一个支持本地和云端无缝切换的跨平台AI聊天客户端，旨在为用户提供统一的模型运行控制。该项目使用Flutter构建，能够在Android设备上直接下载并运行GGUF模型，利用GPU加速推理，无需网络连接；同时支持通过OpenAI、Anthropic等云API获取更强大的处理能力。它还具备多模态聊天功能，允许用户在对话中发送文本和图片，并且所有数据默认存储于本地，确保隐私安全。适用于需要在不同环境下灵活使用AI助手的个人或企业场景，特别是在对数据隐私有较高要求的情况下。",2,"2026-06-11 03:54:32","CREATED_QUERY"]