[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-82925":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":10,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":23,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":36,"readmeContent":37,"aiSummary":38,"trendingCount":16,"starSnapshotCount":16,"syncStatus":18,"lastSyncTime":39,"discoverSource":40},82925,"Android-MVVM-Architecture-Android-Voice-AI-SDK","ahmedeltaher\u002FAndroid-MVVM-Architecture-Android-Voice-AI-SDK","ahmedeltaher","Voice AI SDK is a reusable Android library that gives any app a full voice-driven AI conversation pipeline in minutes. Voice Assistant + Android Voide AI + SDK + MVVM + Kotlin","https:\u002F\u002Fahmedeltaher.github.io\u002FAndroid-MVVM-Architecture-Voice-AI\u002F",null,"Kotlin",2557,614,58,14,0,1,2,3,60.67,false,"main",true,[25,26,27,28,29,30,31,32,33,34,35],"ai","ai-agent","android","android-mvvm-architechture","mvvm","mvvm-android","mvvm-architechure","voice","voice-ai","voice-control","voice-recognition-ai","2026-06-12 04:01:39","# [Android Voice AI SDK in Model-View-ViewModel (ie MVVM)](https:\u002F\u002Fgithub.com\u002Fahmedeltaher\u002F\u002FAndroid-MVVM-Architecture-Android-Voice-AI-SDK\u002F)\n\n![Android Voice AI SDK](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FVoice%20AI%20SDK-1.0-blue)\n[![Kotlin](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FKotlin-2.3.21-brightgreen.svg)](https:\u002F\u002Fkotlinlang.org\u002F)\n[![Coroutines](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FCoroutines-1.11.0-red.svg)](https:\u002F\u002Fkotlinlang.org\u002Fdocs\u002Fcoroutines-overview.html)\n[![Jetpack Compose](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FJetpack%20Compose-BOM%202026.05.01-blue.svg)](https:\u002F\u002Fdeveloper.android.com\u002Fjetpack\u002Fcompose)\n[![Hilt](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FHilt-2.57.1-orange.svg)](https:\u002F\u002Fdagger.dev\u002Fhilt\u002F)\n[![MockK](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FMockK-1.14.11-yellow.svg)](https:\u002F\u002Fmockk.io\u002F)\n[![JUnit5](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FJUnit5-5.14.4-yellowgreen.svg)](https:\u002F\u002Fjunit.org\u002Fjunit5\u002F)\n[![Espresso](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FEspresso-3.7.0-lightgrey.svg)](https:\u002F\u002Fdeveloper.android.com\u002Ftraining\u002Ftesting\u002Fespresso\u002F)\n[![MVVM](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FClean--Code-MVVM-brightgreen.svg)](https:\u002F\u002Fgithub.com\u002Fgooglesamples\u002Fandroid-architecture)\n[![STT - Android SpeechRecognizer](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSTT-Android%20SpeechRecognizer-purple.svg)](https:\u002F\u002Fdeveloper.android.com\u002Freference\u002Fandroid\u002Fspeech\u002FSpeechRecognizer)\n[![STT - OpenAI Whisper](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FSTT-OpenAI%20Whisper-blueviolet.svg)](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fguides\u002Fspeech-to-text)\n[![TTS - Android TTS](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTTS-Android%20TextToSpeech-ff69b4.svg)](https:\u002F\u002Fdeveloper.android.com\u002Freference\u002Fandroid\u002Fspeech\u002Ftts\u002FTextToSpeech)\n[![TTS - ElevenLabs](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FTTS-ElevenLabs-ff1493.svg)](https:\u002F\u002Felevenlabs.io\u002Fdocs)\n[![Anthropic Claude](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FAI-Anthropic%20Claude-d4a574.svg)](https:\u002F\u002Fdocs.anthropic.com\u002F)\n![build: passing](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fbuild-passing-brightgreen) ![license: MIT](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-MIT-blue) ![minSdk: 24](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002FminSdk-24-orange)\n\n![MVVM3](https:\u002F\u002Fuser-images.githubusercontent.com\u002F1812129\u002F68319232-446cf900-00be-11ea-92cf-cad817b2af2c.png)\n\n\n```mermaid\nflowchart LR\n    Microphone --> AudioRecord --> VAD --> STT --> ClaudeAI[\"Claude AI\"] --> TTS --> Speaker\n```\n\n---\nThe Android Voice AI SDK is a reusable Android library that gives any app a full voice-driven AI conversation pipeline in minutes. It captures audio from the device microphone, transcribes speech to text, sends the transcript to Anthropic Claude for an intelligent response, and speaks the reply back to the user through text-to-speech — all wired together with a single `VoiceAISDK.Builder` call. The SDK ships ready-to-drop-in Jetpack Compose UI components, swappable STT\u002FTTS engine adapters, on-device emotion detection, and security utilities including PII redaction and encrypted key storage.\n\n\n## Features\n\n| Layer | Capability |\n|-------|-----------|\n| **Audio Input** | Voice Activity Detection (VAD), noise handling, streaming PCM capture |\n| **Recognition** | Speech-to-Text (STT), language detection, speaker diarization |\n| **Understanding** | Intent extraction, entity recognition, conversation context |\n| **Action** | API orchestration, workflow execution, task automation |\n| **Response** | LLM answer generation (Anthropic Claude) |\n| **Voice Output** | Text-to-Speech (TTS), voice style selection, audio streaming |\n| **Safety** | User consent, authentication, abuse prevention |\n| **Analytics** | Conversation logs, session summaries, quality metrics |\n\n\n## Requirements\n\n| Requirement | Version |\n|---|---|\n| Android Studio | Meerkat or newer |\n| Minimum SDK | 24 (Android 7.0) |\n| Kotlin | 2.0+ (project uses 2.3.21) |\n| Anthropic API key | Required — obtain at [console.anthropic.com](https:\u002F\u002Fconsole.anthropic.com) |\n\n## Quick Start\n\n### Step 1 — Add the dependency and manifest permissions\n\nIn your app `build.gradle.kts`:\n\n```kotlin\ndependencies {\n    implementation(\"com.sdk:voice-ai-sdk:1.0.0\")\n}\n```\n\nIn `app\u002Fsrc\u002Fmain\u002FAndroidManifest.xml`:\n\n```xml\n\u003Cuses-permission android:name=\"android.permission.RECORD_AUDIO\" \u002F>\n\u003Cuses-permission android:name=\"android.permission.INTERNET\" \u002F>\n\u003Cuses-permission android:name=\"android.permission.ACCESS_NETWORK_STATE\" \u002F>\n```\n\n### Step 2 — Hilt setup\n\nAnnotate your `Application` class with `@HiltAndroidApp` and your `Activity` with `@AndroidEntryPoint`:\n\n```kotlin\n@HiltAndroidApp\nclass MyApp : Application()\n\n@AndroidEntryPoint\nclass MainActivity : ComponentActivity() { ... }\n```\n\n### Step 3 — Add your API key to local.properties\n\n`local.properties` is git-ignored, so your key never ends up in source control:\n\n```properties\nANTHROPIC_API_KEY=sk-ant-...\n```\n\nThen expose it via `BuildConfig` in `app\u002Fbuild.gradle.kts`:\n\n```kotlin\ndefaultConfig {\n    buildConfigField(\n        \"String\",\n        \"ANTHROPIC_API_KEY\",\n        \"\\\"${project.findProperty(\"ANTHROPIC_API_KEY\") ?: \"\"}\\\"\",\n    )\n}\n\nbuildFeatures {\n    buildConfig = true\n}\n```\n\n### Step 4 — Build the SDK\n\nProvide the SDK through Hilt by creating an `AppModule`:\n\n```kotlin\n@Module\n@InstallIn(SingletonComponent::class)\nobject AppModule {\n\n    @Provides\n    @Singleton\n    fun provideVoiceAIConfig(): VoiceAIConfig =\n        VoiceAIConfig(anthropicApiKey = BuildConfig.ANTHROPIC_API_KEY)\n\n    @Provides\n    @Singleton\n    fun provideVoiceAISDK(\n        @ApplicationContext context: Context,\n        config: VoiceAIConfig,\n    ): VoiceAISDK = VoiceAISDK.Builder(context)\n        .anthropicApiKey(config.anthropicApiKey)\n        .debugLogging(BuildConfig.DEBUG)\n        .build()\n}\n```\n\nOr construct the SDK directly without Hilt:\n\n```kotlin\nval sdk = VoiceAISDK.Builder(context)\n    .anthropicApiKey(BuildConfig.ANTHROPIC_API_KEY)\n    .debugLogging(true)\n    .config { copy(systemPrompt = \"You are a concise voice assistant.\") }\n    .build()\n\nval session: VoiceAISession = sdk.createSession()\nsession.start()\n```\n\n### Step 5 — Add the VoiceScreen composable\n\nUse `VoiceSessionPermissionGate` to handle the `RECORD_AUDIO` runtime permission automatically, then place `VoiceButton` and `ConversationView` inside:\n\n```kotlin\n@Composable\nfun VoiceScreen(viewModel: VoiceViewModel = hiltViewModel()) {\n    VoiceSessionPermissionGate(\n        rationale = \"Microphone access is required for voice conversations.\",\n    ) {\n        Column(\n            modifier = Modifier\n                .fillMaxSize()\n                .padding(16.dp),\n            verticalArrangement = Arrangement.SpaceBetween,\n        ) {\n            ConversationView(\n                messages = viewModel.messages.collectAsStateWithLifecycle().value,\n                modifier = Modifier.weight(1f),\n            )\n            VoiceButton(\n                session = viewModel.session,\n                modifier = Modifier.align(Alignment.CenterHorizontally),\n            )\n        }\n    }\n}\n```\n\n## Architecture\n\nThe SDK is organised into six layers, each with a single responsibility:\n\n| Layer | Package | Responsibility |\n|---|---|---|\n| **Audio** | `audio\u002F` | Raw PCM capture via `AudioRecord`, voice activity detection (VAD), audio level metering, and PCM-to-WAV conversion |\n| **STT** | `stt\u002F` | `SpeechToTextEngine` interface with a drop-in Android built-in implementation; plug in Whisper or any other engine |\n| **AI** | `ai\u002F` | `AIEngine` interface backed by `ClaudeAIEngine`, which wraps the official Anthropic Java SDK and maintains conversation history |\n| **TTS** | `tts\u002F` | `TextToSpeechEngine` interface with a drop-in Android built-in implementation; plug in ElevenLabs for premium voices |\n| **Session** | `VoiceAISession` | Orchestrates the full pipeline — audio in, transcript out, AI reply, speech out — as a single coroutine-based lifecycle |\n| **UI** | `ui\u002F` | Ready-to-use Jetpack Compose components: `VoiceButton`, `ConversationView`, `VoiceSessionPermissionGate`, `WaveformVisualizer`, `LiveCaptionBanner`, `VoiceStatusIndicator` |\n\n## Available Engines\n\n| Category | Engine | Class | Notes |\n|---|---|---|---|\n| STT | Android built-in | `AndroidSttEngine` | Default; free; uses `android.speech.SpeechRecognizer`; requires network |\n| STT | OpenAI Whisper | `WhisperSttEngine` | Higher accuracy; POSTs PCM\u002FWAV to OpenAI REST API; requires OpenAI key |\n| AI | Anthropic Claude | `ClaudeAIEngine` | Default and only AI engine; uses `com.anthropic:anthropic-java`; model is configurable |\n| TTS | Android built-in | `AndroidTtsEngine` | Default; free; uses `android.speech.tts.TextToSpeech` |\n| TTS | ElevenLabs | `ElevenLabsTtsEngine` | High-quality natural voices; POSTs to ElevenLabs REST API; requires ElevenLabs key |\n| Emotion | On-device | built-in | Lightweight on-device audio feature analysis; no external key required |\n| Emotion | Hume AI | `HumeEmotionDetector` | Cloud-based; high accuracy across 7 emotions; requires Hume API key |\n\n## Configuration Reference\n\nAll options are fields on `VoiceAIConfig`. Pass a `config { }` block to `VoiceAISDK.Builder` to override defaults.\n\n| Field | Type | Default | Description |\n|---|---|---|---|\n| `anthropicApiKey` | `String` | — | Required. Your Anthropic API key. Never hardcode; read from `BuildConfig` or encrypted storage. |\n| `aiModel` | `String` | `\"claude-3-5-sonnet-20241022\"` | Claude model ID used for all AI turns. |\n| `systemPrompt` | `String?` | `\"You are a helpful voice assistant…\"` | System instruction prepended to every conversation. |\n| `inputMode` | `InputMode` | `HANDS_FREE` | `HANDS_FREE` activates VAD; `PUSH_TO_TALK` records only while button is held. |\n| `locale` | `Locale` | `Locale.getDefault()` | Locale passed to the STT engine for language hints. |\n| `silenceTimeoutMs` | `Long` | `1200` | Milliseconds of silence after speech before the STT turn is finalised. |\n| `maxHistoryTurns` | `Int` | `20` | Maximum number of conversation turns kept in the Claude context window. |\n| `piiRedaction` | `Boolean` | `false` | When `true`, strips phone numbers, emails, and credit-card numbers from transcripts before sending to the AI. |\n| `emotionDetectionEnabled` | `Boolean` | `false` | Enables voice emotion detection. Requires user consent; set `emotionConsentRationale`. |\n| `certificatePins` | `List\u003CString>` | `emptyList()` | SHA-256 certificate pins applied to the OkHttp client for network requests. |\n\n## Voice Emotion Detection\n\nWhen enabled, the SDK analyses the acoustic features of each recorded utterance and annotates AI turns with the detected emotion (`NEUTRAL`, `HAPPY`, `SAD`, `ANGRY`, `FEARFUL`, `SURPRISED`, or `DISGUSTED`). Set `emotionAwareAI = true` to have the detected emotion automatically injected into the Claude system context so the AI can adapt its tone. Always present a clear consent rationale before enabling this feature.\n\n```kotlin\nval sdk = VoiceAISDK.Builder(context)\n    .anthropicApiKey(BuildConfig.ANTHROPIC_API_KEY)\n    .config {\n        copy(\n            emotionDetectionEnabled = true,\n            emotionConsentRationale = \"Emotion analysis helps the assistant respond more empathetically.\",\n            emotionAwareAI = true,\n        )\n    }\n    .build()\n```\n\n## Security\n\n- **API keys are never hardcoded.** Keys are read from `BuildConfig` fields populated via `local.properties` (git-ignored) or CI environment variables, never embedded in source files or `strings.xml`.\n- **Encrypted local storage.** `VoiceAIKeyStorage` wraps `EncryptedSharedPreferences` with AES-256-GCM key encryption and AES-256-SIV value encryption backed by the Android Keystore.\n- **PII redaction.** When `piiRedaction = true`, `PiiRedactor` strips phone numbers, email addresses, and credit-card numbers from transcripts before they leave the device.\n- **Certificate pinning.** Populate `VoiceAIConfig.certificatePins` with SHA-256 digests to enable OkHttp certificate pinning on all outbound API calls.\n- **R8\u002FProGuard minification.** Release builds should enable `isMinifyEnabled = true`; the Anthropic Java SDK ships consumer ProGuard rules that are merged automatically.\n\n## License\n\n```\nMIT License\n\nCopyright (c) 2026 Android Voice AI SDK Contributors\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and\u002For sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n```","Android Voice AI SDK 是一个可重用的 Android 库，能够在几分钟内为任何应用提供完整的语音驱动 AI 会话管道。它支持从设备麦克风捕获音频、将语音转录为文本、通过 Anthropic Claude 获取智能响应，并通过文本转语音技术将回复传达给用户。该库采用 MVVM 架构设计，使用 Kotlin 编写，集成了 Jetpack Compose UI 组件、可替换的 STT\u002FTTS 引擎适配器以及包括设备端情感检测和安全工具（如 PII 删除和加密密钥存储）在内的多种功能。适用于需要快速集成高质量语音交互功能的 Android 应用开发场景。","2026-06-11 04:09:38","top_language"]