[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72040":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":23,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":26,"readmeContent":27,"aiSummary":28,"trendingCount":16,"starSnapshotCount":16,"syncStatus":29,"lastSyncTime":30,"discoverSource":31},72040,"call-center-ai","microsoft\u002Fcall-center-ai","microsoft","Send a phone call from AI agent, in an API call. Or, directly call the bot from the configured phone number!","",null,"Python",6502,774,47,21,0,3,11,44,9,39.67,"Apache License 2.0",false,"main",[],"2026-06-12 02:02:57","# Call Center AI\n\nAI-powered call center solution with Azure and OpenAI GPT.\n\n\u003C!-- github.com badges -->\n[![Last release date](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frelease-date\u002Fclemlesne\u002Fcall-center-ai)](https:\u002F\u002Fgithub.com\u002Fclemlesne\u002Fcall-center-ai\u002Freleases)\n[![Project license](https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Flicense\u002Fclemlesne\u002Fcall-center-ai)](https:\u002F\u002Fgithub.com\u002Fclemlesne\u002Fcall-center-ai\u002Fblob\u002Fmain\u002FLICENSE)\n\n\u003C!-- GitHub Codespaces badge -->\n[![Open in GitHub Codespaces](https:\u002F\u002Fgithub.com\u002Fcodespaces\u002Fbadge.svg)](https:\u002F\u002Fcodespaces.new\u002Fmicrosoft\u002Fcall-center-ai?quickstart=1)\n\n## Overview\n\nSend a phone call from AI agent, in an API call. Or, directly call the bot from the configured phone number!\n\nInsurance, IT support, customer service, and more. The bot can be customized in few hours (really) to fit your needs.\n\n```bash\n# Ask the bot to call a phone number\ndata='{\n  \"bot_company\": \"Contoso\",\n  \"bot_name\": \"Amélie\",\n  \"phone_number\": \"+11234567890\",\n  \"task\": \"Help the customer with their digital workplace. Assistant is working for the IT support department. The objective is to help the customer with their issue and gather information in the claim.\",\n  \"agent_phone_number\": \"+33612345678\",\n  \"claim\": [\n    {\n      \"name\": \"hardware_info\",\n      \"type\": \"text\"\n    },\n    {\n      \"name\": \"first_seen\",\n      \"type\": \"datetime\"\n    },\n    {\n      \"name\": \"building_location\",\n      \"type\": \"text\"\n    }\n  ]\n}'\n\ncurl \\\n  --header 'Content-Type: application\u002Fjson' \\\n  --request POST \\\n  --url https:\u002F\u002Fxxx\u002Fcall \\\n  --data $data\n```\n\n### Features\n\n- **Enhanced communication and user experience**: Integrates inbound and outbound calls with a dedicated phone number, supports multiple languages and voice tones, and allows users to provide or receive information via SMS. Conversations are **streamed in real-time** to avoid delays, can be **resumed after disconnections**, and are **stored for future reference**. This ensures an **improved customer experience**, enabling 24\u002F7 communication and handling of low to medium complexity calls, all in a more accessible and user-friendly manner.\n\n- **Advanced intelligence and data management**: Leverages **gpt-4.1** and **gpt-4.1-nano** (known for higher performance and a 10–15x cost premium) to achieve nuanced comprehension. It can discuss **private and sensitive data**, including customer-specific information, while following **retrieval-augmented generation (RAG)** best practices to ensure secure and compliant handling of internal documents. The system understands domain-specific terms, follows a structured claim schema, generates automated to-do lists, filters inappropriate content, and detects jailbreak attempts. Historical conversations and past interactions can also be used to **fine-tune the LLM**, improving accuracy and personalization over time. Redis caching further enhances efficiency.\n\n- **Customization, oversight, and scalability**: Offers **customizable prompts**, feature flags for controlled experimentation, human agent fallback, and call recording for quality assurance. Integrates Application Insights for monitoring and tracing, provides publicly accessible claim data, and plans future enhancements such as automated callbacks and IVR-like workflows. It also enables the creation of a **brand-specific custom voice**, allowing the assistant’s voice to reflect the company’s identity and improve brand consistency.\n\n- **Cloud-native deployment and resource management**: Deployed on **Azure** with a containerized, serverless architecture for low maintenance and elastic scaling. This approach optimizes costs based on usage, ensuring flexibility and affordability over time. Seamless integration with **Azure Communication Services**, **Cognitive Services**, and **OpenAI resources** provides a secure environment suitable for rapid iteration, continuous improvement, and accommodating variable workloads in the call center.\n\n### Demo\n\nA French demo is avaialble on YouTube. Do not hesitate to watch the demo in x1.5 speed to get a quick overview of the project. Voice is hesitant on purpose to show the bot can handle it. All the infrastructure is deployed on Azure, mostly in serverless mode. Provisionning of the LLM resources can be done to reduce the latency.\n\n[![French demo](https:\u002F\u002Fimg.youtube.com\u002Fvi\u002Fi_qhNdUUxSI\u002Fmaxresdefault.jpg)](https:\u002F\u002Fyoutube.com\u002Fwatch?v=i_qhNdUUxSI)\n\nMain interactions shown in the demo:\n\n1. User calls the call center\n2. The bot answers and the conversation starts\n3. The bot stores conversation, claim and todo list in the database\n\nExtract of the data stored during the call:\n\n```json\n{\n  \"claim\": {\n    \"incident_description\": \"Collision avec un autre véhicule, voiture dans le fossé, pas de blessés\",\n    \"incident_location\": \"Nationale 17\",\n    \"involved_parties\": \"Dujardin, Madame Lesné\",\n    \"policy_number\": \"DEC1748\"\n  },\n  \"messages\": [\n    {\n      \"created_at\": \"2024-12-10T15:51:04.566727Z\",\n      \"action\": \"talk\",\n      \"content\": \"Non, je pense que c'est pas mal. Vous avez répondu à mes questions et là j'attends la dépaneuse. Merci beaucoup.\",\n      \"persona\": \"human\",\n      \"style\": \"none\",\n      \"tool_calls\": []\n    },\n    {\n      \"created_at\": \"2024-12-10T15:51:06.040451Z\",\n      \"action\": \"talk\",\n      \"content\": \"Je suis ravi d'avoir pu vous aider! Si vous avez besoin de quoi que ce soit d'autre, n'hésitez pas à nous contacter. Je vous souhaite une bonne journée et j'espère que tout se passera bien avec la dépanneuse. Au revoir!\",\n      \"persona\": \"assistant\",\n      \"style\": \"none\",\n      \"tool_calls\": []\n    }\n  ],\n  \"next\": {\n    \"action\": \"case_closed\",\n    \"justification\": \"The customer has provided all necessary information for the insurance claim, and a reminder has been set for a follow-up call. The customer is satisfied with the assistance provided and is waiting for the tow truck. The case can be closed for now.\"\n  },\n  \"reminders\": [\n    {\n      \"created_at\": \"2024-12-10T15:50:09.507903Z\",\n      \"description\": \"Rappeler le client pour faire le point sur l'accident et l'avancement du dossier.\",\n      \"due_date_time\": \"2024-12-11T14:30:00\",\n      \"owner\": \"assistant\",\n      \"title\": \"Rappel client sur l'accident\"\n    }\n  ],\n  \"synthesis\": {\n    \"long\": \"During our call, you reported an accident involving your vehicle on the Nationale 17. You mentioned that there were no injuries, but both your car and the other vehicle ended up in a ditch. The other party involved is named Dujardin, and your vehicle is a 4x4 Ford. I have updated your claim with these details, including the license plates: yours is U837GE and the other vehicle's is GA837IA. A reminder has been set for a follow-up call tomorrow at 14:30 to discuss the progress of your claim. If you need further assistance, please feel free to reach out.\",\n    \"satisfaction\": \"high\",\n    \"short\": \"the accident on Nationale 17\",\n    \"improvement_suggestions\": \"To improve the customer experience, it would be beneficial to ensure that the call connection is stable to avoid interruptions. Additionally, providing a clear step-by-step guide on what information is needed for the claim could help streamline the process and reduce any confusion for the customer.\"\n  }\n  ...\n}\n```\n\n### User report after the call\n\nA report is available at `https:\u002F\u002F[your_domain]\u002Freport\u002F[phone_number]` (like `http:\u002F\u002Flocalhost:8080\u002Freport\u002F%2B133658471534`). It shows the conversation history, claim data and reminders.\n\n![User report](.\u002Fdocs\u002Fuser_report.png)\n\n## Architecture\n\n### High level architecture\n\n```mermaid\n---\ntitle: System diagram (C4 model)\n---\ngraph\n  user([\"User\"])\n  agent([\"Agent\"])\n\n  app[\"Call Center AI\"]\n\n  app -- Transfer to --> agent\n  app -. Send voice .-> user\n  user -- Call --> app\n```\n\n### Component level architecture\n\n```mermaid\n---\ntitle: Claim AI component diagram (C4 model)\n---\ngraph LR\n  agent([\"Agent\"])\n  user([\"User\"])\n\n  subgraph \"Claim AI\"\n    ada[\"Embedding\u003Cbr>(ADA)\"]\n    app[\"App\u003Cbr>(Container App)\"]\n    communication_services[\"Call & SMS gateway\u003Cbr>(Communication Services)\"]\n    db[(\"Conversations and claims\u003Cbr>(Cosmos DB)\")]\n    eventgrid[\"Broker\u003Cbr>(Event Grid)\"]\n    gpt[\"LLM\u003Cbr>(gpt-4.1, gpt-4.1-nano)\"]\n    queues[(\"Queues\u003Cbr>(Azure Storage)\")]\n    redis[(\"Cache\u003Cbr>(Redis)\")]\n    search[(\"RAG\u003Cbr>(AI Search)\")]\n    sounds[(\"Sounds\u003Cbr>(Azure Storage)\")]\n    sst[\"Speech-to-text\u003Cbr>(Cognitive Services)\"]\n    translation[\"Translation\u003Cbr>(Cognitive Services)\"]\n    tts[\"Text-to-speech\u003Cbr>(Cognitive Services)\"]\n  end\n\n  app -- Translate static TTS --> translation\n  app -- Sezarch RAG data --> search\n  app -- Generate completion --> gpt\n  gpt -. Answer with completion .-> app\n  app -- Generate voice --> tts\n  tts -. Answer with voice .-> app\n  app -- Get cached data --> redis\n  app -- Save conversation --> db\n  app -- Transform voice --> sst\n  sst -. Answer with text .-> app\n  app \u003C-. Exchange audio .-> communication_services\n  app -. Watch .-> queues\n\n  communication_services -- Load sound --> sounds\n  communication_services -- Notifies --> eventgrid\n  communication_services -- Transfer to --> agent\n  communication_services \u003C-. Exchange audio .-> agent\n  communication_services \u003C-. Exchange audio .-> user\n\n  eventgrid -- Push to --> queues\n\n  search -- Generate embeddings --> ada\n\n  user -- Call --> communication_services\n```\n\n## Deployment\n\n> [!NOTE]\n> This project is a proof of concept. It is not intended to be used in production. This demonstrates how can be combined Azure Communication Services, Azure Cognitive Services and Azure OpenAI to build an automated call center solution.\n\n### Prerequisites\n\n[Prefer using GitHub Codespaces for a quick start.](https:\u002F\u002Fcodespaces.new\u002Fmicrosoft\u002Fcall-center-ai?quickstart=1) The environment will setup automatically with all the required tools.\n\nIn macOS, with [Homebrew](https:\u002F\u002Fbrew.sh), simply type `make brew`.\n\nFor other systems, make sure you have the following installed:\n\n- [Azure CLI](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fcli\u002Fazure\u002Finstall-azure-cli)\n- [Twilio CLI](https:\u002F\u002Fwww.twilio.com\u002Fdocs\u002Ftwilio-cli\u002Fgetting-started\u002Finstall) (optional)\n- [yq](https:\u002F\u002Fgithub.com\u002Fmikefarah\u002Fyq?tab=readme-ov-file#install)\n- Bash compatible shell, like `bash` or `zsh`\n- Make, `apt install make` (Ubuntu), `yum install make` (CentOS), `brew install make` (macOS)\n\nThen, Azure resources are needed:\n\n#### 1. [Create a new resource group](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fazure-resource-manager\u002Fmanagement\u002Fmanage-resource-groups-portal)\n\n- Prefer to use lowercase and no special characters other than dashes (e.g. `ccai-customer-a`)\n\n#### 2. [Create a Communication Services resource](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fcommunication-services\u002Fquickstarts\u002Fcreate-communication-resource?tabs=linux&pivots=platform-azp)\n\n- Same name as the resource group\n- Enable system managed identity\n\n#### 3. [Buy a phone number](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fcommunication-services\u002Fquickstarts\u002Ftelephony\u002Fget-phone-number?tabs=linux&pivots=platform-azp-new)\n\n- From the Communication Services resource\n- Allow inbound and outbound communication\n- Enable voice (required) and SMS (optional) capabilities\n\nNow that the prerequisites are configured (local + Azure), the deployment can be done.\n\n### Remote (on Azure)\n\nA pre-built container image is available on GitHub Actions, it will be used to deploy the solution on Azure:\n\n- Latest version from a branch: `ghcr.io\u002Fclemlesne\u002Fcall-center-ai:main`\n- Specific tag: `ghcr.io\u002Fclemlesne\u002Fcall-center-ai:0.1.0` (recommended)\n\n#### 1. Create the light config file\n\nFill the template from the example at [`config-remote-example.yaml`](.\u002Fconfig-remote-example.yaml). The file should be placed at the root of the project under the name `config.yaml`. It will be used by install scripts (incl. Makefile and Bicep) to configure the Azure resources.\n\n#### 2. Connect to your Azure environment\n\n```zsh\naz login\n```\n\n#### 3. Run deployment automation\n\n> [!TIP]\n> Specify the release version under the `image_version` parameter (default is `main`). For example, `image_version=16.0.0` or `image_version=sha-7ca2c0c`. This will ensure any future project breaking changes won't affect your deployment.\n\n```zsh\nmake deploy name=my-rg-name\n```\n\nWait for the deployment to finish.\n\n#### 4. Get the logs\n\n```zsh\nmake logs name=my-rg-name\n```\n\n### Local (on your machine)\n\n#### 1. Prerequisites\n\nIf you skiped the `make brew` command from the first install section, make sure you have the following installed:\n\n- [Rust](https:\u002F\u002Frust-lang.org)\n- [uv](https:\u002F\u002Fdocs.astral.sh\u002Fuv)\n\nFinally, run `make install` to setup Python environment.\n\n#### 2. Create the full config file\n\nIf the application is already deployed on Azure, you can run `make name=my-rg-name sync-local-config` to copy the configuration from remote to your local machine.\n\n> [!TIP]\n> To use a Service Principal to authenticate to Azure, you can also add the following in a `.env` file:\n>\n> ```dotenv\n> AZURE_CLIENT_ID=xxx\n> AZURE_CLIENT_SECRET=xxx\n> AZURE_TENANT_ID=xxx\n> ```\n\nIf the solution is not running online, fill the template from the example at [`config-local-example.yaml`](.\u002Fconfig-local-example.yaml). The file should be placed at the root of the project under the name `config.yaml`.\n\n#### 3. Run the deployment automation\n\nExecute if the solution is not yet deployed on Azure.\n\n```zsh\nmake deploy-bicep deploy-post name=my-rg-name\n```\n\n- This will deploy the Azure resources without the API server, allowing you to test the bot locally\n- Wait for the deployment to finish\n\n#### 4. Connect to Azure Dev tunnels\n\n> [!IMPORTANT]\n> Tunnel requires to be run in a separate terminal, because it needs to be running all the time\n\n```zsh\n# Log in once\ndevtunnel login\n\n# Start the tunnel\nmake tunnel\n```\n\n#### 5. Iterate quickly with the code\n\n> [!NOTE]\n> To override a specific configuration value, you can use environment variables. For example, to override the `llm.fast.endpoint` value, you can use the `LLM__FAST__ENDPOINT` variable:\n>\n> ```dotenv\n> LLM__FAST__ENDPOINT=https:\u002F\u002Fxxx.openai.azure.com\n> ```\n\n> [!NOTE]\n> Also, `local.py` script is available to test the application without the need of a phone call (= without Communication Services). Run the script with:\n>\n> ```bash\n> python3 -m tests.local\n> ```\n\n```zsh\nmake dev\n```\n\n- Code is automatically reloaded on file changes, no need to restart the server\n- The API server is available at `http:\u002F\u002Flocalhost:8080`\n\n## Advanced usage\n\n### Enable call recording\n\nCall recording is disabled by default. To enable it:\n\n1. Create a new container in the Azure Storage account (i.e. `recordings`), it is already done if you deployed the solution on Azure\n2. Update the feature flag `recording_enabled` in App Configuration to `true`\n\n### Add my custom training data with AI Search\n\nTraining data is stored on AI Search to be retrieved by the bot, on demand.\n\nRequired index schema:\n\n| **Field Name** | `Type` | Retrievable | Searchable | Dimensions | Vectorizer |\n|-|-|-|-|-|-|\n| **answer** | `Edm.String` | Yes | Yes | | |\n| **context** | `Edm.String` | Yes | Yes | | |\n| **created_at** | `Edm.String` | Yes | No | | |\n| **document_synthesis** | `Edm.String` | Yes | Yes | | |\n| **file_path** | `Edm.String` | Yes | No | | |\n| **id** | `Edm.String` | Yes | No | | |\n| **question** | `Edm.String` | Yes | Yes | | |\n| **vectors** | `Collection(Edm.Single)` | No | Yes | 1536 | *OpenAI ADA* |\n\nSoftware to fill the index is included [on Synthetic RAG Index](https:\u002F\u002Fgithub.com\u002Fclemlesne\u002Frag-index) repository.\n\n### Customize the languages\n\nThe bot can be used in multiple languages. It can understand the language the user chose.\n\nSee the [list of supported languages](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fai-services\u002Fspeech-service\u002Flanguage-support?tabs=tts#supported-languages) for the Text-to-Speech service.\n\n```yaml\n# config.yaml\nconversation:\n  initiate:\n    lang:\n      default_short_code: fr-FR\n      availables:\n        - pronunciations_en: [\"French\", \"FR\", \"France\"]\n          short_code: fr-FR\n          voice: fr-FR-DeniseNeural\n        - pronunciations_en: [\"Chinese\", \"ZH\", \"China\"]\n          short_code: zh-CN\n          voice: zh-CN-XiaoqiuNeural\n```\n\nIf you built and deployed an [Azure Speech Custom Neural Voice (CNV)](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fai-services\u002Fspeech-service\u002Fcustom-neural-voice), add field `custom_voice_endpoint_id` on the language configuration:\n\n```yaml\n# config.yaml\nconversation:\n  initiate:\n    lang:\n      default_short_code: fr-FR\n      availables:\n        - pronunciations_en: [\"French\", \"FR\", \"France\"]\n          short_code: fr-FR\n          voice: xxx\n          custom_voice_endpoint_id: xxx\n```\n\n### Customize the moderation levels\n\nLevels are defined for each category of Content Safety. The higher the score, the more strict the moderation is, from 0 to 7. Moderation is applied on all bot data, including the web page and the conversation. Configure them in Azure OpenAI Content Filters.\n\n### Customize the claim data schema\n\nCustomization of the data schema is fully supported. You can add or remove fields as needed, depending on the requirements.\n\nBy default, the schema of composed of:\n\n- `caller_email` (`email`)\n- `caller_name` (`text`)\n- `caller_phone` (`phone_number`)\n\nValues are validated to ensure the data format commit to your schema. They can be either:\n\n- `datetime`\n- `email`\n- `phone_number` (`E164` format)\n- `text`\n\nFinally, an optional description can be provided. The description must be short and meaningful, it will be passed to the LLM.\n\nDefault schema, for inbound calls, is defined in the configuration:\n\n```yaml\n# config.yaml\nconversation:\n  default_initiate:\n    claim:\n      - name: additional_notes\n        type: text\n        # description: xxx\n      - name: device_info\n        type: text\n        # description: xxx\n      - name: incident_datetime\n        type: datetime\n        # description: xxx\n```\n\nClaim schema can be customized for each call, by adding the `claim` field in the `POST \u002Fcall` API call.\n\n### Customize the call objective\n\nThe objective is a description of what the bot will do during the call. It is used to give a context to the LLM. It should be short, meaningful, and written in English.\n\nThis solution is priviledged instead of overriding the LLM prompt.\n\nDefault task, for inbound calls, is defined in the configuration:\n\n```yaml\n# config.yaml\nconversation:\n  initiate:\n    task: |\n      Help the customer with their insurance claim. Assistant requires data from the customer to fill the claim. The latest claim data will be given. Assistant role is not over until all the relevant data is gathered.\n```\n\nTask can be customized for each call, by adding the `task` field in the `POST \u002Fcall` API call.\n\n### Customize the conversation\n\nConversation options are represented as features. They can be configured from App Configuration, without the need to redeploy or restart the application. Once a feature is updated, a delay of 60 secs is needed to make the change effective.\n\nBy default, values are refreshed every 60 seconds. Refresh is not sync across all instances, so it can take up to 60 seconds to see the change on all users. Update this in the `app_configuration.ttl_sec` field.\n\n| Name | Description | Type | Default |\n|-|-|-|-|\n| `answer_hard_timeout_sec` | Time waiting the LLM before aborting the answer with an error message. | `int` | 15 |\n| `answer_soft_timeout_sec` | Time waiting the LLM before sending a waiting message. | `int` | 4 |\n| `callback_timeout_hour` | The timeout for a callback in hours. Set 0 to disable. | `int` | 3 |\n| `phone_silence_timeout_sec` | Amount of silence in secs to trigger a warning message from the assistant. | `int` | 20 |\n| `recognition_retry_max` | TThe maximum number of retries for voice recognition. Minimum of 1. | `int` | 3 |\n| `recognition_stt_complete_timeout_ms` | The timeout for STT completion in milliseconds. | `int` | 100 |\n| `recording_enabled` | Whether call recording is enabled. | `bool` | false |\n| `slow_llm_for_chat` | Whether to use the slow LLM for chat. | `bool` | false |\n| `vad_cutoff_timeout_ms` | The cutoff timeout for voice activity detection in milliseconds. | `int` | 250 |\n| `vad_silence_timeout_ms` | Silence to trigger voice activity detection in milliseconds. | `int` | 500 |\n| `vad_threshold` | The threshold for voice activity detection. Between 0.1 and 1. | `float` | 0.5 |\n\n### Use Twilio for SMS\n\nTo use Twilio for SMS, you need to create an account and get the following information:\n\n- Account SID\n- Auth Token\n- Phone number\n\nThen, add the following in the `config.yaml` file:\n\n```yaml\n# config.yaml\nsms:\n  mode: twilio\n  twilio:\n    account_sid: xxx\n    auth_token: xxx\n    phone_number: \"+33612345678\"\n```\n\n### Customize the prompts\n\nNote that prompt examples contains `{xxx}` placeholders. These placeholders are replaced by the bot with the corresponding data. For example, `{bot_name}` is internally replaced by the bot name. Be sure to write all the TTS prompts in English. This language is used as a pivot language for the conversation translation. All texts are referenced as lists, so user can have a different experience each time they call, thus making the conversation more engaging.\n\n```yaml\n# config.yaml\nprompts:\n  tts:\n    hello_tpl:\n      - : |\n        Hello, I'm {bot_name}, from {bot_company}! I'm an IT support specialist.\n\n        Here's how I work: when I'm working, you'll hear a little music; then, at the beep, it's your turn to speak. You can speak to me naturally, I'll understand.\n\n        What's your problem?\n      - : |\n        Hi, I'm {bot_name} from {bot_company}. I'm here to help.\n\n        You'll hear music, then a beep. Speak naturally, I'll understand.\n\n        What's the issue?\n  llm:\n    default_system_tpl: |\n      Assistant is called {bot_name} and is in a call center for the company {bot_company} as an expert with 20 years of experience in IT service.\n\n      # Context\n      Today is {date}. Customer is calling from {phone_number}. Call center number is {bot_phone_number}.\n    chat_system_tpl: |\n      # Objective\n      Provide internal IT support to employees. Assistant requires data from the employee to provide IT support. The assistant's role is not over until the issue is resolved or the request is fulfilled.\n\n      # Rules\n      - Answers in {default_lang}, even if the customer speaks another language\n      - Cannot talk about any topic other than IT support\n      - Is polite, helpful, and professional\n      - Rephrase the employee's questions as statements and answer them\n      - Use additional context to enhance the conversation with useful details\n      - When the employee says a word and then spells out letters, this means that the word is written in the way the employee spelled it (e.g. \"I work in Paris PARIS\", \"My name is John JOHN\", \"My email is Clemence CLEMENCE at gmail GMAIL dot com COM\")\n      - You work for {bot_company}, not someone else\n\n      # Required employee data to be gathered by the assistant\n      - Department\n      - Description of the IT issue or request\n      - Employee name\n      - Location\n\n      # General process to follow\n      1. Gather information to know the employee's identity (e.g. name, department)\n      2. Gather details about the IT issue or request to understand the situation (e.g. description, location)\n      3. Provide initial troubleshooting steps or solutions\n      4. Gather additional information if needed (e.g. error messages, screenshots)\n      5. Be proactive and create reminders for follow-up or further assistance\n\n      # Support status\n      {claim}\n\n      # Reminders\n      {reminders}\n```\n\n### Optimize response delay\n\nThe delay mainly come from two things:\n\n- Voice in and voice out are processed by Azure AI Speech, both are implemented in streaming mode but voice is not directly streamed to the LLM\n- The LLM, more specifically the delay between API call and first sentence infered, can be long (as the sentences are sent one by one once they are made avalable), even longer if it hallucinate and returns empty answers (it happens regularly, and the applicatoipn retries the call)\n\nFrom now, the only impactful thing you can do is the LLM part. This can be acheieve by a PTU on Azure or using a less smart model like `gpt-4.1-nano` (selected by default on the latest versions). With a PTU on Azure OpenAI, you can divide by 2 the latency in some case.\n\nThe application is natively connected to Azure Application Insights, so you can monitor the response time and see where the time is spent. This is a great start to identify the bottlenecks.\n\nFeel free to raise an issue or propose a PR if you have any idea to optimize the response delay.\n\n### Improving conversation quality through model fine-tuning\n\nEnhance the LLM’s accuracy and domain adaptation by integrating historical data from human-run call centers. Before proceeding, ensure compliance with data privacy regulations, internal security standards, and [Responsible AI principles](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fmachine-learning\u002Fconcept-responsible-ai?view=azureml-api-2). Consider the following steps:\n\n1. Aggregate authentic data sources: Collect voice recordings, call transcripts, and chat logs from previous human-managed interactions to provide the LLM with realistic training material.\n2. Preprocess and anonymize data: [Remove sensitive information (AI Language Personally Identifiable Information detection)](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fai-services\u002Flanguage-service\u002Fpersonally-identifiable-information\u002Foverview), including personal identifiers or confidential details, to preserve user privacy, meet compliance, and align with Responsible AI guidelines.\n3. Perform iterative fine-tuning: Continuously [refine the model’s using the curated dataset (AI Foundry Fine-tuning)](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fai-studio\u002Fconcepts\u002Ffine-tuning-overview), allowing it to learn industry-specific terminology, preferred conversation styles, and problem-resolution approaches.\n4. Validate improvements: Test the updated model against sample scenarios and measure key performance indicators (e.g. user satisfaction, call duration, resolution rate) to confirm that adjustments have led to meaningful enhancements.\n5. Monitor, iterate, and A\u002FB test: Regularly reassess the model’s performance, integrate newly gathered data, and apply further fine-tuning as needed. Leverage [built-in feature configurations to A\u002FB test (App Configuration Experimentation)](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fazure-app-configuration\u002Fconcept-experimentation) different versions of the model, ensuring responsible, data-driven decisions and continuous optimization over time.\n\n### Monitoring the application\n\nApplication send traces and metrics to Azure Application Insights. You can monitor the application from the Azure portal, or by using the API.\n\nThis includes application behavior, database queries, and external service calls. Plus, LLM metrics (latency, token usage, prompts content, raw response) from [OpenLLMetry](https:\u002F\u002Fgithub.com\u002Ftraceloop\u002Fopenllmetry), following the [semantic sonventions for OpenAI operations](https:\u002F\u002Fopentelemetry.io\u002Fdocs\u002Fspecs\u002Fsemconv\u002Fgen-ai\u002Fopenai\u002F#openai-spans).\n\nAdditionally custom metrics (viewable in Application Insights > Metrics) are published, notably:\n\n- `call.aec.droped`, number of times the echo cancellation dropped the voice completely.\n- `call.aec.missed`, number of times the echo cancellation failed to remove the echo in time.\n- `call.answer.latency`, time between the end of the user voice and the start of the bot voice.\n\n## Q&A\n\n### What will this cost?\n\nFor a monthly usage of 1000 calls of 10 minutes each. Costs are estimated for 2024-12-10, in USD. Prices are subject to change.\n\n> [!NOTE]\n> For production usage, it is recommended to upgrade to SKUs with vNET integration and private endpoints. This can increase notably the costs.\n\nThis totalizes $720.07 \u002Fmonth, $0.12 \u002Fhour, with the following breakdown:\n\n[Azure Communication Services](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcommunication-services\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| West Europe | Audio Streaming | $0.004 \u002Fminute | $40 | |\n\n[Azure OpenAI](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcognitive-services\u002Fopenai-service\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| Sweden Central | gpt-4.1-nano global | $0.15 \u002F1M input tokens | $35.25 | 8k tokens for conversation history, 3750 tokens for RAG, each participant talk every 15s |\n| Sweden Central | gpt-4.1-nano global | $0.60 \u002F1M output tokens | $1.4 | 400 tokens for each response incl tools, each participant talk every 15s |\n| Sweden Central | gpt-4.1 global | $2.50 \u002F1M input tokens | $10 | 4k tokens for each conversation, to get insights |\n| Sweden Central | gpt-4.1 global | $10 \u002F1M output tokens | $10 | 1k tokens for each conversation, to get insights |\n| Sweden Central | text-embedding-3-large | $0.00013 \u002F1k tokens | $2.08 | 1 search or 400 tokens for each message, each participant talk every 15s |\n\n[Azure Container Apps](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcontainer-apps\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| Sweden Central | Serverless vCPU | $0.000024 \u002Fsec | $128.56 | Avg of 2 replicas with 1 vCPU |\n| Sweden Central | Serverless memory (average of 2 replicas) | $0.000003 \u002Fsec | $32.14 | Avg of 2 replicas with 2GB |\n\n[Azure AI Search](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fsearch\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| Sweden Central | Basic | $73.73 \u002Fmonth | $73.73 | Has 15GB of storage \u002Findex, should be upgraded for big datasets |\n\n[Azure AI Speech](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcognitive-services\u002Fspeech-services\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| West Europe | Speech-to-text real-time | $1 \u002Fhour | $83.33 | Each participant talk every 15s |\n| West Europe | Text-to-speech standard | $15 \u002F1M characters | $69.23 | 300 tokens for each response, 1.3 tokens \u002Fword in English, each participant talk every 15s |\n\n[Azure Cosmos DB](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcosmos-db\u002Fautoscale-provisioned\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| Sweden Central | Multi-region write RU\u002Fs \u002Fregion | $11.68 \u002F100 RU\u002Fs | $233.6 | Avg of 1k RU\u002Fs on 2 regions |\n| Sweden Central | Transactional storage | $0.25 \u002FGB | $0.5 | 2GB of storage, should be upgraded if more history is needed |\n\n**Not included upper:**\n\n> [!NOTE]\n> Azure Monitor costs shouldn't be considered as optional as monitoring is a key part of maintaining a business-critical application and high-quality service for users.\n\nOptional costs totalizing $343.02 \u002Fmonth, with the following breakdown:\n\n[Azure Communication Services](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcommunication-services\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| West Europe | Call recording | $0.002 \u002Fminute | $20 | |\n\n[Azure OpenAI](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fcognitive-services\u002Fopenai-service\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| Sweden Central | text-embedding-3-large | $0.00013 \u002F1k tokens | $0.52 | 10k PDF pages with 400 tokens each, for indexing |\n\n[Azure Monitor](https:\u002F\u002Fazure.microsoft.com\u002Fen-us\u002Fpricing\u002Fdetails\u002Fmonitor\u002F):\n\n| Region | Metric | Cost | Total (monthly $) | Note |\n|-|-|-|-|-|\n| Sweden Central | Basic logs ingestion | $0.645 \u002FGB | $322.5 | 500GB of logs [with sampling enabled](https:\u002F\u002Flearn.microsoft.com\u002Fen-us\u002Fazure\u002Fazure-monitor\u002Fapp\u002Fopentelemetry-configuration?tabs=python#enable-sampling) |\n\n### What would it require to make it production ready?\n\nQuality:\n\n- [x] Unit and integration tests for persistence layer\n- [ ] Complete unit and integration tests coverage\n\nReliability:\n\n- [x] Reproductible builds\n- [x] Traces and telemetry\n- [ ] Operation runbooks for common issues\n- [ ] Proper dashboarding in Azure Application Insights (deployed with the IaC)\n\nMaintainability:\n\n- [x] Automated and required static code checks\n- [ ] Decouple assistant from the insights in a separate service\n- [ ] Peer review to limit the bus factor\n\nResiliency:\n\n- [x] Infrastructure as Code (IaC)\n- [ ] Multi-region deployment\n- [ ] Reproductible performance tests\n\nSecurity:\n\n- [x] CI builds attestations\n- [x] CodeQL static code checks\n- [ ] GitOps for deployments\n- [ ] Private networking\n- [ ] Production SKUs allowing vNET integration\n- [ ] Red team exercises\n\nResponsible AI:\n\n- [x] Harmful content detection\n- [ ] Grounding detection with Content Safety\n- [ ] Social impact assessment\n\n### Why no LLM framework is used?\n\nAt the time of development, no LLM framework was available to handle all of these features: streaming capability with multi-tools, backup models on availability issue, callbacks mechanisms in the triggered tools. So, OpenAI SDK is used directly and some algorithms are implemented to handle reliability.\n\n## Related content\n\n- For a simple sample with Azure OpenAI `gpt-4o-realtime`, local deployment only, [see VoiceRAG](https:\u002F\u002Fgithub.com\u002FAzure-Samples\u002Faisearch-openai-rag-audio)\n- For an easier-to-use sample with Azure OpenAI `gpt-4o-realtime`, deployed on Azure, [see Realtime Call Center Solution Accelerator](https:\u002F\u002Fgithub.com\u002FAzure-Samples\u002Frealtime-call-center-accelerator)\n","微软的Call Center AI项目是一个基于Azure和OpenAI GPT技术的人工智能呼叫中心解决方案，能够通过API调用让AI代理发起电话呼叫，或者直接从配置的电话号码拨打机器人。其核心功能包括实时流式对话、断线后恢复通话、多语言支持及语音语调自定义等，显著提升了用户体验。此外，该项目利用gpt-4.1等高级模型理解复杂信息，并遵循RAG最佳实践处理敏感数据，确保了信息安全。它还支持定制化提示、历史对话记录分析以优化模型表现，以及Redis缓存提高效率。适用于保险、IT支持、客户服务等多个场景，可在几小时内根据具体需求进行快速定制。",2,"2026-06-11 03:40:04","high_star"]