[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"skill-d15e42e5-7d08-435a-af03-3bfbddc5db69":3,"$fPDOc9aLxRBQipKz1r8szRa3ubZz7Juts69ASsmfnAzs":43},{"id":4,"title":5,"description":6,"categoryId":7,"moduleId":8,"tags":9,"prompt":10,"icon":11,"source":12,"sourceUrl":13,"authorId":14,"authorName":15,"isPublic":16,"stars":17,"runs":18,"createdAt":19,"updatedAt":19,"module":20,"category":27,"packages":34},"d15e42e5-7d08-435a-af03-3bfbddc5db69","azure-ai-voicelive-ts","Azure AI语音实时SDK（JavaScript\u002FTypeScript）。使用双向WebSocket通信构建实时语音AI应用。","cat_coding_devops","mod_coding","sickn33,coding","---\nname: azure-ai-voicelive-ts\ndescription: Azure AI Voice Live SDK for JavaScript\u002FTypeScript. Build real-time voice AI applications with bidirectional WebSocket communication.\nrisk: unknown\nsource: community\ndate_added: '2026-02-27'\n---\n\n# @azure\u002Fai-voicelive (JavaScript\u002FTypeScript)\n\nReal-time voice AI SDK for building bidirectional voice assistants with Azure AI in Node.js and browser environments.\n\n## Installation\n\n```bash\nnpm install @azure\u002Fai-voicelive @azure\u002Fidentity\n# TypeScript users\nnpm install @types\u002Fnode\n```\n\n**Current Version**: 1.0.0-beta.3\n\n**Supported Environments**:\n- Node.js LTS versions (20+)\n- Modern browsers (Chrome, Firefox, Safari, Edge)\n\n## Environment Variables\n\n```bash\nAZURE_VOICELIVE_ENDPOINT=https:\u002F\u002F\u003Cresource>.cognitiveservices.azure.com\n# Optional: API key if not using Entra ID\nAZURE_VOICELIVE_API_KEY=\u003Cyour-api-key>\n# Optional: Logging\nAZURE_LOG_LEVEL=info\n```\n\n## Authentication\n\n### Microsoft Entra ID (Recommended)\n\n```typescript\nimport { DefaultAzureCredential } from \"@azure\u002Fidentity\";\nimport { VoiceLiveClient } from \"@azure\u002Fai-voicelive\";\n\nconst credential = new DefaultAzureCredential();\nconst endpoint = \"https:\u002F\u002Fyour-resource.cognitiveservices.azure.com\";\n\nconst client = new VoiceLiveClient(endpoint, credential);\n```\n\n### API Key\n\n```typescript\nimport { AzureKeyCredential } from \"@azure\u002Fcore-auth\";\nimport { VoiceLiveClient } from \"@azure\u002Fai-voicelive\";\n\nconst endpoint = \"https:\u002F\u002Fyour-resource.cognitiveservices.azure.com\";\nconst credential = new AzureKeyCredential(\"your-api-key\");\n\nconst client = new VoiceLiveClient(endpoint, credential);\n```\n\n## Client Hierarchy\n\n```\nVoiceLiveClient\n└── VoiceLiveSession (WebSocket connection)\n    ├── updateSession()      → Configure session options\n    ├── subscribe()          → Event handlers (Azure SDK pattern)\n    ├── sendAudio()          → Stream audio input\n    ├── addConversationItem() → Add messages\u002Ffunction outputs\n    └── sendEvent()          → Send raw protocol events\n```\n\n## Quick Start\n\n```typescript\nimport { DefaultAzureCredential } from \"@azure\u002Fidentity\";\nimport { VoiceLiveClient } from \"@azure\u002Fai-voicelive\";\n\nconst credential = new DefaultAzureCredential();\nconst endpoint = process.env.AZURE_VOICELIVE_ENDPOINT!;\n\n\u002F\u002F Create client and start session\nconst client = new VoiceLiveClient(endpoint, credential);\nconst session = await client.startSession(\"gpt-4o-mini-realtime-preview\");\n\n\u002F\u002F Configure session\nawait session.updateSession({\n  modalities: [\"text\", \"audio\"],\n  instructions: \"You are a helpful AI assistant. Respond naturally.\",\n  voice: {\n    type: \"azure-standard\",\n    name: \"en-US-AvaNeural\",\n  },\n  turnDetection: {\n    type: \"server_vad\",\n    threshold: 0.5,\n    prefixPaddingMs: 300,\n    silenceDurationMs: 500,\n  },\n  inputAudioFormat: \"pcm16\",\n  outputAudioFormat: \"pcm16\",\n});\n\n\u002F\u002F Subscribe to events\nconst subscription = session.subscribe({\n  onResponseAudioDelta: async (event, context) => {\n    \u002F\u002F Handle streaming audio output\n    const audioData = event.delta;\n    playAudioChunk(audioData);\n  },\n  onResponseTextDelta: async (event, context) => {\n    \u002F\u002F Handle streaming text\n    process.stdout.write(event.delta);\n  },\n  onInputAudioTranscriptionCompleted: async (event, context) => {\n    console.log(\"User said:\", event.transcript);\n  },\n});\n\n\u002F\u002F Send audio from microphone\nfunction sendAudioChunk(audioBuffer: ArrayBuffer) {\n  session.sendAudio(audioBuffer);\n}\n```\n\n## Session Configuration\n\n```typescript\nawait session.updateSession({\n  \u002F\u002F Modalities\n  modalities: [\"audio\", \"text\"],\n  \n  \u002F\u002F System instructions\n  instructions: \"You are a customer service representative.\",\n  \n  \u002F\u002F Voice selection\n  voice: {\n    type: \"azure-standard\",  \u002F\u002F or \"azure-custom\", \"openai\"\n    name: \"en-US-AvaNeural\",\n  },\n  \n  \u002F\u002F Turn detection (VAD)\n  turnDetection: {\n    type: \"server_vad\",      \u002F\u002F or \"azure_semantic_vad\"\n    threshold: 0.5,\n    prefixPaddingMs: 300,\n    silenceDurationMs: 500,\n  },\n  \n  \u002F\u002F Audio formats\n  inputAudioFormat: \"pcm16\",\n  outputAudioFormat: \"pcm16\",\n  \n  \u002F\u002F Tools (function calling)\n  tools: [\n    {\n      type: \"function\",\n      name: \"get_weather\",\n      description: \"Get current weather\",\n      parameters: {\n        type: \"object\",\n        properties: {\n          location: { type: \"string\" }\n        },\n        required: [\"location\"]\n      }\n    }\n  ],\n  toolChoice: \"auto\",\n});\n```\n\n## Event Handling (Azure SDK Pattern)\n\nThe SDK uses a subscription-based event handling pattern:\n\n```typescript\nconst subscription = session.subscribe({\n  \u002F\u002F Connection lifecycle\n  onConnected: async (args, context) => {\n    console.log(\"Connected:\", args.connectionId);\n  },\n  onDisconnected: async (args, context) => {\n    console.log(\"Disconnected:\", args.code, args.reason);\n  },\n  onError: async (args, context) => {\n    console.error(\"Error:\", args.error.message);\n  },\n  \n  \u002F\u002F Session events\n  onSessionCreated: async (event, context) => {\n    console.log(\"Session created:\", context.sessionId);\n  },\n  onSessionUpdated: async (event, context) => {\n    console.log(\"Session updated\");\n  },\n  \n  \u002F\u002F Audio input events (VAD)\n  onInputAudioBufferSpeechStarted: async (event, context) => {\n    console.log(\"Speech started at:\", event.audioStartMs);\n  },\n  onInputAudioBufferSpeechStopped: async (event, context) => {\n    console.log(\"Speech stopped at:\", event.audioEndMs);\n  },\n  \n  \u002F\u002F Transcription events\n  onConversationItemInputAudioTranscriptionCompleted: async (event, context) => {\n    console.log(\"User said:\", event.transcript);\n  },\n  onConversationItemInputAudioTranscriptionDelta: async (event, context) => {\n    process.stdout.write(event.delta);\n  },\n  \n  \u002F\u002F Response events\n  onResponseCreated: async (event, context) => {\n    console.log(\"Response started\");\n  },\n  onResponseDone: async (event, context) => {\n    console.log(\"Response complete\");\n  },\n  \n  \u002F\u002F Streaming text\n  onResponseTextDelta: async (event, context) => {\n    process.stdout.write(event.delta);\n  },\n  onResponseTextDone: async (event, context) => {\n    console.log(\"\\n--- Text complete ---\");\n  },\n  \n  \u002F\u002F Streaming audio\n  onResponseAudioDelta: async (event, context) => {\n    const audioData = event.delta;\n    playAudioChunk(audioData);\n  },\n  onResponseAudioDone: async (event, context) => {\n    console.log(\"Audio complete\");\n  },\n  \n  \u002F\u002F Audio transcript (what assistant said)\n  onResponseAudioTranscriptDelta: async (event, context) => {\n    process.stdout.write(event.delta);\n  },\n  \n  \u002F\u002F Function calling\n  onResponseFunctionCallArgumentsDone: async (event, context) => {\n    if (event.name === \"get_weather\") {\n      const args = JSON.parse(event.arguments);\n      const result = await getWeather(args.location);\n      \n      await session.addConversationItem({\n        type: \"function_call_output\",\n        callId: event.callId,\n        output: JSON.stringify(result),\n      });\n      \n      await session.sendEvent({ type: \"response.create\" });\n    }\n  },\n  \n  \u002F\u002F Catch-all for debugging\n  onServerEvent: async (event, context) => {\n    console.log(\"Event:\", event.type);\n  },\n});\n\n\u002F\u002F Clean up when done\nawait subscription.close();\n```\n\n## Function Calling\n\n```typescript\n\u002F\u002F Define tools in session config\nawait session.updateSession({\n  modalities: [\"audio\", \"text\"],\n  instructions: \"Help users with weather information.\",\n  tools: [\n    {\n      type: \"function\",\n      name: \"get_weather\",\n      description: \"Get current weather for a location\",\n      parameters: {\n        type: \"object\",\n        properties: {\n          location: {\n            type: \"string\",\n            description: \"City and state or country\",\n          },\n        },\n        required: [\"location\"],\n      },\n    },\n  ],\n  toolChoice: \"auto\",\n});\n\n\u002F\u002F Handle function calls\nconst subscription = session.subscribe({\n  onResponseFunctionCallArgumentsDone: async (event, context) => {\n    if (event.name === \"get_weather\") {\n      const args = JSON.parse(event.arguments);\n      const weatherData = await fetchWeather(args.location);\n      \n      \u002F\u002F Send function result\n      await session.addConversationItem({\n        type: \"function_call_output\",\n        callId: event.callId,\n        output: JSON.stringify(weatherData),\n      });\n      \n      \u002F\u002F Trigger response generation\n      await session.sendEvent({ type: \"response.create\" });\n    }\n  },\n});\n```\n\n## Voice Options\n\n| Voice Type | Config | Example |\n|------------|--------|---------|\n| Azure Standard | `{ type: \"azure-standard\", name: \"...\" }` | `\"en-US-AvaNeural\"` |\n| Azure Custom | `{ type: \"azure-custom\", name: \"...\", endpointId: \"...\" }` | Custom voice endpoint |\n| Azure Personal | `{ type: \"azure-personal\", speakerProfileId: \"...\" }` | Personal voice clone |\n| OpenAI | `{ type: \"openai\", name: \"...\" }` | `\"alloy\"`, `\"echo\"`, `\"shimmer\"` |\n\n## Supported Models\n\n| Model | Description | Use Case |\n|-------|-------------|----------|\n| `gpt-4o-realtime-preview` | GPT-4o with real-time audio | High-quality conversational AI |\n| `gpt-4o-mini-realtime-preview` | Lightweight GPT-4o | Fast, efficient interactions |\n| `phi4-mm-realtime` | Phi multimodal | Cost-effective applications |\n\n## Turn Detection Options\n\n```typescript\n\u002F\u002F Server VAD (default)\nturnDetection: {\n  type: \"server_vad\",\n  threshold: 0.5,\n  prefixPaddingMs: 300,\n  silenceDurationMs: 500,\n}\n\n\u002F\u002F Azure Semantic VAD (smarter detection)\nturnDetection: {\n  type: \"azure_semantic_vad\",\n}\n\n\u002F\u002F Azure Semantic VAD (English optimized)\nturnDetection: {\n  type: \"azure_semantic_vad_en\",\n}\n\n\u002F\u002F Azure Semantic VAD (Multilingual)\nturnDetection: {\n  type: \"azure_semantic_vad_multilingual\",\n}\n```\n\n## Audio Formats\n\n| Format | Sample Rate | Use Case |\n|--------|-------------|----------|\n| `pcm16` | 24kHz | Default, high quality |\n| `pcm16-8000hz` | 8kHz | Telephony |\n| `pcm16-16000hz` | 16kHz | Voice assistants |\n| `g711_ulaw` | 8kHz | Telephony (US) |\n| `g711_alaw` | 8kHz | Telephony (EU) |\n\n## Key Types Reference\n\n| Type | Purpose |\n|------|---------|\n| `VoiceLiveClient` | Main client for creating sessions |\n| `VoiceLiveSession` | Active WebSocket session |\n| `VoiceLiveSessionHandlers` | Event handler interface |\n| `VoiceLiveSubscription` | Active event subscription |\n| `ConnectionContext` | Context for connection events |\n| `SessionContext` | Context for session events |\n| `ServerEventUnion` | Union of all server events |\n\n## Error Handling\n\n```typescript\nimport {\n  VoiceLiveError,\n  VoiceLiveConnectionError,\n  VoiceLiveAuthenticationError,\n  VoiceLiveProtocolError,\n} from \"@azure\u002Fai-voicelive\";\n\nconst subscription = session.subscribe({\n  onError: async (args, context) => {\n    const { error } = args;\n    \n    if (error instanceof VoiceLiveConnectionError) {\n      console.error(\"Connection error:\", error.message);\n    } else if (error instanceof VoiceLiveAuthenticationError) {\n      console.error(\"Auth error:\", error.message);\n    } else if (error instanceof VoiceLiveProtocolError) {\n      console.error(\"Protocol error:\", error.message);\n    }\n  },\n  \n  onServerError: async (event, context) => {\n    console.error(\"Server error:\", event.error?.message);\n  },\n});\n```\n\n## Logging\n\n```typescript\nimport { setLogLevel } from \"@azure\u002Flogger\";\n\n\u002F\u002F Enable verbose logging\nsetLogLevel(\"info\");\n\n\u002F\u002F Or via environment variable\n\u002F\u002F AZURE_LOG_LEVEL=info\n```\n\n## Browser Usage\n\n```typescript\n\u002F\u002F Browser requires bundler (Vite, webpack, etc.)\nimport { VoiceLiveClient } from \"@azure\u002Fai-voicelive\";\nimport { InteractiveBrowserCredential } from \"@azure\u002Fidentity\";\n\n\u002F\u002F Use browser-compatible credential\nconst credential = new InteractiveBrowserCredential({\n  clientId: \"your-client-id\",\n  tenantId: \"your-tenant-id\",\n});\n\nconst client = new VoiceLiveClient(endpoint, credential);\n\n\u002F\u002F Request microphone access\nconst stream = await navigator.mediaDevices.getUserMedia({ audio: true });\nconst audioContext = new AudioContext({ sampleRate: 24000 });\n\n\u002F\u002F Process audio and send to session\n\u002F\u002F ... (see samples for full implementation)\n```\n\n## Best Practices\n\n1. **Always use `DefaultAzureCredential`** — Never hardcode API keys\n2. **Set both modalities** — Include `[\"text\", \"audio\"]` for voice assistants\n3. **Use Azure Semantic VAD** — Better turn detection than basic server VAD\n4. **Handle all error types** — Connection, auth, and protocol errors\n5. **Clean up subscriptions** — Call `subscription.close()` when done\n6. **Use appropriate audio format** — PCM16 at 24kHz for best quality\n\n## Reference Links\n\n| Resource | URL |\n|----------|-----|\n| npm Package | https:\u002F\u002Fwww.npmjs.com\u002Fpackage\u002F@azure\u002Fai-voicelive |\n| GitHub Source | https:\u002F\u002Fgithub.com\u002FAzure\u002Fazure-sdk-for-js\u002Ftree\u002Fmain\u002Fsdk\u002Fai\u002Fai-voicelive |\n| Samples | https:\u002F\u002Fgithub.com\u002FAzure\u002Fazure-sdk-for-js\u002Ftree\u002Fmain\u002Fsdk\u002Fai\u002Fai-voicelive\u002Fsamples |\n| API Reference | https:\u002F\u002Flearn.microsoft.com\u002Fjavascript\u002Fapi\u002F@azure\u002Fai-voicelive |\n\n## When to Use\nThis skill is applicable to execute the workflow or actions described in the overview.\n\n## Limitations\n- Use this skill only when the task clearly matches the scope described above.\n- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.\n- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.\n","","imported","https:\u002F\u002Fgithub.com\u002Fsickn33\u002Fantigravity-awesome-skills","user_system_seed","SkillOPIC",true,100,1739,"2026-05-16 13:05:41",{"id":8,"name":21,"slug":22,"icon":23,"description":24,"sort":25,"createdAt":26},"编程开发","coding","mdi-code-braces","代码生成、调试、审查，提升开发效率",2,"2026-05-16 12:53:40",{"id":7,"name":28,"slug":29,"icon":30,"description":31,"moduleId":8,"sort":32,"skillCount":33,"createdAt":26},"DevOps","devops","mdi-cog-outline","CI\u002FCD、容器化、部署运维",3,162,[35],{"id":36,"skillId":4,"version":37,"fileName":38,"fileSize":39,"filePath":40,"fileHash":41,"manifest":42,"createdAt":19},"6e9ef427-6a68-42c5-87bf-a4a2da8632d2","1.0.0","azure-ai-voicelive-ts.zip",4114,"uploads\u002Fskills\u002Fd15e42e5-7d08-435a-af03-3bfbddc5db69\u002Fazure-ai-voicelive-ts.zip","b27bb1b90cb42b4d56e56ae41586ce2dee6dfa7ad374ecded66dce4024ef26c2","[{\"path\":\"SKILL.md\",\"isDirectory\":false,\"size\":13230}]",{"code":44,"message":45,"data":46},200,"success",{"items":47,"stats":48,"page":51},[],{"averageRating":49,"totalRatings":49,"ratingCounts":50},0,[49,49,49,49,49],{"limit":52,"offset":49,"hasMore":53,"nextOffset":52,"ratedOnly":16},15,false]