[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"skill-db792002-124f-4edc-ac87-aeeeacaa7447":3,"$fa6TZwsGbU9l8QE_xqiNZMuO93ebmoWbgc4HVm2n3_-g":43},{"id":4,"title":5,"description":6,"categoryId":7,"moduleId":8,"tags":9,"prompt":10,"icon":11,"source":12,"sourceUrl":13,"authorId":14,"authorName":15,"isPublic":16,"stars":17,"runs":18,"createdAt":19,"updatedAt":19,"module":20,"category":27,"packages":34},"db792002-124f-4edc-ac87-aeeeacaa7447","azure-ai-voicelive-java","Azure AI语音实时SDK for Java。使用WebSocket与AI助手进行实时双向语音对话。","cat_coding_devops","mod_coding","sickn33,coding","---\nname: azure-ai-voicelive-java\ndescription: Azure AI VoiceLive SDK for Java. Real-time bidirectional voice conversations with AI assistants using WebSocket.\nrisk: unknown\nsource: community\ndate_added: '2026-02-27'\n---\n\n# Azure AI VoiceLive SDK for Java\n\nReal-time, bidirectional voice conversations with AI assistants using WebSocket technology.\n\n## Installation\n\n```xml\n\u003Cdependency>\n    \u003CgroupId>com.azure\u003C\u002FgroupId>\n    \u003CartifactId>azure-ai-voicelive\u003C\u002FartifactId>\n    \u003Cversion>1.0.0-beta.2\u003C\u002Fversion>\n\u003C\u002Fdependency>\n```\n\n## Environment Variables\n\n```bash\nAZURE_VOICELIVE_ENDPOINT=https:\u002F\u002F\u003Cresource>.openai.azure.com\u002F\nAZURE_VOICELIVE_API_KEY=\u003Cyour-api-key>\n```\n\n## Authentication\n\n### API Key\n\n```java\nimport com.azure.ai.voicelive.VoiceLiveAsyncClient;\nimport com.azure.ai.voicelive.VoiceLiveClientBuilder;\nimport com.azure.core.credential.AzureKeyCredential;\n\nVoiceLiveAsyncClient client = new VoiceLiveClientBuilder()\n    .endpoint(System.getenv(\"AZURE_VOICELIVE_ENDPOINT\"))\n    .credential(new AzureKeyCredential(System.getenv(\"AZURE_VOICELIVE_API_KEY\")))\n    .buildAsyncClient();\n```\n\n### DefaultAzureCredential (Recommended)\n\n```java\nimport com.azure.identity.DefaultAzureCredentialBuilder;\n\nVoiceLiveAsyncClient client = new VoiceLiveClientBuilder()\n    .endpoint(System.getenv(\"AZURE_VOICELIVE_ENDPOINT\"))\n    .credential(new DefaultAzureCredentialBuilder().build())\n    .buildAsyncClient();\n```\n\n## Key Concepts\n\n| Concept | Description |\n|---------|-------------|\n| `VoiceLiveAsyncClient` | Main entry point for voice sessions |\n| `VoiceLiveSessionAsyncClient` | Active WebSocket connection for streaming |\n| `VoiceLiveSessionOptions` | Configuration for session behavior |\n\n### Audio Requirements\n\n- **Sample Rate**: 24kHz (24000 Hz)\n- **Bit Depth**: 16-bit PCM\n- **Channels**: Mono (1 channel)\n- **Format**: Signed PCM, little-endian\n\n## Core Workflow\n\n### 1. Start Session\n\n```java\nimport reactor.core.publisher.Mono;\n\nclient.startSession(\"gpt-4o-realtime-preview\")\n    .flatMap(session -> {\n        System.out.println(\"Session started\");\n        \n        \u002F\u002F Subscribe to events\n        session.receiveEvents()\n            .subscribe(\n                event -> System.out.println(\"Event: \" + event.getType()),\n                error -> System.err.println(\"Error: \" + error.getMessage())\n            );\n        \n        return Mono.just(session);\n    })\n    .block();\n```\n\n### 2. Configure Session Options\n\n```java\nimport com.azure.ai.voicelive.models.*;\nimport java.util.Arrays;\n\nServerVadTurnDetection turnDetection = new ServerVadTurnDetection()\n    .setThreshold(0.5)                    \u002F\u002F Sensitivity (0.0-1.0)\n    .setPrefixPaddingMs(300)              \u002F\u002F Audio before speech\n    .setSilenceDurationMs(500)            \u002F\u002F Silence to end turn\n    .setInterruptResponse(true)           \u002F\u002F Allow interruptions\n    .setAutoTruncate(true)\n    .setCreateResponse(true);\n\nAudioInputTranscriptionOptions transcription = new AudioInputTranscriptionOptions(\n    AudioInputTranscriptionOptionsModel.WHISPER_1);\n\nVoiceLiveSessionOptions options = new VoiceLiveSessionOptions()\n    .setInstructions(\"You are a helpful AI voice assistant.\")\n    .setVoice(BinaryData.fromObject(new OpenAIVoice(OpenAIVoiceName.ALLOY)))\n    .setModalities(Arrays.asList(InteractionModality.TEXT, InteractionModality.AUDIO))\n    .setInputAudioFormat(InputAudioFormat.PCM16)\n    .setOutputAudioFormat(OutputAudioFormat.PCM16)\n    .setInputAudioSamplingRate(24000)\n    .setInputAudioNoiseReduction(new AudioNoiseReduction(AudioNoiseReductionType.NEAR_FIELD))\n    .setInputAudioEchoCancellation(new AudioEchoCancellation())\n    .setInputAudioTranscription(transcription)\n    .setTurnDetection(turnDetection);\n\n\u002F\u002F Send configuration\nClientEventSessionUpdate updateEvent = new ClientEventSessionUpdate(options);\nsession.sendEvent(updateEvent).subscribe();\n```\n\n### 3. Send Audio Input\n\n```java\nbyte[] audioData = readAudioChunk(); \u002F\u002F Your PCM16 audio data\nsession.sendInputAudio(BinaryData.fromBytes(audioData)).subscribe();\n```\n\n### 4. Handle Events\n\n```java\nsession.receiveEvents().subscribe(event -> {\n    ServerEventType eventType = event.getType();\n    \n    if (ServerEventType.SESSION_CREATED.equals(eventType)) {\n        System.out.println(\"Session created\");\n    } else if (ServerEventType.INPUT_AUDIO_BUFFER_SPEECH_STARTED.equals(eventType)) {\n        System.out.println(\"User started speaking\");\n    } else if (ServerEventType.INPUT_AUDIO_BUFFER_SPEECH_STOPPED.equals(eventType)) {\n        System.out.println(\"User stopped speaking\");\n    } else if (ServerEventType.RESPONSE_AUDIO_DELTA.equals(eventType)) {\n        if (event instanceof SessionUpdateResponseAudioDelta) {\n            SessionUpdateResponseAudioDelta audioEvent = (SessionUpdateResponseAudioDelta) event;\n            playAudioChunk(audioEvent.getDelta());\n        }\n    } else if (ServerEventType.RESPONSE_DONE.equals(eventType)) {\n        System.out.println(\"Response complete\");\n    } else if (ServerEventType.ERROR.equals(eventType)) {\n        if (event instanceof SessionUpdateError) {\n            SessionUpdateError errorEvent = (SessionUpdateError) event;\n            System.err.println(\"Error: \" + errorEvent.getError().getMessage());\n        }\n    }\n});\n```\n\n## Voice Configuration\n\n### OpenAI Voices\n\n```java\n\u002F\u002F Available: ALLOY, ASH, BALLAD, CORAL, ECHO, SAGE, SHIMMER, VERSE\nVoiceLiveSessionOptions options = new VoiceLiveSessionOptions()\n    .setVoice(BinaryData.fromObject(new OpenAIVoice(OpenAIVoiceName.ALLOY)));\n```\n\n### Azure Voices\n\n```java\n\u002F\u002F Azure Standard Voice\noptions.setVoice(BinaryData.fromObject(new AzureStandardVoice(\"en-US-JennyNeural\")));\n\n\u002F\u002F Azure Custom Voice\noptions.setVoice(BinaryData.fromObject(new AzureCustomVoice(\"myVoice\", \"endpointId\")));\n\n\u002F\u002F Azure Personal Voice\noptions.setVoice(BinaryData.fromObject(\n    new AzurePersonalVoice(\"speakerProfileId\", PersonalVoiceModels.PHOENIX_LATEST_NEURAL)));\n```\n\n## Function Calling\n\n```java\nVoiceLiveFunctionDefinition weatherFunction = new VoiceLiveFunctionDefinition(\"get_weather\")\n    .setDescription(\"Get current weather for a location\")\n    .setParameters(BinaryData.fromObject(parametersSchema));\n\nVoiceLiveSessionOptions options = new VoiceLiveSessionOptions()\n    .setTools(Arrays.asList(weatherFunction))\n    .setInstructions(\"You have access to weather information.\");\n```\n\n## Best Practices\n\n1. **Use async client** — VoiceLive requires reactive patterns\n2. **Configure turn detection** for natural conversation flow\n3. **Enable noise reduction** for better speech recognition\n4. **Handle interruptions** gracefully with `setInterruptResponse(true)`\n5. **Use Whisper transcription** for input audio transcription\n6. **Close sessions** properly when conversation ends\n\n## Error Handling\n\n```java\nsession.receiveEvents()\n    .doOnError(error -> System.err.println(\"Connection error: \" + error.getMessage()))\n    .onErrorResume(error -> {\n        \u002F\u002F Attempt reconnection or cleanup\n        return Flux.empty();\n    })\n    .subscribe();\n```\n\n## Reference Links\n\n| Resource | URL |\n|----------|-----|\n| GitHub Source | https:\u002F\u002Fgithub.com\u002FAzure\u002Fazure-sdk-for-java\u002Ftree\u002Fmain\u002Fsdk\u002Fai\u002Fazure-ai-voicelive |\n| Samples | https:\u002F\u002Fgithub.com\u002FAzure\u002Fazure-sdk-for-java\u002Ftree\u002Fmain\u002Fsdk\u002Fai\u002Fazure-ai-voicelive\u002Fsrc\u002Fsamples |\n\n## When to Use\nThis skill is applicable to execute the workflow or actions described in the overview.\n\n## Limitations\n- Use this skill only when the task clearly matches the scope described above.\n- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.\n- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.\n","","imported","https:\u002F\u002Fgithub.com\u002Fsickn33\u002Fantigravity-awesome-skills","user_system_seed","SkillOPIC",true,173,1660,"2026-05-16 13:05:39",{"id":8,"name":21,"slug":22,"icon":23,"description":24,"sort":25,"createdAt":26},"编程开发","coding","mdi-code-braces","代码生成、调试、审查，提升开发效率",2,"2026-05-16 12:53:40",{"id":7,"name":28,"slug":29,"icon":30,"description":31,"moduleId":8,"sort":32,"skillCount":33,"createdAt":26},"DevOps","devops","mdi-cog-outline","CI\u002FCD、容器化、部署运维",3,162,[35],{"id":36,"skillId":4,"version":37,"fileName":38,"fileSize":39,"filePath":40,"fileHash":41,"manifest":42,"createdAt":19},"9a18fd89-2222-4f74-abb3-b68d29f8c90e","1.0.0","azure-ai-voicelive-java.zip",2834,"uploads\u002Fskills\u002Fdb792002-124f-4edc-ac87-aeeeacaa7447\u002Fazure-ai-voicelive-java.zip","53f2f066b32b110e28e2290302be7c17453f94d0db9f7585b007f45ff90c6d49","[{\"path\":\"SKILL.md\",\"isDirectory\":false,\"size\":7669}]",{"code":44,"message":45,"data":46},200,"success",{"items":47,"stats":48,"page":51},[],{"averageRating":49,"totalRatings":49,"ratingCounts":50},0,[49,49,49,49,49],{"limit":52,"offset":49,"hasMore":53,"nextOffset":52,"ratedOnly":16},15,false]