npm - @mastra/voice-cloudflare - Versions diffs - 0.12.1 → 0.12.2 - Mend

@mastra/voice-cloudflare 0.12.1 → 0.12.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (121) hide show

package/dist/docs/references/docs-agents-adding-voice.md CHANGED Viewed

@@ -20,7 +20,7 @@ export const agent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: `You are a helpful assistant with both STT and TTS capabilities.`,
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice,
 })
@@ -109,7 +109,7 @@ export const agent = new Agent({
   id: 'speech-to-speech-agent',
   name: 'Speech-to-Speech Agent',
   instructions: `You are a helpful assistant with speech-to-speech capabilities.`,
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   tools: {
     // Tools configured on Agent are passed to voice provider
     search,
@@ -132,6 +132,37 @@ agent.voice.send(microphoneStream)
 agent.voice.close()
 ```
+### Per-session voice for concurrent sessions
+A static `voice` instance is shared across every request. For one-shot text-to-speech this is fine, but realtime and speech-to-speech providers store one WebSocket, one set of tools, and one request context per instance. If you deploy a single agent that handles several live sessions at once, a shared instance lets one session overwrite another session's tools, instructions, and request context.
+To give each session its own voice, provide `voice` as a resolver. Mastra runs the resolver on every `getVoice()` call and returns a fresh, session-owned instance:
+```typescript
+import { Agent } from '@mastra/core/agent'
+import { OpenAIRealtimeVoice } from '@mastra/voice-openai-realtime'
+export const agent = new Agent({
+  id: 'support-line',
+  name: 'Support Line',
+  instructions: ({ requestContext }) => `Help user ${requestContext.get('user')}.`,
+  model: 'openai/gpt-5.5',
+  voice: ({ requestContext }) => new OpenAIRealtimeVoice({ apiKey: requestContext.get('apiKey') }),
+})
+// Each concurrent session resolves its own voice instance
+const voice = await agent.getVoice({ requestContext })
+await voice.connect()
+```
+When you use a resolver:
+- Each call to `getVoice()` returns a new instance, so concurrent sessions never share state.
+- Mastra does not add tools or instructions to a resolver instance. Configure those inside the resolver or on the provider.
+- You own the lifecycle of the returned instance, so call `disconnect()` or `close()` when the session ends.
+The `agent.voice` getter has no request context, so it throws when `voice` is a resolver. Use `agent.getVoice({ requestContext })` instead.
 ### Event System
 The realtime voice provider emits several events you can listen for:
@@ -209,7 +240,7 @@ export const convertToText = async (input: string | NodeJS.ReadableStream): Prom
 export const hybridVoiceAgent = new Agent({
   id: 'hybrid-voice-agent',
   name: 'Hybrid Voice Agent',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   instructions: 'You can speak and listen using different providers.',
   voice: new CompositeVoice({
     input: new OpenAIVoice(),
@@ -221,7 +252,7 @@ export const unifiedVoiceAgent = new Agent({
   id: 'unified-voice-agent',
   name: 'Unified Voice Agent',
   instructions: 'You are an agent with both STT and TTS capabilities.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new OpenAIVoice(),
 })
@@ -263,7 +294,7 @@ export const agent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: `You are a helpful assistant with both STT and TTS capabilities.`,
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   // Create a composite voice using OpenAI for listening and PlayAI for speaking
   voice: new CompositeVoice({
@@ -288,7 +319,7 @@ export const agent = new Agent({
   id: 'aisdk-voice-agent',
   name: 'AI SDK Voice Agent',
   instructions: `You are a helpful assistant with voice capabilities.`,
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   // Pass AI SDK models directly to CompositeVoice
   voice: new CompositeVoice({
@@ -327,23 +358,24 @@ For the complete list of supported AI SDK providers and their capabilities:
 Mastra supports multiple voice providers for text-to-speech (TTS) and speech-to-text (STT) capabilities:
-| Provider        | Package                         | Features                  | Reference                                                          |
-| --------------- | ------------------------------- | ------------------------- | ------------------------------------------------------------------ |
-| OpenAI          | `@mastra/voice-openai`          | TTS, STT                  | [Documentation](https://mastra.ai/reference/voice/openai)          |
-| OpenAI Realtime | `@mastra/voice-openai-realtime` | Realtime speech-to-speech | [Documentation](https://mastra.ai/reference/voice/openai-realtime) |
-| ElevenLabs      | `@mastra/voice-elevenlabs`      | High-quality TTS          | [Documentation](https://mastra.ai/reference/voice/elevenlabs)      |
-| PlayAI          | `@mastra/voice-playai`          | TTS                       | [Documentation](https://mastra.ai/reference/voice/playai)          |
-| Google          | `@mastra/voice-google`          | TTS, STT                  | [Documentation](https://mastra.ai/reference/voice/google)          |
-| Deepgram        | `@mastra/voice-deepgram`        | STT                       | [Documentation](https://mastra.ai/reference/voice/deepgram)        |
-| Murf            | `@mastra/voice-murf`            | TTS                       | [Documentation](https://mastra.ai/reference/voice/murf)            |
-| Speechify       | `@mastra/voice-speechify`       | TTS                       | [Documentation](https://mastra.ai/reference/voice/speechify)       |
-| Sarvam          | `@mastra/voice-sarvam`          | TTS, STT                  | [Documentation](https://mastra.ai/reference/voice/sarvam)          |
-| Azure           | `@mastra/voice-azure`           | TTS, STT                  | [Documentation](https://mastra.ai/reference/voice/mastra-voice)    |
-| Cloudflare      | `@mastra/voice-cloudflare`      | TTS                       | [Documentation](https://mastra.ai/reference/voice/mastra-voice)    |
+| Provider        | Package                         | Features                                  | Reference                                                          |
+| --------------- | ------------------------------- | ----------------------------------------- | ------------------------------------------------------------------ |
+| OpenAI          | `@mastra/voice-openai`          | TTS, STT                                  | [Documentation](https://mastra.ai/reference/voice/openai)          |
+| OpenAI Realtime | `@mastra/voice-openai-realtime` | Realtime speech-to-speech                 | [Documentation](https://mastra.ai/reference/voice/openai-realtime) |
+| AWS Nova Sonic  | `@mastra/voice-aws-nova-sonic`  | Realtime speech-to-speech via AWS Bedrock | [Documentation](https://mastra.ai/reference/voice/aws-nova-sonic)  |
+| ElevenLabs      | `@mastra/voice-elevenlabs`      | High-quality TTS                          | [Documentation](https://mastra.ai/reference/voice/elevenlabs)      |
+| PlayAI          | `@mastra/voice-playai`          | TTS                                       | [Documentation](https://mastra.ai/reference/voice/playai)          |
+| Google          | `@mastra/voice-google`          | TTS, STT                                  | [Documentation](https://mastra.ai/reference/voice/google)          |
+| Deepgram        | `@mastra/voice-deepgram`        | STT                                       | [Documentation](https://mastra.ai/reference/voice/deepgram)        |
+| Murf            | `@mastra/voice-murf`            | TTS                                       | [Documentation](https://mastra.ai/reference/voice/murf)            |
+| Speechify       | `@mastra/voice-speechify`       | TTS                                       | [Documentation](https://mastra.ai/reference/voice/speechify)       |
+| Sarvam          | `@mastra/voice-sarvam`          | TTS, STT                                  | [Documentation](https://mastra.ai/reference/voice/sarvam)          |
+| Azure           | `@mastra/voice-azure`           | TTS, STT                                  | [Documentation](https://mastra.ai/reference/voice/mastra-voice)    |
+| Cloudflare      | `@mastra/voice-cloudflare`      | TTS                                       | [Documentation](https://mastra.ai/reference/voice/mastra-voice)    |
 ## Next steps
-- [Voice API Reference](https://mastra.ai/reference/voice/mastra-voice) - Detailed API documentation for voice capabilities
-- [Text to Speech Examples](https://github.com/mastra-ai/voice-examples/tree/main/text-to-speech) - Interactive story generator and other TTS implementations
-- [Speech to Text Examples](https://github.com/mastra-ai/voice-examples/tree/main/speech-to-text) - Voice memo app and other STT implementations
-- [Speech to Speech Examples](https://github.com/mastra-ai/voice-examples/tree/main/speech-to-speech) - Real-time voice conversation with call analysis
+- [Voice API Reference](https://mastra.ai/reference/voice/mastra-voice): Detailed API documentation for voice capabilities
+- [Text to Speech Examples](https://github.com/mastra-ai/voice-examples/tree/main/text-to-speech): Interactive story generator and other TTS implementations
+- [Speech to Text Examples](https://github.com/mastra-ai/voice-examples/tree/main/speech-to-text): Voice memo app and other STT implementations
+- [Speech to Speech Examples](https://github.com/mastra-ai/voice-examples/tree/main/speech-to-speech): Real-time voice conversation with call analysis

package/dist/docs/references/docs-voice-overview.md CHANGED Viewed

@@ -16,7 +16,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new OpenAIVoice(),
 })
 ```
@@ -40,7 +40,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new OpenAIVoice(),
 })
@@ -68,7 +68,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new AzureVoice(),
 })
@@ -95,7 +95,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new ElevenLabsVoice(),
 })
@@ -122,7 +122,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new PlayAIVoice(),
 })
@@ -149,7 +149,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new GoogleVoice(),
 })
@@ -176,7 +176,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new CloudflareVoice(),
 })
@@ -203,7 +203,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new DeepgramVoice(),
 })
@@ -219,6 +219,33 @@ playAudio(audioStream)
 Visit the [Deepgram Voice Reference](https://mastra.ai/reference/voice/deepgram) for more information on the Deepgram voice provider.
+**Inworld**:
+```typescript
+import { Agent } from '@mastra/core/agent'
+import { InworldVoice } from '@mastra/voice-inworld'
+import { playAudio } from '@mastra/node-audio'
+const voiceAgent = new Agent({
+  id: 'voice-agent',
+  name: 'Voice Agent',
+  instructions: 'You are a voice assistant that can help users with their tasks.',
+  model: 'openai/gpt-5.5',
+  voice: new InworldVoice(),
+})
+const { text } = await voiceAgent.generate('What color is the sky?')
+// Convert text to speech to an Audio Stream
+const audioStream = await voiceAgent.voice.speak(text, {
+  speaker: 'Dennis', // Optional: specify a speaker
+})
+playAudio(audioStream)
+```
+Visit the [Inworld Voice Reference](https://mastra.ai/reference/voice/inworld) for more information on the Inworld voice provider.
 **Speechify**:
 ```typescript
@@ -230,7 +257,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new SpeechifyVoice(),
 })
@@ -257,7 +284,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new SarvamVoice(),
 })
@@ -265,7 +292,7 @@ const { text } = await voiceAgent.generate('What color is the sky?')
 // Convert text to speech to an Audio Stream
 const audioStream = await voiceAgent.voice.speak(text, {
-  speaker: 'default', // Optional: specify a speaker
+  speaker: 'shubh', // Optional: specify a bulbul:v3 speaker
 })
 playAudio(audioStream)
@@ -284,7 +311,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new MurfVoice(),
 })
@@ -319,7 +346,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new OpenAIVoice(),
 })
@@ -348,7 +375,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new AzureVoice(),
 })
@@ -376,7 +403,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new ElevenLabsVoice(),
 })
@@ -404,7 +431,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new GoogleVoice(),
 })
@@ -432,7 +459,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new CloudflareVoice(),
 })
@@ -460,7 +487,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new DeepgramVoice(),
 })
@@ -477,6 +504,34 @@ const { text } = await voiceAgent.generate(transcript)
 Visit the [Deepgram Voice Reference](https://mastra.ai/reference/voice/deepgram) for more information on the Deepgram voice provider.
+**Inworld**:
+```typescript
+import { Agent } from '@mastra/core/agent'
+import { InworldVoice } from '@mastra/voice-inworld'
+import { createReadStream } from 'fs'
+const voiceAgent = new Agent({
+  id: 'voice-agent',
+  name: 'Voice Agent',
+  instructions: 'You are a voice assistant that can help users with their tasks.',
+  model: 'openai/gpt-5.5',
+  voice: new InworldVoice(),
+})
+// Use an audio file from a URL
+const audioStream = await createReadStream('./how_can_i_help_you.mp3')
+// Convert audio to text
+const transcript = await voiceAgent.voice.listen(audioStream)
+console.log(`User said: ${transcript}`)
+// Generate a response based on the transcript
+const { text } = await voiceAgent.generate(transcript)
+```
+Visit the [Inworld Voice Reference](https://mastra.ai/reference/voice/inworld) for more information on the Inworld voice provider.
 **Sarvam**:
 ```typescript
@@ -488,7 +543,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new SarvamVoice(),
 })
@@ -520,7 +575,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new OpenAIRealtimeVoice(),
 })
@@ -550,7 +605,7 @@ const voiceAgent = new Agent({
   id: 'voice-agent',
   name: 'Voice Agent',
   instructions: 'You are a voice assistant that can help users with their tasks.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice: new GeminiLiveVoice({
     // Live API mode
     apiKey: process.env.GOOGLE_API_KEY,
@@ -588,6 +643,134 @@ await voiceAgent.voice.send(micStream)
 Visit the [Google Gemini Live Reference](https://mastra.ai/reference/voice/google-gemini-live) for more information on the Google Gemini Live voice provider.
+**AWS Nova Sonic**:
+```typescript
+import { Agent } from '@mastra/core/agent'
+import { playAudio, getMicrophoneStream } from '@mastra/node-audio'
+import { NovaSonicVoice } from '@mastra/voice-aws-nova-sonic'
+const voiceAgent = new Agent({
+  id: 'voice-agent',
+  name: 'Voice Agent',
+  instructions: 'You are a voice assistant that can help users with their tasks.',
+  model: 'openai/gpt-5.5',
+  voice: new NovaSonicVoice({
+    region: 'us-east-1',
+    speaker: 'matthew',
+    // Static credentials are optional. The default AWS credential
+    // provider chain is used when none are passed.
+  }),
+})
+// Connect before using speak/send
+await voiceAgent.voice.connect()
+// Listen for assistant audio (Int16Array PCM)
+voiceAgent.voice.on('speaking', ({ audioData }) => {
+  if (audioData) playAudio(audioData)
+})
+// Listen for transcribed text
+voiceAgent.voice.on('writing', ({ text, role }) => {
+  console.log(`${role}: ${text}`)
+})
+// Initiate the conversation
+await voiceAgent.voice.speak('How can I help you today?')
+// Send continuous audio from the microphone
+const micStream = getMicrophoneStream()
+await voiceAgent.voice.send(micStream)
+```
+Visit the [AWS Nova Sonic Reference](https://mastra.ai/reference/voice/aws-nova-sonic) for more information on the AWS Nova Sonic voice provider.
+**Inworld Realtime**:
+```typescript
+import { Agent } from '@mastra/core/agent'
+import { playAudio, getMicrophoneStream } from '@mastra/node-audio'
+import { InworldRealtimeVoice } from '@mastra/voice-inworld'
+const voiceAgent = new Agent({
+  id: 'voice-agent',
+  name: 'Voice Agent',
+  instructions: 'You are a voice assistant that can help users with their tasks.',
+  model: 'openai/gpt-5.5',
+  voice: new InworldRealtimeVoice({
+    apiKey: process.env.INWORLD_API_KEY,
+    model: 'inworld/models/gemma-4-26b-a4b-it',
+    speaker: 'Sarah',
+  }),
+})
+// Connect before using speak/send
+await voiceAgent.voice.connect()
+// Listen for agent audio (PCM stream)
+voiceAgent.voice.on('speaker', stream => {
+  playAudio(stream)
+})
+// Listen for text responses and transcriptions
+voiceAgent.voice.on('writing', ({ text, role }) => {
+  console.log(`${role}: ${text}`)
+})
+// Initiate the conversation
+await voiceAgent.voice.speak('How can I help you today?')
+// Send continuous audio from the microphone
+const micStream = getMicrophoneStream()
+await voiceAgent.voice.send(micStream)
+```
+Visit the [Inworld Realtime Reference](https://mastra.ai/reference/voice/inworld-realtime) for more information on the Inworld Realtime voice provider.
+**xAI**:
+```typescript
+import { Agent } from '@mastra/core/agent'
+import { playAudio, getMicrophoneStream } from '@mastra/node-audio'
+import { XAIRealtimeVoice } from '@mastra/voice-xai-realtime'
+const voiceAgent = new Agent({
+  id: 'voice-agent',
+  name: 'Voice Agent',
+  instructions: 'You are a voice assistant that can help users with their tasks.',
+  model: 'xai/grok-4.3',
+  voice: new XAIRealtimeVoice({
+    apiKey: process.env.XAI_API_KEY,
+    model: 'grok-voice-think-fast-1.0',
+    speaker: 'eve',
+    turnDetection: { type: 'server_vad' },
+  }),
+})
+// Connect before using speak/send
+await voiceAgent.voice.connect()
+// Listen for agent audio responses
+voiceAgent.voice.on('speaker', audioStream => {
+  playAudio(audioStream)
+})
+// Listen for text responses and transcriptions
+voiceAgent.voice.on('writing', ({ text, role }) => {
+  console.log(`${role}: ${text}`)
+})
+// Initiate the conversation
+await voiceAgent.voice.speak('How can I help you today?')
+// Send continuous audio from the microphone
+const micStream = getMicrophoneStream()
+await voiceAgent.voice.send(micStream)
+```
+Visit the [xAI Realtime Voice Reference](https://mastra.ai/reference/voice/xai-realtime) for more information on the xAI voice provider.
 ## Voice configuration
 Each voice provider can be configured with different models and options. Below are the detailed configuration options for all supported providers:
@@ -736,6 +919,34 @@ const voice = new DeepgramVoice({
 Visit the [Deepgram Voice Reference](https://mastra.ai/reference/voice/deepgram) for more information on the Deepgram voice provider.
+**Inworld**:
+```typescript
+// Inworld Voice Configuration
+const voice = new InworldVoice({
+  speechModel: {
+    name: 'inworld-tts-2',
+    apiKey: process.env.INWORLD_API_KEY,
+  },
+  listeningModel: {
+    name: 'groq/whisper-large-v3',
+    apiKey: process.env.INWORLD_API_KEY,
+  },
+  speaker: 'Dennis',
+  audioEncoding: 'MP3',
+  sampleRateHertz: 48000,
+  language: 'en-US',
+})
+// Per-call options: `deliveryMode` is honored only by `inworld-tts-2`.
+const audioStream = await voice.speak('Hello!', {
+  deliveryMode: 'BALANCED', // 'STABLE' | 'BALANCED' | 'CREATIVE'
+  language: 'en-US', // BCP-47 per-call override
+})
+```
+Visit the [Inworld Voice Reference](https://mastra.ai/reference/voice/inworld) for more information on the Inworld voice provider.
 **Speechify**:
 ```typescript
@@ -760,12 +971,15 @@ Visit the [Speechify Voice Reference](https://mastra.ai/reference/voice/speechif
 // Sarvam Voice Configuration
 const voice = new SarvamVoice({
   speechModel: {
-    name: 'sarvam-voice', // Example model name
+    model: 'bulbul:v3', // TTS model (bulbul:v2 or bulbul:v3)
     apiKey: process.env.SARVAM_API_KEY,
-    language: 'en-IN', // Language code
-    style: 'conversational', // Style setting
+    language: 'en-IN', // BCP-47 language code
   },
-  // Sarvam may not have a separate listening model
+  listeningModel: {
+    model: 'saarika:v2.5', // STT model (saarika:v2.5 or saaras:v3)
+    apiKey: process.env.SARVAM_API_KEY,
+  },
+  speaker: 'shubh', // Default bulbul:v3 speaker
 })
 ```
@@ -809,6 +1023,38 @@ const voice = new OpenAIRealtimeVoice({
 For more information on the OpenAI Realtime voice provider, refer to the [OpenAI Realtime Voice Reference](https://mastra.ai/reference/voice/openai-realtime).
+**xAI Realtime**:
+```typescript
+// xAI Realtime Voice Configuration
+const voice = new XAIRealtimeVoice({
+  apiKey: process.env.XAI_API_KEY,
+  model: 'grok-voice-think-fast-1.0',
+  speaker: 'eve',
+  instructions: 'You are a concise voice assistant.',
+  turnDetection: {
+    type: 'server_vad',
+    threshold: 0.85,
+    silence_duration_ms: 1000,
+    prefix_padding_ms: 333,
+  },
+  audio: {
+    input: { format: { type: 'audio/pcm', rate: 24000 } },
+    output: { format: { type: 'audio/pcm', rate: 24000 } },
+  },
+  serverTools: [
+    { type: 'web_search' },
+    {
+      type: 'mcp',
+      server_url: 'https://mcp.example.com/mcp',
+      server_label: 'business-tools',
+    },
+  ],
+})
+```
+Visit the [xAI Realtime Voice Reference](https://mastra.ai/reference/voice/xai-realtime) for more information on the xAI realtime voice provider.
 **Google Gemini Live**:
 ```typescript
@@ -825,6 +1071,48 @@ const voice = new GeminiLiveVoice({
 Visit the [Google Gemini Live Reference](https://mastra.ai/reference/voice/google-gemini-live) for more information on the Google Gemini Live voice provider.
+**AWS Nova Sonic**:
+```typescript
+// AWS Nova Sonic Voice Configuration
+const voice = new NovaSonicVoice({
+  region: 'us-east-1',
+  speaker: 'matthew',
+  sessionConfig: {
+    inferenceConfiguration: {
+      temperature: 0.7,
+      maxTokens: 1024,
+    },
+    turnDetectionConfiguration: {
+      endpointingSensitivity: 'MEDIUM',
+    },
+  },
+  // AWS Nova Sonic is a realtime bidirectional API without separate speech and listening models
+})
+```
+Visit the [AWS Nova Sonic Reference](https://mastra.ai/reference/voice/aws-nova-sonic) for more information on the AWS Nova Sonic voice provider.
+**Inworld Realtime**:
+```typescript
+// Inworld Realtime Voice Configuration
+const voice = new InworldRealtimeVoice({
+  apiKey: process.env.INWORLD_API_KEY,
+  model: 'inworld/models/gemma-4-26b-a4b-it',
+  speaker: 'Sarah',
+  // Typed Inworld realtime knobs (semantic VAD, playback speed, MCP tool routing, ...)
+  session: {
+    audio: {
+      output: { speed: 1.1 },
+      input: { turn_detection: { type: 'semantic_vad', eagerness: 'high' } },
+    },
+  },
+})
+```
+Visit the [Inworld Realtime Reference](https://mastra.ai/reference/voice/inworld-realtime) for more information on the Inworld Realtime voice provider.
 **AI SDK**:
 ```typescript
@@ -844,7 +1132,7 @@ const voiceAgent = new Agent({
   id: 'aisdk-voice-agent',
   name: 'AI SDK Voice Agent',
   instructions: 'You are a helpful assistant with voice capabilities.',
-  model: 'openai/gpt-5.4',
+  model: 'openai/gpt-5.5',
   voice,
 })
 ```
@@ -951,9 +1239,12 @@ For more information on the CompositeVoice, refer to the [CompositeVoice Referen
 - [MastraVoice](https://mastra.ai/reference/voice/mastra-voice)
 - [OpenAI Voice](https://mastra.ai/reference/voice/openai)
 - [OpenAI Realtime Voice](https://mastra.ai/reference/voice/openai-realtime)
+- [xAI Realtime Voice](https://mastra.ai/reference/voice/xai-realtime)
 - [Azure Voice](https://mastra.ai/reference/voice/azure)
 - [Google Voice](https://mastra.ai/reference/voice/google)
 - [Google Gemini Live Voice](https://mastra.ai/reference/voice/google-gemini-live)
+- [AWS Nova Sonic Voice](https://mastra.ai/reference/voice/aws-nova-sonic)
 - [Deepgram Voice](https://mastra.ai/reference/voice/deepgram)
+- [Inworld Voice](https://mastra.ai/reference/voice/inworld)
 - [PlayAI Voice](https://mastra.ai/reference/voice/playai)
 - [Voice Examples](https://github.com/mastra-ai/voice-examples)