@mastra/voice-aws-nova-sonic 0.0.0-studio-cli-20260504022012

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,51 @@
1
+ # @mastra/voice-aws-nova-sonic
2
+
3
+ ## 0.0.0-studio-cli-20260504022012
4
+
5
+ ### Patch Changes
6
+
7
+ - Updated dependencies [[`6dcd65f`](https://github.com/mastra-ai/mastra/commit/6dcd65f2a34069e6dc43ba35f1d11119b9b40bef), [`c05c9a1`](https://github.com/mastra-ai/mastra/commit/c05c9a13230988cef6d438a62f37760f31927bc7), [`e24aacb`](https://github.com/mastra-ai/mastra/commit/e24aacba07bd66f5d95b636dc24016fca26b52cf), [`1c2dda8`](https://github.com/mastra-ai/mastra/commit/1c2dda805fbfccc0abf55d4cb20cc34402dc3f0c), [`c721164`](https://github.com/mastra-ai/mastra/commit/c7211643f7ac861f83b19a3757cc921487fc9d75), [`1b55954`](https://github.com/mastra-ai/mastra/commit/1b559541c1e08a10e49d01ffc51a634dfc37a286), [`5adc55e`](https://github.com/mastra-ai/mastra/commit/5adc55e63407be8ee977914957d68bcc2a075ceb), [`70017d7`](https://github.com/mastra-ai/mastra/commit/70017d72ab741b5d7040e2a15c251a317782e39e), [`e4942bc`](https://github.com/mastra-ai/mastra/commit/e4942bc7fdc903572f7d84f26d5e15f9d39c763d)]:
8
+ - @mastra/core@0.0.0-studio-cli-20260504022012
9
+
10
+ ## 0.1.0
11
+
12
+ ### Minor Changes
13
+
14
+ - Add new `@mastra/voice-aws-nova-sonic` voice provider for AWS Bedrock Nova 2 Sonic. ([#13232](https://github.com/mastra-ai/mastra/pull/13232))
15
+
16
+ The provider exposes a real-time bidirectional voice interface backed by the
17
+ `InvokeModelWithBidirectionalStreamCommand` API on AWS Bedrock, including:
18
+ - Live microphone streaming (`send` / `listen`) and assistant audio playback
19
+ via `speaking` events
20
+ - Live transcription via `writing` events with `SPECULATIVE` / `FINAL`
21
+ generation stages
22
+ - Barge-in / interrupt detection
23
+ - Speaker selection across all 18 Nova Sonic voices and configurable
24
+ endpointing sensitivity
25
+ - Tool calling with per-session `RequestContext`
26
+ - Configurable AWS region, model id, credentials (or default credential
27
+ provider chain), and inference / turn-detection parameters
28
+
29
+ ### Patch Changes
30
+
31
+ - Updated dependencies [[`1723e09`](https://github.com/mastra-ai/mastra/commit/1723e099829892419ddbfe49287acfeac2522724), [`629f9e9`](https://github.com/mastra-ai/mastra/commit/629f9e9a7e56aa8f129515a3923c5813298790c7), [`25168fb`](https://github.com/mastra-ai/mastra/commit/25168fb9c1de9db7f8171df4f58ceb842c53aa29), [`ab34b5a`](https://github.com/mastra-ai/mastra/commit/ab34b5a2191b8e4353df1dbf7b9155e7d6628d79), [`5fb6c2a`](https://github.com/mastra-ai/mastra/commit/5fb6c2a95c1843cc231704b91354311fc1f34a71), [`2b0f355`](https://github.com/mastra-ai/mastra/commit/2b0f3553be3e9e5524da539a66e5cf82668440a4), [`394f0cf`](https://github.com/mastra-ai/mastra/commit/394f0cfc31e6b4d801219fdef2e9cc69e5bc8682), [`b2deb29`](https://github.com/mastra-ai/mastra/commit/b2deb29412b300c868655b5840463614fbb7962d), [`66644be`](https://github.com/mastra-ai/mastra/commit/66644beac1aa560f0e417956ff007c89341dc382), [`e109607`](https://github.com/mastra-ai/mastra/commit/e10960749251e34d46b480a20648c490fd30381b), [`310b953`](https://github.com/mastra-ai/mastra/commit/310b95345f302dcd5ba3ed862bdc96f059d44122), [`3d7f709`](https://github.com/mastra-ai/mastra/commit/3d7f709b615e588050bb6283c4ee5cfe2978cbde), [`48a42f1`](https://github.com/mastra-ai/mastra/commit/48a42f114a4006a95e0b7a1b5ad1a24815a175c2), [`8091c7c`](https://github.com/mastra-ai/mastra/commit/8091c7c944d15e13fef6d61b6cfd903f158d4006), [`2c83efc`](https://github.com/mastra-ai/mastra/commit/2c83efc4482b3efe50830e3b8b4ba9a8d219edff), [`43f0e1d`](https://github.com/mastra-ai/mastra/commit/43f0e1d5d5a74ba6fc746f2ad89ebe0c64777a7d), [`da0b9e2`](https://github.com/mastra-ai/mastra/commit/da0b9e2ba7ecc560213b426d6c097fe63946086e), [`282a10c`](https://github.com/mastra-ai/mastra/commit/282a10c9446e9922afe80e10e3770481c8ac8a28), [`04151c7`](https://github.com/mastra-ai/mastra/commit/04151c7dcea934b4fe9076708a23fac161195414), [`8091c7c`](https://github.com/mastra-ai/mastra/commit/8091c7c944d15e13fef6d61b6cfd903f158d4006)]:
32
+ - @mastra/core@1.31.0
33
+
34
+ ## 0.1.0-alpha.0
35
+
36
+ ### Minor Changes
37
+
38
+ - Add new `@mastra/voice-aws-nova-sonic` voice provider for AWS Bedrock Nova 2 Sonic. ([#13232](https://github.com/mastra-ai/mastra/pull/13232))
39
+
40
+ The provider exposes a real-time bidirectional voice interface backed by the
41
+ `InvokeModelWithBidirectionalStreamCommand` API on AWS Bedrock, including:
42
+ - Live microphone streaming (`send` / `listen`) and assistant audio playback
43
+ via `speaking` events
44
+ - Live transcription via `writing` events with `SPECULATIVE` / `FINAL`
45
+ generation stages
46
+ - Barge-in / interrupt detection
47
+ - Speaker selection across all 18 Nova Sonic voices and configurable
48
+ endpointing sensitivity
49
+ - Tool calling with per-session `RequestContext`
50
+ - Configurable AWS region, model id, credentials (or default credential
51
+ provider chain), and inference / turn-detection parameters
package/LICENSE.md ADDED
@@ -0,0 +1,30 @@
1
+ Portions of this software are licensed as follows:
2
+
3
+ - All content that resides under any directory named "ee/" within this
4
+ repository, including but not limited to:
5
+ - `packages/core/src/auth/ee/`
6
+ - `packages/server/src/server/auth/ee/`
7
+ is licensed under the license defined in `ee/LICENSE`.
8
+
9
+ - All third-party components incorporated into the Mastra Software are
10
+ licensed under the original license provided by the owner of the
11
+ applicable component.
12
+
13
+ - Content outside of the above-mentioned directories or restrictions is
14
+ available under the "Apache License 2.0" as defined below.
15
+
16
+ # Apache License 2.0
17
+
18
+ Copyright (c) 2025 Kepler Software, Inc.
19
+
20
+ Licensed under the Apache License, Version 2.0 (the "License");
21
+ you may not use this file except in compliance with the License.
22
+ You may obtain a copy of the License at
23
+
24
+ http://www.apache.org/licenses/LICENSE-2.0
25
+
26
+ Unless required by applicable law or agreed to in writing, software
27
+ distributed under the License is distributed on an "AS IS" BASIS,
28
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
29
+ See the License for the specific language governing permissions and
30
+ limitations under the License.
package/README.md ADDED
@@ -0,0 +1,384 @@
1
+ # @mastra/voice-aws-nova-sonic
2
+
3
+ Mastra integration for AWS Nova 2 Sonic, providing real-time bidirectional speech-to-speech capabilities using Amazon Bedrock's bidirectional streaming API.
4
+
5
+ ## Features
6
+
7
+ - **Real-time bidirectional streaming**: Continuous audio streaming in both directions
8
+ - **Multilingual support**: Supports English, French, Italian, German, Spanish, Portuguese, and Hindi
9
+ - **Polyglot voices**: Voices that can speak multiple languages within the same session
10
+ - **Barge-in support**: Users can interrupt the assistant mid-speech; handled server-side by Nova Sonic
11
+ - **Tool/function calling**: Support for agentic workflows and async tool execution
12
+ - **Cross-modal input**: Support for both audio and text inputs in the same conversation
13
+ - **Natural turn-taking**: Intelligent voice activity detection and turn management
14
+ - **Robust error handling**: Comprehensive error handling with detailed error codes
15
+
16
+ ## Installation
17
+
18
+ ```bash
19
+ npm install @mastra/voice-aws-nova-sonic
20
+ # or
21
+ pnpm add @mastra/voice-aws-nova-sonic
22
+ # or
23
+ yarn add @mastra/voice-aws-nova-sonic
24
+ ```
25
+
26
+ ## Prerequisites
27
+
28
+ - Node.js >= 22.13.0
29
+ - AWS account with access to Amazon Bedrock
30
+ - AWS credentials configured (see [AWS Setup](#aws-setup))
31
+ - Access to Nova 2 Sonic model in your AWS region
32
+
33
+ ## AWS Setup
34
+
35
+ ### 1. Enable Nova 2 Sonic in Amazon Bedrock
36
+
37
+ 1. Go to the [Amazon Bedrock Console](https://console.aws.amazon.com/bedrock/)
38
+ 2. Navigate to "Model access" in the left sidebar
39
+ 3. Request access to "Amazon Nova 2 Sonic" model
40
+ 4. Wait for approval (usually instant)
41
+
42
+ ### 2. Configure AWS Credentials
43
+
44
+ You can configure AWS credentials in several ways:
45
+
46
+ **Option 1: Environment Variables**
47
+
48
+ ```bash
49
+ export AWS_ACCESS_KEY_ID=your-access-key-id
50
+ export AWS_SECRET_ACCESS_KEY=your-secret-access-key
51
+ export AWS_REGION=us-east-1
52
+ ```
53
+
54
+ **Option 2: AWS Credentials File**
55
+
56
+ ```ini
57
+ # ~/.aws/credentials
58
+ [default]
59
+ aws_access_key_id = your-access-key-id
60
+ aws_secret_access_key = your-secret-access-key
61
+ ```
62
+
63
+ **Option 3: IAM Role** (for EC2/Lambda)
64
+
65
+ - Attach an IAM role with Bedrock permissions to your EC2 instance or Lambda function
66
+
67
+ **Option 4: Explicit Credentials in Code**
68
+
69
+ ```typescript
70
+ import { NovaSonicVoice } from '@mastra/voice-aws-nova-sonic';
71
+
72
+ const voice = new NovaSonicVoice({
73
+ region: 'us-east-1',
74
+ credentials: {
75
+ accessKeyId: 'your-access-key-id',
76
+ secretAccessKey: 'your-secret-access-key',
77
+ },
78
+ });
79
+ ```
80
+
81
+ ### 3. IAM Permissions
82
+
83
+ Your AWS credentials need the following IAM permissions:
84
+
85
+ ```json
86
+ {
87
+ "Version": "2012-10-17",
88
+ "Statement": [
89
+ {
90
+ "Effect": "Allow",
91
+ "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithBidirectionalStream"],
92
+ "Resource": "arn:aws:bedrock:*::foundation-model/amazon.nova-2-sonic-v1:0"
93
+ }
94
+ ]
95
+ }
96
+ ```
97
+
98
+ ## Usage
99
+
100
+ ### Basic Example
101
+
102
+ ```typescript
103
+ import { Agent } from '@mastra/core/agent';
104
+ import { NovaSonicVoice } from '@mastra/voice-aws-nova-sonic';
105
+
106
+ const agent = new Agent({
107
+ name: 'Nova Sonic Agent',
108
+ instructions: 'You are a helpful assistant with real-time voice capabilities.',
109
+ model: 'openai/gpt-4o',
110
+ voice: new NovaSonicVoice({
111
+ region: 'us-east-1',
112
+ speaker: 'tiffany',
113
+ }),
114
+ });
115
+
116
+ // Connect to the voice service
117
+ await agent.voice.connect();
118
+
119
+ // Listen for agent audio responses (stream of audio data)
120
+ agent.voice.on('speaker', audioStream => {
121
+ // Pipe to your audio output (e.g., speaker, WebSocket, file)
122
+ audioStream.pipe(yourAudioOutput);
123
+ });
124
+
125
+ // Listen for text transcriptions
126
+ agent.voice.on('writing', ({ text, role, generationStage }) => {
127
+ // generationStage is 'SPECULATIVE' (preview) or 'FINAL' (actual transcript)
128
+ console.log(`[${role}] ${text}`);
129
+ });
130
+
131
+ // Send continuous audio from the microphone (NodeJS.ReadableStream of PCM16 audio)
132
+ await agent.voice.send(microphoneStream);
133
+ ```
134
+
135
+ ### Advanced Configuration
136
+
137
+ ```typescript
138
+ import { NovaSonicVoice } from '@mastra/voice-aws-nova-sonic';
139
+
140
+ const voice = new NovaSonicVoice({
141
+ region: 'us-east-1', // or 'us-west-2', 'ap-northeast-1'
142
+ model: 'amazon.nova-2-sonic-v1:0',
143
+ speaker: 'matthew', // or 'tiffany', 'amy', etc.
144
+ languageCode: 'en-US',
145
+ instructions: 'You are a helpful assistant.',
146
+ sessionConfig: {
147
+ tools: [
148
+ {
149
+ name: 'search',
150
+ description: 'Search the web',
151
+ inputSchema: {
152
+ type: 'object',
153
+ properties: {
154
+ query: { type: 'string' },
155
+ },
156
+ required: ['query'],
157
+ },
158
+ },
159
+ ],
160
+ turnDetectionConfiguration: {
161
+ // HIGH = fastest (1.5s pause), MEDIUM = balanced (1.75s), LOW = slowest (2s)
162
+ endpointingSensitivity: 'MEDIUM',
163
+ },
164
+ },
165
+ debug: true,
166
+ });
167
+
168
+ await voice.connect();
169
+ ```
170
+
171
+ ### With Tools
172
+
173
+ ```typescript
174
+ import { Agent } from '@mastra/core/agent';
175
+ import { NovaSonicVoice } from '@mastra/voice-aws-nova-sonic';
176
+ import { createTool } from '@mastra/core/tools';
177
+ import { z } from 'zod';
178
+
179
+ const weatherTool = createTool({
180
+ id: 'weather',
181
+ description: 'Get weather information',
182
+ inputSchema: z.object({
183
+ location: z.string(),
184
+ }),
185
+ execute: async ({ context }) => {
186
+ // Fetch weather data
187
+ return { temperature: 72, condition: 'sunny' };
188
+ },
189
+ });
190
+
191
+ const agent = new Agent({
192
+ name: 'Weather Agent',
193
+ instructions: 'You help users get weather information.',
194
+ model: 'openai/gpt-4o',
195
+ tools: {
196
+ weather: weatherTool,
197
+ },
198
+ voice: new NovaSonicVoice({
199
+ region: 'us-east-1',
200
+ }),
201
+ });
202
+
203
+ await agent.voice.connect();
204
+ // Tools are automatically available to the voice model
205
+ ```
206
+
207
+ ### Cross-Modal Text Input
208
+
209
+ Send text messages during an active voice session:
210
+
211
+ ```typescript
212
+ // After connecting and starting audio streaming
213
+ await agent.voice.speak('What is the weather in New York?');
214
+ ```
215
+
216
+ ## API Reference
217
+
218
+ ### Constructor
219
+
220
+ ```typescript
221
+ new NovaSonicVoice(config?: NovaSonicVoiceConfig)
222
+ ```
223
+
224
+ **Configuration Options:**
225
+
226
+ - `region` (string, optional): AWS region. Default: `'us-east-1'`. Supported: `'us-east-1'`, `'us-west-2'`, `'ap-northeast-1'`
227
+ - `model` (string, optional): Model ID. Default: `'amazon.nova-2-sonic-v1:0'`
228
+ - `credentials` (Credentials, optional): AWS credentials. If not provided, uses default credential chain
229
+ - `speaker` (string, optional): Voice name/identifier (e.g., `'matthew'`, `'tiffany'`, `'amy'`)
230
+ - `languageCode` (string, optional): Language code (e.g., `'en-US'`, `'fr-FR'`)
231
+ - `instructions` (string, optional): System instructions for the model
232
+ - `tools` (array, optional): Tool definitions
233
+ - `sessionConfig` (object, optional): Session configuration including `turnDetectionConfiguration`, `tools`, `inferenceConfiguration`
234
+ - `debug` (boolean, optional): Enable debug logging. Default: `false`
235
+
236
+ ### Methods
237
+
238
+ #### `connect(options?)`
239
+
240
+ Establishes connection to AWS Bedrock. Must be called before using other methods.
241
+
242
+ ```typescript
243
+ await voice.connect();
244
+ ```
245
+
246
+ #### `speak(input, options?)`
247
+
248
+ Send cross-modal text input during an active voice session. Nova Sonic processes it and responds with audio.
249
+
250
+ ```typescript
251
+ await voice.speak('Hello, world!');
252
+ ```
253
+
254
+ #### `listen(audioStream, options?)`
255
+
256
+ Stream audio input for transcription. For Nova Sonic, this is equivalent to `send()`.
257
+
258
+ ```typescript
259
+ await voice.listen(audioStream);
260
+ ```
261
+
262
+ #### `send(audioData)`
263
+
264
+ Stream audio data in real-time. Accepts a `NodeJS.ReadableStream` (PCM16 audio) or an `Int16Array`.
265
+
266
+ ```typescript
267
+ // Stream from a ReadableStream
268
+ await voice.send(audioStream);
269
+
270
+ // Or with Int16Array
271
+ const audioArray = new Int16Array([...]);
272
+ await voice.send(audioArray);
273
+ ```
274
+
275
+ #### `close()`
276
+
277
+ Disconnect and cleanup resources.
278
+
279
+ ```typescript
280
+ voice.close();
281
+ ```
282
+
283
+ #### `on(event, callback)`
284
+
285
+ Register an event listener.
286
+
287
+ ```typescript
288
+ voice.on('speaking', ({ audio }) => {
289
+ // audio is a base64-encoded string of PCM audio
290
+ });
291
+
292
+ voice.on('writing', ({ text, role, generationStage }) => {
293
+ // generationStage: 'SPECULATIVE' (preview) or 'FINAL' (actual transcript)
294
+ console.log(`${role}: ${text}`);
295
+ });
296
+
297
+ voice.on('error', ({ message, code }) => {
298
+ console.error(`Error: ${message} (${code})`);
299
+ });
300
+ ```
301
+
302
+ #### `off(event, callback)`
303
+
304
+ Remove an event listener.
305
+
306
+ ```typescript
307
+ voice.off('speaking', callback);
308
+ ```
309
+
310
+ ### Events
311
+
312
+ - **`speaker`**: Audio stream (`NodeJS.ReadableStream`) for the full response
313
+ - **`speaking`**: Audio chunk `{ audio: string, audioData: Buffer, response_id?: string }`
314
+ - **`writing`**: Text transcription `{ text: string, role: 'assistant' | 'user', generationStage?: 'SPECULATIVE' | 'FINAL' }`
315
+ - **`error`**: Error event `{ message: string, code?: string, details?: unknown }`
316
+ - **`toolCall`**: Tool invocation `{ name: string, args: Record<string, any>, id: string }`
317
+ - **`turnComplete`**: Turn completion `{ timestamp: number }`
318
+ - **`interrupt`**: Barge-in detected `{ type: string, timestamp: number }`
319
+ - **`contentStart`**: Content block started (raw Nova Sonic event)
320
+ - **`contentEnd`**: Content block ended (raw Nova Sonic event)
321
+ - **`usage`**: Token usage `{ inputTokens: number, outputTokens: number, totalTokens: number }`
322
+
323
+ ## Supported Regions
324
+
325
+ - `us-east-1` (US East - N. Virginia)
326
+ - `us-west-2` (US West - Oregon)
327
+ - `ap-northeast-1` (Asia Pacific - Tokyo)
328
+
329
+ ## Supported Languages
330
+
331
+ - English (US, UK, India, Australia)
332
+ - French
333
+ - Italian
334
+ - German
335
+ - Spanish
336
+ - Portuguese
337
+ - Hindi
338
+
339
+ ## Error Handling
340
+
341
+ The package provides error handling with specific error codes:
342
+
343
+ ```typescript
344
+ import { NovaSonicError, NovaSonicErrorCode } from '@mastra/voice-aws-nova-sonic';
345
+
346
+ voice.on('error', ({ message, code, details }) => {
347
+ if (code === NovaSonicErrorCode.CONNECTION_FAILED) {
348
+ // Handle connection error
349
+ } else if (code === NovaSonicErrorCode.CREDENTIALS_MISSING) {
350
+ // Handle credentials error
351
+ }
352
+ });
353
+ ```
354
+
355
+ ## Troubleshooting
356
+
357
+ ### Connection Issues
358
+
359
+ - Verify AWS credentials are configured correctly
360
+ - Check that Nova 2 Sonic is enabled in your AWS Bedrock console
361
+ - Ensure your IAM role/user has the required permissions
362
+ - Verify the region supports Nova 2 Sonic
363
+
364
+ ### Audio Issues
365
+
366
+ - Ensure audio format is compatible (PCM, 16-bit, 16kHz)
367
+ - Check sample rate matches expected format
368
+ - Verify audio stream is not empty
369
+
370
+ ### Authentication Issues
371
+
372
+ - Check AWS credentials are valid
373
+ - Verify IAM permissions include Bedrock access
374
+ - Ensure region is correct
375
+
376
+ ## License
377
+
378
+ Apache-2.0
379
+
380
+ ## Links
381
+
382
+ - [Mastra Documentation](https://mastra.ai)
383
+ - [AWS Nova 2 Sonic Documentation](https://docs.aws.amazon.com/nova/latest/nova2-userguide/using-conversational-speech.html)
384
+ - [Amazon Bedrock Documentation](https://docs.aws.amazon.com/bedrock/)
@@ -0,0 +1,27 @@
1
+ ---
2
+ name: mastra-voice-aws-nova-sonic
3
+ description: Documentation for @mastra/voice-aws-nova-sonic. Use when working with @mastra/voice-aws-nova-sonic APIs, configuration, or implementation.
4
+ metadata:
5
+ package: "@mastra/voice-aws-nova-sonic"
6
+ version: "0.0.0-studio-cli-20260504022012"
7
+ ---
8
+
9
+ ## When to use
10
+
11
+ Use this skill whenever you are working with @mastra/voice-aws-nova-sonic to obtain the domain-specific knowledge.
12
+
13
+ ## How to use
14
+
15
+ Read the individual reference documents for detailed explanations and code examples.
16
+
17
+ ### Docs
18
+
19
+ - [Voice in Mastra](references/docs-voice-overview.md) - Overview of voice capabilities in Mastra, including text-to-speech, speech-to-text, and real-time speech-to-speech interactions.
20
+ - [Speech-to-Speech capabilities in Mastra](references/docs-voice-speech-to-speech.md) - Overview of speech-to-speech capabilities in Mastra, including real-time interactions and event-driven architecture.
21
+
22
+ ### Reference
23
+
24
+ - [Reference: AWS Nova Sonic voice](references/reference-voice-aws-nova-sonic.md) - Documentation for the NovaSonicVoice class, providing real-time speech-to-speech capabilities via AWS Bedrock Nova 2 Sonic.
25
+
26
+
27
+ Read [assets/SOURCE_MAP.json](assets/SOURCE_MAP.json) for source code references.
@@ -0,0 +1,6 @@
1
+ {
2
+ "version": "0.0.0-studio-cli-20260504022012",
3
+ "package": "@mastra/voice-aws-nova-sonic",
4
+ "exports": {},
5
+ "modules": {}
6
+ }