@mastra/voice-xai-realtime 0.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +39 -0
- package/README.md +106 -0
- package/dist/docs/SKILL.md +26 -0
- package/dist/docs/assets/SOURCE_MAP.json +6 -0
- package/dist/docs/references/docs-voice-overview.md +1188 -0
- package/dist/docs/references/reference-voice-xai-realtime.md +267 -0
- package/dist/index.cjs +851 -0
- package/dist/index.cjs.map +1 -0
- package/dist/index.d.ts +91 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +849 -0
- package/dist/index.js.map +1 -0
- package/dist/types.d.ts +181 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/utils.d.ts +25 -0
- package/dist/utils.d.ts.map +1 -0
- package/package.json +67 -0
package/CHANGELOG.md
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# @mastra/voice-xai-realtime
|
|
2
|
+
|
|
3
|
+
## 0.1.0-alpha.0
|
|
4
|
+
|
|
5
|
+
### Minor Changes
|
|
6
|
+
|
|
7
|
+
- Added `@mastra/voice-xai-realtime`, a realtime voice provider for the xAI Grok Voice Agent API. ([#16507](https://github.com/mastra-ai/mastra/pull/16507))
|
|
8
|
+
|
|
9
|
+
Use `XAIRealtimeVoice` with Mastra's `Agent` voice primitive to connect, stream audio, and run xAI voice turns:
|
|
10
|
+
|
|
11
|
+
```ts
|
|
12
|
+
import { Agent } from '@mastra/core/agent';
|
|
13
|
+
import { XAIRealtimeVoice } from '@mastra/voice-xai-realtime';
|
|
14
|
+
|
|
15
|
+
const voice = new XAIRealtimeVoice({
|
|
16
|
+
apiKey: process.env.XAI_API_KEY,
|
|
17
|
+
model: 'grok-voice-think-fast-1.0',
|
|
18
|
+
speaker: 'eve',
|
|
19
|
+
turnDetection: { type: 'server_vad' },
|
|
20
|
+
});
|
|
21
|
+
|
|
22
|
+
const agent = new Agent({
|
|
23
|
+
id: 'voice-agent',
|
|
24
|
+
name: 'Voice Agent',
|
|
25
|
+
instructions: 'You are a helpful voice assistant.',
|
|
26
|
+
model: 'xai/grok-4.3',
|
|
27
|
+
voice,
|
|
28
|
+
});
|
|
29
|
+
|
|
30
|
+
await agent.voice.connect();
|
|
31
|
+
agent.voice.on('speaker', audioStream => playAudio(audioStream));
|
|
32
|
+
await agent.voice.speak('How can I help you today?');
|
|
33
|
+
await agent.voice.send(microphoneStream);
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
### Patch Changes
|
|
37
|
+
|
|
38
|
+
- Updated dependencies [[`fceae1f`](https://github.com/mastra-ai/mastra/commit/fceae1f5f5db4722cb078a663c6eb4bd22944123), [`bf02acb`](https://github.com/mastra-ai/mastra/commit/bf02acbb8a6110f638ac844e89f1ebf04cb7fe74), [`0fd3fbe`](https://github.com/mastra-ai/mastra/commit/0fd3fbe40fb63657aedd72f6e7b38c8e8ee6940d), [`fed0475`](https://github.com/mastra-ai/mastra/commit/fed0475ccfea31e4fc251469ac05640d0742c1f0), [`522f44d`](https://github.com/mastra-ai/mastra/commit/522f44d947214bfc06cff50599bae1ef3494880d)]:
|
|
39
|
+
- @mastra/core@1.34.0-alpha.1
|
package/README.md
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
# @mastra/voice-xai-realtime
|
|
2
|
+
|
|
3
|
+
xAI Grok Voice Agent API integration for Mastra. This package provides a realtime `MastraVoice` provider that connects to xAI's WebSocket API for bidirectional text and audio conversations.
|
|
4
|
+
|
|
5
|
+
## Installation
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
npm install @mastra/voice-xai-realtime
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
## Configuration
|
|
12
|
+
|
|
13
|
+
Set an xAI API key for server-side use:
|
|
14
|
+
|
|
15
|
+
```bash
|
|
16
|
+
XAI_API_KEY=your_xai_api_key
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
This provider is built for Node.js server-side runtimes. If you already mint xAI ephemeral tokens on your server, you can pass one with `ephemeralToken`; the provider sends it through the WebSocket protocol as `xai-client-secret.<token>` instead of sending an authorization header. If both `apiKey` and `ephemeralToken` are configured, the provider uses the ephemeral token.
|
|
20
|
+
|
|
21
|
+
## Usage
|
|
22
|
+
|
|
23
|
+
```typescript
|
|
24
|
+
import { Agent } from '@mastra/core/agent';
|
|
25
|
+
import { getMicrophoneStream, playAudio } from '@mastra/node-audio';
|
|
26
|
+
import { XAIRealtimeVoice } from '@mastra/voice-xai-realtime';
|
|
27
|
+
|
|
28
|
+
const voice = new XAIRealtimeVoice({
|
|
29
|
+
apiKey: process.env.XAI_API_KEY,
|
|
30
|
+
model: 'grok-voice-think-fast-1.0',
|
|
31
|
+
speaker: 'eve',
|
|
32
|
+
instructions: 'You are a concise voice assistant.',
|
|
33
|
+
turnDetection: { type: 'server_vad' },
|
|
34
|
+
});
|
|
35
|
+
|
|
36
|
+
const agent = new Agent({
|
|
37
|
+
id: 'voice-agent',
|
|
38
|
+
name: 'Voice Agent',
|
|
39
|
+
instructions: 'You are a helpful voice assistant.',
|
|
40
|
+
model: 'xai/grok-4.3',
|
|
41
|
+
voice,
|
|
42
|
+
});
|
|
43
|
+
|
|
44
|
+
await agent.voice.connect();
|
|
45
|
+
|
|
46
|
+
agent.voice.on('speaker', audioStream => {
|
|
47
|
+
playAudio(audioStream);
|
|
48
|
+
});
|
|
49
|
+
|
|
50
|
+
agent.voice.on('writing', ({ text, role }) => {
|
|
51
|
+
console.log(`${role}: ${text}`);
|
|
52
|
+
});
|
|
53
|
+
|
|
54
|
+
await agent.voice.speak('How can I help you today?');
|
|
55
|
+
|
|
56
|
+
const microphoneStream = getMicrophoneStream();
|
|
57
|
+
await agent.voice.send(microphoneStream);
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Server-side tools
|
|
61
|
+
|
|
62
|
+
xAI executes `web_search`, `x_search`, `file_search`, and `mcp` tools server-side. Pass them through `serverTools` or `session.tools`; the provider merges both arrays into the initial `session.update`:
|
|
63
|
+
|
|
64
|
+
```typescript
|
|
65
|
+
const voice = new XAIRealtimeVoice({
|
|
66
|
+
apiKey: process.env.XAI_API_KEY,
|
|
67
|
+
serverTools: [
|
|
68
|
+
{ type: 'web_search' },
|
|
69
|
+
{
|
|
70
|
+
type: 'mcp',
|
|
71
|
+
server_url: 'https://mcp.example.com/mcp',
|
|
72
|
+
server_label: 'business-tools',
|
|
73
|
+
allowed_tools: ['lookup_order'],
|
|
74
|
+
},
|
|
75
|
+
],
|
|
76
|
+
});
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
Mastra function tools added with `addTools()` are converted into xAI function tools. When xAI emits function call events, this provider executes the Mastra tools, sends `function_call_output` items, and waits for all parallel tool calls to finish before sending the continuation `response.create`.
|
|
80
|
+
|
|
81
|
+
`send()` requires an open WebSocket connection. Call `connect()` first for live microphone streaming. Readable stream chunks must be binary audio chunks (`Buffer`, `ArrayBuffer`, or a typed array).
|
|
82
|
+
|
|
83
|
+
## Supported voices
|
|
84
|
+
|
|
85
|
+
- `eve`
|
|
86
|
+
- `ara`
|
|
87
|
+
- `rex`
|
|
88
|
+
- `sal`
|
|
89
|
+
- `leo`
|
|
90
|
+
|
|
91
|
+
Custom xAI voice IDs can also be used as the `speaker` value.
|
|
92
|
+
|
|
93
|
+
## Audio
|
|
94
|
+
|
|
95
|
+
The default input and output format is 24 kHz PCM16:
|
|
96
|
+
|
|
97
|
+
```typescript
|
|
98
|
+
const voice = new XAIRealtimeVoice({
|
|
99
|
+
audio: {
|
|
100
|
+
input: { format: { type: 'audio/pcm', rate: 24000 } },
|
|
101
|
+
output: { format: { type: 'audio/pcm', rate: 24000 } },
|
|
102
|
+
},
|
|
103
|
+
});
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
The provider also supports `audio/pcmu` and `audio/pcma` for telephony use cases. Those G.711 codecs use 8 kHz audio.
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: mastra-voice-xai-realtime
|
|
3
|
+
description: Documentation for @mastra/voice-xai-realtime. Use when working with @mastra/voice-xai-realtime APIs, configuration, or implementation.
|
|
4
|
+
metadata:
|
|
5
|
+
package: "@mastra/voice-xai-realtime"
|
|
6
|
+
version: "0.0.0"
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## When to use
|
|
10
|
+
|
|
11
|
+
Use this skill whenever you are working with @mastra/voice-xai-realtime to obtain the domain-specific knowledge.
|
|
12
|
+
|
|
13
|
+
## How to use
|
|
14
|
+
|
|
15
|
+
Read the individual reference documents for detailed explanations and code examples.
|
|
16
|
+
|
|
17
|
+
### Docs
|
|
18
|
+
|
|
19
|
+
- [Voice in Mastra](references/docs-voice-overview.md) - Overview of voice capabilities in Mastra, including text-to-speech, speech-to-text, and real-time speech-to-speech interactions.
|
|
20
|
+
|
|
21
|
+
### Reference
|
|
22
|
+
|
|
23
|
+
- [Reference: xAI Realtime voice](references/reference-voice-xai-realtime.md) - Documentation for the XAIRealtimeVoice class, providing realtime voice conversations with the xAI Grok Voice Agent API.
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
Read [assets/SOURCE_MAP.json](assets/SOURCE_MAP.json) for source code references.
|