@convai/web-sdk 1.0.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/README.md +784 -680
  2. package/dist/core/BlendshapeQueue.d.ts +43 -14
  3. package/dist/core/BlendshapeQueue.d.ts.map +1 -1
  4. package/dist/core/BlendshapeQueue.js +69 -26
  5. package/dist/core/BlendshapeQueue.js.map +1 -1
  6. package/dist/core/ConvaiClient.d.ts +6 -0
  7. package/dist/core/ConvaiClient.d.ts.map +1 -1
  8. package/dist/core/ConvaiClient.js +58 -13
  9. package/dist/core/ConvaiClient.js.map +1 -1
  10. package/dist/core/MessageHandler.d.ts +14 -0
  11. package/dist/core/MessageHandler.d.ts.map +1 -1
  12. package/dist/core/MessageHandler.js +58 -18
  13. package/dist/core/MessageHandler.js.map +1 -1
  14. package/dist/core/types.d.ts +34 -4
  15. package/dist/core/types.d.ts.map +1 -1
  16. package/dist/react/components/ConvaiWidget.d.ts.map +1 -1
  17. package/dist/react/components/ConvaiWidget.js +45 -3
  18. package/dist/react/components/ConvaiWidget.js.map +1 -1
  19. package/dist/react/components/rtc-widget/components/conviComponents/VoiceModeOverlay.d.ts +3 -1
  20. package/dist/react/components/rtc-widget/components/conviComponents/VoiceModeOverlay.d.ts.map +1 -1
  21. package/dist/react/components/rtc-widget/components/conviComponents/VoiceModeOverlay.js +3 -3
  22. package/dist/react/components/rtc-widget/components/conviComponents/VoiceModeOverlay.js.map +1 -1
  23. package/dist/react/hooks/useConvaiClient.d.ts.map +1 -1
  24. package/dist/react/hooks/useConvaiClient.js +4 -1
  25. package/dist/react/hooks/useConvaiClient.js.map +1 -1
  26. package/dist/types/index.d.ts +34 -4
  27. package/dist/types/index.d.ts.map +1 -1
  28. package/dist/utils/logger.d.ts +12 -12
  29. package/dist/utils/logger.d.ts.map +1 -1
  30. package/dist/utils/logger.js +31 -21
  31. package/dist/utils/logger.js.map +1 -1
  32. package/dist/vanilla/ConvaiWidget.d.ts.map +1 -1
  33. package/dist/vanilla/ConvaiWidget.js +92 -22
  34. package/dist/vanilla/ConvaiWidget.js.map +1 -1
  35. package/dist/vanilla/types.d.ts +2 -0
  36. package/dist/vanilla/types.d.ts.map +1 -1
  37. package/package.json +6 -4
package/README.md CHANGED
@@ -1,230 +1,308 @@
1
1
  # @convai/web-sdk
2
2
 
3
- JavaScript/TypeScript SDK for Convai AI voice assistants. Build voice-powered AI interactions for web applications with real-time audio/video streaming. Supports both React and Vanilla JavaScript/TypeScript.
4
-
5
- ## Installation
3
+ `@convai/web-sdk` is a TypeScript-first SDK for building real-time conversational AI experiences with Convai characters on the web. It supports:
4
+
5
+ - React applications with ready-to-use hooks and widget components
6
+ - Vanilla TypeScript/JavaScript applications with a framework-agnostic widget
7
+ - Direct core client usage for custom UIs and advanced integrations
8
+ - Optional lipsync data pipelines for ARKit and MetaHuman rigs
9
+
10
+ This document is written as a complete implementation reference, from first setup to production hardening.
11
+
12
+ ## Table of Contents
13
+
14
+ - [1. Package Entry Points](#1-package-entry-points)
15
+ - [2. Installation and Requirements](#2-installation-and-requirements)
16
+ - [3. Credentials and Environment Setup](#3-credentials-and-environment-setup)
17
+ - [4. Quick Start](#4-quick-start)
18
+ - [5. Build a Chatbot from Scratch](#5-build-a-chatbot-from-scratch)
19
+ - [6. Core Concepts and Lifecycle](#6-core-concepts-and-lifecycle)
20
+ - [7. Configuration Reference (`ConvaiConfig`)](#7-configuration-reference-convaiconfig)
21
+ - [8. Core API Reference (`ConvaiClient`)](#8-core-api-reference-convaiclient)
22
+ - [9. Message Semantics and Turn Completion](#9-message-semantics-and-turn-completion)
23
+ - [10. React API Reference](#10-react-api-reference)
24
+ - [11. Vanilla API Reference](#11-vanilla-api-reference)
25
+ - [12. Audio Integration Best Practices (Vanilla TypeScript)](#12-audio-integration-best-practices-vanilla-typescript)
26
+ - [13. Lipsync Helpers Reference](#13-lipsync-helpers-reference)
27
+ - [14. Error Handling and Reliability Patterns](#14-error-handling-and-reliability-patterns)
28
+ - [15. Troubleshooting](#15-troubleshooting)
29
+ - [16. Production Readiness Checklist](#16-production-readiness-checklist)
30
+ - [17. Examples](#17-examples)
31
+ - [18. License](#18-license)
32
+
33
+ ## 1. Package Entry Points
34
+
35
+ The SDK is published with multiple entry points for different integration styles.
36
+
37
+ ### `@convai/web-sdk` (default)
38
+
39
+ Primary exports:
40
+
41
+ - `useConvaiClient`
42
+ - `ConvaiWidget`
43
+ - `useCharacterInfo`
44
+ - `useLocalCameraTrack`
45
+ - `ConvaiClient`
46
+ - `AudioRenderer` (re-export of LiveKit `RoomAudioRenderer` for React usage)
47
+ - `AudioContext` (re-export of LiveKit `RoomContext`)
48
+ - Core types re-exported from `core/types`:
49
+ - `AudioSettings`
50
+ - `ConvaiConfig`
51
+ - `ChatMessage`
52
+ - `ConvaiClientState`
53
+ - `AudioControls`
54
+ - `VideoControls`
55
+ - `ScreenShareControls`
56
+ - `IConvaiClient`
57
+ - All exports from `@convai/web-sdk/lipsync-helpers`
58
+ - Type exports for latency models:
59
+ - `LatencyMonitor` (type)
60
+ - `LatencyMeasurement`
61
+ - `LatencyStats`
62
+
63
+ ### `@convai/web-sdk/react`
64
+
65
+ React-focused entry point, equivalent to the default React API surface.
66
+
67
+ ### `@convai/web-sdk/vanilla`
68
+
69
+ Vanilla/browser-focused exports:
70
+
71
+ - `ConvaiClient`
72
+ - `AudioRenderer` (vanilla audio playback manager)
73
+ - `createConvaiWidget`
74
+ - `destroyConvaiWidget`
75
+ - Types:
76
+ - `VanillaWidget`
77
+ - `VanillaWidgetOptions`
78
+ - `IConvaiClient`
79
+ - `ConvaiConfig`
80
+ - `ConvaiClientState`
81
+ - `ChatMessage`
82
+
83
+ ### `@convai/web-sdk/core`
84
+
85
+ Framework-agnostic low-level API:
86
+
87
+ - `ConvaiClient`
88
+ - `AudioManager`
89
+ - `VideoManager`
90
+ - `ScreenShareManager`
91
+ - `MessageHandler`
92
+ - `BlendshapeQueue`
93
+ - `EventEmitter`
94
+ - Type alias: `ConvaiClientType`
95
+ - All core types from `core/types`
96
+ - `TurnStats` type
97
+
98
+ ### `@convai/web-sdk/lipsync-helpers`
99
+
100
+ Dedicated helpers for blendshape formats and queue creation. Full function list is in [Section 13](#13-lipsync-helpers-reference).
101
+
102
+ ## 2. Installation and Requirements
103
+
104
+ ### Install
6
105
 
7
106
  ```bash
8
107
  npm install @convai/web-sdk
9
108
  ```
10
109
 
11
- ## Basic Setup
12
-
13
- ### React
14
-
15
- ```tsx
16
- import { useConvaiClient, ConvaiWidget } from "@convai/web-sdk";
17
-
18
- function App() {
19
- const convaiClient = useConvaiClient({
20
- apiKey: "your-api-key",
21
- characterId: "your-character-id",
22
- });
110
+ or
23
111
 
24
- return <ConvaiWidget convaiClient={convaiClient} />;
25
- }
112
+ ```bash
113
+ pnpm add @convai/web-sdk
26
114
  ```
27
115
 
28
- ### Vanilla TypeScript
29
-
30
- ```typescript
31
- import { ConvaiClient, createConvaiWidget } from "@convai/web-sdk/vanilla";
116
+ or
32
117
 
33
- // Create client with configuration
34
- const client = new ConvaiClient({
35
- apiKey: "your-api-key",
36
- characterId: "your-character-id",
37
- });
38
-
39
- // Create widget - auto-connects on first user click
40
- const widget = createConvaiWidget(document.body, {
41
- convaiClient: client,
42
- });
43
-
44
- // Cleanup when done
45
- widget.destroy();
118
+ ```bash
119
+ yarn add @convai/web-sdk
46
120
  ```
47
121
 
48
- ## Exports
122
+ ### Runtime requirements
49
123
 
50
- ### React Exports (`@convai/web-sdk` or `@convai/web-sdk/react`)
124
+ - Modern browser with WebRTC support
125
+ - Secure context (`https://` or `http://localhost`) for microphone/camera/screen access
51
126
 
52
- **Components:**
127
+ ### Peer dependencies
53
128
 
54
- - `ConvaiWidget` - Main chat widget component
129
+ If you are using React APIs:
55
130
 
56
- **Hooks:**
131
+ - `react` `^18 || ^19`
132
+ - `react-dom` `^18 || ^19`
57
133
 
58
- - `useConvaiClient(config?)` - Main client hook
59
- - `useCharacterInfo(characterId, apiKey)` - Fetch character metadata
60
- - `useLocalCameraTrack()` - Get local camera track
134
+ ## 3. Credentials and Environment Setup
61
135
 
62
- **Core Client:**
136
+ ### Obtain credentials
63
137
 
64
- - `ConvaiClient` - Core client class
138
+ 1. Create/login to your Convai account.
139
+ 2. Create or select a character.
140
+ 3. Copy:
141
+ - API key
142
+ - Character ID
65
143
 
66
- **Types:**
144
+ ### Store credentials in environment variables
67
145
 
68
- - `ConvaiConfig` - Configuration interface
69
- - `ConvaiClientState` - Client state interface
70
- - `ChatMessage` - Message interface
71
- - `IConvaiClient` - Client interface
72
- - `AudioControls` - Audio control interface
73
- - `VideoControls` - Video control interface
74
- - `ScreenShareControls` - Screen share control interface
146
+ Do not hardcode credentials in source files.
75
147
 
76
- **Components:**
148
+ ```bash
149
+ # .env.local (example)
150
+ VITE_CONVAI_API_KEY=<YOUR_CONVAI_API_KEY>
151
+ VITE_CONVAI_CHARACTER_ID=<YOUR_CONVAI_CHARACTER_ID>
152
+ VITE_CONVAI_API_URL=<OPTIONAL_CONVAI_BASE_URL>
153
+ ```
77
154
 
78
- - `AudioRenderer` - Audio playback component
79
- - `AudioContext` - Audio context provider
155
+ Use these values through your build system (`import.meta.env`, process env injection, or server-provided config).
80
156
 
81
- ### Vanilla Exports (`@convai/web-sdk/vanilla`)
157
+ ## 4. Quick Start
82
158
 
83
- **Functions:**
159
+ ### React
84
160
 
85
- - `createConvaiWidget(container, options)` - Create widget instance
86
- - `destroyConvaiWidget(widget)` - Destroy widget instance
161
+ ```tsx
162
+ import { ConvaiWidget, useConvaiClient } from "@convai/web-sdk";
87
163
 
88
- **Classes:**
164
+ export function App() {
165
+ const convaiClient = useConvaiClient({
166
+ apiKey: import.meta.env.VITE_CONVAI_API_KEY,
167
+ characterId: import.meta.env.VITE_CONVAI_CHARACTER_ID,
168
+ enableVideo: false,
169
+ startWithAudioOn: false,
170
+ });
89
171
 
90
- - `ConvaiClient` - Core client class
91
- - `AudioRenderer` - Audio playback handler
172
+ return <ConvaiWidget convaiClient={convaiClient} />;
173
+ }
174
+ ```
92
175
 
93
- **Types:**
176
+ ### Vanilla TypeScript
94
177
 
95
- - `VanillaWidget` - Widget instance interface
96
- - `VanillaWidgetOptions` - Widget options interface
97
- - `IConvaiClient` - Client interface
98
- - `ConvaiConfig` - Configuration interface
99
- - `ConvaiClientState` - Client state interface
100
- - `ChatMessage` - Message interface
178
+ ```ts
179
+ import { ConvaiClient, createConvaiWidget } from "@convai/web-sdk/vanilla";
101
180
 
102
- ### Core Exports (`@convai/web-sdk/core`)
181
+ const client = new ConvaiClient({
182
+ apiKey: import.meta.env.VITE_CONVAI_API_KEY,
183
+ characterId: import.meta.env.VITE_CONVAI_CHARACTER_ID,
184
+ enableVideo: false,
185
+ });
103
186
 
104
- **Classes:**
187
+ const widget = createConvaiWidget(document.body, {
188
+ convaiClient: client,
189
+ defaultVoiceMode: true,
190
+ onConnect: () => console.log("Connected"),
191
+ onDisconnect: () => console.log("Disconnected"),
192
+ });
105
193
 
106
- - `ConvaiClient` - Main client class
107
- - `AudioManager` - Audio management
108
- - `VideoManager` - Video management
109
- - `ScreenShareManager` - Screen share management
110
- - `MessageHandler` - Message handling
111
- - `EventEmitter` - Event emitter base class
194
+ window.addEventListener("beforeunload", () => {
195
+ widget.destroy();
196
+ void client.disconnect().catch(() => undefined);
197
+ });
198
+ ```
112
199
 
113
- **Types:**
200
+ ## 5. Build a Chatbot from Scratch
114
201
 
115
- - All types from React/Vanilla exports
116
- - `ConvaiClientType` - Type alias for ConvaiClient
202
+ This section shows an end-to-end approach you can use in production.
117
203
 
118
- ## Props and Configuration
204
+ ### A) React from scratch (custom connection flow)
119
205
 
120
- ### ConvaiWidget Props (React)
206
+ #### Step 1: Create the client
121
207
 
122
208
  ```tsx
123
- interface ConvaiWidgetProps {
124
- /** Convai client instance (required) */
125
- convaiClient: IConvaiClient & {
126
- activity?: string;
127
- isAudioMuted: boolean;
128
- isVideoEnabled: boolean;
129
- isScreenShareActive: boolean;
130
- };
131
- /** Show video toggle button in settings (default: true) */
132
- showVideo?: boolean;
133
- /** Show screen share toggle button in settings (default: true) */
134
- showScreenShare?: boolean;
135
- }
209
+ import { useConvaiClient } from "@convai/web-sdk";
210
+
211
+ const convaiClient = useConvaiClient({
212
+ apiKey: import.meta.env.VITE_CONVAI_API_KEY,
213
+ characterId: import.meta.env.VITE_CONVAI_CHARACTER_ID,
214
+ endUserId: "<UNIQUE_END_USER_ID>",
215
+ enableVideo: true,
216
+ startWithVideoOn: false,
217
+ startWithAudioOn: false,
218
+ ttsEnabled: true,
219
+ enableLipsync: true,
220
+ blendshapeConfig: {
221
+ format: "arkit",
222
+ frames_buffer_duration: 0.5,
223
+ },
224
+ });
136
225
  ```
137
226
 
138
- ### createConvaiWidget Options (Vanilla)
227
+ #### Step 2: Connect from a user gesture with error handling
139
228
 
140
- ```typescript
141
- interface VanillaWidgetOptions {
142
- /** Convai client instance (required) */
143
- convaiClient: IConvaiClient & {
144
- activity?: string;
145
- chatMessages: ChatMessage[];
146
- };
147
- /** Show video toggle button in settings (default: true) */
148
- showVideo?: boolean;
149
- /** Show screen share toggle button in settings (default: true) */
150
- showScreenShare?: boolean;
229
+ ```tsx
230
+ async function handleConnect() {
231
+ try {
232
+ await convaiClient.connect();
233
+ } catch (error) {
234
+ console.error("Connection failed:", error);
235
+ }
151
236
  }
152
237
  ```
153
238
 
154
- ### ConvaiConfig
155
-
156
- ```typescript
157
- interface ConvaiConfig {
158
- /** Your Convai API key from convai.com dashboard (required) */
159
- apiKey: string;
160
- /** The Character ID to connect to (required) */
161
- characterId: string;
162
- /**
163
- * End user identifier for speaker management (optional).
164
- * When provided: enables long-term memory and analytics
165
- * When not provided: anonymous mode, no persistent memory
166
- */
167
- endUserId?: string;
168
- /** Custom Convai API URL (optional, defaults to production endpoint) */
169
- url?: string;
170
- /**
171
- * Enable video capability (default: false).
172
- * If true, connection_type will be "video" (supports audio, video, and screenshare).
173
- * If false, connection_type will be "audio" (audio only).
174
- */
175
- enableVideo?: boolean;
176
- /**
177
- * Start with video camera on when connecting (default: false).
178
- * Only works if enableVideo is true.
179
- */
180
- startWithVideoOn?: boolean;
181
- /**
182
- * Start with microphone on when connecting (default: false).
183
- * If false, microphone stays off until user enables it.
184
- */
185
- startWithAudioOn?: boolean;
186
- /** Enable text-to-speech audio generation (default: true) */
187
- ttsEnabled?: boolean;
239
+ #### Step 3: Wait for readiness before sending text
240
+
241
+ ```tsx
242
+ function sendMessage(text: string) {
243
+ if (!convaiClient.state.isConnected || !convaiClient.isBotReady) return;
244
+ convaiClient.sendUserTextMessage(text);
188
245
  }
189
246
  ```
190
247
 
191
- ## Features
248
+ #### Step 4: Render the widget or your own UI
192
249
 
193
- ### Video Enabled Chat
250
+ ```tsx
251
+ import { ConvaiWidget } from "@convai/web-sdk";
194
252
 
195
- To enable video capabilities, set `enableVideo: true` in your configuration. This enables audio, video, and screen sharing.
253
+ <ConvaiWidget
254
+ convaiClient={convaiClient}
255
+ showVideo={true}
256
+ showScreenShare={true}
257
+ defaultVoiceMode={true}
258
+ />;
259
+ ```
196
260
 
197
- **React:**
261
+ #### Step 5: Subscribe to lifecycle events
198
262
 
199
263
  ```tsx
200
- import { useConvaiClient, ConvaiWidget } from "@convai/web-sdk";
264
+ useEffect(() => {
265
+ const unsubError = convaiClient.on("error", (error) => {
266
+ console.error("Convai error:", error);
267
+ });
201
268
 
202
- function App() {
203
- const convaiClient = useConvaiClient({
204
- apiKey: "your-api-key",
205
- characterId: "your-character-id",
206
- enableVideo: true,
207
- startWithVideoOn: false, // Camera off by default
269
+ const unsubState = convaiClient.on("stateChange", (state) => {
270
+ console.log("State:", state.agentState);
208
271
  });
209
272
 
210
- return (
211
- <ConvaiWidget
212
- convaiClient={convaiClient}
213
- showVideo={true}
214
- showScreenShare={true}
215
- />
216
- );
217
- }
273
+ const unsubMessages = convaiClient.on("messagesChange", (messages) => {
274
+ console.log("Messages:", messages.length);
275
+ });
276
+
277
+ return () => {
278
+ unsubError();
279
+ unsubState();
280
+ unsubMessages();
281
+ };
282
+ }, [convaiClient]);
218
283
  ```
219
284
 
220
- **Vanilla:**
285
+ #### Step 6: Clean up on unmount
286
+
287
+ ```tsx
288
+ useEffect(() => {
289
+ return () => {
290
+ void convaiClient.disconnect().catch(() => undefined);
291
+ };
292
+ }, [convaiClient]);
293
+ ```
221
294
 
222
- ```typescript
295
+ ### B) Vanilla TypeScript from scratch (widget + custom hooks)
296
+
297
+ #### Step 1: Initialize client and widget
298
+
299
+ ```ts
223
300
  import { ConvaiClient, createConvaiWidget } from "@convai/web-sdk/vanilla";
224
301
 
225
302
  const client = new ConvaiClient({
226
- apiKey: "your-api-key",
227
- characterId: "your-character-id",
303
+ apiKey: "<YOUR_CONVAI_API_KEY>",
304
+ characterId: "<YOUR_CHARACTER_ID>",
305
+ endUserId: "<UNIQUE_END_USER_ID>",
228
306
  enableVideo: true,
229
307
  startWithVideoOn: false,
230
308
  });
@@ -233,667 +311,693 @@ const widget = createConvaiWidget(document.body, {
233
311
  convaiClient: client,
234
312
  showVideo: true,
235
313
  showScreenShare: true,
314
+ defaultVoiceMode: true,
315
+ onConnect: () => console.log("Connected"),
316
+ onDisconnect: () => console.log("Disconnected"),
317
+ onMessage: (message) => console.log("Message:", message),
236
318
  });
237
319
  ```
238
320
 
239
- **Manual Video Controls:**
240
-
241
- ```typescript
242
- // Enable video camera
243
- await convaiClient.videoControls.enableVideo();
244
-
245
- // Disable video camera
246
- await convaiClient.videoControls.disableVideo();
247
-
248
- // Toggle video
249
- await convaiClient.videoControls.toggleVideo();
250
-
251
- // Check video state
252
- const isVideoEnabled = convaiClient.videoControls.isVideoEnabled;
253
-
254
- // Set video quality
255
- await convaiClient.videoControls.setVideoQuality("high"); // 'low' | 'medium' | 'high'
321
+ #### Step 2: Add explicit error listeners
256
322
 
257
- // Get available video devices
258
- const devices = await convaiClient.videoControls.getVideoDevices();
259
-
260
- // Set specific video device
261
- await convaiClient.videoControls.setVideoDevice(deviceId);
323
+ ```ts
324
+ const unsubError = client.on("error", (error) => {
325
+ console.error("SDK error:", error);
326
+ });
262
327
  ```
263
328
 
264
- **Screen Sharing:**
265
-
266
- ```typescript
267
- // Enable screen share
268
- await convaiClient.screenShareControls.enableScreenShare();
329
+ #### Step 3: Add guarded send utility
269
330
 
270
- // Enable screen share with audio
271
- await convaiClient.screenShareControls.enableScreenShareWithAudio();
272
-
273
- // Disable screen share
274
- await convaiClient.screenShareControls.disableScreenShare();
331
+ ```ts
332
+ function safeSend(text: string) {
333
+ if (!text.trim()) return;
334
+ if (!client.state.isConnected) return;
335
+ if (!client.isBotReady) return;
336
+ client.sendUserTextMessage(text);
337
+ }
338
+ ```
275
339
 
276
- // Toggle screen share
277
- await convaiClient.screenShareControls.toggleScreenShare();
340
+ #### Step 4: Cleanup
278
341
 
279
- // Check screen share state
280
- const isActive = convaiClient.screenShareControls.isScreenShareActive;
342
+ ```ts
343
+ function destroy() {
344
+ unsubError();
345
+ widget.destroy();
346
+ void client.disconnect().catch(() => undefined);
347
+ }
281
348
  ```
282
349
 
283
- **Video State Monitoring:**
284
-
285
- ```typescript
286
- // React
287
- const { isVideoEnabled } = convaiClient;
350
+ ### C) Custom UI (framework-agnostic)
288
351
 
289
- // Core API (event-based)
290
- convaiClient.videoControls.on("videoStateChange", (state) => {
291
- console.log("Video enabled:", state.isVideoEnabled);
292
- console.log("Video hidden:", state.isVideoHidden);
293
- });
294
- ```
352
+ If you are not using the built-in widget:
295
353
 
296
- ### Lipsync (Facial Animation for 3D Characters)
354
+ - Use `ConvaiClient` from `@convai/web-sdk/core`
355
+ - Use `AudioRenderer` from `@convai/web-sdk/vanilla` for remote audio playback
356
+ - Render your own UI based on `stateChange`, `messagesChange`, and control manager events
297
357
 
298
- Enable lipsync to receive blendshape data for animating 3D character faces in sync with speech:
358
+ ```ts
359
+ import { ConvaiClient } from "@convai/web-sdk/core";
360
+ import { AudioRenderer } from "@convai/web-sdk/vanilla";
299
361
 
300
- ```typescript
301
362
  const client = new ConvaiClient({
302
- apiKey: "your-api-key",
303
- characterId: "your-character-id",
304
- enableLipsync: true,
305
- blendshapeConfig: {
306
- format: "arkit", // or "mha" for MetaHuman
307
- },
363
+ apiKey: "<YOUR_CONVAI_API_KEY>",
364
+ characterId: "<YOUR_CHARACTER_ID>",
308
365
  });
309
366
 
310
367
  await client.connect();
368
+ const audioRenderer = new AudioRenderer(client.room);
311
369
 
312
- // In your 3D render loop (60 FPS)
313
- let conversationStartTime = 0;
314
-
315
- client.on("speakingChange", (speaking) => {
316
- if (speaking) conversationStartTime = Date.now();
317
- });
318
-
319
- function render() {
320
- const elapsedSeconds = (Date.now() - conversationStartTime) / 1000;
321
- const result = client.blendshapeQueue.getFrameAtTime(elapsedSeconds);
322
-
323
- if (result) {
324
- // Apply blendshape values to your 3D character
325
- myCharacter.morphTargets["jawOpen"] = result.frame[0];
326
- myCharacter.morphTargets["mouthSmile"] = result.frame[1];
327
- // ... apply remaining blendshapes
328
- }
370
+ // ... your custom UI logic
329
371
 
330
- requestAnimationFrame(render);
331
- }
372
+ audioRenderer.destroy();
373
+ await client.disconnect();
332
374
  ```
333
375
 
334
- **Blendshape Formats:**
376
+ ## 6. Core Concepts and Lifecycle
335
377
 
336
- - `arkit` - 61 blendshapes (iOS ARKit standard)
337
- - `mha` - 251 blendshapes (MetaHuman)
378
+ ### Connection lifecycle
338
379
 
339
- ### Interruption
380
+ 1. `connect()` starts room and transport setup.
381
+ 2. `state.isConnected` becomes true when room connection is established.
382
+ 3. `botReady` event indicates the character is ready for interaction.
383
+ 4. Messages stream through data events into `chatMessages`.
384
+ 5. Audio/video/screen-share are managed through dedicated control managers.
385
+ 6. `disconnect()` tears down the session.
340
386
 
341
- Interrupt the character's current response to allow the user to speak immediately.
387
+ ### Activity lifecycle
342
388
 
343
- **React:**
389
+ - `state.isThinking`: model is generating response
390
+ - `state.isSpeaking`: model audio is currently speaking
391
+ - `state.agentState`: combined high-level state (`disconnected | connected | listening | thinking | speaking`)
344
392
 
345
- ```tsx
346
- function ChatInterface() {
347
- const convaiClient = useConvaiClient({
348
- /* config */
349
- });
393
+ ### Widget lifecycle
350
394
 
351
- const handleInterrupt = () => {
352
- // Interrupt the bot's current response
353
- convaiClient.sendInterruptMessage();
354
- };
395
+ Both React and vanilla widgets:
355
396
 
356
- return <button onClick={handleInterrupt}>Interrupt</button>;
357
- }
358
- ```
397
+ - auto-connect on first user interaction
398
+ - expose optional callbacks/events
399
+ - need explicit cleanup on app teardown
359
400
 
360
- **Vanilla:**
401
+ ## 7. Configuration Reference (`ConvaiConfig`)
361
402
 
362
- ```typescript
363
- const interruptButton = document.getElementById("interrupt-btn");
403
+ | Field | Type | Required | Default | Description |
404
+ | ----------------------------------------- | ------------------ | -------- | -------------------- | ----------------------------------------------------------------------------------- |
405
+ | `apiKey` | `string` | Yes | - | Convai API key. |
406
+ | `characterId` | `string` | Yes | - | Target character identifier. |
407
+ | `endUserId` | `string` | No | `undefined` | Stable end-user identity for memory/analytics continuity. |
408
+ | `url` | `string` | No | SDK internal default | Convai base URL. Set explicitly if your deployment requires a specific environment. |
409
+ | `enableVideo` | `boolean` | No | `false` | Enables video-capable connection type. |
410
+ | `startWithVideoOn` | `boolean` | No | `false` | Auto-enable camera after connect. |
411
+ | `startWithAudioOn` | `boolean` | No | `false` | Auto-enable microphone after connect. |
412
+ | `ttsEnabled` | `boolean` | No | `true` | Enables model text-to-speech output. |
413
+ | `enableLipsync` | `boolean` | No | `false` | Requests blendshape payloads for facial animation. |
414
+ | `blendshapeConfig.format` | `"arkit" \| "mha"` | No | `"mha"` | Blendshape output format. |
415
+ | `blendshapeConfig.frames_buffer_duration` | `number` | No | server-defined | Buffering hint for audio/blendshape synchronization. |
416
+ | `actionConfig` | object | No | `undefined` | Action and scene-context metadata (actions, characters, objects, attention object). |
364
417
 
365
- interruptButton.addEventListener("click", () => {
366
- client.sendInterruptMessage();
367
- });
368
- ```
369
-
370
- **Voice Mode Interruption Pattern:**
418
+ ## 8. Core API Reference (`ConvaiClient`)
371
419
 
372
- When implementing voice mode, interrupt the bot when the user starts speaking:
373
-
374
- ```typescript
375
- // When user enters voice mode
376
- const enterVoiceMode = async () => {
377
- // Interrupt any ongoing bot response
378
- convaiClient.sendInterruptMessage();
379
-
380
- // Unmute microphone
381
- await convaiClient.audioControls.unmuteAudio();
382
- };
420
+ Import:
383
421
 
384
- // When user exits voice mode
385
- const exitVoiceMode = async () => {
386
- // Interrupt any ongoing bot response
387
- convaiClient.sendInterruptMessage();
388
-
389
- // Mute microphone
390
- await convaiClient.audioControls.muteAudio();
391
- };
422
+ ```ts
423
+ import { ConvaiClient } from "@convai/web-sdk/core";
392
424
  ```
393
425
 
394
- ### User Microphone Mute/Unmute
395
-
396
- Control the user's microphone input.
426
+ ### Constructor
427
+
428
+ ```ts
429
+ new ConvaiClient(config?: ConvaiConfig)
430
+ ```
431
+
432
+ ### Properties
433
+
434
+ | Property | Type | Description |
435
+ | ----------------------- | ---------------------------- | -------------------------------------------------------- |
436
+ | `state` | `ConvaiClientState` | Real-time connection/activity state. |
437
+ | `connectionType` | `"audio" \| "video" \| null` | Active transport mode. |
438
+ | `apiKey` | `string \| null` | Active API key. |
439
+ | `characterId` | `string \| null` | Active character ID. |
440
+ | `speakerId` | `string \| null` | Resolved speaker identity. |
441
+ | `room` | `Room` | Internal LiveKit room instance. |
442
+ | `chatMessages` | `ChatMessage[]` | Conversation message store. |
443
+ | `userTranscription` | `string` | Current non-final voice transcription text. |
444
+ | `characterSessionId` | `string \| null` | Server conversation session identifier. |
445
+ | `isBotReady` | `boolean` | Character readiness flag. |
446
+ | `audioControls` | `AudioControls` | Microphone controls. |
447
+ | `videoControls` | `VideoControls` | Camera controls. |
448
+ | `screenShareControls` | `ScreenShareControls` | Screen sharing controls. |
449
+ | `latencyMonitor` | `LatencyMonitor` | Measurement manager used by the client for turn latency. |
450
+ | `blendshapeQueue` | `BlendshapeQueue` | Buffer queue for lipsync frames. |
451
+ | `conversationSessionId` | `number` | Incremental turn session ID used by conversation events. |
452
+
453
+ ### Methods
454
+
455
+ | Method | Signature | Description |
456
+ | ---------------------- | ------------------------------------------------------------------- | ---------------------------------------------------------- |
457
+ | `connect` | `(config?: ConvaiConfig) => Promise<void>` | Connect using passed config or stored config. |
458
+ | `disconnect` | `() => Promise<void>` | Disconnect and release session resources. |
459
+ | `reconnect` | `() => Promise<void>` | Disconnect then connect with stored config. |
460
+ | `resetSession` | `() => void` | Reset character session and clear conversation history. |
461
+ | `sendUserTextMessage` | `(text: string) => void` | Send text message to character. |
462
+ | `sendTriggerMessage` | `(triggerName?: string, triggerMessage?: string) => void` | Send trigger/action message. |
463
+ | `sendInterruptMessage` | `() => void` | Interrupt current bot response. |
464
+ | `updateTemplateKeys` | `(templateKeys: Record<string, string>) => void` | Update runtime template variables. |
465
+ | `updateDynamicInfo` | `(dynamicInfo: { text: string }) => void` | Update dynamic context text. |
466
+ | `toggleTts` | `(enabled: boolean) => void` | Enable/disable TTS for subsequent responses. |
467
+ | `on` | `(event: string, callback: (...args: any[]) => void) => () => void` | Subscribe to an event and receive an unsubscribe function. |
468
+ | `off` | `(event: string, callback: (...args: any[]) => void) => void` | Remove a specific listener. |
469
+
470
+ ### Common event names and payloads
471
+
472
+ | Event | Payload | Notes |
473
+ | ------------------------- | --------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
474
+ | `stateChange` | `ConvaiClientState` | Any state transition. |
475
+ | `message` | `ChatMessage` | Last message whenever `messagesChange` updates. |
476
+ | `messagesChange` | `ChatMessage[]` | Full message array update. |
477
+ | `userTranscriptionChange` | `string` | Live user speech text updates. |
478
+ | `speakingChange` | `boolean` | Bot speaking started/stopped. |
479
+ | `botReady` | `void` | Bot can now receive interaction. |
480
+ | `connect` | `void` | Client connected. |
481
+ | `disconnect` | `void` | Client disconnected. |
482
+ | `error` | `unknown` | Error surfaced by client. |
483
+ | `conversationStart` | `{ sessionId, userMessage, timestamp }` | Conversation turn started. |
484
+ | `turnEnd` | `{ sessionId, duration, timestamp }` | Server signaled end of turn (bot stopped speaking). Same semantics as `BlendshapeQueue.hasReceivedEndSignal()`. |
485
+ | `blendshapes` | `unknown` | Incoming blendshape chunk payload. |
486
+ | `blendshapeStatsReceived` | `unknown` | End-of-turn blendshape stats marker. |
487
+ | `latencyMeasurement` | `LatencyMeasurement` | Latency sample from monitor. |
488
+
489
+ ### Control manager APIs
490
+
491
+ #### `audioControls`
492
+
493
+ Properties:
494
+
495
+ - `isAudioEnabled`
496
+ - `isAudioMuted`
497
+ - `audioLevel`
498
+
499
+ Methods:
500
+
501
+ - `enableAudio()`
502
+ - `disableAudio()`
503
+ - `muteAudio()`
504
+ - `unmuteAudio()`
505
+ - `toggleAudio()`
506
+ - `setAudioDevice(deviceId)`
507
+ - `getAudioDevices()`
508
+ - `startAudioLevelMonitoring()`
509
+ - `stopAudioLevelMonitoring()`
510
+ - `on("audioStateChange", callback)`
511
+ - `off("audioStateChange", callback)`
512
+
513
+ #### `videoControls`
514
+
515
+ Properties:
516
+
517
+ - `isVideoEnabled`
518
+ - `isVideoHidden`
519
+
520
+ Methods:
521
+
522
+ - `enableVideo()`
523
+ - `disableVideo()`
524
+ - `hideVideo()`
525
+ - `showVideo()`
526
+ - `toggleVideo()`
527
+ - `setVideoDevice(deviceId)`
528
+ - `getVideoDevices()`
529
+ - `setVideoQuality("low" | "medium" | "high")`
530
+ - `on("videoStateChange", callback)`
531
+ - `off("videoStateChange", callback)`
532
+
533
+ #### `screenShareControls`
534
+
535
+ Properties:
536
+
537
+ - `isScreenShareEnabled`
538
+ - `isScreenShareActive`
539
+
540
+ Methods:
541
+
542
+ - `enableScreenShare()`
543
+ - `disableScreenShare()`
544
+ - `toggleScreenShare()`
545
+ - `enableScreenShareWithAudio()`
546
+ - `getScreenShareTracks()`
547
+ - `on("screenShareStateChange", callback)`
548
+ - `off("screenShareStateChange", callback)`
549
+
550
+ ### `latencyMonitor` API (via `client.latencyMonitor`)
551
+
552
+ `latencyMonitor` is available on every client instance for instrumentation and diagnostics.
553
+
554
+ Methods:
555
+
556
+ - `enable()`
557
+ - `disable()`
558
+ - `startMeasurement(type, userMessage?)`
559
+ - `endMeasurement()`
560
+ - `cancelMeasurement()`
561
+ - `getMeasurements()`
562
+ - `getLatestMeasurement()`
563
+ - `getStats()`
564
+ - `clear()`
565
+ - `getPendingMeasurement()`
566
+ - `on("measurement", callback)`
567
+ - `on("measurementsChange", callback)`
568
+ - `on("enabledChange", callback)`
569
+
570
+ Properties:
571
+
572
+ - `enabled`
573
+ - `hasPendingMeasurement`
574
+
575
+ ### Advanced core classes (`@convai/web-sdk/core`)
576
+
577
+ These are exported for advanced and custom pipeline use-cases.
578
+
579
+ #### `BlendshapeQueue`
580
+
581
+ Buffer for lipsync frames. Use `isConversationEnded()` for definitive end-of-conversation: it returns true only when the server has sent `blendshape-turn-stats` and either all expected frames have been consumed or the queue is empty (handles dropped frames). Use `hasReceivedEndSignal()` when you only need to know that the server signaled end (e.g. to keep playing remaining frames).
582
+
583
+ Methods:
397
584
 
398
- **React:**
399
-
400
- ```tsx
401
- function AudioControls() {
402
- const convaiClient = useConvaiClient({
403
- /* config */
404
- });
585
+ - `addChunk(blendshapes)`
586
+ - `getFrames()`
587
+ - `getFrame(index)`
588
+ - `getFrameWithAlpha(index)`
589
+ - `consumeFrames(count)`
590
+ - `hasFrames()`
591
+ - `isConversationActive()`
592
+ - `isConversationEnded()` — true when server signaled end and playback is complete (all frames consumed or queue empty)
593
+ - `hasReceivedEndSignal()` — true when server sent `blendshape-turn-stats` (does not check frame consumption)
594
+ - `startConversation()`
595
+ - `startBotSpeaking()`
596
+ - `stopBotSpeaking()`
597
+ - `isBotSpeaking()`
598
+ - `endConversation(stats?)`
599
+ - `interrupt()`
600
+ - `getTurnStats()`
601
+ - `getFramesConsumed()`
602
+ - `getTimeLeftMs()`
603
+ - `isAllFramesConsumed()`
604
+ - `reset()`
605
+ - `getFrameAtTime(elapsedTime)`
606
+ - `getDebugInfo()`
405
607
 
406
- const handleMute = async () => {
407
- await convaiClient.audioControls.muteAudio();
408
- };
608
+ Properties:
409
609
 
410
- const handleUnmute = async () => {
411
- await convaiClient.audioControls.unmuteAudio();
412
- };
610
+ - `length`
413
611
 
414
- const handleToggle = async () => {
415
- await convaiClient.audioControls.toggleAudio();
416
- };
612
+ #### `MessageHandler`
417
613
 
418
- return (
419
- <div>
420
- <button onClick={handleMute}>Mute</button>
421
- <button onClick={handleUnmute}>Unmute</button>
422
- <button onClick={handleToggle}>Toggle</button>
423
- <p>Muted: {convaiClient.audioControls.isAudioMuted ? "Yes" : "No"}</p>
424
- </div>
425
- );
426
- }
427
- ```
614
+ Methods:
428
615
 
429
- **Vanilla:**
616
+ - `getBlendshapeQueue()`
617
+ - `setLatencyMonitor(monitor)`
618
+ - `getChatMessages()`
619
+ - `getUserTranscription()`
620
+ - `getIsBotResponding()`
621
+ - `getIsSpeaking()`
622
+ - `setRoom(room)`
623
+ - `reset()`
624
+ - inherited event APIs from `EventEmitter`:
625
+ - `on(event, callback)`
626
+ - `off(event, callback)`
430
627
 
431
- ```typescript
432
- // Mute microphone
433
- await client.audioControls.muteAudio();
628
+ #### `EventEmitter`
434
629
 
435
- // Unmute microphone
436
- await client.audioControls.unmuteAudio();
630
+ Methods:
437
631
 
438
- // Toggle mute state
439
- await client.audioControls.toggleAudio();
632
+ - `on(event, callback)`
633
+ - `off(event, callback)`
634
+ - `emit(event, ...args)`
635
+ - `removeAllListeners()`
636
+ - `listenerCount(event)`
637
+
638
+ ## 9. Message Semantics and Turn Completion
440
639
 
441
- // Check mute state
442
- const isMuted = client.audioControls.isAudioMuted;
640
+ ### `ChatMessage` model
641
+
642
+ `ChatMessage` includes:
643
+
644
+ - `id`
645
+ - `type`
646
+ - `content`
647
+ - `timestamp`
648
+ - `isStreaming?` — `true` while the message is still streaming (mutable), `false` when finalized
443
649
 
444
- // Enable audio (request permissions if needed)
445
- await client.audioControls.enableAudio();
650
+ Supported message `type` values include:
446
651
 
447
- // Disable audio
448
- await client.audioControls.disableAudio();
449
- ```
652
+ - `user`
653
+ - `convai`
654
+ - `emotion`
655
+ - `behavior-tree`
656
+ - `action`
657
+ - `user-transcription`
658
+ - `bot-llm-text`
659
+ - `bot-emotion`
660
+ - `user-llm-text`
661
+ - `interrupt-bot`
662
+
663
+ ### Recommended way to detect response completion
450
664
 
451
- **Audio Device Management:**
665
+ Use events instead of checking `isStreaming`:
452
666
 
453
- ```typescript
454
- // Get available audio devices
455
- const devices = await convaiClient.audioControls.getAudioDevices();
667
+ - `turnEnd` for the server turn-end signal (bot stopped speaking; same as `hasReceivedEndSignal()`)
668
+ - `blendshapeStatsReceived` as additional completion marker when lipsync/animation output is enabled
456
669
 
457
- // Set specific audio device
458
- await convaiClient.audioControls.setAudioDevice(deviceId);
670
+ When driving lipsync from `BlendshapeQueue`, use `blendshapeQueue.isConversationEnded()` for definitive end-of-conversation. It returns true only when the server has signaled end and playback is complete (all expected frames consumed or queue empty). Call `blendshapeQueue.reset()` and your `onConversationEnded` when it becomes true. Use `hasReceivedEndSignal()` only when you need the raw server signal (e.g. to decide whether to keep playing remaining frames).
459
671
 
460
- // Monitor audio level
461
- convaiClient.audioControls.startAudioLevelMonitoring();
672
+ Example:
462
673
 
463
- convaiClient.audioControls.on("audioLevelChange", (level) => {
464
- console.log("Audio level:", level);
465
- // level is a number between 0 and 1
466
- });
467
-
468
- convaiClient.audioControls.stopAudioLevelMonitoring();
469
- ```
470
-
471
- **Audio State Monitoring:**
472
-
473
- ```typescript
474
- // React
475
- const { isAudioMuted } = convaiClient;
476
-
477
- // Core API (event-based)
478
- convaiClient.audioControls.on("audioStateChange", (state) => {
479
- console.log("Audio enabled:", state.isAudioEnabled);
480
- console.log("Audio muted:", state.isAudioMuted);
481
- console.log("Audio level:", state.audioLevel);
482
- });
483
- ```
674
+ ```ts
675
+ type TurnCompletionOptions = {
676
+ expectBlendshapes: boolean;
677
+ onComplete: () => void;
678
+ };
484
679
 
485
- ### Character TTS Mute/Unmute
680
+ function subscribeTurnCompletion(client: any, options: TurnCompletionOptions) {
681
+ let spokenDone = false;
682
+ let animationDone = !options.expectBlendshapes;
486
683
 
487
- Control whether the character's responses are spoken aloud (text-to-speech).
684
+ const invokeOnCompleteIfReady = () => {
685
+ if (spokenDone && animationDone) {
686
+ options.onComplete();
687
+ }
688
+ };
488
689
 
489
- **React:**
690
+ const unsubTurnEnd = client.on("turnEnd", () => {
691
+ spokenDone = true;
692
+ invokeOnCompleteIfReady();
693
+ });
490
694
 
491
- ```tsx
492
- function TTSControls() {
493
- const convaiClient = useConvaiClient({
494
- /* config */
695
+ const unsubBlendshapeStats = client.on("blendshapeStatsReceived", () => {
696
+ animationDone = true;
697
+ invokeOnCompleteIfReady();
495
698
  });
496
699
 
497
- const handleToggleTTS = (enabled: boolean) => {
498
- convaiClient.toggleTts(enabled);
700
+ return () => {
701
+ unsubTurnEnd();
702
+ unsubBlendshapeStats();
499
703
  };
500
-
501
- return (
502
- <div>
503
- <button onClick={() => handleToggleTTS(true)}>Enable TTS</button>
504
- <button onClick={() => handleToggleTTS(false)}>Disable TTS</button>
505
- </div>
506
- );
507
704
  }
508
705
  ```
509
706
 
510
- **Vanilla:**
511
-
512
- ```typescript
513
- // Enable text-to-speech (character will speak responses)
514
- client.toggleTts(true);
515
-
516
- // Disable text-to-speech (character will only send text, no audio)
517
- client.toggleTts(false);
518
- ```
519
-
520
- **Initial TTS Configuration:**
521
-
522
- ```typescript
523
- // Set TTS state during connection
524
- const client = new ConvaiClient({
525
- apiKey: "your-api-key",
526
- characterId: "your-character-id",
527
- ttsEnabled: true, // Enable TTS by default
528
- });
529
-
530
- // Or disable initially
531
- const client = new ConvaiClient({
532
- apiKey: "your-api-key",
533
- characterId: "your-character-id",
534
- ttsEnabled: false, // Disable TTS
535
- });
536
- ```
707
+ When to use both signals: You only need to wait for both `turnEnd` and `blendshapeStatsReceived` when you use lipsync. Set `expectBlendshapes: false` when you do not use facial animation; then `animationDone` is effectively always true and completion runs as soon as `turnEnd` fires. Set `expectBlendshapes: true` when you drive lipsync from the queue; speech and blendshape data are separate pipelines and can finish in either order, so waiting for both ensures "turn complete" means both speech and animation are done before you run `onComplete`.
537
708
 
538
- ### Voice Mode Implementation
709
+ ## 10. React API Reference
539
710
 
540
- Voice mode allows users to speak instead of typing. The widget automatically handles voice mode, but you can implement it manually.
711
+ ### `useConvaiClient(config?)`
541
712
 
542
- **React - Manual Voice Mode:**
713
+ Import:
543
714
 
544
715
  ```tsx
545
716
  import { useConvaiClient } from "@convai/web-sdk";
546
- import { useState, useEffect } from "react";
547
-
548
- function CustomChatInterface() {
549
- const convaiClient = useConvaiClient({
550
- /* config */
551
- });
552
- const [isVoiceMode, setIsVoiceMode] = useState(false);
553
-
554
- const enterVoiceMode = async () => {
555
- // Interrupt any ongoing bot response
556
- convaiClient.sendInterruptMessage();
717
+ ```
557
718
 
558
- // Unmute microphone
559
- await convaiClient.audioControls.unmuteAudio();
719
+ Returns full `IConvaiClient` plus React-friendly reactive fields:
560
720
 
561
- setIsVoiceMode(true);
562
- };
721
+ - `activity`
722
+ - `chatMessages`
723
+ - `isAudioMuted`
724
+ - `isVideoEnabled`
725
+ - `isScreenShareActive`
563
726
 
564
- const exitVoiceMode = async () => {
565
- // Interrupt any ongoing bot response
566
- convaiClient.sendInterruptMessage();
727
+ ### `ConvaiWidget`
567
728
 
568
- // Mute microphone
569
- await convaiClient.audioControls.muteAudio();
729
+ Import:
570
730
 
571
- setIsVoiceMode(false);
572
- };
573
-
574
- // Monitor user transcription for voice input
575
- useEffect(() => {
576
- const transcription = convaiClient.userTranscription;
577
- if (transcription && isVoiceMode) {
578
- // Display real-time transcription
579
- console.log("User is saying:", transcription);
580
- }
581
- }, [convaiClient.userTranscription, isVoiceMode]);
582
-
583
- return (
584
- <div>
585
- {isVoiceMode ? (
586
- <div>
587
- <p>Listening: {convaiClient.userTranscription}</p>
588
- <button onClick={exitVoiceMode}>Stop Voice Mode</button>
589
- </div>
590
- ) : (
591
- <button onClick={enterVoiceMode}>Start Voice Mode</button>
592
- )}
593
- </div>
594
- );
595
- }
731
+ ```tsx
732
+ import { ConvaiWidget } from "@convai/web-sdk";
596
733
  ```
597
734
 
598
- **Vanilla - Manual Voice Mode:**
735
+ Props:
599
736
 
600
- ```typescript
601
- let isVoiceMode = false;
737
+ | Prop | Type | Default | Description |
738
+ | ------------------ | --------------------------------------------------------------------------------------------------------------------- | -------- | ------------------------------------------------------------------ |
739
+ | `convaiClient` | `IConvaiClient & { activity?: string; isAudioMuted: boolean; isVideoEnabled: boolean; isScreenShareActive: boolean }` | required | Client instance returned by `useConvaiClient`. |
740
+ | `showVideo` | `boolean` | `true` | Shows video toggle in settings if connection type is video. |
741
+ | `showScreenShare` | `boolean` | `true` | Shows screen-share toggle in settings if connection type is video. |
742
+ | `defaultVoiceMode` | `boolean` | `true` | Opens in voice mode on first widget session. |
602
743
 
603
- const enterVoiceMode = async () => {
604
- // Interrupt any ongoing bot response
605
- client.sendInterruptMessage();
744
+ ### `useCharacterInfo(characterId?, apiKey?)`
606
745
 
607
- // Unmute microphone
608
- await client.audioControls.unmuteAudio();
746
+ Returns:
609
747
 
610
- isVoiceMode = true;
611
- updateUI();
612
- };
748
+ - `name`
749
+ - `image`
750
+ - `isLoading`
751
+ - `error`
613
752
 
614
- const exitVoiceMode = async () => {
615
- // Interrupt any ongoing bot response
616
- client.sendInterruptMessage();
753
+ ### `useLocalCameraTrack()`
617
754
 
618
- // Mute microphone
619
- await client.audioControls.muteAudio();
755
+ Returns a LiveKit `TrackReferenceOrPlaceholder` for local camera rendering in custom React video UIs.
620
756
 
621
- isVoiceMode = false;
622
- updateUI();
623
- };
757
+ ### React audio utility exports
624
758
 
625
- // Monitor user transcription
626
- client.on("userTranscriptionChange", (transcription) => {
627
- if (isVoiceMode && transcription) {
628
- // Display real-time transcription
629
- document.getElementById("transcription").textContent = transcription;
630
- }
631
- });
759
+ - `AudioRenderer` from LiveKit React components
760
+ - `AudioContext` from LiveKit React components
632
761
 
633
- function updateUI() {
634
- const voiceButton = document.getElementById("voice-btn");
635
- const transcriptionDiv = document.getElementById("transcription");
762
+ ## 11. Vanilla API Reference
636
763
 
637
- if (isVoiceMode) {
638
- voiceButton.textContent = "Stop Voice Mode";
639
- transcriptionDiv.style.display = "block";
640
- } else {
641
- voiceButton.textContent = "Start Voice Mode";
642
- transcriptionDiv.style.display = "none";
643
- }
644
- }
645
- ```
764
+ ### `createConvaiWidget(container, options)`
646
765
 
647
- **Voice Mode with State Monitoring:**
648
-
649
- ```typescript
650
- // Monitor agent state to handle voice mode transitions
651
- convaiClient.on("stateChange", (state) => {
652
- if (isVoiceMode) {
653
- switch (state.agentState) {
654
- case "listening":
655
- // User can speak
656
- console.log("Bot is listening");
657
- break;
658
- case "thinking":
659
- // Bot is processing
660
- console.log("Bot is thinking");
661
- break;
662
- case "speaking":
663
- // Bot is responding
664
- console.log("Bot is speaking");
665
- // Optionally interrupt if user wants to speak
666
- break;
667
- }
668
- }
669
- });
766
+ ```ts
767
+ import { createConvaiWidget } from "@convai/web-sdk/vanilla";
670
768
  ```
671
769
 
672
- ### Connection Management
770
+ Creates and mounts a complete floating chat widget.
673
771
 
674
- **Connect:**
772
+ #### `VanillaWidgetOptions`
675
773
 
676
- ```typescript
677
- // React - config passed to hook
678
- const convaiClient = useConvaiClient({
679
- apiKey: "your-api-key",
680
- characterId: "your-character-id",
681
- });
774
+ | Field | Type | Required | Default | Description |
775
+ | ------------------ | -------------------------------- | -------- | ----------- | -------------------------------------------------- |
776
+ | `convaiClient` | `IConvaiClient` | No\* | - | Existing client instance. |
777
+ | `apiKey` | `string` | No\* | - | Used only when `convaiClient` is not provided. |
778
+ | `characterId` | `string` | No\* | - | Used only when `convaiClient` is not provided. |
779
+ | `enableVideo` | `boolean` | No | `false` | Used for auto-created client only. |
780
+ | `startWithVideoOn` | `boolean` | No | `false` | Used for auto-created client only. |
781
+ | `enableLipsync` | `boolean` | No | `false` | Used for auto-created client only. |
782
+ | `blendshapeConfig` | object | No | `undefined` | Used for auto-created client only. |
783
+ | `showVideo` | `boolean` | No | `true` | Show video toggle in settings. |
784
+ | `showScreenShare` | `boolean` | No | `true` | Show screen-share toggle in settings. |
785
+ | `defaultVoiceMode` | `boolean` | No | `true` | Start in voice mode when opened. |
786
+ | `onConnect` | `() => void` | No | `undefined` | Called when widget client connects. |
787
+ | `onDisconnect` | `() => void` | No | `undefined` | Called when widget client disconnects. |
788
+ | `onMessage` | `(message: ChatMessage) => void` | No | `undefined` | Called on each message change with latest message. |
682
789
 
683
- // Or connect manually
684
- await convaiClient.connect({
685
- apiKey: "your-api-key",
686
- characterId: "your-character-id",
687
- });
790
+ \* You must provide either `convaiClient` OR both `apiKey` and `characterId`.
688
791
 
689
- // Vanilla
690
- const client = new ConvaiClient();
691
- await client.connect({
692
- apiKey: "your-api-key",
693
- characterId: "your-character-id",
694
- });
695
- ```
792
+ #### Return type: `VanillaWidget`
696
793
 
697
- **Disconnect:**
794
+ - `element`: root widget element
795
+ - `client`: resolved client instance
796
+ - `destroy()`: unmount and cleanup
797
+ - `update?`: optional future extension field
698
798
 
699
- ```typescript
700
- await convaiClient.disconnect();
701
- ```
799
+ ### `destroyConvaiWidget(widget)`
702
800
 
703
- **Reconnect:**
801
+ Convenience wrapper that calls `widget.destroy()`.
704
802
 
705
- ```typescript
706
- await convaiClient.reconnect();
707
- ```
803
+ ### `AudioRenderer` (vanilla)
708
804
 
709
- **Reset Session:**
805
+ `AudioRenderer` listens to LiveKit room track subscriptions and auto-attaches remote audio tracks to hidden `audio` elements for playback. Use one renderer instance per active room session and destroy it during cleanup.
710
806
 
711
- ```typescript
712
- // Clear conversation history and start new session
713
- convaiClient.resetSession();
714
- ```
807
+ ## 12. Audio Integration Best Practices (Vanilla TypeScript)
715
808
 
716
- **Connection State:**
809
+ This section provides the recommended integration for stable audio playback.
717
810
 
718
- ```typescript
719
- // React
720
- const { state } = convaiClient;
721
- console.log("Connected:", state.isConnected);
722
- console.log("Connecting:", state.isConnecting);
723
- console.log("Agent state:", state.agentState); // 'disconnected' | 'connected' | 'listening' | 'thinking' | 'speaking'
811
+ ### Recommended reference implementation
724
812
 
725
- // Core API (event-based)
726
- convaiClient.on("stateChange", (state) => {
727
- console.log("State changed:", state);
728
- });
813
+ ```ts
814
+ import { ConvaiClient } from "@convai/web-sdk/core";
815
+ import { AudioRenderer } from "@convai/web-sdk/vanilla";
816
+
817
+ class ConvaiAudioSession {
818
+ private client: ConvaiClient;
819
+ private audioRenderer: AudioRenderer | null = null;
820
+ private audioContext: AudioContext | null = null;
821
+
822
+ constructor() {
823
+ this.client = new ConvaiClient({
824
+ apiKey: "<YOUR_CONVAI_API_KEY>",
825
+ characterId: "<YOUR_CHARACTER_ID>",
826
+ ttsEnabled: true,
827
+ });
828
+ }
729
829
 
730
- convaiClient.on("connect", () => {
731
- console.log("Connected");
732
- });
830
+ async connectFromUserGesture(): Promise<void> {
831
+ await this.client.connect();
733
832
 
734
- convaiClient.on("disconnect", () => {
735
- console.log("Disconnected");
736
- });
737
- ```
833
+ // Required for remote audio playback wiring.
834
+ this.audioRenderer = new AudioRenderer(this.client.room);
835
+
836
+ // Optional: if your app performs WebAudio analysis/effects.
837
+ if (!this.audioContext) {
838
+ this.audioContext = new AudioContext();
839
+ }
840
+ if (this.audioContext.state === "suspended") {
841
+ await this.audioContext.resume();
842
+ }
843
+ }
738
844
 
739
- ### Messaging
845
+ async disconnect(): Promise<void> {
846
+ if (this.audioRenderer) {
847
+ this.audioRenderer.destroy();
848
+ this.audioRenderer = null;
849
+ }
740
850
 
741
- **Send Text Message:**
851
+ await this.client.disconnect();
742
852
 
743
- ```typescript
744
- convaiClient.sendUserTextMessage("Hello, how are you?");
853
+ if (this.audioContext && this.audioContext.state !== "closed") {
854
+ await this.audioContext.close();
855
+ this.audioContext = null;
856
+ }
857
+ }
858
+ }
745
859
  ```
746
860
 
747
- **Send Trigger Message:**
861
+ ### AudioContext guidance
748
862
 
749
- ```typescript
750
- // Trigger specific character action
751
- convaiClient.sendTriggerMessage("greet", "User entered the room");
863
+ - Create/resume `AudioContext` only after user interaction in browsers that enforce autoplay policy.
864
+ - If you are not processing audio with WebAudio, you do not need a custom `AudioContext`; `AudioRenderer` is enough for playback.
865
+ - Always close your custom `AudioContext` in teardown.
752
866
 
753
- // Trigger without message
754
- convaiClient.sendTriggerMessage("wave");
755
- ```
867
+ ### Lifecycle and cleanup order
756
868
 
757
- **Update Context:**
869
+ Recommended shutdown order:
758
870
 
759
- ```typescript
760
- // Update template keys (e.g., user name, location)
761
- convaiClient.updateTemplateKeys({
762
- user_name: "John",
763
- location: "New York",
764
- });
871
+ 1. Stop UI input loops/listeners
872
+ 2. Destroy `AudioRenderer`
873
+ 3. Disconnect `ConvaiClient`
874
+ 4. Close custom `AudioContext` (if created)
765
875
 
766
- // Update dynamic information
767
- convaiClient.updateDynamicInfo({
768
- text: "User is currently browsing the products page",
769
- });
770
- ```
876
+ ### Common failure modes and fixes
771
877
 
772
- **Message History:**
878
+ | Symptom | Likely cause | Recommended action |
879
+ | ----------------------- | ----------------------------------------- | ------------------------------------------------------------------------------------- |
880
+ | No AI audio output | `AudioRenderer` not created | Instantiate `new AudioRenderer(client.room)` immediately after successful connect. |
881
+ | No AI audio output | Browser autoplay restriction | Trigger connect/playback from a user click, and resume `AudioContext` if suspended. |
882
+ | No AI audio output | TTS disabled | Ensure `ttsEnabled` is true for sessions that need speech output. |
883
+ | Intermittent playback | Multiple renderers or stale room instance | Use one renderer per session and always destroy old renderer before reconnecting. |
884
+ | Works once, then silent | Incomplete cleanup on previous session | Destroy renderer and disconnect client on teardown; avoid reusing invalid room state. |
885
+ | Random muted behavior | App-side muting of remote tracks | Verify no custom code is muting remote publications or media elements. |
773
886
 
774
- ```typescript
775
- // React
776
- const { chatMessages } = convaiClient;
887
+ ## 13. Error Handling and Reliability Patterns
777
888
 
778
- // Core API (event-based)
779
- convaiClient.on("message", (message: ChatMessage) => {
780
- console.log("New message:", message.content);
781
- console.log("Message type:", message.type);
782
- });
889
+ ### Pattern 1: Centralized SDK error handling
783
890
 
784
- convaiClient.on("messagesChange", (messages: ChatMessage[]) => {
785
- console.log("All messages:", messages);
891
+ ```ts
892
+ const unsubError = client.on("error", (error) => {
893
+ console.error("Convai SDK error:", error);
894
+ // Optional: route to telemetry/monitoring
786
895
  });
787
896
  ```
788
897
 
789
- **Message Types:**
790
-
791
- ```typescript
792
- type ChatMessageType =
793
- | "user" // User's sent message
794
- | "convai" // Character's response
795
- | "user-transcription" // Real-time speech-to-text from user
796
- | "bot-llm-text" // Character's LLM-generated text
797
- | "emotion" // Character's emotional state
798
- | "behavior-tree" // Behavior tree response
799
- | "action" // Action execution
800
- | "bot-emotion" // Bot emotional response
801
- | "user-llm-text" // User text processed by LLM
802
- | "interrupt-bot"; // Interrupt message
898
+ ### Pattern 2: Retry connect with exponential backoff
899
+
900
+ ```ts
901
+ async function connectWithRetry(
902
+ client: any,
903
+ attempts = 3,
904
+ initialDelayMs = 500,
905
+ ): Promise<void> {
906
+ let delay = initialDelayMs;
907
+
908
+ for (let i = 1; i <= attempts; i++) {
909
+ try {
910
+ await client.connect();
911
+ return;
912
+ } catch (error) {
913
+ if (i === attempts) throw error;
914
+ await new Promise((resolve) => setTimeout(resolve, delay));
915
+ delay *= 2;
916
+ }
917
+ }
918
+ }
803
919
  ```
804
920
 
805
- ### State Monitoring
921
+ ### Pattern 3: Safe send guard
806
922
 
807
- **Agent State:**
923
+ ```ts
924
+ function safeSendText(client: any, text: string) {
925
+ if (!text.trim()) return;
926
+ if (!client.state.isConnected) return;
927
+ if (!client.isBotReady) return;
928
+ client.sendUserTextMessage(text);
929
+ }
930
+ ```
808
931
 
809
- ```typescript
810
- // React
811
- const { state } = convaiClient;
932
+ ### Pattern 4: Protect media control calls
812
933
 
813
- // Check specific states
814
- if (state.isListening) {
815
- console.log("Bot is listening");
934
+ ```ts
935
+ async function safeToggleMic(client: any) {
936
+ try {
937
+ await client.audioControls.toggleAudio();
938
+ } catch (error) {
939
+ console.error("Failed to toggle microphone:", error);
940
+ }
816
941
  }
942
+ ```
817
943
 
818
- if (state.isThinking) {
819
- console.log("Bot is thinking");
820
- }
944
+ ### Pattern 5: Always unsubscribe listeners
821
945
 
822
- if (state.isSpeaking) {
823
- console.log("Bot is speaking");
824
- }
946
+ ```ts
947
+ const unsubscribers = [
948
+ client.on("stateChange", () => {}),
949
+ client.on("messagesChange", () => {}),
950
+ ];
825
951
 
826
- // Combined state
827
- console.log(state.agentState); // 'disconnected' | 'connected' | 'listening' | 'thinking' | 'speaking'
952
+ function cleanupListeners() {
953
+ for (const unsub of unsubscribers) unsub();
954
+ }
828
955
  ```
829
956
 
830
- **User Transcription:**
957
+ ## 14. Troubleshooting
831
958
 
832
- ```typescript
833
- // React
834
- const { userTranscription } = convaiClient;
959
+ ### Connection issues
835
960
 
836
- // Core API (event-based)
837
- convaiClient.on("userTranscriptionChange", (transcription: string) => {
838
- console.log("User is saying:", transcription);
839
- });
840
- ```
961
+ - Verify API key and character ID are valid.
962
+ - Ensure requests are allowed from your browser origin.
963
+ - Set `url` explicitly if your environment does not use the SDK default endpoint.
964
+ - Listen to `error` and inspect failed network calls in browser devtools.
841
965
 
842
- **Bot Ready State:**
966
+ ### `connect()` succeeds but bot never responds
843
967
 
844
- ```typescript
845
- // React
846
- const { isBotReady } = convaiClient;
968
+ - Wait for `botReady` before sending messages.
969
+ - Confirm `ttsEnabled` and message flow are configured as expected.
970
+ - Verify `messagesChange` receives content.
847
971
 
848
- // Core API (event-based)
849
- convaiClient.on("botReady", () => {
850
- console.log("Bot is ready to receive messages");
851
- });
852
- ```
972
+ ### Audio does not play
853
973
 
854
- ## Getting Convai Credentials
974
+ - Ensure an `AudioRenderer` is active for the connected room (vanilla custom UI).
975
+ - Ensure playback starts from a user gesture path to satisfy autoplay policies.
976
+ - Confirm no custom muting code is muting remote tracks.
855
977
 
856
- 1. Visit [convai.com](https://convai.com) and create an account
857
- 2. Navigate to your dashboard
858
- 3. Create a new character or use an existing one
859
- 4. Copy your **API Key** from the dashboard
860
- 5. Copy your **Character ID** from the character details
978
+ ### Microphone does not capture user voice
861
979
 
862
- ## Import Paths
980
+ - Ensure app is served over secure context.
981
+ - Verify browser microphone permission.
982
+ - Handle permission errors from `audioControls.enableAudio()/unmuteAudio()`.
863
983
 
864
- ```typescript
865
- // Default: React version (backward compatible)
866
- import { useConvaiClient, ConvaiWidget } from "@convai/web-sdk";
984
+ ### Video or screen share controls fail
867
985
 
868
- // Explicit React import
869
- import { useConvaiClient, ConvaiWidget } from "@convai/web-sdk/react";
986
+ - Use `enableVideo: true` in config when you need video capabilities.
987
+ - Screen share can be blocked by browser policy or user denial.
988
+ - Wrap calls in `try/catch` and provide fallback UX.
870
989
 
871
- // Vanilla JS/TS
872
- import { ConvaiClient, createConvaiWidget } from "@convai/web-sdk/vanilla";
990
+ ### Lipsync appears out of sync or shape
873
991
 
874
- // Core only (no UI, framework agnostic)
875
- import { ConvaiClient } from "@convai/web-sdk/core";
876
- ```
992
+ - Validate blendshape format (`arkit` vs `mha`) matches your rig expectations.
993
+ - Tune `frames_buffer_duration` so you atleast have some duration of blendshapes before the audio starts playing.
994
+ - Align lipsync start and stop with the queue: start playback when the bot starts speaking (`isBotSpeaking()` true) and treat the turn as finished when `blendshapeQueue.isConversationEnded()` is true before resetting.
995
+ - Drive blendshape application from a single loop (e.g. `requestAnimationFrame`) and advance frame index at 60fps so mouth movement stays in sync with audio.
877
996
 
878
- ## TypeScript Support
879
-
880
- All exports are fully typed:
881
-
882
- ```typescript
883
- import type {
884
- ConvaiClient,
885
- ConvaiConfig,
886
- ConvaiClientState,
887
- ChatMessage,
888
- AudioControls,
889
- VideoControls,
890
- ScreenShareControls,
891
- IConvaiClient,
892
- } from "@convai/web-sdk";
893
- ```
997
+ ## 16. Examples
894
998
 
895
- ## Support
999
+ Repository examples:
896
1000
 
897
- - [Convai Forum](https://forum.convai.com)
898
- - [API Reference](./API_REFERENCE.md)
899
- - [Convai Website](https://convai.com)
1001
+ - `examples/react-three-fiber`
1002
+ - `examples/three-vanilla`
1003
+ - `examples/README.md` for example-level setup notes