@spatialwalk/avatarkit 1.0.0-beta.3 → 1.0.0-beta.31

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (96) hide show
  1. package/CHANGELOG.md +441 -0
  2. package/README.md +328 -138
  3. package/dist/StreamingAudioPlayer-B07iPxK4.js +398 -0
  4. package/dist/animation/AnimationWebSocketClient.d.ts +6 -24
  5. package/dist/animation/utils/eventEmitter.d.ts +0 -4
  6. package/dist/animation/utils/flameConverter.d.ts +3 -11
  7. package/dist/audio/AnimationPlayer.d.ts +5 -29
  8. package/dist/audio/StreamingAudioPlayer.d.ts +8 -66
  9. package/dist/avatar_core_wasm-i0Ocpx6q.js +2693 -0
  10. package/dist/avatar_core_wasm.wasm +0 -0
  11. package/dist/config/app-config.d.ts +4 -13
  12. package/dist/config/constants.d.ts +18 -11
  13. package/dist/config/sdk-config-loader.d.ts +2 -9
  14. package/dist/core/Avatar.d.ts +0 -15
  15. package/dist/core/AvatarController.d.ts +49 -109
  16. package/dist/core/AvatarDownloader.d.ts +0 -95
  17. package/dist/core/AvatarManager.d.ts +7 -18
  18. package/dist/core/AvatarSDK.d.ts +21 -0
  19. package/dist/core/AvatarView.d.ts +25 -119
  20. package/dist/core/NetworkLayer.d.ts +1 -0
  21. package/dist/generated/driveningress/v1/driveningress.d.ts +1 -12
  22. package/dist/generated/driveningress/v2/driveningress.d.ts +0 -3
  23. package/dist/generated/google/protobuf/struct.d.ts +5 -39
  24. package/dist/generated/google/protobuf/timestamp.d.ts +1 -103
  25. package/dist/index-CCBBCJi2.js +7915 -0
  26. package/dist/index.d.ts +1 -6
  27. package/dist/index.js +17 -17
  28. package/dist/renderer/RenderSystem.d.ts +1 -77
  29. package/dist/renderer/covariance.d.ts +0 -12
  30. package/dist/renderer/renderer.d.ts +0 -1
  31. package/dist/renderer/sortSplats.d.ts +0 -11
  32. package/dist/renderer/webgl/reorderData.d.ts +0 -13
  33. package/dist/renderer/webgl/webglRenderer.d.ts +3 -40
  34. package/dist/renderer/webgpu/webgpuRenderer.d.ts +3 -28
  35. package/dist/types/character-settings.d.ts +0 -5
  36. package/dist/types/character.d.ts +3 -21
  37. package/dist/types/index.d.ts +38 -18
  38. package/dist/utils/animation-interpolation.d.ts +3 -13
  39. package/dist/utils/client-id.d.ts +1 -0
  40. package/dist/utils/cls-tracker.d.ts +11 -0
  41. package/dist/utils/conversationId.d.ts +1 -0
  42. package/dist/utils/error-utils.d.ts +1 -25
  43. package/dist/utils/heartbeat-manager.d.ts +18 -0
  44. package/dist/utils/id-manager.d.ts +37 -0
  45. package/dist/utils/logger.d.ts +5 -11
  46. package/dist/utils/usage-tracker.d.ts +5 -0
  47. package/dist/vanilla/vite.config.d.ts +2 -0
  48. package/dist/wasm/avatarCoreAdapter.d.ts +11 -97
  49. package/dist/wasm/avatarCoreMemory.d.ts +5 -54
  50. package/package.json +10 -4
  51. package/dist/StreamingAudioPlayer-BeLlDiwE.js +0 -288
  52. package/dist/StreamingAudioPlayer-BeLlDiwE.js.map +0 -1
  53. package/dist/animation/AnimationWebSocketClient.d.ts.map +0 -1
  54. package/dist/animation/utils/eventEmitter.d.ts.map +0 -1
  55. package/dist/animation/utils/flameConverter.d.ts.map +0 -1
  56. package/dist/audio/AnimationPlayer.d.ts.map +0 -1
  57. package/dist/audio/StreamingAudioPlayer.d.ts.map +0 -1
  58. package/dist/avatar_core_wasm-DmkU6dYn.js +0 -1666
  59. package/dist/avatar_core_wasm-DmkU6dYn.js.map +0 -1
  60. package/dist/config/app-config.d.ts.map +0 -1
  61. package/dist/config/constants.d.ts.map +0 -1
  62. package/dist/config/sdk-config-loader.d.ts.map +0 -1
  63. package/dist/core/Avatar.d.ts.map +0 -1
  64. package/dist/core/AvatarController.d.ts.map +0 -1
  65. package/dist/core/AvatarDownloader.d.ts.map +0 -1
  66. package/dist/core/AvatarKit.d.ts +0 -60
  67. package/dist/core/AvatarKit.d.ts.map +0 -1
  68. package/dist/core/AvatarManager.d.ts.map +0 -1
  69. package/dist/core/AvatarView.d.ts.map +0 -1
  70. package/dist/generated/driveningress/v1/driveningress.d.ts.map +0 -1
  71. package/dist/generated/driveningress/v2/driveningress.d.ts.map +0 -1
  72. package/dist/generated/google/protobuf/struct.d.ts.map +0 -1
  73. package/dist/generated/google/protobuf/timestamp.d.ts.map +0 -1
  74. package/dist/index-NmYXWJnL.js +0 -9712
  75. package/dist/index-NmYXWJnL.js.map +0 -1
  76. package/dist/index.d.ts.map +0 -1
  77. package/dist/index.js.map +0 -1
  78. package/dist/renderer/RenderSystem.d.ts.map +0 -1
  79. package/dist/renderer/covariance.d.ts.map +0 -1
  80. package/dist/renderer/renderer.d.ts.map +0 -1
  81. package/dist/renderer/sortSplats.d.ts.map +0 -1
  82. package/dist/renderer/webgl/reorderData.d.ts.map +0 -1
  83. package/dist/renderer/webgl/webglRenderer.d.ts.map +0 -1
  84. package/dist/renderer/webgpu/webgpuRenderer.d.ts.map +0 -1
  85. package/dist/types/character-settings.d.ts.map +0 -1
  86. package/dist/types/character.d.ts.map +0 -1
  87. package/dist/types/index.d.ts.map +0 -1
  88. package/dist/utils/animation-interpolation.d.ts.map +0 -1
  89. package/dist/utils/error-utils.d.ts.map +0 -1
  90. package/dist/utils/logger.d.ts.map +0 -1
  91. package/dist/utils/posthog-tracker.d.ts +0 -82
  92. package/dist/utils/posthog-tracker.d.ts.map +0 -1
  93. package/dist/utils/reqId.d.ts +0 -20
  94. package/dist/utils/reqId.d.ts.map +0 -1
  95. package/dist/wasm/avatarCoreAdapter.d.ts.map +0 -1
  96. package/dist/wasm/avatarCoreMemory.d.ts.map +0 -1
package/README.md CHANGED
@@ -1,4 +1,4 @@
1
- # SPAvatarKit SDK
1
+ # SPAvatarSDK SDK
2
2
 
3
3
  Real-time virtual avatar rendering SDK based on 3D Gaussian Splatting, supporting audio-driven animation rendering and high-quality 3D rendering.
4
4
 
@@ -6,6 +6,7 @@ Real-time virtual avatar rendering SDK based on 3D Gaussian Splatting, supportin
6
6
 
7
7
  - **3D Gaussian Splatting Rendering** - Based on the latest point cloud rendering technology, providing high-quality 3D virtual avatars
8
8
  - **Audio-Driven Real-Time Animation Rendering** - Users provide audio data, SDK handles receiving animation data and rendering
9
+ - **Multi-Character Support** - Support multiple avatar instances simultaneously, each with independent state and rendering
9
10
  - **WebGPU/WebGL Dual Rendering Backend** - Automatically selects the best rendering backend for compatibility
10
11
  - **WASM High-Performance Computing** - Uses C++ compiled WebAssembly modules for geometric calculations
11
12
  - **TypeScript Support** - Complete type definitions and IntelliSense
@@ -23,99 +24,234 @@ npm install @spatialwalk/avatarkit
23
24
 
24
25
  ```typescript
25
26
  import {
26
- AvatarKit,
27
+ AvatarSDK,
27
28
  AvatarManager,
28
29
  AvatarView,
29
30
  Configuration,
30
- Environment
31
+ Environment,
32
+ DrivingServiceMode,
33
+ LogLevel
31
34
  } from '@spatialwalk/avatarkit'
32
35
 
33
36
  // 1. Initialize SDK
37
+
34
38
  const configuration: Configuration = {
35
- environment: Environment.test,
39
+ environment: Environment.cn,
40
+ drivingServiceMode: DrivingServiceMode.sdk, // Optional, 'sdk' is default
41
+ // - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
42
+ // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
43
+ logLevel: LogLevel.off, // Optional, 'off' is default
44
+ // - LogLevel.off: Disable all logs
45
+ // - LogLevel.error: Only error logs
46
+ // - LogLevel.warning: Warning and error logs
47
+ // - LogLevel.all: All logs (info, warning, error)
36
48
  }
37
49
 
38
- await AvatarKit.initialize('your-app-id', configuration)
50
+ await AvatarSDK.initialize('your-app-id', configuration)
39
51
 
40
52
  // Set sessionToken (if needed, call separately)
41
- // AvatarKit.setSessionToken('your-session-token')
53
+ // AvatarSDK.setSessionToken('your-session-token')
42
54
 
43
55
  // 2. Load character
44
- const avatarManager = new AvatarManager()
56
+ const avatarManager = AvatarManager.shared
45
57
  const avatar = await avatarManager.load('character-id', (progress) => {
46
58
  console.log(`Loading progress: ${progress.progress}%`)
47
59
  })
48
60
 
49
61
  // 3. Create view (automatically creates Canvas and AvatarController)
62
+ // The playback mode is determined by drivingServiceMode in AvatarSDK configuration
63
+ // - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
64
+ // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
50
65
  const container = document.getElementById('avatar-container')
51
66
  const avatarView = new AvatarView(avatar, container)
52
67
 
53
- // 4. Start real-time communication
68
+ // 4. Start real-time communication (SDK mode only)
54
69
  await avatarView.avatarController.start()
55
70
 
56
- // 5. Send audio data
57
- // If audio is Uint8Array, you can use slice().buffer to convert to ArrayBuffer
58
- const audioUint8 = new Uint8Array(1024) // Example: audio data
59
- const audioData = audioUint8.slice().buffer // Simplified conversion, works for ArrayBuffer and SharedArrayBuffer
60
- avatarView.avatarController.send(audioData, false) // Send audio data, will automatically start playing after accumulating enough data
61
- avatarView.avatarController.send(audioData, true) // end=true means immediately return animation data, no longer accumulating
71
+ // 5. Send audio data (SDK mode, must be 16kHz mono PCM16 format)
72
+ const audioData = new ArrayBuffer(1024) // Example: 16kHz PCM16 audio data
73
+ avatarView.avatarController.send(audioData, false) // Send audio data
74
+ avatarView.avatarController.send(audioData, true) // end=true marks the end of current conversation round
62
75
  ```
63
76
 
64
- ### Complete Example
77
+ ### Host Mode Example
78
+
79
+ ```typescript
80
+
81
+ // 1-3. Same as SDK mode (initialize SDK, load character)
65
82
 
66
- Check the example code in the GitHub repository for the complete usage flow.
83
+ // 3. Create view with Host mode
84
+ const container = document.getElementById('avatar-container')
85
+ const avatarView = new AvatarView(avatar, container)
86
+
87
+ // 4. Host Mode Workflow:
88
+ // Send audio data first to get conversationId, then use it to send animation data
89
+ const conversationId = avatarView.avatarController.yieldAudioData(audioData, false)
90
+ avatarView.avatarController.yieldFramesData(animationData, conversationId)
91
+ ```
67
92
 
68
- **Example Project:** [Avatarkit-web-demo](https://github.com/spatialwalk/Avatarkit-web-demo)
93
+ ### Complete Examples
69
94
 
70
- This repository contains complete examples for Vanilla JS, Vue 3, and React, demonstrating how to integrate and use SPAvatarKit SDK in different frameworks.
95
+ Check the example code in the GitHub repository for complete usage flows for both modes.
96
+
97
+ **Example Project:** [AvatarSDK-Web-Demo](https://github.com/spatialwalk/AvatarSDK-Web-Demo)
98
+
99
+ This repository contains complete examples for Vanilla JS, Vue 3, and React, demonstrating:
100
+ - SDK mode: Real-time audio input with automatic animation data reception
101
+ - Host mode: Custom data sources with manual audio/animation data management
71
102
 
72
103
  ## 🏗️ Architecture Overview
73
104
 
105
+ ### Three-Layer Architecture
106
+
107
+ The SDK uses a three-layer architecture for clear separation of concerns:
108
+
109
+ 1. **Rendering Layer (AvatarView)** - Responsible for 3D rendering only
110
+ 2. **Playback Layer (AvatarController)** - Manages audio/animation synchronization and playback
111
+ 3. **Network Layer** - Handles WebSocket communication (only in SDK mode, internal implementation)
112
+
74
113
  ### Core Components
75
114
 
76
- - **AvatarKit** - SDK initialization and management
115
+ - **AvatarSDK** - SDK initialization and management
77
116
  - **AvatarManager** - Character resource loading and management
78
- - **AvatarView** - 3D rendering view (internally contains AvatarController)
79
- - **AvatarController** - Real-time communication and data processing
80
- - **AvatarCoreAdapter** - WASM module adapter
117
+ - **AvatarView** - 3D rendering view (rendering layer)
118
+ - **AvatarController** - Audio/animation playback controller (playback layer)
119
+
120
+ ### Playback Modes
121
+
122
+ The SDK supports two playback modes, configured in `AvatarSDK.initialize()`:
123
+
124
+ #### 1. SDK Mode (Default)
125
+ - Configured via `drivingServiceMode: DrivingServiceMode.sdk` in `AvatarSDK.initialize()`
126
+ - SDK handles WebSocket communication automatically
127
+ - Send audio data via `AvatarController.send()`
128
+ - SDK receives animation data from backend and synchronizes playback
129
+ - Best for: Real-time audio input scenarios
130
+
131
+ #### 2. Host Mode
132
+ - Configured via `drivingServiceMode: DrivingServiceMode.host` in `AvatarSDK.initialize()`
133
+ - Host application manages its own network/data fetching
134
+ - Host application provides both audio and animation data
135
+ - SDK only handles synchronized playback
136
+ - Best for: Custom data sources, pre-recorded content, or custom network implementations
137
+
138
+ **Note:** The playback mode is determined by `drivingServiceMode` in `AvatarSDK.initialize()` configuration.
139
+
140
+ ### Fallback Mechanism
141
+
142
+ The SDK includes a fallback mechanism to ensure audio playback continues even when animation data is unavailable:
143
+
144
+ - **SDK Mode Connection Failure**: If WebSocket connection fails to establish within 15 seconds, the SDK automatically enters fallback mode. In this mode, audio data can still be sent and will play normally, even though no animation data will be received from the server. This ensures that audio playback is not interrupted even when the service connection fails.
145
+ - **SDK Mode Server Error**: If the server returns an error after connection is established, the SDK automatically enters audio-only mode for that session and continues playing audio independently.
146
+ - **Host Mode**: If empty animation data is provided (empty array or undefined), the SDK automatically enters audio-only mode.
147
+ - Once in audio-only mode, any subsequent animation data for that session will be ignored, and only audio will continue playing.
148
+ - The fallback mode is interruptible, just like normal playback mode.
149
+ - Connection state callbacks (`onConnectionState`) will notify you when connection fails or times out, allowing you to handle the fallback state appropriately.
81
150
 
82
151
  ### Data Flow
83
152
 
153
+ #### SDK Mode Flow
154
+
84
155
  ```
85
- User audio input (16kHz mono PCM) → AvatarController → WebSocket → Backend processing
86
-
87
- Backend returns animation data (FLAME keyframes) → AvatarController → AnimationPlayer
88
-
156
+ User audio input (16kHz mono PCM16)
157
+
158
+ AvatarController.send()
159
+
160
+ WebSocket → Backend processing
161
+
162
+ Backend returns animation data (FLAME keyframes)
163
+
164
+ AvatarController → AnimationPlayer
165
+
89
166
  FLAME parameters → AvatarCore.computeFrameFlatFromParams() → Splat data
90
-
91
- Splat data RenderSystem WebGPU/WebGL → Canvas rendering
167
+
168
+ AvatarController (playback loop)AvatarView.renderRealtimeFrame()
169
+
170
+ RenderSystem → WebGPU/WebGL → Canvas rendering
92
171
  ```
93
172
 
94
- **Note:** Users need to provide audio data themselves (16kHz mono PCM), SDK handles receiving animation data and rendering.
173
+ #### Host Mode Flow
174
+
175
+ ```
176
+ External data source (audio + animation)
177
+
178
+ AvatarController.yieldAudioData(audioChunk) // Returns conversationId
179
+
180
+ AvatarController.yieldFramesData(keyframes, conversationId)
181
+
182
+ AvatarController → AnimationPlayer (synchronized playback)
183
+
184
+ FLAME parameters → AvatarCore.computeFrameFlatFromParams() → Splat data
185
+
186
+ AvatarController (playback loop) → AvatarView.renderRealtimeFrame()
187
+
188
+ RenderSystem → WebGPU/WebGL → Canvas rendering
189
+ ```
190
+
191
+ ### Audio Format Requirements
192
+
193
+ **⚠️ Important:** The SDK requires audio data to be in **16kHz mono PCM16** format:
194
+
195
+ - **Sample Rate**: 16kHz (16000 Hz) - This is a backend requirement
196
+ - **Channels**: Mono (single channel)
197
+ - **Format**: PCM16 (16-bit signed integer, little-endian)
198
+ - **Byte Order**: Little-endian
199
+
200
+ **Audio Data Format:**
201
+ - Each sample is 2 bytes (16-bit)
202
+ - Audio data should be provided as `ArrayBuffer` or `Uint8Array`
203
+ - For example: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
204
+
205
+ **Resampling:**
206
+ - If your audio source is at a different sample rate (e.g., 24kHz, 48kHz), you must resample it to 16kHz before sending to the SDK
207
+ - For high-quality resampling, we recommend using Web Audio API's `OfflineAudioContext` with anti-aliasing filtering
208
+ - See example projects for resampling implementation
95
209
 
96
210
  ## 📚 API Reference
97
211
 
98
- ### AvatarKit
212
+ ### AvatarSDK
99
213
 
100
214
  The core management class of the SDK, responsible for initialization and global configuration.
101
215
 
102
216
  ```typescript
103
217
  // Initialize SDK
104
- await AvatarKit.initialize(appId: string, configuration: Configuration)
218
+ await AvatarSDK.initialize(appId: string, configuration: Configuration)
105
219
 
106
220
  // Check initialization status
107
- const isInitialized = AvatarKit.isInitialized
221
+ const isInitialized = AvatarSDK.isInitialized
222
+
223
+ // Get initialized app ID
224
+ const appId = AvatarSDK.appId
225
+
226
+ // Get configuration
227
+ const config = AvatarSDK.configuration
228
+
229
+ // Set sessionToken (if needed, call separately)
230
+ AvatarSDK.setSessionToken('your-session-token')
231
+
232
+ // Set userId (optional, for telemetry)
233
+ AvatarSDK.setUserId('user-id')
234
+
235
+ // Get sessionToken
236
+ const sessionToken = AvatarSDK.sessionToken
237
+
238
+ // Get userId
239
+ const userId = AvatarSDK.userId
240
+
241
+ // Get SDK version
242
+ const version = AvatarSDK.version
108
243
 
109
244
  // Cleanup resources (must be called when no longer in use)
110
- AvatarKit.cleanup()
245
+ AvatarSDK.cleanup()
111
246
  ```
112
247
 
113
248
  ### AvatarManager
114
249
 
115
- Character resource manager, responsible for downloading, caching, and loading character data.
250
+ Character resource manager, responsible for downloading, caching, and loading character data. Use the singleton instance via `AvatarManager.shared`.
116
251
 
117
252
  ```typescript
118
- const manager = new AvatarManager()
253
+ // Get singleton instance
254
+ const manager = AvatarManager.shared
119
255
 
120
256
  // Load character
121
257
  const avatar = await manager.load(
@@ -129,23 +265,32 @@ manager.clearCache()
129
265
 
130
266
  ### AvatarView
131
267
 
132
- 3D rendering view, internally automatically creates and manages AvatarController.
133
-
134
- **⚠️ Important Limitation:** Currently, the SDK only supports one AvatarView instance at a time. If you need to switch characters, you must first call the `dispose()` method to clean up the current AvatarView, then create a new instance.
268
+ 3D rendering view (rendering layer), responsible for 3D rendering only. Internally automatically creates and manages `AvatarController`.
135
269
 
136
270
  ```typescript
137
- // Create view (Canvas is automatically added to container)
138
- const avatarView = new AvatarView(avatar: Avatar, container?: HTMLElement)
271
+ constructor(avatar: Avatar, container: HTMLElement)
272
+ ```
273
+
274
+ **Parameters:**
275
+ - `avatar`: Avatar instance
276
+ - `container`: Canvas container element (required)
277
+ - Canvas automatically uses the full size of the container (width and height)
278
+ - Canvas aspect ratio adapts to container size - set container size to control aspect ratio
279
+ - Canvas will be automatically added to the container
280
+ - SDK automatically handles resize events via ResizeObserver
139
281
 
140
- // Get Canvas element
141
- const canvas = avatarView.getCanvas()
282
+ **Playback Mode:**
283
+ - The playback mode is determined by `drivingServiceMode` in `AvatarSDK.initialize()` configuration
284
+ - The playback mode is fixed when creating `AvatarView` and persists throughout its lifecycle
285
+ - Cannot be changed after creation
142
286
 
143
- // Set background
144
- avatarView.setBackgroundImage('path/to/image.jpg')
145
- avatarView.setBackgroundOpaque(true)
287
+ ```typescript
288
+ // Create view (Canvas is automatically added to container)
289
+ const container = document.getElementById('avatar-container')
290
+ const avatarView = new AvatarView(avatar, container)
146
291
 
147
- // Update camera configuration
148
- avatarView.updateCameraConfig(cameraConfig: CameraConfig)
292
+ // Wait for first frame to render
293
+ await avatarView.ready // Promise that resolves when the first frame is rendered
149
294
 
150
295
  // Cleanup resources (must be called before switching characters)
151
296
  avatarView.dispose()
@@ -154,10 +299,9 @@ avatarView.dispose()
154
299
  **Character Switching Example:**
155
300
 
156
301
  ```typescript
157
- // Before switching characters, must clean up old AvatarView first
302
+ // To switch characters, simply dispose the old view and create a new one
158
303
  if (currentAvatarView) {
159
304
  currentAvatarView.dispose()
160
- currentAvatarView = null
161
305
  }
162
306
 
163
307
  // Load new character
@@ -165,37 +309,92 @@ const newAvatar = await avatarManager.load('new-character-id')
165
309
 
166
310
  // Create new AvatarView
167
311
  currentAvatarView = new AvatarView(newAvatar, container)
168
- await currentAvatarView.avatarController.start()
312
+
313
+ // SDK mode: start connection (will throw error if not in SDK mode)
314
+ await currentAvatarView.controller.start()
169
315
  ```
170
316
 
171
317
  ### AvatarController
172
318
 
173
- Real-time communication controller, handles WebSocket connections and animation data.
319
+ Audio/animation playback controller (playback layer), manages synchronized playback of audio and animation. Automatically handles WebSocket communication in SDK mode.
320
+
321
+ **Two Usage Patterns:**
322
+
323
+ #### SDK Mode Methods
174
324
 
175
325
  ```typescript
176
- // Start connection
326
+ // Start WebSocket service
177
327
  await avatarView.avatarController.start()
178
328
 
179
- // Send audio data
180
- avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
181
- // audioData: Audio data (ArrayBuffer format)
182
- // end: false (default) - Normal audio data sending, server will accumulate audio data, automatically returns animation data and starts synchronized playback of animation and audio after accumulating enough data
183
- // end: true - Immediately return animation data, no longer accumulating, used for ending current conversation or scenarios requiring immediate response
329
+ // Send audio data (must be 16kHz mono PCM16 format)
330
+ const conversationId = avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
331
+ // Returns: conversationId - Conversation ID for this conversation session
332
+ // end: false (default) - Continue sending audio data for current conversation
333
+ // end: true - Mark the end of current conversation round. After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
334
+
335
+ // Close WebSocket service
336
+ avatarView.avatarController.close()
337
+ ```
338
+
339
+ #### Host Mode Methods
340
+
341
+ ```typescript
342
+ // Stream audio chunks (must be 16kHz mono PCM16 format)
343
+ const conversationId = avatarView.avatarController.yieldAudioData(
344
+ data: Uint8Array, // Audio chunk data
345
+ isLast: boolean = false // Whether this is the last chunk
346
+ )
347
+ // Returns: conversationId - Conversation ID for this audio session
348
+
349
+ // Stream animation keyframes (requires conversationId from audio data)
350
+ avatarView.avatarController.yieldFramesData(
351
+ keyframes: any[], // Animation keyframes (obtained from your service)
352
+ conversationId: string // Conversation ID (required)
353
+ )
354
+ ```
355
+
356
+ **⚠️ Important: Conversation ID (conversationId) Management**
184
357
 
185
- // Interrupt conversation
358
+ **SDK Mode:**
359
+ - `send()` returns a conversationId to distinguish each conversation round
360
+ - `end=true` marks the end of a conversation round
361
+
362
+ **Host Mode:**
363
+ - `yieldAudioData()` returns a conversationId (automatically generates if starting new session)
364
+ - `yieldFramesData()` requires a valid conversationId parameter
365
+ - Animation data with mismatched conversationId will be **discarded**
366
+ - Use `getCurrentConversationId()` to retrieve the current active conversationId
367
+
368
+ #### Common Methods (Both Modes)
369
+
370
+ ```typescript
371
+
372
+ // Interrupt current playback (stops and clears data)
186
373
  avatarView.avatarController.interrupt()
187
374
 
188
- // Close connection
189
- avatarView.avatarController.close()
375
+ // Clear all data and resources
376
+ avatarView.avatarController.clear()
377
+
378
+ // Get current conversation ID (for Host mode)
379
+ const conversationId = avatarView.avatarController.getCurrentConversationId()
380
+ // Returns: Current conversationId for the active audio session, or null if no active session
381
+
382
+ // Volume control (affects only avatar audio player, not system volume)
383
+ avatarView.avatarController.setVolume(0.5) // Set volume to 50% (0.0 to 1.0)
384
+ const currentVolume = avatarView.avatarController.getVolume() // Get current volume (0.0 to 1.0)
190
385
 
191
386
  // Set event callbacks
192
- avatarView.avatarController.onConnectionState = (state: ConnectionState) => {}
193
- avatarView.avatarController.onAvatarState = (state: AvatarState) => {}
387
+ avatarView.avatarController.onConnectionState = (state: ConnectionState) => {} // SDK mode only
388
+ avatarView.avatarController.onConversationState = (state: ConversationState) => {}
194
389
  avatarView.avatarController.onError = (error: Error) => {}
195
-
196
- // Note: sendText() method is not supported, calling it will throw an error
197
390
  ```
198
391
 
392
+ **Important Notes:**
393
+ - `start()` and `close()` are only available in SDK mode
394
+ - `yieldAudioData()` and `yieldFramesData()` are only available in Host mode
395
+ - `pause()`, `resume()`, `interrupt()`, `clear()`, `getCurrentConversationId()`, `setVolume()`, and `getVolume()` are available in both modes
396
+ - The playback mode is determined when creating `AvatarView` and cannot be changed
397
+
199
398
  ## 🔧 Configuration
200
399
 
201
400
  ### Configuration
@@ -203,18 +402,42 @@ avatarView.avatarController.onError = (error: Error) => {}
203
402
  ```typescript
204
403
  interface Configuration {
205
404
  environment: Environment
405
+ drivingServiceMode?: DrivingServiceMode // Optional, default is 'sdk' (SDK mode)
406
+ logLevel?: LogLevel // Optional, default is 'off' (no logs)
206
407
  }
207
408
  ```
208
409
 
410
+ ### LogLevel
411
+
412
+ Control the verbosity of SDK logs:
413
+
414
+ ```typescript
415
+ enum LogLevel {
416
+ off = 'off', // Disable all logs
417
+ error = 'error', // Only error logs
418
+ warning = 'warning', // Warning and error logs
419
+ all = 'all' // All logs (info, warning, error) - default
420
+ }
421
+ ```
422
+
423
+ **Note:** `LogLevel.off` completely disables all logging, including error logs. Use with caution in production environments.
424
+
209
425
  **Description:**
210
- - `environment`: Specifies the environment (cn/us/test), SDK will automatically use the corresponding API address and WebSocket address based on the environment
211
- - `sessionToken`: Set separately via `AvatarKit.setSessionToken()`, not in Configuration
426
+ - `environment`: Specifies the environment (cn/intl), SDK will automatically use the corresponding API address and WebSocket address based on the environment
427
+ - `drivingServiceMode`: Specifies the driving service mode
428
+ - `DrivingServiceMode.sdk` (default): SDK mode - SDK handles WebSocket communication automatically
429
+ - `DrivingServiceMode.host`: Host mode - Host application provides audio and animation data
430
+ - `logLevel`: Controls the verbosity of SDK logs
431
+ - `LogLevel.off` (default): Disable all logs
432
+ - `LogLevel.error`: Only error logs
433
+ - `LogLevel.warning`: Warning and error logs
434
+ - `LogLevel.all`: All logs (info, warning, error)
435
+ - `sessionToken`: Set separately via `AvatarSDK.setSessionToken()`, not in Configuration
212
436
 
213
437
  ```typescript
214
438
  enum Environment {
215
439
  cn = 'cn', // China region
216
- us = 'us', // US region
217
- test = 'test' // Test environment
440
+ intl = 'intl', // International region
218
441
  }
219
442
  ```
220
443
 
@@ -245,16 +468,23 @@ enum ConnectionState {
245
468
  }
246
469
  ```
247
470
 
248
- ### AvatarState
471
+ ### ConversationState
249
472
 
250
473
  ```typescript
251
- enum AvatarState {
252
- idle = 'idle', // Idle state, showing breathing animation
253
- active = 'active', // Active, waiting for playable content
254
- playing = 'playing' // Playing
474
+ enum ConversationState {
475
+ idle = 'idle', // Idle state (breathing animation)
476
+ playing = 'playing' // Playing state (active conversation)
255
477
  }
256
478
  ```
257
479
 
480
+ **State Description:**
481
+ - `idle`: Avatar is in idle state (breathing animation), waiting for conversation to start
482
+ - `playing`: Avatar is playing conversation content (including during transition animations)
483
+
484
+ **Note:** During transition animations, the target state is notified immediately:
485
+ - When transitioning from `idle` to `playing`, the `playing` state is notified immediately
486
+ - When transitioning from `playing` to `idle`, the `idle` state is notified immediately
487
+
258
488
  ## 🎨 Rendering System
259
489
 
260
490
  The SDK supports two rendering backends:
@@ -264,57 +494,6 @@ The SDK supports two rendering backends:
264
494
 
265
495
  The rendering system automatically selects the best backend, no manual configuration needed.
266
496
 
267
- ## 🔍 Debugging and Monitoring
268
-
269
- ### Logging System
270
-
271
- The SDK has a built-in complete logging system, supporting different levels of log output:
272
-
273
- ```typescript
274
- import { logger } from '@spatialwalk/avatarkit'
275
-
276
- // Set log level
277
- logger.setLevel('verbose') // 'basic' | 'verbose'
278
-
279
- // Manual log output
280
- logger.log('Info message')
281
- logger.warn('Warning message')
282
- logger.error('Error message')
283
- ```
284
-
285
- ### Performance Monitoring
286
-
287
- The SDK provides performance monitoring interfaces to monitor rendering performance:
288
-
289
- ```typescript
290
- // Get rendering performance statistics
291
- const stats = avatarView.getPerformanceStats()
292
-
293
- if (stats) {
294
- console.log(`Render time: ${stats.renderTime.toFixed(2)}ms`)
295
- console.log(`Sort time: ${stats.sortTime.toFixed(2)}ms`)
296
- console.log(`Rendering backend: ${stats.backend}`)
297
-
298
- // Calculate frame rate
299
- const fps = 1000 / stats.renderTime
300
- console.log(`Frame rate: ${fps.toFixed(2)} FPS`)
301
- }
302
-
303
- // Regular performance monitoring
304
- setInterval(() => {
305
- const stats = avatarView.getPerformanceStats()
306
- if (stats) {
307
- // Send to monitoring service or display on UI
308
- console.log('Performance:', stats)
309
- }
310
- }, 1000)
311
- ```
312
-
313
- **Performance Statistics Description:**
314
- - `renderTime`: Total rendering time (milliseconds), includes sorting and GPU rendering
315
- - `sortTime`: Sorting time (milliseconds), uses Radix Sort algorithm to depth-sort point cloud
316
- - `backend`: Currently used rendering backend (`'webgpu'` | `'webgl'` | `null`)
317
-
318
497
  ## 🚨 Error Handling
319
498
 
320
499
  ### SPAvatarError
@@ -348,22 +527,43 @@ avatarView.avatarController.onError = (error: Error) => {
348
527
 
349
528
  ### Lifecycle Management
350
529
 
530
+ #### SDK Mode Lifecycle
531
+
351
532
  ```typescript
352
533
  // Initialize
534
+ const container = document.getElementById('avatar-container')
353
535
  const avatarView = new AvatarView(avatar, container)
354
536
  await avatarView.avatarController.start()
355
537
 
356
538
  // Use
357
539
  avatarView.avatarController.send(audioData, false)
358
540
 
359
- // Cleanup (must be called before switching characters)
541
+ // Cleanup
542
+ avatarView.avatarController.close()
543
+ avatarView.dispose() // Automatically cleans up all resources
544
+ ```
545
+
546
+ #### Host Mode Lifecycle
547
+
548
+ ```typescript
549
+ // Initialize
550
+ const container = document.getElementById('avatar-container')
551
+ const avatarView = new AvatarView(avatar, container)
552
+
553
+ // Use
554
+ const conversationId = avatarView.avatarController.yieldAudioData(audioChunk, false)
555
+ avatarView.avatarController.yieldFramesData(keyframes, conversationId)
556
+
557
+ // Cleanup
558
+ avatarView.avatarController.clear() // Clear all data and resources
360
559
  avatarView.dispose() // Automatically cleans up all resources
361
560
  ```
362
561
 
363
562
  **⚠️ Important Notes:**
364
- - SDK currently only supports one AvatarView instance at a time
365
- - When switching characters, must first call `dispose()` to clean up old AvatarView, then create new instance
563
+ - When disposing AvatarView instances, must call `dispose()` to properly clean up resources
366
564
  - Not properly cleaning up may cause resource leaks and rendering errors
565
+ - In SDK mode, call `close()` before `dispose()` to properly close WebSocket connections
566
+ - In Host mode, call `clear()` before `dispose()` to clear all playback data
367
567
 
368
568
  ### Memory Optimization
369
569
 
@@ -371,16 +571,6 @@ avatarView.dispose() // Automatically cleans up all resources
371
571
  - Supports dynamic loading/unloading of character and animation resources
372
572
  - Provides memory usage monitoring interface
373
573
 
374
- ### Audio Data Sending
375
-
376
- The `send()` method receives audio data in `ArrayBuffer` format:
377
-
378
- **Usage:**
379
- - `audioData`: Audio data (ArrayBuffer format)
380
- - `end=false` (default) - Normal audio data sending, server will accumulate audio data, automatically returns animation data and starts synchronized playback of animation and audio after accumulating enough data
381
- - `end=true` - Immediately return animation data, no longer accumulating, used for ending current conversation or scenarios requiring immediate response
382
- - **Important**: No need to wait for `end=true` to start playing, it will automatically start playing after accumulating enough audio data
383
-
384
574
  ## 🌐 Browser Compatibility
385
575
 
386
576
  - **Chrome/Edge** 90+ (WebGPU recommended)
@@ -400,5 +590,5 @@ Issues and Pull Requests are welcome!
400
590
 
401
591
  For questions, please contact:
402
592
  - Email: support@spavatar.com
403
- - Documentation: https://docs.spavatar.com
593
+ - Documentation: https://docs.spatialreal.ai
404
594
  - GitHub: https://github.com/spavatar/sdk