@spatialwalk/avatarkit 1.0.0-beta.5 → 1.0.0-beta.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (99) hide show
  1. package/CHANGELOG.md +447 -3
  2. package/README.md +266 -283
  3. package/dist/StreamingAudioPlayer-DXKLgGU3.js +445 -0
  4. package/dist/animation/AnimationWebSocketClient.d.ts +9 -24
  5. package/dist/animation/utils/eventEmitter.d.ts +0 -4
  6. package/dist/animation/utils/flameConverter.d.ts +3 -11
  7. package/dist/audio/AnimationPlayer.d.ts +4 -32
  8. package/dist/audio/StreamingAudioPlayer.d.ts +11 -75
  9. package/dist/avatar_core_wasm-i0Ocpx6q.js +2693 -0
  10. package/dist/avatar_core_wasm.wasm +0 -0
  11. package/dist/config/app-config.d.ts +1 -6
  12. package/dist/config/constants.d.ts +11 -25
  13. package/dist/config/sdk-config-loader.d.ts +2 -9
  14. package/dist/core/Avatar.d.ts +0 -14
  15. package/dist/core/AvatarController.d.ts +40 -116
  16. package/dist/core/AvatarDownloader.d.ts +0 -95
  17. package/dist/core/AvatarManager.d.ts +10 -18
  18. package/dist/core/AvatarSDK.d.ts +21 -0
  19. package/dist/core/AvatarView.d.ts +25 -110
  20. package/dist/core/NetworkLayer.d.ts +1 -59
  21. package/dist/generated/common/v1/models.d.ts +29 -0
  22. package/dist/generated/driveningress/v1/driveningress.d.ts +1 -12
  23. package/dist/generated/driveningress/v2/driveningress.d.ts +81 -3
  24. package/dist/generated/google/protobuf/struct.d.ts +5 -39
  25. package/dist/generated/google/protobuf/timestamp.d.ts +1 -103
  26. package/dist/index-D2_q6K22.js +14575 -0
  27. package/dist/index.d.ts +1 -6
  28. package/dist/index.js +17 -18
  29. package/dist/renderer/RenderSystem.d.ts +1 -79
  30. package/dist/renderer/covariance.d.ts +0 -12
  31. package/dist/renderer/renderer.d.ts +6 -2
  32. package/dist/renderer/sortSplats.d.ts +0 -11
  33. package/dist/renderer/webgl/reorderData.d.ts +0 -13
  34. package/dist/renderer/webgl/webglRenderer.d.ts +19 -42
  35. package/dist/renderer/webgpu/webgpuRenderer.d.ts +18 -31
  36. package/dist/types/character-settings.d.ts +0 -5
  37. package/dist/types/character.d.ts +3 -21
  38. package/dist/types/index.d.ts +85 -36
  39. package/dist/utils/animation-interpolation.d.ts +3 -13
  40. package/dist/utils/client-id.d.ts +1 -0
  41. package/dist/utils/conversationId.d.ts +1 -0
  42. package/dist/utils/error-utils.d.ts +1 -25
  43. package/dist/utils/heartbeat-manager.d.ts +18 -0
  44. package/dist/utils/id-manager.d.ts +38 -0
  45. package/dist/utils/logger.d.ts +5 -11
  46. package/dist/utils/posthog-tracker.d.ts +11 -0
  47. package/dist/utils/pwa-cache-manager.d.ts +16 -0
  48. package/dist/utils/usage-tracker.d.ts +5 -0
  49. package/dist/vanilla/vite.config.d.ts +2 -0
  50. package/dist/wasm/avatarCoreAdapter.d.ts +11 -97
  51. package/dist/wasm/avatarCoreMemory.d.ts +5 -54
  52. package/package.json +6 -3
  53. package/dist/StreamingAudioPlayer-8Dz_aHCW.js +0 -319
  54. package/dist/StreamingAudioPlayer-8Dz_aHCW.js.map +0 -1
  55. package/dist/animation/AnimationWebSocketClient.d.ts.map +0 -1
  56. package/dist/animation/utils/eventEmitter.d.ts.map +0 -1
  57. package/dist/animation/utils/flameConverter.d.ts.map +0 -1
  58. package/dist/audio/AnimationPlayer.d.ts.map +0 -1
  59. package/dist/audio/StreamingAudioPlayer.d.ts.map +0 -1
  60. package/dist/avatar_core_wasm-D4eEi7Eh.js +0 -1666
  61. package/dist/avatar_core_wasm-D4eEi7Eh.js.map +0 -1
  62. package/dist/config/app-config.d.ts.map +0 -1
  63. package/dist/config/constants.d.ts.map +0 -1
  64. package/dist/config/sdk-config-loader.d.ts.map +0 -1
  65. package/dist/core/Avatar.d.ts.map +0 -1
  66. package/dist/core/AvatarController.d.ts.map +0 -1
  67. package/dist/core/AvatarDownloader.d.ts.map +0 -1
  68. package/dist/core/AvatarKit.d.ts +0 -66
  69. package/dist/core/AvatarKit.d.ts.map +0 -1
  70. package/dist/core/AvatarManager.d.ts.map +0 -1
  71. package/dist/core/AvatarView.d.ts.map +0 -1
  72. package/dist/core/NetworkLayer.d.ts.map +0 -1
  73. package/dist/generated/driveningress/v1/driveningress.d.ts.map +0 -1
  74. package/dist/generated/driveningress/v2/driveningress.d.ts.map +0 -1
  75. package/dist/generated/google/protobuf/struct.d.ts.map +0 -1
  76. package/dist/generated/google/protobuf/timestamp.d.ts.map +0 -1
  77. package/dist/index-bbc7bE-q.js +0 -5942
  78. package/dist/index-bbc7bE-q.js.map +0 -1
  79. package/dist/index.d.ts.map +0 -1
  80. package/dist/index.js.map +0 -1
  81. package/dist/renderer/RenderSystem.d.ts.map +0 -1
  82. package/dist/renderer/covariance.d.ts.map +0 -1
  83. package/dist/renderer/renderer.d.ts.map +0 -1
  84. package/dist/renderer/sortSplats.d.ts.map +0 -1
  85. package/dist/renderer/webgl/reorderData.d.ts.map +0 -1
  86. package/dist/renderer/webgl/webglRenderer.d.ts.map +0 -1
  87. package/dist/renderer/webgpu/webgpuRenderer.d.ts.map +0 -1
  88. package/dist/types/character-settings.d.ts.map +0 -1
  89. package/dist/types/character.d.ts.map +0 -1
  90. package/dist/types/index.d.ts.map +0 -1
  91. package/dist/utils/animation-interpolation.d.ts.map +0 -1
  92. package/dist/utils/cls-tracker.d.ts +0 -17
  93. package/dist/utils/cls-tracker.d.ts.map +0 -1
  94. package/dist/utils/error-utils.d.ts.map +0 -1
  95. package/dist/utils/logger.d.ts.map +0 -1
  96. package/dist/utils/reqId.d.ts +0 -20
  97. package/dist/utils/reqId.d.ts.map +0 -1
  98. package/dist/wasm/avatarCoreAdapter.d.ts.map +0 -1
  99. package/dist/wasm/avatarCoreMemory.d.ts.map +0 -1
package/README.md CHANGED
@@ -1,4 +1,4 @@
1
- # SPAvatarKit SDK
1
+ # SPAvatarSDK SDK
2
2
 
3
3
  Real-time virtual avatar rendering SDK based on 3D Gaussian Splatting, supporting audio-driven animation rendering and high-quality 3D rendering.
4
4
 
@@ -6,6 +6,7 @@ Real-time virtual avatar rendering SDK based on 3D Gaussian Splatting, supportin
6
6
 
7
7
  - **3D Gaussian Splatting Rendering** - Based on the latest point cloud rendering technology, providing high-quality 3D virtual avatars
8
8
  - **Audio-Driven Real-Time Animation Rendering** - Users provide audio data, SDK handles receiving animation data and rendering
9
+ - **Multi-Character Support** - Support multiple avatar instances simultaneously, each with independent state and rendering
9
10
  - **WebGPU/WebGL Dual Rendering Backend** - Automatically selects the best rendering backend for compatibility
10
11
  - **WASM High-Performance Computing** - Uses C++ compiled WebAssembly modules for geometric calculations
11
12
  - **TypeScript Support** - Complete type definitions and IntelliSense
@@ -23,84 +24,86 @@ npm install @spatialwalk/avatarkit
23
24
 
24
25
  ```typescript
25
26
  import {
26
- AvatarKit,
27
+ AvatarSDK,
27
28
  AvatarManager,
28
29
  AvatarView,
29
30
  Configuration,
30
- Environment
31
+ Environment,
32
+ DrivingServiceMode,
33
+ LogLevel
31
34
  } from '@spatialwalk/avatarkit'
32
35
 
33
36
  // 1. Initialize SDK
37
+
34
38
  const configuration: Configuration = {
35
- environment: Environment.test,
39
+ environment: Environment.cn,
40
+ drivingServiceMode: DrivingServiceMode.sdk, // Optional, 'sdk' is default
41
+ // - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
42
+ // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
43
+ logLevel: LogLevel.off, // Optional, 'off' is default
44
+ // - LogLevel.off: Disable all logs
45
+ // - LogLevel.error: Only error logs
46
+ // - LogLevel.warning: Warning and error logs
47
+ // - LogLevel.all: All logs (info, warning, error)
48
+ audioFormat: { // Optional, default is { channelCount: 1, sampleRate: 16000 }
49
+ channelCount: 1, // Fixed to 1 (mono)
50
+ sampleRate: 16000 // Supported: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz
51
+ }
52
+ // characterApiBaseUrl: 'https://custom-api.example.com' // Optional, internal debug config, can be ignored
36
53
  }
37
54
 
38
- await AvatarKit.initialize('your-app-id', configuration)
55
+ await AvatarSDK.initialize('your-app-id', configuration)
39
56
 
40
57
  // Set sessionToken (if needed, call separately)
41
- // AvatarKit.setSessionToken('your-session-token')
58
+ // AvatarSDK.setSessionToken('your-session-token')
42
59
 
43
60
  // 2. Load character
44
- const avatarManager = new AvatarManager()
61
+ const avatarManager = AvatarManager.shared
45
62
  const avatar = await avatarManager.load('character-id', (progress) => {
46
63
  console.log(`Loading progress: ${progress.progress}%`)
47
64
  })
48
65
 
49
66
  // 3. Create view (automatically creates Canvas and AvatarController)
50
- // Network mode (default)
67
+ // The playback mode is determined by drivingServiceMode in AvatarSDK configuration
68
+ // - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
69
+ // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
51
70
  const container = document.getElementById('avatar-container')
52
- const avatarView = new AvatarView(avatar, {
53
- container: container,
54
- playbackMode: 'network' // Optional, 'network' is default
55
- })
71
+ const avatarView = new AvatarView(avatar, container)
56
72
 
57
- // 4. Start real-time communication (network mode only)
73
+ // 4. Start real-time communication (SDK mode only)
58
74
  await avatarView.avatarController.start()
59
75
 
60
- // 5. Send audio data (network mode)
61
- // ⚠️ Important: Audio must be 16kHz mono PCM16 format
62
- // If audio is Uint8Array, you can use slice().buffer to convert to ArrayBuffer
63
- const audioUint8 = new Uint8Array(1024) // Example: 16kHz PCM16 audio data (512 samples = 1024 bytes)
64
- const audioData = audioUint8.slice().buffer // Simplified conversion, works for ArrayBuffer and SharedArrayBuffer
65
- avatarView.avatarController.send(audioData, false) // Send audio data, will automatically start playing after accumulating enough data
66
- avatarView.avatarController.send(audioData, true) // end=true means immediately return animation data, no longer accumulating
76
+ // 5. Send audio data (SDK mode, must be mono PCM16 format matching configured sample rate)
77
+ const audioData = new ArrayBuffer(1024) // Example: PCM16 audio data at configured sample rate
78
+ avatarView.avatarController.send(audioData, false) // Send audio data
79
+ avatarView.avatarController.send(audioData, true) // end=true marks the end of current conversation round
67
80
  ```
68
81
 
69
- ### External Data Mode Example
82
+ ### Host Mode Example
70
83
 
71
84
  ```typescript
72
- import { AvatarPlaybackMode } from '@spatialwalk/avatarkit'
73
85
 
74
- // 1-3. Same as network mode (initialize SDK, load character)
86
+ // 1-3. Same as SDK mode (initialize SDK, load character)
75
87
 
76
- // 3. Create view with external data mode
88
+ // 3. Create view with Host mode
77
89
  const container = document.getElementById('avatar-container')
78
- const avatarView = new AvatarView(avatar, {
79
- container: container,
80
- playbackMode: AvatarPlaybackMode.external
81
- })
82
-
83
- // 4. Start playback with initial data (obtained from your service)
84
- // Note: Audio and animation data should be obtained from your backend service
85
- const initialAudioChunks = [{ data: audioData1, isLast: false }, { data: audioData2, isLast: false }]
86
- const initialKeyframes = animationData1 // Animation keyframes from your service
87
-
88
- await avatarView.avatarController.play(initialAudioChunks, initialKeyframes)
90
+ const avatarView = new AvatarView(avatar, container)
89
91
 
90
- // 5. Stream additional data as needed
91
- avatarView.avatarController.sendAudioChunk(audioData3, false)
92
- avatarView.avatarController.sendKeyframes(animationData2)
92
+ // 4. Host Mode Workflow:
93
+ // Send audio data first to get conversationId, then use it to send animation data
94
+ const conversationId = avatarView.avatarController.yieldAudioData(audioData, false)
95
+ avatarView.avatarController.yieldFramesData(animationDataArray, conversationId) // animationDataArray: (Uint8Array | ArrayBuffer)[]
93
96
  ```
94
97
 
95
98
  ### Complete Examples
96
99
 
97
100
  Check the example code in the GitHub repository for complete usage flows for both modes.
98
101
 
99
- **Example Project:** [Avatarkit-web-demo](https://github.com/spatialwalk/Avatarkit-web-demo)
102
+ **Example Project:** [AvatarSDK-Web-Demo](https://github.com/spatialwalk/AvatarSDK-Web-Demo)
100
103
 
101
104
  This repository contains complete examples for Vanilla JS, Vue 3, and React, demonstrating:
102
- - Network mode: Real-time audio input with automatic animation data reception
103
- - External data mode: Custom data sources with manual audio/animation data management
105
+ - SDK mode: Real-time audio input with automatic animation data reception
106
+ - Host mode: Custom data sources with manual audio/animation data management
104
107
 
105
108
  ## 🏗️ Architecture Overview
106
109
 
@@ -110,47 +113,60 @@ The SDK uses a three-layer architecture for clear separation of concerns:
110
113
 
111
114
  1. **Rendering Layer (AvatarView)** - Responsible for 3D rendering only
112
115
  2. **Playback Layer (AvatarController)** - Manages audio/animation synchronization and playback
113
- 3. **Network Layer (NetworkLayer)** - Handles WebSocket communication (only in network mode)
116
+ 3. **Network Layer** - Handles WebSocket communication (only in SDK mode, internal implementation)
114
117
 
115
118
  ### Core Components
116
119
 
117
- - **AvatarKit** - SDK initialization and management
120
+ - **AvatarSDK** - SDK initialization and management
118
121
  - **AvatarManager** - Character resource loading and management
119
122
  - **AvatarView** - 3D rendering view (rendering layer)
120
123
  - **AvatarController** - Audio/animation playback controller (playback layer)
121
- - **NetworkLayer** - WebSocket communication (network layer, automatically composed in network mode)
122
- - **AvatarCoreAdapter** - WASM module adapter
123
124
 
124
125
  ### Playback Modes
125
126
 
126
- The SDK supports two playback modes, configured when creating `AvatarView`:
127
+ The SDK supports two playback modes, configured in `AvatarSDK.initialize()`:
127
128
 
128
- #### 1. Network Mode (Default)
129
+ #### 1. SDK Mode (Default)
130
+ - Configured via `drivingServiceMode: DrivingServiceMode.sdk` in `AvatarSDK.initialize()`
129
131
  - SDK handles WebSocket communication automatically
130
132
  - Send audio data via `AvatarController.send()`
131
133
  - SDK receives animation data from backend and synchronizes playback
132
134
  - Best for: Real-time audio input scenarios
133
135
 
134
- #### 2. External Data Mode
135
- - External components manage their own network/data fetching
136
- - External components provide both audio and animation data
136
+ #### 2. Host Mode
137
+ - Configured via `drivingServiceMode: DrivingServiceMode.host` in `AvatarSDK.initialize()`
138
+ - Host application manages its own network/data fetching
139
+ - Host application provides both audio and animation data
137
140
  - SDK only handles synchronized playback
138
141
  - Best for: Custom data sources, pre-recorded content, or custom network implementations
139
142
 
143
+ **Note:** The playback mode is determined by `drivingServiceMode` in `AvatarSDK.initialize()` configuration.
144
+
145
+ ### Fallback Mechanism
146
+
147
+ The SDK includes a fallback mechanism to ensure audio playback continues even when animation data is unavailable:
148
+
149
+ - **SDK Mode Connection Failure**: If WebSocket connection fails to establish within 15 seconds, the SDK automatically enters fallback mode. In this mode, audio data can still be sent and will play normally, even though no animation data will be received from the server. This ensures that audio playback is not interrupted even when the service connection fails.
150
+ - **SDK Mode Server Error**: If the server returns an error after connection is established, the SDK automatically enters audio-only mode for that session and continues playing audio independently.
151
+ - **Host Mode**: If empty animation data is provided (empty array or undefined), the SDK automatically enters audio-only mode.
152
+ - Once in audio-only mode, any subsequent animation data for that session will be ignored, and only audio will continue playing.
153
+ - The fallback mode is interruptible, just like normal playback mode.
154
+ - Connection state callbacks (`onConnectionState`) will notify you when connection fails or times out, allowing you to handle the fallback state appropriately.
155
+
140
156
  ### Data Flow
141
157
 
142
- #### Network Mode Flow
158
+ #### SDK Mode Flow
143
159
 
144
160
  ```
145
161
  User audio input (16kHz mono PCM16)
146
162
 
147
163
  AvatarController.send()
148
164
 
149
- NetworkLayer → WebSocket → Backend processing
165
+ WebSocket → Backend processing
150
166
 
151
167
  Backend returns animation data (FLAME keyframes)
152
168
 
153
- NetworkLayer → AvatarController → AnimationPlayer
169
+ AvatarController → AnimationPlayer
154
170
 
155
171
  FLAME parameters → AvatarCore.computeFrameFlatFromParams() → Splat data
156
172
 
@@ -159,15 +175,14 @@ AvatarController (playback loop) → AvatarView.renderRealtimeFrame()
159
175
  RenderSystem → WebGPU/WebGL → Canvas rendering
160
176
  ```
161
177
 
162
- #### External Data Mode Flow
178
+ #### Host Mode Flow
163
179
 
164
180
  ```
165
181
  External data source (audio + animation)
166
182
 
167
- AvatarController.play(initialAudio, initialKeyframes) // Start playback
183
+ AvatarController.yieldAudioData(audioChunk) // Returns conversationId
168
184
 
169
- AvatarController.sendAudioChunk() // Stream additional audio
170
- AvatarController.sendKeyframes() // Stream additional animation
185
+ AvatarController.yieldFramesData(keyframesDataArray, conversationId) // keyframesDataArray: (Uint8Array | ArrayBuffer)[] - each element is a protobuf encoded Message
171
186
 
172
187
  AvatarController → AnimationPlayer (synchronized playback)
173
188
 
@@ -178,52 +193,84 @@ AvatarController (playback loop) → AvatarView.renderRealtimeFrame()
178
193
  RenderSystem → WebGPU/WebGL → Canvas rendering
179
194
  ```
180
195
 
181
- **Note:**
182
- - In network mode, users provide audio data, SDK handles network communication and animation data reception
183
- - In external data mode, users provide both audio and animation data, SDK handles synchronized playback only
184
-
185
196
  ### Audio Format Requirements
186
197
 
187
- **⚠️ Important:** The SDK requires audio data to be in **16kHz mono PCM16** format:
198
+ **⚠️ Important:** The SDK requires audio data to be in **mono PCM16** format:
188
199
 
189
- - **Sample Rate**: 16kHz (16000 Hz) - This is a backend requirement
190
- - **Channels**: Mono (single channel)
200
+ - **Sample Rate**: Configurable via `audioFormat.sampleRate` in SDK initialization (default: 16000 Hz)
201
+ - Supported sample rates: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz
202
+ - The configured sample rate will be used for both audio recording and playback
203
+ - **Channels**: Mono (single channel) - Fixed to 1 channel
191
204
  - **Format**: PCM16 (16-bit signed integer, little-endian)
192
205
  - **Byte Order**: Little-endian
193
206
 
194
207
  **Audio Data Format:**
195
208
  - Each sample is 2 bytes (16-bit)
196
209
  - Audio data should be provided as `ArrayBuffer` or `Uint8Array`
197
- - For example: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
210
+ - For example, with 16kHz sample rate: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
211
+ - For 48kHz sample rate: 1 second of audio = 48000 samples × 2 bytes = 96000 bytes
198
212
 
199
213
  **Resampling:**
200
- - If your audio source is at a different sample rate (e.g., 24kHz, 48kHz), you must resample it to 16kHz before sending to the SDK
214
+ - If your audio source is at a different sample rate, you must resample it to match the configured sample rate before sending to the SDK
201
215
  - For high-quality resampling, we recommend using Web Audio API's `OfflineAudioContext` with anti-aliasing filtering
202
216
  - See example projects for resampling implementation
203
217
 
218
+ **Configuration Example:**
219
+ ```typescript
220
+ const configuration: Configuration = {
221
+ environment: Environment.cn,
222
+ audioFormat: {
223
+ channelCount: 1, // Fixed to 1 (mono)
224
+ sampleRate: 48000 // Choose from: 8000, 16000, 22050, 24000, 32000, 44100, 48000
225
+ }
226
+ }
227
+ ```
228
+
204
229
  ## 📚 API Reference
205
230
 
206
- ### AvatarKit
231
+ ### AvatarSDK
207
232
 
208
233
  The core management class of the SDK, responsible for initialization and global configuration.
209
234
 
210
235
  ```typescript
211
236
  // Initialize SDK
212
- await AvatarKit.initialize(appId: string, configuration: Configuration)
237
+ await AvatarSDK.initialize(appId: string, configuration: Configuration)
213
238
 
214
239
  // Check initialization status
215
- const isInitialized = AvatarKit.isInitialized
240
+ const isInitialized = AvatarSDK.isInitialized
241
+
242
+ // Get initialized app ID
243
+ const appId = AvatarSDK.appId
244
+
245
+ // Get configuration
246
+ const config = AvatarSDK.configuration
247
+
248
+ // Set sessionToken (if needed, call separately)
249
+ AvatarSDK.setSessionToken('your-session-token')
250
+
251
+ // Set userId (optional, for telemetry)
252
+ AvatarSDK.setUserId('user-id')
253
+
254
+ // Get sessionToken
255
+ const sessionToken = AvatarSDK.sessionToken
256
+
257
+ // Get userId
258
+ const userId = AvatarSDK.userId
259
+
260
+ // Get SDK version
261
+ const version = AvatarSDK.version
216
262
 
217
263
  // Cleanup resources (must be called when no longer in use)
218
- AvatarKit.cleanup()
264
+ AvatarSDK.cleanup()
219
265
  ```
220
266
 
221
267
  ### AvatarManager
222
268
 
223
- Character resource manager, responsible for downloading, caching, and loading character data.
269
+ Character resource manager, responsible for downloading, caching, and loading character data. Use the singleton instance via `AvatarManager.shared`.
224
270
 
225
271
  ```typescript
226
- const manager = new AvatarManager()
272
+ // Get singleton instance
273
+ const manager = AvatarManager.shared
227
274
 
228
275
  // Load character
229
276
  const avatar = await manager.load(
@@ -239,37 +286,42 @@ manager.clearCache()
239
286
 
240
287
  3D rendering view (rendering layer), responsible for 3D rendering only. Internally automatically creates and manages `AvatarController`.
241
288
 
242
- **⚠️ Important Limitation:** Currently, the SDK only supports one AvatarView instance at a time. If you need to switch characters, you must first call the `dispose()` method to clean up the current AvatarView, then create a new instance.
289
+ ```typescript
290
+ constructor(avatar: Avatar, container: HTMLElement)
291
+ ```
292
+
293
+ **Parameters:**
294
+ - `avatar`: Avatar instance
295
+ - `container`: Canvas container element (required)
296
+ - Canvas automatically uses the full size of the container (width and height)
297
+ - Canvas aspect ratio adapts to container size - set container size to control aspect ratio
298
+ - Canvas will be automatically added to the container
299
+ - SDK automatically handles resize events via ResizeObserver
243
300
 
244
- **Playback Mode Configuration:**
301
+ **Playback Mode:**
302
+ - The playback mode is determined by `drivingServiceMode` in `AvatarSDK.initialize()` configuration
245
303
  - The playback mode is fixed when creating `AvatarView` and persists throughout its lifecycle
246
304
  - Cannot be changed after creation
247
305
 
248
306
  ```typescript
249
- import { AvatarPlaybackMode } from '@spatialwalk/avatarkit'
250
-
251
307
  // Create view (Canvas is automatically added to container)
252
- // Network mode (default)
253
308
  const container = document.getElementById('avatar-container')
254
- const avatarView = new AvatarView(avatar: Avatar, {
255
- container: container,
256
- playbackMode: AvatarPlaybackMode.network // Optional, default is 'network'
257
- })
309
+ const avatarView = new AvatarView(avatar, container)
258
310
 
259
- // External data mode
260
- const avatarView = new AvatarView(avatar: Avatar, {
261
- container: container,
262
- playbackMode: AvatarPlaybackMode.external
263
- })
264
-
265
- // Get Canvas element
266
- const canvas = avatarView.getCanvas()
311
+ // Wait for first frame to render
312
+ avatarView.onFirstRendering = () => {
313
+ // First frame rendered
314
+ }
267
315
 
268
- // Get playback mode
269
- const mode = avatarView.playbackMode // 'network' | 'external'
316
+ // Get or set avatar transform (position and scale)
317
+ // Get current transform
318
+ const currentTransform = avatarView.transform // { x: number, y: number, scale: number }
270
319
 
271
- // Update camera configuration
272
- avatarView.updateCameraConfig(cameraConfig: CameraConfig)
320
+ // Set transform
321
+ avatarView.transform = { x, y, scale }
322
+ // - x: Horizontal offset in normalized coordinates (-1 to 1, where -1 = left edge, 0 = center, 1 = right edge)
323
+ // - y: Vertical offset in normalized coordinates (-1 to 1, where -1 = bottom edge, 0 = center, 1 = top edge)
324
+ // - scale: Scale factor (1.0 = original size, 2.0 = double size, 0.5 = half size)
273
325
 
274
326
  // Cleanup resources (must be called before switching characters)
275
327
  avatarView.dispose()
@@ -278,105 +330,117 @@ avatarView.dispose()
278
330
  **Character Switching Example:**
279
331
 
280
332
  ```typescript
281
- // Before switching characters, must clean up old AvatarView first
333
+ // To switch characters, simply dispose the old view and create a new one
282
334
  if (currentAvatarView) {
283
335
  currentAvatarView.dispose()
284
- currentAvatarView = null
285
336
  }
286
337
 
287
338
  // Load new character
288
339
  const newAvatar = await avatarManager.load('new-character-id')
289
340
 
290
- // Create new AvatarView (with same or different playback mode)
291
- currentAvatarView = new AvatarView(newAvatar, {
292
- container: container,
293
- playbackMode: AvatarPlaybackMode.network
294
- })
341
+ // Create new AvatarView
342
+ currentAvatarView = new AvatarView(newAvatar, container)
295
343
 
296
- // Network mode: start connection
297
- if (currentAvatarView.playbackMode === AvatarPlaybackMode.network) {
298
- await currentAvatarView.avatarController.start()
299
- }
344
+ // SDK mode: start connection (will throw error if not in SDK mode)
345
+ await currentAvatarView.controller.start()
300
346
  ```
301
347
 
302
348
  ### AvatarController
303
349
 
304
- Audio/animation playback controller (playback layer), manages synchronized playback of audio and animation. Automatically composes `NetworkLayer` in network mode.
350
+ Audio/animation playback controller (playback layer), manages synchronized playback of audio and animation. Automatically handles WebSocket communication in SDK mode.
305
351
 
306
352
  **Two Usage Patterns:**
307
353
 
308
- #### Network Mode Methods
354
+ #### SDK Mode Methods
309
355
 
310
356
  ```typescript
311
357
  // Start WebSocket service
312
358
  await avatarView.avatarController.start()
313
359
 
314
- // Send audio data (SDK handles receiving animation data automatically)
315
- avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
316
- // audioData: Audio data (ArrayBuffer format, must be 16kHz mono PCM16)
317
- // - Sample rate: 16kHz (16000 Hz) - backend requirement
318
- // - Format: PCM16 (16-bit signed integer, little-endian)
319
- // - Channels: Mono (single channel)
320
- // - Example: 1 second = 16000 samples × 2 bytes = 32000 bytes
321
- // end: false (default) - Normal audio data sending, server will accumulate audio data, automatically returns animation data and starts synchronized playback of animation and audio after accumulating enough data
322
- // end: true - Immediately return animation data, no longer accumulating, used for ending current conversation or scenarios requiring immediate response
360
+ // Send audio data (must be 16kHz mono PCM16 format)
361
+ const conversationId = avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
362
+ // Returns: conversationId - Conversation ID for this conversation session
363
+ // end: false (default) - Continue sending audio data for current conversation
364
+ // end: true - Mark the end of current conversation round. After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
323
365
 
324
366
  // Close WebSocket service
325
367
  avatarView.avatarController.close()
326
368
  ```
327
369
 
328
- #### External Data Mode Methods
370
+ #### Host Mode Methods
329
371
 
330
372
  ```typescript
331
- // Start playback with initial audio and animation data
332
- await avatarView.avatarController.play(
333
- initialAudioChunks?: Array<{ data: Uint8Array, isLast: boolean }>, // Initial audio chunks (16kHz mono PCM16)
334
- initialKeyframes?: any[] // Initial animation keyframes (obtained from your service)
335
- )
336
-
337
- // Stream additional audio chunks (after play() is called)
338
- avatarView.avatarController.sendAudioChunk(
373
+ // Stream audio chunks (must be 16kHz mono PCM16 format)
374
+ const conversationId = avatarView.avatarController.yieldAudioData(
339
375
  data: Uint8Array, // Audio chunk data
340
376
  isLast: boolean = false // Whether this is the last chunk
341
377
  )
378
+ // Returns: conversationId - Conversation ID for this audio session
342
379
 
343
- // Stream additional animation keyframes (after play() is called)
344
- avatarView.avatarController.sendKeyframes(
345
- keyframes: any[] // Additional animation keyframes (obtained from your service)
380
+ // Stream animation keyframes (requires conversationId from audio data)
381
+ avatarView.avatarController.yieldFramesData(
382
+ keyframesDataArray: (Uint8Array | ArrayBuffer)[], // Animation keyframes binary data array (each element is a protobuf encoded Message)
383
+ conversationId: string // Conversation ID (required)
346
384
  )
347
385
  ```
348
386
 
387
+ **⚠️ Important: Conversation ID (conversationId) Management**
388
+
389
+ **SDK Mode:**
390
+ - `send()` returns a conversationId to distinguish each conversation round
391
+ - `end=true` marks the end of a conversation round
392
+
393
+ **Host Mode:**
394
+ - `yieldAudioData()` returns a conversationId (automatically generates if starting new session)
395
+ - `yieldFramesData()` requires a valid conversationId parameter
396
+ - Animation data with mismatched conversationId will be **discarded**
397
+ - Use `getCurrentConversationId()` to retrieve the current active conversationId
398
+
349
399
  #### Common Methods (Both Modes)
350
400
 
351
401
  ```typescript
402
+
352
403
  // Interrupt current playback (stops and clears data)
353
404
  avatarView.avatarController.interrupt()
354
405
 
355
406
  // Clear all data and resources
356
407
  avatarView.avatarController.clear()
357
408
 
358
- // Get connection state (network mode only)
359
- const isConnected = avatarView.avatarController.connected
409
+ // Get current conversation ID (for Host mode)
410
+ const conversationId = avatarView.avatarController.getCurrentConversationId()
411
+ // Returns: Current conversationId for the active audio session, or null if no active session
360
412
 
361
- // Start service (network mode only)
362
- await avatarView.avatarController.start()
363
-
364
- // Close service (network mode only)
365
- avatarView.avatarController.close()
366
-
367
- // Get current avatar state
368
- const state = avatarView.avatarController.state
413
+ // Volume control (affects only avatar audio player, not system volume)
414
+ avatarView.avatarController.setVolume(0.5) // Set volume to 50% (0.0 to 1.0)
415
+ const currentVolume = avatarView.avatarController.getVolume() // Get current volume (0.0 to 1.0)
369
416
 
370
417
  // Set event callbacks
371
- avatarView.avatarController.onConnectionState = (state: ConnectionState) => {} // Network mode only
372
- avatarView.avatarController.onAvatarState = (state: AvatarState) => {}
418
+ avatarView.avatarController.onConnectionState = (state: ConnectionState) => {} // SDK mode only
419
+ avatarView.avatarController.onConversationState = (state: ConversationState) => {}
373
420
  avatarView.avatarController.onError = (error: Error) => {}
374
421
  ```
375
422
 
423
+ #### Avatar Transform Methods
424
+
425
+ ```typescript
426
+ // Get or set avatar transform (position and scale in canvas)
427
+ // Get current transform
428
+ const currentTransform = avatarView.transform // { x: number, y: number, scale: number }
429
+
430
+ // Set transform
431
+ avatarView.transform = { x, y, scale }
432
+ // - x: Horizontal offset in normalized coordinates (-1 to 1, where -1 = left edge, 0 = center, 1 = right edge)
433
+ // - y: Vertical offset in normalized coordinates (-1 to 1, where -1 = bottom edge, 0 = center, 1 = top edge)
434
+ // - scale: Scale factor (1.0 = original size, 2.0 = double size, 0.5 = half size)
435
+ // Example:
436
+ avatarView.transform = { x: 0, y: 0, scale: 1.0 } // Center, original size
437
+ avatarView.transform = { x: 0.5, y: 0, scale: 2.0 } // Right half, double size
438
+ ```
439
+
376
440
  **Important Notes:**
377
- - `start()` and `close()` are only available in network mode
378
- - `play()`, `sendAudioChunk()`, and `sendKeyframes()` are only available in external data mode
379
- - `interrupt()` and `clear()` are available in both modes
441
+ - `start()` and `close()` are only available in SDK mode
442
+ - `yieldAudioData()` and `yieldFramesData()` are only available in Host mode
443
+ - `pause()`, `resume()`, `interrupt()`, `clear()`, `getCurrentConversationId()`, `setVolume()`, and `getVolume()` are available in both modes
380
444
  - The playback mode is determined when creating `AvatarView` and cannot be changed
381
445
 
382
446
  ## 🔧 Configuration
@@ -386,40 +450,55 @@ avatarView.avatarController.onError = (error: Error) => {}
386
450
  ```typescript
387
451
  interface Configuration {
388
452
  environment: Environment
453
+ drivingServiceMode?: DrivingServiceMode // Optional, default is 'sdk' (SDK mode)
454
+ logLevel?: LogLevel // Optional, default is 'off' (no logs)
455
+ audioFormat?: AudioFormat // Optional, default is { channelCount: 1, sampleRate: 16000 }
456
+ characterApiBaseUrl?: string // Optional, internal debug config, can be ignored
389
457
  }
390
- ```
391
458
 
392
- **Description:**
393
- - `environment`: Specifies the environment (cn/us/test), SDK will automatically use the corresponding API address and WebSocket address based on the environment
394
- - `sessionToken`: Set separately via `AvatarKit.setSessionToken()`, not in Configuration
395
-
396
- ```typescript
397
- enum Environment {
398
- cn = 'cn', // China region
399
- us = 'us', // US region
400
- test = 'test' // Test environment
459
+ interface AudioFormat {
460
+ readonly channelCount: 1 // Fixed to 1 (mono)
461
+ readonly sampleRate: number // Supported: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz, default: 16000
401
462
  }
402
463
  ```
403
464
 
404
- ### AvatarViewOptions
465
+ ### LogLevel
466
+
467
+ Control the verbosity of SDK logs:
405
468
 
406
469
  ```typescript
407
- interface AvatarViewOptions {
408
- playbackMode?: AvatarPlaybackMode // Playback mode, default is 'network'
409
- container?: HTMLElement // Canvas container element
470
+ enum LogLevel {
471
+ off = 'off', // Disable all logs
472
+ error = 'error', // Only error logs
473
+ warning = 'warning', // Warning and error logs
474
+ all = 'all' // All logs (info, warning, error) - default
410
475
  }
411
476
  ```
412
477
 
478
+ **Note:** `LogLevel.off` completely disables all logging, including error logs. Use with caution in production environments.
479
+
413
480
  **Description:**
414
- - `playbackMode`: Specifies the playback mode (`'network'` or `'external'`), default is `'network'`
415
- - `'network'`: SDK handles WebSocket communication, send audio via `send()`
416
- - `'external'`: External components provide audio and animation data, SDK handles synchronized playback
417
- - `container`: Optional container element for Canvas, if not provided, Canvas will be created but not added to DOM
481
+ - `environment`: Specifies the environment (cn/intl), SDK will automatically use the corresponding API address and WebSocket address based on the environment
482
+ - `drivingServiceMode`: Specifies the driving service mode
483
+ - `DrivingServiceMode.sdk` (default): SDK mode - SDK handles WebSocket communication automatically
484
+ - `DrivingServiceMode.host`: Host mode - Host application provides audio and animation data
485
+ - `logLevel`: Controls the verbosity of SDK logs
486
+ - `LogLevel.off` (default): Disable all logs
487
+ - `LogLevel.error`: Only error logs
488
+ - `LogLevel.warning`: Warning and error logs
489
+ - `LogLevel.all`: All logs (info, warning, error)
490
+ - `audioFormat`: Configures audio sample rate and channel count
491
+ - `channelCount`: Fixed to 1 (mono channel)
492
+ - `sampleRate`: Audio sample rate in Hz (default: 16000)
493
+ - Supported values: 8000, 16000, 22050, 24000, 32000, 44100, 48000
494
+ - The configured sample rate will be used for both audio recording and playback
495
+ - `characterApiBaseUrl`: Internal debug config, can be ignored
496
+ - `sessionToken`: Set separately via `AvatarSDK.setSessionToken()`, not in Configuration
418
497
 
419
498
  ```typescript
420
- enum AvatarPlaybackMode {
421
- network = 'network', // Network mode: SDK handles WebSocket communication
422
- external = 'external' // External data mode: External provides data, SDK handles playback
499
+ enum Environment {
500
+ cn = 'cn', // China region
501
+ intl = 'intl', // International region
423
502
  }
424
503
  ```
425
504
 
@@ -450,16 +529,25 @@ enum ConnectionState {
450
529
  }
451
530
  ```
452
531
 
453
- ### AvatarState
532
+ ### ConversationState
454
533
 
455
534
  ```typescript
456
- enum AvatarState {
457
- idle = 'idle', // Idle state, showing breathing animation
458
- active = 'active', // Active, waiting for playable content
459
- playing = 'playing' // Playing
535
+ enum ConversationState {
536
+ idle = 'idle', // Idle state (breathing animation)
537
+ playing = 'playing', // Playing state (active conversation)
538
+ pausing = 'pausing' // Pausing state (paused during playback)
460
539
  }
461
540
  ```
462
541
 
542
+ **State Description:**
543
+ - `idle`: Avatar is in idle state (breathing animation), waiting for conversation to start
544
+ - `playing`: Avatar is playing conversation content (including during transition animations)
545
+ - `pausing`: Avatar playback is paused (e.g., when `end=false` and waiting for more audio data)
546
+
547
+ **Note:** During transition animations, the target state is notified immediately:
548
+ - When transitioning from `idle` to `playing`, the `playing` state is notified immediately
549
+ - When transitioning from `playing` to `idle`, the `idle` state is notified immediately
550
+
463
551
  ## 🎨 Rendering System
464
552
 
465
553
  The SDK supports two rendering backends:
@@ -469,57 +557,6 @@ The SDK supports two rendering backends:
469
557
 
470
558
  The rendering system automatically selects the best backend, no manual configuration needed.
471
559
 
472
- ## 🔍 Debugging and Monitoring
473
-
474
- ### Logging System
475
-
476
- The SDK has a built-in complete logging system, supporting different levels of log output:
477
-
478
- ```typescript
479
- import { logger } from '@spatialwalk/avatarkit'
480
-
481
- // Set log level
482
- logger.setLevel('verbose') // 'basic' | 'verbose'
483
-
484
- // Manual log output
485
- logger.log('Info message')
486
- logger.warn('Warning message')
487
- logger.error('Error message')
488
- ```
489
-
490
- ### Performance Monitoring
491
-
492
- The SDK provides performance monitoring interfaces to monitor rendering performance:
493
-
494
- ```typescript
495
- // Get rendering performance statistics
496
- const stats = avatarView.getPerformanceStats()
497
-
498
- if (stats) {
499
- console.log(`Render time: ${stats.renderTime.toFixed(2)}ms`)
500
- console.log(`Sort time: ${stats.sortTime.toFixed(2)}ms`)
501
- console.log(`Rendering backend: ${stats.backend}`)
502
-
503
- // Calculate frame rate
504
- const fps = 1000 / stats.renderTime
505
- console.log(`Frame rate: ${fps.toFixed(2)} FPS`)
506
- }
507
-
508
- // Regular performance monitoring
509
- setInterval(() => {
510
- const stats = avatarView.getPerformanceStats()
511
- if (stats) {
512
- // Send to monitoring service or display on UI
513
- console.log('Performance:', stats)
514
- }
515
- }, 1000)
516
- ```
517
-
518
- **Performance Statistics Description:**
519
- - `renderTime`: Total rendering time (milliseconds), includes sorting and GPU rendering
520
- - `sortTime`: Sorting time (milliseconds), uses Radix Sort algorithm to depth-sort point cloud
521
- - `backend`: Currently used rendering backend (`'webgpu'` | `'webgl'` | `null`)
522
-
523
560
  ## 🚨 Error Handling
524
561
 
525
562
  ### SPAvatarError
@@ -553,15 +590,12 @@ avatarView.avatarController.onError = (error: Error) => {
553
590
 
554
591
  ### Lifecycle Management
555
592
 
556
- #### Network Mode Lifecycle
593
+ #### SDK Mode Lifecycle
557
594
 
558
595
  ```typescript
559
596
  // Initialize
560
597
  const container = document.getElementById('avatar-container')
561
- const avatarView = new AvatarView(avatar, {
562
- container: container,
563
- playbackMode: AvatarPlaybackMode.network
564
- })
598
+ const avatarView = new AvatarView(avatar, container)
565
599
  await avatarView.avatarController.start()
566
600
 
567
601
  // Use
@@ -572,21 +606,16 @@ avatarView.avatarController.close()
572
606
  avatarView.dispose() // Automatically cleans up all resources
573
607
  ```
574
608
 
575
- #### External Data Mode Lifecycle
609
+ #### Host Mode Lifecycle
576
610
 
577
611
  ```typescript
578
612
  // Initialize
579
613
  const container = document.getElementById('avatar-container')
580
- const avatarView = new AvatarView(avatar, {
581
- container: container,
582
- playbackMode: AvatarPlaybackMode.external
583
- })
614
+ const avatarView = new AvatarView(avatar, container)
584
615
 
585
616
  // Use
586
- const initialAudioChunks = [{ data: audioData1, isLast: false }]
587
- await avatarView.avatarController.play(initialAudioChunks, initialKeyframes)
588
- avatarView.avatarController.sendAudioChunk(audioChunk, false)
589
- avatarView.avatarController.sendKeyframes(keyframes)
617
+ const conversationId = avatarView.avatarController.yieldAudioData(audioChunk, false)
618
+ avatarView.avatarController.yieldFramesData(keyframesDataArray, conversationId) // keyframesDataArray: (Uint8Array | ArrayBuffer)[]
590
619
 
591
620
  // Cleanup
592
621
  avatarView.avatarController.clear() // Clear all data and resources
@@ -594,11 +623,10 @@ avatarView.dispose() // Automatically cleans up all resources
594
623
  ```
595
624
 
596
625
  **⚠️ Important Notes:**
597
- - SDK currently only supports one AvatarView instance at a time
598
- - When switching characters, must first call `dispose()` to clean up old AvatarView, then create new instance
626
+ - When disposing AvatarView instances, must call `dispose()` to properly clean up resources
599
627
  - Not properly cleaning up may cause resource leaks and rendering errors
600
- - In network mode, call `close()` before `dispose()` to properly close WebSocket connections
601
- - In external data mode, call `clear()` before `dispose()` to clear all playback data
628
+ - In SDK mode, call `close()` before `dispose()` to properly close WebSocket connections
629
+ - In Host mode, call `clear()` before `dispose()` to clear all playback data
602
630
 
603
631
  ### Memory Optimization
604
632
 
@@ -606,51 +634,6 @@ avatarView.dispose() // Automatically cleans up all resources
606
634
  - Supports dynamic loading/unloading of character and animation resources
607
635
  - Provides memory usage monitoring interface
608
636
 
609
- ### Audio Data Sending
610
-
611
- #### Network Mode
612
-
613
- The `send()` method receives audio data in `ArrayBuffer` format:
614
-
615
- **Audio Format Requirements:**
616
- - **Sample Rate**: 16kHz (16000 Hz) - **Backend requirement, must be exactly 16kHz**
617
- - **Format**: PCM16 (16-bit signed integer, little-endian)
618
- - **Channels**: Mono (single channel)
619
- - **Data Size**: Each sample is 2 bytes, so 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
620
-
621
- **Usage:**
622
- - `audioData`: Audio data (ArrayBuffer format, must be 16kHz mono PCM16)
623
- - `end=false` (default) - Normal audio data sending, server will accumulate audio data, automatically returns animation data and starts synchronized playback of animation and audio after accumulating enough data
624
- - `end=true` - Immediately return animation data, no longer accumulating, used for ending current conversation or scenarios requiring immediate response
625
- - **Important**: No need to wait for `end=true` to start playing, it will automatically start playing after accumulating enough audio data
626
-
627
- #### External Data Mode
628
-
629
- The `play()` method starts playback with initial data, then use `sendAudioChunk()` to stream additional audio:
630
-
631
- **Audio Format Requirements:**
632
- - Same as network mode: 16kHz mono PCM16 format
633
- - Audio data should be provided as `Uint8Array` in chunks with `isLast` flag
634
-
635
- **Usage:**
636
- ```typescript
637
- // Start playback with initial audio and animation data
638
- // Note: Audio and animation data should be obtained from your backend service
639
- const initialAudioChunks = [
640
- { data: audioData1, isLast: false },
641
- { data: audioData2, isLast: false }
642
- ]
643
- await avatarController.play(initialAudioChunks, initialKeyframes)
644
-
645
- // Stream additional audio chunks
646
- avatarController.sendAudioChunk(audioChunk, isLast)
647
- ```
648
-
649
- **Resampling (Both Modes):**
650
- - If your audio source is at a different sample rate (e.g., 24kHz, 48kHz), you **must** resample it to 16kHz before sending
651
- - For high-quality resampling, use Web Audio API's `OfflineAudioContext` with anti-aliasing filtering
652
- - See example projects (`vanilla`, `react`, `vue`) for complete resampling implementation
653
-
654
637
  ## 🌐 Browser Compatibility
655
638
 
656
639
  - **Chrome/Edge** 90+ (WebGPU recommended)
@@ -670,5 +653,5 @@ Issues and Pull Requests are welcome!
670
653
 
671
654
  For questions, please contact:
672
655
  - Email: support@spavatar.com
673
- - Documentation: https://docs.spavatar.com
656
+ - Documentation: https://docs.spatialreal.ai
674
657
  - GitHub: https://github.com/spavatar/sdk