@spatialwalk/avatarkit 1.0.0-beta.2 → 1.0.0-beta.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. package/CHANGELOG.md +340 -0
  2. package/README.md +501 -194
  3. package/dist/StreamingAudioPlayer-DEXcuhRW.js +334 -0
  4. package/dist/StreamingAudioPlayer-DEXcuhRW.js.map +1 -0
  5. package/dist/animation/AnimationWebSocketClient.d.ts +7 -4
  6. package/dist/animation/AnimationWebSocketClient.d.ts.map +1 -1
  7. package/dist/audio/AnimationPlayer.d.ts +12 -0
  8. package/dist/audio/AnimationPlayer.d.ts.map +1 -1
  9. package/dist/audio/StreamingAudioPlayer.d.ts +11 -0
  10. package/dist/audio/StreamingAudioPlayer.d.ts.map +1 -1
  11. package/dist/avatar_core_wasm-BPIbbUx_.js +1664 -0
  12. package/dist/avatar_core_wasm-BPIbbUx_.js.map +1 -0
  13. package/dist/avatar_core_wasm.wasm +0 -0
  14. package/dist/config/app-config.d.ts +3 -7
  15. package/dist/config/app-config.d.ts.map +1 -1
  16. package/dist/config/constants.d.ts +19 -3
  17. package/dist/config/constants.d.ts.map +1 -1
  18. package/dist/config/sdk-config-loader.d.ts.map +1 -1
  19. package/dist/core/Avatar.d.ts +0 -8
  20. package/dist/core/Avatar.d.ts.map +1 -1
  21. package/dist/core/AvatarController.d.ts +112 -65
  22. package/dist/core/AvatarController.d.ts.map +1 -1
  23. package/dist/core/AvatarDownloader.d.ts +1 -20
  24. package/dist/core/AvatarDownloader.d.ts.map +1 -1
  25. package/dist/core/AvatarKit.d.ts +8 -15
  26. package/dist/core/AvatarKit.d.ts.map +1 -1
  27. package/dist/core/AvatarManager.d.ts +1 -4
  28. package/dist/core/AvatarManager.d.ts.map +1 -1
  29. package/dist/core/AvatarView.d.ts +65 -53
  30. package/dist/core/AvatarView.d.ts.map +1 -1
  31. package/dist/core/NetworkLayer.d.ts +8 -0
  32. package/dist/core/NetworkLayer.d.ts.map +1 -0
  33. package/dist/index-ChKhyUK4.js +6437 -0
  34. package/dist/index-ChKhyUK4.js.map +1 -0
  35. package/dist/index.d.ts +0 -1
  36. package/dist/index.d.ts.map +1 -1
  37. package/dist/index.js +14 -15
  38. package/dist/renderer/RenderSystem.d.ts +9 -76
  39. package/dist/renderer/RenderSystem.d.ts.map +1 -1
  40. package/dist/renderer/webgl/reorderData.d.ts.map +1 -1
  41. package/dist/renderer/webgl/webglRenderer.d.ts.map +1 -1
  42. package/dist/types/character.d.ts +0 -11
  43. package/dist/types/character.d.ts.map +1 -1
  44. package/dist/types/index.d.ts +18 -6
  45. package/dist/types/index.d.ts.map +1 -1
  46. package/dist/utils/cls-tracker.d.ts +17 -0
  47. package/dist/utils/cls-tracker.d.ts.map +1 -0
  48. package/dist/utils/{reqId.d.ts → conversationId.d.ts} +6 -6
  49. package/dist/utils/conversationId.d.ts.map +1 -0
  50. package/dist/utils/logger.d.ts +2 -10
  51. package/dist/utils/logger.d.ts.map +1 -1
  52. package/dist/vanilla/vite.config.d.ts +3 -0
  53. package/dist/vanilla/vite.config.d.ts.map +1 -0
  54. package/dist/wasm/avatarCoreAdapter.d.ts +58 -9
  55. package/dist/wasm/avatarCoreAdapter.d.ts.map +1 -1
  56. package/dist/wasm/avatarCoreMemory.d.ts +5 -1
  57. package/dist/wasm/avatarCoreMemory.d.ts.map +1 -1
  58. package/package.json +10 -4
  59. package/dist/StreamingAudioPlayer-CMEiGwxE.js +0 -288
  60. package/dist/StreamingAudioPlayer-CMEiGwxE.js.map +0 -1
  61. package/dist/avatar_core_wasm-DmkU6dYn.js +0 -1666
  62. package/dist/avatar_core_wasm-DmkU6dYn.js.map +0 -1
  63. package/dist/index-CNhquYUE.js +0 -9712
  64. package/dist/index-CNhquYUE.js.map +0 -1
  65. package/dist/utils/posthog-tracker.d.ts +0 -82
  66. package/dist/utils/posthog-tracker.d.ts.map +0 -1
  67. package/dist/utils/reqId.d.ts.map +0 -1
package/README.md CHANGED
@@ -1,25 +1,26 @@
1
1
  # SPAvatarKit SDK
2
2
 
3
- 基于 3D Gaussian Splatting 的实时虚拟人物头像渲染 SDK,支持音频驱动的动画渲染和高质量 3D 渲染。
3
+ Real-time virtual avatar rendering SDK based on 3D Gaussian Splatting, supporting audio-driven animation rendering and high-quality 3D rendering.
4
4
 
5
- ## 🚀 特性
5
+ ## 🚀 Features
6
6
 
7
- - **3D Gaussian Splatting 渲染** - 基于最新的点云渲染技术,提供高质量的 3D 虚拟人物
8
- - **音频驱动的实时动画渲染** - 用户提供音频数据,SDK 负责接收动画数据并渲染
9
- - **WebGPU/WebGL 双渲染后端** - 自动选择最佳渲染后端,确保兼容性
10
- - **WASM 高性能计算** - 使用 C++ 编译的 WebAssembly 模块进行几何计算
11
- - **TypeScript 支持** - 完整的类型定义和智能提示
12
- - **模块化架构** - 清晰的组件分离,易于集成和扩展
7
+ - **3D Gaussian Splatting Rendering** - Based on the latest point cloud rendering technology, providing high-quality 3D virtual avatars
8
+ - **Audio-Driven Real-Time Animation Rendering** - Users provide audio data, SDK handles receiving animation data and rendering
9
+ - **Multi-Character Support** - Support multiple avatar instances simultaneously, each with independent state and rendering
10
+ - **WebGPU/WebGL Dual Rendering Backend** - Automatically selects the best rendering backend for compatibility
11
+ - **WASM High-Performance Computing** - Uses C++ compiled WebAssembly modules for geometric calculations
12
+ - **TypeScript Support** - Complete type definitions and IntelliSense
13
+ - **Modular Architecture** - Clear component separation, easy to integrate and extend
13
14
 
14
- ## 📦 安装
15
+ ## 📦 Installation
15
16
 
16
17
  ```bash
17
18
  npm install @spatialwalk/avatarkit
18
19
  ```
19
20
 
20
- ## 🎯 快速开始
21
+ ## 🎯 Quick Start
21
22
 
22
- ### 基础使用
23
+ ### Basic Usage
23
24
 
24
25
  ```typescript
25
26
  import {
@@ -30,190 +31,469 @@ import {
30
31
  Environment
31
32
  } from '@spatialwalk/avatarkit'
32
33
 
33
- // 1. 初始化 SDK
34
+ // 1. Initialize SDK
35
+ import { DrivingServiceMode } from '@spatialwalk/avatarkit'
36
+
34
37
  const configuration: Configuration = {
35
38
  environment: Environment.test,
39
+ drivingServiceMode: DrivingServiceMode.sdk, // Optional, 'sdk' is default
40
+ // - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
41
+ // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
36
42
  }
37
43
 
38
44
  await AvatarKit.initialize('your-app-id', configuration)
39
45
 
40
- // 设置 sessionToken(如果需要,单独调用)
46
+ // Set sessionToken (if needed, call separately)
41
47
  // AvatarKit.setSessionToken('your-session-token')
42
48
 
43
- // 2. 加载角色
44
- const avatarManager = new AvatarManager()
49
+ // 2. Load character
50
+ const avatarManager = AvatarManager.shared
45
51
  const avatar = await avatarManager.load('character-id', (progress) => {
46
52
  console.log(`Loading progress: ${progress.progress}%`)
47
53
  })
48
54
 
49
- // 3. 创建视图(自动创建 Canvas AvatarController
55
+ // 3. Create view (automatically creates Canvas and AvatarController)
56
+ // The playback mode is determined by drivingServiceMode in AvatarKit configuration
57
+ // - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
58
+ // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
50
59
  const container = document.getElementById('avatar-container')
51
60
  const avatarView = new AvatarView(avatar, container)
52
61
 
53
- // 4. 启动实时通信
62
+ // 4. Start real-time communication (SDK mode only)
54
63
  await avatarView.avatarController.start()
55
64
 
56
- // 5. 发送音频数据
57
- // 如果音频是 Uint8Array,可以使用 slice().buffer 转换为 ArrayBuffer
58
- const audioUint8 = new Uint8Array(1024) // 示例:音频数据
59
- const audioData = audioUint8.slice().buffer // 简化的转换方式,适用于 ArrayBuffer SharedArrayBuffer
60
- avatarView.avatarController.send(audioData, false) // 发送音频数据,积累到一定量后会自动开始播放
61
- avatarView.avatarController.send(audioData, true) // end=true 表示立即返回动画数据,不再积累
65
+ // 5. Send audio data (SDK mode)
66
+ // ⚠️ Important: Audio must be 16kHz mono PCM16 format
67
+ // If audio is Uint8Array, you can use slice().buffer to convert to ArrayBuffer
68
+ const audioUint8 = new Uint8Array(1024) // Example: 16kHz PCM16 audio data (512 samples = 1024 bytes)
69
+ const audioData = audioUint8.slice().buffer // Simplified conversion, works for ArrayBuffer and SharedArrayBuffer
70
+ avatarView.avatarController.send(audioData, false) // Send audio data, will automatically start playing after accumulating enough data
71
+ avatarView.avatarController.send(audioData, true) // end=true marks the end of current conversation round
62
72
  ```
63
73
 
64
- ### 完整示例
74
+ ### Host Mode Example
75
+
76
+ ```typescript
77
+ import { AvatarPlaybackMode } from '@spatialwalk/avatarkit'
78
+
79
+ // 1-3. Same as SDK mode (initialize SDK, load character)
80
+
81
+ // 3. Create view with Host mode
82
+ const container = document.getElementById('avatar-container')
83
+ const avatarView = new AvatarView(avatar, container)
84
+
85
+ // 4. Host Mode Workflow:
86
+ // ⚠️ IMPORTANT: In Host mode, you MUST send audio data FIRST to get a conversationId,
87
+ // then use that conversationId to send animation data.
88
+ // Animation data with mismatched conversationId will be discarded.
89
+
90
+ // Option A: Playback existing audio and animation data (replay mode)
91
+ const initialAudioChunks = [{ data: audioData1, isLast: false }, { data: audioData2, isLast: false }]
92
+ const initialKeyframes = animationData1 // Animation keyframes from your service
93
+ // Step 1: Send audio first to get conversationId
94
+ const conversationId = await avatarView.avatarController.playback(initialAudioChunks, initialKeyframes)
95
+
96
+ // Option B: Stream new audio and animation data (start a new session directly)
97
+ // Step 1: Send audio data first to get conversationId (automatically generates conversationId if starting new session)
98
+ const currentConversationId = avatarView.avatarController.yieldAudioData(audioData3, false)
99
+ // Step 2: Use the conversationId to send animation data (mismatched conversationId will be discarded)
100
+ avatarView.avatarController.yieldFramesData(animationData2, currentConversationId || conversationId)
101
+ // Note: To start playback, you need to call playback() with the accumulated data, or ensure enough audio data is sent
102
+ ```
103
+
104
+ ### Complete Examples
105
+
106
+ Check the example code in the GitHub repository for complete usage flows for both modes.
107
+
108
+ **Example Project:** [AvatarKit-Web-Demo](https://github.com/spatialwalk/AvatarKit-Web-Demo)
109
+
110
+ This repository contains complete examples for Vanilla JS, Vue 3, and React, demonstrating:
111
+ - SDK mode: Real-time audio input with automatic animation data reception
112
+ - Host mode: Custom data sources with manual audio/animation data management
113
+
114
+ ## 🏗️ Architecture Overview
115
+
116
+ ### Three-Layer Architecture
65
117
 
66
- 查看 GitHub 仓库中的示例代码了解完整的使用流程。
118
+ The SDK uses a three-layer architecture for clear separation of concerns:
67
119
 
68
- **示例项目:** [Avatarkit-web-demo](https://github.com/spatialwalk/Avatarkit-web-demo)
120
+ 1. **Rendering Layer (AvatarView)** - Responsible for 3D rendering only
121
+ 2. **Playback Layer (AvatarController)** - Manages audio/animation synchronization and playback
122
+ 3. **Network Layer** - Handles WebSocket communication (only in SDK mode, internal implementation)
69
123
 
70
- 该仓库包含 Vanilla JS、Vue 3 和 React 的完整示例,展示了如何在不同框架中集成和使用 SPAvatarKit SDK。
124
+ ### Core Components
71
125
 
72
- ## 🏗️ 架构概览
126
+ - **AvatarKit** - SDK initialization and management
127
+ - **AvatarManager** - Character resource loading and management
128
+ - **AvatarView** - 3D rendering view (rendering layer)
129
+ - **AvatarController** - Audio/animation playback controller (playback layer)
73
130
 
74
- ### 核心组件
131
+ ### Playback Modes
75
132
 
76
- - **AvatarKit** - SDK 初始化和管理
77
- - **AvatarManager** - 角色资源加载和管理
78
- - **AvatarView** - 3D 渲染视图(内部包含 AvatarController)
79
- - **AvatarController** - 实时通信和数据处理
80
- - **AvatarCoreAdapter** - WASM 模块适配器
133
+ The SDK supports two playback modes, configured in `AvatarKit.initialize()`:
81
134
 
82
- ### 数据流
135
+ #### 1. SDK Mode (Default)
136
+ - Configured via `drivingServiceMode: DrivingServiceMode.sdk` in `AvatarKit.initialize()`
137
+ - SDK handles WebSocket communication automatically
138
+ - Send audio data via `AvatarController.send()`
139
+ - SDK receives animation data from backend and synchronizes playback
140
+ - Best for: Real-time audio input scenarios
141
+
142
+ #### 2. Host Mode
143
+ - Configured via `drivingServiceMode: DrivingServiceMode.host` in `AvatarKit.initialize()`
144
+ - Host application manages its own network/data fetching
145
+ - Host application provides both audio and animation data
146
+ - SDK only handles synchronized playback
147
+ - Best for: Custom data sources, pre-recorded content, or custom network implementations
148
+
149
+ **Note:** The playback mode is determined by `drivingServiceMode` in `AvatarKit.initialize()` configuration.
150
+
151
+ ### Fallback Mechanism
152
+
153
+ The SDK includes a fallback mechanism to ensure audio playback continues even when animation data is unavailable:
154
+
155
+ - **SDK Mode Connection Failure**: If WebSocket connection fails to establish within 15 seconds, the SDK automatically enters fallback mode. In this mode, audio data can still be sent and will play normally, even though no animation data will be received from the server. This ensures that audio playback is not interrupted even when the service connection fails.
156
+ - **SDK Mode Server Error**: If the server returns an error after connection is established, the SDK automatically enters audio-only mode for that session and continues playing audio independently.
157
+ - **Host Mode**: If empty animation data is provided (empty array or undefined), the SDK automatically enters audio-only mode.
158
+ - Once in audio-only mode, any subsequent animation data for that session will be ignored, and only audio will continue playing.
159
+ - The fallback mode is interruptible, just like normal playback mode.
160
+ - Connection state callbacks (`onConnectionState`) will notify you when connection fails or times out, allowing you to handle the fallback state appropriately.
161
+
162
+ ### Data Flow
163
+
164
+ #### SDK Mode Flow
83
165
 
84
166
  ```
85
- 用户音频输入(16kHz mono PCM) AvatarController WebSocket → 后台处理
86
-
87
- 后台返回动画数据(FLAME 关键帧) → AvatarController → AnimationPlayer
88
-
89
- FLAME 参数 AvatarCore.computeFrameFlatFromParams() → Splat 数据
90
-
91
- Splat 数据 RenderSystem WebGPU/WebGL → Canvas 渲染
167
+ User audio input (16kHz mono PCM16)
168
+
169
+ AvatarController.send()
170
+
171
+ WebSocketBackend processing
172
+
173
+ Backend returns animation data (FLAME keyframes)
174
+
175
+ AvatarController → AnimationPlayer
176
+
177
+ FLAME parameters → AvatarCore.computeFrameFlatFromParams() → Splat data
178
+
179
+ AvatarController (playback loop) → AvatarView.renderRealtimeFrame()
180
+
181
+ RenderSystem → WebGPU/WebGL → Canvas rendering
92
182
  ```
93
183
 
94
- **注意:** 用户需要自己提供音频数据(16kHz mono PCM),SDK 负责接收动画数据并渲染。
184
+ #### Host Mode Flow
185
+
186
+ ```
187
+ External data source (audio + animation)
188
+
189
+ Step 1: Send audio data FIRST to get conversationId
190
+
191
+ AvatarController.playback(initialAudio, initialKeyframes) // Returns conversationId
192
+ OR
193
+ AvatarController.yieldAudioData(audioChunk) // Returns conversationId
194
+
195
+ Step 2: Use conversationId to send animation data
196
+
197
+ AvatarController.yieldFramesData(keyframes, conversationId) // Requires conversationId
198
+
199
+ AvatarController → AnimationPlayer (synchronized playback)
200
+
201
+ FLAME parameters → AvatarCore.computeFrameFlatFromParams() → Splat data
202
+
203
+ AvatarController (playback loop) → AvatarView.renderRealtimeFrame()
204
+
205
+ RenderSystem → WebGPU/WebGL → Canvas rendering
206
+ ```
207
+
208
+ **Note:**
209
+ - In SDK mode, users provide audio data, SDK handles network communication and animation data reception
210
+ - In Host mode, users provide both audio and animation data, SDK handles synchronized playback only
211
+
212
+ ### Audio Format Requirements
213
+
214
+ **⚠️ Important:** The SDK requires audio data to be in **16kHz mono PCM16** format:
215
+
216
+ - **Sample Rate**: 16kHz (16000 Hz) - This is a backend requirement
217
+ - **Channels**: Mono (single channel)
218
+ - **Format**: PCM16 (16-bit signed integer, little-endian)
219
+ - **Byte Order**: Little-endian
220
+
221
+ **Audio Data Format:**
222
+ - Each sample is 2 bytes (16-bit)
223
+ - Audio data should be provided as `ArrayBuffer` or `Uint8Array`
224
+ - For example: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
95
225
 
96
- ## 📚 API 参考
226
+ **Resampling:**
227
+ - If your audio source is at a different sample rate (e.g., 24kHz, 48kHz), you must resample it to 16kHz before sending to the SDK
228
+ - For high-quality resampling, we recommend using Web Audio API's `OfflineAudioContext` with anti-aliasing filtering
229
+ - See example projects for resampling implementation
230
+
231
+ ## 📚 API Reference
97
232
 
98
233
  ### AvatarKit
99
234
 
100
- SDK 的核心管理类,负责初始化和全局配置。
235
+ The core management class of the SDK, responsible for initialization and global configuration.
101
236
 
102
237
  ```typescript
103
- // 初始化 SDK
238
+ // Initialize SDK
104
239
  await AvatarKit.initialize(appId: string, configuration: Configuration)
105
240
 
106
- // 检查初始化状态
241
+ // Check initialization status
107
242
  const isInitialized = AvatarKit.isInitialized
108
243
 
109
- // 清理资源(不再使用时必须调用)
244
+ // Get initialized app ID
245
+ const appId = AvatarKit.appId
246
+
247
+ // Get configuration
248
+ const config = AvatarKit.configuration
249
+
250
+ // Set sessionToken (if needed, call separately)
251
+ AvatarKit.setSessionToken('your-session-token')
252
+
253
+ // Set userId (optional, for telemetry)
254
+ AvatarKit.setUserId('user-id')
255
+
256
+ // Get sessionToken
257
+ const sessionToken = AvatarKit.sessionToken
258
+
259
+ // Get userId
260
+ const userId = AvatarKit.userId
261
+
262
+ // Get SDK version
263
+ const version = AvatarKit.version
264
+
265
+ // Cleanup resources (must be called when no longer in use)
110
266
  AvatarKit.cleanup()
111
267
  ```
112
268
 
113
269
  ### AvatarManager
114
270
 
115
- 角色资源管理器,负责下载、缓存和加载角色数据。
271
+ Character resource manager, responsible for downloading, caching, and loading character data. Use the singleton instance via `AvatarManager.shared`.
116
272
 
117
273
  ```typescript
118
- const manager = new AvatarManager()
274
+ // Get singleton instance
275
+ const manager = AvatarManager.shared
119
276
 
120
- // 加载角色
277
+ // Load character
121
278
  const avatar = await manager.load(
122
279
  characterId: string,
123
280
  onProgress?: (progress: LoadProgressInfo) => void
124
281
  )
125
282
 
126
- // 清理缓存
283
+ // Clear cache
127
284
  manager.clearCache()
128
285
  ```
129
286
 
130
287
  ### AvatarView
131
288
 
132
- 3D 渲染视图,内部自动创建和管理 AvatarController
289
+ 3D rendering view (rendering layer), responsible for 3D rendering only. Internally automatically creates and manages `AvatarController`.
133
290
 
134
- **⚠️ 重要限制:** 目前 SDK 只支持同时存在一个 AvatarView 实例。如果需要切换角色,必须先调用 `dispose()` 方法清理当前的 AvatarView,然后再创建新的实例。
291
+ **Playback Mode Configuration:**
292
+ - The playback mode is fixed when creating `AvatarView` and persists throughout its lifecycle
293
+ - Cannot be changed after creation
135
294
 
136
295
  ```typescript
137
- // 创建视图(Canvas 会自动添加到容器中)
138
- const avatarView = new AvatarView(avatar: Avatar, container?: HTMLElement)
296
+ import { AvatarPlaybackMode } from '@spatialwalk/avatarkit'
139
297
 
140
- // 获取 Canvas 元素
141
- const canvas = avatarView.getCanvas()
298
+ // Create view (Canvas is automatically added to container)
299
+ // Create view (playback mode is determined by drivingServiceMode in AvatarKit configuration)
300
+ const container = document.getElementById('avatar-container')
301
+ const avatarView = new AvatarView(avatar, container)
142
302
 
143
- // 设置背景
144
- avatarView.setBackgroundImage('path/to/image.jpg')
145
- avatarView.setBackgroundOpaque(true)
303
+ // Get playback mode
304
+ const mode = avatarView.playbackMode // 'network' | 'external'
146
305
 
147
- // 更新相机配置
148
- avatarView.updateCameraConfig(cameraConfig: CameraConfig)
306
+ // Wait for first frame to render
307
+ await avatarView.ready // Promise that resolves when the first frame is rendered
149
308
 
150
- // 清理资源(切换角色前必须调用)
309
+ // Cleanup resources (must be called before switching characters)
151
310
  avatarView.dispose()
152
311
  ```
153
312
 
154
- **切换角色示例:**
313
+ **Character Switching Example:**
155
314
 
156
315
  ```typescript
157
- // 切换角色前,必须先清理旧的 AvatarView
316
+ // To switch characters, simply dispose the old view and create a new one
158
317
  if (currentAvatarView) {
159
318
  currentAvatarView.dispose()
160
- currentAvatarView = null
161
319
  }
162
320
 
163
- // 加载新角色
321
+ // Load new character
164
322
  const newAvatar = await avatarManager.load('new-character-id')
165
323
 
166
- // 创建新的 AvatarView
324
+ // Create new AvatarView
167
325
  currentAvatarView = new AvatarView(newAvatar, container)
168
- await currentAvatarView.avatarController.start()
326
+
327
+ // SDK mode: start connection
328
+ if (currentAvatarView.playbackMode === AvatarPlaybackMode.network) {
329
+ await currentAvatarView.controller.start()
330
+ }
169
331
  ```
170
332
 
171
333
  ### AvatarController
172
334
 
173
- 实时通信控制器,处理 WebSocket 连接和动画数据。
335
+ Audio/animation playback controller (playback layer), manages synchronized playback of audio and animation. Automatically handles WebSocket communication in SDK mode.
336
+
337
+ **Two Usage Patterns:**
338
+
339
+ #### SDK Mode Methods
174
340
 
175
341
  ```typescript
176
- // 启动连接
342
+ // Start WebSocket service
177
343
  await avatarView.avatarController.start()
178
344
 
179
- // 发送音频数据
180
- avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
181
- // audioData: 音频数据(ArrayBuffer 格式)
182
- // end: false(默认)- 正常发送音频数据,服务端会积累音频数据,积累到一定量后会自动返回动画数据并开始同步播放动画和音频
183
- // end: true - 立即返回动画数据,不再积累,用于结束当前对话或需要立即响应的场景
345
+ // Send audio data
346
+ const conversationId = avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
347
+ // Returns: conversationId - Conversation ID for this conversation session (used to distinguish each conversation round)
348
+ // audioData: Audio data (ArrayBuffer format, must be 16kHz mono PCM16)
349
+ // - Sample rate: 16kHz (16000 Hz) - backend requirement
350
+ // - Format: PCM16 (16-bit signed integer, little-endian)
351
+ // - Channels: Mono (single channel)
352
+ // - Example: 1 second = 16000 samples × 2 bytes = 32000 bytes
353
+ // end: false (default) - Continue sending audio data for current conversation
354
+ // end: true - Mark the end of current conversation round. After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
355
+
356
+ // Close WebSocket service
357
+ avatarView.avatarController.close()
358
+ ```
359
+
360
+ #### Host Mode Methods
361
+
362
+ ```typescript
363
+ // Playback existing audio and animation data (starts a new conversation)
364
+ const conversationId = await avatarView.avatarController.playback(
365
+ initialAudioChunks?: Array<{ data: Uint8Array, isLast: boolean }>, // Existing audio chunks (16kHz mono PCM16)
366
+ initialKeyframes?: any[] // Existing animation keyframes (obtained from your service)
367
+ )
368
+ // Returns: conversationId - New conversation ID for this conversation session
369
+
370
+ // Stream audio chunks (can be called directly to start a new session, or after playback() to add more data)
371
+ const conversationId = avatarView.avatarController.yieldAudioData(
372
+ data: Uint8Array, // Audio chunk data
373
+ isLast: boolean = false // Whether this is the last chunk
374
+ )
375
+ // Returns: conversationId - Conversation ID for this audio session
376
+ // Note: If no conversationId exists, a new one will be automatically generated
377
+
378
+ // Stream animation keyframes (requires conversationId from audio data)
379
+ avatarView.avatarController.yieldFramesData(
380
+ keyframes: any[], // Animation keyframes (obtained from your service)
381
+ conversationId: string // Conversation ID (required). Use getCurrentConversationId() or yieldAudioData() to get conversationId.
382
+ )
383
+ ```
384
+
385
+ **⚠️ Important: Conversation ID (conversationId) Management**
386
+
387
+ **SDK Mode:**
388
+ - `send()` returns a conversationId to distinguish each conversation round
389
+ - `end=true` marks the end of a conversation round. After `end=true`, sending new audio data will interrupt any ongoing playback from the previous conversation round
390
+
391
+ **Host Mode:**
392
+ For each conversation session, you **must**:
393
+ 1. **First send audio data** to get a conversationId (used to distinguish each conversation round):
394
+ - `playback()` returns a conversationId when playback existing audio and animation data (replay mode)
395
+ - `yieldAudioData()` returns a conversationId for streaming new audio data
396
+ 2. **Then use that conversationId** to send animation data:
397
+ - `yieldFramesData()` requires a valid conversationId parameter
398
+ - Animation data with mismatched conversationId will be **discarded**
399
+ - Use `getCurrentConversationId()` to retrieve the current active conversationId
400
+
401
+ **Example Flow (Host Mode):**
402
+ ```typescript
403
+ // Option A: Playback existing complete data (replay mode)
404
+ const conversationId = await avatarView.avatarController.playback(initialAudioChunks, initialKeyframes)
405
+
406
+ // Option B: Start streaming new data directly
407
+ // Step 1: Send audio data first to get conversationId (automatically generates if starting new session)
408
+ const conversationId = avatarView.avatarController.yieldAudioData(audioChunk, false)
409
+ // Step 2: Use the conversationId to send animation data
410
+ avatarView.avatarController.yieldFramesData(keyframes, conversationId)
411
+ // Note: To start playback with Option B, call playback() with accumulated data or ensure enough audio is sent
412
+ ```
413
+
414
+ **Why conversationId is required:**
415
+ - Ensures audio and animation data belong to the same conversation session
416
+ - Prevents data from different sessions from being mixed
417
+ - Automatically discards mismatched animation data for data integrity
418
+
419
+ #### Common Methods (Both Modes)
420
+
421
+ ```typescript
422
+ // Pause playback (can be resumed later)
423
+ avatarView.avatarController.pause()
424
+
425
+ // Resume playback (from paused state)
426
+ await avatarView.avatarController.resume()
184
427
 
185
- // 打断对话
428
+ // Interrupt current playback (stops and clears data)
186
429
  avatarView.avatarController.interrupt()
187
430
 
188
- // 关闭连接
189
- avatarView.avatarController.close()
431
+ // Clear all data and resources
432
+ avatarView.avatarController.clear()
433
+
434
+ // Get current conversation ID (for Host mode)
435
+ const conversationId = avatarView.avatarController.getCurrentConversationId()
436
+ // Returns: Current conversationId for the active audio session, or null if no active session
190
437
 
191
- // 设置事件回调
192
- avatarView.avatarController.onConnectionState = (state: ConnectionState) => {}
438
+ // Set event callbacks
439
+ avatarView.avatarController.onConnectionState = (state: ConnectionState) => {} // SDK mode only
193
440
  avatarView.avatarController.onAvatarState = (state: AvatarState) => {}
194
441
  avatarView.avatarController.onError = (error: Error) => {}
195
-
196
- // 注意:不支持 sendText() 方法,调用会抛出错误
197
442
  ```
198
443
 
199
- ## 🔧 配置
444
+ **Important Notes:**
445
+ - `start()` and `close()` are only available in SDK mode
446
+ - `playback()`, `yieldAudioData()`, and `yieldFramesData()` are only available in Host mode
447
+ - `pause()`, `resume()`, `interrupt()`, `clear()`, and `getCurrentConversationId()` are available in both modes
448
+ - The playback mode is determined when creating `AvatarView` and cannot be changed
449
+ - **Conversation ID**: In Host mode, always send audio data first to obtain a conversationId, then use that conversationId when sending animation data. Animation data with mismatched conversationId will be discarded. Use `getCurrentConversationId()` to retrieve the current active conversationId.
450
+
451
+ ## 🔧 Configuration
200
452
 
201
453
  ### Configuration
202
454
 
203
455
  ```typescript
204
456
  interface Configuration {
205
457
  environment: Environment
458
+ drivingServiceMode?: DrivingServiceMode // Optional, default is 'sdk' (SDK mode)
206
459
  }
207
460
  ```
208
461
 
209
- **说明:**
210
- - `environment`: 指定环境(cn/us/test),SDK 会根据环境自动使用对应的 API 地址和 WebSocket 地址
211
- - `sessionToken`: 通过 `AvatarKit.setSessionToken()` 单独设置,而不是在 Configuration
462
+ **Description:**
463
+ - `environment`: Specifies the environment (cn/us/test), SDK will automatically use the corresponding API address and WebSocket address based on the environment
464
+ - `drivingServiceMode`: Specifies the driving service mode
465
+ - `DrivingServiceMode.sdk` (default): SDK mode - SDK handles WebSocket communication automatically
466
+ - `DrivingServiceMode.host`: Host mode - Host application provides audio and animation data
467
+ - `sessionToken`: Set separately via `AvatarKit.setSessionToken()`, not in Configuration
212
468
 
469
+ ```typescript
213
470
  enum Environment {
214
- cn = 'cn', // 中国区
215
- us = 'us', // 美国区
216
- test = 'test' // 测试环境
471
+ cn = 'cn', // China region
472
+ us = 'us', // US region
473
+ test = 'test' // Test environment
474
+ }
475
+ ```
476
+
477
+ ### AvatarView Constructor
478
+
479
+ ```typescript
480
+ constructor(avatar: Avatar, container: HTMLElement)
481
+ ```
482
+
483
+ **Parameters:**
484
+ - `avatar`: Avatar 实例
485
+ - `container`: Canvas 容器元素(必选)
486
+ - Canvas 自动使用容器的完整尺寸(宽度和高度)
487
+ - Canvas 宽高比适应容器尺寸 - 设置容器尺寸以控制宽高比
488
+ - Canvas 会自动添加到容器中
489
+
490
+ **Note:** 播放模式由 `AvatarKit.initialize()` 配置中的 `drivingServiceMode` 决定,而不是在构造函数参数中
491
+ - SDK automatically handles resize events via ResizeObserver
492
+
493
+ ```typescript
494
+ enum AvatarPlaybackMode {
495
+ network = 'network', // SDK mode: SDK handles WebSocket communication
496
+ external = 'external' // Host mode: Host provides data, SDK handles playback
217
497
  }
218
498
  ```
219
499
 
@@ -221,17 +501,17 @@ enum Environment {
221
501
 
222
502
  ```typescript
223
503
  interface CameraConfig {
224
- position: [number, number, number] // 相机位置
225
- target: [number, number, number] // 相机目标
226
- fov: number // 视野角度
227
- near: number // 近裁剪面
228
- far: number // 远裁剪面
229
- up?: [number, number, number] // 上方向
230
- aspect?: number // 宽高比
504
+ position: [number, number, number] // Camera position
505
+ target: [number, number, number] // Camera target
506
+ fov: number // Field of view angle
507
+ near: number // Near clipping plane
508
+ far: number // Far clipping plane
509
+ up?: [number, number, number] // Up direction
510
+ aspect?: number // Aspect ratio
231
511
  }
232
512
  ```
233
513
 
234
- ## 📊 状态管理
514
+ ## 📊 State Management
235
515
 
236
516
  ### ConnectionState
237
517
 
@@ -248,77 +528,27 @@ enum ConnectionState {
248
528
 
249
529
  ```typescript
250
530
  enum AvatarState {
251
- idle = 'idle', // 空闲状态,呈现呼吸态
252
- active = 'active', // 活跃中,等待可播放内容
253
- playing = 'playing' // 播放中
531
+ idle = 'idle', // Idle state, showing breathing animation
532
+ active = 'active', // Active, waiting for playable content
533
+ playing = 'playing', // Playing
534
+ paused = 'paused' // Paused (can be resumed)
254
535
  }
255
536
  ```
256
537
 
257
- ## 🎨 渲染系统
258
-
259
- SDK 支持两种渲染后端:
260
-
261
- - **WebGPU** - 现代浏览器的高性能渲染
262
- - **WebGL** - 兼容性更好的传统渲染
263
-
264
- 渲染系统会自动选择最佳的后端,无需手动配置。
265
-
266
- ## 🔍 调试和监控
267
-
268
- ### 日志系统
269
-
270
- SDK 内置了完整的日志系统,支持不同级别的日志输出:
271
-
272
- ```typescript
273
- import { logger } from '@spatialwalk/avatarkit'
274
-
275
- // 设置日志级别
276
- logger.setLevel('verbose') // 'basic' | 'verbose'
277
-
278
- // 手动日志输出
279
- logger.log('Info message')
280
- logger.warn('Warning message')
281
- logger.error('Error message')
282
- ```
283
-
284
- ### 性能监控
285
-
286
- SDK 提供了性能监控接口,可以监控渲染性能:
538
+ ## 🎨 Rendering System
287
539
 
288
- ```typescript
289
- // 获取渲染性能统计
290
- const stats = avatarView.getPerformanceStats()
291
-
292
- if (stats) {
293
- console.log(`渲染耗时: ${stats.renderTime.toFixed(2)}ms`)
294
- console.log(`排序耗时: ${stats.sortTime.toFixed(2)}ms`)
295
- console.log(`渲染后端: ${stats.backend}`)
296
-
297
- // 计算帧率
298
- const fps = 1000 / stats.renderTime
299
- console.log(`帧率: ${fps.toFixed(2)} FPS`)
300
- }
540
+ The SDK supports two rendering backends:
301
541
 
302
- // 定期监控性能
303
- setInterval(() => {
304
- const stats = avatarView.getPerformanceStats()
305
- if (stats) {
306
- // 发送到监控服务或显示在 UI 上
307
- console.log('Performance:', stats)
308
- }
309
- }, 1000)
310
- ```
542
+ - **WebGPU** - High-performance rendering for modern browsers
543
+ - **WebGL** - Better compatibility traditional rendering
311
544
 
312
- **性能统计说明**:
313
- - `renderTime`: 总渲染耗时(毫秒),包含排序和 GPU 渲染
314
- - `sortTime`: 排序耗时(毫秒),使用 Radix Sort 算法对点云进行深度排序
315
- - `backend`: 当前使用的渲染后端(`'webgpu'` | `'webgl'` | `null`)
545
+ The rendering system automatically selects the best backend, no manual configuration needed.
316
546
 
317
- ## 🚨 错误处理
547
+ ## 🚨 Error Handling
318
548
 
319
549
  ### SPAvatarError
320
550
 
321
- SDK 使用自定义错误类型,提供更详细的错误信息:
551
+ The SDK uses custom error types, providing more detailed error information:
322
552
 
323
553
  ```typescript
324
554
  import { SPAvatarError } from '@spatialwalk/avatarkit'
@@ -334,70 +564,147 @@ try {
334
564
  }
335
565
  ```
336
566
 
337
- ### 错误回调
567
+ ### Error Callbacks
338
568
 
339
569
  ```typescript
340
570
  avatarView.avatarController.onError = (error: Error) => {
341
571
  console.error('AvatarController error:', error)
342
- // 处理错误,比如重连、用户提示等
572
+ // Handle error, such as reconnection, user notification, etc.
343
573
  }
344
574
  ```
345
575
 
346
- ## 🔄 资源管理
576
+ ## 🔄 Resource Management
347
577
 
348
- ### 生命周期管理
578
+ ### Lifecycle Management
579
+
580
+ #### SDK Mode Lifecycle
349
581
 
350
582
  ```typescript
351
- // 初始化
583
+ // Initialize
584
+ const container = document.getElementById('avatar-container')
352
585
  const avatarView = new AvatarView(avatar, container)
353
586
  await avatarView.avatarController.start()
354
587
 
355
- // 使用
588
+ // Use
356
589
  avatarView.avatarController.send(audioData, false)
357
590
 
358
- // 清理(切换角色前必须调用)
359
- avatarView.dispose() // 自动清理所有资源
591
+ // Cleanup
592
+ avatarView.avatarController.close()
593
+ avatarView.dispose() // Automatically cleans up all resources
594
+ ```
595
+
596
+ #### Host Mode Lifecycle
597
+
598
+ ```typescript
599
+ // Initialize
600
+ const container = document.getElementById('avatar-container')
601
+ const avatarView = new AvatarView(avatar, container)
602
+
603
+ // Use
604
+ const initialAudioChunks = [{ data: audioData1, isLast: false }]
605
+ // Step 1: Send audio first to get conversationId
606
+ const conversationId = await avatarView.avatarController.playback(initialAudioChunks, initialKeyframes)
607
+ // Step 2: Stream additional audio (returns conversationId)
608
+ const currentConversationId = avatarView.avatarController.yieldAudioData(audioChunk, false)
609
+ // Step 3: Use conversationId to send animation data (mismatched conversationId will be discarded)
610
+ avatarView.avatarController.yieldFramesData(keyframes, currentConversationId || conversationId)
611
+
612
+ // Cleanup
613
+ avatarView.avatarController.clear() // Clear all data and resources
614
+ avatarView.dispose() // Automatically cleans up all resources
360
615
  ```
361
616
 
362
- **⚠️ 重要提示:**
363
- - SDK 目前只支持同时存在一个 AvatarView 实例
364
- - 切换角色时,必须先调用 `dispose()` 清理旧的 AvatarView,然后再创建新的实例
365
- - 未正确清理可能导致资源泄漏和渲染错误
617
+ **⚠️ Important Notes:**
618
+ - When disposing AvatarView instances, must call `dispose()` to properly clean up resources
619
+ - Not properly cleaning up may cause resource leaks and rendering errors
620
+ - In SDK mode, call `close()` before `dispose()` to properly close WebSocket connections
621
+ - In Host mode, call `clear()` before `dispose()` to clear all playback data
622
+
623
+ ### Memory Optimization
624
+
625
+ - SDK automatically manages WASM memory allocation
626
+ - Supports dynamic loading/unloading of character and animation resources
627
+ - Provides memory usage monitoring interface
628
+
629
+ ### Audio Data Sending
630
+
631
+ #### SDK Mode
632
+
633
+ The `send()` method receives audio data in `ArrayBuffer` format:
366
634
 
367
- ### 内存优化
635
+ **Audio Format Requirements:**
636
+ - **Sample Rate**: 16kHz (16000 Hz) - **Backend requirement, must be exactly 16kHz**
637
+ - **Format**: PCM16 (16-bit signed integer, little-endian)
638
+ - **Channels**: Mono (single channel)
639
+ - **Data Size**: Each sample is 2 bytes, so 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
368
640
 
369
- - SDK 自动管理 WASM 内存分配
370
- - 支持角色和动画资源的动态加载/卸载
371
- - 提供内存使用监控接口
641
+ **Usage:**
642
+ - `audioData`: Audio data (ArrayBuffer format, must be 16kHz mono PCM16)
643
+ - `end=false` (default) - Continue sending audio data for current conversation
644
+ - `end=true` - Mark the end of current conversation round. After `end=true`, sending new audio data will interrupt any ongoing playback from the previous conversation round
645
+ - **Important**: No need to wait for `end=true` to start playing, it will automatically start playing after accumulating enough audio data
372
646
 
373
- ### 音频数据发送
647
+ #### Host Mode
648
+
649
+ The `playback()` method is used to playback existing audio and animation data (replay mode), generating a new conversationId and interrupting any existing conversation.
650
+
651
+ **Two ways to start a session in Host mode:**
652
+ 1. **Use `playback()`** - For replaying existing complete audio and animation data
653
+ 2. **Use `yieldAudioData()` directly** - For streaming new audio data (automatically generates conversationId if needed)
654
+
655
+ Then use `yieldAudioData()` to stream additional audio:
656
+
657
+ **Audio Format Requirements:**
658
+ - Same as SDK mode: 16kHz mono PCM16 format
659
+ - Audio data should be provided as `Uint8Array` in chunks with `isLast` flag
660
+
661
+ **Usage:**
662
+ ```typescript
663
+ // Playback existing audio and animation data (starts a new conversation)
664
+ // Note: Audio and animation data should be obtained from your backend service
665
+ const initialAudioChunks = [
666
+ { data: audioData1, isLast: false },
667
+ { data: audioData2, isLast: false }
668
+ ]
669
+ const conversationId = await avatarController.playback(initialAudioChunks, initialKeyframes)
670
+ // Returns: conversationId - New conversation ID for this conversation session
671
+
672
+ // Stream additional audio chunks
673
+ const conversationId = avatarController.yieldAudioData(audioChunk, isLast)
674
+ // Returns: conversationId - Conversation ID for this audio session
675
+ ```
374
676
 
375
- `send()` 方法接收 `ArrayBuffer` 格式的音频数据:
677
+ **⚠️ Conversation ID Workflow:**
678
+ 1. **Start a session** → Choose one of two ways:
679
+ - **Option A**: Use `playback(initialAudioChunks, initialKeyframes)` to replay existing complete data
680
+ - **Option B**: Use `yieldAudioData(audioChunk)` directly to start streaming (automatically generates conversationId)
681
+ 2. **Get conversationId** → Both methods return a conversationId
682
+ 3. **Send animation with conversationId** → Use the conversationId from step 1 in `yieldFramesData()`
683
+ 4. **Data matching** → Only animation data with matching conversationId will be accepted
376
684
 
377
- **使用说明:**
378
- - `audioData`: 音频数据(ArrayBuffer 格式)
379
- - `end=false`(默认)- 正常发送音频数据,服务端会积累音频数据,积累到一定量后会自动返回动画数据并开始同步播放动画和音频
380
- - `end=true` - 立即返回动画数据,不再积累,用于结束当前对话或需要立即响应的场景
381
- - **重要**:不需要等待 `end=true` 才开始播放,积累到一定音频数据后就会自动开始播放
685
+ **Resampling (Both Modes):**
686
+ - If your audio source is at a different sample rate (e.g., 24kHz, 48kHz), you **must** resample it to 16kHz before sending
687
+ - For high-quality resampling, use Web Audio API's `OfflineAudioContext` with anti-aliasing filtering
688
+ - See example projects (`vanilla`, `react`, `vue`) for complete resampling implementation
382
689
 
383
- ## 🌐 浏览器兼容性
690
+ ## 🌐 Browser Compatibility
384
691
 
385
- - **Chrome/Edge** 90+ (推荐 WebGPU)
692
+ - **Chrome/Edge** 90+ (WebGPU recommended)
386
693
  - **Firefox** 90+ (WebGL)
387
694
  - **Safari** 14+ (WebGL)
388
- - **移动端** iOS 14+, Android 8+
695
+ - **Mobile** iOS 14+, Android 8+
389
696
 
390
- ## 📝 许可证
697
+ ## 📝 License
391
698
 
392
699
  MIT License
393
700
 
394
- ## 🤝 贡献
701
+ ## 🤝 Contributing
395
702
 
396
- 欢迎提交 Issue Pull Request!
703
+ Issues and Pull Requests are welcome!
397
704
 
398
- ## 📞 支持
705
+ ## 📞 Support
399
706
 
400
- 如有问题,请联系:
401
- - 邮箱:support@spavatar.com
402
- - 文档:https://docs.spavatar.com
403
- - GitHubhttps://github.com/spavatar/sdk
707
+ For questions, please contact:
708
+ - Email: support@spavatar.com
709
+ - Documentation: https://docs.spatialreal.ai
710
+ - GitHub: https://github.com/spavatar/sdk