@spatialwalk/avatarkit 1.0.0-beta.1 → 1.0.0-beta.100

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (112) hide show
  1. package/CHANGELOG.md +938 -0
  2. package/README.md +821 -208
  3. package/dist/StreamingAudioPlayer-CY6WeP2p.js +643 -0
  4. package/dist/avatar_core_wasm-6656456a.wasm +0 -0
  5. package/dist/avatar_core_wasm-Dci9E9jF.js +2696 -0
  6. package/dist/core/Avatar.d.ts +4 -14
  7. package/dist/core/AvatarController.d.ts +108 -93
  8. package/dist/core/AvatarManager.d.ts +32 -12
  9. package/dist/core/AvatarSDK.d.ts +58 -0
  10. package/dist/core/AvatarView.d.ts +132 -123
  11. package/dist/index-DADGbRoo.js +18392 -0
  12. package/dist/index.d.ts +2 -5
  13. package/dist/index.js +17 -17
  14. package/dist/next.d.ts +2 -0
  15. package/dist/performance/FrameRateMonitor.d.ts +85 -0
  16. package/dist/types/character-settings.d.ts +7 -1
  17. package/dist/types/character.d.ts +42 -16
  18. package/dist/types/index.d.ts +170 -32
  19. package/dist/vite.d.ts +19 -0
  20. package/next.d.ts +3 -0
  21. package/next.js +187 -0
  22. package/package.json +42 -15
  23. package/vite.d.ts +20 -0
  24. package/vite.js +126 -0
  25. package/dist/StreamingAudioPlayer-C2TfYsO8.js +0 -293
  26. package/dist/StreamingAudioPlayer-C2TfYsO8.js.map +0 -1
  27. package/dist/animation/AnimationWebSocketClient.d.ts +0 -50
  28. package/dist/animation/AnimationWebSocketClient.d.ts.map +0 -1
  29. package/dist/animation/utils/eventEmitter.d.ts +0 -13
  30. package/dist/animation/utils/eventEmitter.d.ts.map +0 -1
  31. package/dist/animation/utils/flameConverter.d.ts +0 -26
  32. package/dist/animation/utils/flameConverter.d.ts.map +0 -1
  33. package/dist/audio/AnimationPlayer.d.ts +0 -53
  34. package/dist/audio/AnimationPlayer.d.ts.map +0 -1
  35. package/dist/audio/StreamingAudioPlayer.d.ts +0 -113
  36. package/dist/audio/StreamingAudioPlayer.d.ts.map +0 -1
  37. package/dist/avatar_core_wasm-DmkU6dYn.js +0 -1666
  38. package/dist/avatar_core_wasm-DmkU6dYn.js.map +0 -1
  39. package/dist/avatar_core_wasm.wasm +0 -0
  40. package/dist/config/app-config.d.ts +0 -48
  41. package/dist/config/app-config.d.ts.map +0 -1
  42. package/dist/config/constants.d.ts +0 -13
  43. package/dist/config/constants.d.ts.map +0 -1
  44. package/dist/config/region-config.d.ts +0 -17
  45. package/dist/config/region-config.d.ts.map +0 -1
  46. package/dist/config/sdk-config-loader.d.ts +0 -12
  47. package/dist/config/sdk-config-loader.d.ts.map +0 -1
  48. package/dist/core/Avatar.d.ts.map +0 -1
  49. package/dist/core/AvatarController.d.ts.map +0 -1
  50. package/dist/core/AvatarDownloader.d.ts +0 -100
  51. package/dist/core/AvatarDownloader.d.ts.map +0 -1
  52. package/dist/core/AvatarKit.d.ts +0 -60
  53. package/dist/core/AvatarKit.d.ts.map +0 -1
  54. package/dist/core/AvatarManager.d.ts.map +0 -1
  55. package/dist/core/AvatarView.d.ts.map +0 -1
  56. package/dist/generated/driveningress/v1/driveningress.d.ts +0 -80
  57. package/dist/generated/driveningress/v1/driveningress.d.ts.map +0 -1
  58. package/dist/generated/driveningress/v2/driveningress.d.ts +0 -81
  59. package/dist/generated/driveningress/v2/driveningress.d.ts.map +0 -1
  60. package/dist/generated/google/protobuf/any.d.ts +0 -145
  61. package/dist/generated/google/protobuf/any.d.ts.map +0 -1
  62. package/dist/generated/google/protobuf/struct.d.ts +0 -108
  63. package/dist/generated/google/protobuf/struct.d.ts.map +0 -1
  64. package/dist/generated/google/protobuf/timestamp.d.ts +0 -129
  65. package/dist/generated/google/protobuf/timestamp.d.ts.map +0 -1
  66. package/dist/generated/jsonapi/v1/base.d.ts +0 -140
  67. package/dist/generated/jsonapi/v1/base.d.ts.map +0 -1
  68. package/dist/generated/platform/v1/asset_groups.d.ts +0 -225
  69. package/dist/generated/platform/v1/asset_groups.d.ts.map +0 -1
  70. package/dist/generated/platform/v1/assets.d.ts +0 -149
  71. package/dist/generated/platform/v1/assets.d.ts.map +0 -1
  72. package/dist/generated/platform/v1/character.d.ts +0 -395
  73. package/dist/generated/platform/v1/character.d.ts.map +0 -1
  74. package/dist/generated/platform/v1/redeem.d.ts +0 -22
  75. package/dist/generated/platform/v1/redeem.d.ts.map +0 -1
  76. package/dist/index-DwhR9l52.js +0 -9712
  77. package/dist/index-DwhR9l52.js.map +0 -1
  78. package/dist/index.d.ts.map +0 -1
  79. package/dist/index.js.map +0 -1
  80. package/dist/renderer/RenderSystem.d.ts +0 -77
  81. package/dist/renderer/RenderSystem.d.ts.map +0 -1
  82. package/dist/renderer/covariance.d.ts +0 -13
  83. package/dist/renderer/covariance.d.ts.map +0 -1
  84. package/dist/renderer/renderer.d.ts +0 -8
  85. package/dist/renderer/renderer.d.ts.map +0 -1
  86. package/dist/renderer/sortSplats.d.ts +0 -12
  87. package/dist/renderer/sortSplats.d.ts.map +0 -1
  88. package/dist/renderer/webgl/reorderData.d.ts +0 -14
  89. package/dist/renderer/webgl/reorderData.d.ts.map +0 -1
  90. package/dist/renderer/webgl/webglRenderer.d.ts +0 -66
  91. package/dist/renderer/webgl/webglRenderer.d.ts.map +0 -1
  92. package/dist/renderer/webgpu/webgpuRenderer.d.ts +0 -54
  93. package/dist/renderer/webgpu/webgpuRenderer.d.ts.map +0 -1
  94. package/dist/types/character-settings.d.ts.map +0 -1
  95. package/dist/types/character.d.ts.map +0 -1
  96. package/dist/types/index.d.ts.map +0 -1
  97. package/dist/utils/animation-interpolation.d.ts +0 -17
  98. package/dist/utils/animation-interpolation.d.ts.map +0 -1
  99. package/dist/utils/error-utils.d.ts +0 -27
  100. package/dist/utils/error-utils.d.ts.map +0 -1
  101. package/dist/utils/logger.d.ts +0 -35
  102. package/dist/utils/logger.d.ts.map +0 -1
  103. package/dist/utils/posthog-tracker.d.ts +0 -82
  104. package/dist/utils/posthog-tracker.d.ts.map +0 -1
  105. package/dist/utils/reqId.d.ts +0 -20
  106. package/dist/utils/reqId.d.ts.map +0 -1
  107. package/dist/utils/toast.d.ts +0 -74
  108. package/dist/utils/toast.d.ts.map +0 -1
  109. package/dist/wasm/avatarCoreAdapter.d.ts +0 -188
  110. package/dist/wasm/avatarCoreAdapter.d.ts.map +0 -1
  111. package/dist/wasm/avatarCoreMemory.d.ts +0 -141
  112. package/dist/wasm/avatarCoreMemory.d.ts.map +0 -1
package/README.md CHANGED
@@ -1,196 +1,809 @@
1
- # SPAvatarKit SDK
1
+ # AvatarKit SDK
2
2
 
3
- 基于 3D Gaussian Splatting 的实时虚拟人物头像渲染 SDK,支持音频驱动的动画渲染和高质量 3D 渲染。
3
+ Real-time virtual avatar rendering SDK for Web, supporting audio-driven animation and high-quality 3D rendering.
4
4
 
5
- ## 🚀 特性
5
+ ## 🚀 Features
6
6
 
7
- - **3D Gaussian Splatting 渲染** - 基于最新的点云渲染技术,提供高质量的 3D 虚拟人物
8
- - **音频驱动的实时动画渲染** - 用户提供音频数据,SDK 负责接收动画数据并渲染
9
- - **WebGPU/WebGL 双渲染后端** - 自动选择最佳渲染后端,确保兼容性
10
- - **WASM 高性能计算** - 使用 C++ 编译的 WebAssembly 模块进行几何计算
11
- - **TypeScript 支持** - 完整的类型定义和智能提示
12
- - **模块化架构** - 清晰的组件分离,易于集成和扩展
7
+ - **High-Quality 3D Rendering** - GPU-accelerated avatar rendering with automatic backend selection
8
+ - **Audio-Driven Real-Time Animation** - Send audio data, SDK handles animation and rendering
9
+ - **Multi-Avatar Support** - Support multiple avatar instances simultaneously, each with independent state and rendering
10
+ - **TypeScript Support** - Complete type definitions and IntelliSense
11
+ - **Modular Architecture** - Clear component separation, easy to integrate and extend
13
12
 
14
- ## 📦 安装
13
+ ## 📦 Installation
15
14
 
16
15
  ```bash
17
16
  npm install @spatialwalk/avatarkit
18
17
  ```
19
18
 
20
- ## 🎯 快速开始
19
+ ## 🚧 Release Gate (Hard Rule)
21
20
 
22
- ### 基础使用
21
+ Release must pass gates before publish. Do not publish by manual ad-hoc commands.
22
+
23
+ Required gate checks:
24
+
25
+ ```bash
26
+ pnpm typecheck
27
+ pnpm test
28
+ pnpm build
29
+ ./tools/check_perf_baseline_release_gate.sh
30
+ ```
31
+
32
+ If iteration includes bugfixes, `docs/bugfix-history.md` must have completed rows (test mapping + red/green evidence).
33
+
34
+ Hotfix bypass is allowed only for emergency and must be recorded:
35
+
36
+ ```bash
37
+ HOTFIX_BYPASS=1 ./tools/check_perf_baseline_release_gate.sh
38
+ ```
39
+
40
+ ## 🧪 Benchmark Demo (Web SDK)
41
+
42
+ Use the dedicated benchmark demo (independent from `vanilla/`) for perf/render baseline runs:
43
+
44
+ ```bash
45
+ pnpm demo:benchmark
46
+ ```
47
+
48
+ ## 🚀 Demo Repository
49
+
50
+ <div align="center">
51
+
52
+ ### 📌 **Quick Start: Check Out Our Demo Repository**
53
+
54
+ We provide complete example code and best practices to help you quickly integrate the SDK.
55
+
56
+ **The demo repository includes:**
57
+ - ✅ Complete integration examples
58
+ - ✅ Usage examples for both SDK mode and Host mode
59
+ - ✅ Audio processing examples (PCM16, WAV, MP3, etc.)
60
+ - ✅ Vite configuration examples
61
+ - ✅ Next.js configuration examples
62
+ - ✅ Best practices for common scenarios
63
+
64
+ **[👉 View Demo Repository](https://github.com/spatialwalk/avatarkit-demo)** | *If not yet created, please contact the team*
65
+
66
+ </div>
67
+
68
+ ---
69
+
70
+ ## 🔧 Vite Configuration (Recommended)
71
+
72
+ If you are using Vite as your build tool, we strongly recommend using our Vite plugin to automatically handle WASM file configuration. The plugin automatically handles all necessary configurations, so you don't need to set them up manually.
73
+
74
+ ### Using the Plugin
75
+
76
+ Add the plugin to `vite.config.ts`:
77
+
78
+ ```typescript
79
+ import { defineConfig } from 'vite'
80
+ import { avatarkitVitePlugin } from '@spatialwalk/avatarkit/vite'
81
+
82
+ export default defineConfig({
83
+ plugins: [
84
+ avatarkitVitePlugin(), // Just add this line
85
+ ],
86
+ })
87
+ ```
88
+
89
+ ### Plugin Features
90
+
91
+ The plugin automatically handles:
92
+
93
+ - ✅ **Development Server**: Automatically sets the correct MIME type (`application/wasm`) for WASM files
94
+ - ✅ **Build Time**: Automatically copies WASM files to `dist/assets/` directory
95
+ - ✅ **Cloudflare Pages**: Automatically generates `_headers` file to ensure WASM files use the correct MIME type
96
+ - ✅ **Vite Configuration**: Automatically configures `optimizeDeps`, `assetsInclude`, `assetsInlineLimit`, and other options
97
+
98
+ ### Manual Configuration (Without Plugin)
99
+
100
+ If you don't use the Vite plugin, you need to manually configure the following:
101
+
102
+ ```typescript
103
+ // vite.config.ts
104
+ export default defineConfig({
105
+ optimizeDeps: {
106
+ exclude: ['@spatialwalk/avatarkit'],
107
+ },
108
+ assetsInclude: ['**/*.wasm'],
109
+ build: {
110
+ assetsInlineLimit: 0,
111
+ rollupOptions: {
112
+ output: {
113
+ assetFileNames: (assetInfo) => {
114
+ if (assetInfo.name?.endsWith('.wasm')) {
115
+ return 'assets/[name][extname]'
116
+ }
117
+ return 'assets/[name]-[hash][extname]'
118
+ },
119
+ },
120
+ },
121
+ },
122
+ // Development server needs to manually configure middleware to set WASM MIME type
123
+ configureServer(server) {
124
+ server.middlewares.use((req, res, next) => {
125
+ if (req.url?.endsWith('.wasm')) {
126
+ res.setHeader('Content-Type', 'application/wasm')
127
+ }
128
+ next()
129
+ })
130
+ },
131
+ })
132
+ ```
133
+
134
+ ## 🔧 Next.js Configuration
135
+
136
+ For Next.js projects, use the `withAvatarkit` wrapper to automatically handle WASM file configuration with webpack.
137
+
138
+ ### Using the Plugin
139
+
140
+ Wrap your Next.js config in `next.config.mjs`:
141
+
142
+ ```javascript
143
+ import { withAvatarkit } from '@spatialwalk/avatarkit/next'
144
+
145
+ export default withAvatarkit({
146
+ // ...your existing Next.js config
147
+ })
148
+ ```
149
+
150
+ ### Plugin Features
151
+
152
+ The plugin automatically handles:
153
+
154
+ - ✅ **Path Fix**: Patches asset path resolution so WASM files are correctly loaded at `/_next/static/chunks/`
155
+ - ✅ **WASM Copying**: Copies `.wasm` files into `static/chunks/` via a custom webpack plugin (client build only)
156
+ - ✅ **Content-Type Headers**: Adds `application/wasm` response header for `/_next/static/chunks/*.wasm`
157
+ - ✅ **Config Chaining**: Preserves your existing `webpack` and `headers` configurations
158
+
159
+ ## 🔐 Authentication
160
+
161
+ All environments require an **App ID** and **Session Token** for authentication.
162
+
163
+ ### App ID
164
+
165
+ The App ID is used to identify your application. You can obtain your App ID by:
166
+
167
+ 1. **For Testing**: Use the default test App ID provided in demo repositories (paired with test Session Token, only works with publicly available test avatars like Rohan, Dr.Kellan, Priya, Josh, etc.)
168
+ 2. **For Production**: Visit the [Developer Platform](https://dash.spatialreal.ai) to create your own App and avatars. You will receive your own App ID after creating an App.
169
+
170
+ ### Session Token
171
+
172
+ The Session Token is required for authentication and must be obtained from your SDK provider.
173
+
174
+ **⚠️ Important Notes:**
175
+ - The Session Token must be valid and not expired
176
+ - In production applications, you **must** manually inject a valid Session Token obtained from your SDK provider
177
+ - The default Session Token provided in demo repositories is **only for demonstration purposes** and can only be used with test avatars
178
+ - If you want to create your own avatars and test them, please visit the [Developer Platform](https://dash.spatialreal.ai) to create your own App and generate Session Tokens
179
+
180
+ **How to Set Session Token:**
181
+
182
+ ```typescript
183
+ // Initialize SDK with App ID
184
+ await AvatarSDK.initialize('your-app-id', configuration)
185
+
186
+ // Set Session Token (can be called before or after initialization)
187
+ // If called before initialization, the token will be automatically set when you initialize the SDK
188
+ AvatarSDK.setSessionToken('your-session-token')
189
+
190
+ // Get current Session Token
191
+ const sessionToken = AvatarSDK.sessionToken
192
+ ```
193
+
194
+ **Token Management:**
195
+ - The Session Token can be set at any time using `AvatarSDK.setSessionToken(token)`
196
+ - If you set the token before initializing the SDK, it will be automatically applied during initialization
197
+ - If you set the token after initialization, it will be applied immediately
198
+ - Handle token refresh logic in your application as needed (e.g., when token expires)
199
+
200
+ **For Production Integration:**
201
+ - Obtain a valid Session Token from your SDK provider
202
+ - Store the token securely (never expose it in client-side code if possible)
203
+ - Implement token refresh logic to handle token expiration
204
+ - Use `AvatarSDK.setSessionToken(token)` to inject the token programmatically
205
+
206
+ ## 🎯 Quick Start
207
+
208
+ ### ⚠️ Important: Audio Context Initialization
209
+
210
+ **Before using any audio-related features, you MUST initialize the audio context in a user gesture context** (e.g., `click`, `touchstart` event handlers). This is required by browser security policies. Calling `initializeAudioContext()` outside a user gesture will fail.
211
+
212
+ ### Basic Usage
23
213
 
24
214
  ```typescript
25
215
  import {
26
- AvatarKit,
216
+ AvatarSDK,
27
217
  AvatarManager,
28
218
  AvatarView,
29
219
  Configuration,
30
- Environment
220
+ Environment,
221
+ DrivingServiceMode,
222
+ LogLevel
31
223
  } from '@spatialwalk/avatarkit'
32
224
 
33
- // 1. 初始化 SDK
225
+ // 1. Initialize SDK
226
+
34
227
  const configuration: Configuration = {
35
- environment: Environment.test,
228
+ environment: Environment.cn,
229
+ drivingServiceMode: DrivingServiceMode.sdk, // Optional, 'sdk' is default
230
+ // - DrivingServiceMode.sdk: SDK mode - SDK handles network communication
231
+ // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
232
+ logLevel: LogLevel.off, // Optional, 'off' is default
233
+ // - LogLevel.off: Disable all logs
234
+ // - LogLevel.error: Only error logs
235
+ // - LogLevel.warning: Warning and error logs
236
+ // - LogLevel.all: All logs (info, warning, error)
237
+ audioFormat: { // Default is { channelCount: 1, sampleRate: 16000 }
238
+ channelCount: 1, // Fixed to 1 (mono)
239
+ sampleRate: 16000 // Supported: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz
240
+ // ⚠️ Must match your actual audio sample rate. Mismatched sample rate will cause playback issues.
241
+ }
242
+ // characterApiBaseUrl: 'https://custom-api.example.com' // Optional, internal debug config, can be ignored
36
243
  }
37
244
 
38
- await AvatarKit.initialize('your-app-id', configuration)
245
+ await AvatarSDK.initialize('your-app-id', configuration)
39
246
 
40
- // 设置 sessionToken(如果需要,单独调用)
41
- // AvatarKit.setSessionToken('your-session-token')
247
+ // Set Session Token (required for authentication)
248
+ // You must obtain a valid Session Token from your SDK provider
249
+ // See Authentication section above for more details
250
+ AvatarSDK.setSessionToken('your-session-token')
42
251
 
43
- // 2. 加载角色
44
- const avatarManager = new AvatarManager()
252
+ // 2. Load avatar
253
+ const avatarManager = AvatarManager.shared
45
254
  const avatar = await avatarManager.load('character-id', (progress) => {
46
255
  console.log(`Loading progress: ${progress.progress}%`)
47
256
  })
48
257
 
49
- // 3. 创建视图(自动创建 Canvas AvatarController
258
+ // 3. Create view (automatically creates Canvas and AvatarController)
259
+ // The playback mode is determined by drivingServiceMode in AvatarSDK configuration
260
+ // - DrivingServiceMode.sdk: SDK mode - SDK handles network communication
261
+ // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
50
262
  const container = document.getElementById('avatar-container')
51
263
  const avatarView = new AvatarView(avatar, container)
52
264
 
53
- // 4. 启动实时通信
54
- await avatarView.avatarController.start()
265
+ // 4. ⚠️ CRITICAL: Initialize audio context (MUST be called in user gesture context)
266
+ // This method MUST be called within a user gesture event handler (click, touchstart, etc.)
267
+ // to satisfy browser security policies. Calling it outside a user gesture will fail.
268
+ button.addEventListener('click', async () => {
269
+ // Initialize audio context - MUST be in user gesture context
270
+ await avatarView.controller.initializeAudioContext()
271
+
272
+ // 5. Start real-time communication (SDK mode only)
273
+ // Note: start() initiates the WebSocket connection asynchronously.
274
+ // Wait for onConnectionState === 'connected' before calling send().
275
+ await avatarView.controller.start()
276
+
277
+ // 6. Wait for connection to be ready
278
+ await new Promise<void>((resolve) => {
279
+ avatarView.controller.onConnectionState = (state) => {
280
+ if (state === ConnectionState.connected) resolve()
281
+ }
282
+ })
283
+
284
+ // 7. Send audio data (SDK mode, must be mono PCM16 format matching configured sample rate)
285
+ // audioData: ArrayBuffer or Uint8Array containing PCM16 (S16LE) audio samples
286
+ // ⚠️ Byte length MUST be even (2 bytes per sample). Odd-length data will cause server-side
287
+ // validation error and WebSocket disconnect.
288
+ // - PCM files: Can be directly read as ArrayBuffer
289
+ // - WAV files: Extract PCM data from WAV format (may require resampling)
290
+ // - MP3 files: Decode first (e.g., using AudioContext.decodeAudioData()), then convert to PCM16
291
+ const audioData = new ArrayBuffer(1024) // Placeholder: Replace with actual PCM16 audio data
292
+ avatarView.controller.send(audioData, false) // Send audio data
293
+ avatarView.controller.send(audioData, true) // end=true marks the end of current conversation round
294
+ })
295
+ ```
296
+
297
+ ### Host Mode Example
298
+
299
+ ```typescript
300
+
301
+ // 1-3. Same as SDK mode (initialize SDK, load avatar)
302
+
303
+ // 3. Create view with Host mode
304
+ const container = document.getElementById('avatar-container')
305
+ const avatarView = new AvatarView(avatar, container)
55
306
 
56
- // 5. 发送音频数据
57
- // 如果音频是 Uint8Array,可以使用 slice().buffer 转换为 ArrayBuffer
58
- const audioUint8 = new Uint8Array(1024) // 示例:音频数据
59
- const audioData = audioUint8.slice().buffer // 简化的转换方式,适用于 ArrayBuffer 和 SharedArrayBuffer
60
- avatarView.avatarController.send(audioData, false) // 发送音频数据,积累到一定量后会自动开始播放
61
- avatarView.avatarController.send(audioData, true) // end=true 表示立即返回动画数据,不再积累
307
+ // 4. ⚠️ CRITICAL: Initialize audio context (MUST be called in user gesture context)
308
+ // This method MUST be called within a user gesture event handler (click, touchstart, etc.)
309
+ // to satisfy browser security policies. Calling it outside a user gesture will fail.
310
+ button.addEventListener('click', async () => {
311
+ // Initialize audio context - MUST be in user gesture context
312
+ await avatarView.controller.initializeAudioContext()
313
+
314
+ // 5. Host Mode Workflow:
315
+ // Send audio data first to get conversationId, then use it to send animation data
316
+ const conversationId = avatarView.controller.yieldAudioData(audioData, false)
317
+ avatarView.controller.yieldFramesData(animationDataArray, conversationId) // animationDataArray: (Uint8Array | ArrayBuffer)[]
62
318
  ```
63
319
 
64
- ### 完整示例
320
+ ### Complete Examples
321
+
322
+ This SDK supports two usage modes:
323
+ - SDK mode: Real-time audio input with automatic animation data reception
324
+ - Host mode: Custom data sources with manual audio/animation data management
325
+
326
+ ## 🏗️ Architecture Overview
327
+
328
+ ### Core Components
329
+
330
+ - **AvatarSDK** - SDK initialization and management
331
+ - **AvatarManager** - Avatar resource loading and management
332
+ - **AvatarView** - 3D rendering view
333
+ - **AvatarController** - Audio/animation playback controller
65
334
 
66
- 查看 GitHub 仓库中的示例代码了解完整的使用流程。
335
+ ### Playback Modes
67
336
 
68
- ## 🏗️ 架构概览
337
+ The SDK supports two playback modes, configured in `AvatarSDK.initialize()`:
69
338
 
70
- ### 核心组件
339
+ #### 1. SDK Mode (Default)
340
+ - Configured via `drivingServiceMode: DrivingServiceMode.sdk` in `AvatarSDK.initialize()`
341
+ - SDK handles network communication automatically
342
+ - Send audio data via `AvatarController.send()`
343
+ - SDK receives animation data from backend and synchronizes playback
344
+ - Best for: Real-time audio input scenarios
71
345
 
72
- - **AvatarKit** - SDK 初始化和管理
73
- - **AvatarManager** - 角色资源加载和管理
74
- - **AvatarView** - 3D 渲染视图(内部包含 AvatarController)
75
- - **AvatarController** - 实时通信和数据处理
76
- - **AvatarCoreAdapter** - WASM 模块适配器
346
+ #### 2. Host Mode
347
+ - Configured via `drivingServiceMode: DrivingServiceMode.host` in `AvatarSDK.initialize()`
348
+ - Host application manages its own network/data fetching
349
+ - Host application provides both audio and animation data
350
+ - SDK only handles synchronized playback
351
+ - Best for: Custom data sources, pre-recorded content, or custom network implementations
77
352
 
78
- ### 数据流
353
+ **Note:** The playback mode is determined by `drivingServiceMode` in `AvatarSDK.initialize()` configuration.
354
+
355
+ ### Fallback Mechanism
356
+
357
+ The SDK includes a fallback mechanism to ensure audio playback continues even when animation data is unavailable:
358
+
359
+ - **SDK Mode Connection Failure**: If connection fails to establish within 15 seconds, the SDK automatically enters fallback mode. Audio data can still be sent and will play normally, even though no animation data will be received. This ensures audio playback is not interrupted.
360
+ - **SDK Mode Server Error**: If the server returns an error after connection is established, the SDK automatically enters audio-only mode for that session.
361
+ - **Host Mode**: If empty animation data is provided (empty array or undefined), the SDK automatically enters audio-only mode.
362
+ - Once in audio-only mode, any subsequent animation data for that session will be ignored, and only audio will continue playing.
363
+ - The fallback mode is interruptible, just like normal playback mode.
364
+ - Connection state callbacks (`onConnectionState`) will notify you when connection fails or times out.
365
+
366
+ ### Data Flow
367
+
368
+ #### SDK Mode Flow
79
369
 
80
370
  ```
81
- 用户音频输入(16kHz mono PCM) → AvatarController → WebSocket → 后台处理
82
-
83
- 后台返回动画数据(FLAME 关键帧) → AvatarController → AnimationPlayer
84
-
85
- FLAME 参数AvatarCore.computeFrameFlatFromParams() → Splat 数据
86
-
87
- Splat 数据 RenderSystem WebGPU/WebGL → Canvas 渲染
371
+ Audio input (PCM16 mono)
372
+
373
+ AvatarController.send()
374
+
375
+ Backend processingAnimation data
376
+
377
+ SDK synchronizes audio + animation playback
378
+
379
+ GPU rendering → Canvas
88
380
  ```
89
381
 
90
- **注意:** 用户需要自己提供音频数据(16kHz mono PCM),SDK 负责接收动画数据并渲染。
382
+ #### Host Mode Flow
91
383
 
92
- ## 📚 API 参考
384
+ ```
385
+ External data source (audio + animation)
386
+
387
+ AvatarController.yieldAudioData(audioChunk) → returns conversationId
388
+ AvatarController.yieldFramesData(dataArray, conversationId)
389
+
390
+ SDK synchronizes audio + animation playback
391
+
392
+ GPU rendering → Canvas
393
+ ```
93
394
 
94
- ### AvatarKit
395
+ ### Audio Format Requirements
95
396
 
96
- SDK 的核心管理类,负责初始化和全局配置。
397
+ **⚠️ Important:** The SDK requires audio data to be in **mono PCM16** format:
97
398
 
399
+ - **Sample Rate**: Configurable via `audioFormat.sampleRate` in SDK initialization (default: 16000 Hz)
400
+ - Supported sample rates: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz
401
+ - The configured sample rate will be used for both audio recording and playback
402
+ - **Channels**: Mono (single channel) - Fixed to 1 channel
403
+ - **Format**: PCM16 (16-bit signed integer, little-endian)
404
+ - **Byte Order**: Little-endian
405
+
406
+ **Audio Data Format:**
407
+ - Each sample is 2 bytes (16-bit signed integer, little-endian)
408
+ - Audio data should be provided as `ArrayBuffer` or `Uint8Array`
409
+ - For example, with 16kHz sample rate: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
410
+ - For 48kHz sample rate: 1 second of audio = 48000 samples × 2 bytes = 96000 bytes
411
+
412
+ **Audio Data Source:**
413
+ The `audioData` parameter represents raw PCM16 audio samples in the configured sample rate and mono format. Common audio sources include:
414
+ - **PCM files**: Raw PCM16 files can be directly read as `ArrayBuffer` or `Uint8Array` and sent to the SDK (ensure sample rate matches configuration)
415
+ - **WAV files**: WAV files contain PCM16 audio data in their data chunk. After extracting the PCM data from the WAV file format, it can be sent to the SDK (may require resampling if sample rate differs)
416
+ - **MP3 files**: MP3 files need to be decoded first (e.g., using `AudioContext.decodeAudioData()` or a decoder library), then converted from the decoded format to PCM16 before sending to the SDK
417
+ - **Microphone input**: Real-time microphone audio needs to be captured and converted to PCM16 format at the configured sample rate before sending
418
+ - **Other audio sources**: Any audio source must be converted to mono PCM16 format at the configured sample rate before sending
419
+
420
+ **Example: Processing WAV and MP3 Files:**
98
421
  ```typescript
99
- // 初始化 SDK
100
- await AvatarKit.initialize(appId: string, configuration: Configuration)
422
+ // WAV file processing
423
+ async function processWAVFile(wavFile: File): Promise<ArrayBuffer> {
424
+ const arrayBuffer = await wavFile.arrayBuffer()
425
+ const view = new DataView(arrayBuffer)
426
+
427
+ // WAV format: Skip header (usually 44 bytes for standard WAV)
428
+ // Check RIFF header
429
+ if (view.getUint32(0, true) !== 0x46464952) { // "RIFF"
430
+ throw new Error('Invalid WAV file')
431
+ }
432
+
433
+ // Find "data" chunk (offset may vary)
434
+ let dataOffset = 44 // Standard WAV header size
435
+ // For non-standard WAV files, you may need to search for "data" chunk
436
+ // This is a simplified example - production code should parse chunks properly
437
+
438
+ const pcmData = arrayBuffer.slice(dataOffset)
439
+ return pcmData
440
+ }
101
441
 
102
- // 检查初始化状态
103
- const isInitialized = AvatarKit.isInitialized
442
+ // MP3 file processing
443
+ async function processMP3File(mp3File: File, targetSampleRate: number): Promise<ArrayBuffer> {
444
+ const arrayBuffer = await mp3File.arrayBuffer()
445
+ const audioContext = new AudioContext({ sampleRate: targetSampleRate })
446
+
447
+ // Decode MP3 to AudioBuffer
448
+ const audioBuffer = await audioContext.decodeAudioData(arrayBuffer.slice(0))
449
+
450
+ // Convert AudioBuffer to PCM16 ArrayBuffer
451
+ const length = audioBuffer.length
452
+ const channels = audioBuffer.numberOfChannels
453
+ const pcm16Buffer = new ArrayBuffer(length * 2)
454
+ const pcm16View = new DataView(pcm16Buffer)
455
+
456
+ // Mix down to mono if stereo
457
+ const sourceData = channels === 1
458
+ ? audioBuffer.getChannelData(0)
459
+ : new Float32Array(length)
460
+
461
+ if (channels > 1) {
462
+ const leftChannel = audioBuffer.getChannelData(0)
463
+ const rightChannel = audioBuffer.getChannelData(1)
464
+ for (let i = 0; i < length; i++) {
465
+ sourceData[i] = (leftChannel[i] + rightChannel[i]) / 2 // Mix to mono
466
+ }
467
+ }
468
+
469
+ // Convert float32 (-1.0 to 1.0) to int16 (-32768 to 32767)
470
+ for (let i = 0; i < length; i++) {
471
+ const sample = Math.max(-1, Math.min(1, sourceData[i])) // Clamp
472
+ const int16Sample = sample < 0 ? sample * 0x8000 : sample * 0x7FFF
473
+ pcm16View.setInt16(i * 2, int16Sample, true) // little-endian
474
+ }
475
+
476
+ audioContext.close()
477
+ return pcm16Buffer
478
+ }
479
+
480
+ // Usage example:
481
+ // const wavPcmData = await processWAVFile(wavFile)
482
+ // avatarView.controller.send(wavPcmData, false)
483
+ //
484
+ // const mp3PcmData = await processMP3File(mp3File, 16000) // 16kHz
485
+ // avatarView.controller.send(mp3PcmData, false)
486
+ ```
487
+
488
+ **Resampling:**
489
+ - If your audio source is at a different sample rate, you must resample it to match the configured sample rate before sending to the SDK
490
+ - For high-quality resampling, we recommend using Web Audio API's `OfflineAudioContext` with anti-aliasing filtering
491
+ - See example projects for resampling implementation
104
492
 
105
- // 清理资源(不再使用时必须调用)
106
- AvatarKit.cleanup()
493
+ **Configuration Example:**
494
+ ```typescript
495
+ const configuration: Configuration = {
496
+ environment: Environment.cn,
497
+ audioFormat: {
498
+ channelCount: 1, // Fixed to 1 (mono)
499
+ sampleRate: 48000 // Choose from: 8000, 16000, 22050, 24000, 32000, 44100, 48000
500
+ }
501
+ }
502
+ ```
503
+
504
+ ## 📚 API Reference
505
+
506
+ ### AvatarSDK
507
+
508
+ The core management class of the SDK, responsible for initialization and global configuration.
509
+
510
+ ```typescript
511
+ // Initialize SDK
512
+ await AvatarSDK.initialize(appId: string, configuration: Configuration)
513
+
514
+ // Check initialization status
515
+ const isInitialized = AvatarSDK.isInitialized
516
+
517
+ // Get initialized app ID
518
+ const appId = AvatarSDK.appId
519
+
520
+ // Get configuration
521
+ const config = AvatarSDK.configuration
522
+
523
+ // Set Session Token (required for authentication)
524
+ // You must obtain a valid Session Token from your SDK provider
525
+ // See Authentication section for more details
526
+ AvatarSDK.setSessionToken('your-session-token')
527
+
528
+ // Set userId (optional, for telemetry)
529
+ AvatarSDK.setUserId('user-id')
530
+
531
+ // Get sessionToken
532
+ const sessionToken = AvatarSDK.sessionToken
533
+
534
+ // Get userId
535
+ const userId = AvatarSDK.userId
536
+
537
+ // Get SDK version
538
+ const version = AvatarSDK.version
539
+
540
+ // Cleanup resources (must be called when no longer in use)
541
+ AvatarSDK.cleanup()
107
542
  ```
108
543
 
109
544
  ### AvatarManager
110
545
 
111
- 角色资源管理器,负责下载、缓存和加载角色数据。
546
+ Avatar resource manager, responsible for downloading, caching, and loading avatar data. Use the singleton instance via `AvatarManager.shared`.
112
547
 
113
548
  ```typescript
114
- const manager = new AvatarManager()
549
+ // Get singleton instance
550
+ const manager = AvatarManager.shared
115
551
 
116
- // 加载角色
552
+ // Load avatar
117
553
  const avatar = await manager.load(
118
- characterId: string,
554
+ id: string,
119
555
  onProgress?: (progress: LoadProgressInfo) => void
120
556
  )
121
557
 
122
- // 清理缓存
123
- manager.clearCache()
558
+ // Clear cache
559
+ manager.clearAll()
124
560
  ```
125
561
 
126
562
  ### AvatarView
127
563
 
128
- 3D 渲染视图,内部自动创建和管理 AvatarController
564
+ 3D rendering view, responsible for 3D rendering only. Internally automatically creates and manages `AvatarController`.
565
+
566
+ ```typescript
567
+ constructor(avatar: Avatar, container: HTMLElement)
568
+ ```
569
+
570
+ **Parameters:**
571
+ - `avatar`: Avatar instance
572
+ - `container`: Canvas container element (required)
573
+ - Canvas automatically uses the full size of the container (width and height)
574
+ - Canvas aspect ratio adapts to container size - set container size to control aspect ratio
575
+ - Canvas will be automatically added to the container
576
+ - SDK automatically handles resize events via ResizeObserver
577
+
578
+ **Playback Mode:**
579
+ - The playback mode is determined by `drivingServiceMode` in `AvatarSDK.initialize()` configuration
580
+ - The playback mode is fixed when creating `AvatarView` and persists throughout its lifecycle
581
+ - Cannot be changed after creation
129
582
 
130
583
  ```typescript
131
- // 创建视图(Canvas 会自动添加到容器中)
132
- const avatarView = new AvatarView(avatar: Avatar, container?: HTMLElement)
584
+ // Create view (Canvas is automatically added to container)
585
+ const container = document.getElementById('avatar-container')
586
+ const avatarView = new AvatarView(avatar, container)
133
587
 
134
- // 获取 Canvas 元素
135
- const canvas = avatarView.getCanvas()
588
+ // Wait for first frame to render
589
+ avatarView.onFirstRendering = () => {
590
+ // First frame rendered
591
+ }
136
592
 
137
- // 设置背景
138
- avatarView.setBackgroundImage('path/to/image.jpg')
139
- avatarView.setBackgroundOpaque(true)
593
+ // Get or set avatar transform (position and scale)
594
+ // Get current transform
595
+ const currentTransform = avatarView.avatarTransform // { x: number, y: number, scale: number }
140
596
 
141
- // 更新相机配置
142
- avatarView.updateCameraConfig(cameraConfig: CameraConfig)
597
+ // Set transform
598
+ avatarView.avatarTransform = { x, y, scale }
599
+ // - x: Horizontal offset in normalized coordinates (-1 to 1, where -1 = left edge, 0 = center, 1 = right edge)
600
+ // - y: Vertical offset in normalized coordinates (-1 to 1, where -1 = bottom edge, 0 = center, 1 = top edge)
601
+ // - scale: Scale factor (1.0 = original size, 2.0 = double size, 0.5 = half size)
143
602
 
144
- // 清理资源
603
+ // Cleanup resources (must be called before switching avatars)
145
604
  avatarView.dispose()
146
605
  ```
147
606
 
607
+ **Switching Avatars:**
608
+
609
+ To switch avatars, dispose the old view and create a new one. Do NOT attempt to reuse or reset an existing AvatarView.
610
+ - `AvatarSDK.initialize()` and session token do not need to be called again.
611
+ - The old AvatarView's internal state is fully cleaned up by `dispose()`.
612
+
613
+ ```typescript
614
+ // 1. Dispose old avatar
615
+ if (currentAvatarView) {
616
+ currentAvatarView.dispose()
617
+ }
618
+
619
+ // 2. Load new avatar (SDK is already initialized, token is still valid)
620
+ const newAvatar = await AvatarManager.shared.load('new-character-id')
621
+
622
+ // 3. Create new AvatarView
623
+ currentAvatarView = new AvatarView(newAvatar, container)
624
+
625
+ // 4. Start connection if SDK mode
626
+ await currentAvatarView.controller.start()
627
+ ```
628
+
148
629
  ### AvatarController
149
630
 
150
- 实时通信控制器,处理 WebSocket 连接和动画数据。
631
+ Audio/animation playback controller, manages synchronized playback of audio and animation. Automatically handles network communication in SDK mode.
632
+
633
+ **Two Usage Patterns:**
634
+
635
+ #### SDK Mode Methods
151
636
 
152
637
  ```typescript
153
- // 启动连接
154
- await avatarView.avatarController.start()
638
+ // ⚠️ CRITICAL: Initialize audio context first (MUST be called in user gesture context)
639
+ // This method MUST be called within a user gesture event handler (click, touchstart, etc.)
640
+ // to satisfy browser security policies. Calling it outside a user gesture will fail.
641
+ // All audio operations (start, send, etc.) require prior initialization.
642
+ button.addEventListener('click', async () => {
643
+ // Initialize audio context - MUST be in user gesture context
644
+ await avatarView.controller.initializeAudioContext()
645
+
646
+ // Start service
647
+ await avatarView.controller.start()
648
+
649
+ // Send audio data (must be mono PCM16 format matching configured sample rate)
650
+ const conversationId = avatarView.controller.send(audioData: ArrayBuffer, end: boolean)
651
+ // Returns: conversationId - Conversation ID for this conversation session
652
+ // end: false (default) - Continue sending audio data for current conversation
653
+ // end: true - Mark the end of audio input for current conversation round. The avatar will continue playing remaining animation until finished, then automatically return to idle (notified via onConversationState). After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
654
+ })
155
655
 
156
- // 发送音频数据
157
- avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
158
- // audioData: 音频数据(ArrayBuffer 格式)
159
- // end: false(默认)- 正常发送音频数据,服务端会积累音频数据,积累到一定量后会自动返回动画数据并开始同步播放动画和音频
160
- // end: true - 立即返回动画数据,不再积累,用于结束当前对话或需要立即响应的场景
656
+ // Close service
657
+ avatarView.controller.close()
658
+ ```
659
+
660
+ #### Host Mode Methods
661
+
662
+ ```typescript
663
+ // ⚠️ CRITICAL: Initialize audio context first (MUST be called in user gesture context)
664
+ // This method MUST be called within a user gesture event handler (click, touchstart, etc.)
665
+ // to satisfy browser security policies. Calling it outside a user gesture will fail.
666
+ // All audio operations (yieldAudioData, yieldFramesData, etc.) require prior initialization.
667
+ button.addEventListener('click', async () => {
668
+ // Initialize audio context - MUST be in user gesture context
669
+ await avatarView.controller.initializeAudioContext()
670
+
671
+ // Stream audio chunks (must be mono PCM16 format matching configured sample rate)
672
+ const conversationId = avatarView.controller.yieldAudioData(
673
+ data: Uint8Array, // Audio chunk data (PCM16 format)
674
+ isLast: boolean = false // Whether this is the last chunk
675
+ )
676
+ // Returns: conversationId - Conversation ID for this audio session
677
+
678
+ // Stream animation keyframes (requires conversationId from audio data)
679
+ avatarView.controller.yieldFramesData(
680
+ keyframesDataArray: (Uint8Array | ArrayBuffer)[], // Animation keyframes binary data array
681
+ conversationId: string // Conversation ID (required)
682
+ )
683
+ })
684
+ ```
161
685
 
162
- // 打断对话
163
- avatarView.avatarController.interrupt()
686
+ **⚠️ Important: Conversation ID (conversationId) Management**
164
687
 
165
- // 关闭连接
166
- avatarView.avatarController.close()
688
+ **SDK Mode:**
689
+ - `send()` returns a conversationId to distinguish each conversation round
690
+ - `end=true` marks the end of a conversation round
167
691
 
168
- // 设置事件回调
169
- avatarView.avatarController.onConnectionState = (state: ConnectionState) => {}
170
- avatarView.avatarController.onAvatarState = (state: AvatarState) => {}
171
- avatarView.avatarController.onError = (error: Error) => {}
692
+ **Host Mode:**
693
+ - `yieldAudioData()` returns a conversationId (automatically generates if starting new session)
694
+ - `yieldFramesData()` requires a valid conversationId parameter
695
+ - Animation data with mismatched conversationId will be **discarded**
696
+ - Use `getCurrentConversationId()` to retrieve the current active conversationId
172
697
 
173
- // 注意:不支持 sendText() 方法,调用会抛出错误
698
+ #### Common Methods (Both Modes)
699
+
700
+ ```typescript
701
+
702
+ // Pause playback (from playing state)
703
+ avatarView.controller.pause()
704
+
705
+ // Resume playback (from paused state)
706
+ await avatarView.controller.resume()
707
+
708
+ // Interrupt current playback (stops and clears data)
709
+ avatarView.controller.interrupt()
710
+
711
+ // Clear all data and resources
712
+ avatarView.controller.clear()
713
+
714
+ // Get current conversation ID (for Host mode)
715
+ const conversationId = avatarView.controller.getCurrentConversationId()
716
+ // Returns: Current conversationId for the active audio session, or null if no active session
717
+
718
+ // Volume control (affects only avatar audio player, not system volume)
719
+ avatarView.controller.setVolume(0.5) // Set volume to 50% (0.0 to 1.0)
720
+ const currentVolume = avatarView.controller.getVolume() // Get current volume (0.0 to 1.0)
721
+
722
+ // Set event callbacks
723
+ avatarView.controller.onConnectionState = (state: ConnectionState) => {} // SDK mode only
724
+ avatarView.controller.onConversationState = (state: ConversationState) => {}
725
+ avatarView.controller.onError = (error: AvatarError) => {} // Includes error.code for specific error type
174
726
  ```
175
727
 
176
- ## 🔧 配置
728
+ #### Avatar Transform Methods
729
+
730
+ ```typescript
731
+ // Get or set avatar transform (position and scale in canvas)
732
+ // Get current transform
733
+ const currentTransform = avatarView.avatarTransform // { x: number, y: number, scale: number }
734
+
735
+ // Set transform
736
+ avatarView.avatarTransform = { x, y, scale }
737
+ // - x: Horizontal offset in normalized coordinates (-1 to 1, where -1 = left edge, 0 = center, 1 = right edge)
738
+ // - y: Vertical offset in normalized coordinates (-1 to 1, where -1 = bottom edge, 0 = center, 1 = top edge)
739
+ // - scale: Scale factor (1.0 = original size, 2.0 = double size, 0.5 = half size)
740
+ // Example:
741
+ avatarView.avatarTransform = { x: 0, y: 0, scale: 1.0 } // Center, original size
742
+ avatarView.avatarTransform = { x: 0.5, y: 0, scale: 2.0 } // Right half, double size
743
+ ```
744
+
745
+ **Important Notes:**
746
+ - `start()` and `close()` are only available in SDK mode
747
+ - `yieldAudioData()` and `yieldFramesData()` are only available in Host mode
748
+ - `pause()`, `resume()`, `interrupt()`, `clear()`, `getCurrentConversationId()`, `setVolume()`, and `getVolume()` are available in both modes
749
+ - The playback mode is determined when creating `AvatarView` and cannot be changed
750
+
751
+ ## 🔧 Configuration
177
752
 
178
753
  ### Configuration
179
754
 
180
755
  ```typescript
181
756
  interface Configuration {
182
757
  environment: Environment
758
+ drivingServiceMode?: DrivingServiceMode // Optional, default is 'sdk' (SDK mode)
759
+ logLevel?: LogLevel // Optional, default is 'off' (no logs)
760
+ audioFormat?: AudioFormat // Optional, default is { channelCount: 1, sampleRate: 16000 }
761
+ characterApiBaseUrl?: string // Optional, internal debug config, can be ignored
762
+ }
763
+
764
+ interface AudioFormat {
765
+ readonly channelCount: 1 // Fixed to 1 (mono)
766
+ readonly sampleRate: number // Supported: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz, default: 16000
183
767
  }
184
768
  ```
185
769
 
186
- **说明:**
187
- - `environment`: 指定环境(cn/us/test),SDK 会根据环境自动使用对应的 API 地址和 WebSocket 地址
188
- - `sessionToken`: 通过 `AvatarKit.setSessionToken()` 单独设置,而不是在 Configuration 中
770
+ ### LogLevel
771
+
772
+ Control the verbosity of SDK logs:
773
+
774
+ ```typescript
775
+ enum LogLevel {
776
+ off = 'off', // Disable all logs
777
+ error = 'error', // Only error logs
778
+ warning = 'warning', // Warning and error logs
779
+ all = 'all' // All logs (info, warning, error) - default
780
+ }
781
+ ```
189
782
 
783
+ **Note:** `LogLevel.off` completely disables all logging, including error logs. Use with caution in production environments.
784
+
785
+ **Description:**
786
+ - `environment`: Specifies the environment (cn/intl), SDK will automatically use the corresponding server addresses based on the environment
787
+ - `drivingServiceMode`: Specifies the driving service mode
788
+ - `DrivingServiceMode.sdk` (default): SDK mode - SDK handles network communication automatically
789
+ - `DrivingServiceMode.host`: Host mode - Host application provides audio and animation data
790
+ - `logLevel`: Controls the verbosity of SDK logs
791
+ - `LogLevel.off` (default): Disable all logs
792
+ - `LogLevel.error`: Only error logs
793
+ - `LogLevel.warning`: Warning and error logs
794
+ - `LogLevel.all`: All logs (info, warning, error)
795
+ - `audioFormat`: Configures audio sample rate and channel count
796
+ - `channelCount`: Fixed to 1 (mono channel)
797
+ - `sampleRate`: Audio sample rate in Hz (default: 16000)
798
+ - Supported values: 8000, 16000, 22050, 24000, 32000, 44100, 48000
799
+ - The configured sample rate will be used for both audio recording and playback
800
+ - `characterApiBaseUrl`: Internal debug config, can be ignored
801
+ - `sessionToken`: **Required for authentication**. Set separately via `AvatarSDK.setSessionToken()`, not in Configuration. See [Authentication](#-authentication) section for details
802
+
803
+ ```typescript
190
804
  enum Environment {
191
- cn = 'cn', // 中国区
192
- us = 'us', // 美国区
193
- test = 'test' // 测试环境
805
+ cn = 'cn', // China region
806
+ intl = 'intl', // International region
194
807
  }
195
808
  ```
196
809
 
@@ -198,17 +811,17 @@ enum Environment {
198
811
 
199
812
  ```typescript
200
813
  interface CameraConfig {
201
- position: [number, number, number] // 相机位置
202
- target: [number, number, number] // 相机目标
203
- fov: number // 视野角度
204
- near: number // 近裁剪面
205
- far: number // 远裁剪面
206
- up?: [number, number, number] // 上方向
207
- aspect?: number // 宽高比
814
+ position: [number, number, number] // Camera position
815
+ target: [number, number, number] // Camera target
816
+ fov: number // Field of view angle
817
+ near: number // Near clipping plane
818
+ far: number // Far clipping plane
819
+ up?: [number, number, number] // Up direction
820
+ aspect?: number // Aspect ratio
208
821
  }
209
822
  ```
210
823
 
211
- ## 📊 状态管理
824
+ ## 📊 State Management
212
825
 
213
826
  ### ConnectionState
214
827
 
@@ -221,89 +834,42 @@ enum ConnectionState {
221
834
  }
222
835
  ```
223
836
 
224
- ### AvatarState
837
+ ### ConversationState
225
838
 
226
839
  ```typescript
227
- enum AvatarState {
228
- idle = 'idle', // 空闲状态,呈现呼吸态
229
- active = 'active', // 活跃中,等待可播放内容
230
- playing = 'playing' // 播放中
840
+ enum ConversationState {
841
+ idle = 'idle', // Idle state (breathing animation)
842
+ playing = 'playing', // Playing state (active conversation)
843
+ pausing = 'pausing' // Pausing state (paused during playback)
231
844
  }
232
845
  ```
233
846
 
234
- ## 🎨 渲染系统
847
+ **State Description:**
848
+ - `idle`: Avatar is in idle state (breathing animation), waiting for conversation to start
849
+ - `playing`: Avatar is playing conversation content (including during transition animations)
850
+ - `pausing`: Avatar playback is paused (e.g., when `end=false` and waiting for more audio data)
235
851
 
236
- SDK 支持两种渲染后端:
852
+ **Note:** During transition animations, the target state is notified immediately:
853
+ - When transitioning from `idle` to `playing`, the `playing` state is notified immediately
854
+ - When transitioning from `playing` to `idle`, the `idle` state is notified immediately
237
855
 
238
- - **WebGPU** - 现代浏览器的高性能渲染
239
- - **WebGL** - 兼容性更好的传统渲染
856
+ ## 🎨 Rendering System
240
857
 
241
- 渲染系统会自动选择最佳的后端,无需手动配置。
858
+ The SDK automatically selects the best rendering backend for your browser, no manual configuration needed.
242
859
 
243
- ## 🔍 调试和监控
860
+ ## 🚨 Error Handling
244
861
 
245
- ### 日志系统
862
+ ### AvatarError
246
863
 
247
- SDK 内置了完整的日志系统,支持不同级别的日志输出:
864
+ The SDK uses custom error types, providing more detailed error information:
248
865
 
249
866
  ```typescript
250
- import { logger } from '@spatialwalk/avatarkit'
251
-
252
- // 设置日志级别
253
- logger.setLevel('verbose') // 'basic' | 'verbose'
254
-
255
- // 手动日志输出
256
- logger.log('Info message')
257
- logger.warn('Warning message')
258
- logger.error('Error message')
259
- ```
260
-
261
- ### 性能监控
262
-
263
- SDK 提供了性能监控接口,可以监控渲染性能:
264
-
265
- ```typescript
266
- // 获取渲染性能统计
267
- const stats = avatarView.getPerformanceStats()
268
-
269
- if (stats) {
270
- console.log(`渲染耗时: ${stats.renderTime.toFixed(2)}ms`)
271
- console.log(`排序耗时: ${stats.sortTime.toFixed(2)}ms`)
272
- console.log(`渲染后端: ${stats.backend}`)
273
-
274
- // 计算帧率
275
- const fps = 1000 / stats.renderTime
276
- console.log(`帧率: ${fps.toFixed(2)} FPS`)
277
- }
278
-
279
- // 定期监控性能
280
- setInterval(() => {
281
- const stats = avatarView.getPerformanceStats()
282
- if (stats) {
283
- // 发送到监控服务或显示在 UI 上
284
- console.log('Performance:', stats)
285
- }
286
- }, 1000)
287
- ```
288
-
289
- **性能统计说明**:
290
- - `renderTime`: 总渲染耗时(毫秒),包含排序和 GPU 渲染
291
- - `sortTime`: 排序耗时(毫秒),使用 Radix Sort 算法对点云进行深度排序
292
- - `backend`: 当前使用的渲染后端(`'webgpu'` | `'webgl'` | `null`)
293
-
294
- ## 🚨 错误处理
295
-
296
- ### SPAvatarError
297
-
298
- SDK 使用自定义错误类型,提供更详细的错误信息:
299
-
300
- ```typescript
301
- import { SPAvatarError } from '@spatialwalk/avatarkit'
867
+ import { AvatarError } from '@spatialwalk/avatarkit'
302
868
 
303
869
  try {
304
- await avatarView.avatarController.start()
870
+ await avatarView.controller.start()
305
871
  } catch (error) {
306
- if (error instanceof SPAvatarError) {
872
+ if (error instanceof AvatarError) {
307
873
  console.error('SDK Error:', error.message, error.code)
308
874
  } else {
309
875
  console.error('Unknown error:', error)
@@ -311,65 +877,112 @@ try {
311
877
  }
312
878
  ```
313
879
 
314
- ### 错误回调
880
+ ### Error Callbacks
315
881
 
316
882
  ```typescript
317
- avatarView.avatarController.onError = (error: Error) => {
318
- console.error('AvatarController error:', error)
319
- // 处理错误,比如重连、用户提示等
883
+ import { AvatarError } from '@spatialwalk/avatarkit'
884
+
885
+ avatarView.controller.onError = (error: AvatarError) => {
886
+ console.error('Error:', error.code, error.message)
320
887
  }
321
888
  ```
322
889
 
323
- ## 🔄 资源管理
324
-
325
- ### 生命周期管理
890
+ `error.code` values (from `ErrorCode` enum):
891
+
892
+ | Code | Description | Trigger |
893
+ |------|-------------|---------|
894
+ | **Authentication & Authorization** | | |
895
+ | `appIDUnrecognized` | App ID not recognized | Reserved |
896
+ | `sessionTokenInvalid` | Token invalid or appId mismatch | WebSocket close code 4010 |
897
+ | `sessionTokenExpired` | Token expired | WebSocket close code 4010 |
898
+ | `insufficientBalance` | Insufficient balance | WebSocket close code 4001 |
899
+ | `concurrentLimitExceeded` | Concurrent connection limit exceeded | WebSocket close code 4003 |
900
+ | **Resource Loading** | | |
901
+ | `avatarIDUnrecognized` | Avatar ID not found | Server error |
902
+ | `failedToFetchAvatarMetadata` | Metadata fetch failed | Network/server error |
903
+ | `failedToDownloadAvatarAssets` | Asset download failed | Network/server error |
904
+ | **Connection** | | |
905
+ | `websocketError` | WebSocket handshake or network error | Connection failure |
906
+ | `websocketClosedAbnormally` | Connection closed abnormally | Close code 1006 |
907
+ | `websocketClosedUnexpected` | Unexpected close code | Unknown close code |
908
+ | `sessionTimeout` | Session timeout | WebSocket close code 4002 |
909
+ | `connectionInProgress` | Connection already in progress | Duplicate `start()` call |
910
+ | **Playback** | | |
911
+ | `networkLayerNotAvailable` | Network layer not available | `send()` in host mode |
912
+ | `playbackStartFailed` | Failed to start playback | Internal error |
913
+ | `playbackInitFailed` | Playback initialization failed | Internal error |
914
+ | `audioOnlyInitFailed` | Audio-only playback init failed | Fallback mode error |
915
+ | `noAudio` | No audio data to play | Empty audio input |
916
+ | `audioContextNotInitialized` | Audio context not initialized | `send()` before `initializeAudioContext()` |
917
+ | `animationPlayerNotInitialized` | Animation player not initialized | Internal error |
918
+ | **Server** | | |
919
+ | `serverError` | Server-side error | Server MESSAGE_SERVER_ERROR |
920
+
921
+ ## 🔄 Resource Management
922
+
923
+ ### Lifecycle Management
924
+
925
+ #### SDK Mode Lifecycle
326
926
 
327
927
  ```typescript
328
- // 初始化
928
+ // Initialize
929
+ const container = document.getElementById('avatar-container')
329
930
  const avatarView = new AvatarView(avatar, container)
330
- await avatarView.avatarController.start()
931
+ await avatarView.controller.start()
331
932
 
332
- // 使用
333
- avatarView.avatarController.send(audioData, false)
933
+ // Use
934
+ avatarView.controller.send(audioData, false)
334
935
 
335
- // 清理
336
- avatarView.dispose() // 自动清理所有资源
936
+ // Cleanup - dispose() automatically cleans up all resources including connections
937
+ avatarView.dispose()
337
938
  ```
338
939
 
339
- ### 内存优化
940
+ #### Host Mode Lifecycle
340
941
 
341
- - SDK 自动管理 WASM 内存分配
342
- - 支持角色和动画资源的动态加载/卸载
343
- - 提供内存使用监控接口
942
+ ```typescript
943
+ // Initialize
944
+ const container = document.getElementById('avatar-container')
945
+ const avatarView = new AvatarView(avatar, container)
946
+
947
+ // Use
948
+ const conversationId = avatarView.controller.yieldAudioData(audioChunk, false)
949
+ avatarView.controller.yieldFramesData(keyframesDataArray, conversationId)
950
+
951
+ // Cleanup - dispose() automatically cleans up all resources including playback data
952
+ avatarView.dispose()
953
+ ```
344
954
 
345
- ### 音频数据发送
955
+ **⚠️ Important Notes:**
956
+ - `dispose()` automatically cleans up all resources, including:
957
+ - Network connections (SDK mode)
958
+ - Playback data and animation resources (both modes)
959
+ - Render system and canvas elements
960
+ - All event listeners and callbacks
961
+ - Not properly calling `dispose()` may cause resource leaks and rendering errors
962
+ - If you need to manually close connections or clear playback data before disposing, you can call `avatarView.controller.close()` (SDK mode) or `avatarView.controller.clear()` (both modes) first, but it's not required as `dispose()` handles this automatically
346
963
 
347
- `send()` 方法接收 `ArrayBuffer` 格式的音频数据:
964
+ ### Memory Optimization
348
965
 
349
- **使用说明:**
350
- - `audioData`: 音频数据(ArrayBuffer 格式)
351
- - `end=false`(默认)- 正常发送音频数据,服务端会积累音频数据,积累到一定量后会自动返回动画数据并开始同步播放动画和音频
352
- - `end=true` - 立即返回动画数据,不再积累,用于结束当前对话或需要立即响应的场景
353
- - **重要**:不需要等待 `end=true` 才开始播放,积累到一定音频数据后就会自动开始播放
966
+ - SDK automatically manages memory allocation
967
+ - Supports dynamic loading/unloading of avatar and animation resources
354
968
 
355
- ## 🌐 浏览器兼容性
969
+ ## 🌐 Browser Compatibility
356
970
 
357
- - **Chrome/Edge** 90+ (推荐 WebGPU)
971
+ - **Chrome/Edge** 90+ (WebGPU recommended)
358
972
  - **Firefox** 90+ (WebGL)
359
973
  - **Safari** 14+ (WebGL)
360
- - **移动端** iOS 14+, Android 8+
974
+ - **Mobile** iOS 14+, Android 8+
361
975
 
362
- ## 📝 许可证
976
+ ## 📝 License
363
977
 
364
978
  MIT License
365
979
 
366
- ## 🤝 贡献
980
+ ## 🤝 Contributing
367
981
 
368
- 欢迎提交 Issue Pull Request!
982
+ Issues and Pull Requests are welcome!
369
983
 
370
- ## 📞 支持
984
+ ## 📞 Support
371
985
 
372
- 如有问题,请联系:
373
- - 邮箱:support@spavatar.com
374
- - 文档:https://docs.spavatar.com
375
- - GitHub:https://github.com/spavatar/sdk
986
+ For questions, please contact:
987
+ - Email: code@spatialwalk.net
988
+ - Documentation: https://docs.spatialreal.ai