@spatialwalk/avatarkit 1.0.0-beta.7 → 1.0.0-beta.71
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +602 -10
- package/README.md +475 -312
- package/dist/StreamingAudioPlayer-D8Q8WiEg.js +638 -0
- package/dist/animation/AnimationWebSocketClient.d.ts +6 -50
- package/dist/animation/utils/eventEmitter.d.ts +1 -9
- package/dist/animation/utils/flameConverter.d.ts +3 -24
- package/dist/audio/AnimationPlayer.d.ts +6 -57
- package/dist/audio/StreamingAudioPlayer.d.ts +2 -118
- package/dist/avatar_core_wasm-Dv943JJl.js +2696 -0
- package/dist/{avatar_core_wasm.wasm → avatar_core_wasm-e68766db.wasm} +0 -0
- package/dist/config/app-config.d.ts +3 -4
- package/dist/config/constants.d.ts +10 -18
- package/dist/config/sdk-config-loader.d.ts +4 -10
- package/dist/core/Avatar.d.ts +2 -14
- package/dist/core/AvatarController.d.ts +95 -85
- package/dist/core/AvatarDownloader.d.ts +7 -92
- package/dist/core/AvatarManager.d.ts +22 -12
- package/dist/core/AvatarSDK.d.ts +35 -0
- package/dist/core/AvatarView.d.ts +55 -140
- package/dist/core/NetworkLayer.d.ts +7 -59
- package/dist/generated/common/v1/models.d.ts +36 -0
- package/dist/generated/driveningress/v1/driveningress.d.ts +0 -1
- package/dist/generated/driveningress/v2/driveningress.d.ts +82 -1
- package/dist/generated/google/protobuf/struct.d.ts +0 -1
- package/dist/generated/google/protobuf/timestamp.d.ts +0 -1
- package/dist/index-U8QcNdma.js +16477 -0
- package/dist/index.d.ts +2 -4
- package/dist/index.js +17 -18
- package/dist/renderer/RenderSystem.d.ts +9 -79
- package/dist/renderer/covariance.d.ts +3 -11
- package/dist/renderer/renderer.d.ts +6 -2
- package/dist/renderer/sortSplats.d.ts +3 -10
- package/dist/renderer/webgl/reorderData.d.ts +4 -11
- package/dist/renderer/webgl/webglRenderer.d.ts +34 -4
- package/dist/renderer/webgpu/webgpuRenderer.d.ts +30 -5
- package/dist/types/character-settings.d.ts +1 -1
- package/dist/types/character.d.ts +3 -15
- package/dist/types/index.d.ts +123 -43
- package/dist/utils/animation-interpolation.d.ts +4 -15
- package/dist/utils/client-id.d.ts +6 -0
- package/dist/utils/conversationId.d.ts +10 -0
- package/dist/utils/error-utils.d.ts +0 -1
- package/dist/utils/id-manager.d.ts +34 -0
- package/dist/utils/logger.d.ts +2 -11
- package/dist/utils/posthog-tracker.d.ts +8 -0
- package/dist/utils/pwa-cache-manager.d.ts +17 -0
- package/dist/utils/usage-tracker.d.ts +6 -0
- package/dist/vanilla/vite.config.d.ts +2 -0
- package/dist/vite.d.ts +19 -0
- package/dist/wasm/avatarCoreAdapter.d.ts +15 -126
- package/dist/wasm/avatarCoreMemory.d.ts +5 -2
- package/package.json +19 -8
- package/vite.d.ts +20 -0
- package/vite.js +126 -0
- package/dist/StreamingAudioPlayer-D7s8q5h0.js +0 -319
- package/dist/StreamingAudioPlayer-D7s8q5h0.js.map +0 -1
- package/dist/animation/AnimationWebSocketClient.d.ts.map +0 -1
- package/dist/animation/utils/eventEmitter.d.ts.map +0 -1
- package/dist/animation/utils/flameConverter.d.ts.map +0 -1
- package/dist/audio/AnimationPlayer.d.ts.map +0 -1
- package/dist/audio/StreamingAudioPlayer.d.ts.map +0 -1
- package/dist/avatar_core_wasm-D4eEi7Eh.js +0 -1666
- package/dist/avatar_core_wasm-D4eEi7Eh.js.map +0 -1
- package/dist/config/app-config.d.ts.map +0 -1
- package/dist/config/constants.d.ts.map +0 -1
- package/dist/config/sdk-config-loader.d.ts.map +0 -1
- package/dist/core/Avatar.d.ts.map +0 -1
- package/dist/core/AvatarController.d.ts.map +0 -1
- package/dist/core/AvatarDownloader.d.ts.map +0 -1
- package/dist/core/AvatarKit.d.ts +0 -66
- package/dist/core/AvatarKit.d.ts.map +0 -1
- package/dist/core/AvatarManager.d.ts.map +0 -1
- package/dist/core/AvatarView.d.ts.map +0 -1
- package/dist/core/NetworkLayer.d.ts.map +0 -1
- package/dist/generated/driveningress/v1/driveningress.d.ts.map +0 -1
- package/dist/generated/driveningress/v2/driveningress.d.ts.map +0 -1
- package/dist/generated/google/protobuf/struct.d.ts.map +0 -1
- package/dist/generated/google/protobuf/timestamp.d.ts.map +0 -1
- package/dist/index-CpSvWi6A.js +0 -6026
- package/dist/index-CpSvWi6A.js.map +0 -1
- package/dist/index.d.ts.map +0 -1
- package/dist/index.js.map +0 -1
- package/dist/renderer/RenderSystem.d.ts.map +0 -1
- package/dist/renderer/covariance.d.ts.map +0 -1
- package/dist/renderer/renderer.d.ts.map +0 -1
- package/dist/renderer/sortSplats.d.ts.map +0 -1
- package/dist/renderer/webgl/reorderData.d.ts.map +0 -1
- package/dist/renderer/webgl/webglRenderer.d.ts.map +0 -1
- package/dist/renderer/webgpu/webgpuRenderer.d.ts.map +0 -1
- package/dist/types/character-settings.d.ts.map +0 -1
- package/dist/types/character.d.ts.map +0 -1
- package/dist/types/index.d.ts.map +0 -1
- package/dist/utils/animation-interpolation.d.ts.map +0 -1
- package/dist/utils/cls-tracker.d.ts +0 -17
- package/dist/utils/cls-tracker.d.ts.map +0 -1
- package/dist/utils/error-utils.d.ts.map +0 -1
- package/dist/utils/logger.d.ts.map +0 -1
- package/dist/utils/reqId.d.ts +0 -20
- package/dist/utils/reqId.d.ts.map +0 -1
- package/dist/wasm/avatarCoreAdapter.d.ts.map +0 -1
- package/dist/wasm/avatarCoreMemory.d.ts.map +0 -1
package/README.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
#
|
|
1
|
+
# AvatarKit SDK
|
|
2
2
|
|
|
3
3
|
Real-time virtual avatar rendering SDK based on 3D Gaussian Splatting, supporting audio-driven animation rendering and high-quality 3D rendering.
|
|
4
4
|
|
|
@@ -6,6 +6,7 @@ Real-time virtual avatar rendering SDK based on 3D Gaussian Splatting, supportin
|
|
|
6
6
|
|
|
7
7
|
- **3D Gaussian Splatting Rendering** - Based on the latest point cloud rendering technology, providing high-quality 3D virtual avatars
|
|
8
8
|
- **Audio-Driven Real-Time Animation Rendering** - Users provide audio data, SDK handles receiving animation data and rendering
|
|
9
|
+
- **Multi-Avatar Support** - Support multiple avatar instances simultaneously, each with independent state and rendering
|
|
9
10
|
- **WebGPU/WebGL Dual Rendering Backend** - Automatically selects the best rendering backend for compatibility
|
|
10
11
|
- **WASM High-Performance Computing** - Uses C++ compiled WebAssembly modules for geometric calculations
|
|
11
12
|
- **TypeScript Support** - Complete type definitions and IntelliSense
|
|
@@ -17,90 +18,179 @@ Real-time virtual avatar rendering SDK based on 3D Gaussian Splatting, supportin
|
|
|
17
18
|
npm install @spatialwalk/avatarkit
|
|
18
19
|
```
|
|
19
20
|
|
|
21
|
+
## 🔧 Vite 配置(推荐)
|
|
22
|
+
|
|
23
|
+
如果你使用 Vite 作为构建工具,强烈推荐使用我们的 Vite 插件来自动处理 WASM 文件配置。插件会自动处理所有必要的配置,让你无需手动设置。
|
|
24
|
+
|
|
25
|
+
### 使用插件
|
|
26
|
+
|
|
27
|
+
在 `vite.config.ts` 中添加插件:
|
|
28
|
+
|
|
29
|
+
```typescript
|
|
30
|
+
import { defineConfig } from 'vite'
|
|
31
|
+
import { avatarkitVitePlugin } from '@spatialwalk/avatarkit/vite'
|
|
32
|
+
|
|
33
|
+
export default defineConfig({
|
|
34
|
+
plugins: [
|
|
35
|
+
avatarkitVitePlugin(), // 添加这一行即可
|
|
36
|
+
],
|
|
37
|
+
})
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
### 插件功能
|
|
41
|
+
|
|
42
|
+
插件会自动处理:
|
|
43
|
+
|
|
44
|
+
- ✅ **开发服务器**:自动设置 WASM 文件的正确 MIME 类型 (`application/wasm`)
|
|
45
|
+
- ✅ **构建时**:自动复制 WASM 文件到 `dist/assets/` 目录
|
|
46
|
+
- 智能检测:从 JS glue 文件中提取引用的 WASM 文件名(包括 hash)
|
|
47
|
+
- 自动匹配:确保复制的 WASM 文件与 JS glue 文件中的引用匹配
|
|
48
|
+
- 支持 hash:正确处理带 hash 的 WASM 文件(如 `avatar_core_wasm-{hash}.wasm`)
|
|
49
|
+
- ✅ **WASM JS Glue**:自动复制 WASM JS glue 文件到 `dist/assets/` 目录
|
|
50
|
+
- ✅ **Cloudflare Pages**:自动生成 `_headers` 文件,确保 WASM 文件使用正确的 MIME 类型
|
|
51
|
+
- ✅ **Vite 配置**:自动配置 `optimizeDeps`、`assetsInclude`、`assetsInlineLimit` 等选项
|
|
52
|
+
|
|
53
|
+
### 手动配置(不使用插件)
|
|
54
|
+
|
|
55
|
+
如果你不使用 Vite 插件,需要手动配置以下内容:
|
|
56
|
+
|
|
57
|
+
```typescript
|
|
58
|
+
// vite.config.ts
|
|
59
|
+
export default defineConfig({
|
|
60
|
+
optimizeDeps: {
|
|
61
|
+
exclude: ['@spatialwalk/avatarkit'],
|
|
62
|
+
},
|
|
63
|
+
assetsInclude: ['**/*.wasm'],
|
|
64
|
+
build: {
|
|
65
|
+
assetsInlineLimit: 0,
|
|
66
|
+
rollupOptions: {
|
|
67
|
+
output: {
|
|
68
|
+
assetFileNames: (assetInfo) => {
|
|
69
|
+
if (assetInfo.name?.endsWith('.wasm')) {
|
|
70
|
+
return 'assets/[name][extname]'
|
|
71
|
+
}
|
|
72
|
+
return 'assets/[name]-[hash][extname]'
|
|
73
|
+
},
|
|
74
|
+
},
|
|
75
|
+
},
|
|
76
|
+
},
|
|
77
|
+
// 开发服务器需要手动配置中间件设置 WASM MIME 类型
|
|
78
|
+
configureServer(server) {
|
|
79
|
+
server.middlewares.use((req, res, next) => {
|
|
80
|
+
if (req.url?.endsWith('.wasm')) {
|
|
81
|
+
res.setHeader('Content-Type', 'application/wasm')
|
|
82
|
+
}
|
|
83
|
+
next()
|
|
84
|
+
})
|
|
85
|
+
},
|
|
86
|
+
})
|
|
87
|
+
```
|
|
88
|
+
|
|
20
89
|
## 🎯 Quick Start
|
|
21
90
|
|
|
91
|
+
### ⚠️ Important: Audio Context Initialization
|
|
92
|
+
|
|
93
|
+
**Before using any audio-related features, you MUST initialize the audio context in a user gesture context** (e.g., `click`, `touchstart` event handlers). This is required by browser security policies. Calling `initializeAudioContext()` outside a user gesture will fail.
|
|
94
|
+
|
|
22
95
|
### Basic Usage
|
|
23
96
|
|
|
24
97
|
```typescript
|
|
25
98
|
import {
|
|
26
|
-
|
|
99
|
+
AvatarSDK,
|
|
27
100
|
AvatarManager,
|
|
28
101
|
AvatarView,
|
|
29
102
|
Configuration,
|
|
30
|
-
Environment
|
|
103
|
+
Environment,
|
|
104
|
+
DrivingServiceMode,
|
|
105
|
+
LogLevel
|
|
31
106
|
} from '@spatialwalk/avatarkit'
|
|
32
107
|
|
|
33
108
|
// 1. Initialize SDK
|
|
109
|
+
|
|
34
110
|
const configuration: Configuration = {
|
|
35
|
-
environment: Environment.
|
|
111
|
+
environment: Environment.cn,
|
|
112
|
+
drivingServiceMode: DrivingServiceMode.sdk, // Optional, 'sdk' is default
|
|
113
|
+
// - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
|
|
114
|
+
// - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
|
|
115
|
+
logLevel: LogLevel.off, // Optional, 'off' is default
|
|
116
|
+
// - LogLevel.off: Disable all logs
|
|
117
|
+
// - LogLevel.error: Only error logs
|
|
118
|
+
// - LogLevel.warning: Warning and error logs
|
|
119
|
+
// - LogLevel.all: All logs (info, warning, error)
|
|
120
|
+
audioFormat: { // Optional, default is { channelCount: 1, sampleRate: 16000 }
|
|
121
|
+
channelCount: 1, // Fixed to 1 (mono)
|
|
122
|
+
sampleRate: 16000 // Supported: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz
|
|
123
|
+
}
|
|
124
|
+
// characterApiBaseUrl: 'https://custom-api.example.com' // Optional, internal debug config, can be ignored
|
|
36
125
|
}
|
|
37
126
|
|
|
38
|
-
await
|
|
127
|
+
await AvatarSDK.initialize('your-app-id', configuration)
|
|
39
128
|
|
|
40
129
|
// Set sessionToken (if needed, call separately)
|
|
41
|
-
//
|
|
130
|
+
// AvatarSDK.setSessionToken('your-session-token')
|
|
42
131
|
|
|
43
|
-
// 2. Load
|
|
44
|
-
const avatarManager =
|
|
132
|
+
// 2. Load avatar
|
|
133
|
+
const avatarManager = AvatarManager.shared
|
|
45
134
|
const avatar = await avatarManager.load('character-id', (progress) => {
|
|
46
135
|
console.log(`Loading progress: ${progress.progress}%`)
|
|
47
136
|
})
|
|
48
137
|
|
|
49
138
|
// 3. Create view (automatically creates Canvas and AvatarController)
|
|
50
|
-
//
|
|
139
|
+
// The playback mode is determined by drivingServiceMode in AvatarSDK configuration
|
|
140
|
+
// - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
|
|
141
|
+
// - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
|
|
51
142
|
const container = document.getElementById('avatar-container')
|
|
52
|
-
const avatarView = new AvatarView(avatar,
|
|
53
|
-
|
|
54
|
-
|
|
143
|
+
const avatarView = new AvatarView(avatar, container)
|
|
144
|
+
|
|
145
|
+
// 4. ⚠️ CRITICAL: Initialize audio context (MUST be called in user gesture context)
|
|
146
|
+
// This method MUST be called within a user gesture event handler (click, touchstart, etc.)
|
|
147
|
+
// to satisfy browser security policies. Calling it outside a user gesture will fail.
|
|
148
|
+
button.addEventListener('click', async () => {
|
|
149
|
+
// Initialize audio context - MUST be in user gesture context
|
|
150
|
+
await avatarView.controller.initializeAudioContext()
|
|
151
|
+
|
|
152
|
+
// 5. Start real-time communication (SDK mode only)
|
|
153
|
+
await avatarView.controller.start()
|
|
154
|
+
|
|
155
|
+
// 6. Send audio data (SDK mode, must be mono PCM16 format matching configured sample rate)
|
|
156
|
+
// audioData: ArrayBuffer or Uint8Array containing PCM16 audio samples
|
|
157
|
+
// - PCM files: Can be directly read as ArrayBuffer
|
|
158
|
+
// - WAV files: Extract PCM data from WAV format (may require resampling)
|
|
159
|
+
// - MP3 files: Decode first (e.g., using AudioContext.decodeAudioData()), then convert to PCM16
|
|
160
|
+
const audioData = new ArrayBuffer(1024) // Placeholder: Replace with actual PCM16 audio data
|
|
161
|
+
avatarView.controller.send(audioData, false) // Send audio data
|
|
162
|
+
avatarView.controller.send(audioData, true) // end=true marks the end of current conversation round
|
|
55
163
|
})
|
|
56
|
-
|
|
57
|
-
// 4. Start real-time communication (network mode only)
|
|
58
|
-
await avatarView.avatarController.start()
|
|
59
|
-
|
|
60
|
-
// 5. Send audio data (network mode)
|
|
61
|
-
// ⚠️ Important: Audio must be 16kHz mono PCM16 format
|
|
62
|
-
// If audio is Uint8Array, you can use slice().buffer to convert to ArrayBuffer
|
|
63
|
-
const audioUint8 = new Uint8Array(1024) // Example: 16kHz PCM16 audio data (512 samples = 1024 bytes)
|
|
64
|
-
const audioData = audioUint8.slice().buffer // Simplified conversion, works for ArrayBuffer and SharedArrayBuffer
|
|
65
|
-
avatarView.avatarController.send(audioData, false) // Send audio data, will automatically start playing after accumulating enough data
|
|
66
|
-
avatarView.avatarController.send(audioData, true) // end=true means immediately return animation data, no longer accumulating
|
|
67
164
|
```
|
|
68
165
|
|
|
69
|
-
###
|
|
166
|
+
### Host Mode Example
|
|
70
167
|
|
|
71
168
|
```typescript
|
|
72
|
-
import { AvatarPlaybackMode } from '@spatialwalk/avatarkit'
|
|
73
169
|
|
|
74
|
-
// 1-3. Same as
|
|
170
|
+
// 1-3. Same as SDK mode (initialize SDK, load avatar)
|
|
75
171
|
|
|
76
|
-
// 3. Create view with
|
|
172
|
+
// 3. Create view with Host mode
|
|
77
173
|
const container = document.getElementById('avatar-container')
|
|
78
|
-
const avatarView = new AvatarView(avatar,
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
//
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
avatarView.avatarController.sendAudioChunk(audioData3, false)
|
|
92
|
-
avatarView.avatarController.sendKeyframes(animationData2)
|
|
174
|
+
const avatarView = new AvatarView(avatar, container)
|
|
175
|
+
|
|
176
|
+
// 4. ⚠️ CRITICAL: Initialize audio context (MUST be called in user gesture context)
|
|
177
|
+
// This method MUST be called within a user gesture event handler (click, touchstart, etc.)
|
|
178
|
+
// to satisfy browser security policies. Calling it outside a user gesture will fail.
|
|
179
|
+
button.addEventListener('click', async () => {
|
|
180
|
+
// Initialize audio context - MUST be in user gesture context
|
|
181
|
+
await avatarView.controller.initializeAudioContext()
|
|
182
|
+
|
|
183
|
+
// 5. Host Mode Workflow:
|
|
184
|
+
// Send audio data first to get conversationId, then use it to send animation data
|
|
185
|
+
const conversationId = avatarView.controller.yieldAudioData(audioData, false)
|
|
186
|
+
avatarView.controller.yieldFramesData(animationDataArray, conversationId) // animationDataArray: (Uint8Array | ArrayBuffer)[]
|
|
93
187
|
```
|
|
94
188
|
|
|
95
189
|
### Complete Examples
|
|
96
190
|
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
This repository contains complete examples for Vanilla JS, Vue 3, and React, demonstrating:
|
|
102
|
-
- Network mode: Real-time audio input with automatic animation data reception
|
|
103
|
-
- External data mode: Custom data sources with manual audio/animation data management
|
|
191
|
+
This SDK supports two usage modes:
|
|
192
|
+
- SDK mode: Real-time audio input with automatic animation data reception
|
|
193
|
+
- Host mode: Custom data sources with manual audio/animation data management
|
|
104
194
|
|
|
105
195
|
## 🏗️ Architecture Overview
|
|
106
196
|
|
|
@@ -110,47 +200,60 @@ The SDK uses a three-layer architecture for clear separation of concerns:
|
|
|
110
200
|
|
|
111
201
|
1. **Rendering Layer (AvatarView)** - Responsible for 3D rendering only
|
|
112
202
|
2. **Playback Layer (AvatarController)** - Manages audio/animation synchronization and playback
|
|
113
|
-
3. **Network Layer
|
|
203
|
+
3. **Network Layer** - Handles WebSocket communication (only in SDK mode, internal implementation)
|
|
114
204
|
|
|
115
205
|
### Core Components
|
|
116
206
|
|
|
117
|
-
- **
|
|
118
|
-
- **AvatarManager** -
|
|
207
|
+
- **AvatarSDK** - SDK initialization and management
|
|
208
|
+
- **AvatarManager** - Avatar resource loading and management
|
|
119
209
|
- **AvatarView** - 3D rendering view (rendering layer)
|
|
120
210
|
- **AvatarController** - Audio/animation playback controller (playback layer)
|
|
121
|
-
- **NetworkLayer** - WebSocket communication (network layer, automatically composed in network mode)
|
|
122
|
-
- **AvatarCoreAdapter** - WASM module adapter
|
|
123
211
|
|
|
124
212
|
### Playback Modes
|
|
125
213
|
|
|
126
|
-
The SDK supports two playback modes, configured
|
|
214
|
+
The SDK supports two playback modes, configured in `AvatarSDK.initialize()`:
|
|
127
215
|
|
|
128
|
-
#### 1.
|
|
216
|
+
#### 1. SDK Mode (Default)
|
|
217
|
+
- Configured via `drivingServiceMode: DrivingServiceMode.sdk` in `AvatarSDK.initialize()`
|
|
129
218
|
- SDK handles WebSocket communication automatically
|
|
130
219
|
- Send audio data via `AvatarController.send()`
|
|
131
220
|
- SDK receives animation data from backend and synchronizes playback
|
|
132
221
|
- Best for: Real-time audio input scenarios
|
|
133
222
|
|
|
134
|
-
#### 2.
|
|
135
|
-
-
|
|
136
|
-
-
|
|
223
|
+
#### 2. Host Mode
|
|
224
|
+
- Configured via `drivingServiceMode: DrivingServiceMode.host` in `AvatarSDK.initialize()`
|
|
225
|
+
- Host application manages its own network/data fetching
|
|
226
|
+
- Host application provides both audio and animation data
|
|
137
227
|
- SDK only handles synchronized playback
|
|
138
228
|
- Best for: Custom data sources, pre-recorded content, or custom network implementations
|
|
139
229
|
|
|
230
|
+
**Note:** The playback mode is determined by `drivingServiceMode` in `AvatarSDK.initialize()` configuration.
|
|
231
|
+
|
|
232
|
+
### Fallback Mechanism
|
|
233
|
+
|
|
234
|
+
The SDK includes a fallback mechanism to ensure audio playback continues even when animation data is unavailable:
|
|
235
|
+
|
|
236
|
+
- **SDK Mode Connection Failure**: If WebSocket connection fails to establish within 15 seconds, the SDK automatically enters fallback mode. In this mode, audio data can still be sent and will play normally, even though no animation data will be received from the server. This ensures that audio playback is not interrupted even when the service connection fails.
|
|
237
|
+
- **SDK Mode Server Error**: If the server returns an error after connection is established, the SDK automatically enters audio-only mode for that session and continues playing audio independently.
|
|
238
|
+
- **Host Mode**: If empty animation data is provided (empty array or undefined), the SDK automatically enters audio-only mode.
|
|
239
|
+
- Once in audio-only mode, any subsequent animation data for that session will be ignored, and only audio will continue playing.
|
|
240
|
+
- The fallback mode is interruptible, just like normal playback mode.
|
|
241
|
+
- Connection state callbacks (`onConnectionState`) will notify you when connection fails or times out, allowing you to handle the fallback state appropriately.
|
|
242
|
+
|
|
140
243
|
### Data Flow
|
|
141
244
|
|
|
142
|
-
####
|
|
245
|
+
#### SDK Mode Flow
|
|
143
246
|
|
|
144
247
|
```
|
|
145
248
|
User audio input (16kHz mono PCM16)
|
|
146
249
|
↓
|
|
147
250
|
AvatarController.send()
|
|
148
251
|
↓
|
|
149
|
-
|
|
252
|
+
WebSocket → Backend processing
|
|
150
253
|
↓
|
|
151
254
|
Backend returns animation data (FLAME keyframes)
|
|
152
255
|
↓
|
|
153
|
-
|
|
256
|
+
AvatarController → AnimationPlayer
|
|
154
257
|
↓
|
|
155
258
|
FLAME parameters → AvatarCore.computeFrameFlatFromParams() → Splat data
|
|
156
259
|
↓
|
|
@@ -159,15 +262,14 @@ AvatarController (playback loop) → AvatarView.renderRealtimeFrame()
|
|
|
159
262
|
RenderSystem → WebGPU/WebGL → Canvas rendering
|
|
160
263
|
```
|
|
161
264
|
|
|
162
|
-
####
|
|
265
|
+
#### Host Mode Flow
|
|
163
266
|
|
|
164
267
|
```
|
|
165
268
|
External data source (audio + animation)
|
|
166
269
|
↓
|
|
167
|
-
AvatarController.
|
|
270
|
+
AvatarController.yieldAudioData(audioChunk) // Returns conversationId
|
|
168
271
|
↓
|
|
169
|
-
AvatarController.
|
|
170
|
-
AvatarController.sendKeyframes() // Stream additional animation
|
|
272
|
+
AvatarController.yieldFramesData(keyframesDataArray, conversationId) // keyframesDataArray: (Uint8Array | ArrayBuffer)[] - each element is a protobuf encoded Message
|
|
171
273
|
↓
|
|
172
274
|
AvatarController → AnimationPlayer (synchronized playback)
|
|
173
275
|
↓
|
|
@@ -178,205 +280,348 @@ AvatarController (playback loop) → AvatarView.renderRealtimeFrame()
|
|
|
178
280
|
RenderSystem → WebGPU/WebGL → Canvas rendering
|
|
179
281
|
```
|
|
180
282
|
|
|
181
|
-
**Note:**
|
|
182
|
-
- In network mode, users provide audio data, SDK handles network communication and animation data reception
|
|
183
|
-
- In external data mode, users provide both audio and animation data, SDK handles synchronized playback only
|
|
184
|
-
|
|
185
283
|
### Audio Format Requirements
|
|
186
284
|
|
|
187
|
-
**⚠️ Important:** The SDK requires audio data to be in **
|
|
285
|
+
**⚠️ Important:** The SDK requires audio data to be in **mono PCM16** format:
|
|
188
286
|
|
|
189
|
-
- **Sample Rate**:
|
|
190
|
-
-
|
|
287
|
+
- **Sample Rate**: Configurable via `audioFormat.sampleRate` in SDK initialization (default: 16000 Hz)
|
|
288
|
+
- Supported sample rates: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz
|
|
289
|
+
- The configured sample rate will be used for both audio recording and playback
|
|
290
|
+
- **Channels**: Mono (single channel) - Fixed to 1 channel
|
|
191
291
|
- **Format**: PCM16 (16-bit signed integer, little-endian)
|
|
192
292
|
- **Byte Order**: Little-endian
|
|
193
293
|
|
|
194
294
|
**Audio Data Format:**
|
|
195
|
-
- Each sample is 2 bytes (16-bit)
|
|
295
|
+
- Each sample is 2 bytes (16-bit signed integer, little-endian)
|
|
196
296
|
- Audio data should be provided as `ArrayBuffer` or `Uint8Array`
|
|
197
|
-
- For example: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
|
|
297
|
+
- For example, with 16kHz sample rate: 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
|
|
298
|
+
- For 48kHz sample rate: 1 second of audio = 48000 samples × 2 bytes = 96000 bytes
|
|
299
|
+
|
|
300
|
+
**Audio Data Source:**
|
|
301
|
+
The `audioData` parameter represents raw PCM16 audio samples in the configured sample rate and mono format. Common audio sources include:
|
|
302
|
+
- **PCM files**: Raw PCM16 files can be directly read as `ArrayBuffer` or `Uint8Array` and sent to the SDK (ensure sample rate matches configuration)
|
|
303
|
+
- **WAV files**: WAV files contain PCM16 audio data in their data chunk. After extracting the PCM data from the WAV file format, it can be sent to the SDK (may require resampling if sample rate differs)
|
|
304
|
+
- **MP3 files**: MP3 files need to be decoded first (e.g., using `AudioContext.decodeAudioData()` or a decoder library), then converted from the decoded format to PCM16 before sending to the SDK
|
|
305
|
+
- **Microphone input**: Real-time microphone audio needs to be captured and converted to PCM16 format at the configured sample rate before sending
|
|
306
|
+
- **Other audio sources**: Any audio source must be converted to mono PCM16 format at the configured sample rate before sending
|
|
307
|
+
|
|
308
|
+
**Example: Processing WAV and MP3 Files:**
|
|
309
|
+
```typescript
|
|
310
|
+
// WAV file processing
|
|
311
|
+
async function processWAVFile(wavFile: File): Promise<ArrayBuffer> {
|
|
312
|
+
const arrayBuffer = await wavFile.arrayBuffer()
|
|
313
|
+
const view = new DataView(arrayBuffer)
|
|
314
|
+
|
|
315
|
+
// WAV format: Skip header (usually 44 bytes for standard WAV)
|
|
316
|
+
// Check RIFF header
|
|
317
|
+
if (view.getUint32(0, true) !== 0x46464952) { // "RIFF"
|
|
318
|
+
throw new Error('Invalid WAV file')
|
|
319
|
+
}
|
|
320
|
+
|
|
321
|
+
// Find "data" chunk (offset may vary)
|
|
322
|
+
let dataOffset = 44 // Standard WAV header size
|
|
323
|
+
// For non-standard WAV files, you may need to search for "data" chunk
|
|
324
|
+
// This is a simplified example - production code should parse chunks properly
|
|
325
|
+
|
|
326
|
+
const pcmData = arrayBuffer.slice(dataOffset)
|
|
327
|
+
return pcmData
|
|
328
|
+
}
|
|
329
|
+
|
|
330
|
+
// MP3 file processing
|
|
331
|
+
async function processMP3File(mp3File: File, targetSampleRate: number): Promise<ArrayBuffer> {
|
|
332
|
+
const arrayBuffer = await mp3File.arrayBuffer()
|
|
333
|
+
const audioContext = new AudioContext({ sampleRate: targetSampleRate })
|
|
334
|
+
|
|
335
|
+
// Decode MP3 to AudioBuffer
|
|
336
|
+
const audioBuffer = await audioContext.decodeAudioData(arrayBuffer.slice(0))
|
|
337
|
+
|
|
338
|
+
// Convert AudioBuffer to PCM16 ArrayBuffer
|
|
339
|
+
const length = audioBuffer.length
|
|
340
|
+
const channels = audioBuffer.numberOfChannels
|
|
341
|
+
const pcm16Buffer = new ArrayBuffer(length * 2)
|
|
342
|
+
const pcm16View = new DataView(pcm16Buffer)
|
|
343
|
+
|
|
344
|
+
// Mix down to mono if stereo
|
|
345
|
+
const sourceData = channels === 1
|
|
346
|
+
? audioBuffer.getChannelData(0)
|
|
347
|
+
: new Float32Array(length)
|
|
348
|
+
|
|
349
|
+
if (channels > 1) {
|
|
350
|
+
const leftChannel = audioBuffer.getChannelData(0)
|
|
351
|
+
const rightChannel = audioBuffer.getChannelData(1)
|
|
352
|
+
for (let i = 0; i < length; i++) {
|
|
353
|
+
sourceData[i] = (leftChannel[i] + rightChannel[i]) / 2 // Mix to mono
|
|
354
|
+
}
|
|
355
|
+
}
|
|
356
|
+
|
|
357
|
+
// Convert float32 (-1.0 to 1.0) to int16 (-32768 to 32767)
|
|
358
|
+
for (let i = 0; i < length; i++) {
|
|
359
|
+
const sample = Math.max(-1, Math.min(1, sourceData[i])) // Clamp
|
|
360
|
+
const int16Sample = sample < 0 ? sample * 0x8000 : sample * 0x7FFF
|
|
361
|
+
pcm16View.setInt16(i * 2, int16Sample, true) // little-endian
|
|
362
|
+
}
|
|
363
|
+
|
|
364
|
+
audioContext.close()
|
|
365
|
+
return pcm16Buffer
|
|
366
|
+
}
|
|
367
|
+
|
|
368
|
+
// Usage example:
|
|
369
|
+
// const wavPcmData = await processWAVFile(wavFile)
|
|
370
|
+
// avatarView.controller.send(wavPcmData, false)
|
|
371
|
+
//
|
|
372
|
+
// const mp3PcmData = await processMP3File(mp3File, 16000) // 16kHz
|
|
373
|
+
// avatarView.controller.send(mp3PcmData, false)
|
|
374
|
+
```
|
|
198
375
|
|
|
199
376
|
**Resampling:**
|
|
200
|
-
- If your audio source is at a different sample rate
|
|
377
|
+
- If your audio source is at a different sample rate, you must resample it to match the configured sample rate before sending to the SDK
|
|
201
378
|
- For high-quality resampling, we recommend using Web Audio API's `OfflineAudioContext` with anti-aliasing filtering
|
|
202
379
|
- See example projects for resampling implementation
|
|
203
380
|
|
|
381
|
+
**Configuration Example:**
|
|
382
|
+
```typescript
|
|
383
|
+
const configuration: Configuration = {
|
|
384
|
+
environment: Environment.cn,
|
|
385
|
+
audioFormat: {
|
|
386
|
+
channelCount: 1, // Fixed to 1 (mono)
|
|
387
|
+
sampleRate: 48000 // Choose from: 8000, 16000, 22050, 24000, 32000, 44100, 48000
|
|
388
|
+
}
|
|
389
|
+
}
|
|
390
|
+
```
|
|
391
|
+
|
|
204
392
|
## 📚 API Reference
|
|
205
393
|
|
|
206
|
-
###
|
|
394
|
+
### AvatarSDK
|
|
207
395
|
|
|
208
396
|
The core management class of the SDK, responsible for initialization and global configuration.
|
|
209
397
|
|
|
210
398
|
```typescript
|
|
211
399
|
// Initialize SDK
|
|
212
|
-
await
|
|
400
|
+
await AvatarSDK.initialize(appId: string, configuration: Configuration)
|
|
213
401
|
|
|
214
402
|
// Check initialization status
|
|
215
|
-
const isInitialized =
|
|
403
|
+
const isInitialized = AvatarSDK.isInitialized
|
|
404
|
+
|
|
405
|
+
// Get initialized app ID
|
|
406
|
+
const appId = AvatarSDK.appId
|
|
407
|
+
|
|
408
|
+
// Get configuration
|
|
409
|
+
const config = AvatarSDK.configuration
|
|
410
|
+
|
|
411
|
+
// Set sessionToken (if needed, call separately)
|
|
412
|
+
AvatarSDK.setSessionToken('your-session-token')
|
|
413
|
+
|
|
414
|
+
// Set userId (optional, for telemetry)
|
|
415
|
+
AvatarSDK.setUserId('user-id')
|
|
416
|
+
|
|
417
|
+
// Get sessionToken
|
|
418
|
+
const sessionToken = AvatarSDK.sessionToken
|
|
419
|
+
|
|
420
|
+
// Get userId
|
|
421
|
+
const userId = AvatarSDK.userId
|
|
422
|
+
|
|
423
|
+
// Get SDK version
|
|
424
|
+
const version = AvatarSDK.version
|
|
216
425
|
|
|
217
426
|
// Cleanup resources (must be called when no longer in use)
|
|
218
|
-
|
|
427
|
+
AvatarSDK.cleanup()
|
|
219
428
|
```
|
|
220
429
|
|
|
221
430
|
### AvatarManager
|
|
222
431
|
|
|
223
|
-
|
|
432
|
+
Avatar resource manager, responsible for downloading, caching, and loading avatar data. Use the singleton instance via `AvatarManager.shared`.
|
|
224
433
|
|
|
225
434
|
```typescript
|
|
226
|
-
|
|
435
|
+
// Get singleton instance
|
|
436
|
+
const manager = AvatarManager.shared
|
|
227
437
|
|
|
228
|
-
// Load
|
|
438
|
+
// Load avatar
|
|
229
439
|
const avatar = await manager.load(
|
|
230
440
|
characterId: string,
|
|
231
441
|
onProgress?: (progress: LoadProgressInfo) => void
|
|
232
442
|
)
|
|
233
443
|
|
|
234
444
|
// Clear cache
|
|
235
|
-
manager.
|
|
445
|
+
manager.clearAll()
|
|
236
446
|
```
|
|
237
447
|
|
|
238
448
|
### AvatarView
|
|
239
449
|
|
|
240
450
|
3D rendering view (rendering layer), responsible for 3D rendering only. Internally automatically creates and manages `AvatarController`.
|
|
241
451
|
|
|
242
|
-
|
|
452
|
+
```typescript
|
|
453
|
+
constructor(avatar: Avatar, container: HTMLElement)
|
|
454
|
+
```
|
|
243
455
|
|
|
244
|
-
**
|
|
456
|
+
**Parameters:**
|
|
457
|
+
- `avatar`: Avatar instance
|
|
458
|
+
- `container`: Canvas container element (required)
|
|
459
|
+
- Canvas automatically uses the full size of the container (width and height)
|
|
460
|
+
- Canvas aspect ratio adapts to container size - set container size to control aspect ratio
|
|
461
|
+
- Canvas will be automatically added to the container
|
|
462
|
+
- SDK automatically handles resize events via ResizeObserver
|
|
463
|
+
|
|
464
|
+
**Playback Mode:**
|
|
465
|
+
- The playback mode is determined by `drivingServiceMode` in `AvatarSDK.initialize()` configuration
|
|
245
466
|
- The playback mode is fixed when creating `AvatarView` and persists throughout its lifecycle
|
|
246
467
|
- Cannot be changed after creation
|
|
247
468
|
|
|
248
469
|
```typescript
|
|
249
|
-
import { AvatarPlaybackMode } from '@spatialwalk/avatarkit'
|
|
250
|
-
|
|
251
470
|
// Create view (Canvas is automatically added to container)
|
|
252
|
-
// Network mode (default)
|
|
253
471
|
const container = document.getElementById('avatar-container')
|
|
254
|
-
const avatarView = new AvatarView(avatar
|
|
255
|
-
container: container,
|
|
256
|
-
playbackMode: AvatarPlaybackMode.network // Optional, default is 'network'
|
|
257
|
-
})
|
|
258
|
-
|
|
259
|
-
// External data mode
|
|
260
|
-
const avatarView = new AvatarView(avatar: Avatar, {
|
|
261
|
-
container: container,
|
|
262
|
-
playbackMode: AvatarPlaybackMode.external
|
|
263
|
-
})
|
|
472
|
+
const avatarView = new AvatarView(avatar, container)
|
|
264
473
|
|
|
265
|
-
//
|
|
266
|
-
|
|
474
|
+
// Wait for first frame to render
|
|
475
|
+
avatarView.onFirstRendering = () => {
|
|
476
|
+
// First frame rendered
|
|
477
|
+
}
|
|
267
478
|
|
|
268
|
-
// Get
|
|
269
|
-
|
|
479
|
+
// Get or set avatar transform (position and scale)
|
|
480
|
+
// Get current transform
|
|
481
|
+
const currentTransform = avatarView.transform // { x: number, y: number, scale: number }
|
|
270
482
|
|
|
271
|
-
//
|
|
272
|
-
avatarView.
|
|
483
|
+
// Set transform
|
|
484
|
+
avatarView.transform = { x, y, scale }
|
|
485
|
+
// - x: Horizontal offset in normalized coordinates (-1 to 1, where -1 = left edge, 0 = center, 1 = right edge)
|
|
486
|
+
// - y: Vertical offset in normalized coordinates (-1 to 1, where -1 = bottom edge, 0 = center, 1 = top edge)
|
|
487
|
+
// - scale: Scale factor (1.0 = original size, 2.0 = double size, 0.5 = half size)
|
|
273
488
|
|
|
274
|
-
// Cleanup resources (must be called before switching
|
|
489
|
+
// Cleanup resources (must be called before switching avatars)
|
|
275
490
|
avatarView.dispose()
|
|
276
491
|
```
|
|
277
492
|
|
|
278
|
-
**
|
|
493
|
+
**Avatar Switching Example:**
|
|
279
494
|
|
|
280
495
|
```typescript
|
|
281
|
-
//
|
|
496
|
+
// To switch avatars, simply dispose the old view and create a new one
|
|
282
497
|
if (currentAvatarView) {
|
|
283
498
|
currentAvatarView.dispose()
|
|
284
|
-
currentAvatarView = null
|
|
285
499
|
}
|
|
286
500
|
|
|
287
|
-
// Load new
|
|
501
|
+
// Load new avatar
|
|
288
502
|
const newAvatar = await avatarManager.load('new-character-id')
|
|
289
503
|
|
|
290
|
-
// Create new AvatarView
|
|
291
|
-
currentAvatarView = new AvatarView(newAvatar,
|
|
292
|
-
container: container,
|
|
293
|
-
playbackMode: AvatarPlaybackMode.network
|
|
294
|
-
})
|
|
504
|
+
// Create new AvatarView
|
|
505
|
+
currentAvatarView = new AvatarView(newAvatar, container)
|
|
295
506
|
|
|
296
|
-
//
|
|
297
|
-
|
|
298
|
-
await currentAvatarView.avatarController.start()
|
|
299
|
-
}
|
|
507
|
+
// SDK mode: start connection (will throw error if not in SDK mode)
|
|
508
|
+
await currentAvatarView.controller.start()
|
|
300
509
|
```
|
|
301
510
|
|
|
302
511
|
### AvatarController
|
|
303
512
|
|
|
304
|
-
Audio/animation playback controller (playback layer), manages synchronized playback of audio and animation. Automatically
|
|
513
|
+
Audio/animation playback controller (playback layer), manages synchronized playback of audio and animation. Automatically handles WebSocket communication in SDK mode.
|
|
305
514
|
|
|
306
515
|
**Two Usage Patterns:**
|
|
307
516
|
|
|
308
|
-
####
|
|
517
|
+
#### SDK Mode Methods
|
|
309
518
|
|
|
310
519
|
```typescript
|
|
311
|
-
//
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
//
|
|
315
|
-
|
|
316
|
-
//
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
//
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
//
|
|
520
|
+
// ⚠️ CRITICAL: Initialize audio context first (MUST be called in user gesture context)
|
|
521
|
+
// This method MUST be called within a user gesture event handler (click, touchstart, etc.)
|
|
522
|
+
// to satisfy browser security policies. Calling it outside a user gesture will fail.
|
|
523
|
+
// All audio operations (start, send, etc.) require prior initialization.
|
|
524
|
+
button.addEventListener('click', async () => {
|
|
525
|
+
// Initialize audio context - MUST be in user gesture context
|
|
526
|
+
await avatarView.controller.initializeAudioContext()
|
|
527
|
+
|
|
528
|
+
// Start WebSocket service
|
|
529
|
+
await avatarView.controller.start()
|
|
530
|
+
|
|
531
|
+
// Send audio data (must be mono PCM16 format matching configured sample rate)
|
|
532
|
+
const conversationId = avatarView.controller.send(audioData: ArrayBuffer, end: boolean)
|
|
533
|
+
// Returns: conversationId - Conversation ID for this conversation session
|
|
534
|
+
// end: false (default) - Continue sending audio data for current conversation
|
|
535
|
+
// end: true - Mark the end of current conversation round. After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
|
|
536
|
+
})
|
|
323
537
|
|
|
324
538
|
// Close WebSocket service
|
|
325
|
-
avatarView.
|
|
539
|
+
avatarView.controller.close()
|
|
326
540
|
```
|
|
327
541
|
|
|
328
|
-
####
|
|
542
|
+
#### Host Mode Methods
|
|
329
543
|
|
|
330
544
|
```typescript
|
|
331
|
-
//
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
)
|
|
545
|
+
// ⚠️ CRITICAL: Initialize audio context first (MUST be called in user gesture context)
|
|
546
|
+
// This method MUST be called within a user gesture event handler (click, touchstart, etc.)
|
|
547
|
+
// to satisfy browser security policies. Calling it outside a user gesture will fail.
|
|
548
|
+
// All audio operations (yieldAudioData, yieldFramesData, etc.) require prior initialization.
|
|
549
|
+
button.addEventListener('click', async () => {
|
|
550
|
+
// Initialize audio context - MUST be in user gesture context
|
|
551
|
+
await avatarView.controller.initializeAudioContext()
|
|
552
|
+
|
|
553
|
+
// Stream audio chunks (must be mono PCM16 format matching configured sample rate)
|
|
554
|
+
const conversationId = avatarView.controller.yieldAudioData(
|
|
555
|
+
data: Uint8Array, // Audio chunk data (PCM16 format)
|
|
556
|
+
isLast: boolean = false // Whether this is the last chunk
|
|
557
|
+
)
|
|
558
|
+
// Returns: conversationId - Conversation ID for this audio session
|
|
559
|
+
|
|
560
|
+
// Stream animation keyframes (requires conversationId from audio data)
|
|
561
|
+
avatarView.controller.yieldFramesData(
|
|
562
|
+
keyframesDataArray: (Uint8Array | ArrayBuffer)[], // Animation keyframes binary data array (each element is a protobuf encoded Message)
|
|
563
|
+
conversationId: string // Conversation ID (required)
|
|
564
|
+
)
|
|
565
|
+
})
|
|
566
|
+
```
|
|
336
567
|
|
|
337
|
-
|
|
338
|
-
avatarView.avatarController.sendAudioChunk(
|
|
339
|
-
data: Uint8Array, // Audio chunk data
|
|
340
|
-
isLast: boolean = false // Whether this is the last chunk
|
|
341
|
-
)
|
|
568
|
+
**⚠️ Important: Conversation ID (conversationId) Management**
|
|
342
569
|
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
570
|
+
**SDK Mode:**
|
|
571
|
+
- `send()` returns a conversationId to distinguish each conversation round
|
|
572
|
+
- `end=true` marks the end of a conversation round
|
|
573
|
+
|
|
574
|
+
**Host Mode:**
|
|
575
|
+
- `yieldAudioData()` returns a conversationId (automatically generates if starting new session)
|
|
576
|
+
- `yieldFramesData()` requires a valid conversationId parameter
|
|
577
|
+
- Animation data with mismatched conversationId will be **discarded**
|
|
578
|
+
- Use `getCurrentConversationId()` to retrieve the current active conversationId
|
|
348
579
|
|
|
349
580
|
#### Common Methods (Both Modes)
|
|
350
581
|
|
|
351
582
|
```typescript
|
|
583
|
+
|
|
352
584
|
// Interrupt current playback (stops and clears data)
|
|
353
585
|
avatarView.avatarController.interrupt()
|
|
354
586
|
|
|
355
587
|
// Clear all data and resources
|
|
356
588
|
avatarView.avatarController.clear()
|
|
357
589
|
|
|
358
|
-
// Get
|
|
359
|
-
const
|
|
360
|
-
|
|
361
|
-
// Start service (network mode only)
|
|
362
|
-
await avatarView.avatarController.start()
|
|
363
|
-
|
|
364
|
-
// Close service (network mode only)
|
|
365
|
-
avatarView.avatarController.close()
|
|
590
|
+
// Get current conversation ID (for Host mode)
|
|
591
|
+
const conversationId = avatarView.avatarController.getCurrentConversationId()
|
|
592
|
+
// Returns: Current conversationId for the active audio session, or null if no active session
|
|
366
593
|
|
|
367
|
-
//
|
|
368
|
-
|
|
594
|
+
// Volume control (affects only avatar audio player, not system volume)
|
|
595
|
+
avatarView.avatarController.setVolume(0.5) // Set volume to 50% (0.0 to 1.0)
|
|
596
|
+
const currentVolume = avatarView.avatarController.getVolume() // Get current volume (0.0 to 1.0)
|
|
369
597
|
|
|
370
598
|
// Set event callbacks
|
|
371
|
-
avatarView.avatarController.onConnectionState = (state: ConnectionState) => {} //
|
|
372
|
-
avatarView.avatarController.
|
|
599
|
+
avatarView.avatarController.onConnectionState = (state: ConnectionState) => {} // SDK mode only
|
|
600
|
+
avatarView.avatarController.onConversationState = (state: ConversationState) => {}
|
|
373
601
|
avatarView.avatarController.onError = (error: Error) => {}
|
|
374
602
|
```
|
|
375
603
|
|
|
604
|
+
#### Avatar Transform Methods
|
|
605
|
+
|
|
606
|
+
```typescript
|
|
607
|
+
// Get or set avatar transform (position and scale in canvas)
|
|
608
|
+
// Get current transform
|
|
609
|
+
const currentTransform = avatarView.transform // { x: number, y: number, scale: number }
|
|
610
|
+
|
|
611
|
+
// Set transform
|
|
612
|
+
avatarView.transform = { x, y, scale }
|
|
613
|
+
// - x: Horizontal offset in normalized coordinates (-1 to 1, where -1 = left edge, 0 = center, 1 = right edge)
|
|
614
|
+
// - y: Vertical offset in normalized coordinates (-1 to 1, where -1 = bottom edge, 0 = center, 1 = top edge)
|
|
615
|
+
// - scale: Scale factor (1.0 = original size, 2.0 = double size, 0.5 = half size)
|
|
616
|
+
// Example:
|
|
617
|
+
avatarView.transform = { x: 0, y: 0, scale: 1.0 } // Center, original size
|
|
618
|
+
avatarView.transform = { x: 0.5, y: 0, scale: 2.0 } // Right half, double size
|
|
619
|
+
```
|
|
620
|
+
|
|
376
621
|
**Important Notes:**
|
|
377
|
-
- `start()` and `close()` are only available in
|
|
378
|
-
- `
|
|
379
|
-
- `interrupt()` and `
|
|
622
|
+
- `start()` and `close()` are only available in SDK mode
|
|
623
|
+
- `yieldAudioData()` and `yieldFramesData()` are only available in Host mode
|
|
624
|
+
- `pause()`, `resume()`, `interrupt()`, `clear()`, `getCurrentConversationId()`, `setVolume()`, and `getVolume()` are available in both modes
|
|
380
625
|
- The playback mode is determined when creating `AvatarView` and cannot be changed
|
|
381
626
|
|
|
382
627
|
## 🔧 Configuration
|
|
@@ -386,40 +631,55 @@ avatarView.avatarController.onError = (error: Error) => {}
|
|
|
386
631
|
```typescript
|
|
387
632
|
interface Configuration {
|
|
388
633
|
environment: Environment
|
|
634
|
+
drivingServiceMode?: DrivingServiceMode // Optional, default is 'sdk' (SDK mode)
|
|
635
|
+
logLevel?: LogLevel // Optional, default is 'off' (no logs)
|
|
636
|
+
audioFormat?: AudioFormat // Optional, default is { channelCount: 1, sampleRate: 16000 }
|
|
637
|
+
characterApiBaseUrl?: string // Optional, internal debug config, can be ignored
|
|
389
638
|
}
|
|
390
|
-
```
|
|
391
|
-
|
|
392
|
-
**Description:**
|
|
393
|
-
- `environment`: Specifies the environment (cn/us/test), SDK will automatically use the corresponding API address and WebSocket address based on the environment
|
|
394
|
-
- `sessionToken`: Set separately via `AvatarKit.setSessionToken()`, not in Configuration
|
|
395
639
|
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
|
|
399
|
-
us = 'us', // US region
|
|
400
|
-
test = 'test' // Test environment
|
|
640
|
+
interface AudioFormat {
|
|
641
|
+
readonly channelCount: 1 // Fixed to 1 (mono)
|
|
642
|
+
readonly sampleRate: number // Supported: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz, default: 16000
|
|
401
643
|
}
|
|
402
644
|
```
|
|
403
645
|
|
|
404
|
-
###
|
|
646
|
+
### LogLevel
|
|
647
|
+
|
|
648
|
+
Control the verbosity of SDK logs:
|
|
405
649
|
|
|
406
650
|
```typescript
|
|
407
|
-
|
|
408
|
-
|
|
409
|
-
|
|
651
|
+
enum LogLevel {
|
|
652
|
+
off = 'off', // Disable all logs
|
|
653
|
+
error = 'error', // Only error logs
|
|
654
|
+
warning = 'warning', // Warning and error logs
|
|
655
|
+
all = 'all' // All logs (info, warning, error) - default
|
|
410
656
|
}
|
|
411
657
|
```
|
|
412
658
|
|
|
659
|
+
**Note:** `LogLevel.off` completely disables all logging, including error logs. Use with caution in production environments.
|
|
660
|
+
|
|
413
661
|
**Description:**
|
|
414
|
-
- `
|
|
415
|
-
|
|
416
|
-
- `
|
|
417
|
-
- `
|
|
662
|
+
- `environment`: Specifies the environment (cn/intl), SDK will automatically use the corresponding API address and WebSocket address based on the environment
|
|
663
|
+
- `drivingServiceMode`: Specifies the driving service mode
|
|
664
|
+
- `DrivingServiceMode.sdk` (default): SDK mode - SDK handles WebSocket communication automatically
|
|
665
|
+
- `DrivingServiceMode.host`: Host mode - Host application provides audio and animation data
|
|
666
|
+
- `logLevel`: Controls the verbosity of SDK logs
|
|
667
|
+
- `LogLevel.off` (default): Disable all logs
|
|
668
|
+
- `LogLevel.error`: Only error logs
|
|
669
|
+
- `LogLevel.warning`: Warning and error logs
|
|
670
|
+
- `LogLevel.all`: All logs (info, warning, error)
|
|
671
|
+
- `audioFormat`: Configures audio sample rate and channel count
|
|
672
|
+
- `channelCount`: Fixed to 1 (mono channel)
|
|
673
|
+
- `sampleRate`: Audio sample rate in Hz (default: 16000)
|
|
674
|
+
- Supported values: 8000, 16000, 22050, 24000, 32000, 44100, 48000
|
|
675
|
+
- The configured sample rate will be used for both audio recording and playback
|
|
676
|
+
- `characterApiBaseUrl`: Internal debug config, can be ignored
|
|
677
|
+
- `sessionToken`: Set separately via `AvatarSDK.setSessionToken()`, not in Configuration
|
|
418
678
|
|
|
419
679
|
```typescript
|
|
420
|
-
enum
|
|
421
|
-
|
|
422
|
-
|
|
680
|
+
enum Environment {
|
|
681
|
+
cn = 'cn', // China region
|
|
682
|
+
intl = 'intl', // International region
|
|
423
683
|
}
|
|
424
684
|
```
|
|
425
685
|
|
|
@@ -450,16 +710,25 @@ enum ConnectionState {
|
|
|
450
710
|
}
|
|
451
711
|
```
|
|
452
712
|
|
|
453
|
-
###
|
|
713
|
+
### ConversationState
|
|
454
714
|
|
|
455
715
|
```typescript
|
|
456
|
-
enum
|
|
457
|
-
idle = 'idle', // Idle state
|
|
458
|
-
|
|
459
|
-
|
|
716
|
+
enum ConversationState {
|
|
717
|
+
idle = 'idle', // Idle state (breathing animation)
|
|
718
|
+
playing = 'playing', // Playing state (active conversation)
|
|
719
|
+
pausing = 'pausing' // Pausing state (paused during playback)
|
|
460
720
|
}
|
|
461
721
|
```
|
|
462
722
|
|
|
723
|
+
**State Description:**
|
|
724
|
+
- `idle`: Avatar is in idle state (breathing animation), waiting for conversation to start
|
|
725
|
+
- `playing`: Avatar is playing conversation content (including during transition animations)
|
|
726
|
+
- `pausing`: Avatar playback is paused (e.g., when `end=false` and waiting for more audio data)
|
|
727
|
+
|
|
728
|
+
**Note:** During transition animations, the target state is notified immediately:
|
|
729
|
+
- When transitioning from `idle` to `playing`, the `playing` state is notified immediately
|
|
730
|
+
- When transitioning from `playing` to `idle`, the `idle` state is notified immediately
|
|
731
|
+
|
|
463
732
|
## 🎨 Rendering System
|
|
464
733
|
|
|
465
734
|
The SDK supports two rendering backends:
|
|
@@ -469,70 +738,19 @@ The SDK supports two rendering backends:
|
|
|
469
738
|
|
|
470
739
|
The rendering system automatically selects the best backend, no manual configuration needed.
|
|
471
740
|
|
|
472
|
-
## 🔍 Debugging and Monitoring
|
|
473
|
-
|
|
474
|
-
### Logging System
|
|
475
|
-
|
|
476
|
-
The SDK has a built-in complete logging system, supporting different levels of log output:
|
|
477
|
-
|
|
478
|
-
```typescript
|
|
479
|
-
import { logger } from '@spatialwalk/avatarkit'
|
|
480
|
-
|
|
481
|
-
// Set log level
|
|
482
|
-
logger.setLevel('verbose') // 'basic' | 'verbose'
|
|
483
|
-
|
|
484
|
-
// Manual log output
|
|
485
|
-
logger.log('Info message')
|
|
486
|
-
logger.warn('Warning message')
|
|
487
|
-
logger.error('Error message')
|
|
488
|
-
```
|
|
489
|
-
|
|
490
|
-
### Performance Monitoring
|
|
491
|
-
|
|
492
|
-
The SDK provides performance monitoring interfaces to monitor rendering performance:
|
|
493
|
-
|
|
494
|
-
```typescript
|
|
495
|
-
// Get rendering performance statistics
|
|
496
|
-
const stats = avatarView.getPerformanceStats()
|
|
497
|
-
|
|
498
|
-
if (stats) {
|
|
499
|
-
console.log(`Render time: ${stats.renderTime.toFixed(2)}ms`)
|
|
500
|
-
console.log(`Sort time: ${stats.sortTime.toFixed(2)}ms`)
|
|
501
|
-
console.log(`Rendering backend: ${stats.backend}`)
|
|
502
|
-
|
|
503
|
-
// Calculate frame rate
|
|
504
|
-
const fps = 1000 / stats.renderTime
|
|
505
|
-
console.log(`Frame rate: ${fps.toFixed(2)} FPS`)
|
|
506
|
-
}
|
|
507
|
-
|
|
508
|
-
// Regular performance monitoring
|
|
509
|
-
setInterval(() => {
|
|
510
|
-
const stats = avatarView.getPerformanceStats()
|
|
511
|
-
if (stats) {
|
|
512
|
-
// Send to monitoring service or display on UI
|
|
513
|
-
console.log('Performance:', stats)
|
|
514
|
-
}
|
|
515
|
-
}, 1000)
|
|
516
|
-
```
|
|
517
|
-
|
|
518
|
-
**Performance Statistics Description:**
|
|
519
|
-
- `renderTime`: Total rendering time (milliseconds), includes sorting and GPU rendering
|
|
520
|
-
- `sortTime`: Sorting time (milliseconds), uses Radix Sort algorithm to depth-sort point cloud
|
|
521
|
-
- `backend`: Currently used rendering backend (`'webgpu'` | `'webgl'` | `null`)
|
|
522
|
-
|
|
523
741
|
## 🚨 Error Handling
|
|
524
742
|
|
|
525
|
-
###
|
|
743
|
+
### AvatarError
|
|
526
744
|
|
|
527
745
|
The SDK uses custom error types, providing more detailed error information:
|
|
528
746
|
|
|
529
747
|
```typescript
|
|
530
|
-
import {
|
|
748
|
+
import { AvatarError } from '@spatialwalk/avatarkit'
|
|
531
749
|
|
|
532
750
|
try {
|
|
533
751
|
await avatarView.avatarController.start()
|
|
534
752
|
} catch (error) {
|
|
535
|
-
if (error instanceof
|
|
753
|
+
if (error instanceof AvatarError) {
|
|
536
754
|
console.error('SDK Error:', error.message, error.code)
|
|
537
755
|
} else {
|
|
538
756
|
console.error('Unknown error:', error)
|
|
@@ -553,15 +771,12 @@ avatarView.avatarController.onError = (error: Error) => {
|
|
|
553
771
|
|
|
554
772
|
### Lifecycle Management
|
|
555
773
|
|
|
556
|
-
####
|
|
774
|
+
#### SDK Mode Lifecycle
|
|
557
775
|
|
|
558
776
|
```typescript
|
|
559
777
|
// Initialize
|
|
560
778
|
const container = document.getElementById('avatar-container')
|
|
561
|
-
const avatarView = new AvatarView(avatar,
|
|
562
|
-
container: container,
|
|
563
|
-
playbackMode: AvatarPlaybackMode.network
|
|
564
|
-
})
|
|
779
|
+
const avatarView = new AvatarView(avatar, container)
|
|
565
780
|
await avatarView.avatarController.start()
|
|
566
781
|
|
|
567
782
|
// Use
|
|
@@ -572,21 +787,16 @@ avatarView.avatarController.close()
|
|
|
572
787
|
avatarView.dispose() // Automatically cleans up all resources
|
|
573
788
|
```
|
|
574
789
|
|
|
575
|
-
####
|
|
790
|
+
#### Host Mode Lifecycle
|
|
576
791
|
|
|
577
792
|
```typescript
|
|
578
793
|
// Initialize
|
|
579
794
|
const container = document.getElementById('avatar-container')
|
|
580
|
-
const avatarView = new AvatarView(avatar,
|
|
581
|
-
container: container,
|
|
582
|
-
playbackMode: AvatarPlaybackMode.external
|
|
583
|
-
})
|
|
795
|
+
const avatarView = new AvatarView(avatar, container)
|
|
584
796
|
|
|
585
797
|
// Use
|
|
586
|
-
const
|
|
587
|
-
|
|
588
|
-
avatarView.avatarController.sendAudioChunk(audioChunk, false)
|
|
589
|
-
avatarView.avatarController.sendKeyframes(keyframes)
|
|
798
|
+
const conversationId = avatarView.avatarController.yieldAudioData(audioChunk, false)
|
|
799
|
+
avatarView.avatarController.yieldFramesData(keyframesDataArray, conversationId) // keyframesDataArray: (Uint8Array | ArrayBuffer)[]
|
|
590
800
|
|
|
591
801
|
// Cleanup
|
|
592
802
|
avatarView.avatarController.clear() // Clear all data and resources
|
|
@@ -594,63 +804,17 @@ avatarView.dispose() // Automatically cleans up all resources
|
|
|
594
804
|
```
|
|
595
805
|
|
|
596
806
|
**⚠️ Important Notes:**
|
|
597
|
-
-
|
|
598
|
-
- When switching characters, must first call `dispose()` to clean up old AvatarView, then create new instance
|
|
807
|
+
- When disposing AvatarView instances, must call `dispose()` to properly clean up resources
|
|
599
808
|
- Not properly cleaning up may cause resource leaks and rendering errors
|
|
600
|
-
- In
|
|
601
|
-
- In
|
|
809
|
+
- In SDK mode, call `close()` before `dispose()` to properly close WebSocket connections
|
|
810
|
+
- In Host mode, call `clear()` before `dispose()` to clear all playback data
|
|
602
811
|
|
|
603
812
|
### Memory Optimization
|
|
604
813
|
|
|
605
814
|
- SDK automatically manages WASM memory allocation
|
|
606
|
-
- Supports dynamic loading/unloading of
|
|
815
|
+
- Supports dynamic loading/unloading of avatar and animation resources
|
|
607
816
|
- Provides memory usage monitoring interface
|
|
608
817
|
|
|
609
|
-
### Audio Data Sending
|
|
610
|
-
|
|
611
|
-
#### Network Mode
|
|
612
|
-
|
|
613
|
-
The `send()` method receives audio data in `ArrayBuffer` format:
|
|
614
|
-
|
|
615
|
-
**Audio Format Requirements:**
|
|
616
|
-
- **Sample Rate**: 16kHz (16000 Hz) - **Backend requirement, must be exactly 16kHz**
|
|
617
|
-
- **Format**: PCM16 (16-bit signed integer, little-endian)
|
|
618
|
-
- **Channels**: Mono (single channel)
|
|
619
|
-
- **Data Size**: Each sample is 2 bytes, so 1 second of audio = 16000 samples × 2 bytes = 32000 bytes
|
|
620
|
-
|
|
621
|
-
**Usage:**
|
|
622
|
-
- `audioData`: Audio data (ArrayBuffer format, must be 16kHz mono PCM16)
|
|
623
|
-
- `end=false` (default) - Normal audio data sending, server will accumulate audio data, automatically returns animation data and starts synchronized playback of animation and audio after accumulating enough data
|
|
624
|
-
- `end=true` - Immediately return animation data, no longer accumulating, used for ending current conversation or scenarios requiring immediate response
|
|
625
|
-
- **Important**: No need to wait for `end=true` to start playing, it will automatically start playing after accumulating enough audio data
|
|
626
|
-
|
|
627
|
-
#### External Data Mode
|
|
628
|
-
|
|
629
|
-
The `play()` method starts playback with initial data, then use `sendAudioChunk()` to stream additional audio:
|
|
630
|
-
|
|
631
|
-
**Audio Format Requirements:**
|
|
632
|
-
- Same as network mode: 16kHz mono PCM16 format
|
|
633
|
-
- Audio data should be provided as `Uint8Array` in chunks with `isLast` flag
|
|
634
|
-
|
|
635
|
-
**Usage:**
|
|
636
|
-
```typescript
|
|
637
|
-
// Start playback with initial audio and animation data
|
|
638
|
-
// Note: Audio and animation data should be obtained from your backend service
|
|
639
|
-
const initialAudioChunks = [
|
|
640
|
-
{ data: audioData1, isLast: false },
|
|
641
|
-
{ data: audioData2, isLast: false }
|
|
642
|
-
]
|
|
643
|
-
await avatarController.play(initialAudioChunks, initialKeyframes)
|
|
644
|
-
|
|
645
|
-
// Stream additional audio chunks
|
|
646
|
-
avatarController.sendAudioChunk(audioChunk, isLast)
|
|
647
|
-
```
|
|
648
|
-
|
|
649
|
-
**Resampling (Both Modes):**
|
|
650
|
-
- If your audio source is at a different sample rate (e.g., 24kHz, 48kHz), you **must** resample it to 16kHz before sending
|
|
651
|
-
- For high-quality resampling, use Web Audio API's `OfflineAudioContext` with anti-aliasing filtering
|
|
652
|
-
- See example projects (`vanilla`, `react`, `vue`) for complete resampling implementation
|
|
653
|
-
|
|
654
818
|
## 🌐 Browser Compatibility
|
|
655
819
|
|
|
656
820
|
- **Chrome/Edge** 90+ (WebGPU recommended)
|
|
@@ -669,6 +833,5 @@ Issues and Pull Requests are welcome!
|
|
|
669
833
|
## 📞 Support
|
|
670
834
|
|
|
671
835
|
For questions, please contact:
|
|
672
|
-
- Email:
|
|
673
|
-
- Documentation: https://docs.
|
|
674
|
-
- GitHub: https://github.com/spavatar/sdk
|
|
836
|
+
- Email: code@spatialwalk.net
|
|
837
|
+
- Documentation: https://docs.spatialreal.ai
|