npm - @spatialwalk/avatarkit - Versions diffs - 1.0.0-beta.16 → 1.0.0-beta.18 - Mend

@spatialwalk/avatarkit 1.0.0-beta.16 → 1.0.0-beta.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/CHANGELOG.md +49 -1
package/README.md +179 -115
package/dist/{StreamingAudioPlayer-COgQTrz3.js → StreamingAudioPlayer-D5P7mU8B.js} +2 -2
package/dist/{StreamingAudioPlayer-COgQTrz3.js.map → StreamingAudioPlayer-D5P7mU8B.js.map} +1 -1
package/dist/animation/AnimationWebSocketClient.d.ts +5 -4
package/dist/animation/AnimationWebSocketClient.d.ts.map +1 -1
package/dist/avatar_core_wasm.wasm +0 -0
package/dist/core/AvatarController.d.ts +64 -9
package/dist/core/AvatarController.d.ts.map +1 -1
package/dist/core/AvatarDownloader.d.ts +0 -4
package/dist/core/AvatarDownloader.d.ts.map +1 -1
package/dist/core/AvatarManager.d.ts +1 -3
package/dist/core/AvatarManager.d.ts.map +1 -1
package/dist/core/AvatarView.d.ts +9 -2
package/dist/core/AvatarView.d.ts.map +1 -1
package/dist/{index-Dsokgngg.js → index-CuR_S9Ng.js} +1284 -1121
package/dist/index-CuR_S9Ng.js.map +1 -0
package/dist/index.js +14 -15
package/dist/types/character.d.ts +0 -11
package/dist/types/character.d.ts.map +1 -1
package/dist/types/index.d.ts +8 -9
package/dist/types/index.d.ts.map +1 -1
package/dist/utils/{reqId.d.ts → conversationId.d.ts} +6 -6
package/dist/utils/conversationId.d.ts.map +1 -0
package/dist/vanilla/vite.config.d.ts.map +1 -1
package/package.json +13 -12
package/dist/index-Dsokgngg.js.map +0 -1
package/dist/utils/reqId.d.ts.map +0 -1

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,54 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [1.0.0-beta.18] - 2025-01-25
+### 🔧 API Changes
+- **Renamed `reqId` to `conversationId`** - Updated terminology for better clarity
+  - All methods and parameters that used `reqId` now use `conversationId`
+  - `getCurrentReqId()` → `getCurrentConversationId()`
+  - `generateReqId()` → `generateConversationId()`
+  - Updated all event logs and documentation to use `conversationId`
+  - Note: Protobuf protocol still uses `reqId` field name internally, but SDK API uses `conversationId`
+### 📚 Documentation
+- Enhanced Host mode documentation to clearly emphasize the workflow: send audio data first to get conversationId, then use that conversationId to send animation data
+- Updated Host Mode Example and Host Mode Flow sections with clearer step-by-step instructions
+## [1.0.0-beta.17] - 2025-01-24
+### ✨ New Features
+- **Audio-Only Fallback Mechanism** - SDK now includes automatic fallback to audio-only playback when animation data is unavailable
+  - SDK mode: Automatically enters audio-only mode when server returns an error
+  - Host mode: Automatically enters audio-only mode when empty animation data is provided
+  - Once in audio-only mode, subsequent animation data for that session is ignored
+  - Fallback mode is interruptible, just like normal playback mode
+### 🔧 API Changes
+- **Playback Mode Configuration** - Moved playback mode configuration from `AvatarView` constructor to `AvatarKit.initialize()`
+  - Playback mode is now determined by `drivingServiceMode` in `AvatarKit.initialize()` configuration
+  - `AvatarView` constructor now only requires `avatar` and `container` parameters
+  - Removed `AvatarViewOptions` interface
+  - `container` parameter is now required (no longer optional)
+- **Method Renames** - Renamed methods in `AvatarController` for Host mode to better reflect their purpose
+  - `play()` → `playback()`: Renamed to better reflect that the method is used for playback of existing data (replay mode)
+    - Old API: `avatarController.play(initialAudioChunks, initialKeyframes)`
+    - New API: `avatarController.playback(initialAudioChunks, initialKeyframes)`
+  - `sendAudioChunk()` → `yieldAudioData()`: Renamed to better reflect that the method yields/streams audio data
+    - Old API: `avatarController.sendAudioChunk(data, isLast)`
+    - New API: `avatarController.yieldAudioData(data, isLast)`
+  - `sendKeyframes()` → `yieldFramesData()`: Renamed to better reflect that the method yields/streams animation keyframes
+    - Old API: `avatarController.sendKeyframes(keyframes, reqId)`
+    - New API: `avatarController.yieldFramesData(keyframes, conversationId)`
+### 🔧 Improvements
+- Extended transition animation duration from 200ms to 400ms for smoother end-of-playback transitions
+### 📚 Documentation
+- Updated README.md to use "SDK mode" and "Host mode" terminology instead of "Network mode" and "External data mode"
+- Added fallback mechanism documentation
+- Updated API reference to reflect new constructor signature
 ## [1.0.0-beta.16] - 2025-11-21
 ### ✨ New Features
@@ -174,7 +222,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - **External Data Mode**:
   - External components fully control audio and animation data acquisition
   - SDK only responsible for synchronized playback of externally provided data
-  - Use `controller.play()`, `controller.sendAudioChunk()` and `controller.sendKeyframes()` methods
+  - Use `controller.playback()`, `controller.yieldAudioData()` and `controller.yieldFramesData()` methods
 ### ✨ New Features

package/README.md CHANGED Viewed

@@ -32,8 +32,13 @@ import {
 } from '@spatialwalk/avatarkit'
 // 1. Initialize SDK
+import { DrivingServiceMode } from '@spatialwalk/avatarkit'
 const configuration: Configuration = {
   environment: Environment.test,
+  drivingServiceMode: DrivingServiceMode.sdk, // Optional, 'sdk' is default
+  // - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
+  // - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
 }
 await AvatarKit.initialize('your-app-id', configuration)
@@ -42,55 +47,57 @@ await AvatarKit.initialize('your-app-id', configuration)
 // AvatarKit.setSessionToken('your-session-token')
 // 2. Load character
-const avatarManager = new AvatarManager()
+const avatarManager = AvatarManager.shared
 const avatar = await avatarManager.load('character-id', (progress) => {
   console.log(`Loading progress: ${progress.progress}%`)
 })
 // 3. Create view (automatically creates Canvas and AvatarController)
-// Network mode (default)
+// The playback mode is determined by drivingServiceMode in AvatarKit configuration
+// - DrivingServiceMode.sdk: SDK mode - SDK handles WebSocket communication
+// - DrivingServiceMode.host: Host mode - Host app provides audio and animation data
 const container = document.getElementById('avatar-container')
-const avatarView = new AvatarView(avatar, {
-  container: container,
-  playbackMode: 'network' // Optional, 'network' is default
-})
+const avatarView = new AvatarView(avatar, container)
-// 4. Start real-time communication (network mode only)
+// 4. Start real-time communication (SDK mode only)
 await avatarView.avatarController.start()
-// 5. Send audio data (network mode)
+// 5. Send audio data (SDK mode)
 // ⚠️ Important: Audio must be 16kHz mono PCM16 format
 // If audio is Uint8Array, you can use slice().buffer to convert to ArrayBuffer
 const audioUint8 = new Uint8Array(1024) // Example: 16kHz PCM16 audio data (512 samples = 1024 bytes)
 const audioData = audioUint8.slice().buffer // Simplified conversion, works for ArrayBuffer and SharedArrayBuffer
 avatarView.avatarController.send(audioData, false) // Send audio data, will automatically start playing after accumulating enough data
-avatarView.avatarController.send(audioData, true) // end=true means immediately return animation data, no longer accumulating
+avatarView.avatarController.send(audioData, true) // end=true marks the end of current conversation round
 ```
-### External Data Mode Example
+### Host Mode Example
 ```typescript
 import { AvatarPlaybackMode } from '@spatialwalk/avatarkit'
-// 1-3. Same as network mode (initialize SDK, load character)
+// 1-3. Same as SDK mode (initialize SDK, load character)
-// 3. Create view with external data mode
+// 3. Create view with Host mode
 const container = document.getElementById('avatar-container')
-const avatarView = new AvatarView(avatar, {
-  container: container,
-  playbackMode: AvatarPlaybackMode.external
-})
+const avatarView = new AvatarView(avatar, container)
-// 4. Start playback with initial data (obtained from your service)
-// Note: Audio and animation data should be obtained from your backend service
+// 4. Host Mode Workflow:
+// ⚠️ IMPORTANT: In Host mode, you MUST send audio data FIRST to get a conversationId,
+//    then use that conversationId to send animation data.
+//    Animation data with mismatched conversationId will be discarded.
+// Option A: Playback existing audio and animation data (replay mode)
 const initialAudioChunks = [{ data: audioData1, isLast: false }, { data: audioData2, isLast: false }]
 const initialKeyframes = animationData1 // Animation keyframes from your service
-await avatarView.avatarController.play(initialAudioChunks, initialKeyframes)
-// 5. Stream additional data as needed
-avatarView.avatarController.sendAudioChunk(audioData3, false)
-avatarView.avatarController.sendKeyframes(animationData2)
+// Step 1: Send audio first to get conversationId
+const conversationId = await avatarView.avatarController.playback(initialAudioChunks, initialKeyframes)
+// Option B: Stream new audio and animation data
+// Step 1: Send audio data first to get conversationId
+const currentConversationId = avatarView.avatarController.yieldAudioData(audioData3, false)
+// Step 2: Use the conversationId to send animation data (mismatched conversationId will be discarded)
+avatarView.avatarController.yieldFramesData(animationData2, currentConversationId || conversationId)
 ```
 ### Complete Examples
@@ -100,8 +107,8 @@ Check the example code in the GitHub repository for complete usage flows for bot
 **Example Project:** [AvatarKit-Web-Demo](https://github.com/spatialwalk/AvatarKit-Web-Demo)
 This repository contains complete examples for Vanilla JS, Vue 3, and React, demonstrating:
-- Network mode: Real-time audio input with automatic animation data reception
-- External data mode: Custom data sources with manual audio/animation data management
+- SDK mode: Real-time audio input with automatic animation data reception
+- Host mode: Custom data sources with manual audio/animation data management
 ## 🏗️ Architecture Overview
@@ -111,7 +118,7 @@ The SDK uses a three-layer architecture for clear separation of concerns:
 1. **Rendering Layer (AvatarView)** - Responsible for 3D rendering only
 2. **Playback Layer (AvatarController)** - Manages audio/animation synchronization and playback
-3. **Network Layer** - Handles WebSocket communication (only in network mode, internal implementation)
+3. **Network Layer** - Handles WebSocket communication (only in SDK mode, internal implementation)
 ### Core Components
@@ -122,23 +129,36 @@ The SDK uses a three-layer architecture for clear separation of concerns:
 ### Playback Modes
-The SDK supports two playback modes, configured when creating `AvatarView`:
+The SDK supports two playback modes, configured in `AvatarKit.initialize()`:
-#### 1. Network Mode (Default)
+#### 1. SDK Mode (Default)
+- Configured via `drivingServiceMode: DrivingServiceMode.sdk` in `AvatarKit.initialize()`
 - SDK handles WebSocket communication automatically
 - Send audio data via `AvatarController.send()`
 - SDK receives animation data from backend and synchronizes playback
 - Best for: Real-time audio input scenarios
-#### 2. External Data Mode
-- External components manage their own network/data fetching
-- External components provide both audio and animation data
+#### 2. Host Mode
+- Configured via `drivingServiceMode: DrivingServiceMode.host` in `AvatarKit.initialize()`
+- Host application manages its own network/data fetching
+- Host application provides both audio and animation data
 - SDK only handles synchronized playback
 - Best for: Custom data sources, pre-recorded content, or custom network implementations
+**Note:** The playback mode is determined by `drivingServiceMode` in `AvatarKit.initialize()` configuration.
+### Fallback Mechanism
+The SDK includes a fallback mechanism to ensure audio playback continues even when animation data is unavailable:
+- **SDK Mode**: If the server returns an error or fails to provide animation data, the SDK automatically enters audio-only mode and continues playing audio independently
+- **Host Mode**: If empty animation data is provided (empty array or undefined), the SDK automatically enters audio-only mode
+- Once in audio-only mode, any subsequent animation data for that session will be ignored, and only audio will continue playing
+- The fallback mode is interruptible, just like normal playback mode
 ### Data Flow
-#### Network Mode Flow
+#### SDK Mode Flow
 ```
 User audio input (16kHz mono PCM16)
@@ -158,15 +178,20 @@ AvatarController (playback loop) → AvatarView.renderRealtimeFrame()
 RenderSystem → WebGPU/WebGL → Canvas rendering
 ```
-#### External Data Mode Flow
+#### Host Mode Flow
 ```
 External data source (audio + animation)
     ↓
-AvatarController.play(initialAudio, initialKeyframes) // Start playback
+Step 1: Send audio data FIRST to get conversationId
+    ↓
+AvatarController.playback(initialAudio, initialKeyframes) // Returns conversationId
+    OR
+AvatarController.yieldAudioData(audioChunk) // Returns conversationId
+    ↓
+Step 2: Use conversationId to send animation data
     ↓
-AvatarController.sendAudioChunk() // Stream additional audio
-AvatarController.sendKeyframes() // Stream additional animation
+AvatarController.yieldFramesData(keyframes, conversationId) // Requires conversationId
     ↓
 AvatarController → AnimationPlayer (synchronized playback)
     ↓
@@ -178,8 +203,8 @@ RenderSystem → WebGPU/WebGL → Canvas rendering
 ```
 **Note:**
-- In network mode, users provide audio data, SDK handles network communication and animation data reception
-- In external data mode, users provide both audio and animation data, SDK handles synchronized playback only
+- In SDK mode, users provide audio data, SDK handles network communication and animation data reception
+- In Host mode, users provide both audio and animation data, SDK handles synchronized playback only
 ### Audio Format Requirements
@@ -240,10 +265,11 @@ AvatarKit.cleanup()
 ### AvatarManager
-Character resource manager, responsible for downloading, caching, and loading character data.
+Character resource manager, responsible for downloading, caching, and loading character data. Use the singleton instance via `AvatarManager.shared`.
 ```typescript
-const manager = new AvatarManager()
+// Get singleton instance
+const manager = AvatarManager.shared
 // Load character
 const avatar = await manager.load(
@@ -267,22 +293,16 @@ manager.clearCache()
 import { AvatarPlaybackMode } from '@spatialwalk/avatarkit'
 // Create view (Canvas is automatically added to container)
-// Network mode (default)
+// Create view (playback mode is determined by drivingServiceMode in AvatarKit configuration)
 const container = document.getElementById('avatar-container')
-const avatarView = new AvatarView(avatar: Avatar, {
-  container: container,
-  playbackMode: AvatarPlaybackMode.network // Optional, default is 'network'
-})
-// External data mode
-const avatarView = new AvatarView(avatar: Avatar, {
-  container: container,
-  playbackMode: AvatarPlaybackMode.external
-})
+const avatarView = new AvatarView(avatar, container)
 // Get playback mode
 const mode = avatarView.playbackMode // 'network' | 'external'
+// Wait for first frame to render
+await avatarView.ready // Promise that resolves when the first frame is rendered
 // Cleanup resources (must be called before switching characters)
 avatarView.dispose()
 ```
@@ -299,12 +319,9 @@ if (currentAvatarView) {
 const newAvatar = await avatarManager.load('new-character-id')
 // Create new AvatarView
-currentAvatarView = new AvatarView(newAvatar, {
-  container: container,
-  playbackMode: AvatarPlaybackMode.network
-})
+currentAvatarView = new AvatarView(newAvatar, container)
-// Network mode: start connection
+// SDK mode: start connection
 if (currentAvatarView.playbackMode === AvatarPlaybackMode.network) {
   await currentAvatarView.controller.start()
 }
@@ -312,51 +329,87 @@ if (currentAvatarView.playbackMode === AvatarPlaybackMode.network) {
 ### AvatarController
-Audio/animation playback controller (playback layer), manages synchronized playback of audio and animation. Automatically handles WebSocket communication in network mode.
+Audio/animation playback controller (playback layer), manages synchronized playback of audio and animation. Automatically handles WebSocket communication in SDK mode.
 **Two Usage Patterns:**
-#### Network Mode Methods
+#### SDK Mode Methods
 ```typescript
 // Start WebSocket service
 await avatarView.avatarController.start()
-// Send audio data (SDK handles receiving animation data automatically)
-avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
+// Send audio data
+const conversationId = avatarView.avatarController.send(audioData: ArrayBuffer, end: boolean)
+// Returns: conversationId - Conversation ID for this conversation session (used to distinguish each conversation round)
 // audioData: Audio data (ArrayBuffer format, must be 16kHz mono PCM16)
 //   - Sample rate: 16kHz (16000 Hz) - backend requirement
 //   - Format: PCM16 (16-bit signed integer, little-endian)
 //   - Channels: Mono (single channel)
 //   - Example: 1 second = 16000 samples × 2 bytes = 32000 bytes
-// end: false (default) - Normal audio data sending, server will accumulate audio data, automatically returns animation data and starts synchronized playback of animation and audio after accumulating enough data
-// end: true - Immediately return animation data, no longer accumulating, used for ending current conversation or scenarios requiring immediate response
+// end: false (default) - Continue sending audio data for current conversation
+// end: true - Mark the end of current conversation round. After end=true, sending new audio data will interrupt any ongoing playback from the previous conversation round
 // Close WebSocket service
 avatarView.avatarController.close()
 ```
-#### External Data Mode Methods
+#### Host Mode Methods
 ```typescript
-// Start playback with initial audio and animation data
-await avatarView.avatarController.play(
-  initialAudioChunks?: Array<{ data: Uint8Array, isLast: boolean }>,  // Initial audio chunks (16kHz mono PCM16)
-  initialKeyframes?: any[]  // Initial animation keyframes (obtained from your service)
+// Playback existing audio and animation data (starts a new conversation)
+const conversationId = await avatarView.avatarController.playback(
+  initialAudioChunks?: Array<{ data: Uint8Array, isLast: boolean }>,  // Existing audio chunks (16kHz mono PCM16)
+  initialKeyframes?: any[]  // Existing animation keyframes (obtained from your service)
 )
+// Returns: conversationId - New conversation ID for this conversation session
-// Stream additional audio chunks (after play() is called)
-avatarView.avatarController.sendAudioChunk(
+// Stream additional audio chunks (after playback() is called)
+const conversationId = avatarView.avatarController.yieldAudioData(
   data: Uint8Array,               // Audio chunk data
   isLast: boolean = false         // Whether this is the last chunk
 )
+// Returns: conversationId - Conversation ID for this audio session
-// Stream additional animation keyframes (after play() is called)
-avatarView.avatarController.sendKeyframes(
-  keyframes: any[]                 // Additional animation keyframes (obtained from your service)
+// Stream additional animation keyframes (after playback() is called)
+avatarView.avatarController.yieldFramesData(
+  keyframes: any[],                // Additional animation keyframes (obtained from your service)
+  conversationId: string                    // Conversation ID (required). Use getCurrentConversationId() or yieldAudioData() to get conversationId.
 )
 ```
+**⚠️ Important: Conversation ID (conversationId) Management**
+**SDK Mode:**
+- `send()` returns a conversationId to distinguish each conversation round
+- `end=true` marks the end of a conversation round. After `end=true`, sending new audio data will interrupt any ongoing playback from the previous conversation round
+**Host Mode:**
+For each conversation session, you **must**:
+1. **First send audio data** to get a conversationId (used to distinguish each conversation round):
+   - `playback()` returns a conversationId when playback existing audio and animation data (replay mode)
+   - `yieldAudioData()` returns a conversationId for streaming new audio data
+2. **Then use that conversationId** to send animation data:
+   - `yieldFramesData()` requires a valid conversationId parameter
+   - Animation data with mismatched conversationId will be **discarded**
+   - Use `getCurrentConversationId()` to retrieve the current active conversationId
+**Example Flow (Host Mode):**
+```typescript
+// Step 1: Playback existing data first to get conversationId (or stream new audio)
+const conversationId = await avatarView.avatarController.playback(initialAudioChunks, initialKeyframes)
+// or stream new audio data
+const conversationId = avatarView.avatarController.yieldAudioData(audioChunk, false)
+// Step 2: Use the conversationId to send animation data
+avatarView.avatarController.yieldFramesData(keyframes, conversationId)
+```
+**Why conversationId is required:**
+- Ensures audio and animation data belong to the same conversation session
+- Prevents data from different sessions from being mixed
+- Automatically discards mismatched animation data for data integrity
 #### Common Methods (Both Modes)
 ```typescript
@@ -372,17 +425,22 @@ avatarView.avatarController.interrupt()
 // Clear all data and resources
 avatarView.avatarController.clear()
+// Get current conversation ID (for Host mode)
+const conversationId = avatarView.avatarController.getCurrentConversationId()
+// Returns: Current conversationId for the active audio session, or null if no active session
 // Set event callbacks
-avatarView.avatarController.onConnectionState = (state: ConnectionState) => {} // Network mode only
+avatarView.avatarController.onConnectionState = (state: ConnectionState) => {} // SDK mode only
 avatarView.avatarController.onAvatarState = (state: AvatarState) => {}
 avatarView.avatarController.onError = (error: Error) => {}
 ```
 **Important Notes:**
-- `start()` and `close()` are only available in network mode
-- `play()`, `sendAudioChunk()`, and `sendKeyframes()` are only available in external data mode
-- `pause()`, `resume()`, `interrupt()`, and `clear()` are available in both modes
+- `start()` and `close()` are only available in SDK mode
+- `playback()`, `yieldAudioData()`, and `yieldFramesData()` are only available in Host mode
+- `pause()`, `resume()`, `interrupt()`, `clear()`, and `getCurrentConversationId()` are available in both modes
 - The playback mode is determined when creating `AvatarView` and cannot be changed
+- **Conversation ID**: In Host mode, always send audio data first to obtain a conversationId, then use that conversationId when sending animation data. Animation data with mismatched conversationId will be discarded. Use `getCurrentConversationId()` to retrieve the current active conversationId.
 ## 🔧 Configuration
@@ -391,11 +449,15 @@ avatarView.avatarController.onError = (error: Error) => {}
 ```typescript
 interface Configuration {
   environment: Environment
+  drivingServiceMode?: DrivingServiceMode  // Optional, default is 'sdk' (SDK mode)
 }
 ```
 **Description:**
 - `environment`: Specifies the environment (cn/us/test), SDK will automatically use the corresponding API address and WebSocket address based on the environment
+- `drivingServiceMode`: Specifies the driving service mode
+  - `DrivingServiceMode.sdk` (default): SDK mode - SDK handles WebSocket communication automatically
+  - `DrivingServiceMode.host`: Host mode - Host application provides audio and animation data
 - `sessionToken`: Set separately via `AvatarKit.setSessionToken()`, not in Configuration
 ```typescript
@@ -406,28 +468,26 @@ enum Environment {
 }
 ```
-### AvatarViewOptions
+### AvatarView Constructor
 ```typescript
-interface AvatarViewOptions {
-  playbackMode?: AvatarPlaybackMode  // Playback mode, default is 'network'
-  container?: HTMLElement            // Canvas container element
-}
+constructor(avatar: Avatar, container: HTMLElement)
 ```
-**Description:**
-- `playbackMode`: Specifies the playback mode (`'network'` or `'external'`), default is `'network'`
-  - `'network'`: SDK handles WebSocket communication, send audio via `send()`
-  - `'external'`: External components provide audio and animation data, SDK handles synchronized playback
-- `container`: Optional container element for Canvas, if not provided, Canvas will be created but not added to DOM
-  - Canvas automatically uses the container's full dimensions (width and height)
-  - Canvas aspect ratio adapts to container size - set container dimensions to control the aspect ratio
+**Parameters:**
+- `avatar`: Avatar 实例
+- `container`: Canvas 容器元素（必选）
+  - Canvas 自动使用容器的完整尺寸（宽度和高度）
+  - Canvas 宽高比适应容器尺寸 - 设置容器尺寸以控制宽高比
+  - Canvas 会自动添加到容器中
+**Note:** 播放模式由 `AvatarKit.initialize()` 配置中的 `drivingServiceMode` 决定，而不是在构造函数参数中
   - SDK automatically handles resize events via ResizeObserver
 ```typescript
 enum AvatarPlaybackMode {
-  network = 'network',   // Network mode: SDK handles WebSocket communication
-  external = 'external'  // External data mode: External provides data, SDK handles playback
+  network = 'network',   // SDK mode: SDK handles WebSocket communication
+  external = 'external'  // Host mode: Host provides data, SDK handles playback
 }
 ```
@@ -511,15 +571,12 @@ avatarView.avatarController.onError = (error: Error) => {
 ### Lifecycle Management
-#### Network Mode Lifecycle
+#### SDK Mode Lifecycle
 ```typescript
 // Initialize
 const container = document.getElementById('avatar-container')
-const avatarView = new AvatarView(avatar, {
-  container: container,
-  playbackMode: AvatarPlaybackMode.network
-})
+const avatarView = new AvatarView(avatar, container)
 await avatarView.avatarController.start()
 // Use
@@ -530,21 +587,21 @@ avatarView.avatarController.close()
 avatarView.dispose() // Automatically cleans up all resources
 ```
-#### External Data Mode Lifecycle
+#### Host Mode Lifecycle
 ```typescript
 // Initialize
 const container = document.getElementById('avatar-container')
-const avatarView = new AvatarView(avatar, {
-  container: container,
-  playbackMode: AvatarPlaybackMode.external
-})
+const avatarView = new AvatarView(avatar, container)
 // Use
 const initialAudioChunks = [{ data: audioData1, isLast: false }]
-await avatarView.avatarController.play(initialAudioChunks, initialKeyframes)
-avatarView.avatarController.sendAudioChunk(audioChunk, false)
-avatarView.avatarController.sendKeyframes(keyframes)
+// Step 1: Send audio first to get conversationId
+const conversationId = await avatarView.avatarController.playback(initialAudioChunks, initialKeyframes)
+// Step 2: Stream additional audio (returns conversationId)
+const currentConversationId = avatarView.avatarController.yieldAudioData(audioChunk, false)
+// Step 3: Use conversationId to send animation data (mismatched conversationId will be discarded)
+avatarView.avatarController.yieldFramesData(keyframes, currentConversationId || conversationId)
 // Cleanup
 avatarView.avatarController.clear() // Clear all data and resources
@@ -554,8 +611,8 @@ avatarView.dispose() // Automatically cleans up all resources
 **⚠️ Important Notes:**
 - When disposing AvatarView instances, must call `dispose()` to properly clean up resources
 - Not properly cleaning up may cause resource leaks and rendering errors
-- In network mode, call `close()` before `dispose()` to properly close WebSocket connections
-- In external data mode, call `clear()` before `dispose()` to clear all playback data
+- In SDK mode, call `close()` before `dispose()` to properly close WebSocket connections
+- In Host mode, call `clear()` before `dispose()` to clear all playback data
 ### Memory Optimization
@@ -565,7 +622,7 @@ avatarView.dispose() // Automatically cleans up all resources
 ### Audio Data Sending
-#### Network Mode
+#### SDK Mode
 The `send()` method receives audio data in `ArrayBuffer` format:
@@ -577,32 +634,39 @@ The `send()` method receives audio data in `ArrayBuffer` format:
 **Usage:**
 - `audioData`: Audio data (ArrayBuffer format, must be 16kHz mono PCM16)
-- `end=false` (default) - Normal audio data sending, server will accumulate audio data, automatically returns animation data and starts synchronized playback of animation and audio after accumulating enough data
-- `end=true` - Immediately return animation data, no longer accumulating, used for ending current conversation or scenarios requiring immediate response
+- `end=false` (default) - Continue sending audio data for current conversation
+- `end=true` - Mark the end of current conversation round. After `end=true`, sending new audio data will interrupt any ongoing playback from the previous conversation round
 - **Important**: No need to wait for `end=true` to start playing, it will automatically start playing after accumulating enough audio data
-#### External Data Mode
+#### Host Mode
-The `play()` method starts playback with initial data, then use `sendAudioChunk()` to stream additional audio:
+The `playback()` method is used to playback existing audio and animation data (replay mode), generating a new conversationId and interrupting any existing conversation. Then use `yieldAudioData()` to stream additional audio:
 **Audio Format Requirements:**
-- Same as network mode: 16kHz mono PCM16 format
+- Same as SDK mode: 16kHz mono PCM16 format
 - Audio data should be provided as `Uint8Array` in chunks with `isLast` flag
 **Usage:**
 ```typescript
-// Start playback with initial audio and animation data
+// Playback existing audio and animation data (starts a new conversation)
 // Note: Audio and animation data should be obtained from your backend service
 const initialAudioChunks = [
   { data: audioData1, isLast: false },
   { data: audioData2, isLast: false }
 ]
-await avatarController.play(initialAudioChunks, initialKeyframes)
+const conversationId = await avatarController.playback(initialAudioChunks, initialKeyframes)
+// Returns: conversationId - New conversation ID for this conversation session
 // Stream additional audio chunks
-avatarController.sendAudioChunk(audioChunk, isLast)
+const conversationId = avatarController.yieldAudioData(audioChunk, isLast)
+// Returns: conversationId - Conversation ID for this audio session
 ```
+**⚠️ Conversation ID Workflow:**
+1. **Playback existing data or send audio first** → Get conversationId from `playback()` (for existing data) or `yieldAudioData()` (for streaming)
+2. **Send animation with conversationId** → Use the conversationId from step 1 in `yieldFramesData()`
+3. **Data matching** → Only animation data with matching conversationId will be accepted
 **Resampling (Both Modes):**
 - If your audio source is at a different sample rate (e.g., 24kHz, 48kHz), you **must** resample it to 16kHz before sending
 - For high-quality resampling, use Web Audio API's `OfflineAudioContext` with anti-aliasing filtering

package/dist/{StreamingAudioPlayer-COgQTrz3.js → StreamingAudioPlayer-D5P7mU8B.js} RENAMED Viewed

@@ -1,7 +1,7 @@
 var C = Object.defineProperty;
 var g = (h, t, e) => t in h ? C(h, t, { enumerable: !0, configurable: !0, writable: !0, value: e }) : h[t] = e;
 var i = (h, t, e) => g(h, typeof t != "symbol" ? t + "" : t, e);
-import { A as m, e as f, a as c, l as u } from "./index-Dsokgngg.js";
+import { A as m, e as f, a as c, l as u } from "./index-CuR_S9Ng.js";
 class y {
   constructor(t) {
     // AudioContext is managed internally
@@ -331,4 +331,4 @@ class y {
 export {
   y as StreamingAudioPlayer
 };
-//# sourceMappingURL=StreamingAudioPlayer-COgQTrz3.js.map
+//# sourceMappingURL=StreamingAudioPlayer-D5P7mU8B.js.map