npm - @drawdream/livespeech - Versions diffs - 0.1.10 → 0.1.12 - Mend

@drawdream/livespeech 0.1.10 → 0.1.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md CHANGED Viewed

@@ -10,7 +10,7 @@ A TypeScript/JavaScript SDK for real-time speech-to-speech AI conversations.
 - 🎙️ **Real-time Voice Conversations** - Natural, low-latency voice interactions
 - 🌐 **Multi-language Support** - Korean, English, Japanese, Chinese, and more
 - 🔊 **Streaming Audio** - Send and receive audio in real-time
-- 📝 **Live Transcription** - Get transcriptions of both user and AI speech
+- ⏹️ **Barge-in Support** - Interrupt AI mid-speech by talking or programmatically
 - 🔄 **Auto-reconnection** - Automatic recovery from network issues
 - 🌐 **Browser & Node.js** - Works in both environments
@@ -18,13 +18,9 @@ A TypeScript/JavaScript SDK for real-time speech-to-speech AI conversations.
 ```bash
 npm install @drawdream/livespeech
-# or
-yarn add @drawdream/livespeech
-# or
-pnpm add @drawdream/livespeech
 ```
-## Quick Start
+## Quick Start (5 minutes)
 ```typescript
 import { LiveSpeechClient } from '@drawdream/livespeech';
@@ -34,31 +30,28 @@ const client = new LiveSpeechClient({
   apiKey: 'your-api-key',
 });
-// Set up event handlers
-client.setUserTranscriptHandler((text) => {
-  console.log('You:', text);
+// Handle only 4 essential events!
+client.setAudioHandler((audioData) => {
+  audioPlayer.queue(audioData);  // PCM16 @ 24kHz
 });
-client.setResponseHandler((text, isFinal) => {
-  console.log('AI:', text);
+client.on('interrupted', () => {
+  audioPlayer.clear();  // CRITICAL: Clear buffer on interrupt!
 });
-client.setAudioHandler((audioData) => {
-  playAudio(audioData);  // PCM16 @ 24kHz
+client.on('turnComplete', () => {
+  console.log('AI finished');
 });
 client.setErrorHandler((error) => {
   console.error('Error:', error.message);
 });
-// Connect and start conversation
+// Connect and start
 await client.connect();
-await client.startSession({
-  prePrompt: 'You are a helpful assistant.',
-  language: 'ko-KR',
-});
+await client.startSession({ prePrompt: 'You are a helpful assistant.' });
-// Stream audio
+// Send audio
 client.audioStart();
 client.sendAudioChunk(pcmData);  // PCM16 @ 16kHz
 client.audioEnd();
@@ -68,380 +61,224 @@ await client.endSession();
 client.disconnect();
 ```
-## Audio Flow
+---
-```
-connect() → startSession() → audioStart() → sendAudioChunk()* → audioEnd() → endSession()
-                                    ↓
-                          sendSystemMessage() (optional, during live session)
-                          sendToolResponse() (when toolCall received)
-```
+# Core API
-| Step | Description |
-|------|-------------|
-| `connect()` | Establish WebSocket connection |
-| `startSession(config)` | Start conversation with optional system prompt |
-| `audioStart()` | Begin audio streaming |
-| `sendAudioChunk(data)` | Send PCM16 audio (call multiple times) |
-| `sendSystemMessage(msg)` | Inject context or trigger AI response (optional) |
-| `sendToolResponse(id, result)` | Send function result back to AI (after toolCall) |
-| `updateUserId(userId)` | Migrate guest session to user account |
-| `audioEnd()` | End streaming, triggers AI response |
-| `endSession()` | End conversation |
+Everything you need for basic voice conversations.
+## Methods
+| Method | Description |
+|--------|-------------|
+| `connect()` | Establish connection |
 | `disconnect()` | Close connection |
+| `startSession(config)` | Start conversation with system prompt |
+| `endSession()` | End conversation |
+| `sendAudioChunk(data)` | Send PCM16 audio (16kHz) |
+## Events
+| Event | Description | Action Required |
+|-------|-------------|-----------------|
+| `audio` | AI's audio output | Play audio (PCM16 @ 24kHz) |
+| `turnComplete` | AI finished speaking | Ready for next input |
+| `interrupted` | User barged in | **Clear audio buffer!** |
+| `error` | Error occurred | Handle/log error |
+### ⚠️ Critical: Handle `interrupted`
+When the user speaks while AI is responding, **you must clear your audio buffer**:
+```typescript
+client.on('interrupted', () => {
+  audioPlayer.clear();  // Stop buffered audio immediately
+  audioPlayer.stop();
+});
+```
+Without this, 2-3 seconds of buffered audio continues playing after the user interrupts.
+## Audio Format
+| Direction | Format | Sample Rate |
+|-----------|--------|-------------|
+| Input (mic) | PCM16 | 16,000 Hz |
+| Output (AI) | PCM16 | 24,000 Hz |
 ## Configuration
 ```typescript
 const client = new LiveSpeechClient({
-  region: 'ap-northeast-2',       // Required: Seoul region
-  apiKey: 'your-api-key',         // Required: Your API key
-  userId: 'user-123',             // Optional: Enable conversation memory
-  autoReconnect: true,            // Auto-reconnect on disconnect
-  maxReconnectAttempts: 5,        // Maximum reconnection attempts
-  debug: false,                   // Enable debug logging
+  region: 'ap-northeast-2',       // Required
+  apiKey: 'your-api-key',         // Required
 });
 await client.startSession({
   prePrompt: 'You are a helpful assistant.',
-  language: 'ko-KR',              // Language: ko-KR, en-US, ja-JP, etc.
-  pipelineMode: 'live',           // 'live' (default) or 'composed'
-  aiSpeaksFirst: false,           // AI speaks first (live mode only)
-  allowHarmCategory: false,       // Disable safety filtering (use with caution)
-  tools: [{ name: 'func', description: 'desc', parameters: {...} }],  // Function calling
+  language: 'ko-KR',              // Optional: ko-KR, en-US, ja-JP, etc.
 });
 ```
-## Session Options
+---
-| Option | Type | Default | Description |
-|--------|------|---------|-------------|
-| `prePrompt` | `string` | - | System prompt for the AI assistant |
-| `language` | `string` | `'en-US'` | Language code (e.g., `ko-KR`, `ja-JP`) |
-| `pipelineMode` | `'live' \| 'composed'` | `'live'` | Audio processing mode |
-| `aiSpeaksFirst` | `boolean` | `false` | AI initiates conversation (live mode only) |
-| `allowHarmCategory` | `boolean` | `false` | Disable content safety filtering |
-| `tools` | `Tool[]` | `undefined` | Function definitions for AI to call |
+# Advanced API
-### Pipeline Modes
+Optional features for power users.
-| Mode | Latency | Description |
-|------|---------|-------------|
-| `live` | Lower (~300ms) | Direct audio-to-audio via Live API |
-| `composed` | Higher (~1-2s) | Separate STT → LLM → TTS pipeline |
+## Additional Methods
-### AI Speaks First
+| Method | Description |
+|--------|-------------|
+| `audioStart()` / `audioEnd()` | Manual audio stream control |
+| `interrupt()` | Explicitly stop AI response (for Stop button) |
+| `sendSystemMessage(msg)` | Inject context during conversation |
+| `sendToolResponse(id, result)` | Reply to function calls |
+| `updateUserId(userId)` | Migrate guest to authenticated user |
-When `aiSpeaksFirst: true`, the AI will immediately speak a greeting based on your `prePrompt`:
+## Additional Events
-```typescript
-await client.startSession({
-  prePrompt: 'You are a customer service agent. Greet the customer warmly and ask how you can help.',
-  aiSpeaksFirst: true,
-});
+| Event | Description |
+|-------|-------------|
+| `connected` / `disconnected` | Connection lifecycle |
+| `sessionStarted` / `sessionEnded` | Session lifecycle |
+| `ready` | Session ready for audio |
+| `userTranscript` | User's speech transcribed |
+| `response` | AI's response text |
+| `toolCall` | AI wants to call a function |
+| `userIdUpdated` | Guest-to-user migration complete |
+---
+## Explicit Interrupt (Stop Button)
+For UI "Stop" buttons or programmatic control:
-client.audioStart();  // AI greeting plays immediately
+```typescript
+// User clicks Stop button
+client.interrupt();
 ```
-> ⚠️ **Note**: Only works with `pipelineMode: 'live'`
+Note: Voice barge-in works automatically via Gemini's VAD. This method is for explicit control.
+---
-### Content Safety
+## System Messages
-By default, LLM applies content safety filtering. Set `allowHarmCategory: true` to disable:
+Inject text context during live sessions (game events, app state, etc.):
 ```typescript
-await client.startSession({
-  allowHarmCategory: true,  // ⚠️ Disables all safety filters
-});
+// AI responds immediately
+client.sendSystemMessage("User completed level 5. Congratulate them!");
+// Context only, no response
+client.sendSystemMessage({ text: "User is browsing", triggerResponse: false });
 ```
-> ⚠️ **Warning**: Only use in controlled environments where content moderation is handled by other means.
+> Requires active live session (`audioStart()` called). Max 500 characters.
+---
 ## Function Calling (Tool Use)
-Define functions that the AI can call during conversation. When the AI decides to call a function, you receive a `toolCall` event and must respond with `sendToolResponse()`.
+Let AI call functions in your app:
-### Define Tools
+### 1. Define Tools
 ```typescript
-const tools = [
-  {
-    name: 'open_login',
-    description: 'Opens Google Login popup when user wants to sign in',
-    parameters: { type: 'OBJECT', properties: {}, required: [] }
-  },
-  {
-    name: 'get_price',
-    description: 'Gets product price by ID',
-    parameters: {
-      type: 'OBJECT',
-      properties: {
-        productId: { type: 'string', description: 'Product ID' }
-      },
-      required: ['productId']
-    }
+const tools = [{
+  name: 'get_price',
+  description: 'Gets product price by ID',
+  parameters: {
+    type: 'OBJECT',
+    properties: { productId: { type: 'string' } },
+    required: ['productId']
   }
-];
+}];
 await client.startSession({
-  prePrompt: 'You are a helpful assistant. Use tools when appropriate.',
+  prePrompt: 'You are helpful.',
   tools,
 });
 ```
-### Handle Tool Calls
+### 2. Handle toolCall Events
 ```typescript
 client.on('toolCall', (event) => {
-  console.log('AI wants to call:', event.name);
-  console.log('With arguments:', event.args);
-  if (event.name === 'open_login') {
-    showLoginModal();
-    client.sendToolResponse(event.id, { success: true });
-  }
   if (event.name === 'get_price') {
-    const price = getProductPrice(event.args.productId);
-    client.sendToolResponse(event.id, { price, currency: 'USD' });
+    const price = lookupPrice(event.args.productId);
+    client.sendToolResponse(event.id, { price });
   }
 });
 ```
-### Tool Interface
-```typescript
-interface Tool {
-  name: string;                    // Function name
-  description: string;             // When AI should use this
-  parameters?: {
-    type: 'OBJECT';
-    properties: Record<string, unknown>;
-    required?: string[];
-  };
-}
-```
-> ⚠️ **Note**: Function calling only works with `pipelineMode: 'live'`
-## System Messages
-During an active live session, you can inject text messages to the AI using `sendSystemMessage()`. This is useful for:
-- Game events ("User completed level 5, congratulate them!")
-- App state changes ("User opened the cart with 3 items")
-- Timer/engagement triggers ("User has been quiet, engage them")
-- External data updates ("Weather changed to rainy")
-### Usage
-```typescript
-// Simple usage - AI responds immediately
-client.sendSystemMessage("User just completed level 5. Congratulate them!");
-// With options - context only, no immediate response
-client.sendSystemMessage({
-  text: "User is browsing the cart",
-  triggerResponse: false
-});
-```
-### Parameters
-| Parameter | Type | Required | Default | Description |
-|-----------|------|----------|---------|-------------|
-| `text` | `string` | Yes | - | Message text (max 500 chars) |
-| `triggerResponse` | `boolean` | No | `true` | AI responds immediately if `true` |
-> ⚠️ **Note**: Requires an active live session (`audioStart()` must have been called). Only works with `pipelineMode: 'live'`.
+---
 ## Conversation Memory
-When you provide a `userId`, the SDK enables persistent conversation memory:
-- **Entity Memory**: AI remembers facts shared in previous sessions (names, preferences, relationships)
-- **Session Summaries**: Recent conversation summaries are available to the AI
-- **Cross-Session**: Memory persists across sessions for the same `userId`
+Enable persistent memory across sessions:
 ```typescript
-// With memory (authenticated user)
 const client = new LiveSpeechClient({
   region: 'ap-northeast-2',
   apiKey: 'your-api-key',
-  userId: 'user-123',  // Enables conversation memory
-});
-// Without memory (guest)
-const client = new LiveSpeechClient({
-  region: 'ap-northeast-2',
-  apiKey: 'your-api-key',
-  // No userId = guest mode, no persistent memory
+  userId: 'user-123',  // Enables memory
 });
 ```
-| Mode | Memory Persistence | Use Case |
-|------|-------------------|----------|
-| With `userId` | Permanent | Authenticated users |
-| Without `userId` | Session only | Guests, anonymous users |
+| Mode | Memory |
+|------|--------|
+| With `userId` | Permanent (entities, summaries) |
+| Without `userId` | Session only (guest) |
 ### Guest-to-User Migration
-When a guest user logs in during a session, you can migrate their conversation history to their user account:
 ```typescript
-// User logs in after chatting as guest
-client.on('userIdUpdated', (event) => {
-  console.log(`Migrated ${event.migratedMessages} messages to user ${event.userId}`);
-});
-// After authentication
+// User logs in during session
 await client.updateUserId('authenticated-user-123');
-```
-This enables:
-- Entity extraction on guest conversation history
-- Conversation continuity across sessions
-- Personalization based on past interactions
-## Events
-| Event | Description | Key Properties |
-|-------|-------------|----------------|
-| `connected` | Connection established | `connectionId` |
-| `disconnected` | Connection closed | `reason`, `code` |
-| `sessionStarted` | Session created | `sessionId` |
-| `ready` | Ready for audio input | `timestamp` |
-| `userTranscript` | Your speech transcribed | `text` |
-| `response` | AI's response text | `text`, `isFinal` |
-| `audio` | AI's audio output | `data`, `sampleRate` |
-| `turnComplete` | AI finished speaking | `timestamp` |
-| `toolCall` | AI wants to call a function | `id`, `name`, `args` |
-| `userIdUpdated` | Guest migrated to user account | `userId`, `migratedMessages` |
-| `error` | Error occurred | `code`, `message` |
-### Simple Handlers
-```typescript
-// Your speech transcription
-client.setUserTranscriptHandler((text) => {
-  console.log('You said:', text);
-});
-// AI's text response
-client.setResponseHandler((text, isFinal) => {
-  console.log('AI:', text, isFinal ? '(done)' : '...');
-});
-// AI's audio output
-client.setAudioHandler((data: Uint8Array) => {
-  // data: PCM16 audio
-  // Sample rate: 24000 Hz
-  playAudio(data);
-});
-// Error handling
-client.setErrorHandler((error) => {
-  console.error(`Error [${error.code}]: ${error.message}`);
-});
-// Tool calls (function calling)
-client.on('toolCall', (event) => {
-  // Execute function and send result
-  const result = executeFunction(event.name, event.args);
-  client.sendToolResponse(event.id, result);
-});
-// Guest-to-user migration
+// Listen for confirmation
 client.on('userIdUpdated', (event) => {
-  console.log(`Logged in as ${event.userId}, migrated ${event.migratedMessages} messages`);
+  console.log(`Migrated ${event.migratedMessages} messages`);
 });
 ```
-### Full Event API
+---
-```typescript
-client.on('connected', (event) => {
-  console.log('Connected:', event.connectionId);
-});
+## AI Speaks First
-client.on('ready', () => {
-  console.log('Ready for audio');
-});
-client.on('userTranscript', (event) => {
-  console.log('You:', event.text);
-});
-client.on('response', (event) => {
-  console.log('AI:', event.text, event.isFinal);
-});
-client.on('audio', (event) => {
-  // event.data: Uint8Array (PCM16)
-  // event.sampleRate: 24000
-  playAudio(event.data);
-});
-client.on('turnComplete', () => {
-  console.log('AI finished speaking');
-});
-client.on('error', (event) => {
-  console.error('Error:', event.code, event.message);
-});
+AI initiates the conversation:
-client.on('toolCall', (event) => {
-  // event.id: string - use with sendToolResponse
-  // event.name: string - function name
-  // event.args: object - function arguments
-  const result = handleToolCall(event.name, event.args);
-  client.sendToolResponse(event.id, result);
+```typescript
+await client.startSession({
+  prePrompt: 'Greet the customer warmly.',
+  aiSpeaksFirst: true,
 });
-client.on('userIdUpdated', (event) => {
-  // event.userId: string - the new user ID
-  // event.migratedMessages: number - count of migrated messages
-  console.log(`Migrated ${event.migratedMessages} messages to ${event.userId}`);
-});
+client.audioStart();  // AI speaks immediately
 ```
-## Audio Format
-### Input (Your Microphone)
+---
-| Property | Value |
-|----------|-------|
-| Format | PCM16 (16-bit signed, little-endian) |
-| Sample Rate | 16,000 Hz |
-| Channels | 1 (Mono) |
-| Chunk Size | ~3200 bytes (100ms) |
+## Session Options
-### Output (AI Response)
+| Option | Default | Description |
+|--------|---------|-------------|
+| `prePrompt` | - | System prompt |
+| `language` | `'en-US'` | Language code |
+| `pipelineMode` | `'live'` | `'live'` (~300ms) or `'composed'` (~1-2s) |
+| `aiSpeaksFirst` | `false` | AI initiates (live mode only) |
+| `allowHarmCategory` | `false` | Disable safety filters |
+| `tools` | `[]` | Function definitions |
-| Property | Value |
-|----------|-------|
-| Format | PCM16 (16-bit signed, little-endian) |
-| Sample Rate | 24,000 Hz |
-| Channels | 1 (Mono) |
+---
 ## Browser Example
 ```typescript
 import { LiveSpeechClient, float32ToInt16, int16ToUint8 } from '@drawdream/livespeech';
-const client = new LiveSpeechClient({
-  region: 'ap-northeast-2',
-  apiKey: 'your-api-key',
-});
-// Handlers
-client.setUserTranscriptHandler((text) => console.log('You:', text));
-client.setResponseHandler((text) => console.log('AI:', text));
-client.setAudioHandler((data) => playAudioChunk(data));
-// Connect
-await client.connect();
-await client.startSession({ prePrompt: 'You are a helpful assistant.' });
 // Capture microphone
 const stream = await navigator.mediaDevices.getUserMedia({
   audio: { sampleRate: 16000, channelCount: 1 }
@@ -460,60 +297,30 @@ processor.onaudioprocess = (e) => {
 source.connect(processor);
 processor.connect(audioContext.destination);
-// Start streaming
-client.audioStart();
-// Stop later
-client.audioEnd();
-stream.getTracks().forEach(track => track.stop());
 ```
+---
 ## Audio Utilities
 ```typescript
-import {
-  float32ToInt16,    // Web Audio Float32 → PCM16
-  int16ToFloat32,    // PCM16 → Float32
-  int16ToUint8,      // Int16Array → Uint8Array
-  uint8ToInt16,      // Uint8Array → Int16Array
-  wrapPcmInWav,      // Create WAV file
-  AudioEncoder,      // Base64 encoding/decoding
-} from '@drawdream/livespeech';
-// Convert Web Audio to PCM16 for sending
-const float32 = audioBuffer.getChannelData(0);
-const int16 = float32ToInt16(float32);
-const pcmBytes = int16ToUint8(int16);
-client.sendAudioChunk(pcmBytes);
-// Convert received PCM16 to Web Audio
-const receivedInt16 = uint8ToInt16(audioEvent.data);
-const float32Data = int16ToFloat32(receivedInt16);
+import { float32ToInt16, int16ToUint8, wrapPcmInWav } from '@drawdream/livespeech';
+const int16 = float32ToInt16(float32Data);
+const bytes = int16ToUint8(int16);
+const wav = wrapPcmInWav(bytes, 16000, 1, 16);
 ```
+---
 ## Error Handling
 ```typescript
 client.on('error', (event) => {
   switch (event.code) {
-    case 'authentication_failed':
-      console.error('Invalid API key');
-      break;
-    case 'connection_timeout':
-      console.error('Connection timed out');
-      break;
-    case 'rate_limit':
-      console.error('Rate limit exceeded');
-      break;
-    default:
-      console.error(`Error: ${event.message}`);
-  }
-});
-client.on('disconnected', (event) => {
-  if (event.reason === 'error') {
-    console.log('Will auto-reconnect...');
+    case 'authentication_failed': console.error('Invalid API key'); break;
+    case 'connection_timeout': console.error('Timed out'); break;
+    default: console.error(`Error: ${event.message}`);
   }
 });
@@ -522,44 +329,13 @@ client.on('reconnecting', (event) => {
 });
 ```
-## Client Properties
-| Property | Type | Description |
-|----------|------|-------------|
-| `isConnected` | `boolean` | Connection status |
-| `hasActiveSession` | `boolean` | Session status |
-| `isAudioStreaming` | `boolean` | Streaming status |
-| `connectionId` | `string \| null` | Current connection ID |
-| `currentSessionId` | `string \| null` | Current session ID |
+---
 ## Regions
-| Region | Code | Location |
-|--------|------|----------|
-| Asia Pacific (Seoul) | `ap-northeast-2` | Korea |
-## TypeScript Types
-```typescript
-import type {
-  LiveSpeechConfig,
-  SessionConfig,
-  LiveSpeechEvent,
-  ConnectedEvent,
-  DisconnectedEvent,
-  SessionStartedEvent,
-  ReadyEvent,
-  UserTranscriptEvent,
-  ResponseEvent,
-  AudioEvent,
-  TurnCompleteEvent,
-  ToolCallEvent,
-  UserIdUpdatedEvent,
-  ErrorEvent,
-  ErrorCode,
-  Tool,
-} from '@drawdream/livespeech';
-```
+| Region | Code |
+|--------|------|
+| Seoul (Korea) | `ap-northeast-2` |
 ## License

package/dist/index.d.mts CHANGED Viewed

@@ -201,7 +201,7 @@ interface ResolvedConfig {
 /**
  * Event types emitted by the LiveSpeech client
  */
-type LiveSpeechEventType = 'connected' | 'disconnected' | 'reconnecting' | 'sessionStarted' | 'sessionEnded' | 'ready' | 'userTranscript' | 'response' | 'audio' | 'turnComplete' | 'toolCall' | 'userIdUpdated' | 'error';
+type LiveSpeechEventType = 'connected' | 'disconnected' | 'reconnecting' | 'sessionStarted' | 'sessionEnded' | 'ready' | 'userTranscript' | 'response' | 'audio' | 'turnComplete' | 'toolCall' | 'userIdUpdated' | 'interrupted' | 'error';
 /**
  * Event payload for 'connected' event
  */
@@ -357,10 +357,30 @@ interface UserIdUpdatedEvent {
     migratedMessages: number;
     timestamp: string;
 }
+/**
+ * Event payload for 'interrupted' event (barge-in)
+ * Indicates the AI response was interrupted because the user started speaking.
+ *
+ * **Critical**: When you receive this event, immediately clear your audio playback
+ * buffer to stop the AI audio from continuing to play. This enables natural
+ * barge-in behavior like a real phone conversation.
+ *
+ * @example
+ * client.on('interrupted', (event) => {
+ *   // Stop playing AI audio immediately
+ *   audioPlayer.clearBuffer();
+ *   audioPlayer.stop();
+ *   console.log('AI interrupted - ready for user input');
+ * });
+ */
+interface InterruptedEvent {
+    type: 'interrupted';
+    timestamp: string;
+}
 /**
  * Union type of all event payloads
  */
-type LiveSpeechEvent = ConnectedEvent | DisconnectedEvent | ReconnectingEvent | SessionStartedEvent | SessionEndedEvent | ReadyEvent | UserTranscriptEvent | ResponseEvent | AudioEvent | TurnCompleteEvent | ToolCallEvent | UserIdUpdatedEvent | ErrorEvent;
+type LiveSpeechEvent = ConnectedEvent | DisconnectedEvent | ReconnectingEvent | SessionStartedEvent | SessionEndedEvent | ReadyEvent | UserTranscriptEvent | ResponseEvent | AudioEvent | TurnCompleteEvent | ToolCallEvent | UserIdUpdatedEvent | InterruptedEvent | ErrorEvent;
 /**
  * Simplified event handlers for common use cases
  */
@@ -372,11 +392,11 @@ type ErrorHandler = (error: ErrorEvent) => void;
 /**
  * WebSocket message types sent from client to server
  */
-type ClientMessageType = 'startSession' | 'endSession' | 'audioStart' | 'audioChunk' | 'audioEnd' | 'systemMessage' | 'toolResponse' | 'updateUserId' | 'ping';
+type ClientMessageType = 'startSession' | 'endSession' | 'audioStart' | 'audioChunk' | 'audioEnd' | 'systemMessage' | 'toolResponse' | 'updateUserId' | 'interrupt' | 'ping';
 /**
  * WebSocket message types received from server
  */
-type ServerMessageType = 'sessionStarted' | 'sessionEnded' | 'ready' | 'userTranscript' | 'response' | 'audio' | 'turnComplete' | 'toolCall' | 'userIdUpdated' | 'error' | 'pong';
+type ServerMessageType = 'sessionStarted' | 'sessionEnded' | 'ready' | 'userTranscript' | 'response' | 'audio' | 'turnComplete' | 'toolCall' | 'userIdUpdated' | 'interrupted' | 'error' | 'pong';
 /**
  * Base interface for client messages
  */
@@ -466,10 +486,16 @@ interface UpdateUserIdMessage extends BaseClientMessage {
     /** The authenticated user's unique identifier */
     userId: string;
 }
+/**
+ * Interrupt message - explicitly stop AI response (for Stop button)
+ */
+interface InterruptMessage extends BaseClientMessage {
+    action: 'interrupt';
+}
 /**
  * Union type of all client messages
  */
-type ClientMessage = StartSessionMessage | EndSessionMessage | AudioStartMessage | AudioChunkMessage | AudioEndMessage | SystemMessageMessage | ToolResponseMessage | UpdateUserIdMessage | PingMessage;
+type ClientMessage = StartSessionMessage | EndSessionMessage | AudioStartMessage | AudioChunkMessage | AudioEndMessage | SystemMessageMessage | ToolResponseMessage | UpdateUserIdMessage | InterruptMessage | PingMessage;
 /**
  * Base interface for server messages
  */
@@ -567,10 +593,18 @@ interface ServerUserIdUpdatedMessage extends BaseServerMessage {
     /** Number of messages migrated from guest to user partition */
     migratedMessages: number;
 }
+/**
+ * Interrupted message from server (barge-in)
+ * Indicates the AI response was interrupted because the user started speaking.
+ * Clients should immediately clear their audio playback buffer when receiving this.
+ */
+interface ServerInterruptedMessage extends BaseServerMessage {
+    type: 'interrupted';
+}
 /**
  * Union type of all server messages
  */
-type ServerMessage = ServerSessionStartedMessage | ServerSessionEndedMessage | ServerReadyMessage | ServerUserTranscriptMessage | ServerResponseMessage | ServerAudioMessage | ServerTurnCompleteMessage | ServerToolCallMessage | ServerUserIdUpdatedMessage | ServerErrorMessage | ServerPongMessage;
+type ServerMessage = ServerSessionStartedMessage | ServerSessionEndedMessage | ServerReadyMessage | ServerUserTranscriptMessage | ServerResponseMessage | ServerAudioMessage | ServerTurnCompleteMessage | ServerToolCallMessage | ServerUserIdUpdatedMessage | ServerInterruptedMessage | ServerErrorMessage | ServerPongMessage;
 /**
  * Connection state
@@ -593,6 +627,7 @@ type LiveSpeechEventMap = {
     turnComplete: TurnCompleteEvent;
     toolCall: ToolCallEvent;
     userIdUpdated: UserIdUpdatedEvent;
+    interrupted: InterruptedEvent;
     error: ErrorEvent;
 };
 /**
@@ -710,6 +745,26 @@ declare class LiveSpeechClient {
      * });
      */
     sendToolResponse(id: string, response?: unknown): void;
+    /**
+     * Explicitly interrupt the current AI response
+     *
+     * Use this method for:
+     * - UI "Stop" button functionality
+     * - Programmatic control to stop AI mid-response
+     *
+     * Note: In most cases, simply speaking will trigger automatic
+     * interruption via Gemini's voice activity detection (VAD).
+     * This method is for explicit programmatic control.
+     *
+     * @example
+     * // User clicks "Stop" button
+     * client.interrupt();
+     *
+     * @example
+     * // Stop AI after a certain time
+     * setTimeout(() => client.interrupt(), 10000);
+     */
+    interrupt(): void;
     /**
      * Update the user ID for the current connection (guest-to-user migration)
      *

package/dist/index.d.ts CHANGED Viewed

@@ -201,7 +201,7 @@ interface ResolvedConfig {
 /**
  * Event types emitted by the LiveSpeech client
  */
-type LiveSpeechEventType = 'connected' | 'disconnected' | 'reconnecting' | 'sessionStarted' | 'sessionEnded' | 'ready' | 'userTranscript' | 'response' | 'audio' | 'turnComplete' | 'toolCall' | 'userIdUpdated' | 'error';
+type LiveSpeechEventType = 'connected' | 'disconnected' | 'reconnecting' | 'sessionStarted' | 'sessionEnded' | 'ready' | 'userTranscript' | 'response' | 'audio' | 'turnComplete' | 'toolCall' | 'userIdUpdated' | 'interrupted' | 'error';
 /**
  * Event payload for 'connected' event
  */
@@ -357,10 +357,30 @@ interface UserIdUpdatedEvent {
     migratedMessages: number;
     timestamp: string;
 }
+/**
+ * Event payload for 'interrupted' event (barge-in)
+ * Indicates the AI response was interrupted because the user started speaking.
+ *
+ * **Critical**: When you receive this event, immediately clear your audio playback
+ * buffer to stop the AI audio from continuing to play. This enables natural
+ * barge-in behavior like a real phone conversation.
+ *
+ * @example
+ * client.on('interrupted', (event) => {
+ *   // Stop playing AI audio immediately
+ *   audioPlayer.clearBuffer();
+ *   audioPlayer.stop();
+ *   console.log('AI interrupted - ready for user input');
+ * });
+ */
+interface InterruptedEvent {
+    type: 'interrupted';
+    timestamp: string;
+}
 /**
  * Union type of all event payloads
  */
-type LiveSpeechEvent = ConnectedEvent | DisconnectedEvent | ReconnectingEvent | SessionStartedEvent | SessionEndedEvent | ReadyEvent | UserTranscriptEvent | ResponseEvent | AudioEvent | TurnCompleteEvent | ToolCallEvent | UserIdUpdatedEvent | ErrorEvent;
+type LiveSpeechEvent = ConnectedEvent | DisconnectedEvent | ReconnectingEvent | SessionStartedEvent | SessionEndedEvent | ReadyEvent | UserTranscriptEvent | ResponseEvent | AudioEvent | TurnCompleteEvent | ToolCallEvent | UserIdUpdatedEvent | InterruptedEvent | ErrorEvent;
 /**
  * Simplified event handlers for common use cases
  */
@@ -372,11 +392,11 @@ type ErrorHandler = (error: ErrorEvent) => void;
 /**
  * WebSocket message types sent from client to server
  */
-type ClientMessageType = 'startSession' | 'endSession' | 'audioStart' | 'audioChunk' | 'audioEnd' | 'systemMessage' | 'toolResponse' | 'updateUserId' | 'ping';
+type ClientMessageType = 'startSession' | 'endSession' | 'audioStart' | 'audioChunk' | 'audioEnd' | 'systemMessage' | 'toolResponse' | 'updateUserId' | 'interrupt' | 'ping';
 /**
  * WebSocket message types received from server
  */
-type ServerMessageType = 'sessionStarted' | 'sessionEnded' | 'ready' | 'userTranscript' | 'response' | 'audio' | 'turnComplete' | 'toolCall' | 'userIdUpdated' | 'error' | 'pong';
+type ServerMessageType = 'sessionStarted' | 'sessionEnded' | 'ready' | 'userTranscript' | 'response' | 'audio' | 'turnComplete' | 'toolCall' | 'userIdUpdated' | 'interrupted' | 'error' | 'pong';
 /**
  * Base interface for client messages
  */
@@ -466,10 +486,16 @@ interface UpdateUserIdMessage extends BaseClientMessage {
     /** The authenticated user's unique identifier */
     userId: string;
 }
+/**
+ * Interrupt message - explicitly stop AI response (for Stop button)
+ */
+interface InterruptMessage extends BaseClientMessage {
+    action: 'interrupt';
+}
 /**
  * Union type of all client messages
  */
-type ClientMessage = StartSessionMessage | EndSessionMessage | AudioStartMessage | AudioChunkMessage | AudioEndMessage | SystemMessageMessage | ToolResponseMessage | UpdateUserIdMessage | PingMessage;
+type ClientMessage = StartSessionMessage | EndSessionMessage | AudioStartMessage | AudioChunkMessage | AudioEndMessage | SystemMessageMessage | ToolResponseMessage | UpdateUserIdMessage | InterruptMessage | PingMessage;
 /**
  * Base interface for server messages
  */
@@ -567,10 +593,18 @@ interface ServerUserIdUpdatedMessage extends BaseServerMessage {
     /** Number of messages migrated from guest to user partition */
     migratedMessages: number;
 }
+/**
+ * Interrupted message from server (barge-in)
+ * Indicates the AI response was interrupted because the user started speaking.
+ * Clients should immediately clear their audio playback buffer when receiving this.
+ */
+interface ServerInterruptedMessage extends BaseServerMessage {
+    type: 'interrupted';
+}
 /**
  * Union type of all server messages
  */
-type ServerMessage = ServerSessionStartedMessage | ServerSessionEndedMessage | ServerReadyMessage | ServerUserTranscriptMessage | ServerResponseMessage | ServerAudioMessage | ServerTurnCompleteMessage | ServerToolCallMessage | ServerUserIdUpdatedMessage | ServerErrorMessage | ServerPongMessage;
+type ServerMessage = ServerSessionStartedMessage | ServerSessionEndedMessage | ServerReadyMessage | ServerUserTranscriptMessage | ServerResponseMessage | ServerAudioMessage | ServerTurnCompleteMessage | ServerToolCallMessage | ServerUserIdUpdatedMessage | ServerInterruptedMessage | ServerErrorMessage | ServerPongMessage;
 /**
  * Connection state
@@ -593,6 +627,7 @@ type LiveSpeechEventMap = {
     turnComplete: TurnCompleteEvent;
     toolCall: ToolCallEvent;
     userIdUpdated: UserIdUpdatedEvent;
+    interrupted: InterruptedEvent;
     error: ErrorEvent;
 };
 /**
@@ -710,6 +745,26 @@ declare class LiveSpeechClient {
      * });
      */
     sendToolResponse(id: string, response?: unknown): void;
+    /**
+     * Explicitly interrupt the current AI response
+     *
+     * Use this method for:
+     * - UI "Stop" button functionality
+     * - Programmatic control to stop AI mid-response
+     *
+     * Note: In most cases, simply speaking will trigger automatic
+     * interruption via Gemini's voice activity detection (VAD).
+     * This method is for explicit programmatic control.
+     *
+     * @example
+     * // User clicks "Stop" button
+     * client.interrupt();
+     *
+     * @example
+     * // Stop AI after a certain time
+     * setTimeout(() => client.interrupt(), 10000);
+     */
+    interrupt(): void;
     /**
      * Update the user ID for the current connection (guest-to-user migration)
      *

package/dist/index.js CHANGED Viewed

@@ -877,6 +877,35 @@ var LiveSpeechClient = class {
       payload: { id, response }
     });
   }
+  /**
+   * Explicitly interrupt the current AI response
+   *
+   * Use this method for:
+   * - UI "Stop" button functionality
+   * - Programmatic control to stop AI mid-response
+   *
+   * Note: In most cases, simply speaking will trigger automatic
+   * interruption via Gemini's voice activity detection (VAD).
+   * This method is for explicit programmatic control.
+   *
+   * @example
+   * // User clicks "Stop" button
+   * client.interrupt();
+   *
+   * @example
+   * // Stop AI after a certain time
+   * setTimeout(() => client.interrupt(), 10000);
+   */
+  interrupt() {
+    if (!this.isConnected) {
+      throw new Error("Not connected");
+    }
+    if (!this.isStreaming) {
+      throw new Error("No active Live session. Call audioStart() first.");
+    }
+    this.logger.info("Sending explicit interrupt");
+    this.connection.send({ action: "interrupt" });
+  }
   /**
    * Update the user ID for the current connection (guest-to-user migration)
    *
@@ -1119,6 +1148,15 @@ var LiveSpeechClient = class {
         this.emit("userIdUpdated", userIdUpdatedEvent);
         break;
       }
+      case "interrupted": {
+        const interruptedEvent = {
+          type: "interrupted",
+          timestamp: message.timestamp
+        };
+        this.logger.info("AI response interrupted (barge-in)");
+        this.emit("interrupted", interruptedEvent);
+        break;
+      }
       case "error":
         this.handleError(message.code, message.message);
         break;

package/dist/index.mjs CHANGED Viewed

@@ -838,6 +838,35 @@ var LiveSpeechClient = class {
       payload: { id, response }
     });
   }
+  /**
+   * Explicitly interrupt the current AI response
+   *
+   * Use this method for:
+   * - UI "Stop" button functionality
+   * - Programmatic control to stop AI mid-response
+   *
+   * Note: In most cases, simply speaking will trigger automatic
+   * interruption via Gemini's voice activity detection (VAD).
+   * This method is for explicit programmatic control.
+   *
+   * @example
+   * // User clicks "Stop" button
+   * client.interrupt();
+   *
+   * @example
+   * // Stop AI after a certain time
+   * setTimeout(() => client.interrupt(), 10000);
+   */
+  interrupt() {
+    if (!this.isConnected) {
+      throw new Error("Not connected");
+    }
+    if (!this.isStreaming) {
+      throw new Error("No active Live session. Call audioStart() first.");
+    }
+    this.logger.info("Sending explicit interrupt");
+    this.connection.send({ action: "interrupt" });
+  }
   /**
    * Update the user ID for the current connection (guest-to-user migration)
    *
@@ -1080,6 +1109,15 @@ var LiveSpeechClient = class {
         this.emit("userIdUpdated", userIdUpdatedEvent);
         break;
       }
+      case "interrupted": {
+        const interruptedEvent = {
+          type: "interrupted",
+          timestamp: message.timestamp
+        };
+        this.logger.info("AI response interrupted (barge-in)");
+        this.emit("interrupted", interruptedEvent);
+        break;
+      }
       case "error":
         this.handleError(message.code, message.message);
         break;

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@drawdream/livespeech",
-  "version": "0.1.10",
+  "version": "0.1.12",
   "description": "Real-time speech-to-speech AI conversation SDK",
   "main": "dist/index.js",
   "module": "dist/index.mjs",