npm - @voxdiscover/voiceserver - Versions diffs - 0.1.0 - Mend

@voxdiscover/voiceserver 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Voxdiscover
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,465 @@
+# @voxdiscover/voiceserver
+Framework-agnostic TypeScript SDK for Voice_server voice agents. Provides session token authentication, Daily.js WebRTC integration, and typed events for building voice-enabled applications.
+## Installation
+```bash
+npm install @voxdiscover/voiceserver @daily-co/daily-js
+# or
+pnpm add @voxdiscover/voiceserver @daily-co/daily-js
+# or
+yarn add @voxdiscover/voiceserver @daily-co/daily-js
+```
+**Note:** `@daily-co/daily-js` is a peer dependency and must be installed separately.
+## Quick Start
+### 1. Obtain Session Token
+First, obtain a session token from your backend (which calls Voice_server's session API):
+```typescript
+// Your backend endpoint
+const response = await fetch('https://voiceserver.voxdiscover.com/api/voice-session', {
+  method: 'POST',
+  headers: { 'Content-Type': 'application/json' },
+  body: JSON.stringify({ userId: 'user_123' }),
+});
+const { token } = await response.json();
+```
+### 2. Initialize SDK
+```typescript
+import { VoiceAgent } from '@voxdiscover/voiceserver';
+const agent = new VoiceAgent({ token });
+```
+### 3. Subscribe to Events
+```typescript
+// Connection state changes
+agent.on('connection:state', (state) => {
+  console.log('Connection state:', state);
+  // states: 'connecting' | 'connected' | 'reconnecting' | 'disconnected' | 'failed'
+});
+// Transcripts (streaming)
+agent.on('transcript:interim', ({ text, speaker }) => {
+  console.log(`[interim] ${speaker}: ${text}`);
+});
+agent.on('transcript:final', ({ text, speaker }) => {
+  console.log(`[final] ${speaker}: ${text}`);
+});
+// Errors
+agent.on('connection:error', (error) => {
+  console.error('Connection error:', error.message);
+  if (error.context?.suggestion) {
+    console.log('Suggestion:', error.context.suggestion);
+  }
+});
+```
+### 4. Connect to Voice Agent
+```typescript
+try {
+  await agent.connect();
+  console.log('Connected! State:', agent.state);
+} catch (error) {
+  console.error('Failed to connect:', error);
+}
+```
+### 5. Control Audio
+```typescript
+// Mute microphone
+agent.mute();
+// Unmute microphone
+agent.unmute();
+```
+### 6. Disconnect
+```typescript
+await agent.disconnect();
+```
+## API Reference
+### `VoiceAgent`
+Main SDK class for managing voice conversations.
+#### Constructor
+```typescript
+new VoiceAgent(config: VoiceAgentConfig)
+```
+**Config options:**
+- `token` (required): Session token from backend
+- `baseUrl` (optional): Backend base URL for validation (default: `https://voiceserver.voxdiscover.com`)
+- `reconnection` (optional):
+  - `enabled` (default: `true`): Enable automatic reconnection
+  - `maxAttempts` (default: `5`): Max reconnection attempts
+#### Properties
+- `state`: Current connection state (read-only)
+  - `'connecting'` - Establishing connection
+  - `'connected'` - Successfully connected
+  - `'reconnecting'` - Attempting to reconnect
+  - `'disconnected'` - Not connected
+  - `'failed'` - Connection failed
+#### Methods
+- `connect(): Promise<void>` - Connect to voice session (validates token, joins Daily room, starts remote audio)
+- `disconnect(): Promise<void>` - Disconnect and cleanup resources (leaves room, releases audio elements)
+- `mute(): void` - Mute microphone
+- `unmute(): void` - Unmute microphone
+#### Events
+Subscribe to events using `agent.on(event, callback)`:
+**Connection events:**
+- `connection:state` - `(state: ConnectionState) => void`
+- `connection:error` - `(error: VoiceAgentError) => void`
+**Transcript events:**
+- `transcript:interim` - `(data: TranscriptData) => void` - Partial transcripts (not emitted by all agent types)
+- `transcript:final` - `(data: TranscriptData) => void` - One event per completed turn; emitted in real-time as each user or agent turn finishes
+**Audio events:**
+- `audio:muted` - `() => void`
+- `audio:unmuted` - `() => void`
+**Session events:**
+- `session:expiring` - `(expiresIn: number) => void` - 5 minutes before expiration
+### Error Handling
+The SDK provides typed error classes for different scenarios:
+```typescript
+import {
+  VoiceAgentError,
+  TokenExpiredError,
+  TokenInvalidError,
+  ConnectionFailedError,
+  PermissionDeniedError,
+} from '@voxdiscover/voiceserver';
+try {
+  await agent.connect();
+} catch (error) {
+  // Pattern 1: instanceof checks
+  if (error instanceof TokenExpiredError) {
+    console.log('Token expired, requesting new session...');
+    // Request new token from backend
+  } else if (error instanceof PermissionDeniedError) {
+    console.log('Microphone permission denied');
+    // Show permission request UI
+  }
+  // Pattern 2: code property checks
+  if (error.code === 'CONNECTION_FAILED' && error.retryable) {
+    console.log('Retryable error, will auto-reconnect');
+  }
+  // Access error details
+  console.log('Message:', error.message);
+  console.log('Suggestion:', error.context?.suggestion);
+  console.log('Retryable:', error.retryable);
+}
+```
+**Error types:**
+- `TokenExpiredError` - Session token expired (non-retryable)
+- `TokenInvalidError` - Token malformed or invalid (non-retryable)
+- `ConnectionFailedError` - WebRTC connection failed (retryable)
+- `PermissionDeniedError` - Microphone permission denied (non-retryable)
+- `NetworkError` - Network error during API call (retryable)
+## Complete Example
+```typescript
+import { VoiceAgent, TokenExpiredError } from '@voxdiscover/voiceserver';
+async function startVoiceCall() {
+  // 1. Get session token from your backend
+  const { token } = await fetch('/api/voice-session', {
+    method: 'POST',
+    body: JSON.stringify({ agentId: 'support-agent' }),
+  }).then(r => r.json());
+  // 2. Initialize agent
+  const agent = new VoiceAgent({
+    token,
+    baseUrl: 'https://voiceserver.voxdiscover.com',
+    reconnection: { enabled: true, maxAttempts: 5 },
+  });
+  // 3. Set up event listeners
+  agent.on('connection:state', (state) => {
+    updateUI({ connectionState: state });
+  });
+  agent.on('transcript:final', ({ text, speaker }) => {
+    addMessageToChat({ speaker, text });
+  });
+  agent.on('connection:error', async (error) => {
+    if (error instanceof TokenExpiredError) {
+      // Refresh token and reconnect
+      const { token: newToken } = await refreshSession();
+      // Create new agent with fresh token
+      await startVoiceCall();
+    } else {
+      showError(error.message, error.context?.suggestion);
+    }
+  });
+  // 4. Connect
+  try {
+    await agent.connect();
+    showUI('connected');
+  } catch (error) {
+    showUI('error', error.message);
+  }
+  return agent;
+}
+// Usage in UI event handlers
+document.getElementById('startCall').addEventListener('click', async () => {
+  const agent = await startVoiceCall();
+  document.getElementById('muteBtn').addEventListener('click', () => {
+    agent.mute();
+  });
+  document.getElementById('endCall').addEventListener('click', async () => {
+    await agent.disconnect();
+  });
+});
+```
+## Analytics Hooks
+The SDK provides standardized analytics hooks for integrating with observability platforms like Segment, DataDog, or PostHog. Analytics hooks emit lifecycle and error events only (not transcripts or audio events) to keep analytics data clean.
+### Registering an Analytics Callback
+```typescript
+import { VoiceAgent } from '@voxdiscover/voiceserver';
+const agent = new VoiceAgent({ token });
+agent.onAnalyticsEvent((event) => {
+  console.log('Analytics event:', event.eventType, {
+    sessionId: event.sessionId,
+    agentId: event.agentId,
+    userId: event.userId,
+    timestamp: event.timestamp,
+  });
+});
+```
+### Event Types
+| Event Type | When Emitted |
+|------------|-------------|
+| `session_started` | WebRTC connection established (joined Daily room) |
+| `session_ended` | Session disconnected (explicit disconnect) |
+| `connection_failed` | Connection error (token invalid, network failure, etc.) |
+| `agent_swap_completed` | Agent hot-swap completed successfully |
+| `agent_swap_failed` | Agent hot-swap failed |
+| `error` | Categorized SDK error requiring developer attention |
+### Event Payload Structure
+```typescript
+interface AnalyticsEvent {
+  timestamp: number;        // Unix timestamp in milliseconds
+  eventType: string;        // One of the event types above
+  sessionId: string;        // Session identifier from token
+  agentId?: string;         // Agent identifier from token
+  userId?: string;          // User identifier from session context
+  customContext?: Record<string, any>;  // Context from session creation
+  error?: {
+    code: string;           // Programmatic error code
+    message: string;        // Human-readable error description
+    retryable: boolean;     // Whether operation can be retried
+  };
+}
+```
+### Integration with Segment
+```typescript
+import Analytics from 'analytics';
+import segmentPlugin from '@analytics/segment';
+// Initialize Segment
+const analytics = Analytics({
+  app: 'my-voice-app',
+  plugins: [
+    segmentPlugin({
+      writeKey: 'YOUR_SEGMENT_WRITE_KEY',
+    }),
+  ],
+});
+// Register analytics callback
+const agent = new VoiceAgent({ token });
+agent.onAnalyticsEvent((event) => {
+  analytics.track(event.eventType, {
+    session_id: event.sessionId,
+    agent_id: event.agentId,
+    user_id: event.userId,
+    timestamp: event.timestamp,
+    // Error details (only present on failure events)
+    ...(event.error && {
+      error_code: event.error.code,
+      error_message: event.error.message,
+      error_retryable: event.error.retryable,
+    }),
+    // Custom context (from session creation)
+    ...event.customContext,
+  });
+});
+await agent.connect();
+```
+### Integration with DataDog / PostHog
+Any analytics platform that accepts key-value event properties works the same way:
+```typescript
+agent.onAnalyticsEvent((event) => {
+  // PostHog example
+  posthog.capture(event.eventType, {
+    distinct_id: event.userId,
+    session_id: event.sessionId,
+    agent_id: event.agentId,
+    $timestamp: new Date(event.timestamp).toISOString(),
+  });
+  // DataDog example
+  datadogRum.addAction(event.eventType, {
+    session_id: event.sessionId,
+    user_id: event.userId,
+  });
+});
+```
+### Multiple Callbacks
+Multiple callbacks can be registered - all receive every event:
+```typescript
+// Log to console
+agent.onAnalyticsEvent((event) => {
+  console.log('[Voice Analytics]', event.eventType, event.sessionId);
+});
+// Send to Segment
+agent.onAnalyticsEvent((event) => {
+  analytics.track(event.eventType, { session_id: event.sessionId });
+});
+// Send to custom backend
+agent.onAnalyticsEvent((event) => {
+  fetch('/api/analytics', {
+    method: 'POST',
+    body: JSON.stringify(event),
+  });
+});
+```
+### IMPORTANT: Read-Only Callbacks
+Analytics callbacks MUST be read-only. Do NOT call SDK methods (connect, disconnect, mute, etc.) inside a callback. Calling SDK methods from within an analytics callback creates a circular event chain that triggers the circuit breaker and disables analytics for the remainder of the session.
+**Do NOT do this:**
+```typescript
+// WRONG: Calling SDK methods inside analytics callback
+agent.onAnalyticsEvent((event) => {
+  if (event.eventType === 'session_ended') {
+    agent.connect(); // This will trigger another analytics event -> infinite loop
+  }
+});
+```
+**Do this instead:**
+```typescript
+// CORRECT: React to events outside the callback
+agent.on('connection:state', (state) => {
+  if (state === 'disconnected') {
+    handleReconnect(); // Handle reconnection in event listener, not analytics callback
+  }
+});
+agent.onAnalyticsEvent((event) => {
+  // Read-only: only forward events to external services
+  myAnalytics.track(event.eventType, { session_id: event.sessionId });
+});
+```
+## TypeScript Support
+The SDK is written in TypeScript and includes full type definitions. All types are exported:
+```typescript
+import type {
+  VoiceAgentConfig,
+  ConnectionState,
+  TranscriptData,
+  VoiceAgentEvents,
+  VoiceAgentErrorCode,
+} from '@voxdiscover/voiceserver';
+```
+## Internal Architecture Notes
+### Headless Mode Audio
+`VoiceAgent` uses `DailyIframe.createCallObject()` (headless mode), which does **not** auto-play
+remote audio. The SDK manages this internally: a `track-started` handler creates an `<Audio>`
+element per remote participant and pipes the incoming track through it. No additional setup is
+needed in your application.
+### Transcript Delivery
+Transcripts are streamed in real-time over Daily's app-message channel. Each completed turn
+(user or agent) triggers one `transcript:final` event. The server uses Pipecat's
+`OutputTransportMessageFrame` API to broadcast each turn to all room participants.
+## Browser Support
+- Chrome 90+
+- Firefox 88+
+- Safari 14+
+- Edge 90+
+**Requirements:**
+- WebRTC support
+- getUserMedia API support
+- ES2022 features
+## License
+MIT