npm - @telnyx/voice-agent-tester - Versions diffs - 0.4.3 → 0.4.5 - Mend

@telnyx/voice-agent-tester 0.4.3 → 0.4.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/CHANGELOG.md +23 -0
package/README.md +185 -161
package/applications/elevenlabs.yaml +1 -1
package/javascript/audio_input_hooks.js +89 -19
package/javascript/audio_output_hooks.js +92 -2
package/package.json +1 -1
package/src/index.js +79 -28
package/src/report.js +169 -90
package/src/voice-agent-tester.js +43 -7
package/tests/integration.test.js +4 -3
package/tests/voice-agent-tester.test.js +133 -0

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,28 @@
 # Changelog
+## [0.4.5](https://github.com/team-telnyx/voice-agent-tester/compare/v0.4.4...v0.4.5) (2026-03-16)
+### Bug Fixes
+* add event-based fallback for audio monitoring (ElevenLabs support) ([#27](https://github.com/team-telnyx/voice-agent-tester/issues/27)) ([6051b5e](https://github.com/team-telnyx/voice-agent-tester/commit/6051b5e949376951f0fb046cffcc5a2a5c250e19))
+* align comparison metrics by scenario step index, not absolute step number ([#23](https://github.com/team-telnyx/voice-agent-tester/issues/23)) ([e4c485b](https://github.com/team-telnyx/voice-agent-tester/commit/e4c485b6eae5e9a6d60f11745b46997a183fc180)), closes [#1](https://github.com/team-telnyx/voice-agent-tester/issues/1) [#2](https://github.com/team-telnyx/voice-agent-tester/issues/2)
+* make ElevenLabs branch-id optional for comparison mode ([#24](https://github.com/team-telnyx/voice-agent-tester/issues/24)) ([3f1735a](https://github.com/team-telnyx/voice-agent-tester/commit/3f1735a6a02e6c1edc4b6e17a6be4087127bded8))
+* single headline number in comparison, per-response in --debug ([#26](https://github.com/team-telnyx/voice-agent-tester/issues/26)) ([a482129](https://github.com/team-telnyx/voice-agent-tester/commit/a482129c1bfe49d28aca7dec8230d30e5b6d8f8a)), closes [#1](https://github.com/team-telnyx/voice-agent-tester/issues/1) [#2](https://github.com/team-telnyx/voice-agent-tester/issues/2)
+### Documentation
+* restructure README with comparison mode front and center ([#25](https://github.com/team-telnyx/voice-agent-tester/issues/25)) ([f15cbcd](https://github.com/team-telnyx/voice-agent-tester/commit/f15cbcd8707cded8081d00b90accf09fd77be169))
+## [0.4.4](https://github.com/team-telnyx/voice-agent-tester/compare/v0.4.3...v0.4.4) (2026-03-11)
+### Features
+* fix speechend race condition, add --retries flag ([#21](https://github.com/team-telnyx/voice-agent-tester/issues/21)) ([09e3b65](https://github.com/team-telnyx/voice-agent-tester/commit/09e3b6578face6c407d058991ab5495d9463e544))
+### Chores
+* release v0.4.3 ([#20](https://github.com/team-telnyx/voice-agent-tester/issues/20)) ([bdeb87b](https://github.com/team-telnyx/voice-agent-tester/commit/bdeb87bed502919a9fed9950e69242b1c2aefcfc))
 ## [0.4.3](https://github.com/team-telnyx/voice-agent-tester/compare/v0.4.2...v0.4.3) (2026-03-11)
 ### Features

package/README.md CHANGED Viewed

@@ -3,160 +3,119 @@
 [![CI](https://github.com/team-telnyx/voice-agent-tester/actions/workflows/ci.yml/badge.svg)](https://github.com/team-telnyx/voice-agent-tester/actions/workflows/ci.yml)
 [![npm version](https://img.shields.io/npm/v/@telnyx/voice-agent-tester.svg)](https://www.npmjs.com/package/@telnyx/voice-agent-tester)
-A CLI tool for automated benchmarking and testing of voice AI agents. Supports Telnyx, ElevenLabs, Vapi, and Retell.
+Automated benchmarking CLI for voice AI agents. Import your assistant from any provider, run identical test scenarios on both platforms, and get a side-by-side latency comparison.
-## Quick Start
+Supports **Telnyx**, **ElevenLabs**, **Vapi**, and **Retell**.
-Run directly with npx (no installation required):
+## Compare Your Voice Agent Against Telnyx
-```bash
-npx @telnyx/voice-agent-tester@latest -a applications/telnyx.yaml -s scenarios/appointment.yaml --assistant-id <YOUR_ASSISTANT_ID>
-```
+The tool imports your assistant from an external provider into Telnyx, then runs the **same scenario** on both platforms and produces a head-to-head latency report:
-Or install globally:
-```bash
-npm install -g @telnyx/voice-agent-tester
-voice-agent-tester -a applications/telnyx.yaml -s scenarios/appointment.yaml --assistant-id <YOUR_ASSISTANT_ID>
+```
+📈 Latency Comparison (elapsed_time):
+--------------------------------------------------------------------------------
+Metric                                  vapi        Telnyx      Delta            Winner
+--------------------------------------------------------------------------------
+Response #1 (wait_for_voice_elapsed_time) 2849ms    1552ms      -1297ms (-45.5%) 🏆 Telnyx
+Response #2 (wait_for_voice_elapsed_time) 3307ms    704ms       -2603ms (-78.7%) 🏆 Telnyx
+--------------------------------------------------------------------------------
+📊 Overall Summary:
+   Compared 2 matched response latencies
+   vapi total latency: 6156ms
+   Telnyx total latency: 2256ms
+   Difference: -3900ms (-63.3%)
+   🏆 Result: Telnyx is faster overall
 ```
-## CLI Options
-| Option | Default | Description |
-|--------|---------|-------------|
-| `-a, --applications` | required | Application config path(s) or folder |
-| `-s, --scenarios` | required | Scenario config path(s) or folder |
-| `--assistant-id` | | Telnyx or provider assistant ID |
-| `--api-key` | | Telnyx API key for authentication |
-| `--provider` | | Import from provider (`vapi`, `elevenlabs`, `retell`) |
-| `--provider-api-key` | | External provider API key (required with `--provider`) |
-| `--provider-import-id` | | Provider assistant ID to import (required with `--provider`) |
-| `--share-key` | | Vapi share key for comparison mode (prompted if missing) |
-| `--branch-id` | | ElevenLabs branch ID for comparison mode (prompted if missing) |
-| `--compare` | `true` | Run both provider direct and Telnyx import benchmarks |
-| `--no-compare` | | Disable comparison (run only Telnyx import) |
-| `-d, --debug` | `false` | Enable detailed timeout diagnostics |
-| `-v, --verbose` | `false` | Show browser console logs |
-| `--headless` | `true` | Run browser in headless mode |
-| `--repeat` | `1` | Number of repetitions per combination |
-| `-c, --concurrency` | `1` | Number of parallel tests |
-| `-r, --report` | | Generate CSV report to specified file |
-| `-p, --params` | | URL template params (e.g., `key=value,key2=value2`) |
-| `--application-tags` | | Filter applications by comma-separated tags |
-| `--scenario-tags` | | Filter scenarios by comma-separated tags |
-| `--assets-server` | `http://localhost:3333` | Assets server URL |
-| `--audio-url` | | URL to audio file to play as input during entire benchmark |
-| `--audio-volume` | `1.0` | Volume level for audio input (0.0 to 1.0) |
-## Bundled Configs
-| Application Config | Provider |
-|-------------------|----------|
-| `applications/telnyx.yaml` | Telnyx AI Widget |
-| `applications/elevenlabs.yaml` | ElevenLabs |
-| `applications/vapi.yaml` | Vapi |
-| `applications/retell.yaml` | Retell |
-| `applications/livetok.yaml` | Livetok |
-Scenarios:
-- `scenarios/appointment.yaml` - Basic appointment booking test
-- `scenarios/appointment_with_noise.yaml` - Appointment with background noise (pre-mixed audio)
-## Background Noise Testing
-Test voice agents' performance with ambient noise (e.g., crowd chatter, cafe environment). Background noise is pre-mixed into audio files to simulate real-world conditions where users speak to voice agents in noisy environments.
-### Running with Background Noise
+### Vapi vs Telnyx
 ```bash
-# Telnyx with background noise
-npx @telnyx/voice-agent-tester@latest \
-  -a applications/telnyx.yaml \
-  -s scenarios/appointment_with_noise.yaml \
-  --assistant-id <YOUR_ASSISTANT_ID>
-# Compare with no noise (same assistant)
 npx @telnyx/voice-agent-tester@latest \
   -a applications/telnyx.yaml \
   -s scenarios/appointment.yaml \
-  --assistant-id <YOUR_ASSISTANT_ID>
+  --provider vapi \
+  --share-key <VAPI_SHARE_KEY> \
+  --api-key <TELNYX_API_KEY> \
+  --provider-api-key <VAPI_API_KEY> \
+  --provider-import-id <VAPI_ASSISTANT_ID>
+```
+### ElevenLabs vs Telnyx
-# Generate CSV report with metrics
+```bash
 npx @telnyx/voice-agent-tester@latest \
   -a applications/telnyx.yaml \
-  -s scenarios/appointment_with_noise.yaml \
-  --assistant-id <YOUR_ASSISTANT_ID> \
-  -r output/noise_benchmark.csv
+  -s scenarios/appointment.yaml \
+  --provider elevenlabs \
+  --api-key <TELNYX_API_KEY> \
+  --provider-api-key <ELEVENLABS_API_KEY> \
+  --provider-import-id <ELEVENLABS_AGENT_ID>
 ```
-### Custom Audio Input from URL
-Play any audio file from a URL as input throughout the entire benchmark run. The audio is sent to the voice agent as microphone input.
+### Retell vs Telnyx
 ```bash
-# Use custom audio input from URL
 npx @telnyx/voice-agent-tester@latest \
   -a applications/telnyx.yaml \
   -s scenarios/appointment.yaml \
-  --assistant-id <YOUR_ASSISTANT_ID> \
-  --audio-url "https://example.com/test-audio.mp3" \
-  --audio-volume 0.8
+  --provider retell \
+  --api-key <TELNYX_API_KEY> \
+  --provider-api-key <RETELL_API_KEY> \
+  --provider-import-id <RETELL_AGENT_ID>
 ```
-This is useful for:
-- Testing with custom audio inputs
-- Using longer audio tracks that play throughout the benchmark
-- A/B testing different audio sources
+### How Comparison Works
-### Bundled Audio Files
+1. **Import** — The assistant is imported from the external provider into Telnyx
+2. **Phase 1: Provider Direct** — Runs the scenario on the provider's native widget
+3. **Phase 2: Telnyx Import** — Runs the same scenario on the Telnyx-imported assistant
+4. **Report** — Produces a side-by-side comparison with latency delta and winner per response
-| File | Description |
-|------|-------------|
-| `hello_make_an_appointment.mp3` | Clean appointment request |
-| `hello_make_an_appointment_with_noise.mp3` | Appointment request with crowd noise |
-| `appointment_data.mp3` | Clean appointment details |
-| `appointment_data_with_noise.mp3` | Appointment details with crowd noise |
+### Provider-Specific Keys
-### Scenario Configuration
+Some providers need an extra key to load their demo widget. If not passed via CLI, the tool prompts with instructions.
-The noise scenario uses pre-mixed audio files:
+| Provider | Flag | Required? | How to find it |
+|----------|------|-----------|----------------|
+| Vapi | `--share-key` | Yes | Dashboard → select assistant → click 🔗 link icon next to the assistant ID |
+| ElevenLabs | `--branch-id` | No | Dashboard → Agents → select agent → Publish dropdown → "Copy shareable link" |
-```yaml
-# scenarios/appointment_with_noise.yaml
-tags:
-  - default
-  - noise
-steps:
-  - action: wait_for_voice
-  - action: wait_for_silence
-  - action: sleep
-    time: 1000
-  - action: speak
-    file: hello_make_an_appointment_with_noise.mp3
-  - action: wait_for_voice
-    metrics: elapsed_time
-  - action: wait_for_silence
-  - action: speak
-    file: appointment_data_with_noise.mp3
-  - action: wait_for_voice
-    metrics: elapsed_time
+### Import Only (Skip Comparison)
+To import without running the provider benchmark:
+```bash
+npx @telnyx/voice-agent-tester@latest \
+  -a applications/telnyx.yaml \
+  -s scenarios/appointment.yaml \
+  --provider vapi \
+  --no-compare \
+  --api-key <TELNYX_API_KEY> \
+  --provider-api-key <VAPI_API_KEY> \
+  --provider-import-id <VAPI_ASSISTANT_ID>
 ```
-### Metrics and Reports
+## Quick Start
-The benchmark collects response latency metrics at each `wait_for_voice` step with `metrics: elapsed_time`. Generated CSV reports include:
+Run directly with npx (no installation required):
-```csv
-app, scenario, repetition, success, duration, step_9_wait_for_voice_elapsed_time, step_12_wait_for_voice_elapsed_time
-telnyx, appointment_with_noise, 0, 1, 29654, 1631, 1225
+```bash
+npx @telnyx/voice-agent-tester@latest \
+  -a applications/telnyx.yaml \
+  -s scenarios/appointment.yaml \
+  --assistant-id <YOUR_ASSISTANT_ID>
 ```
-Compare results with and without noise to measure how background noise affects your voice agent's:
-- Response latency
-- Speech recognition accuracy
-- Overall conversation flow
+Or install globally:
-## Examples
+```bash
+npm install -g @telnyx/voice-agent-tester
+voice-agent-tester -a applications/telnyx.yaml -s scenarios/appointment.yaml --assistant-id <YOUR_ASSISTANT_ID>
+```
+## Provider Examples
 ### Telnyx
@@ -185,78 +144,143 @@ npx @telnyx/voice-agent-tester@latest \
   --assistant-id <ASSISTANT_ID>
 ```
-## Comparison Mode
+## CLI Reference
-When importing from an external provider, the tool automatically runs both benchmarks in sequence and generates a comparison report:
+| Option | Default | Description |
+|--------|---------|-------------|
+| `-a, --applications` | required | Application config path(s) or folder |
+| `-s, --scenarios` | required | Scenario config path(s) or folder |
+| `--assistant-id` | | Telnyx or provider assistant ID |
+| `--api-key` | | Telnyx API key |
+| `--provider` | | Import from provider (`vapi`, `elevenlabs`, `retell`) |
+| `--provider-api-key` | | External provider API key |
+| `--provider-import-id` | | Provider assistant/agent ID to import |
+| `--share-key` | | Vapi share key for comparison mode |
+| `--branch-id` | | ElevenLabs branch ID (optional) |
+| `--compare` | `true` | Run provider direct + Telnyx import benchmarks |
+| `--no-compare` | | Skip provider direct benchmark |
+| `-d, --debug` | `false` | Detailed timeout diagnostics |
+| `-v, --verbose` | `false` | Show browser console logs |
+| `--headless` | `true` | Run browser in headless mode |
+| `--repeat` | `1` | Repetitions per app+scenario combination |
+| `-c, --concurrency` | `1` | Parallel test runs |
+| `-r, --report` | | CSV report output path |
+| `-p, --params` | | URL template params (`key=value,key2=value2`) |
+| `--retries` | `0` | Retry failed runs |
+| `--application-tags` | | Filter applications by tags |
+| `--scenario-tags` | | Filter scenarios by tags |
+| `--record` | `false` | Record video+audio (webm) |
+| `--audio-url` | | URL to audio file played as input during run |
+| `--audio-volume` | `1.0` | Audio input volume (0.0–1.0) |
+| `--assets-server` | `http://localhost:3333` | Assets server URL |
-1. **Provider Direct** - Benchmarks the assistant on the original provider's widget
-2. **Telnyx Import** - Benchmarks the same assistant after importing to Telnyx
+## Bundled Configs
-### Provider-Specific Keys
+**Applications:**
-Comparison mode requires a provider-specific key to load the provider's direct widget. If not passed via CLI, the tool will prompt you with instructions on how to find it.
+| Config | Provider |
+|--------|----------|
+| `applications/telnyx.yaml` | Telnyx AI Widget |
+| `applications/elevenlabs.yaml` | ElevenLabs |
+| `applications/vapi.yaml` | Vapi |
+| `applications/retell.yaml` | Retell |
+**Scenarios:**
-| Provider | Flag | How to find it |
-|----------|------|----------------|
-| Vapi | `--share-key` | In the Vapi Dashboard, select your assistant, then click the link icon (🔗) next to the assistant ID at the top. This copies the demo link containing your share key. |
-| ElevenLabs | `--branch-id` | In the ElevenLabs Dashboard, go to Agents, select your target agent, then click the dropdown next to Publish and select "Copy shareable link". This copies the demo link containing your branch ID. |
+| Config | Description |
+|--------|-------------|
+| `scenarios/appointment.yaml` | Appointment booking test |
+| `scenarios/appointment_with_noise.yaml` | Appointment with background crowd noise |
-### Import and Compare (Default)
+## Background Noise Testing
-**Vapi:**
+Test how voice agents perform with ambient noise by using pre-mixed audio files:
 ```bash
+# With background noise
+npx @telnyx/voice-agent-tester@latest \
+  -a applications/telnyx.yaml \
+  -s scenarios/appointment_with_noise.yaml \
+  --assistant-id <ASSISTANT_ID>
+# Without noise (same assistant, compare results)
 npx @telnyx/voice-agent-tester@latest \
   -a applications/telnyx.yaml \
   -s scenarios/appointment.yaml \
-  --provider vapi \
-  --share-key <VAPI_SHARE_KEY> \
-  --api-key <TELNYX_KEY> \
-  --provider-api-key <VAPI_KEY> \
-  --provider-import-id <VAPI_ASSISTANT_ID>
+  --assistant-id <ASSISTANT_ID>
 ```
-**ElevenLabs:**
+### Custom Audio Input
+Play any audio file from a URL as microphone input throughout the benchmark:
 ```bash
 npx @telnyx/voice-agent-tester@latest \
   -a applications/telnyx.yaml \
   -s scenarios/appointment.yaml \
-  --provider elevenlabs \
-  --branch-id <ELEVENLABS_BRANCH_ID> \
-  --api-key <TELNYX_KEY> \
-  --provider-api-key <ELEVENLABS_KEY> \
-  --provider-import-id <ELEVENLABS_AGENT_ID>
+  --assistant-id <ASSISTANT_ID> \
+  --audio-url "https://example.com/test-audio.mp3" \
+  --audio-volume 0.8
 ```
-This will:
-- Run Phase 1: Provider direct benchmark
-- Run Phase 2: Telnyx import benchmark
-- Generate a side-by-side latency comparison report
+### Audio Assets
-### Import Only (No Comparison)
+| File | Description |
+|------|-------------|
+| `hello_make_an_appointment.mp3` | Clean appointment request |
+| `hello_make_an_appointment_with_noise.mp3` | Appointment request + crowd noise |
+| `appointment_data.mp3` | Clean appointment details |
+| `appointment_data_with_noise.mp3` | Appointment details + crowd noise |
-To skip the provider direct benchmark and only run the Telnyx import:
+## Scenario Configuration
-```bash
-npx @telnyx/voice-agent-tester@latest \
-  -a applications/telnyx.yaml \
-  -s scenarios/appointment.yaml \
-  --provider vapi \
-  --no-compare \
-  --api-key <TELNYX_KEY> \
-  --provider-api-key <VAPI_KEY> \
-  --provider-import-id <VAPI_ASSISTANT_ID>
+Scenarios are YAML files with a sequence of steps. Steps with `metrics: elapsed_time` are included in the latency report.
+```yaml
+# scenarios/appointment.yaml
+steps:
+  - action: wait_for_voice        # Wait for agent greeting
+  - action: wait_for_silence      # Wait for greeting to finish
+  - action: speak
+    file: hello_make_an_appointment.mp3
+  - action: wait_for_voice        # ← Measured: time to first response
+    metrics: elapsed_time
+  - action: wait_for_silence
+  - action: speak
+    file: appointment_data.mp3
+  - action: wait_for_voice        # ← Measured: time to second response
+    metrics: elapsed_time
 ```
-### Debugging Failures
+### Available Actions
+| Action | Description |
+|--------|-------------|
+| `speak` | Play audio (`file`) or synthesize text (`text`) as microphone input |
+| `wait_for_voice` | Wait for the AI agent to start speaking |
+| `wait_for_silence` | Wait for the AI agent to stop speaking |
+| `sleep` | Pause for a fixed duration (`time` in ms) |
+| `click` | Click an element (`selector`) |
+| `click_with_retry` | Click with retries and connection verification |
+| `wait_for_element` | Wait for a DOM element to appear |
+| `type` | Type text into an input field |
+| `fill` | Set an input field value directly |
+| `select` | Select dropdown/checkbox/radio option |
+| `screenshot` | Capture a screenshot |
+| `listen` | Record agent audio, transcribe, and evaluate |
-If benchmarks fail, rerun with `--debug` for detailed diagnostics:
+## Debugging
+If benchmarks fail or time out, use `--debug` for detailed diagnostics including audio monitor state, WebRTC connection info, and RTP stats:
 ```bash
-voice-agent-tester --provider vapi --debug [other options...]
+npx @telnyx/voice-agent-tester@latest \
+  -a applications/telnyx.yaml \
+  -s scenarios/appointment.yaml \
+  --assistant-id <ASSISTANT_ID> \
+  --debug
 ```
 ## License
-MIT
+MIT

package/applications/elevenlabs.yaml CHANGED Viewed

@@ -1,4 +1,4 @@
-url: "https://elevenlabs.io/app/talk-to?agent_id={{assistantId}}&branch_id={{branchId}}"
+url: "https://elevenlabs.io/app/talk-to?agent_id={{assistantId}}"
 tags:
   - provider
   - elevenlabs

package/javascript/audio_input_hooks.js CHANGED Viewed

@@ -62,20 +62,24 @@ function createControlledMediaStream() {
 }
 // Replace getUserMedia to return our controlled stream
-const originalGetUserMedia = navigator.mediaDevices.getUserMedia.bind(navigator.mediaDevices);
-navigator.mediaDevices.getUserMedia = function (constraints) {
-  console.log("🎤 Intercepted getUserMedia call with constraints:", constraints);
-  // If audio is requested, return our controlled stream
-  if (constraints && constraints.audio) {
-    console.log("🎤 Returning controlled MediaStream instead of real microphone");
-    const controlledStream = createControlledMediaStream();
-    return Promise.resolve(controlledStream);
-  }
+if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {
+  const originalGetUserMedia = navigator.mediaDevices.getUserMedia.bind(navigator.mediaDevices);
+  navigator.mediaDevices.getUserMedia = function (constraints) {
+    console.log("🎤 Intercepted getUserMedia call with constraints:", constraints);
+    // If audio is requested, return our controlled stream
+    if (constraints && constraints.audio) {
+      console.log("🎤 Returning controlled MediaStream instead of real microphone");
+      const controlledStream = createControlledMediaStream();
+      return Promise.resolve(controlledStream);
+    }
-  // For video-only or other requests, use original implementation
-  return originalGetUserMedia(constraints);
-};
+    // For video-only or other requests, use original implementation
+    return originalGetUserMedia(constraints);
+  };
+} else {
+  console.warn("🎤 navigator.mediaDevices.getUserMedia not available, skipping microphone intercept");
+}
 // Expose __speak method to be called from voice-agent-tester.js
 window.__speak = function (textOrUrl) {
@@ -152,6 +156,24 @@ function playAudioInMediaStream(url) {
   const audio = new Audio(url);
   audio.crossOrigin = 'anonymous'; // Enable CORS if needed
+  // Keep a strong reference so the element is not garbage collected
+  currentSpeakAudio = audio;
+  let speechEndFired = false;
+  let safetyTimeoutId = null;
+  function fireSpeechEnd(reason) {
+    if (speechEndFired) return;
+    speechEndFired = true;
+    if (safetyTimeoutId) clearTimeout(safetyTimeoutId);
+    console.log(`🎤 Audio playback ended (${reason})`);
+    if (typeof __publishEvent === 'function') {
+      __publishEvent('speechend', { url: url, reason: reason });
+    }
+    // Release reference
+    if (currentSpeakAudio === audio) currentSpeakAudio = null;
+  }
   // Set up audio routing through all MediaStreams
   audio.addEventListener('canplaythrough', function () {
     console.log(`🎤 Audio ready to play, routing to ${mediaStreams.length} MediaStreams`);
@@ -181,7 +203,33 @@ function playAudioInMediaStream(url) {
       }
       // Play the audio
-      audio.play();
+      audio.play().then(() => {
+        // Set up safety timeout based on audio duration
+        // audio.duration should be available after canplaythrough
+        const duration = audio.duration;
+        if (duration && isFinite(duration)) {
+          const safetyMs = Math.max((duration * 1000) + 5000, 15000);
+          console.log(`🎤 Audio duration: ${duration.toFixed(1)}s, safety timeout: ${(safetyMs / 1000).toFixed(1)}s`);
+          safetyTimeoutId = setTimeout(() => {
+            if (!speechEndFired) {
+              console.warn(`🎤 Safety timeout: speechend not fired after ${(safetyMs / 1000).toFixed(1)}s (audio paused=${audio.paused}, ended=${audio.ended}, currentTime=${audio.currentTime.toFixed(1)})`);
+              fireSpeechEnd('safety_timeout');
+            }
+          }, safetyMs);
+        } else {
+          // Unknown duration — use 20s fallback
+          console.warn('🎤 Audio duration unknown, using 20s safety timeout');
+          safetyTimeoutId = setTimeout(() => {
+            if (!speechEndFired) {
+              console.warn('🎤 Safety timeout: speechend not fired after 20s');
+              fireSpeechEnd('safety_timeout');
+            }
+          }, 20000);
+        }
+      }).catch(error => {
+        console.error('Error playing audio:', error);
+        fireSpeechEnd('play_error');
+      });
     } catch (error) {
       console.error('Error setting up audio source:', error);
       if (typeof __publishEvent === 'function') {
@@ -190,11 +238,19 @@ function playAudioInMediaStream(url) {
     }
   });
-  // Handle audio end
+  // Handle audio end — primary path
   audio.addEventListener('ended', function () {
-    console.log('🎤 Audio playback ended');
-    if (typeof __publishEvent === 'function') {
-      __publishEvent('speechend', { url: url });
+    fireSpeechEnd('ended');
+  });
+  // Handle pause — if something pauses the audio externally
+  audio.addEventListener('pause', function () {
+    // Only treat as speechend if the audio is past 90% of its duration (near end)
+    // or if it was paused externally (not by us)
+    if (audio.ended || (audio.duration && audio.currentTime >= audio.duration * 0.9)) {
+      fireSpeechEnd('pause_near_end');
+    } else {
+      console.warn(`🎤 Audio paused at ${audio.currentTime.toFixed(1)}s / ${(audio.duration || 0).toFixed(1)}s`);
     }
   });
@@ -204,17 +260,31 @@ function playAudioInMediaStream(url) {
     if (typeof __publishEvent === 'function') {
       __publishEvent('speecherror', { error: 'Audio playback failed', url: url });
     }
+    fireSpeechEnd('error');
   });
   // Start loading the audio
   audio.load();
 }
+// Keep a reference to the current speak Audio element so it doesn't get GC'd
+let currentSpeakAudio = null;
 // Helper function to stop current audio and reset to silence
 function stopCurrentAudio() {
+  // Stop the speak audio element if playing
+  if (currentSpeakAudio) {
+    try {
+      currentSpeakAudio.pause();
+      currentSpeakAudio.currentTime = 0;
+    } catch (e) {
+      console.warn('Error stopping speak audio:', e);
+    }
+    currentSpeakAudio = null;
+  }
   currentPlaybackNodes.forEach((sourceNode, index) => {
     try {
-      sourceNode.stop();
       sourceNode.disconnect();
       console.log(`🎤 Stopped audio source ${index}`);
     } catch (e) {