@craftedxp/voice-js 0.3.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CONSUMING.md CHANGED
@@ -102,7 +102,7 @@ Browsers require a user gesture to start `AudioContext`. The SDK calls `audioCon
102
102
 
103
103
  For consumers running on a strict CSP, allow:
104
104
 
105
- - `connect-src wss://your-voxline-server.com`
105
+ - `connect-src wss://your-voissia-server.com`
106
106
  - `worker-src 'self' blob:` (the audio worklet is registered from a Blob URL)
107
107
 
108
108
  Browsers also need `https` for `getUserMedia` (or `localhost` during dev).
package/README.md CHANGED
@@ -4,7 +4,7 @@ JS SDK for embedding a voice agent call in any JS environment — browser tabs,
4
4
 
5
5
  Companion to [`@craftedxp/voice-rn`](https://www.npmjs.com/package/@craftedxp/voice-rn) (React Native) and [`@craftedxp/sdk-node`](https://www.npmjs.com/package/@craftedxp/sdk-node) (server-side `sk_` SDK).
6
6
 
7
- > **Internal testing release.** API surface may evolve before a stable release. **0.3.1** adds Node-consumer ergonomics: `onInterrupt`/`onAgentTurnStart` callbacks on `startCall`, and the `NodeVoiceClientFactory` return type from the Node entry. **0.3.0** added [client tools](#client-tools) — handlers the agent's LLM can call on the consumer's machine. **0.2.0** was a breaking rename + redesign of the previous `@voxline/web@0.1.0` — the singleton-`VoiceClient`-with-`apiKey` pattern is gone in favour of a `configureVoiceClient({ fetchToken })` factory that mirrors `voice-rn` 0.3.x. See [Migrating from `@voxline/web`](#migrating-from-voxlineweb) below.
7
+ > **Internal testing release.** API surface may evolve before a stable release. **0.3.2** is a bug fix release — `onStateChange` now fires correctly for state transitions driven by server frames; the callback was silently swallowed since 0.2.0 for `connected → listening`, `agent_turn_start → agent_speaking`, etc. Consumers using only `onTranscript` were unaffected; anyone building UI from `onStateChange` should upgrade. **0.3.1** added Node-consumer ergonomics (`onInterrupt`/`onAgentTurnStart` callbacks, `NodeVoiceClientFactory` return type) those depend on the state-callback path so 0.3.2 is the minimum recommended. **0.3.0** added [client tools](#client-tools) — handlers the agent's LLM can call on the consumer's machine. **0.2.0** was a breaking rename + redesign of the previous `@voxline/web@0.1.0` — the singleton-`VoiceClient`-with-`apiKey` pattern is gone in favour of a `configureVoiceClient({ fetchToken })` factory that mirrors `voice-rn` 0.3.x. See [Migrating from `@voxline/web`](#migrating-from-voxlineweb) below.
8
8
 
9
9
  ## Install
10
10
 
@@ -22,11 +22,11 @@ The same three-party flow as `voice-rn`. Your backend mints `ct_` tokens with it
22
22
 
23
23
  ```
24
24
  ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
25
- │ Your web app │ │ Your backend │ │ Voxline server │
25
+ │ Your web app │ │ Your backend │ │ Voissia server │
26
26
  │ │ │ │ │ │
27
- │ fetchToken ────┼───────►│ call Voxline ──┼───────►│ mint ct_ │
28
- │ │ │ │ with sk_ │ │ │ │
29
- │ │◄───────┼────────┼──── ct_ ────────┼────────┼─── ct_ │
27
+ │ fetchToken ────┼───────►│ call Voissia ──┼───────►│ mint ct_ │
28
+ │ │ │ │ with sk_ │ │ │ │
29
+ │ │◄───────┼────────┼──── ct_ ────────┼────────┼─── ct_ │
30
30
  │ startCall(...) ┼────────┼──── WSS /v1/agents/.../call?token=ct_ ─────►│
31
31
  └─────────────────┘ └──────────────────┘ └─────────────────┘
32
32
  ```
@@ -133,7 +133,7 @@ The Node bundle has the same `configureVoiceClient` / `startCall` shape, plus an
133
133
 
134
134
  | Field | Type | Notes |
135
135
  | ----------------- | --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
136
- | `apiBase` | `string` | Full HTTPS URL of the Voxline server. WS scheme derived: `https`→`wss`. Trailing slash optional. |
136
+ | `apiBase` | `string` | Full HTTPS URL of the Voissia server. WS scheme derived: `https`→`wss`. Trailing slash optional. |
137
137
  | `fetchToken` | `(args) => Promise<string>` | Called by the SDK whenever it needs a fresh `ct_`. Mirrors `@craftedxp/voice-rn`'s shape exactly — `{ agentId, userId?, context?, metadata? }`. |
138
138
  | `defaultMetadata` | `Record<string, string>?` | Applied to every `startCall`. Per-call merges on top. |
139
139
  | `defaultContext` | `Record<string, unknown>?` | Applied to every `startCall`. Per-call merges on top. |
@@ -344,7 +344,8 @@ Renders a floating call button with a Shadow-DOM transcript panel. Pre-mint the
344
344
 
345
345
  ## Status
346
346
 
347
- - **0.3.1** (current) — adds `onInterrupt` / `onAgentTurnStart` callbacks on `StartCallOptions` and `NodeVoiceClientFactory` proper return type for the Node entry. Backwards-compatible.
347
+ - **0.3.2** (current) — bug fix: `onStateChange` now fires for state transitions driven by server frames (`connected → listening`, `agent_turn_start agent_speaking`, etc.). Latent regression since 0.2.0; `onTranscript`-only consumers were unaffected, but anyone deriving UI from `onStateChange` should upgrade. No API changes drop-in.
348
+ - 0.3.1 — adds `onInterrupt` / `onAgentTurnStart` callbacks on `StartCallOptions` and `NodeVoiceClientFactory` proper return type for the Node entry. Backwards-compatible. **Use 0.3.2 instead** — both new callbacks depend on the state-callback path that 0.3.2 fixes.
348
349
  - 0.3.0 — adds client-tools support. New `clientTools` option on `startCall` accepts a `ClientToolMap` (description, parameters, handler, optional usage/timeoutMs/example). Browser and Node bundles both supported. Backwards-compatible — existing consumers see no change.
349
350
  - 0.2.0 — first `@craftedxp/voice-js` release. Browser + Node dual bundle, `fetchToken` factory, voice-rn 0.3.x parity. Migration path from `@voxline/web@0.1.0` documented above.
350
351
  - 0.1.0 — `@voxline/web`. Singleton `VoiceClient` class, `apiKey` accepted. Retired in 0.2.0; never published to npm so no deprecation window.
@@ -64,7 +64,8 @@ interface ProtocolCallbacks {
64
64
  onTranscript: (entries: TranscriptEntry[]) => void;
65
65
  onError: (err: CallError) => void;
66
66
  onInterrupt: () => void;
67
- onAgentTurnStart: () => void;
67
+ onAgentTurnStart: (seq?: number) => void;
68
+ onAgentTurnEnd: (seq?: number) => void;
68
69
  onCallEnd: (reason: CallEndReason) => void;
69
70
  onConnected: () => void;
70
71
  onClientToolCall: (frame: ClientToolCallFrame) => void;
@@ -98,10 +99,25 @@ interface FetchTokenArgs {
98
99
  */
99
100
  metadata?: Record<string, string>;
100
101
  }
101
- type FetchToken = (args: FetchTokenArgs) => Promise<string>;
102
+ /**
103
+ * What `fetchToken` may return. The rich object form lets the server
104
+ * choose the transport per call. Returning a bare string is backwards-
105
+ * compatible — the SDK treats it as `{ token, transport: 'ws' }`.
106
+ */
107
+ interface FetchTokenResult {
108
+ /** Raw `ct_` to feed into the WS open / WebRTC offer. */
109
+ token: string;
110
+ /** Server-selected transport. Default `'ws'` if absent. */
111
+ transport?: 'ws' | 'webrtc';
112
+ /** Required when `transport === 'webrtc'` AND the server uses a
113
+ * separate signaling gateway. When omitted on a webrtc result, the
114
+ * SDK falls back to the API base's Phase-1 routes (local dev). */
115
+ webrtcGatewayBase?: string;
116
+ }
117
+ type FetchToken = (args: FetchTokenArgs) => Promise<string | FetchTokenResult>;
102
118
  interface VoiceClientConfig {
103
119
  /**
104
- * Full HTTPS URL of the Voxline server. The WebSocket scheme is
120
+ * Full HTTPS URL of the Voissia server. The WebSocket scheme is
105
121
  * derived: `https` → `wss`, `http` → `ws`. No trailing slash needed.
106
122
  */
107
123
  apiBase: string;
@@ -318,4 +334,4 @@ type ReconnectingWebSocket = ReturnType<typeof createReconnectingWebSocket>;
318
334
  */
319
335
  declare function configureVoiceClient(config: VoiceClientConfig): VoiceClientFactory;
320
336
 
321
- export { type Call, type CallEndEvent, type CallEndReason, type CallError, type CallErrorCode, type CallState, type CaptureController, type CaptureOptions, type ClientTool, type ClientToolMap, type FetchToken, type FetchTokenArgs, type OnAgentSpeakingChange, type OnChunk, type OnError, type OnVolume$1 as OnVolume, type PlaybackController, type PlaybackOptions, type ProtocolCallbacks, type ProtocolState, type RWSEvent, type RWSOptions, type ReconnectingWebSocket, type ServerMessage, type StartCallOptions, type TranscriptEntry, type VoiceClientConfig, type VoiceClientFactory, type VolumeEvent, type WebSocketFactory, type WebSocketLike, buildWsUrl, configureVoiceClient, createAudioCapture, createAudioPlayback, createProtocolState, createReconnectingWebSocket, handleServerMessage };
337
+ export { type Call, type CallEndEvent, type CallEndReason, type CallError, type CallErrorCode, type CallState, type CaptureController, type CaptureOptions, type ClientTool, type ClientToolMap, type FetchToken, type FetchTokenArgs, type FetchTokenResult, type OnAgentSpeakingChange, type OnChunk, type OnError, type OnVolume$1 as OnVolume, type PlaybackController, type PlaybackOptions, type ProtocolCallbacks, type ProtocolState, type RWSEvent, type RWSOptions, type ReconnectingWebSocket, type ServerMessage, type StartCallOptions, type TranscriptEntry, type VoiceClientConfig, type VoiceClientFactory, type VolumeEvent, type WebSocketFactory, type WebSocketLike, buildWsUrl, configureVoiceClient, createAudioCapture, createAudioPlayback, createProtocolState, createReconnectingWebSocket, handleServerMessage };