@craftedxp/voice-js 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CONSUMING.md ADDED
@@ -0,0 +1,120 @@
1
+ # Consuming `@craftedxp/voice-js` in your app
2
+
3
+ Walks through the three install paths (local tarball → fastest, file: dep → middle, npm registry → production) plus the platform-specific glue you need on each side.
4
+
5
+ ## TL;DR
6
+
7
+ ```bash
8
+ # in YOUR app
9
+ npm install @craftedxp/voice-js
10
+ # Node consumers also need:
11
+ npm install ws
12
+
13
+ # wire up:
14
+ import { configureVoiceClient } from '@craftedxp/voice-js'
15
+ const voice = configureVoiceClient({ apiBase, fetchToken })
16
+ const call = await voice.startCall({ agentId, ...callbacks })
17
+ ```
18
+
19
+ That's the whole thing in the browser. No native build, no patches, no permissions hassle (browser's standard mic prompt covers it).
20
+
21
+ ## Path 1 — Local tarball (fastest, no registry)
22
+
23
+ Best for quick verification while iterating on the SDK locally. Same workflow as the React Native SDK.
24
+
25
+ ```bash
26
+ # in the SDK repo
27
+ cd voice-assistant/sdk/voice-js
28
+ npm install
29
+ npm pack # → craftedxp-voice-js-0.2.0.tgz
30
+
31
+ # in YOUR app
32
+ cd /path/to/your-app
33
+ npm install /abs/path/to/voice-assistant/sdk/voice-js/craftedxp-voice-js-0.2.0.tgz
34
+ ```
35
+
36
+ `npm pack` runs `prepare` automatically, which rebuilds `dist/` (browser + node bundles + embed IIFE) into the tarball.
37
+
38
+ ## Path 2 — file: dep (monorepo / side-by-side)
39
+
40
+ For the landing dashboard in this repo:
41
+
42
+ ```jsonc
43
+ {
44
+ "dependencies": {
45
+ "@craftedxp/voice-js": "file:../sdk/voice-js"
46
+ }
47
+ }
48
+ ```
49
+
50
+ `dist/` is checked into git for this dep type because Vercel-style deploys don't reliably run the `prepare` hook for `file:` deps (matching the same rationale we ship `dist/` for under `sdk/voice-rn` and `sdk/sdk-node`).
51
+
52
+ After SDK changes, `npm run build` in `sdk/voice-js/` and your consumer picks up the new `dist/` on next refresh.
53
+
54
+ ## Path 3 — npm registry (production)
55
+
56
+ ```bash
57
+ npm install @craftedxp/voice-js
58
+ # Node-side consumers:
59
+ npm install ws
60
+ ```
61
+
62
+ `ws` is a peer dependency declared as `peerDependenciesMeta.optional` so npm doesn't force-install it for browser-only consumers. Add it explicitly when running under Node / Electron-main.
63
+
64
+ ## Backend setup — minting `ct_` tokens
65
+
66
+ Your `fetchToken` callback hits YOUR backend. Your backend uses the `sk_` API key (held server-side only) to mint a short-lived `ct_` for the SDK to use. Pattern:
67
+
68
+ ```ts
69
+ // YOUR backend (e.g. /api/voice/mint route)
70
+ import { CraftedXP } from '@craftedxp/sdk-node'
71
+ const platform = new CraftedXP({ apiKey: process.env.CRAFTEDXP_SK })
72
+
73
+ export async function POST(req) {
74
+ // Your auth — verify the user is signed in, extract their userId etc.
75
+ const session = await getSession(req)
76
+ if (!session) return new Response('unauthorized', { status: 401 })
77
+
78
+ const { agentId, context, metadata } = await req.json()
79
+
80
+ // Mint with the user's identity. Server attaches contactId, applies the
81
+ // org's tier gate, etc.
82
+ const token = await platform.callTokens.mint({
83
+ agentId,
84
+ contactId: session.userId,
85
+ context,
86
+ metadata,
87
+ ttlSeconds: 600,
88
+ })
89
+ return Response.json({ token: token.token })
90
+ }
91
+ ```
92
+
93
+ The SDK calls this on initial connect and (eventually) on `token_expired` mid-call. Don't share the same `ct_` across browser tabs — they're scoped to one call session and short-lived.
94
+
95
+ ## iOS Safari / mobile browsers
96
+
97
+ Browsers require a user gesture to start `AudioContext`. The SDK calls `audioContext.resume()` automatically, but if you call `startCall` outside a click/tap handler, the AudioContext may stay suspended and you won't hear the agent.
98
+
99
+ **Always invoke `startCall` from inside a click handler** (or a `Promise.resolve().then()` chain that started in one). The `getUserMedia` mic prompt also won't fire on iOS without a gesture.
100
+
101
+ ## CSP / mic permission
102
+
103
+ For consumers running on a strict CSP, allow:
104
+ - `connect-src wss://your-voxline-server.com`
105
+ - `worker-src 'self' blob:` (the audio worklet is registered from a Blob URL)
106
+
107
+ Browsers also need `https` for `getUserMedia` (or `localhost` during dev).
108
+
109
+ ## Debug logging
110
+
111
+ The SDK doesn't log to the console by default. To see protocol-level events, wire all the `onStateChange` / `onTranscript` / `onError` / `onEnd` callbacks and log them yourself — that's the same surface the SDK uses internally.
112
+
113
+ ## Updating
114
+
115
+ When the SDK changes:
116
+ - **Tarball path:** re-`npm pack` then `npm install <newTgz>` in the consumer.
117
+ - **`file:` path:** `npm run build` in `sdk/voice-js/` (refreshes `dist/`); the consumer picks it up on the next bundler refresh.
118
+ - **Registry path:** bump the version in your `package.json` and `npm install`.
119
+
120
+ Major-version bumps note breaking changes in the README's `Status` section + a migration block.
package/DEVELOPING.md ADDED
@@ -0,0 +1,78 @@
1
+ # Developing `@craftedxp/voice-js` locally
2
+
3
+ JS SDK is pure JS — no native toolchain, no platform quirks. Iteration is fast: edit TS → tsup rebuilds `dist/` → your consumer app picks it up.
4
+
5
+ ## TL;DR
6
+
7
+ ```bash
8
+ cd voice-assistant/sdk/voice-js
9
+ npm install
10
+ npm run dev # tsup --watch, rebuilds browser + node + embed on save
11
+ ```
12
+
13
+ Leave that running. In a consumer project, either:
14
+
15
+ 1. **Via Yalc (recommended):** `yalc publish` in the SDK, `yalc add @craftedxp/voice-js` in the consumer. `yalc push` after each SDK rebuild.
16
+ 2. **Via file-dep:** `npm install file:/abs/path/to/sdk/voice-js` in the consumer; re-run after each rebuild (or use `--install-links` to copy).
17
+
18
+ Either way, the consumer's bundler (Webpack / Vite / esbuild / Next) picks up the new `dist/` on next hot reload.
19
+
20
+ ## Bundle layout (0.2.0)
21
+
22
+ `tsup.config.ts` emits three artefacts:
23
+
24
+ - `dist/browser.{mjs,js}` + `.d.ts` — browser entry. Uses native `WebSocket` + `AudioContext` + `getUserMedia`. The default under the package's `browser` and `default` exports conditions.
25
+ - `dist/node.{mjs,js}` + `.d.ts` — Node entry. Uses `ws` (loaded via dynamic `import()` so it stays an OPTIONAL peer). No audio glue. Picked under the `node` condition or `@craftedxp/voice-js/node`.
26
+ - `dist/embed.iife.js` — minified IIFE for `<script>` embed; bundles the browser entry inline.
27
+
28
+ Source files map to:
29
+ - `src/browser.ts` — entry, factory implementation, public re-exports.
30
+ - `src/node.ts` — entry, dynamic `ws` loader, factory implementation.
31
+ - `src/VoiceClient.ts` — browser `BrowserVoiceClient` implementing the `Call` interface.
32
+ - `src/NodeVoiceClient.ts` — node `NodeVoiceClient` implementing the `NodeCall` interface (extends `Call` with `sendAudioChunk`).
33
+ - `src/protocol.ts` — shared WS message handling + state machine + types. Both clients call into this.
34
+ - `src/config.ts` — `VoiceClientConfig` / `StartCallOptions` / `VoiceClientFactory` types + `normalizeConfig` + `mergeStartCallContext` helpers.
35
+ - `src/AudioCapture.ts` / `src/AudioPlayback.ts` — browser-only. Lifted unchanged from the `@voxline/web` 0.1.x code.
36
+ - `src/ReconnectingWebSocket.ts` — transport-agnostic; takes a `wsFactory` so browser + node both reuse it.
37
+
38
+ ## Testing the embed bundle
39
+
40
+ The embed widget lives at `src/widget/embed.ts` and compiles to `dist/embed.iife.js`, which is copied to `/web/embed.js` post-build. The main server (port 8080) serves `/embed.js` as a static file.
41
+
42
+ To iterate on the widget:
43
+
44
+ 1. `npm run dev` in `sdk/voice-js` — watches all three entries (browser, node, embed).
45
+ 2. Open `http://localhost:8080/embed.js` in a browser tab after each save — `curl` or View-Source to confirm your changes landed.
46
+ 3. Test end-to-end against `http://localhost:3000` (the landing page's EmbedDemo mounts the widget in place) or the `/test` page.
47
+
48
+ Note: the post-build copy (`embed:copy` script) only runs after a full `npm run build`, not after `tsup --watch`. For a fresh watch-mode embed, rerun the build manually or add an `onchange` watcher:
49
+
50
+ ```bash
51
+ # optional: auto-copy on every tsup output change
52
+ npx onchange 'dist/embed.iife.js' -- npm run embed:copy
53
+ ```
54
+
55
+ ## Debugging in the browser
56
+
57
+ - **Source maps** are enabled for the main SDK bundle. In DevTools you'll see `VoiceClient.ts`, `AudioCapture.ts` etc. in Sources. The embed bundle is minified (it ships to end users) and has no source map — file an issue if you need one for local debugging.
58
+ - **`console.log`** inside `VoiceClient` shows up in the consumer's browser console.
59
+ - **AudioWorklet code** (`mic-downsampler.worklet.js`) runs in the audio rendering thread. `console.log` inside the worklet works but has a 1–2 frame delay. For tight debugging, post messages back to the main thread with `port.postMessage({debug: ...})`.
60
+
61
+ ## Two quick sanity checks before cutting a release
62
+
63
+ 1. **Build succeeds + tarball is clean:**
64
+ ```bash
65
+ npm run build
66
+ npm pack --dry-run
67
+ # expected: only dist/, README.md, CONSUMING.md, DEVELOPING.md, package.json
68
+ ```
69
+ 2. **Embed IIFE runs standalone:**
70
+ ```bash
71
+ # open an empty HTML page; paste the <script> tag from the README
72
+ # click the button, verify getUserMedia prompt + agent greeting
73
+ ```
74
+
75
+ ## Related
76
+
77
+ - [CONSUMING.md](./CONSUMING.md) — external install paths
78
+ - `../react-native/DEVELOPING.md` — same story for mobile
package/README.md ADDED
@@ -0,0 +1,245 @@
1
+ # @craftedxp/voice-js
2
+
3
+ JS SDK for embedding a voice agent call in any JS environment — browser tabs, Node.js processes, Electron apps. Zero framework deps.
4
+
5
+ Companion to [`@craftedxp/voice-rn`](https://www.npmjs.com/package/@craftedxp/voice-rn) (React Native) and [`@craftedxp/sdk-node`](https://www.npmjs.com/package/@craftedxp/sdk-node) (server-side `sk_` SDK).
6
+
7
+ > **Internal testing release.** API surface may evolve before a stable release. **0.2.0** is a breaking rename + redesign of the previous `@voxline/web@0.1.0` — the singleton-`VoiceClient`-with-`apiKey` pattern is gone in favour of a `configureVoiceClient({ fetchToken })` factory that mirrors `voice-rn` 0.3.x. See [Migrating from `@voxline/web`](#migrating-from-voxlineweb) below.
8
+
9
+ ## Install
10
+
11
+ ```bash
12
+ npm install @craftedxp/voice-js
13
+ # Node consumers also need:
14
+ npm install ws
15
+ ```
16
+
17
+ `ws` is declared as an OPTIONAL peer — only needed in Node / Electron-main. Browsers use the native `WebSocket` and skip it.
18
+
19
+ ## How the integration fits together
20
+
21
+ The same three-party flow as `voice-rn`. Your backend mints `ct_` tokens with its `sk_` API key (via `@craftedxp/sdk-node` or a raw `POST /v1/call-tokens`), the SDK calls your `fetchToken` callback whenever it needs a fresh one, and your client never sees `sk_`.
22
+
23
+ ```
24
+ ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
25
+ │ Your web app │ │ Your backend │ │ Voxline server │
26
+ │ │ │ │ │ │
27
+ │ fetchToken ────┼───────►│ call Voxline ──┼───────►│ mint ct_ │
28
+ │ │ │ │ with sk_ │ │ │ │
29
+ │ │◄───────┼────────┼──── ct_ ────────┼────────┼─── ct_ │
30
+ │ startCall(...) ┼────────┼──── WSS /v1/agents/.../call?token=ct_ ─────►│
31
+ └─────────────────┘ └──────────────────┘ └─────────────────┘
32
+ ```
33
+
34
+ The `sk_` API key never lives in browser code. **The SDK has no `apiKey` option** — pre-0.2 had one, anyone reading the docs would bake their server credential into client code, that whole class of footgun is gone.
35
+
36
+ ## Quick start (browser)
37
+
38
+ ```ts
39
+ import { configureVoiceClient } from '@craftedxp/voice-js'
40
+
41
+ const voice = configureVoiceClient({
42
+ apiBase: 'https://api.your-server.com',
43
+ // SDK calls this whenever it needs a fresh ct_ — initial connect
44
+ // and any mid-call token refresh. Your backend handles the mint.
45
+ fetchToken: async ({ agentId }) => {
46
+ const r = await fetch('/api/voice/mint', {
47
+ method: 'POST',
48
+ body: JSON.stringify({ agentId }),
49
+ })
50
+ return (await r.json()).token
51
+ },
52
+ // Optional — applied to every call. Per-call options merge on top.
53
+ defaultMetadata: { surface: 'web', appVersion: '1.4.0' },
54
+ })
55
+
56
+ // Per call (typically inside a click handler so the AudioContext gets
57
+ // the user gesture it needs):
58
+ const call = await voice.startCall({
59
+ agentId: 'agt_xxx',
60
+ context: { userId: 'usr_123', topic: 'billing' },
61
+ metadata: { sessionId: 'sess_x' },
62
+ bargeIn: true,
63
+ onStateChange: (state) => console.log('state', state),
64
+ onTranscript: (entries) => render(entries),
65
+ onVolume: ({ input, output }) => drawMeters(input, output),
66
+ onError: ({ code, message }) => toast(`${code}: ${message}`),
67
+ onEnd: ({ reason, durationMs }) => log('ended', reason, durationMs),
68
+ })
69
+
70
+ call.mute() // gate mic frames (server still sees wire cadence)
71
+ call.unmute()
72
+ call.end() // close WS + stop mic + fire onEnd
73
+ ```
74
+
75
+ ## Quick start (Node / Electron-main)
76
+
77
+ ```ts
78
+ import { configureVoiceClient } from '@craftedxp/voice-js/node'
79
+ import { spawn } from 'child_process'
80
+
81
+ const voice = configureVoiceClient({
82
+ apiBase: 'https://api.your-server.com',
83
+ fetchToken: async () => mintFromMyBackend(),
84
+ })
85
+
86
+ // Bring your own audio. Example: sox subprocesses for mic + speakers.
87
+ const mic = spawn('sox', ['-d', '-r', '16000', '-c', '1', '-b', '16', '-e', 'signed', '-t', 'raw', '-'])
88
+ const spk = spawn('sox', ['-t', 'raw', '-r', '16000', '-c', '1', '-b', '16', '-e', 'signed', '-', '-d'])
89
+
90
+ const call = await voice.startCall({
91
+ agentId: 'agt_xxx',
92
+ onAudioChunk: (pcm) => spk.stdin.write(Buffer.from(pcm)),
93
+ onEnd: () => { mic.kill(); spk.stdin.end() },
94
+ })
95
+
96
+ mic.stdout.on('data', (chunk) => call.sendAudioChunk(chunk))
97
+ ```
98
+
99
+ The Node bundle has the same `configureVoiceClient` / `startCall` shape, plus an extra `sendAudioChunk(pcm)` method on the call handle and an `onAudioChunk(pcm)` per-call callback. **No built-in audio adapter** — feed PCM in/out yourself with whatever your host has handy (sox, PortAudio, RTP relay, Electron IPC bridge).
100
+
101
+ ## API reference
102
+
103
+ ### `configureVoiceClient(config)`
104
+
105
+ | Field | Type | Notes |
106
+ |---|---|---|
107
+ | `apiBase` | `string` | Full HTTPS URL of the Voxline server. WS scheme derived: `https`→`wss`. Trailing slash optional. |
108
+ | `fetchToken` | `(args) => Promise<string>` | Called by the SDK whenever it needs a fresh `ct_`. Mirrors `@craftedxp/voice-rn`'s shape exactly — `{ agentId, userId?, context?, metadata? }`. |
109
+ | `defaultMetadata` | `Record<string, string>?` | Applied to every `startCall`. Per-call merges on top. |
110
+ | `defaultContext` | `Record<string, unknown>?` | Applied to every `startCall`. Per-call merges on top. |
111
+
112
+ Returns a `VoiceClientFactory` with one method:
113
+
114
+ ### `factory.startCall(options)`
115
+
116
+ | Field | Type | Notes |
117
+ |---|---|---|
118
+ | `agentId` | `string` | Required. |
119
+ | `userId` | `string?` | Round-tripped to fetchToken as `userId`; server uses it for contact memory. |
120
+ | `context` | `Record<string, unknown>?` | Per-call structured context. Merged on top of `defaultContext`. Lowered into the agent's system prompt server-side. |
121
+ | `metadata` | `Record<string, string>?` | Per-call key/value. Merged on top of `defaultMetadata`. Round-tripped on `call.ended` webhook. NOT lowered into the prompt. |
122
+ | `bargeIn` | `boolean?` | Default `true`. Set `false` for alarm-style flows where the user shouldn't accidentally interrupt the script. |
123
+ | `token` | `string?` | **Test-only escape hatch** — pre-minted `ct_`, bypasses `fetchToken`. Don't use in production. |
124
+ | `onStateChange` | `(state) => void` | Fires on every state machine transition. |
125
+ | `onTranscript` | `(entries) => void` | Fires on every transcript update. |
126
+ | `onVolume` | `({ input, output }) => void` | 0-1 RMS. ~10 Hz cadence. Browser bundle only. |
127
+ | `onError` | `(err) => void` | Stable `code` from `CallErrorCode`; matches `voice-rn` codes where overlap. |
128
+ | `onEnd` | `({ reason, errorCode?, durationMs }) => void` | Fires once when the call ends. |
129
+
130
+ Resolves to a `Call` handle:
131
+
132
+ ```ts
133
+ interface Call {
134
+ readonly state: CallState
135
+ readonly transcript: TranscriptEntry[]
136
+ readonly isMuted: boolean
137
+ end: () => void
138
+ mute: () => void
139
+ unmute: () => void
140
+ }
141
+ ```
142
+
143
+ Node consumers get a `NodeCall` extension with one extra method:
144
+
145
+ ```ts
146
+ interface NodeCall extends Call {
147
+ sendAudioChunk: (pcm: ArrayBuffer | ArrayBufferView) => boolean
148
+ }
149
+ ```
150
+
151
+ ### Stable types
152
+
153
+ ```ts
154
+ type CallState =
155
+ | 'idle' | 'connecting' | 'listening'
156
+ | 'user_speaking' | 'agent_speaking'
157
+ | 'ended' | 'error'
158
+
159
+ type CallErrorCode =
160
+ | 'missing_credentials' | 'forbidden'
161
+ | 'mic_denied' | 'mic_start_failed' | 'audio_session_failed'
162
+ | 'token_expired' | 'token_invalid' | 'unauthorized'
163
+ | 'network_unreachable' | 'socket_error'
164
+ | 'payment_required' | 'not_found'
165
+ | 'silence_timeout' | 'server_error'
166
+
167
+ type CallEndReason = 'agent_ended' | 'user_hangup' | 'timeout' | 'error'
168
+ ```
169
+
170
+ ## Migrating from `@voxline/web`
171
+
172
+ ```diff
173
+ - import { VoiceClient } from '@voxline/web'
174
+ + import { configureVoiceClient } from '@craftedxp/voice-js'
175
+
176
+ - const client = new VoiceClient({
177
+ - apiBase: 'https://api.example.com',
178
+ - agentId: 'agt_xxx',
179
+ - apiKey: 'sk_REDACTED', // ⚠️ DO NOT in client code
180
+ - variables: { topic: 'billing' },
181
+ - })
182
+ - client.on('state', (s) => setState(s))
183
+ - client.on('transcript', (t) => setTranscript(t))
184
+ - client.on('volume', (v) => setVolume(v))
185
+ - client.on('error', (e) => setError(e))
186
+ - client.on('close', () => setState('ended'))
187
+ - await client.connect()
188
+ - // …
189
+ - client.mute(true)
190
+ - client.disconnect()
191
+
192
+ + const voice = configureVoiceClient({
193
+ + apiBase: 'https://api.example.com',
194
+ + fetchToken: async ({ agentId }) => {
195
+ + const r = await fetch('/api/voice/mint', {
196
+ + method: 'POST',
197
+ + body: JSON.stringify({ agentId }),
198
+ + })
199
+ + return (await r.json()).token // your backend uses sk_ to mint ct_
200
+ + },
201
+ + })
202
+ + const call = await voice.startCall({
203
+ + agentId: 'agt_xxx',
204
+ + context: { topic: 'billing' }, // was `variables`
205
+ + onStateChange: (s) => setState(s),
206
+ + onTranscript: (t) => setTranscript(t),
207
+ + onVolume: (v) => setVolume(v),
208
+ + onError: (e) => setError(e),
209
+ + onEnd: ({ reason }) => setState('ended'),
210
+ + })
211
+ + // …
212
+ + call.mute() // toggleable: mute/unmute, no boolean
213
+ + call.end()
214
+ ```
215
+
216
+ Three semantic shifts to be aware of:
217
+
218
+ 1. **`apiKey` is gone.** The SDK no longer accepts it. Move your token mint to your backend. If you have an `sk_` in JS code today, that's a credential leak — rotate it after the migration.
219
+ 2. **`variables` → `context`.** Same purpose; new name lines up with `voice-rn`.
220
+ 3. **`mute(true|false)` → `mute()` / `unmute()`.** Symmetric with `voice-rn`.
221
+
222
+ The embed widget (`<script src="embed.js" data-token="ct_...">`) keeps the same HTML API, but the `data-api-key` attribute is no longer accepted — mint server-side and inject `data-token` instead.
223
+
224
+ ## Embed widget
225
+
226
+ For drop-in `<script>` consumers (landing pages, no-build embeds):
227
+
228
+ ```html
229
+ <script
230
+ src="https://your-cdn/embed.js"
231
+ data-token="ct_REDACTED"
232
+ data-agent-id="agt_xxx"
233
+ data-api-base="https://api.your-server.com"
234
+ defer
235
+ ></script>
236
+ ```
237
+
238
+ Renders a floating call button with a Shadow-DOM transcript panel. Pre-mint the `ct_` server-side and inject it into the `data-token` attribute when you render the page.
239
+
240
+ ## Status
241
+
242
+ - **0.2.0** (current) — first `@craftedxp/voice-js` release. Browser + Node dual bundle, `fetchToken` factory, voice-rn 0.3.x parity. Migration path from `@voxline/web@0.1.0` documented above.
243
+ - 0.1.0 — `@voxline/web`. Singleton `VoiceClient` class, `apiKey` accepted. Retired in 0.2.0; never published to npm so no deprecation window.
244
+
245
+ See [`CONSUMING.md`](CONSUMING.md) for the full setup walkthrough and [`DEVELOPING.md`](DEVELOPING.md) for SDK-author iteration.