@babelbeez/sdk 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +288 -87
  2. package/package.json +26 -2
package/README.md CHANGED
@@ -1,17 +1,19 @@
1
1
  # Babelbeez Headless SDK (`@babelbeez/sdk`)
2
2
 
3
- The **Babelbeez Headless SDK** provides raw, programmatic access to the Babelbeez Voice Agent protocol.
3
+ > Build your own UI for the Babelbeez AI voice agent – custom buttons, call controls, and chat views while we handle realtime audio, OpenAI Realtime, and connection lifecycle.
4
4
 
5
- It’s ideal if you want to:
5
+ The **Babelbeez Headless SDK** gives you low‑level, event‑driven control of a Babelbeez Voice Agent from the browser.
6
6
 
7
- - Integrate Babelbeez voice agents into an existing app (web, dashboards, custom UIs)
8
- - Build completely custom button/widget designs instead of using the standard embed
9
- - Create hybrid **text + voice** interfaces while Babelbeez handles:
10
- - Low-latency audio processing
11
- - WebRTC connections
12
- - AI orchestration and session management
7
+ > **Note:** Using this SDK requires a **Babelbeez account** and a configured Voice Agent. Sign up at https://www.babelbeez.com.
13
8
 
14
- > If you just want a drop-in widget for your website, you probably want the standard Babelbeez embed instead of this SDK.
9
+ Use this SDK when you want to:
10
+
11
+ - Replace the default widget with your **own button or call UI**
12
+ - Show **live transcripts** in your app
13
+ - Combine **voice + text** input in a single experience
14
+ - Orchestrate **human handoffs** (email / WhatsApp) from your own components
15
+
16
+ If you just want a drop‑in chat button, use the standard Babelbeez embed instead. This SDK is for developers who want full control over the UX.
15
17
 
16
18
  ---
17
19
 
@@ -21,188 +23,387 @@ It’s ideal if you want to:
21
23
  npm install @babelbeez/sdk
22
24
  ```
23
25
 
24
- The SDK is designed for modern browsers that support WebRTC and access to the microphone.
26
+ **Requirements**
27
+
28
+ - A **Babelbeez account** and at least one configured Voice Agent
29
+ - Modern browser with **WebRTC** and microphone support
30
+ - Page served over **HTTPS** (or `localhost`) so the browser will allow mic access
25
31
 
26
32
  ---
27
33
 
28
- ## Quick Start
34
+ ## Getting your `publicChatbotId`
29
35
 
30
- To use the SDK, you will need the **Public Chatbot ID** of your configured Voice Agent. You can find this in the Babelbeez Dashboard under **Voice Agent Settings**.
36
+ In the Babelbeez Dashboard, open your Voice Agent and go to **Settings → Embed**. Copy the **Public Chatbot ID** you’ll pass this into the SDK.
37
+
38
+ ---
39
+
40
+ ## Quick Start: custom start/stop button
31
41
 
32
42
  ```ts
33
43
  import { BabelbeezClient } from '@babelbeez/sdk';
34
44
 
35
- // 1. Initialize the client
45
+ // 1. Create the client
36
46
  const client = new BabelbeezClient({
37
- publicChatbotId: 'YOUR_UUID_HERE',
47
+ publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID',
38
48
  });
39
49
 
40
- // 2. Listen for state changes (e.g., loading, listening, speaking)
50
+ let currentState: 'idle' | 'loading' | 'active' | 'speaking' | 'error' | 'rag-retrieval' = 'idle';
51
+
52
+ // 2. Listen for state changes to drive your UI
41
53
  client.on('buttonState', (state) => {
42
- console.log('Agent State:', state); // 'loading' | 'active' | 'speaking' | 'idle'
54
+ // state: 'idle' | 'loading' | 'active' | 'speaking' | 'error' | 'rag-retrieval'
55
+ console.log('Agent state:', state);
56
+ currentState = state;
57
+ updateMyButton(state); // implement this in your own UI
58
+ });
59
+
60
+ // 3. Listen for live transcripts (user + agent)
61
+ client.on('transcript', ({ role, text, isFinal }) => {
62
+ // role: 'user' | 'agent'
63
+ console.log(`${role}:`, text, isFinal ? '(final)' : '(partial)');
64
+ appendMessageToChat(role, text, isFinal); // your own renderer
65
+ });
66
+
67
+ // 4. Wire up your button to connect / disconnect
68
+ const startStopButton = document.getElementById('voice-button')!;
69
+
70
+ startStopButton.addEventListener('click', async () => {
71
+ if (currentState === 'active' || currentState === 'speaking' || currentState === 'rag-retrieval') {
72
+ // Play nice goodbye UX (triggers configured farewell in Babelbeez)
73
+ await client.disconnect('user_button_click');
74
+ } else if (currentState === 'idle' || currentState === 'error') {
75
+ try {
76
+ await client.connect(); // Browser will request microphone access
77
+ } catch (err) {
78
+ console.error('Failed to connect:', err);
79
+ }
80
+ }
43
81
  });
82
+ ```
83
+
84
+ You’re responsible for implementing `updateMyButton` and `appendMessageToChat` in your own DOM or framework components.
85
+
86
+ ---
87
+
88
+ ## Example: building a transcript view
89
+
90
+ The SDK emits **streaming transcripts** for both the user and the agent, including partial and final messages. You can use this to build a chat‑like view.
91
+
92
+ ```ts
93
+ const transcriptContainer = document.getElementById('messages');
94
+ let currentLine: HTMLDivElement | null = null;
44
95
 
45
- // 3. Listen for live transcripts
46
96
  client.on('transcript', ({ role, text, isFinal }) => {
47
- console.log(`${role}: ${text}`);
97
+ // role is 'user' or 'agent'
98
+
99
+ const roleAttr = role;
100
+
101
+ // 1. Start a new line when role changes or previous line was final
102
+ if (!currentLine || currentLine.dataset.role !== roleAttr || currentLine.dataset.final === 'true') {
103
+ currentLine = document.createElement('div');
104
+ currentLine.dataset.role = roleAttr;
105
+ currentLine.className = role === 'user' ? 'message-user' : 'message-agent';
106
+ transcriptContainer!.appendChild(currentLine);
107
+ }
108
+
109
+ // 2. Update the text content with the latest transcript
110
+ currentLine.textContent = text;
111
+
112
+ // 3. Mark final utterances
113
+ currentLine.dataset.final = String(isFinal);
114
+
115
+ // 4. Auto-scroll
116
+ transcriptContainer!.scrollTop = transcriptContainer!.scrollHeight;
48
117
  });
118
+ ```
119
+
120
+ Style `.message-user` and `.message-agent` in CSS to match your design system.
121
+
122
+ ---
123
+
124
+ ## Example: hybrid text + voice
125
+
126
+ You can send text input into the same live voice session. The agent will respond via audio, and you’ll still get `transcript` events.
49
127
 
50
- // 4. Connect (browser will request microphone permissions)
51
- document.getElementById('start-btn')!.addEventListener('click', async () => {
128
+ ```ts
129
+ // e.g. on form submit
130
+ async function handleTextSubmit(message: string) {
52
131
  try {
53
- await client.connect();
132
+ await client.sendUserText(message);
54
133
  } catch (err) {
55
- console.error('Failed to connect:', err);
134
+ console.error('Failed to send text message:', err);
56
135
  }
136
+ }
137
+ ```
138
+
139
+ > Note: The voice session must be active (after `connect()`) for `sendUserText` to take effect.
140
+
141
+ ---
142
+
143
+ ## Example: handling human handoff
144
+
145
+ If your agent is configured for **human handoff**, the SDK will emit events so you can present your own email/WhatsApp UI.
146
+
147
+ ```ts
148
+ client.on('handoff:show', ({ summaryText, waLink }) => {
149
+ // summaryText: short description of the conversation / request
150
+ // waLink: WhatsApp deeplink if configured, otherwise null
151
+ openHandoffModal({ summaryText, waLink });
57
152
  });
153
+
154
+ client.on('handoff:hide', ({ outcome }) => {
155
+ // outcome: 'email_submitted' | 'whatsapp_submitted' | 'cancelled'
156
+ closeHandoffModal(outcome);
157
+ });
158
+
159
+ // When the user submits your handoff form
160
+ async function submitHandoff(email: string, consent: boolean) {
161
+ await client.handleHandoffSubmit({ email, consent });
162
+ }
163
+
164
+ // When the user cancels or chooses WhatsApp instead
165
+ async function cancelHandoff(viaWhatsapp: boolean) {
166
+ await client.handleHandoffCancel({ viaWhatsapp });
167
+ }
58
168
  ```
59
169
 
170
+ The agent behavior, wording, and when handoff is triggered are all configured in the Babelbeez Dashboard.
171
+
60
172
  ---
61
173
 
62
174
  ## API Reference
63
175
 
64
176
  ### `new BabelbeezClient(config)`
65
177
 
66
- Creates a new instance of the Babelbeez client.
67
-
68
- **Config object**
69
-
70
- - `publicChatbotId` **(string, required)** – The unique UUID of your Voice Agent.
178
+ Create a new client instance.
71
179
 
72
180
  ```ts
181
+ import { BabelbeezClient } from '@babelbeez/sdk';
182
+
73
183
  const client = new BabelbeezClient({
74
- publicChatbotId: 'YOUR_UUID_HERE',
184
+ publicChatbotId: 'YOUR_PUBLIC_CHATBOT_ID',
75
185
  });
76
186
  ```
77
187
 
188
+ **Config**
189
+
190
+ ```ts
191
+ interface BabelbeezClientConfig {
192
+ publicChatbotId: string;
193
+ }
194
+ ```
195
+
196
+ - `publicChatbotId` **(string, required)** – The public ID of the Voice Agent from the Babelbeez Dashboard.
197
+
78
198
  ---
79
199
 
80
200
  ### Methods
81
201
 
82
- #### `connect()`
202
+ #### `connect(): Promise<void>`
83
203
 
84
- Initializes the session, requests microphone access from the user, and establishes a realtime WebRTC connection with the Babelbeez infrastructure.
204
+ Initializes the session via Babelbeez, requests microphone access from the user, and opens a realtime connection to the OpenAI Realtime API.
205
+
206
+ - Emits `buttonState: 'loading'` while connecting.
207
+ - On success, emits `buttonState: 'active'` and `session:start`.
208
+ - On failure (e.g. mic denied), emits an `error` event and `buttonState: 'error'`.
85
209
 
86
210
  ```ts
87
211
  await client.connect();
88
212
  ```
89
213
 
90
- If the user denies microphone permissions, or the connection fails, an `error` event will be emitted and the promise will reject.
91
-
92
214
  ---
93
215
 
94
- #### `disconnect(reason?)`
216
+ #### `disconnect(reason?: string): Promise<void>`
95
217
 
96
- Gracefully terminates the voice session.
218
+ Gracefully ends the current session and sends a final usage + transcript summary to Babelbeez.
97
219
 
98
220
  ```ts
99
- client.disconnect('user_button_click');
221
+ await client.disconnect();
222
+ // or
223
+ await client.disconnect('user_button_click');
100
224
  ```
101
225
 
102
- **Parameters**
226
+ - `reason` **(optional)** – String reason used for analytics and backend handling.
227
+ - Passing `'user_button_click'` triggers the configured **goodbye message** before disconnecting.
103
228
 
104
- - `reason` **(string, optional)** – A descriptive reason for the disconnection. Defaults to `'client_disconnect'`.
229
+ ---
105
230
 
106
- Special handling:
231
+ #### `initializeAudio(): void`
107
232
 
108
- - Passing `'user_button_click'` specifically triggers the agent to play its configured **“Goodbye”** message **before** disconnecting.
109
- - Other reasons generally terminate the session immediately.
233
+ Optional helper to unlock the browser `AudioContext` in response to a user gesture (click/tap), which can help avoid autoplay restrictions in some environments.
234
+
235
+ ```ts
236
+ // e.g. on a user click before connecting
237
+ client.initializeAudio();
238
+ ```
110
239
 
111
240
  ---
112
241
 
113
- #### `sendUserText(text)`
242
+ #### `sendUserText(text: string): Promise<void>`
114
243
 
115
- Sends a text message to the agent. This enables hybrid interfaces where a user can **speak or type** to the same agent within the same session.
244
+ Sends a user text message into the active voice session useful for hybrid **chat + voice** interfaces.
116
245
 
117
246
  ```ts
118
- client.sendUserText('Hello, I would like to book an appointment.');
247
+ await client.sendUserText('Hello, do you have pricing for teams?');
119
248
  ```
120
249
 
121
- **Parameters**
122
-
123
- - `text` **(string)** – The message to send.
250
+ - If the agent is currently speaking, the SDK will attempt to **interrupt** the response before sending the new message.
124
251
 
125
252
  ---
126
253
 
127
- #### `handleHandoffSubmit({ email, consent })`
254
+ #### `handleHandoffSubmit(payload): Promise<void>`
128
255
 
129
- Submits user contact details in response to a **“Request Human Handoff”** event.
256
+ Notify Babelbeez when the user submits your human handoff form.
130
257
 
131
258
  ```ts
132
- client.handleHandoffSubmit({
259
+ await client.handleHandoffSubmit({
133
260
  email: 'user@example.com',
134
261
  consent: true,
135
262
  });
136
263
  ```
137
264
 
138
- **Parameters**
139
-
140
265
  - `email` **(string)** – The user’s email address.
141
266
  - `consent` **(boolean)** – Whether the user consented to be contacted.
142
267
 
143
268
  ---
144
269
 
145
- #### `handleHandoffCancel({ viaWhatsapp })`
270
+ #### `handleHandoffCancel(options?): Promise<void>`
146
271
 
147
- Cancels a pending handoff request.
272
+ Notify Babelbeez when the user cancels the handoff form or switches to WhatsApp.
148
273
 
149
274
  ```ts
150
- client.handleHandoffCancel({
151
- viaWhatsapp: true,
152
- });
275
+ await client.handleHandoffCancel({ viaWhatsapp: true });
153
276
  ```
154
277
 
155
- **Parameters**
156
-
157
- - `viaWhatsapp` **(boolean, optional)** – Set to `true` if the cancellation occurred because the user chose to continue the conversation on WhatsApp via a deep link.
278
+ - `viaWhatsapp` **(boolean, optional)** – Pass `true` if the user opted to continue via WhatsApp (using the provided `waLink`). In that case, the SDK will end the voice session after a goodbye.
158
279
 
159
280
  ---
160
281
 
161
- ## Events
282
+ ### Events
283
+
284
+ The client extends a simple `EventEmitter` interface. Subscribe with `client.on(event, listener)` and unsubscribe with `client.off(event, listener)`.
162
285
 
163
- The client extends an `EventEmitter`. You can subscribe to events using `client.on(event, callback)`.
286
+ #### Core events
164
287
 
165
288
  ```ts
166
- client.on('event-name', (payload) => {
167
- // handle event
168
- });
289
+ client.on('buttonState', (state) => { /* ... */ });
290
+ client.on('transcript', (event) => { /* ... */ });
291
+ client.on('error', (event) => { /* ... */ });
292
+ client.on('session:start', (event) => { /* ... */ });
293
+ client.on('session:end', (event) => { /* ... */ });
294
+ client.on('handoff:show', (event) => { /* ... */ });
295
+ client.on('handoff:hide', (event) => { /* ... */ });
169
296
  ```
170
297
 
171
- ### Core Events
298
+ #### `buttonState`
172
299
 
173
- | Event | Payload | Description |
174
- |-----------------|----------------------------|-------------|
175
- | `buttonState` | `string` | Current status of the agent. One of: `'idle'`, `'loading'`, `'active'` (listening), `'speaking'`, `'rag-retrieval'`, `'error'`. |
176
- | `transcript` | `{ role, text, isFinal }` | Realtime transcript updates. `role` is `'user'` or `'agent'`. `isFinal` indicates that the utterance/response is complete. |
177
- | `session:start` | `{ chatbotId, config }` | Fired when the WebRTC connection is fully established and the session is active. |
178
- | `session:end` | `{ reason }` | Fired when the session terminates (user disconnect, timeout, error, etc.). |
179
- | `handoff:show` | `{ summaryText, waLink }` | Fired when the AI determines a human is needed. Render a UI form to collect user details. `waLink` contains a WhatsApp deep link if configured. |
180
- | `handoff:hide` | `{ outcome }` | Fired when the handoff flow is completed or cancelled. |
181
- | `error` | `{ code, message, fatal }` | Emitted when an error occurs. If `fatal` is `true`, the session has been terminated. |
300
+ ```ts
301
+ export type BabelbeezButtonState =
302
+ | 'idle'
303
+ | 'loading'
304
+ | 'active'
305
+ | 'speaking'
306
+ | 'error'
307
+ | 'rag-retrieval';
308
+ ```
309
+
310
+ Use this to drive your call control UI (start/stop button, spinners, etc.).
182
311
 
183
- Example:
312
+ #### `transcript`
184
313
 
185
314
  ```ts
186
- client.on('error', ({ code, message, fatal }) => {
187
- console.error(`[Babelbeez] Error (${code}): ${message}`);
315
+ export interface BabelbeezTranscriptEvent {
316
+ role: 'user' | 'agent';
317
+ text: string;
318
+ isFinal: boolean;
319
+ }
320
+ ```
188
321
 
189
- if (fatal) {
190
- // e.g. update UI, disable button, prompt user to refresh
191
- }
192
- });
322
+ - Multiple events are emitted per utterance.
323
+ - `isFinal: true` marks the end of a user or agent turn.
324
+
325
+ #### `session:start`
326
+
327
+ ```ts
328
+ export interface BabelbeezSessionStartEvent {
329
+ chatbotId: string;
330
+ config: unknown; // snapshot of chatbot configuration
331
+ }
332
+ ```
333
+
334
+ Fired when the WebRTC session is fully established and active.
335
+
336
+ #### `session:end`
337
+
338
+ ```ts
339
+ export interface BabelbeezSessionEndEvent {
340
+ reason: string;
341
+ }
342
+ ```
343
+
344
+ Fired when the session terminates (user disconnect, timeout, error, agent‑initiated close, etc.).
345
+
346
+ #### `error`
347
+
348
+ ```ts
349
+ export type BabelbeezErrorSeverity = 'info' | 'warning' | 'error';
350
+
351
+ export interface BabelbeezErrorEvent {
352
+ code: string;
353
+ message: string;
354
+ severity: BabelbeezErrorSeverity;
355
+ fatal?: boolean;
356
+ }
357
+ ```
358
+
359
+ - When `fatal === true`, the session has been terminated.
360
+ - Use `severity` to decide how aggressively to update your UI or prompt the user.
361
+
362
+ #### `handoff:show`
363
+
364
+ ```ts
365
+ export interface BabelbeezHandoffShowEvent {
366
+ summaryText: string;
367
+ waLink: string | null; // WhatsApp deeplink if configured
368
+ }
193
369
  ```
194
370
 
371
+ Fired when the AI decides a human is needed. Use this to show your own form/modal.
372
+
373
+ #### `handoff:hide`
374
+
375
+ ```ts
376
+ export type BabelbeezHandoffHideOutcome =
377
+ | 'email_submitted'
378
+ | 'whatsapp_submitted'
379
+ | 'cancelled';
380
+
381
+ export interface BabelbeezHandoffHideEvent {
382
+ outcome: BabelbeezHandoffHideOutcome;
383
+ }
384
+ ```
385
+
386
+ Fired when the handoff flow is completed or cancelled.
387
+
195
388
  ---
196
389
 
197
- ## Usage Notes
390
+ ## Usage notes
198
391
 
199
- - The SDK is **browser-first** and assumes access to `navigator.mediaDevices` for microphone input.
200
- - Make sure your site is served over **HTTPS**, or from `localhost`, otherwise the browser may block microphone access.
201
- - You should always provide clear UI affordances (e.g. a “Start call” / “End call” button) that map to `connect()` and `disconnect()` calls.
392
+ - The SDK is **browserfirst** and assumes access to `navigator.mediaDevices` for microphone input.
393
+ - Always provide clear UX affordances (e.g. "Start call" / "End call") that map to `connect()` and `disconnect()`.
394
+ - For best results, prompt the user before accessing the microphone and explain what the agent will do.
202
395
 
203
396
  ---
204
397
 
205
- ## License
398
+ ## Further reading
399
+
400
+ For a full walkthrough of building a custom button and UI, see the guide:
401
+
402
+ **Headless embed: use your own chat button**
403
+ https://www.babelbeez.com/resources/help/for-developers/headless-embed-custom-button.html
206
404
 
207
- ISC
405
+ ---
406
+
407
+ ## License
208
408
 
409
+ MIT
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@babelbeez/sdk",
3
- "version": "0.1.1",
3
+ "version": "0.1.2",
4
4
  "type": "module",
5
5
  "publishConfig": {
6
6
  "access": "public"
@@ -18,5 +18,29 @@
18
18
  "files": [
19
19
  "dist"
20
20
  ],
21
- "license": "ISC"
21
+ "keywords": [
22
+ "babelbeez",
23
+ "headless voice ai",
24
+ "headless voice sdk",
25
+ "headless chatbot",
26
+ "speech to speech ai",
27
+ "speech to speech",
28
+ "website voice assistant",
29
+ "website voice agent",
30
+ "website voice chatbot",
31
+ "voice chat widget",
32
+ "web voice widget",
33
+ "voice ai for websites",
34
+ "embedded voice assistant",
35
+ "embedded voice ai",
36
+ "white label voice ai",
37
+ "web agency",
38
+ "client website voice",
39
+ "openai realtime voice",
40
+ "openai voice agent",
41
+ "webrtc voice",
42
+ "browser voice sdk",
43
+ "javascript voice sdk"
44
+ ],
45
+ "license": "MIT"
22
46
  }