@keyframelabs/elements 0.2.1 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -18,9 +18,10 @@ pnpm add @keyframelabs/elements
18
18
 
19
19
  ## Usage
20
20
 
21
- This package provides two primary classes depending on your integration strategy:
21
+ This package provides three integration options depending on your strategy:
22
22
  1. **`PersonaEmbed`**: Fully managed. Uses a publishable key. Best for rapid frontend integration.
23
23
  2. **`PersonaView`**: Bring your own backend. Uses session tokens generated by your server.
24
+ 3. **`<kfl-embed>`**: Drop-in web component. Wraps `PersonaEmbed` with a complete widget UI (states, controls, positioning). Zero framework dependencies.
24
25
 
25
26
  ### Option A: PersonaEmbed (managed)
26
27
 
@@ -69,48 +70,108 @@ await view.connect();
69
70
  view.disconnect();
70
71
  ```
71
72
 
72
- ## Supported agents and real-time LLMs
73
+ ### Option C: `<kfl-embed>` Web Component
74
+
75
+ A self-registering custom element that wraps `PersonaEmbed` with a widget UI shell: minimized/active/hidden states, "Join call" button, minimize/expand toggle, corner positioning, and button color styling. Drop it into any page with zero framework dependencies.
76
+
77
+ ```html
78
+ <kfl-embed
79
+ publishable-key="kfl_pk_live_..."
80
+ initial-state="minimized"
81
+ corner="bottom-right"
82
+ minimized-width="144"
83
+ minimized-height="216"
84
+ active-width="252"
85
+ active-height="377"
86
+ button-color="#919191"
87
+ button-color-opacity="0.3"
88
+ video-fit="cover"
89
+ preview-image="https://example.com/avatar.png"></kfl-embed>
90
+ <script type="module" src="https://unpkg.com/@keyframelabs/elements/dist/kfl-embed.js"></script>
91
+ ```
73
92
 
74
- Supports Cartesia, ElevenLabs, Vapi, Gemini Live (closed alpha), OpenAI Realtime (closed alpha).
93
+ #### Attributes
94
+
95
+ | Attribute | Type | Default | Description |
96
+ | -------------------------------- | ---------------------------------------------------------- | ---------------- | ------------------------------------------------------------------ |
97
+ | `publishable-key` | `string` | Required | Your publishable embed key. |
98
+ | `api-base-url` | `string` | Keyframe default | Base URL for the Keyframe API. |
99
+ | `initial-state` | `'minimized' \| 'active' \| 'hidden'` | `'minimized'` | Widget state on first load. |
100
+ | `controlled-widget-state` | `'minimized' \| 'active' \| 'hidden'` | — | Externally control the widget state (overrides internal state). |
101
+ | `corner` | `'bottom-right' \| 'bottom-left' \| 'top-right' \| 'top-left'` | `'bottom-right'` | Which corner to anchor the widget (fixed mode only). |
102
+ | `inline` | boolean attribute | — | Use `position: relative` instead of `fixed`. |
103
+ | `minimized-width` | `number` | `144` | Width in px when minimized. |
104
+ | `minimized-height` | `number` | `216` | Height in px when minimized. |
105
+ | `active-width` | `number` | `252` | Width in px when active (in-call). |
106
+ | `active-height` | `number` | `377` | Height in px when active (in-call). |
107
+ | `hide-ui` | boolean attribute | — | Hides all overlay controls. Useful for building your own UI shell. |
108
+ | `show-minimize-button` | `'true' \| 'false'` | `'true'` | Show the X/minimize button on hover. |
109
+ | `controlled-show-minimize-button`| `'true' \| 'false'` | — | Externally control the minimize button visibility. |
110
+ | `button-color` | hex color string | `'#919191'` | Color of the "Join call" button. |
111
+ | `button-color-opacity` | `number` (0–1) | `0.3` | Opacity of the "Join call" button background. |
112
+ | `video-fit` | `'cover' \| 'contain'` | `'cover'` | Video scaling mode. |
113
+ | `preview-image` | URL string | — | Image shown in the widget before a call starts. |
114
+
115
+ #### Events
116
+
117
+ | Event | `detail` | Description |
118
+ | ------------------ | --------------------------------------- | -------------------------------------- |
119
+ | `statechange` | `{ status: EmbedStatus }` | Connection status changed. |
120
+ | `agentstatechange` | `{ state: AgentState }` | Avatar playback state changed. |
121
+ | `widgetstatechange`| `{ state: 'minimized' \| 'active' \| 'hidden' }` | Widget UI state changed. |
122
+ | `disconnected` | — | Session disconnected. |
123
+ | `error` | `{ error: Error }` | A fatal error occurred. |
124
+
125
+ #### JavaScript API
75
126
 
76
- For `PersonaEmbed`, this is determined by the values you set in the Keyframe platform dashboard.
127
+ ```ts
128
+ const el = document.querySelector('kfl-embed');
77
129
 
78
- For `PersonaView`, this is determined by `voiceAgentDetails`.
130
+ await el.micOn(); // Start session if needed, then unmute mic
131
+ await el.micOff(); // Mute mic
132
+ el.isMicOn(); // boolean
79
133
 
80
- ## Emotion Controls
134
+ await el.mute(); // Mute speaker audio
135
+ await el.unmute(); // Unmute speaker audio
136
+ el.isMuted(); // boolean
137
+ ```
81
138
 
82
- The avatar can display emotional expressions (`neutral`, `angry`, `sad`, `happy`) that affect its facial expression and demeanor.
139
+ #### Build
83
140
 
84
- ### ElevenLabs: `set_emotion` Tool Call
141
+ The web component is built as a self-contained ES module via `vite.config.embed.ts`:
85
142
 
86
- When using ElevenLabs as the voice agent, emotions are driven by a **client tool call** named `set_emotion`. The ElevenLabs agent parses incoming `client_tool_call` WebSocket messages and, when the tool name is `set_emotion`, updates the avatar's expression accordingly.
143
+ ```bash
144
+ pnpm build:embed # → dist/kfl-embed.js
145
+ ```
87
146
 
88
- > **Important:** Transcripts from the ElevenLabs agent are **not** automatically consumed. The `transcript` event is emitted, but it is up to you to subscribe to it if you need transcript data.
147
+ Import the auto-registering entry point (registers `<kfl-embed>` on `customElements`):
89
148
 
90
- #### Setup
149
+ ```ts
150
+ import '@keyframelabs/elements/kfl-embed';
151
+ ```
91
152
 
92
- You must create a `set_emotion` tool in the [ElevenLabs API](https://elevenlabs.io/docs) for your agent. The tool should accept a single parameter:
153
+ Or import the class directly for manual registration:
93
154
 
94
- | Parameter | Type | Description |
95
- | --------- | -------- | -------------------------------------------------------- |
96
- | `emotion` | `enum` | One of `neutral`, `angry`, `sad`, `happy`. |
155
+ ```ts
156
+ import { KflEmbedElement } from '@keyframelabs/elements';
157
+ customElements.define('my-embed', KflEmbedElement);
158
+ ```
97
159
 
98
- Then instruct your agent (via its system prompt) to call `set_emotion` on each turn with the appropriate emotion. The client library handles the rest — it validates the emotion, emits an `emotion` event, and sends a `client_tool_result` back to ElevenLabs.
160
+ ## Supported agents and real-time LLMs
99
161
 
100
- ### Manual Emotion Control
162
+ Supports ElevenLabs and OpenAI Realtime.
101
163
 
102
- For other agents or custom emotion logic, you can access the underlying session to set emotions manually:
164
+ For `PersonaEmbed`, this is determined by the values you set in the Keyframe platform dashboard.
103
165
 
104
- ```typescript
105
- import { createClient } from '@keyframelabs/sdk';
166
+ For `PersonaView`, this is determined by `voiceAgentDetails`.
106
167
 
107
- const session = createClient({ ... });
108
- await session.setEmotion('happy');
109
- ```
168
+ ## Emotion Controls
169
+
170
+ The avatar can display emotional expressions (`neutral`, `angry`, `sad`, `happy`) that affect its facial expression and demeanor. All supported voice agents can drive emotions automatically via tool/function calling.
110
171
 
111
172
  ### Agent Events
112
173
 
113
- The `emotion` event is emitted when the agent triggers a `set_emotion` tool call:
174
+ The `emotion` event is emitted when any agent triggers a `set_emotion` tool call:
114
175
 
115
176
  ```typescript
116
177
  agent.on('emotion', (emotion) => {
@@ -118,7 +179,30 @@ agent.on('emotion', (emotion) => {
118
179
  });
119
180
  ```
120
181
 
121
- Currently, only the ElevenLabs agent emits emotion events via tool calls.
182
+ When using `PersonaEmbed` or `PersonaView`, emotion events are automatically wired to the avatar session -- no extra code is needed.
183
+
184
+ ### ElevenLabs
185
+
186
+ Emotions are driven by a **client tool call** named `set_emotion`. The agent parses incoming `client_tool_call` WebSocket messages and sends a `client_tool_result` back.
187
+
188
+ **Setup:** Create a `set_emotion` [client tool](https://elevenlabs.io/docs/conversational-ai/customization/tools/client-tools) in the ElevenLabs dashboard for your agent with a single `emotion` parameter (enum: `neutral`, `angry`, `sad`, `happy`). Then instruct your agent (via its system prompt) to call `set_emotion` on each turn.
189
+
190
+ ### OpenAI Realtime
191
+
192
+ The `set_emotion` function is **automatically declared** in the OpenAI Realtime session setup. The model calls it via Realtime [function calling](https://developers.openai.com/api/docs/guides/realtime-conversations#function-calling) (`response.done` with `function_call` output items), and the client responds by creating a `function_call_output` conversation item before asking the model to continue.
193
+
194
+ **Setup:** No additional dashboard configuration is needed. Instruct the model via its system prompt to call `set_emotion` on each turn to reflect the tone of its response.
195
+
196
+ ### Manual Emotion Control
197
+
198
+ For custom emotion logic outside of tool calling, you can access the underlying session directly:
199
+
200
+ ```typescript
201
+ import { createClient } from '@keyframelabs/sdk';
202
+
203
+ const session = createClient({ ... });
204
+ await session.setEmotion('happy');
205
+ ```
122
206
 
123
207
  ## API
124
208
 
@@ -188,10 +272,12 @@ type SessionDetails = {
188
272
  };
189
273
 
190
274
  type VoiceAgentDetails = {
191
- type: 'cartesia' | 'elevenlabs' | 'vapi' | 'gemini' | 'openai';
192
- token?: string; // For gemini, cartesia
193
- agent_id?: string; // For elevenlabs, cartesia
194
- signed_url?: string; // For elevenlabs, vapi
275
+ type: 'elevenlabs' | 'openai';
276
+ token?: string; // For openai (ephemeral client secret)
277
+ agent_id?: string; // For elevenlabs
278
+ signed_url?: string; // For elevenlabs
279
+ system_prompt?: string; // For openai
280
+ voice?: string; // For openai
195
281
  };
196
282
 
197
283
  type Emotion = 'neutral' | 'angry' | 'sad' | 'happy';
@@ -0,0 +1,55 @@
1
+ /**
2
+ * <kfl-embed> Web Component
3
+ *
4
+ * Composes PersonaEmbed internally -- does NOT reimplement session/video/audio logic.
5
+ * Adds the widget UI shell: minimized/active/hidden states, "Join call" button,
6
+ * minimize/expand toggle, corner positioning CSS, button color styling.
7
+ */
8
+ export declare class KflEmbedElement extends HTMLElement {
9
+ static get observedAttributes(): ("publishable-key" | "api-base-url" | "initial-state" | "controlled-widget-state" | "active-width" | "active-height" | "minimized-width" | "minimized-height" | "inline" | "corner" | "hide-ui" | "show-minimize-button" | "controlled-show-minimize-button" | "button-color" | "button-color-opacity" | "video-fit" | "preview-image")[];
10
+ private shadow;
11
+ private embed;
12
+ private _widgetState;
13
+ private _connected;
14
+ private _connecting;
15
+ private widgetEl;
16
+ private innerEl;
17
+ private containerEl;
18
+ private previewImg;
19
+ private spinnerEl;
20
+ private errorToast;
21
+ private errorTimer;
22
+ private joinBtn;
23
+ private endCallBtn;
24
+ private toggleBtn;
25
+ private revealBtn;
26
+ private toolbarEl;
27
+ private micBtn;
28
+ constructor();
29
+ connectedCallback(): void;
30
+ disconnectedCallback(): void;
31
+ attributeChangedCallback(name: string, _old: string | null, value: string | null): void;
32
+ mute(): Promise<void>;
33
+ unmute(): Promise<void>;
34
+ isMuted(): boolean;
35
+ canUnmute(): boolean;
36
+ micOn(): Promise<void>;
37
+ micOff(): Promise<void>;
38
+ isMicOn(): boolean;
39
+ canTurnOnMic(): boolean;
40
+ setEmotion(_emotion: 'neutral' | 'angry' | 'sad' | 'happy'): void;
41
+ private _getAttrNum;
42
+ private _getCorner;
43
+ private _applyLayout;
44
+ private _applyState;
45
+ private _shouldShowMinimizeButton;
46
+ private _handleJoinCall;
47
+ private _handleToggle;
48
+ private _handleEndCall;
49
+ private _handleReveal;
50
+ private _handleMicToggle;
51
+ private _updateMicIcon;
52
+ private _resetToMinimized;
53
+ private _showError;
54
+ private _connect;
55
+ }
@@ -3,7 +3,7 @@
3
3
  *
4
4
  * These utilities help with PCM audio processing for voice AI integrations.
5
5
  */
6
- /** Sample rate for audio sent to Persona (matches Gemini output) */
6
+ /** Standard output sample rate for audio sent to Persona */
7
7
  export declare const SAMPLE_RATE = 24000;
8
8
  /**
9
9
  * Convert base64-encoded audio to Uint8Array.
@@ -1,7 +1,5 @@
1
- import { GeminiLiveAgent, GeminiLiveConfig } from './gemini-live';
2
1
  import { ElevenLabsAgent, ElevenLabsConfig } from './elevenlabs';
3
- import { CartesiaAgent, CartesiaConfig } from './cartesia';
4
- import { VapiAgent, VapiConfig } from './vapi';
2
+ import { OpenAIRealtimeAgent, OpenAIRealtimeConfig, TurnDetection } from './openai-realtime';
5
3
  /**
6
4
  * Agent implementations for voice AI platforms.
7
5
  *
@@ -10,13 +8,11 @@ import { VapiAgent, VapiConfig } from './vapi';
10
8
  */
11
9
  export { BaseAgent, DEFAULT_INPUT_SAMPLE_RATE } from './base';
12
10
  export type { Agent, AgentConfig, AgentEventMap, AgentState, Emotion } from './types';
13
- export { GeminiLiveAgent, type GeminiLiveConfig };
14
11
  export { ElevenLabsAgent, type ElevenLabsConfig };
15
- export { CartesiaAgent, type CartesiaConfig };
16
- export { VapiAgent, type VapiConfig };
12
+ export { OpenAIRealtimeAgent, type OpenAIRealtimeConfig, type TurnDetection };
17
13
  export { SAMPLE_RATE, base64ToBytes, bytesToBase64, resamplePcm, createEventEmitter, floatTo16BitPCM } from './audio-utils';
18
14
  /** Supported agent types */
19
- export type AgentType = 'gemini' | 'elevenlabs' | 'cartesia' | 'vapi';
15
+ export type AgentType = 'elevenlabs' | 'openai';
20
16
  /** Agent type metadata */
21
17
  export interface AgentTypeInfo {
22
18
  id: AgentType;
@@ -27,26 +23,22 @@ export interface AgentTypeInfo {
27
23
  export declare const AGENT_REGISTRY: AgentTypeInfo[];
28
24
  /** Configuration types by agent type */
29
25
  export interface AgentConfigMap {
30
- gemini: GeminiLiveConfig;
31
26
  elevenlabs: ElevenLabsConfig;
32
- cartesia: CartesiaConfig;
33
- vapi: VapiConfig;
27
+ openai: OpenAIRealtimeConfig;
34
28
  }
35
29
  /** Union type of all agent instances */
36
- export type AnyAgent = GeminiLiveAgent | ElevenLabsAgent | CartesiaAgent | VapiAgent;
30
+ export type AnyAgent = ElevenLabsAgent | OpenAIRealtimeAgent;
37
31
  /**
38
32
  * Create an agent instance by type.
39
33
  *
40
34
  * @example
41
35
  * ```ts
42
- * const agent = createAgent('gemini');
43
- * await agent.connect({ apiKey: 'YOUR_KEY' });
36
+ * const agent = createAgent('elevenlabs');
37
+ * await agent.connect({ agentId: '...', signedUrl: '...' });
44
38
  * ```
45
39
  */
46
- export declare function createAgent(type: 'gemini'): GeminiLiveAgent;
47
40
  export declare function createAgent(type: 'elevenlabs'): ElevenLabsAgent;
48
- export declare function createAgent(type: 'cartesia'): CartesiaAgent;
49
- export declare function createAgent(type: 'vapi'): VapiAgent;
41
+ export declare function createAgent(type: 'openai'): OpenAIRealtimeAgent;
50
42
  export declare function createAgent(type: AgentType): AnyAgent;
51
43
  /**
52
44
  * Get agent type metadata by ID.
@@ -0,0 +1,60 @@
1
+ import { AgentConfig } from './types';
2
+ import { BaseAgent } from './base';
3
+ /**
4
+ * Turn detection configuration for OpenAI Realtime.
5
+ * @see https://developers.openai.com/api/docs/guides/realtime-vad
6
+ */
7
+ export type TurnDetection = {
8
+ type: 'server_vad';
9
+ /** Activation threshold 0-1. Higher = requires louder audio. */
10
+ threshold?: number;
11
+ /** Audio (ms) to include before detected speech. */
12
+ prefix_padding_ms?: number;
13
+ /** Silence duration (ms) before speech stop is detected. */
14
+ silence_duration_ms?: number;
15
+ } | {
16
+ type: 'semantic_vad';
17
+ /** How eager the model is to consider a turn finished. Default: 'auto'. */
18
+ eagerness?: 'low' | 'medium' | 'high' | 'auto';
19
+ };
20
+ /** OpenAI Realtime specific configuration */
21
+ export interface OpenAIRealtimeConfig extends AgentConfig {
22
+ /** Model to use (defaults to gpt-realtime) */
23
+ model?: string;
24
+ /** Turn detection / VAD settings. Defaults to semantic_vad with eagerness 'high'. */
25
+ turnDetection?: TurnDetection;
26
+ }
27
+ /**
28
+ * OpenAI Realtime agent implementation.
29
+ *
30
+ * Handles WebSocket connection to OpenAI Realtime and converts
31
+ * audio responses to events that Persona SDK can consume.
32
+ */
33
+ export declare class OpenAIRealtimeAgent extends BaseAgent {
34
+ protected readonly agentName = "OpenAIRealtime";
35
+ private connectResolve;
36
+ private connectReject;
37
+ private connectTimeout;
38
+ private initialSessionUpdate;
39
+ private currentResponseHasAudio;
40
+ private currentTranscript;
41
+ private readonly handledFunctionCallIds;
42
+ private sourceInputSampleRate;
43
+ private pendingFunctionCallStartedAtMs;
44
+ private pendingFunctionCallNames;
45
+ connect(config: OpenAIRealtimeConfig): Promise<void>;
46
+ protected handleParsedMessage(message: unknown): void;
47
+ sendAudio(pcmData: Uint8Array): void;
48
+ close(): void;
49
+ private buildSessionUpdate;
50
+ private sendInitialSessionUpdate;
51
+ private handleResponseDone;
52
+ private handleFunctionCalls;
53
+ private handleFunctionCall;
54
+ private finishAudioTurn;
55
+ private resetTurnState;
56
+ private sendEvent;
57
+ private resolvePendingConnect;
58
+ private rejectPendingConnect;
59
+ private clearConnectTimeout;
60
+ }
@@ -39,7 +39,7 @@ export interface AgentEventMap {
39
39
  text: string;
40
40
  isFinal: boolean;
41
41
  };
42
- /** Emotion change (currently only supported by ElevenLabs agent) */
42
+ /** Emotion change (supported by all agents: ElevenLabs, OpenAI Realtime) */
43
43
  emotion: Emotion;
44
44
  /** Agent connection closed (unexpected disconnect) */
45
45
  closed: {
@@ -50,7 +50,7 @@ export interface AgentEventMap {
50
50
  /**
51
51
  * Abstract agent interface.
52
52
  *
53
- * Implement this for each voice AI platform (Gemini, ElevenLabs, Cartesia, etc.)
53
+ * Implement this for each voice AI platform (ElevenLabs, OpenAI Realtime, etc.)
54
54
  */
55
55
  export interface Agent {
56
56
  /** Current agent state */
package/dist/index.d.ts CHANGED
@@ -3,9 +3,10 @@ export type { PersonaEmbedOptions } from './PersonaEmbed';
3
3
  export { PersonaView } from './PersonaView';
4
4
  export type { PersonaViewOptions } from './PersonaView';
5
5
  export type { EmbedStatus, VideoFit, VoiceAgentDetails, SessionDetails, BaseCallbacks, } from './types';
6
- export { createAgent, GeminiLiveAgent, ElevenLabsAgent, CartesiaAgent, BaseAgent, AGENT_REGISTRY, getAgentInfo, } from './agents';
7
- export type { AgentType, AgentConfig, AgentEventMap, Agent, AnyAgent, AgentTypeInfo, GeminiLiveConfig, ElevenLabsConfig, CartesiaConfig, } from './agents';
6
+ export { createAgent, ElevenLabsAgent, OpenAIRealtimeAgent, BaseAgent, AGENT_REGISTRY, getAgentInfo, } from './agents';
7
+ export type { AgentType, AgentConfig, AgentEventMap, Agent, AnyAgent, AgentTypeInfo, ElevenLabsConfig, OpenAIRealtimeConfig, TurnDetection, } from './agents';
8
8
  export type { AgentState } from '@keyframelabs/sdk';
9
9
  export { floatTo16BitPCM, resamplePcm, base64ToBytes, bytesToBase64, SAMPLE_RATE, createEventEmitter, } from './agents';
10
+ export { KflEmbedElement } from './KflEmbedElement';
10
11
  export { ApiError as KeyframeApiError } from './ApiError';
11
12
  export type { ApiErrorPayload as KeyframeApiErrorPayload } from './ApiError';