@jambonz/schema 0.1.6 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -35,7 +35,7 @@ The verb schemas and JSON structure are identical in both modes. The difference
35
35
  - **Webhook**: Simple IVR, call routing, voicemail, basic gather-and-respond patterns.
36
36
  - **WebSocket**: LLM-powered voice agents, real-time audio streaming, complex conversational flows, anything requiring bidirectional communication, or asynchronous logic, or streaming tts.
37
37
 
38
- **IMPORTANT**: Any application that uses a speech-to-speech verb (`openai_s2s`, `google_s2s`, `deepgram_s2s`, `ultravox_s2s`, `elevenlabs_s2s`, `s2s`, or `pipeline`) MUST use WebSocket transport, not webhooks. These verbs require persistent bidirectional communication for real-time audio and events.
38
+ **IMPORTANT**: Any application that uses a speech-to-speech verb (`openai_s2s`, `google_s2s`, `deepgram_s2s`, `ultravox_s2s`, `elevenlabs_s2s`, `s2s`, or `agent`) MUST use WebSocket transport, not webhooks. These verbs require persistent bidirectional communication for real-time audio and events.
39
39
 
40
40
  ## Schema
41
41
 
@@ -62,10 +62,10 @@ Two tools are available:
62
62
  - **gather** — Collect speech (STT) and/or DTMF input. The workhorse for interactive menus and voice input.
63
63
 
64
64
  ### AI & Real-time
65
- - **openai_s2s** / **google_s2s** / **deepgram_s2s** / **ultravox_s2s** — Connect the caller to a vendor-specific LLM for real-time voice conversation. These are the **preferred** verbs when the vendor is known. Each handles the full STT→LLM→TTS pipeline with the vendor pre-set.
65
+ - **openai_s2s** / **google_s2s** / **deepgram_s2s** / **ultravox_s2s** — Connect the caller to a vendor-specific LLM for real-time voice conversation. These are the **preferred** verbs when the vendor is known. Each handles the full STT→LLM→TTS flow with the vendor pre-set.
66
66
  - **elevenlabs_s2s** — Connect the caller to an ElevenLabs Conversational AI agent. **Unlike other s2s vendors**, ElevenLabs requires a pre-configured `agent_id` (created in the ElevenLabs dashboard) rather than a model and messages. See [ElevenLabs S2S specifics](#elevenlabs-s2s-specifics) below.
67
67
  - **s2s** — Generic LLM voice conversation verb. Use only when the vendor is determined at runtime (e.g. from an env var). Requires `vendor` to be specified.
68
- - **pipeline** — Higher-level voice AI pipeline with integrated turn detection.
68
+ - **agent** — Higher-level voice AI agent with integrated turn detection. Mix-and-match STT, LLM, and TTS vendors.
69
69
  - **dialogflow** — Connect the caller to a Google Dialogflow agent (ES, CX, or CES).
70
70
  - **stream** — Stream raw audio to a websocket endpoint for custom processing.
71
71
  - **transcribe** — Real-time call transcription sent to a webhook.
package/README.md CHANGED
@@ -4,7 +4,7 @@ JSON Schema definitions and validation for jambonz verb applications.
4
4
 
5
5
  ## What's Included
6
6
 
7
- - **33 verb schemas** (`verbs/`) -- every jambonz verb (say, gather, dial, openai_s2s, pipeline, etc.)
7
+ - **33 verb schemas** (`verbs/`) -- every jambonz verb (say, gather, dial, openai_s2s, agent, etc.)
8
8
  - **42 component schemas** (`components/`) -- shared types (synthesizer, recognizer, target, actionHook, etc.)
9
9
  - **32 callback schemas** (`callbacks/`) -- actionHook payload definitions for each verb
10
10
  - **AGENTS.md** -- language-agnostic developer guide covering the verb model, transport modes, and protocol
@@ -1,8 +1,8 @@
1
1
  {
2
2
  "$schema": "https://json-schema.org/draft/2020-12/schema",
3
- "$id": "https://jambonz.org/schema/callbacks/pipeline-turn",
4
- "title": "Pipeline EventHook Events",
5
- "description": "Events sent to the pipeline verb's eventHook during a conversation. These are sent as 'pipeline:event' messages over the WebSocket connection.",
3
+ "$id": "https://jambonz.org/schema/callbacks/agent-turn",
4
+ "title": "Agent EventHook Events",
5
+ "description": "Events sent to the agent verb's eventHook during a conversation. These are sent as 'agent:event' messages over the WebSocket connection.",
6
6
  "type": "object",
7
7
  "oneOf": [
8
8
  {
@@ -84,7 +84,7 @@
84
84
  {
85
85
  "properties": {
86
86
  "type": {
87
- "const": "agent_response",
87
+ "const": "llm_response",
88
88
  "description": "Sent when the LLM has finished generating its response for the current turn. Contains the complete response text."
89
89
  },
90
90
  "response": {
@@ -143,16 +143,16 @@ For any command not covered by a specific method:
143
143
  session.injectCommand('commandName', { ...data });
144
144
  ```
145
145
 
146
- ## Pipeline Update
146
+ ## Agent Update
147
147
 
148
- The `updatePipeline()` method sends mid-conversation updates to an active `pipeline` verb. Four operation types are supported:
148
+ The `updateAgent()` method sends mid-conversation updates to an active `agent` verb. Four operation types are supported:
149
149
 
150
150
  ### Update Instructions
151
151
 
152
152
  Replace the LLM system prompt while the conversation is in progress:
153
153
 
154
154
  ```typescript
155
- session.updatePipeline({
155
+ session.updateAgent({
156
156
  type: 'update_instructions',
157
157
  instructions: 'You are now a billing support agent. Help the caller with invoice questions.',
158
158
  });
@@ -163,7 +163,7 @@ session.updatePipeline({
163
163
  Append messages to the LLM conversation history (e.g. CRM data retrieved after the call started):
164
164
 
165
165
  ```typescript
166
- session.updatePipeline({
166
+ session.updateAgent({
167
167
  type: 'inject_context',
168
168
  messages: [
169
169
  { role: 'system', content: 'Customer account #12345: Gold tier, 3 open tickets.' },
@@ -176,7 +176,7 @@ session.updatePipeline({
176
176
  Replace the tool set available to the LLM:
177
177
 
178
178
  ```typescript
179
- session.updatePipeline({
179
+ session.updateAgent({
180
180
  type: 'update_tools',
181
181
  tools: [
182
182
  {
@@ -193,24 +193,24 @@ session.updatePipeline({
193
193
 
194
194
  ### Generate Reply
195
195
 
196
- Prompt the LLM to generate a new response. If the pipeline is not idle, the request is queued and executes when the current turn completes. Use `interrupt: true` to cancel the current response and generate immediately.
196
+ Prompt the LLM to generate a new response. If the agent is not idle, the request is queued and executes when the current turn completes. Use `interrupt: true` to cancel the current response and generate immediately.
197
197
 
198
198
  ```typescript
199
199
  // Simple prompt
200
- session.updatePipeline({
200
+ session.updateAgent({
201
201
  type: 'generate_reply',
202
202
  user_input: 'The customer just entered their account number: 12345',
203
203
  });
204
204
 
205
205
  // With one-shot instructions
206
- session.updatePipeline({
206
+ session.updateAgent({
207
207
  type: 'generate_reply',
208
208
  user_input: 'Customer is asking about refunds',
209
209
  instructions: 'Be empathetic and offer a 20% discount before processing a refund.',
210
210
  });
211
211
 
212
212
  // Interrupt current response and generate a new one
213
- session.updatePipeline({
213
+ session.updateAgent({
214
214
  type: 'generate_reply',
215
215
  user_input: 'Urgent: supervisor override',
216
216
  interrupt: true,
@@ -219,7 +219,7 @@ session.updatePipeline({
219
219
 
220
220
  ## LLM Tool Output
221
221
 
222
- When using the `pipeline` verb with a `toolHook`, tool call requests arrive as events. Return results with:
222
+ When using the `agent` verb with a `toolHook`, tool call requests arrive as events. Return results with:
223
223
 
224
224
  ```typescript
225
225
  session.on('/tool-hook', (evt: Record<string, any>) => {
@@ -237,7 +237,7 @@ The result is stringified and fed back to the LLM as the tool response.
237
237
 
238
238
  ## Building a Cascaded Voice AI Agent
239
239
 
240
- The **pipeline** verb is the simplest way to build a voice AI agent — jambonz manages everything. But when you need full control over the LLM interaction (custom tool handling, conversation history management, multiple LLM providers, etc.), build a **cascaded agent**: your app handles STT transcripts and LLM calls directly, piping responses back via TTS token streaming.
240
+ The **agent** verb is the simplest way to build a voice AI agent — jambonz manages everything. But when you need full control over the LLM interaction (custom tool handling, conversation history management, multiple LLM providers, etc.), build a **cascaded agent**: your app handles STT transcripts and LLM calls directly, piping responses back via TTS token streaming.
241
241
 
242
242
  ### Architecture
243
243
 
@@ -257,9 +257,9 @@ User speaks again → bargeIn fires → repeat
257
257
 
258
258
  The key mechanism is the `bargeIn` actionHook on the `config` verb. When enabled with `sticky: true`, it persists across all verbs. Whenever the caller speaks, the `/speech-detected` hook fires with the speech transcript — even while TTS is playing (which triggers an interruption). Your app then calls the LLM and streams the response back.
259
259
 
260
- ### When to Use Cascaded vs Pipeline
260
+ ### When to Use Cascaded vs Agent
261
261
 
262
- | | Pipeline verb | Cascaded agent |
262
+ | | Agent verb | Cascaded agent |
263
263
  |---|---|---|
264
264
  | **STT/LLM/TTS** | jambonz orchestrates all three | App owns the LLM; jambonz handles STT and TTS |
265
265
  | **Turn detection** | Built-in (Krisp or STT-native) | App manages via bargeIn actionHook |
@@ -1,8 +1,8 @@
1
1
  ## Overview
2
2
 
3
- The pipeline verb orchestrates a complete voice AI agent by wiring together three separate components — STT, LLM, and TTS — with integrated turn detection. Unlike the s2s verbs (where a single vendor handles everything), pipeline lets you mix and match: e.g. Deepgram for STT, Anthropic for the LLM, and Cartesia for TTS.
3
+ The agent verb orchestrates a complete voice AI agent by wiring together three separate components — STT, LLM, and TTS — with integrated turn detection. Unlike the s2s verbs (where a single vendor handles everything), the agent verb lets you mix and match: e.g. Deepgram for STT, Anthropic for the LLM, and Cartesia for TTS.
4
4
 
5
- Pipeline manages the full conversational turn cycle:
5
+ The agent manages the full conversational turn cycle:
6
6
  1. User speaks → STT produces a transcript
7
7
  2. Turn detection decides the user is done speaking
8
8
  3. Transcript is sent to the LLM
@@ -12,7 +12,7 @@ Pipeline manages the full conversational turn cycle:
12
12
 
13
13
  ## Turn detection
14
14
 
15
- The `turnDetection` property controls how the pipeline decides the user has finished speaking.
15
+ The `turnDetection` property controls how the agent decides the user has finished speaking.
16
16
 
17
17
  **`"stt"` (default)** — Uses the STT vendor's native end-of-utterance signal. For most vendors this is silence-based. Some vendors have smarter built-in turn detection:
18
18
  - **deepgramflux** — Acoustic + semantic turn detection (Deepgram's "Flux" model)
@@ -105,15 +105,15 @@ The `eventHook` receives real-time events during the conversation. In WebSocket
105
105
  | Event type | Description | Key fields |
106
106
  |---|---|---|
107
107
  | `user_transcript` | User speech recognized | `transcript` |
108
- | `agent_response` | Assistant reply text | `response` |
108
+ | `llm_response` | Assistant reply text | `response` |
109
109
  | `user_interruption` | User barged in | — |
110
110
  | `turn_end` | End-of-turn summary | `transcript`, `response`, `interrupted`, `latency` |
111
111
 
112
- The `turn_end` event is the most useful for observability. It includes per-component latency metrics (STT, LLM, TTS) in milliseconds. See the `callback:pipeline-turn` schema for the full payload structure.
112
+ The `turn_end` event is the most useful for observability. It includes per-component latency metrics (STT, LLM, TTS) in milliseconds. See the `callback:agent-turn` schema for the full payload structure.
113
113
 
114
114
  ## toolHook (function calling)
115
115
 
116
- When the LLM requests a tool/function call, the pipeline sends a request to the `toolHook` with:
116
+ When the LLM requests a tool/function call, the agent sends a request to the `toolHook` with:
117
117
 
118
118
  ```json
119
119
  {
@@ -131,11 +131,11 @@ The `arguments` field is already parsed (an object, not a JSON string).
131
131
 
132
132
  ## MCP servers (external tools)
133
133
 
134
- Instead of (or in addition to) defining tools inline via `llmOptions.tools` and handling them with `toolHook`, you can connect to external MCP servers. The pipeline connects to each server at startup via SSE transport, discovers available tools, and makes them available to the LLM alongside any inline tools.
134
+ Instead of (or in addition to) defining tools inline via `llmOptions.tools` and handling them with `toolHook`, you can connect to external MCP servers. The agent connects to each server at startup via SSE transport, discovers available tools, and makes them available to the LLM alongside any inline tools.
135
135
 
136
136
  ```json
137
137
  {
138
- "verb": "pipeline",
138
+ "verb": "agent",
139
139
  "mcpServers": [
140
140
  {
141
141
  "url": "https://livescoremcp.com/sse"
@@ -155,7 +155,7 @@ Instead of (or in addition to) defining tools inline via `llmOptions.tools` and
155
155
  }
156
156
  ```
157
157
 
158
- The [LiveScore MCP server](https://livescoremcp.com/) is a free, public MCP server that exposes tools for live football scores, fixtures, team stats, and player data. The pipeline discovers these tools automatically at startup — no need to define tool schemas in `llmOptions.tools`. A caller can simply ask "what football matches are on right now?" and the LLM will use the `get_live_scores` tool to fetch real-time data.
158
+ The [LiveScore MCP server](https://livescoremcp.com/) is a free, public MCP server that exposes tools for live football scores, fixtures, team stats, and player data. The agent discovers these tools automatically at startup — no need to define tool schemas in `llmOptions.tools`. A caller can simply ask "what football matches are on right now?" and the LLM will use the `get_live_scores` tool to fetch real-time data.
159
159
 
160
160
  If an MCP server requires authentication, pass credentials in the `auth` property:
161
161
 
@@ -172,13 +172,13 @@ If an MCP server requires authentication, pass credentials in the `auth` propert
172
172
  }
173
173
  ```
174
174
 
175
- **How tool dispatch works**: When the LLM requests a tool call, the pipeline checks MCP servers first. If the tool name matches one discovered from an MCP server, the call is dispatched there directly and the result is fed back to the LLM. If no MCP server provides the tool, it falls through to the `toolHook` webhook. You can use both MCP servers and `toolHook` together — MCP handles the tools it knows about, and `toolHook` handles the rest.
175
+ **How tool dispatch works**: When the LLM requests a tool call, the agent checks MCP servers first. If the tool name matches one discovered from an MCP server, the call is dispatched there directly and the result is fed back to the LLM. If no MCP server provides the tool, it falls through to the `toolHook` webhook. You can use both MCP servers and `toolHook` together — MCP handles the tools it knows about, and `toolHook` handles the rest.
176
176
 
177
- **TypeScript example** — a pipeline agent with the LiveScore MCP server:
177
+ **TypeScript example** — an agent with the LiveScore MCP server:
178
178
 
179
179
  ```typescript
180
180
  session
181
- .pipeline({
181
+ .agent({
182
182
  stt: { vendor: 'deepgram', language: 'en-US' },
183
183
  tts: { vendor: 'cartesia', voice: 'sonic-english' },
184
184
  llm: {
@@ -196,18 +196,18 @@ session
196
196
  // { url: 'https://mcp.example.com/sse', auth: { apiKey: 'your-key' } },
197
197
  ],
198
198
  turnDetection: 'krisp',
199
- actionHook: '/pipeline-complete',
199
+ actionHook: '/agent-complete',
200
200
  })
201
201
  .send();
202
202
  ```
203
203
 
204
204
  ## Mid-conversation updates
205
205
 
206
- The pipeline supports asynchronous updates while a conversation is in progress. These let you change the agent's behavior, inject new context, modify available tools, or trigger a new LLM response — without interrupting the current verb stack.
206
+ The agent supports asynchronous updates while a conversation is in progress. These let you change the agent's behavior, inject new context, modify available tools, or trigger a new LLM response — without interrupting the current verb stack.
207
207
 
208
208
  Updates can be sent via:
209
- - **WebSocket**: `session.updatePipeline(data)` (sends a `pipeline:update` command)
210
- - **REST API**: `client.calls.updatePipeline(callSid, data)` (sends `pipeline_update` in the PUT body)
209
+ - **WebSocket**: `session.updateAgent(data)` (sends an `agent:update` command)
210
+ - **REST API**: `client.calls.updateAgent(callSid, data)` (sends `agent_update` in the PUT body)
211
211
 
212
212
  ### update_instructions
213
213
 
@@ -215,13 +215,13 @@ Replace the LLM system prompt mid-conversation. Useful when the conversation tra
215
215
 
216
216
  ```typescript
217
217
  // WebSocket
218
- session.updatePipeline({
218
+ session.updateAgent({
219
219
  type: 'update_instructions',
220
220
  instructions: 'You are now a billing support agent. Help the caller with invoice questions.',
221
221
  });
222
222
 
223
223
  // REST
224
- await client.calls.updatePipeline(callSid, {
224
+ await client.calls.updateAgent(callSid, {
225
225
  type: 'update_instructions',
226
226
  instructions: 'You are now a billing support agent. Help the caller with invoice questions.',
227
227
  });
@@ -232,7 +232,7 @@ await client.calls.updatePipeline(callSid, {
232
232
  Append messages to the LLM conversation history. Useful for injecting CRM data, call notes, or other context retrieved after the call started.
233
233
 
234
234
  ```typescript
235
- session.updatePipeline({
235
+ session.updateAgent({
236
236
  type: 'inject_context',
237
237
  messages: [
238
238
  { role: 'system', content: 'Customer account #12345: Gold tier, 3 open tickets.' },
@@ -245,7 +245,7 @@ session.updatePipeline({
245
245
  Replace the tool set available to the LLM. The new tools take effect on the next LLM turn.
246
246
 
247
247
  ```typescript
248
- session.updatePipeline({
248
+ session.updateAgent({
249
249
  type: 'update_tools',
250
250
  tools: [
251
251
  {
@@ -262,26 +262,26 @@ session.updatePipeline({
262
262
 
263
263
  ### generate_reply
264
264
 
265
- Prompt the LLM to generate a new response. If the pipeline is currently idle, the prompt executes immediately. If the pipeline is busy (e.g. the assistant is speaking), the request is queued and executes when the current turn completes.
265
+ Prompt the LLM to generate a new response. If the agent is currently idle, the prompt executes immediately. If the agent is busy (e.g. the assistant is speaking), the request is queued and executes when the current turn completes.
266
266
 
267
267
  Use `interrupt: true` to cancel the current response and generate immediately — useful for supervisor overrides or urgent context changes.
268
268
 
269
269
  ```typescript
270
270
  // Simple prompt
271
- session.updatePipeline({
271
+ session.updateAgent({
272
272
  type: 'generate_reply',
273
273
  user_input: 'The customer just entered their account number: 12345',
274
274
  });
275
275
 
276
276
  // With one-shot instructions
277
- session.updatePipeline({
277
+ session.updateAgent({
278
278
  type: 'generate_reply',
279
279
  user_input: 'Customer is asking about refunds',
280
280
  instructions: 'Be empathetic and offer a 20% discount before processing a refund.',
281
281
  });
282
282
 
283
283
  // Interrupt current response
284
- session.updatePipeline({
284
+ session.updateAgent({
285
285
  type: 'generate_reply',
286
286
  user_input: 'Urgent: supervisor override',
287
287
  interrupt: true,
@@ -326,11 +326,11 @@ For Anthropic models, use `"vendor": "anthropic"` and structure messages accordi
326
326
 
327
327
  ## Greeting
328
328
 
329
- By default (`greeting: true`), the pipeline prompts the LLM to generate an initial greeting before the user speaks. Set `greeting: false` if you want the agent to wait silently for the user to speak first.
329
+ By default (`greeting: true`), the agent prompts the LLM to generate an initial greeting before the user speaks. Set `greeting: false` if you want the agent to wait silently for the user to speak first.
330
330
 
331
331
  ## Complete example (TypeScript)
332
332
 
333
- A pipeline voice agent using Deepgram STT, OpenAI LLM, and Cartesia TTS with Krisp turn detection. Exposes multiple endpoints with different STT/TTS combinations:
333
+ A voice agent using Deepgram STT, OpenAI LLM, and Cartesia TTS with Krisp turn detection. Exposes multiple endpoints with different STT/TTS combinations:
334
334
 
335
335
  ```typescript
336
336
  import * as http from 'node:http';
@@ -354,19 +354,19 @@ function handleSession(session: Session) {
354
354
  const model = session.data.env_vars?.OPENAI_MODEL || 'gpt-4.1-mini';
355
355
  const systemPrompt = session.data.env_vars?.SYSTEM_PROMPT || envVars.SYSTEM_PROMPT.default;
356
356
 
357
- session.on('/pipeline-event', (evt: Record<string, unknown>) => {
357
+ session.on('/agent-event', (evt: Record<string, unknown>) => {
358
358
  if (evt.type === 'turn_end') {
359
359
  const { transcript, response, interrupted, latency } = evt as Record<string, unknown>;
360
360
  console.log('turn_end', JSON.stringify({ transcript, response, interrupted, latency }, null, 2));
361
361
  }
362
362
  });
363
363
 
364
- session.on('/pipeline-complete', () => {
364
+ session.on('/agent-complete', () => {
365
365
  session.hangup().reply();
366
366
  });
367
367
 
368
368
  session
369
- .pipeline({
369
+ .agent({
370
370
  stt: {
371
371
  vendor: 'deepgram',
372
372
  language: 'multi',
@@ -386,8 +386,8 @@ function handleSession(session: Session) {
386
386
  turnDetection: 'krisp',
387
387
  earlyGeneration: true,
388
388
  bargeIn: { enable: true },
389
- eventHook: '/pipeline-event',
390
- actionHook: '/pipeline-complete',
389
+ eventHook: '/agent-event',
390
+ actionHook: '/agent-complete',
391
391
  })
392
392
  .send();
393
393
  }
@@ -426,19 +426,19 @@ function handleSession(session) {
426
426
  const model = session.data.env_vars?.OPENAI_MODEL || 'gpt-4.1-mini';
427
427
  const systemPrompt = session.data.env_vars?.SYSTEM_PROMPT || envVars.SYSTEM_PROMPT.default;
428
428
 
429
- session.on('/pipeline-event', (evt) => {
429
+ session.on('/agent-event', (evt) => {
430
430
  if (evt.type === 'turn_end') {
431
431
  const { transcript, response, interrupted, latency } = evt;
432
432
  console.log('turn_end', JSON.stringify({ transcript, response, interrupted, latency }, null, 2));
433
433
  }
434
434
  });
435
435
 
436
- session.on('/pipeline-complete', () => {
436
+ session.on('/agent-complete', () => {
437
437
  session.hangup().reply();
438
438
  });
439
439
 
440
440
  session
441
- .pipeline({
441
+ .agent({
442
442
  stt: {
443
443
  vendor: 'deepgram',
444
444
  language: 'multi',
@@ -458,8 +458,8 @@ function handleSession(session) {
458
458
  turnDetection: 'krisp',
459
459
  earlyGeneration: true,
460
460
  bargeIn: { enable: true },
461
- eventHook: '/pipeline-event',
462
- actionHook: '/pipeline-complete',
461
+ eventHook: '/agent-event',
462
+ actionHook: '/agent-complete',
463
463
  })
464
464
  .send();
465
465
  }
@@ -28,7 +28,7 @@
28
28
  { "$ref": "verbs/deepgram_s2s" },
29
29
  { "$ref": "verbs/ultravox_s2s" },
30
30
  { "$ref": "verbs/dialogflow" },
31
- { "$ref": "verbs/pipeline" },
31
+ { "$ref": "verbs/agent" },
32
32
  { "$ref": "verbs/conference" },
33
33
  { "$ref": "verbs/transcribe" },
34
34
  { "$ref": "verbs/enqueue" },
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@jambonz/schema",
3
- "version": "0.1.6",
3
+ "version": "0.2.1",
4
4
  "description": "JSON Schema definitions and validation for jambonz verb applications",
5
5
  "main": "index.js",
6
6
  "scripts": {
@@ -1,13 +1,13 @@
1
1
  {
2
2
  "$schema": "https://json-schema.org/draft/2020-12/schema",
3
- "$id": "https://jambonz.org/schema/verbs/pipeline",
3
+ "$id": "https://jambonz.org/schema/verbs/agent",
4
4
  "minVersion": "10.1.0",
5
- "title": "Pipeline",
6
- "description": "Configures a complete STT → LLM → TTS voice AI pipeline with integrated turn detection. Provides a higher-level abstraction than manually orchestrating the individual components. Optimized for building voice AI agents with proper turn-taking behavior.",
5
+ "title": "Agent",
6
+ "description": "Configures a complete voice AI agent by wiring together STT → LLM → TTS with integrated turn detection. Provides a higher-level abstraction than manually orchestrating the individual components. Optimized for building voice AI agents with proper turn-taking behavior.",
7
7
  "type": "object",
8
8
  "properties": {
9
9
  "verb": {
10
- "const": "pipeline"
10
+ "const": "agent"
11
11
  },
12
12
  "id": {
13
13
  "type": "string",
@@ -15,11 +15,11 @@
15
15
  },
16
16
  "stt": {
17
17
  "$ref": "../components/recognizer",
18
- "description": "Speech-to-text configuration for the pipeline."
18
+ "description": "Speech-to-text configuration for the agent."
19
19
  },
20
20
  "tts": {
21
21
  "$ref": "../components/synthesizer",
22
- "description": "Text-to-speech configuration for the pipeline."
22
+ "description": "Text-to-speech configuration for the agent."
23
23
  },
24
24
  "turnDetection": {
25
25
  "oneOf": [
@@ -53,7 +53,7 @@
53
53
  }
54
54
  ],
55
55
  "default": "stt",
56
- "description": "Turn detection strategy. Controls when the pipeline decides the user has finished speaking. STT vendors with native turn-taking (deepgramflux, assemblyai, speechmatics) always use their built-in detection regardless of this setting."
56
+ "description": "Turn detection strategy. Controls when the agent decides the user has finished speaking. STT vendors with native turn-taking (deepgramflux, assemblyai, speechmatics) always use their built-in detection regardless of this setting."
57
57
  },
58
58
  "bargeIn": {
59
59
  "type": "object",
@@ -86,16 +86,16 @@
86
86
  },
87
87
  "llm": {
88
88
  "type": "object",
89
- "description": "LLM configuration for the pipeline. See the 'llm' verb schema for details.",
89
+ "description": "LLM configuration for the agent. See the 'llm' verb schema for details.",
90
90
  "additionalProperties": true
91
91
  },
92
92
  "actionHook": {
93
93
  "$ref": "../components/actionHook",
94
- "description": "A webhook invoked when the pipeline ends."
94
+ "description": "A webhook invoked when the agent ends."
95
95
  },
96
96
  "eventHook": {
97
97
  "$ref": "../components/actionHook",
98
- "description": "A webhook invoked for pipeline events. Receives event types: 'user_transcript' (user speech recognized), 'agent_response' (assistant reply), 'user_interruption' (barge-in detected), and 'turn_end' (end-of-turn summary with transcript, response, and latency metrics)."
98
+ "description": "A webhook invoked for agent events. Receives event types: 'user_transcript' (user speech recognized), 'llm_response' (assistant reply), 'user_interruption' (barge-in detected), and 'turn_end' (end-of-turn summary with transcript, response, and latency metrics)."
99
99
  },
100
100
  "toolHook": {
101
101
  "$ref": "../components/actionHook",
@@ -171,7 +171,7 @@
171
171
  },
172
172
  "required": ["url"]
173
173
  },
174
- "description": "External MCP servers that provide tools to the LLM. The pipeline connects at startup via SSE, discovers available tools, and makes them callable by the LLM."
174
+ "description": "External MCP servers that provide tools to the LLM. The agent connects at startup via SSE, discovers available tools, and makes them callable by the LLM."
175
175
  }
176
176
  },
177
177
  "required": [
@@ -179,7 +179,7 @@
179
179
  ],
180
180
  "examples": [
181
181
  {
182
- "verb": "pipeline",
182
+ "verb": "agent",
183
183
  "stt": {
184
184
  "vendor": "deepgram",
185
185
  "language": "en-US"
@@ -201,10 +201,10 @@
201
201
  }
202
202
  },
203
203
  "turnDetection": "stt",
204
- "actionHook": "/pipeline-complete"
204
+ "actionHook": "/agent-complete"
205
205
  },
206
206
  {
207
- "verb": "pipeline",
207
+ "verb": "agent",
208
208
  "stt": {
209
209
  "vendor": "deepgram",
210
210
  "language": "en-US"
@@ -234,7 +234,7 @@
234
234
  "minSpeechDuration": 0.3,
235
235
  "sticky": false
236
236
  },
237
- "actionHook": "/pipeline-complete"
237
+ "actionHook": "/agent-complete"
238
238
  }
239
239
  ]
240
240
  }