npm - @pinecall/skills - Versions diffs - 0.1.0 - Mend

@pinecall/skills 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

package/skills/pinecall-reference/references/reference/events.md ADDED Viewed

@@ -0,0 +1,366 @@
+---
+title: "Events"
+description: "Every event the SDK emits, with payload shapes and timing."
+---
+# Events
+This is the complete catalog of events. Subscribe via `agent.on(event, handler)`. All call-scoped events include the `Call` as the final argument.
+## Real-time flow
+This is the order events fire during a typical exchange:
+```
+User speaks    →  speech.started
+               →  user.speaking  (interim, fires multiple times)
+               →  speech.ended
+               →  user.message   (final confirmed text)
+               →  eager.turn / turn.end
+Bot responds   →  bot.speaking   (message ID assigned)
+               →  bot.word       (word-by-word as TTS plays)
+               →  bot.finished   (done speaking)
+Interruption   →  bot.interrupted
+               →  turn.continued (active ReplyStreams auto-aborted)
+```
+## Lifecycle events
+### `call.started`
+```typescript
+agent.on("call.started", (call: Call) => { });
+```
+A new **voice** call connected (phone or WebRTC). The `Call` object is partially populated — `id`, `from`, `to`, `direction`, `transport`, `metadata` are available. `duration`, `endedAt`, `reason` are not yet.
+> **Note:** `call.started` fires only for voice transports (`phone`, `webrtc`). For chat and WhatsApp, use `chat.started` and `whatsapp.started` instead.
+### `chat.started`
+```typescript
+agent.on("chat.started", (call: Call) => { });
+```
+A new chat session started. Receives the same `Call` object, with `call.transport === "chat"`. Use `setPromptVars()`, `addContext()`, and all other Call methods as usual.
+### `whatsapp.started`
+```typescript
+agent.on("whatsapp.started", (call: Call, session: WhatsAppSession) => { });
+```
+A new WhatsApp session started (first message from a new contact). Receives both:
+- `call` — the universal `Call` object for `setPromptVars()`, `addContext()`, etc.
+- `session` — a `WhatsAppSession` with `contactPhone`, `contactName`, and history methods.
+### `call.preparing`
+```typescript
+agent.on("call.preparing", (call: Call) => { });
+```
+Fires before **every** LLM generation — voice, chat, and WhatsApp. Use it to refresh per-call variables that need to be current for every turn:
+```typescript
+agent.on("call.preparing", (call) => {
+  call.setPromptVars({
+    date_block: buildFreshDate(),
+    format_rules: call.transport === "phone" ? VOICE_FORMAT : CHAT_FORMAT,
+  });
+});
+```
+The server waits briefly (~150ms) for your handler to call `setPromptVars()` before proceeding with the LLM call. This runs just-in-time, so variables are always fresh — even in long-lived WhatsApp sessions.
+### `call.ended`
+```typescript
+agent.on("call.ended", (call: Call, reason: string) => { });
+```
+The call ended. The `Call` is now fully populated, including `duration`, `endedAt`, `messages`, and `transcript`.
+`reason` values: `hangup`, `timeout`, `idle_timeout`, `max_duration`, `no_answer`, `busy`, `failed`.
+## User speech events
+### `speech.started` / `speech.ended`
+```typescript
+agent.on("speech.started", (event, call: Call) => { });
+agent.on("speech.ended", (event, call: Call) => { });
+```
+VAD-level events: fire when the audio energy crosses the speech threshold.
+### `user.speaking`
+```typescript
+agent.on("user.speaking", (event: { text: string }, call: Call) => { });
+```
+Interim STT transcript. Fires multiple times as the STT engine refines its guess.
+### `user.message`
+```typescript
+agent.on("user.message", (event: { text: string; messageId: string }, call: Call) => { });
+```
+Final confirmed user text. After this fires, `eager.turn` or `turn.end` follows shortly.
+## Turn events
+### `eager.turn`
+```typescript
+agent.on("eager.turn", (turn: { text: string; probability: number }, call: Call) => { });
+```
+Early signal that the user *probably* finished a turn. Use for low-latency responses — start the LLM, but be ready to abort if `turn.continued` fires.
+### `turn.end`
+```typescript
+agent.on("turn.end", (turn: { text: string; probability: number }, call: Call) => { });
+```
+Final turn signal. Higher confidence than `eager.turn`. This is where most apps trigger the LLM.
+### `turn.continued`
+```typescript
+agent.on("turn.continued", (event, call: Call) => { });
+```
+The user kept talking after a turn signal. Any active `ReplyStream` auto-aborts. Your handler doesn't need to do anything — just don't be surprised when the stream stops.
+## Bot speech events
+Bot speech follows this lifecycle:
+```
+bot.speaking  →  bot.word × N  →  bot.finished      (completed normally)
+                                   bot.interrupted    (user barged in)
+                                   message.confirmed  (full text saved to history)
+```
+`call.currentBotText` accumulates `bot.word` events into a live preview string.
+It resets on each new `bot.speaking` and clears after `bot.finished` / `bot.interrupted`.
+### `bot.speaking`
+```typescript
+agent.on("bot.speaking", (event: { messageId: string; text: string }, call: Call) => { });
+```
+The bot started speaking a message. `messageId` lets you track this specific utterance.
+`text` contains the full response text for non-streaming replies (`call.say()`, `call.reply()`). For streaming replies (`call.replyStream()`), `text` is empty because tokens arrive incrementally — use `bot.word` events or `call.currentBotText` to track what the bot is saying.
+### `bot.word`
+```typescript
+agent.on("bot.word", (event: { messageId: string; word: string }, call: Call) => { });
+```
+A word was just played by TTS — synchronized with the actual audio playback. Use for live captions, subtitles, or transcript UIs.
+Each `bot.word` is automatically accumulated into `call.currentBotText`:
+```typescript
+// Live preview — grows word-by-word as the bot speaks
+agent.on("bot.word", (event, call) => {
+  console.log(`🗣  "${call.currentBotText}"`);
+  // "¡Hola!"
+  // "¡Hola! Estoy"
+  // "¡Hola! Estoy bien,"
+  // "¡Hola! Estoy bien, gracias."
+});
+```
+> **Note:** `bot.word` timing is aligned with TTS audio. If the bot says a 5-second sentence, words arrive spread across those 5 seconds — not all at once.
+### `bot.finished`
+```typescript
+agent.on("bot.finished", (event: { messageId: string; durationMs: number }, call: Call) => { });
+```
+The bot finished speaking. TTS audio fully played. `call.currentBotText` still contains the accumulated words during this handler — it clears immediately after.
+```typescript
+agent.on("bot.finished", (event, call) => {
+  console.log(`Done (${event.durationMs}ms): "${call.currentBotText}"`);
+});
+```
+### `bot.interrupted`
+```typescript
+agent.on("bot.interrupted", (event: { messageId: string; playedMs: number; reason: string }, call: Call) => { });
+```
+The user cut off the bot mid-speech. `call.currentBotText` shows what the bot managed to say before being interrupted.
+```typescript
+agent.on("bot.interrupted", (event, call) => {
+  console.log(`Interrupted after ${event.playedMs}ms, said: "${call.currentBotText}"`);
+});
+```
+## Protocol events
+### `message.confirmed`
+```typescript
+agent.on("message.confirmed", (event: { messageId: string }, call: Call) => { });
+```
+The server acknowledged a bot message you sent (via `say`, `reply`, or `replyStream`).
+### `llm.toolCall`
+```typescript
+agent.on("llm.toolCall", (data: {
+  msgId: string;
+  toolCalls: Array<{ id: string; name: string; arguments: string }>;
+}, call: Call) => { });
+```
+The server-side LLM is requesting one or more tool calls. If you defined tools with `tool()`, the SDK auto-executes them and sends results via `call.toolResult()`. This event still fires — use it for logging, metrics, or UI updates.
+See [Tools and Functions](/guides/tools-and-functions).
+### `session.idleWarning`
+```typescript
+agent.on("session.idleWarning", (event: {
+  remainingSeconds: number;
+  idleTimeoutSeconds: number;
+}, call: Call) => { });
+```
+Fires before idle timeout. The user hasn't spoken in a while. Use it to prompt them.
+```typescript
+agent.on("session.idleWarning", (event, call) => {
+  call.say("Are you still there?");
+});
+```
+### `session.timeout`
+```typescript
+agent.on("session.timeout", (event: {
+  reason: "max_duration" | "idle_timeout";
+}, call: Call) => { });
+```
+A session limit hit. The call is about to end.
+## WhatsApp events
+### `whatsapp.message`
+```typescript
+agent.on("whatsapp.message", (event: {
+  sessionId: string;
+  from: string;
+  name: string;
+  type: "text" | "audio" | "image" | "video" | "document";
+  text: string;
+  messageId: string;
+  paused: boolean;  // true when agent is paused (human-in-the-loop)
+}) => { });
+```
+Incoming WhatsApp message. For voice notes (`type: "audio"`), `text` is the transcript.
+When `paused` is `true`, the AI did not respond — a human should handle this message via `agent.sendMessage()`.
+### `whatsapp.response`
+```typescript
+agent.on("whatsapp.response", (event: {
+  sessionId: string;
+  to: string;
+  text: string;
+  source?: "human";  // present when sent by human via agent.sendMessage()
+}) => { });
+```
+The agent sent a WhatsApp response. When `source` is `"human"`, the message was sent by a human operator (not the AI).
+### `whatsapp.status`
+```typescript
+agent.on("whatsapp.status", (event: {
+  status: "sent" | "delivered" | "read";
+  recipient: string;
+  messageId: string;
+}) => { });
+```
+Delivery status update from Meta.
+## Human-in-the-loop events
+### `session.paused`
+```typescript
+agent.on("session.paused", (event: {
+  sessionId?: string;   // set for session-level pause
+  contact?: string;     // set for contact-level pause
+  // both undefined = global pause
+}) => { });
+```
+Confirmation that the agent was paused. Fires after `agent.pause()`.
+### `session.resumed`
+```typescript
+agent.on("session.resumed", (event: {
+  sessionId?: string;
+  contact?: string;
+}) => { });
+```
+Confirmation that the agent was resumed. Fires after `agent.resume()`.
+## Audio metrics
+When you enable `analysis.send_audio_metrics`:
+```typescript
+agent.on("audio.metrics", (event: {
+  source: "user" | "bot";
+  energyDb: number;     // -60 to 0
+  rms: number;          // 0–1
+  peak: number;         // 0–1
+  isSpeech: boolean;
+  vadProb: number;      // 0–1
+}, call: Call) => { });
+```
+Use for live waveform UIs, energy meters, or VAD visualization.
+## SSE events
+When streamed over SSE (via `pc.stream()` or `agent.stream()`), each event has an `event:` field and a JSON `data:` body with `agent` ID:
+```
+event: user.message
+data: {"callId":"CA123","text":"Hello","messageId":"msg_abc","agent":"mara"}
+```
+A `:ping` comment is sent every 30s as keepalive.
+## What's next
+- [`Call` API reference](/api/call) — methods to call in response to events
+- [Multi-tenant](/guides/multi-tenant) — scope SSE event streams

package/skills/pinecall-reference/references/reference/llm-providers.md ADDED Viewed

@@ -0,0 +1,263 @@
+---
+title: "LLM Providers"
+description: "Server-side LLM providers and configuration."
+---
+# LLM Providers
+When using server-side LLM (the recommended path for most agents), the server runs the LLM and streams responses directly through TTS. Configure it via the `llm` and `prompt` fields on the agent.
+For client-side LLMs, see [ReplyStream](/api/reply-stream).
+## Quick start
+```typescript
+const agent = pc.agent("my-bot", {
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  llm: "openai/gpt-5-chat-latest",
+  prompt: "You are a friendly assistant. Keep responses short.",
+});
+```
+The `llm` shortcut takes the `provider/model` format. `prompt` is a top-level field — no need to nest it inside an object.
+## Shortcut format
+```typescript
+// Recommended: provider/model
+llm: "openai/gpt-5-chat-latest"
+// Bare model name (assumes OpenAI)
+llm: "gpt-5-chat-latest"
+// Both expand to:
+// { provider: "openai", model: "gpt-5-chat-latest", enabled: true }
+```
+> The legacy `provider:model` format (e.g. `"openai:gpt-5-chat-latest"`) still works but is not recommended.
+## Tuning with a full config object
+For `temperature`, `max_tokens`, and other tuning parameters, use the full config object:
+```typescript
+const agent = pc.agent("my-bot", {
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  llm: {
+    provider: "openai",
+    llm: "openai/gpt-5-chat-latest",
+    enabled: true,
+    temperature: 0.3,      // 0-2. Lower = more deterministic
+    max_tokens: 256,        // caps response length
+  },
+  prompt: "You are a customer support agent. Be concise.",
+});
+```
+> **Tip:** `prompt` stays top-level even when using the full `llm` object. The server merges them. You can also put `prompt` inside the `llm` object — both work.
+## OpenAI
+```typescript
+llm: "openai/gpt-5-chat-latest"
+```
+Or with tuning:
+```typescript
+llm: {
+  provider: "openai",
+  llm: "openai/gpt-5-chat-latest",
+  enabled: true,
+  temperature: 0.7,
+  max_tokens: 512,
+}
+```
+**Model picker:**
+| Model | Best for |
+|---|---|
+| `gpt-5-chat-latest` | Most agents — strong reasoning, good cost (recommended default) |
+| `gpt-5-chat-mini` | Highest-volume, simple flows; lowest cost |
+## Mistral
+```typescript
+llm: "mistral/mistral-medium"
+```
+Or with tuning:
+```typescript
+llm: {
+  provider: "mistral",
+  model: "mistral-medium",
+  enabled: true,
+  temperature: 0.7,
+  max_tokens: 512,
+}
+```
+## Google (Gemini)
+```typescript
+llm: "google/gemini-2.0-flash"
+```
+Or with tuning:
+```typescript
+llm: {
+  provider: "google",
+  model: "gemini-2.0-flash",
+  enabled: true,
+  temperature: 0.7,
+  max_tokens: 512,
+}
+```
+> `gemini` is accepted as an alias for `google` (e.g. `llm: "gemini/gemini-2.5-flash"`).
+**Model picker:**
+| Model | Best for |
+|---|---|
+| `gemini-2.0-flash` | Most voice agents — fast and low cost (recommended default) |
+| `gemini-2.5-flash` | Stronger reasoning at a modest cost bump |
+## Anthropic
+```typescript
+llm: "anthropic/claude-haiku-4-5"
+```
+Or with tuning:
+```typescript
+llm: {
+  provider: "anthropic",
+  model: "claude-haiku-4-5",
+  enabled: true,
+  temperature: 0.7,
+  max_tokens: 512,
+}
+```
+> `claude` is accepted as an alias for `anthropic` (e.g. `llm: "claude/claude-sonnet-4-6"`).
+**Model picker:**
+| Model | Best for |
+|---|---|
+| `claude-haiku-4-5` | Most voice agents — fast and low cost (recommended default) |
+| `claude-sonnet-4-6` | Higher reasoning quality when latency/cost matter less |
+> Opus is intentionally **not** offered for voice agents — it's the premium tier (too slow/costly for real-time). Sonnet 4.6 and Haiku 4.5 are the supported Anthropic models. Set your `ANTHROPIC_API_KEY` on the server (managed) or add an Anthropic credential to your org (BYOK).
+## The `enabled` field
+`enabled: false` disables server-side LLM for this agent. The server still does STT and TTS, but it won't generate responses — you handle every `turn.end` yourself with a client-side LLM.
+```typescript
+// Server-side off — bring your own LLM
+const agent = pc.agent("my-bot", {
+  voice: "elevenlabs/sarah",
+  language: "en",
+  // no llm field — or llm: { provider: "openai", enabled: false }
+});
+agent.on("turn.end", async (turn, call) => {
+  const stream = call.replyStream(turn);
+  // ... your LLM here
+});
+```
+## Prompt template variables
+Define a prompt with `{{placeholders}}`. The server resolves them before each LLM request. Built-in: `{{date}}`, `{{time}}`.
+```typescript
+const agent = pc.agent("support-bot", {
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  llm: "openai/gpt-5-chat-latest",
+  prompt: `You are {{agent_name}}, support agent at {{company}}.
+Today is {{date}}. Customer: {{customer_name}}.`,
+});
+```
+Set values per-call:
+```typescript
+agent.on("call.started", async (call) => {
+  await call.setPromptVars({
+    agent_name: "Nova",
+    company: "Acme",
+    customer_name: "Maria",
+  });
+});
+```
+See [Hot-Reload](/concepts/hot-reload) for the full pattern.
+## Temperature & max_tokens
+Standard parameters supported by all providers:
+- `temperature` — 0–2. Lower = more deterministic. For voice agents, `0.3–0.7` is typical.
+- `max_tokens` — caps response length. For voice, keep it short — `256–512` is common to avoid long monologues.
+```typescript
+// Short, deterministic answers (IVR, routing)
+llm: { provider: "openai", model: "gpt-5-chat-mini", temperature: 0.2, max_tokens: 128 }
+// Natural conversation
+llm: { provider: "openai", model: "gpt-5-chat-latest", temperature: 0.7, max_tokens: 512 }
+// Creative, open-ended
+llm: { provider: "openai", model: "gpt-5-chat-latest", temperature: 1.0, max_tokens: 1024 }
+```
+## Tools
+Define tools with `tool()` and Zod schemas. The SDK auto-converts them to the OpenAI function-calling wire format and auto-executes them:
+```typescript
+import { tool } from "@pinecall/sdk";
+import { z } from "zod";
+const lookupOrder = tool({
+  name: "lookupOrder",
+  description: "Look up an order by ID",
+  schema: z.object({ orderId: z.string() }),
+  execute: async ({ orderId }) => await db.orders.findOne(orderId),
+});
+// Pass to agent config
+tools: [lookupOrder],
+```
+See [Tools and Functions](/guides/tools-and-functions) for the full pattern.
+## Hot-reloading the LLM
+Swap models or providers at runtime:
+```typescript
+// Agent-wide (all future calls)
+agent.update({ llm: "openai/gpt-5-chat-latest" });
+// One call only
+call.update({ llm: "mistral/mistral-medium" });
+```
+This is useful for A/B testing different models, or upgrading the model for VIP callers without redeploying.
+## What's next
+- [Server-side vs client-side LLM](/concepts/server-vs-client-llm)
+- [Tools and Functions](/guides/tools-and-functions)
+- [Hot-reload](/concepts/hot-reload)