npm - @pinecall/skills - Versions diffs - 0.1.0 - Mend

@pinecall/skills 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

package/skills/pinecall-concepts/references/concepts/hot-reload.md ADDED Viewed

@@ -0,0 +1,119 @@
+---
+title: "Hot-Reload"
+description: "Change voice, language, prompt, tools — even during an active call."
+---
+# Hot-Reload
+Everything in Pinecall is hot-reloadable. Voice, language, STT provider, prompt, tools — all can change **during an active call**. The server applies changes on the next LLM turn.
+This isn't a power-user feature you'll use rarely. It's the foundation of how Pinecall handles real-world conversation: switching languages when the user does, injecting CRM context when the call connects, swapping voices for different contexts.
+## The three scopes
+| Scope | Method | Affects |
+|---|---|---|
+| Agent defaults | `pc.agent("id", config)` | All future calls |
+| Agent hot-reload | `agent.update(updates)` | Updates defaults, future calls |
+| Session (mid-call) | `call.update(opts)` | This call only |
+| Prompt (mid-call) | `call.setPrompt(text)` | This call's system prompt |
+| Template vars | `call.setPromptVars(vars)` | This call's `{{var}}` values |
+| Context | `call.addContext(text)` | Appended after prompt |
+## Updating the agent's defaults
+`agent.update()` updates the agent's defaults at runtime. Changes take effect on **all future calls** — existing calls keep their current config.
+```typescript
+// Switch the default voice to French
+agent.update({ voice: "elevenlabs/claire", language: "fr" });
+// Upgrade to a bigger model
+agent.update({ llm: "openai/gpt-5-chat-latest", prompt: "..." });
+// Swap STT providers
+agent.update({ stt: "gladia" });
+```
+No REST call needed. `agent.update()` uses the existing WebSocket — changes propagate to the server instantly.
+## Changing a live call
+`call.update()` changes the active call only. Other calls on the same agent are unaffected.
+```typescript
+// User asks for Spanish mid-conversation
+call.update({ voice: "elevenlabs/valentina", language: "es" });
+call.reply("¡Claro! Ahora hablo en español.");
+```
+The next TTS the bot produces uses the new voice. The next STT transcription uses the new language.
+## Prompt template variables
+Define a prompt with `{{placeholders}}`. The server resolves them before each LLM request. Built-in variables: `{{date}}`, `{{time}}`.
+```typescript
+const agent = pc.agent("support", {
+  llm: "openai/gpt-5-chat-latest",
+  prompt: `You are {{agent_name}}, support agent at {{company}}.
+Today is {{date}}, {{time}}.
+Customer: {{customer_name}} ({{tier}} tier).`,
+});
+agent.on("call.started", async (call) => {
+  const customer = await lookupCaller(call.from);
+  await call.setPromptVars({
+    agent_name: "Nova",
+    company: "Acme Corp",
+    customer_name: customer.name,
+    tier: customer.tier,
+  });
+  call.say(`Hi ${customer.name}! How can I help?`);
+});
+```
+This pattern lets you keep a single canonical prompt but personalize it for every caller without rewriting the whole template.
+## Adding context mid-call
+Append dynamic context without replacing the prompt:
+```typescript
+agent.on("call.started", async (call) => {
+  const orders = await getRecentOrders(call.from);
+  await call.addContext(
+    `Recent orders:\n${orders.map((o) => `- ${o.id}: ${o.status}`).join("\n")}`,
+  );
+});
+```
+You can call `addContext` multiple times during a call — each call appends. Use it to inject anything that changes during the conversation: lookups, calculations, tool results you want the LLM to remember.
+## Replacing the prompt mid-call
+For more aggressive changes — escalation, new persona, mode switch — replace the whole prompt:
+```typescript
+call.setPrompt(
+  "You are now in escalation mode. Be more formal. Offer to connect to a human.",
+);
+```
+The next LLM turn uses the new prompt. History is preserved.
+## Why this matters
+Most voice platforms treat the agent as a fixed config: you upload a JSON, the platform serves it, the end. Changes require redeploying or hitting a dashboard.
+Pinecall treats the agent as **live state inside your process**. That changes what you can build:
+- **Personalize every call** — load CRM data on `call.started`, set prompt vars, the LLM knows about the customer from word one
+- **Multi-language by default** — detect language from the first user message, switch voice + STT accordingly
+- **Phase transitions** — `setPrompt` when the conversation enters a new mode (qualification → demo → close)
+- **Live A/B testing** — `agent.update` to flip the model or voice based on traffic without redeploying
+## What's next
+- [Tools and Functions](/guides/tools-and-functions) — combine hot-reload with tool calling
+- [`Call` API reference](/api/call) — every method you can use mid-call

package/skills/pinecall-concepts/references/concepts/philosophy.md ADDED Viewed

@@ -0,0 +1,100 @@
+---
+title: "Philosophy"
+description: "Why Pinecall is code-first and what that means for your architecture."
+---
+# Philosophy
+Pinecall SDK is designed around one idea: **any existing app can add a voice agent without changing its architecture.**
+## Code-first, not platform-first
+Traditional voice AI platforms ask you to adapt your app to them — configure agents in a dashboard, expose webhooks, maintain JSON tool schemas separately from your code, send data to their servers.
+Pinecall flips this. The agent runs **inside your process**:
+```typescript
+import { Pinecall, tool } from "@pinecall/sdk";
+import { z } from "zod";
+import { db } from "./db.js";
+const pc = new Pinecall();
+const lookupOrder = tool({
+  name: "lookupOrder",
+  description: "Look up a customer's order by their phone number.",
+  schema: z.object({ phone: z.string() }),
+  execute: async ({ phone }) => await db.orders.findOne({ phone }),
+});
+export const agent = pc.agent("support", {
+  prompt: "You are a support agent for Acme Corp.",
+  llm: "openai/gpt-5-chat-latest",
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  phoneNumber: "+15551234567",
+  greeting: "Hi, how can I help?",
+  tools: [lookupOrder],
+});
+```
+No webhooks. No separate tool server. No "upload your tools as JSON". Your tools are just functions with Zod schemas, auto-executed by the SDK.
+## Voice as a library, not a platform
+The SDK is a **dependency** — `npm install @pinecall/sdk`. It lives in your `package.json` alongside Express, Prisma, and everything else you already use.
+You don't migrate to Pinecall. You add it to your existing app. Your existing Express routes, your existing database connection, your existing auth — they all stay exactly where they are.
+## The server does the hard parts
+Your code handles business logic. The Pinecall voice server handles the things that are genuinely hard:
+| Your code | Voice server |
+|---|---|
+| Prompts and personality | Audio transport (WebRTC, Twilio, SIP) |
+| Tool functions | Speech-to-text (Deepgram, Gladia, AWS) |
+| Business logic | Text-to-speech (ElevenLabs, Cartesia) |
+| Database queries | Voice Activity Detection (VAD) |
+| Conversation history | Turn detection |
+| When to start/stop calls | Audio mixing and streaming |
+The split is clean: you own the **what** (what the agent says, what tools it has, what data it accesses), the server owns the **how** (how audio is captured, transcribed, synthesized, and played back).
+## One connection, many agents
+A single WebSocket connection multiplexes everything:
+![Agent tree — one connection, many agents](/assets/diagrams/agent-tree.png)
+No separate infrastructure per agent. No load balancer per channel. One `new Pinecall()`, as many agents as you need.
+## Config is code
+There is no dashboard to configure. Agent config lives in your source code, version-controlled, reviewable:
+```typescript
+const agent = pc.agent("mara", {
+  prompt: fs.readFileSync("./prompts/mara.md", "utf-8"),
+  llm: "openai/gpt-5-chat-latest",
+  voice: "elevenlabs/sarah",
+  language: "es",
+  stt: { provider: "deepgram-flux", keyterms: ["Acme", "checkout"] },
+  phoneNumber: "+13186330963",
+  sessionLimits: { idle_timeout_seconds: 30, idle_warning_seconds: 10 },
+});
+```
+Change anything — prompt, voice, model, channels — and the server picks it up on the next connection. No redeployment of a separate config layer. See [Hot Reload](/concepts/hot-reload).
+## Your data never leaves your process
+When the LLM calls a tool, Pinecall routes that call to your SDK handler. Your handler runs in your process — it can query your database, call your internal APIs, read files from disk. The tool result goes back to the LLM through the same WebSocket.
+No data is stored on Pinecall servers. No conversation history is persisted unless you persist it. No tool results are logged unless you log them.
+## What's next
+- [Quickstart](/quickstart) — see the philosophy in action
+- [Agents and Channels](/concepts/agents-and-channels) — the core abstraction
+- [Deployment Topologies](/concepts/deployment-topologies) — how to run in production

package/skills/pinecall-concepts/references/concepts/server-vs-client-llm.md ADDED Viewed

@@ -0,0 +1,119 @@
+---
+title: "Server-side vs Client-side LLM"
+description: "The single most important architectural decision when building a Pinecall agent."
+---
+# Server-side vs Client-side LLM
+When you build a Pinecall agent, you choose where the LLM runs. This is the single most important architectural decision in the SDK.
+## The two modes
+### Server-side LLM (recommended)
+The Pinecall server runs the LLM. You give it a prompt, a model, and (optionally) tool definitions. The server handles STT, runs the LLM, generates TTS — you only handle tool calls.
+```typescript
+import { tool } from "@pinecall/sdk";
+import { z } from "zod";
+const lookupCustomer = tool({
+  name: "lookupCustomer",
+  description: "Look up a customer by phone",
+  schema: z.object({ phone: z.string() }),
+  execute: async ({ phone }) => await db.customers.findOne({ phone }),
+});
+const agent = pc.agent("receptionist", {
+  prompt: "You are a helpful receptionist. Be concise.",
+  llm: "openai/gpt-5-chat-latest",
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  language: "en",
+  tools: [lookupCustomer],
+  greeting: "Hello, how can I help?",
+});
+```
+### Client-side LLM (bring your own)
+You run the LLM yourself. The server handles STT → text and text → TTS. You receive the user's text on `turn.end`, generate a response with whatever LLM you want, and stream it back.
+```typescript
+import OpenAI from "openai";
+const openai = new OpenAI();
+const agent = pc.agent("my-bot", { voice: "cartesia/yumiko", language: "en" });
+agent.on("turn.end", async (turn, call) => {
+  const stream = call.replyStream(turn);
+  const completion = await openai.chat.completions.create({
+    llm: "openai/gpt-5-chat-latest",
+    messages: [
+      { role: "system", content: "You are helpful. Be concise." },
+      { role: "user", content: turn.text },
+    ],
+    stream: true,
+  });
+  for await (const chunk of completion) {
+    if (stream.aborted) break;
+    const token = chunk.choices[0]?.delta?.content;
+    if (token) stream.write(token);
+  }
+  stream.end();
+});
+```
+## Which one to choose
+| | Server-side | Client-side |
+|---|---|---|
+| LLM choice | OpenAI, Mistral, Google, Anthropic | Any provider, any model, local |
+| You handle conversation history | ❌ Server does it | ✅ You do it |
+| You see tool calls | ✅ Via `llm.toolCall` | ✅ You define them |
+| Easier to ship | ✅ Yes | Slightly more code |
+| Required for WhatsApp | ✅ Yes | ❌ No (server-side only) |
+| Latency | Slightly lower (LLM runs near the audio pipeline) | Depends on your provider |
+| Cost | Pinecall passes through provider cost | You pay your provider directly |
+**Pick server-side if**: you're using OpenAI, Mistral, Google, or Anthropic, you want the simplest possible code, or you need WhatsApp.
+**Pick client-side if**: you need a specific LLM Pinecall doesn't host (local Ollama, a fine-tuned model), you have an existing LangChain/LlamaIndex pipeline, or you need full control over the prompt-building logic.
+## You can mix them
+A single `Pinecall` instance can host multiple agents, each with a different LLM strategy:
+```typescript
+// Server-side agent for WhatsApp + phone
+const support = pc.agent("support", {
+  llm: "openai/gpt-5-chat-latest",
+  stt: "deepgram/flux",
+  prompt: "...",
+  phoneNumber: "+13186330963",
+  whatsapp: [{ phoneNumberId: "123", accessToken: "EAA..." }],
+});
+// Client-side agent using a local Ollama model for a specialized use case
+const research = pc.agent("research", { voice: "elevenlabs/george", language: "en" });
+research.on("turn.end", async (turn, call) => {
+  /* call your own LLM (Ollama, fine-tuned model, ...), stream back */
+});
+```
+## What about hybrid?
+What if you want to use the server-side LLM but inject context or modify history mid-call? You can:
+- **Inject context dynamically** — `call.addContext("Recent order: #12345 shipped today")`
+- **Replace the prompt mid-call** — `call.setPrompt("Now you're in escalation mode.")`
+- **Set template variables** — define `{{customer_name}}` in the prompt, fill it per-call
+- **Modify history** — `call.addHistory([...])`, `call.setHistory([...])`, `call.clearHistory()`
+See [Hot-Reload](/concepts/hot-reload) for the full set of mid-call controls.
+## What's next
+- [Hot-reload everything](/concepts/hot-reload)
+- [Tool calling guide](/guides/tools-and-functions)
+- [Events reference](/reference/events) — see all the events you can hook into

package/skills/pinecall-examples/SKILL.md ADDED Viewed

@@ -0,0 +1,59 @@
+---
+name: pinecall-examples
+description: >-
+  Copy-paste recipes — full working agents for common scenarios. Use when the user is building, configuring, or debugging with @pinecall/sdk. Keywords: example, recipe, sample, outbound dispatch, chat bot, browser widget, multi-channel, headless.
+license: MIT
+---
+# Examples
+Copy-paste recipes — full working agents for common scenarios.
+This skill bundles the official Pinecall documentation for **Examples**. The
+table below indexes every page; open the `references/…` file for the full text
+(loaded on demand). Source of truth: <https://docs.pinecall.io>.
+| Page | What it covers | Open |
+|------|----------------|------|
+| **Examples** | Runnable examples showing Pinecall SDK features in action. | [`references/examples/index.md`](references/examples/index.md) · [docs](https://docs.pinecall.io/examples/index) |
+| **Outbound Dispatch** | CSV-driven outbound campaign with rate limiting, dedup, and result writeback. | [`references/examples/outbound-dispatch.md`](references/examples/outbound-dispatch.md) · [docs](https://docs.pinecall.io/examples/outbound-dispatch) |
+| **Example: Turn Detection** | Debug turn events in real-time — per-turn containers showing the full state machine lifecycle. | [`references/examples/turn-detection.md`](references/examples/turn-detection.md) · [docs](https://docs.pinecall.io/examples/turn-detection) |
+| **Example: Headless Agent** | Complete runnable example — a phone support agent with zero web server. | [`references/examples/headless-agent.md`](references/examples/headless-agent.md) · [docs](https://docs.pinecall.io/examples/headless-agent) |
+| **Example: Multi-Channel Bot** | One agent serving phone, WhatsApp, and browser WebRTC simultaneously. | [`references/examples/multi-channel-bot.md`](references/examples/multi-channel-bot.md) · [docs](https://docs.pinecall.io/examples/multi-channel-bot) |
+| **Example: Chat Bot** | Text chat agent using @pinecall/web/chat — same agent, text instead of voice. | [`references/examples/chat-bot.md`](references/examples/chat-bot.md) · [docs](https://docs.pinecall.io/examples/chat-bot) |
+| **Example: Browser Widget** | Express backend + React frontend with VoiceWidget. Click the orb, talk. | [`references/examples/browser-widget.md`](references/examples/browser-widget.md) · [docs](https://docs.pinecall.io/examples/browser-widget) |
+## Canonical agent
+```typescript
+import { Pinecall } from "@pinecall/sdk";
+const pc = new Pinecall(); // reads PINECALL_API_KEY, auto-connects
+const agent = pc.agent("mara", {
+  prompt: "You are Mara, a friendly voice assistant. Be concise.",
+  llm: "openai/gpt-5-chat-latest",
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  language: "en",
+  greeting: "Hello! How can I help?",
+});
+```
+## House rules — always apply
+- **Example defaults** (use these exact strings unless the user asks otherwise):
+  `stt: "deepgram/flux"`, `llm: "openai/gpt-5-chat-latest"`, `voice: "elevenlabs/sarah"`.
+  **NEVER use `deepgram/nova-2`** — it is not supported. Use `deepgram/nova-3`
+  only for languages Flux doesn't support (e.g. Arabic).
+- **Turn detection & VAD are auto-derived from the STT provider — never set
+  `turnDetection` or `vad` manually.** Flux → native turns + native VAD;
+  every other STT → `smart_turn` + `silero`.
+- **Greeting**: inbound → `greeting` field in `pc.agent()`; outbound → `greeting`
+  field in `agent.dial()`. It is sugar for `call.say()` in `call.started`.
+- **Auth**: `new Pinecall()` reads `PINECALL_API_KEY` from env and auto-connects.
+- Full documentation: <https://docs.pinecall.io>
+---
+*Generated from `sdk/docs/` by `@pinecall/skills` — do not edit by hand; edit the
+docs and re-run `node build.mjs`.*

package/skills/pinecall-examples/references/examples/browser-widget.md ADDED Viewed

@@ -0,0 +1,206 @@
+---
+title: "Example: Browser Widget"
+description: "Express backend + React frontend with VoiceWidget. Click the orb, talk."
+---
+# Example: Browser Widget
+An Express backend + React frontend. Users click the orb, talk to your agent — no phone number needed.
+## Install
+```bash
+npm install @pinecall/sdk @pinecall/web express
+```
+## Backend — `server.js`
+```typescript
+import express from "express";
+import { Pinecall } from "@pinecall/sdk";
+const app = express();
+const pc = new Pinecall();
+const mara = pc.agent("mara", {
+  prompt: `You are Mara, a friendly voice assistant.
+Be brief — 1-2 sentences per response.`,
+  llm: "openai/gpt-5-chat-latest",
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  language: "en",
+  allowedOrigins: ["http://localhost:*"],
+  greeting: "Hi! I'm Mara. How can I help?",
+});
+mara.on("call.ended", (call, reason) => {
+  console.log(`Call ended: ${call.id} — ${reason} (${call.duration}s)`);
+});
+// Token endpoint — add your own auth in production
+app.get("/api/token", async (req, res) => {
+  const token = await mara.createToken("webrtc");
+  res.json(token);
+});
+// SSE event stream
+app.get("/events", (req, res) => mara.stream(res));
+app.listen(3000, () => console.log("http://localhost:3000"));
+```
+## Frontend — React
+```tsx
+import { VoiceWidget } from "@pinecall/web";
+function App() {
+  return (
+    <div>
+      <h1>Talk to Mara</h1>
+      <VoiceWidget
+        agent="mara"
+        tokenProvider={async () => {
+          const res = await fetch("/api/token");
+          return res.json();
+        }}
+      />
+    </div>
+  );
+}
+```
+That's it. The `VoiceWidget` renders the orb, handles mic permissions, WebRTC connection, and audio streaming.
+## With `allowedOrigins` (simpler)
+For demos, skip the token endpoint entirely. The `allowedOrigins` config lets the widget auto-fetch tokens:
+```tsx
+// No tokenProvider needed — widget auto-fetches via allowedOrigins
+<VoiceWidget agent="mara" />
+```
+This works because `allowedOrigins: ["http://localhost:*"]` in the backend allows token requests from matching browser origins. For production, use the `tokenProvider` pattern with real auth.
+## Rendering tools in the UI
+The `VoiceWidget` supports **interactive tool UI** — the agent calls tools on the backend, and the results appear as clickable components in the browser.
+### Backend — add a tool
+```typescript
+import { tool } from "@pinecall/sdk";
+import { z } from "zod";
+const getSlots = tool({
+  name: "getSlots",
+  description: "Get available time slots for a date.",
+  schema: z.object({ date: z.string() }),
+  execute: async ({ date }) => ({
+    slots: ["10:00", "11:30", "14:00", "16:00"],
+  }),
+});
+const mara = pc.agent("mara", {
+  // ...config from above...
+  tools: [getSlots],
+});
+```
+### Frontend — render the tool result
+Pass `trackedTools` to tell the widget which results to capture. Use `useVoice()` inside a child component to render them:
+```tsx
+import { VoiceWidget, useVoice } from "@pinecall/web";
+function SlotPicker() {
+  const { toolCalls, sendText, dismissTool } = useVoice();
+  const slots = toolCalls.find(tc => tc.name === "getSlots" && tc.result);
+  if (!slots) return null;
+  return (
+    <div className="slot-picker">
+      <h3>Pick a time</h3>
+      {slots.result.slots.map((slot) => (
+        <button
+          key={slot}
+          onClick={() => {
+            sendText(`I'll take the ${slot} slot`);
+            dismissTool(slots.toolCallId);
+          }}
+        >
+          {slot}
+        </button>
+      ))}
+    </div>
+  );
+}
+function App() {
+  return (
+    <VoiceWidget
+      agent="mara"
+      trackedTools={["getSlots"]}
+      tokenProvider={async () => {
+        const res = await fetch("/api/token");
+        return res.json();
+      }}
+    >
+      <SlotPicker />
+    </VoiceWidget>
+  );
+}
+```
+### API reference
+| API | What it does |
+|-----|-------------|
+| `trackedTools={["getSlots"]}` | Captures results for these tool names |
+| `useVoice()` | Hook — returns `toolCalls`, `sendText`, `dismissTool`, `setContext` |
+| `toolCalls` | Array of `{ name, toolCallId, result }` — live tool state |
+| `sendText(text)` | Injects text as if the user spoke it (click → voice) |
+| `dismissTool(id)` | Removes a tool from state after interaction |
+| `setContext(key, value)` | Injects context into the LLM prompt in real time |
+### Context injection
+Sync UI state back to the agent's prompt so it knows what the user sees:
+```tsx
+const { setContext } = useVoice();
+useEffect(() => {
+  setContext("form_state", `Name: ${name}, Email: ${email}`);
+  return () => setContext("form_state", null);
+}, [name, email]);
+```
+The server appends this as a `## UI Context` section in the system prompt.
+> For a full working example with slot picker, contact form with auto-fill, and confirmation card, see the [`booking-tools` example](https://github.com/pinecall/sdk/tree/master/examples/booking-tools).
+## Run it
+```bash
+PINECALL_API_KEY=pk_... node server.js
+```
+Open `http://localhost:3000`. Click the orb. Talk.
+## Production checklist
+- [ ] **Auth on `/api/token`** — add session/JWT check, never expose without auth
+- [ ] **Rate limit** — cap tokens per user per hour
+- [ ] **Remove `allowedOrigins`** — use `tokenProvider` with your auth instead
+- [ ] **Mic permission UX** — explain why you need mic access before the click
+## What's next
+- [Security](/security) — production token auth
+- [Tools API](/web/widget/tools-api) — full interactive tool UI reference
+- [Headless agent example](/examples/headless-agent) — backend-only agents