npm - @pinecall/skills - Versions diffs - 0.1.0 - Mend

@pinecall/skills 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

package/skills/pinecall-guides/references/guides/human-takeover.md ADDED Viewed

@@ -0,0 +1,184 @@
+---
+title: "Human Takeover"
+description: "Pause the AI agent so a human can intervene in real-time conversations."
+---
+# Human Takeover
+The human-in-the-loop system lets a human operator take over a conversation from the AI agent. While paused, messages still flow to the SDK — the LLM just doesn't respond. The human sends messages through the SDK, and the AI resumes with full context when done.
+## How it works
+```
+AI_ACTIVE ──(pause)──▶ HUMAN_ACTIVE ──(resume)──▶ AI_ACTIVE
+```
+When paused:
+- Incoming messages are forwarded to the SDK (with `paused: true`)
+- The LLM **does not generate responses** — no auto-reply
+- Voice notes are still transcribed (so the human can read them)
+- Human messages are added to LLM history for seamless context on resume
+## Pause granularity
+Three levels, all through the same API:
+| Method | Scope | Use case |
+|--------|-------|----------|
+| `agent.pause(sessionId)` | One conversation | "I'll handle this customer" |
+| `agent.pause({ contact })` | All sessions with a contact | "This person needs human attention" |
+| `agent.pause()` | Entire agent | "Turn off the AI completely" |
+Resume follows the same pattern. Global `agent.resume()` clears all levels.
+## Full example: WhatsApp customer support
+```typescript
+import { Pinecall } from "@pinecall/sdk";
+const pc = new Pinecall({ apiKey: process.env.PINECALL_API_KEY! });
+const support = pc.agent("support", {
+  language: "en",
+  llm: "openai/gpt-5-chat-latest",
+  prompt: "You are a helpful support agent.",
+});
+support.addWhatsapp({
+  phoneNumberId: process.env.WA_PHONE_ID!,
+  accessToken: process.env.WA_TOKEN!,
+});
+// Track active sessions for the dashboard
+const sessions = new Map<string, { contact: string; name: string }>();
+support.on("whatsapp.sessionStarted", (event) => {
+  sessions.set(event.sessionId as string, {
+    contact: event.contactPhone as string,
+    name: event.contactName as string,
+  });
+});
+support.on("whatsapp.message", (event) => {
+  const sessionId = event.sessionId as string;
+  const paused = event.paused as boolean;
+  if (paused) {
+    // AI is paused — route to human dashboard
+    console.log(`[PAUSED] ${event.name}: ${event.text}`);
+    notifyHumanDashboard(sessionId, event);
+    return;
+  }
+  // Normal: AI handles automatically
+  console.log(`[AI] ${event.name}: ${event.text}`);
+});
+// ── Dashboard API (e.g. Express routes) ──
+// Human takes over a session
+app.post("/api/takeover/:sessionId", (req, res) => {
+  support.pause(req.params.sessionId);
+  res.json({ ok: true });
+});
+// Human sends a message
+app.post("/api/send/:sessionId", (req, res) => {
+  support.sendMessage({
+    sessionId: req.params.sessionId,
+    text: req.body.text,
+  });
+  res.json({ ok: true });
+});
+// Human hands back to AI
+app.post("/api/handback/:sessionId", (req, res) => {
+  support.resume(req.params.sessionId);
+  res.json({ ok: true });
+});
+```
+## Events
+| Event | When | Data |
+|-------|------|------|
+| `session.paused` | After `agent.pause()` | `{ sessionId?, contact? }` |
+| `session.resumed` | After `agent.resume()` | `{ sessionId?, contact? }` |
+| `whatsapp.message` | Message received (always) | `{ paused: true }` when paused |
+| `whatsapp.response` | Response sent | `{ source: "human" }` when human |
+```typescript
+support.on("session.paused", (event) => {
+  console.log(`⏸ Paused: session=${event.sessionId}`);
+});
+support.on("session.resumed", (event) => {
+  console.log(`▶ Resumed: session=${event.sessionId}`);
+});
+```
+## Protocol messages
+These are the wire messages exchanged between SDK and server. You don't need to use these directly — the SDK methods handle them.
+### `session.pause` (SDK → Server)
+```json
+{
+  "event": "session.pause",
+  "agent_id": "support",
+  "session_id": "wa-abc123"
+}
+```
+Omit `session_id` and send `contact` for contact-level pause. Omit both for global.
+### `session.resume` (SDK → Server)
+```json
+{
+  "event": "session.resume",
+  "agent_id": "support",
+  "session_id": "wa-abc123"
+}
+```
+### `session.send` (SDK → Server)
+```json
+{
+  "event": "session.send",
+  "agent_id": "support",
+  "session_id": "wa-abc123",
+  "text": "I'm a human agent. Let me help."
+}
+```
+### Confirmations (Server → SDK)
+```json
+{ "event": "session.paused", "agent_id": "support", "session_id": "wa-abc123" }
+{ "event": "session.resumed", "agent_id": "support", "session_id": "wa-abc123" }
+{ "event": "session.sent", "agent_id": "support", "session_id": "wa-abc123" }
+```
+## Context preservation
+Human messages are recorded in the LLM conversation history as `assistant` messages. When the AI resumes, it has full context of what the human said. The conversation flows naturally without the customer noticing the handover.
+## Channel support
+| Channel | Pause/Resume | Send as Human | Status |
+|---------|-------------|---------------|--------|
+| WhatsApp | ✅ | ✅ | Available now |
+| Voice | ✅ (planned) | via `inject_text` | Roadmap |
+| Chat | ✅ (planned) | ✅ (planned) | Roadmap |
+The pause state data model already supports voice call IDs and chat session IDs — the routing just needs to be wired in `LLMHandler.on_user_message()`.
+## What's next
+- [WhatsApp Dashboard example](/examples/whatsapp-dashboard) — runnable example with React UI
+- [WhatsApp guide](/guides/whatsapp) — set up the WhatsApp channel
+- [Events reference](/reference/events) — all event data shapes
+- [Agent API](/api/agent) — `pause()`, `resume()`, `sendMessage()` reference

package/skills/pinecall-guides/references/guides/inbound-voice.md ADDED Viewed

@@ -0,0 +1,201 @@
+---
+title: "Inbound Voice"
+description: "Build a voice agent that answers phone calls."
+---
+# Inbound Voice
+This guide walks through building a phone agent end-to-end: registering a phone number, greeting callers, handling tool calls, and ending the conversation gracefully.
+## Prerequisites
+- A Pinecall API key
+- A phone number on your Pinecall account (purchase one or port one — see [REST API → fetchPhones](/reference/rest-api))
+- Node.js ≥ 18
+## The minimum viable phone agent
+```typescript
+import { Pinecall } from "@pinecall/sdk";
+const pc = new Pinecall({ apiKey: process.env.PINECALL_API_KEY! });
+const receptionist = pc.agent("receptionist", {
+  prompt: "You are the receptionist for Acme Corp. Be brief and warm.",
+  llm: "openai/gpt-5-chat-latest",
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  language: "en",
+  phoneNumber: "+13186330963",
+});
+receptionist.on("call.started", (call) => {
+  if (call.direction === "inbound") {
+    call.say("Thanks for calling Acme. How can I help?");
+  }
+});
+receptionist.on("call.ended", (call, reason) => {
+  console.log(`[${call.id}] ${reason} (${call.duration}s)`);
+});
+```
+That's a working phone agent. The server handles audio transport, STT, the LLM, TTS, and turn detection.
+## Greeting
+There are two ways to greet inbound callers:
+### Option 1: `greeting` in `agent()` (declarative)
+If you use `pc.agent()`, the `greeting` field handles everything — no event handler needed:
+```typescript
+const agent = pc.agent("receptionist", {
+  voice: "elevenlabs/sarah",
+  llm: "openai/gpt-5-chat-latest",
+  stt: "deepgram/flux",
+  prompt: "You are a receptionist for Acme Corp.",
+  phoneNumber: "+13186330963",
+  // Static
+  greeting: "Thanks for calling Acme. How can I help?",
+});
+```
+The greeting is added to LLM history by default, so the model knows what was said. You can disable that:
+```typescript
+greeting: { text: "Welcome to Acme.", addToHistory: false }
+```
+Or make it dynamic per-call:
+```typescript
+greeting: async (call) => {
+  const customer = await db.findByPhone(call.from);
+  return customer ? `Hi ${customer.name}!` : "Hi! How can I help?";
+}
+```
+### Option 2: `call.say()` in `call.started` (programmatic)
+If you use `pc.agent()`, handle the greeting yourself:
+```typescript
+agent.on("call.started", (call) => {
+  call.say("Hello! How can I help you today?");
+});
+```
+Use this when you need logic beyond what `greeting` supports — multiple says, conditional behavior, loading data before speaking, etc.
+> **Outbound calls** use a different mechanism: pass `greeting` in `agent.dial()`. The server speaks it as soon as the callee picks up. See [Outbound Calls](/guides/outbound-calls).
+## Adding tools
+Define tools with `tool()` and Zod schemas. The SDK auto-executes them when the LLM calls them:
+```typescript
+import { Pinecall, tool } from "@pinecall/sdk";
+import { z } from "zod";
+const lookupOrder = tool({
+  name: "lookupOrder",
+  description: "Look up an order by ID",
+  schema: z.object({ orderId: z.string() }),
+  execute: async ({ orderId }) => {
+    const order = await db.orders.findOne(orderId);
+    return order ?? { error: "not_found" };
+  },
+});
+const transferToHuman = tool({
+  name: "transferToHuman",
+  description: "Escalate to a human specialist.",
+  schema: z.object({}),
+  execute: async (_, call) => {
+    call.say("One moment, connecting you to a specialist.");
+    call.forward("+15558675309");
+    return { transferred: true };
+  },
+});
+const endCall = tool({
+  name: "endCall",
+  description: "End the call when the customer says goodbye.",
+  schema: z.object({}),
+  execute: async (_, call) => {
+    call.say("Have a great day. Goodbye!");
+    call.once("bot.finished", () => call.hangup());
+    return { ended: true };
+  },
+});
+const agent = pc.agent("receptionist", {
+  prompt: "You are a receptionist. Look up orders when asked.",
+  llm: "openai/gpt-5-chat-latest",
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  language: "en",
+  phoneNumber: "+13186330963",
+  tools: [lookupOrder, transferToHuman, endCall],
+});
+agent.on("call.started", (call) => call.say("Thanks for calling. How can I help?"));
+```
+See [Tools and Functions](/guides/tools-and-functions) for the full pattern.
+## Automatic call endings
+- When the user hangs up — emits `call.ended` with reason `hangup`
+- After `max_duration_seconds` (default: 10 minutes) — reason `max_duration`
+- After `idle_timeout_seconds` of silence (default: 60s) — reason `idle_timeout`
+See [Session Limits](/reference/session-limits) for tuning these.
+## Listening for live transcripts
+Use `bot.word` and `user.message` events to build a live transcript UI or log the conversation as it happens:
+```typescript
+agent.on("user.message", (event, call) => {
+  console.log(`[${call.id}] User: ${event.text}`);
+});
+let currentBotMessage = "";
+agent.on("bot.speaking", () => { currentBotMessage = ""; });
+agent.on("bot.word", (event, call) => {
+  currentBotMessage += event.word + " ";
+  process.stdout.write(`\r[${call.id}] Bot: ${currentBotMessage}`);
+});
+agent.on("bot.finished", () => console.log());
+```
+## After the call ends
+When `call.ended` fires, the `Call` object is fully populated:
+```typescript
+agent.on("call.ended", async (call, reason) => {
+  await db.calls.create({
+    id: call.id,
+    from: call.from,
+    to: call.to,
+    duration: call.duration,
+    reason,
+    transcript: call.transcript,
+    messages: call.messages, // full LLM history including tool calls
+    startedAt: call.startedAt,
+    endedAt: call.endedAt,
+  });
+});
+```
+## What's next
+- [Outbound calls](/guides/outbound-calls) — make programmatic outbound calls
+- [Tools and Functions](/guides/tools-and-functions) — let the agent take actions
+- [Dev mode](/guides/dev-mode) — share one number between prod and any number of devs
+- [`Call` API reference](/api/call) — every method

package/skills/pinecall-guides/references/guides/knowledge-bases.md ADDED Viewed

@@ -0,0 +1,166 @@
+---
+title: Knowledge bases (RAG)
+description: Tutorial — ground a voice or chat agent on your own documents with retrieval-augmented generation.
+---
+# Knowledge bases (RAG)
+A **knowledge base** is a set of documents your agent answers from. You upload the
+documents once, attach the knowledge base to an agent, and on every turn the
+server retrieves the most relevant chunks for what the user said and injects them
+into the prompt — no fine-tuning, no vector database to run yourself.
+It works the same for **voice** and **chat**.
+This tutorial builds a support agent grounded in your help docs, end to end.
+> **Paid feature.** Knowledge bases require a paid plan (**Starter** or higher). On
+> a free trial, creating or using a knowledge base is blocked — both the dashboard
+> and the CLI will prompt you to upgrade. Everything else below assumes a paid org.
+---
+## Step 1 — Create a knowledge base
+You can do this in the dashboard or from the CLI. Either way you get a **knowledge
+base id** (e.g. `kb_1a2b3c`) — you'll attach that to your agent.
+### Option A — Dashboard
+1. Open [platform.pinecall.io](https://platform.pinecall.io) → **Knowledge**.
+2. Click **New knowledge base**, give it a name (e.g. "Help docs"), and create it.
+3. The knowledge base page shows its **id** (copyable) — keep it for Step 3.
+### Option B — CLI
+```bash
+pinecall knowledge create "Help docs"
+#   ✓ Created knowledge base Help docs
+#     id: kb_1a2b3c
+```
+---
+## Step 2 — Add your documents
+Upload Markdown or text files (`.md`, `.markdown`, `.txt`). Each upload re-trains
+the index automatically.
+### Dashboard
+On the knowledge base page, drag files into the uploader (or paste text). You'll
+see each document listed; click one to read it.
+### CLI
+```bash
+# Upload local files — paths are kept, so re-pushing updates in place (idempotent)
+pinecall knowledge push kb_1a2b3c ./help/*.md
+# List what's in the knowledge base
+pinecall knowledge docs kb_1a2b3c
+# Check what the agent will retrieve for a question — retrieval only, no LLM
+pinecall knowledge query kb_1a2b3c "how do I reset my password"
+```
+See the [CLI reference](/reference/cli) for every `pinecall knowledge` command.
+---
+## Step 3 — Build the agent
+Pass the knowledge base id as `knowledgeBase`. Use the **`{{RAG_CONTEXT}}`** prompt
+variable to control exactly where the retrieved documents are placed:
+```ts
+import { Pinecall } from "@pinecall/sdk";
+const pc = new Pinecall();
+const agent = pc.agent("support", {
+  voice: "elevenlabs/sarah",
+  llm: "anthropic/claude-haiku-4-5",
+  language: "en",
+  // Attach the knowledge base from Step 1
+  knowledgeBase: "kb_1a2b3c",
+  prompt: `You are a friendly support agent for Acme.
+Answer the customer using ONLY the help documentation below. If the answer
+isn't there, say you're not sure and offer to create a ticket — never guess.
+{{RAG_CONTEXT}}
+Keep replies short and conversational.`,
+  greeting: "Hi! You've reached Acme support — how can I help?",
+  phoneNumber: "+14155551234", // omit for chat-only
+});
+```
+That's the whole integration. Before each LLM turn, the server:
+1. takes the user's latest message,
+2. retrieves the top matching chunks from `kb_1a2b3c`,
+3. substitutes them into `{{RAG_CONTEXT}}` (or appends them if you omit the
+   variable), and
+4. runs the LLM with that grounded prompt.
+### Where the context goes — `{{RAG_CONTEXT}}`
+- **Prompt contains `{{RAG_CONTEXT}}`** → retrieved docs are inserted exactly there.
+- **Prompt omits `{{RAG_CONTEXT}}`** → retrieved docs are appended automatically, so
+  a knowledge base works out of the box.
+- **Nothing relevant / no knowledge base** → `{{RAG_CONTEXT}}` resolves to empty and
+  the agent behaves like a normal agent.
+---
+## Step 4 — Run and test it
+```bash
+# Start the agent
+pinecall run support.ts
+# In another terminal, chat with it (text)
+pinecall chat support
+```
+Ask it something covered by your docs — the answer should come straight from them.
+Call the number to test the same behaviour by voice. To sanity-check retrieval
+without spending an LLM call, use `pinecall knowledge query`.
+---
+## How it works
+- **Retrieval is hybrid** — semantic embeddings *fused with* a keyword (BM25)
+  lane. Phrase questions naturally, and exact terms or acronyms (e.g. `TTS` vs
+  `STT`, a product name, an error code) still match precisely instead of blurring
+  into similar wording.
+- **Documents are chunked by heading** (section-aligned, never mid-section), so
+  well-structured Markdown retrieves best.
+- **Sources event** — when retrieval runs, the server emits a `docs.sources` event
+  on the data channel with the documents it used (title, heading, score), so a
+  browser UI can show citations next to the answer.
+- The retrieved context counts toward the LLM's context window — keep documents
+  focused.
+## Keeping the knowledge base in sync
+Re-push whenever the source documents change — `push` upserts by path, so it's safe
+to run repeatedly:
+```bash
+pinecall knowledge push kb_1a2b3c ./help/*.md   # updates changed docs, adds new ones
+pinecall knowledge rm kb_1a2b3c <docId>         # remove one
+pinecall knowledge reindex kb_1a2b3c            # force a rebuild
+```
+## Limits
+- Paid plans only (Starter, Pro, Enterprise).
+- Document formats: `.md`, `.markdown`, `.txt`. Convert PDFs/Docx to text first.
+- A knowledge base belongs to your organization; attach it by id to any of your
+  agents.