npm - @pinecall/skills - Versions diffs - 0.1.0 - Mend

@pinecall/skills 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

package/skills/pinecall-reference/references/reference/rest-api.md ADDED Viewed

@@ -0,0 +1,122 @@
+---
+title: "REST API"
+description: "Static helpers for the Pinecall management API. No WebSocket needed."
+---
+# REST API
+The SDK ships with static REST helpers for management tasks: list voices, list phone numbers, mint tokens, check Twilio balance. These don't require an active WebSocket — you can call them from any process with an API key.
+```typescript
+import {
+  fetchVoices,
+  fetchPhones,
+  createToken,
+  fetchTwilioBalance,
+} from "@pinecall/sdk";
+```
+## `fetchVoices(opts?)`
+List available TTS voices. Filter by provider and language.
+```typescript
+import { fetchVoices } from "@pinecall/sdk";
+// All voices
+const voices = await fetchVoices();
+// Spanish Cartesia voices only
+const es = await fetchVoices({ provider: "cartesia", language: "es" });
+voices.forEach((v) => console.log(`${v.name} (${v.provider}/${v.alias ?? v.id})`));
+// → "Sarah (elevenlabs/sarah)"
+```
+**Returns:** `Voice[]` — each voice has `id`, `name`, `alias`, `provider`, `gender`, `style`, `languages[]`, `previewUrl`.
+| Option | Type | Description |
+|---|---|---|
+| `provider` | `string` | Filter by provider name |
+| `language` | `string` | Filter by language (BCP-47) |
+| `apiUrl` | `string` | Custom server URL |
+## `fetchPhones(opts)`
+List phone numbers on your Pinecall account.
+```typescript
+const phones = await fetchPhones({ apiKey: "pk_..." });
+phones.forEach((p) => console.log(`${p.name} → ${p.number}`));
+// → "(318) 633-0963 → +13186330963"
+```
+**Returns:** `Phone[]` — each phone has `number` (E.164), `name`, `sid`, `isSdk`.
+| Option | Type | Required | Description |
+|---|---|---|---|
+| `apiKey` | `string` | ✅ | Your Pinecall API key |
+| `apiUrl` | `string` | — | Custom server URL |
+## `createToken(opts)`
+Generate a short-lived, single-use token for browser WebRTC or chat connections. **Requires API key** — call from your backend, never the browser.
+```typescript
+import { createToken } from "@pinecall/sdk";
+const token = await createToken({
+  channel: "webrtc",
+  agentId: "florencia",
+  apiKey: process.env.PINECALL_API_KEY!,
+});
+```
+Or via instance methods (preferred when you have a `Pinecall` or `Agent` instance):
+```typescript
+const token = await pc.createToken("webrtc", "florencia");
+const token = await agent.createToken("webrtc");
+```
+**Returns:** `{ token: string, server: string, expiresIn: number }`.
+| Option | Type | Required | Description |
+|---|---|---|---|
+| `channel` | `"webrtc" \| "chat" \| "stream"` | ✅ | Token type |
+| `agentId` | `string` | ✅ | Agent slug |
+| `apiKey` | `string` | ✅ | API key for authentication |
+| `apiUrl` | `string` | — | Custom server URL |
+See [Security](/security) for the full token security model.
+## `fetchTwilioBalance(opts?)`
+Check your Twilio account balance.
+```typescript
+const balance = await fetchTwilioBalance({ apiKey: "pk_..." });
+if (balance) console.log(`$${balance.balance} ${balance.currency}`);
+```
+**Returns:** `{ balance: string, currency: string } | null`.
+| Option | Type | Required | Description |
+|---|---|---|---|
+| `apiKey` | `string` | ✅ | API key |
+| `apiUrl` | `string` | — | Custom server URL |
+## Custom server URL
+All helpers accept an `apiUrl` option for self-hosted or staging servers:
+```typescript
+fetchVoices({ apiUrl: "http://localhost:1337" });
+fetchPhones({ apiKey: "pk_...", apiUrl: "http://localhost:1337" });
+```
+## What's next
+- [`Pinecall.createToken`](/api/pinecall) — instance method form
+- [Security](/security) — token security model
+- [TTS Providers](/reference/tts-providers) — discovering voices

package/skills/pinecall-reference/references/reference/session-limits.md ADDED Viewed

@@ -0,0 +1,119 @@
+---
+title: "Session Limits"
+description: "Safety limits to prevent runaway sessions."
+---
+# Session Limits
+Calls have built-in safety limits to prevent runaway sessions: max call duration, idle timeout, warnings, and grace periods. Tune them per agent.
+## Defaults
+| Setting | Default | Description |
+|---|---|---|
+| `max_duration_seconds` | `600` (10 min) | Hard cap on total call length |
+| `idle_timeout_seconds` | `60` | Auto-hangup after this many seconds of no user speech |
+| `idle_warning_seconds` | `15` | Emit `session.idleWarning` this many seconds **before** idle timeout |
+| `idle_grace_seconds` | `10` | After idle timeout fires, agent gets this many seconds to prompt user before force-hangup |
+## Tuning per agent
+```typescript
+const agent = pc.agent("receptionist", {
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux-en",
+  llm: "openai/gpt-5-chat-latest",
+  prompt: "...",
+  sessionLimits: {
+    max_duration_seconds: 1800,  // 30 minutes
+    idle_timeout_seconds: 120,   // 2 minutes of silence
+    idle_warning_seconds: 30,    // warn 30s before timeout
+    idle_grace_seconds: 15,
+  },
+});
+```
+## Disabling limits
+Set to `0` to disable. **Not recommended for production** — runaway sessions are a real cost risk.
+```typescript
+sessionLimits: {
+  max_duration_seconds: 0,  // 0 = unlimited
+  idle_timeout_seconds: 0,  // 0 = disabled
+}
+```
+## How it works
+1. The server starts two watchdog tasks when a call begins.
+2. The **max-duration watchdog** fires after `max_duration_seconds` — emits `session.timeout` then hangs up.
+3. The **idle watchdog** tracks user activity:
+   - When the user hasn't spoken for `idle_timeout_seconds - idle_warning_seconds`, emits `session.idleWarning`
+   - Then waits `idle_warning_seconds` for the user to speak
+   - If still silent at `idle_timeout_seconds`, fires `session.idleWarning` again, gives the agent `idle_grace_seconds` to prompt the user
+   - If still silent, emits `session.timeout` and hangs up
+4. Any user speech resets the idle timer.
+## Reacting to warnings
+The `session.idleWarning` event lets you prompt the user before the timeout:
+```typescript
+agent.on("session.idleWarning", (event, call) => {
+  // event.remainingSeconds: seconds until timeout
+  // event.idleTimeoutSeconds: the configured idle timeout
+  call.say("Are you still there?");
+});
+agent.on("session.timeout", (event, call) => {
+  // event.reason: "max_duration" | "idle_timeout"
+  call.say("Goodbye! The call is ending due to inactivity.");
+});
+```
+## Timeline
+![Idle timeout timeline](/assets/diagrams/idle-timeline.png)
+> **Important:** Bot speech (e.g. "Are you still there?") **pauses** the idle counter but does **not** reset it. Only real user speech resets the timer. This prevents infinite warning loops.
+## Widget integration
+The `@pinecall/web` automatically responds to `session.idleWarning` by switching the orb to a blinking amber state (`.idle-warning` CSS class, configurable via `colorWarning` theme prop). On `session.timeout`, the widget auto-disconnects.
+## Common configs
+### Quick IVR-style flows
+```typescript
+sessionLimits: {
+  max_duration_seconds: 180,   // 3 min hard cap
+  idle_timeout_seconds: 20,    // hang up fast on silence
+  idle_warning_seconds: 5,
+}
+```
+### Long-form support calls
+```typescript
+sessionLimits: {
+  max_duration_seconds: 3600,  // 1 hour
+  idle_timeout_seconds: 180,   // 3 min of silence
+  idle_warning_seconds: 60,
+}
+```
+### Outbound campaigns
+```typescript
+sessionLimits: {
+  max_duration_seconds: 600,   // 10 min — most outbound calls end quickly
+  idle_timeout_seconds: 30,    // hang up if callee stops engaging
+}
+```
+## What's next
+- [Events reference → `session.*`](/reference/events)
+- [Outbound calls](/guides/outbound-calls)

package/skills/pinecall-reference/references/reference/stt-providers.md ADDED Viewed

@@ -0,0 +1,174 @@
+---
+title: "STT Providers"
+description: "Speech-to-text providers, models, and tuning parameters."
+---
+# STT Providers
+Pinecall supports multiple STT providers. Use the `provider/model` format or a full config object.
+## Quick reference
+```typescript
+// Deepgram Flux (recommended for real-time voice)
+{ stt: "deepgram/flux" }             // auto-selects en/multi based on language
+{ stt: "deepgram/flux-en" }          // force English-only model
+{ stt: "deepgram/flux-multi" }       // force multilingual model
+// Deepgram Nova
+{ stt: "deepgram/nova-3" }
+{ stt: "deepgram/nova-2" }
+// Gladia
+{ stt: "gladia/solaria" }
+// AWS Transcribe
+{ stt: "transcribe" }
+```
+## Naming convention
+Configuration objects that pass through to providers keep **snake_case** to mirror what the receiving side expects (`endpointing_ms`, `interim_results`, etc.). This avoids an unnecessary translation layer and lets you copy-paste from provider docs directly.
+## Deepgram Flux (recommended)
+Best for real-time voice agents. Turn detection and VAD are **auto-derived** — no configuration needed.
+```typescript
+stt: "deepgram/flux"
+```
+Or with tuning:
+```typescript
+stt: {
+  provider: "deepgram-flux",
+  keyterms: ["pinecall"],      // boost recognition for specific terms
+  eot_threshold: 0.5,          // end-of-turn sensitivity (0-1)
+  eager_eot_threshold: 0.7,    // eager turn threshold
+  eot_timeout_ms: 2000,
+}
+```
+> **Auto-derived:** Flux → native turn detection + native VAD. No need to specify `turnDetection`.
+> **Language auto-select:** `"deepgram/flux"` picks `flux-general-en` when `language: "en"` and `flux-general-multi` otherwise. Use `"deepgram/flux-en"` or `"deepgram/flux-multi"` to force a specific model.
+## Deepgram Nova
+Classic STT. Turn detection and VAD auto-derived (smart_turn + silero).
+```typescript
+stt: "deepgram/nova-3"
+```
+Or with tuning:
+```typescript
+stt: {
+  provider: "deepgram",
+  model: "nova-3",
+  language: "en",
+  interim_results: true,
+  smart_format: true,
+  punctuate: true,
+  profanity_filter: false,
+  endpointing_ms: 300,
+  utterance_end_ms: 1000,
+  keywords: ["pinecall"],
+}
+```
+## Gladia
+```typescript
+stt: "gladia/solaria"
+```
+Or with tuning:
+```typescript
+stt: {
+  provider: "gladia",
+  model: "solaria-1",
+  language: "en",
+  endpointing: 300,
+  speech_threshold: 0.8,
+  code_switching: false,
+  audio_enhancer: true,
+}
+```
+## AWS Transcribe
+```typescript
+stt: {
+  provider: "transcribe",
+  language: "en-US",
+}
+```
+## Which to choose
+| Provider | Best for | Trade-off |
+|---|---|---|
+| `deepgram/flux` | Real-time voice agents | Lowest latency; English, Spanish, French, German, Portuguese, and ~15 more |
+| `deepgram/nova-3` | Arabic, Hindi, Thai, CJK, and 60+ languages | Slightly higher latency; smart_turn + silero VAD |
+| `gladia/solaria` | Code-switching, multilingual | Higher latency than Deepgram |
+| `transcribe` | AWS-native deployments | AWS pricing model |
+For most agents, start with `deepgram/flux`. Use `deepgram/nova-3` for languages Flux doesn't cover (Arabic, Hindi, Thai, Chinese, Japanese, Korean, etc.).
+## Language coverage
+**Deepgram Flux** supports ~20 languages including: English, Spanish, French, German, Portuguese, Italian, Dutch, Russian, Ukrainian, Turkish, Polish, Swedish, Norwegian, Danish, Finnish, Indonesian, Malay, Korean, Japanese, Chinese (Mandarin).
+**Deepgram Nova-3** supports 60+ languages including everything Flux covers plus: Arabic, Hindi, Urdu, Bengali, Thai, Vietnamese, Hebrew, Farsi, Swahili, Tamil, Telugu, and many more.
+> **Rule of thumb:** If your language works with Flux, use Flux — it's faster and has native turn detection. If not, use Nova-3.
+### Multi-language agents with `phoneNumbers`
+When you have different phone numbers per language/region, set per-number STT overrides. The server auto-derives turn detection and VAD from the STT provider:
+| STT Provider | Turn Detection | VAD |
+|---|---|---|
+| `deepgram/flux` | Native (built-in) | Native (built-in) |
+| `deepgram/nova-3` | Smart turn | Silero |
+| `gladia/solaria` | Smart turn | Silero |
+```typescript
+const agent = pc.agent("global-support", {
+  prompt: "You are a multilingual support agent.",
+  llm: "openai/gpt-5-chat-latest",
+  phoneNumbers: [
+    // English — Flux (fastest, native turn detection)
+    { number: "+14155551234", language: "en", voice: "elevenlabs/sarah", stt: "deepgram/flux" },
+    // Spanish — Flux multilingual
+    { number: "+34612345678", language: "es", voice: "elevenlabs/valentina", stt: "deepgram/flux" },
+    // Arabic — Nova-3 (Flux doesn't support Arabic)
+    { number: "+972501234567", language: "ar", voice: "elevenlabs/ahmad", stt: "deepgram/nova-3" },
+  ],
+});
+```
+No need to configure turn detection or VAD manually — the server auto-derives them from the STT provider.
+## Hot-reloading STT
+You can swap STT providers at runtime:
+```typescript
+// Agent-wide (all future calls)
+agent.update({ stt: "gladia/solaria" });
+// One call only
+call.update({ stt: "deepgram/nova-3" });
+```
+## What's next
+- [Turn Detection](/concepts/turn-detection) — how Flux native vs SmartTurn + Silero work
+- [TTS Providers](/reference/tts-providers)
+- [LLM Providers](/reference/llm-providers)
+- [`Agent.configure`](/api/agent)

package/skills/pinecall-reference/references/reference/tts-providers.md ADDED Viewed

@@ -0,0 +1,149 @@
+---
+title: "TTS Providers"
+description: "Text-to-speech providers, voices, and tuning parameters."
+---
+# TTS Providers
+Pinecall supports multiple TTS providers. Use the `provider/friendly-id` format (always lowercase) to specify a voice:
+## Voice format
+```typescript
+// Recommended: friendly alias (always lowercase)
+{ voice: "elevenlabs/sarah" }
+{ voice: "cartesia/yumiko" }
+{ voice: "polly/lucia" }
+// Full config object (for tuning parameters)
+{ voice: { provider: "elevenlabs", voice_id: "...", speed: 1.1 } }
+```
+> The legacy `provider:rawId` format (e.g. `"elevenlabs:EXAVITQu4vr4xnSDxMaL"`) still works but is not recommended.
+## Discovering voices
+Use the CLI to browse voices. Without flags, you get a catalog overview:
+```bash
+# Overview — shows providers, voice counts, languages
+pinecall voices
+# List voices for a provider + language
+pinecall voices --provider=elevenlabs --language=es
+# Preview a voice (plays audio in your terminal)
+pinecall voices play elevenlabs/sarah
+```
+Every voice gets a friendly alias auto-generated from its name — use it directly in your config:
+```typescript
+{ voice: "elevenlabs/sarah" }    // → Sarah - Mature, Reassuring
+{ voice: "elevenlabs/agustin" }  // → Agustin - Conversational & Relaxed
+```
+Or use the [`fetchVoices`](/reference/rest-api) REST helper:
+```typescript
+import { fetchVoices } from "@pinecall/sdk";
+const voices = await fetchVoices({ provider: "elevenlabs", language: "es" });
+voices.forEach((v) => console.log(`${v.name} → ${v.provider}/${v.alias ?? v.id}`));
+```
+## ElevenLabs
+```typescript
+voice: {
+  provider: "elevenlabs",
+  voice_id: "JBFqnCBsd6RMkjVDRZzb",
+  speed: 1.0,
+  stability: 0.5,
+  similarity_boost: 0.75,
+  style: 0,
+  use_speaker_boost: true,
+}
+```
+Shortcut: `"elevenlabs/sarah"`
+The server defaults to `eleven_flash_v2_5` (the fastest model, optimized for real-time streaming). Override it with the optional `model` field (e.g. `model: "eleven_turbo_v2_5"`).
+**Tuning notes:**
+- `stability` higher = more consistent, less expressive
+- `similarity_boost` higher = closer to the cloned voice
+- `style` 0–1, adds expressiveness (slight latency cost)
+## Cartesia
+```typescript
+voice: {
+  provider: "cartesia",
+  voice_id: "a0e99841-438c-4a64-b679-ae501e7d6091",
+  model: "sonic-3",
+  speed: 1.0,
+  volume: 1.0,
+  emotion: null,
+  language: "en",
+}
+```
+Shortcut: `"cartesia/yumiko"`
+**Tuning notes:**
+- `model: "sonic-3"` — fastest Cartesia model, designed for streaming
+- `emotion` accepts named emotion presets (check Cartesia docs for the current list)
+## AWS Polly
+```typescript
+voice: {
+  provider: "polly",
+  voice_id: "Joanna",
+  engine: "neural",
+  language: "en-US",
+}
+```
+Shortcut: `"polly/joanna"`
+**Tuning notes:**
+- `engine: "neural"` is required for natural-sounding output. The older `standard` engine is robotic.
+- Polly is the cheapest option but the least natural — fine for IVR-style flows, not for engaging conversation.
+## Which to choose
+| Provider | Best for | Trade-off |
+|---|---|---|
+| **ElevenLabs** | Most natural-sounding output | Higher cost per character |
+| **Cartesia** | Real-time streaming, low latency | Smaller voice library |
+| **Polly** | Cheap IVR, simple flows | Less natural |
+For most agents, start with ElevenLabs (`eleven_flash_v2_5`) or Cartesia (`sonic-3`). Use Polly only for high-volume, low-engagement flows.
+## Hot-reloading voices
+Voice can change at any time:
+```typescript
+// Agent-wide
+agent.update({ voice: "cartesia/blake" });
+// One call only
+call.update({ voice: "elevenlabs/daniel" });
+// Per-channel override
+agent.addPhoneNumber("+34911234567", {
+  voice: "elevenlabs/valentina",
+});
+```
+## What's next
+- [STT Providers](/reference/stt-providers)
+- [REST API → fetchVoices](/reference/rest-api)
+- [`Agent.configure`](/api/agent)

package/skills/pinecall-sdk-api/SKILL.md ADDED Viewed

@@ -0,0 +1,56 @@
+---
+name: pinecall-sdk-api
+description: >-
+  @pinecall/sdk API reference — Pinecall, Agent, Call, ReplyStream. Use when the user is building, configuring, or debugging with @pinecall/sdk. Keywords: api, pc.agent, agent.dial, call object, reply stream, replyStream, server sdk surface.
+license: MIT
+---
+# @pinecall/sdk (Node.js)
+@pinecall/sdk API reference — Pinecall, Agent, Call, ReplyStream.
+This skill bundles the official Pinecall documentation for **@pinecall/sdk (Node.js)**. The
+table below indexes every page; open the `references/…` file for the full text
+(loaded on demand). Source of truth: <https://docs.pinecall.io>.
+| Page | What it covers | Open |
+|------|----------------|------|
+| **Pinecall** | The WebSocket client. Manages auth, reconnection, and agent multiplexing. | [`references/api/pinecall.md`](references/api/pinecall.md) · [docs](https://docs.pinecall.io/api/pinecall) |
+| **Agent** | Owns channels, routes call events, stores defaults, dials outbound calls. | [`references/api/agent.md`](references/api/agent.md) · [docs](https://docs.pinecall.io/api/agent) |
+| **Call** | Per-session handle. Speak, control, update, read state. | [`references/api/call.md`](references/api/call.md) · [docs](https://docs.pinecall.io/api/call) |
+| **ReplyStream** | Token-by-token streaming for client-side LLM responses. | [`references/api/reply-stream.md`](references/api/reply-stream.md) · [docs](https://docs.pinecall.io/api/reply-stream) |
+## Canonical agent
+```typescript
+import { Pinecall } from "@pinecall/sdk";
+const pc = new Pinecall(); // reads PINECALL_API_KEY, auto-connects
+const agent = pc.agent("mara", {
+  prompt: "You are Mara, a friendly voice assistant. Be concise.",
+  llm: "openai/gpt-5-chat-latest",
+  voice: "elevenlabs/sarah",
+  stt: "deepgram/flux",
+  language: "en",
+  greeting: "Hello! How can I help?",
+});
+```
+## House rules — always apply
+- **Example defaults** (use these exact strings unless the user asks otherwise):
+  `stt: "deepgram/flux"`, `llm: "openai/gpt-5-chat-latest"`, `voice: "elevenlabs/sarah"`.
+  **NEVER use `deepgram/nova-2`** — it is not supported. Use `deepgram/nova-3`
+  only for languages Flux doesn't support (e.g. Arabic).
+- **Turn detection & VAD are auto-derived from the STT provider — never set
+  `turnDetection` or `vad` manually.** Flux → native turns + native VAD;
+  every other STT → `smart_turn` + `silero`.
+- **Greeting**: inbound → `greeting` field in `pc.agent()`; outbound → `greeting`
+  field in `agent.dial()`. It is sugar for `call.say()` in `call.started`.
+- **Auth**: `new Pinecall()` reads `PINECALL_API_KEY` from env and auto-connects.
+- Full documentation: <https://docs.pinecall.io>
+---
+*Generated from `sdk/docs/` by `@pinecall/skills` — do not edit by hand; edit the
+docs and re-run `node build.mjs`.*