@pinecall/skills 0.1.13 → 0.1.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@pinecall/skills",
3
- "version": "0.1.13",
3
+ "version": "0.1.15",
4
4
  "description": "Agent Skills for the Pinecall SDK — installable into Claude Code, Antigravity, Cursor, Copilot and any agent that supports the open Skills format.",
5
5
  "type": "module",
6
6
  "license": "MIT",
@@ -34,12 +34,12 @@ Every STT, TTS and LLM model on Pinecall is one of two kinds:
34
34
 
35
35
  | Service | BYOK-only providers |
36
36
  |---|---|
37
- | **STT** | `assemblyai`, `soniox` |
37
+ | **STT** | `assemblyai`, `soniox`, `xai` (Grok) |
38
38
  | **TTS** | `rime`, `soniox`, `xai` (Grok voices) |
39
39
  | **LLM** | `xai` (grok), `groq`, `cerebras`, `deepseek`, `openrouter` |
40
40
 
41
41
  > One key covers multiple services: a **Soniox** key = STT + TTS; an **xAI** key =
42
- > Grok LLM + Grok TTS voices.
42
+ > Grok **LLM + TTS + STT**.
43
43
 
44
44
  ## Check it from the API (authoritative, live)
45
45
 
@@ -30,6 +30,7 @@ Pinecall supports multiple STT providers. Use the `provider/model` format or a f
30
30
  { stt: "elevenlabs/scribe" } // ElevenLabs Scribe v2 (realtime)
31
31
  { stt: "assemblyai/universal" } // AssemblyAI Universal-3
32
32
  { stt: "soniox/realtime" } // Soniox real-time (BYOK)
33
+ { stt: "xai/grok-stt" } // xAI Grok STT (BYOK)
33
34
  ```
34
35
 
35
36
  ## Managed vs bring-your-own-key (BYOK)
@@ -48,6 +49,7 @@ for the full list and the live `GET /api/rates/models` query.
48
49
  | `elevenlabs` (scribe) | ✅ Yes | Same key as ElevenLabs TTS — Pinecall hosts it |
49
50
  | `assemblyai` (universal) | ❌ BYOK only | Add an AssemblyAI key |
50
51
  | `soniox` (realtime) | ❌ BYOK only | One Soniox key = STT **and** TTS |
52
+ | `xai` (grok-stt) | ❌ BYOK only | Same xAI key as Grok LLM/TTS |
51
53
 
52
54
  > **BYOK enforcement:** if you configure a BYOK-only STT provider and your org has
53
55
  > not saved a key for it, **agent registration is rejected** with
@@ -96,7 +98,7 @@ Or with tuning:
96
98
  ```typescript
97
99
  stt: {
98
100
  provider: "deepgram",
99
- model: "nova-3",
101
+ model: "nova-3", // "nova-3" | "nova-2"
100
102
  language: "en",
101
103
  interim_results: true,
102
104
  smart_format: true,
@@ -104,7 +106,9 @@ stt: {
104
106
  profanity_filter: false,
105
107
  endpointing_ms: 300,
106
108
  utterance_end_ms: 1000,
107
- keywords: ["pinecall"],
109
+ keywords: ["pinecall"], // nova-2 keyword boosting
110
+ keyterms: ["pinecall"], // nova-3 keyterm prompting
111
+ min_confidence: 0.0, // drop transcripts below this confidence (0 = off)
108
112
  }
109
113
  ```
110
114
 
@@ -192,6 +196,16 @@ stt: "soniox/realtime"
192
196
  stt: { provider: "soniox", model: "stt-rt-v5", language: "en" }
193
197
  ```
194
198
 
199
+ ## xAI Grok (BYOK)
200
+
201
+ Grok speech-to-text — same **xAI key** as Grok LLM and TTS. Requires your own key.
202
+
203
+ ```typescript
204
+ stt: "xai/grok-stt"
205
+ // or
206
+ stt: { provider: "xai", model: "grok-stt", language: "en" }
207
+ ```
208
+
195
209
  ## Which to choose
196
210
 
197
211
  | Provider | Best for | Trade-off |
@@ -204,6 +218,7 @@ stt: { provider: "soniox", model: "stt-rt-v5", language: "en" }
204
218
  | `elevenlabs/scribe` | Single-vendor with ElevenLabs TTS | Managed (shared key) |
205
219
  | `assemblyai/universal` | Accuracy + diarization | BYOK only |
206
220
  | `soniox/realtime` | Multilingual (60+), single-vendor with Soniox TTS | BYOK only |
221
+ | `xai/grok-stt` | Single-vendor with Grok LLM + TTS | BYOK only |
207
222
 
208
223
  For most agents, start with `deepgram/flux`. Use `deepgram/nova-3` for languages Flux doesn't cover (Arabic, Hindi, Thai, Chinese, Japanese, Korean, etc.).
209
224
 
@@ -179,8 +179,11 @@ Shortcut: `"cartesia/yumiko"`
179
179
  voice: {
180
180
  provider: "polly",
181
181
  voice_id: "Joanna",
182
- engine: "neural",
182
+ engine: "neural", // "neural" | "standard"
183
183
  language: "en-US",
184
+ rate: "medium", // "slow" | "medium" | "fast" | "+10%" / "-10%"
185
+ volume: "medium", // "soft" | "medium" | "loud" | "+6dB" / "-6dB"
186
+ pitch: "+5%", // standard engine only
184
187
  }
185
188
  ```
186
189
 
@@ -189,6 +192,7 @@ Shortcut: `"polly/joanna"`
189
192
  **Tuning notes:**
190
193
 
191
194
  - `engine: "neural"` is required for natural-sounding output. The older `standard` engine is robotic.
195
+ - `rate` / `volume` accept named levels or relative values; `pitch` only applies to the `standard` engine.
192
196
  - Polly is the cheapest option but the least natural — fine for IVR-style flows, not for engaging conversation.
193
197
 
194
198
  ## Rime (BYOK)