@pinecall/skills 0.1.7 → 0.1.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@pinecall/skills",
3
- "version": "0.1.7",
3
+ "version": "0.1.9",
4
4
  "description": "Agent Skills for the Pinecall SDK — installable into Claude Code, Antigravity, Cursor, Copilot and any agent that supports the open Skills format.",
5
5
  "type": "module",
6
6
  "license": "MIT",
@@ -21,21 +21,24 @@ Every STT, TTS and LLM model on Pinecall is one of two kinds:
21
21
 
22
22
  | Service | Managed providers |
23
23
  |---|---|
24
- | **STT** | `deepgram` (flux, nova-3), `gladia`, `transcribe` (AWS) |
24
+ | **STT** | `deepgram` (flux, nova-3), `gladia`, `transcribe` (AWS), `cartesia` (ink-whisper), `elevenlabs` (scribe) |
25
25
  | **TTS** | `elevenlabs`, `cartesia` (sonic), `polly` (AWS) |
26
26
  | **LLM** | `openai`, `anthropic`, `google` (gemini), `mistral` |
27
27
 
28
+ > **One key, both services:** an ElevenLabs (or Cartesia) key serves *both* that
29
+ > vendor's TTS and STT. Pinecall already holds those keys for the managed TTS, so
30
+ > their STT (ElevenLabs **Scribe**, Cartesia **Ink-Whisper**) is **also managed** —
31
+ > no key needed.
32
+
28
33
  ## What requires your own key (BYOK)
29
34
 
30
35
  | Service | BYOK-only providers |
31
36
  |---|---|
32
- | **STT** | `cartesia` (ink-whisper), `elevenlabs` (scribe), `assemblyai` |
33
- | **TTS** | `rime` |
37
+ | **STT** | `assemblyai`, `soniox` |
38
+ | **TTS** | `rime`, `soniox` |
34
39
  | **LLM** | `xai` (grok), `groq`, `cerebras`, `deepseek`, `openrouter` |
35
40
 
36
- > Note a provider can be **managed for one service and BYOK for another** — e.g.
37
- > Cartesia **TTS** (sonic) is managed, but Cartesia **STT** (ink-whisper) is BYOK.
38
- > ElevenLabs **TTS** is managed, ElevenLabs **STT** (scribe) is BYOK.
41
+ > `soniox` is one key for **both** STT and TTS (a Soniox key enables both).
39
42
 
40
43
  ## Check it from the API (authoritative, live)
41
44
 
@@ -55,7 +58,7 @@ curl https://playground.pinecall.io/api/rates/models
55
58
  // ...
56
59
  ],
57
60
  "managedProviders": {
58
- "stt": ["deepgram", "gladia", "transcribe"],
61
+ "stt": ["cartesia", "deepgram", "elevenlabs", "gladia", "transcribe"],
59
62
  "tts": ["cartesia", "elevenlabs", "polly"],
60
63
  "llm": ["anthropic", "google", "mistral", "openai"]
61
64
  }
@@ -29,6 +29,7 @@ Pinecall supports multiple STT providers. Use the `provider/model` format or a f
29
29
  { stt: "cartesia/ink-whisper" } // Cartesia Ink-Whisper
30
30
  { stt: "elevenlabs/scribe" } // ElevenLabs Scribe v2 (realtime)
31
31
  { stt: "assemblyai/universal" } // AssemblyAI Universal-3
32
+ { stt: "soniox/realtime" } // Soniox real-time (BYOK)
32
33
  ```
33
34
 
34
35
  ## Managed vs bring-your-own-key (BYOK)
@@ -43,9 +44,10 @@ for the full list and the live `GET /api/rates/models` query.
43
44
  | `deepgram` (flux/nova) | ✅ Yes | Default, recommended |
44
45
  | `gladia` | ✅ Yes | |
45
46
  | `transcribe` (AWS) | ✅ Yes | |
46
- | `cartesia` (ink-whisper) | BYOK only | Add a Cartesia key |
47
- | `elevenlabs` (scribe) | BYOK only | Add an ElevenLabs key |
47
+ | `cartesia` (ink-whisper) | Yes | Same key as Cartesia TTS — Pinecall hosts it |
48
+ | `elevenlabs` (scribe) | Yes | Same key as ElevenLabs TTS — Pinecall hosts it |
48
49
  | `assemblyai` (universal) | ❌ BYOK only | Add an AssemblyAI key |
50
+ | `soniox` (realtime) | ❌ BYOK only | One Soniox key = STT **and** TTS |
49
51
 
50
52
  > **BYOK enforcement:** if you configure a BYOK-only STT provider and your org has
51
53
  > not saved a key for it, **agent registration is rejected** with
@@ -135,10 +137,11 @@ stt: {
135
137
  }
136
138
  ```
137
139
 
138
- ## Cartesia Ink-Whisper (BYOK)
140
+ ## Cartesia Ink-Whisper
139
141
 
140
- Pairs naturally with Cartesia (Sonic) TTS for a single-vendor voice stack. Requires
141
- your own Cartesia key.
142
+ Pairs naturally with Cartesia (Sonic) TTS for a single-vendor voice stack.
143
+ **Managed** the same Cartesia key serves TTS and STT, and Pinecall hosts it (or
144
+ bring your own Cartesia key to bill it directly).
142
145
 
143
146
  ```typescript
144
147
  stt: "cartesia/ink-whisper"
@@ -146,9 +149,10 @@ stt: "cartesia/ink-whisper"
146
149
  stt: { provider: "cartesia", model: "ink-whisper", language: "en" }
147
150
  ```
148
151
 
149
- ## ElevenLabs Scribe (BYOK)
152
+ ## ElevenLabs Scribe
150
153
 
151
- Realtime `scribe_v2_realtime`. Uses the same ElevenLabs key as ElevenLabs TTS.
154
+ Realtime `scribe_v2_realtime`. **Managed** — uses the same ElevenLabs key as
155
+ ElevenLabs TTS, which Pinecall hosts (or bring your own ElevenLabs key).
152
156
 
153
157
  ```typescript
154
158
  stt: "elevenlabs/scribe"
@@ -163,8 +167,8 @@ stt: {
163
167
 
164
168
  ## AssemblyAI (BYOK)
165
169
 
166
- Universal-3 streaming (`u3-rt-pro`) — strong accuracy + diarization. Requires your
167
- own AssemblyAI key.
170
+ Universal-3 streaming (`u3-rt-pro`) — strong accuracy + diarization. **BYOK only** —
171
+ Pinecall hosts no AssemblyAI key, so add your own under Provider Keys.
168
172
 
169
173
  ```typescript
170
174
  stt: "assemblyai/universal"
@@ -177,6 +181,17 @@ stt: {
177
181
  }
178
182
  ```
179
183
 
184
+ ## Soniox (BYOK)
185
+
186
+ Real-time multilingual STT (60+ languages). One Soniox key serves **both** Soniox
187
+ STT and TTS. Requires your own Soniox key.
188
+
189
+ ```typescript
190
+ stt: "soniox/realtime"
191
+ // or
192
+ stt: { provider: "soniox", model: "stt-rt-v5", language: "en" }
193
+ ```
194
+
180
195
  ## Which to choose
181
196
 
182
197
  | Provider | Best for | Trade-off |
@@ -185,9 +200,10 @@ stt: {
185
200
  | `deepgram/nova-3` | Arabic, Hindi, Thai, CJK, and 60+ languages | Slightly higher latency; smart_turn + silero VAD |
186
201
  | `gladia/solaria` | Code-switching, multilingual | Higher latency than Deepgram |
187
202
  | `transcribe` | AWS-native deployments | AWS pricing model |
188
- | `cartesia/ink-whisper` | Single-vendor with Cartesia TTS | BYOK only |
189
- | `elevenlabs/scribe` | Single-vendor with ElevenLabs TTS | BYOK only |
203
+ | `cartesia/ink-whisper` | Single-vendor with Cartesia TTS | Managed (shared key) |
204
+ | `elevenlabs/scribe` | Single-vendor with ElevenLabs TTS | Managed (shared key) |
190
205
  | `assemblyai/universal` | Accuracy + diarization | BYOK only |
206
+ | `soniox/realtime` | Multilingual (60+), single-vendor with Soniox TTS | BYOK only |
191
207
 
192
208
  For most agents, start with `deepgram/flux`. Use `deepgram/nova-3` for languages Flux doesn't cover (Arabic, Hindi, Thai, Chinese, Japanese, Korean, etc.).
193
209
 
@@ -29,9 +29,10 @@ for the full list and the live `GET /api/rates/models` query.
29
29
  | TTS provider | Managed (no key needed) | Notes |
30
30
  |---|---|---|
31
31
  | `elevenlabs` | ✅ Yes | Default, recommended |
32
- | `cartesia` (sonic) | ✅ Yes | |
32
+ | `cartesia` (sonic-3.5) | ✅ Yes | |
33
33
  | `polly` (AWS) | ✅ Yes | |
34
34
  | `rime` | ❌ BYOK only | Add a Rime key under Provider Keys |
35
+ | `soniox` | ❌ BYOK only | One Soniox key = TTS **and** STT |
35
36
 
36
37
  > **BYOK enforcement:** configuring `rime` without a saved Rime key rejects agent
37
38
  > registration with `PROVIDER_KEY_REQUIRED`. With your own key, that usage is billed
@@ -156,7 +157,7 @@ The model is part of the voice config, so it hot-reloads with it — `agent.upda
156
157
  voice: {
157
158
  provider: "cartesia",
158
159
  voice_id: "a0e99841-438c-4a64-b679-ae501e7d6091",
159
- model: "sonic-3",
160
+ model: "sonic-3.5", // latest; also "sonic-3" / "sonic-latest"
160
161
  speed: 1.0,
161
162
  volume: 1.0,
162
163
  emotion: null,
@@ -168,7 +169,7 @@ Shortcut: `"cartesia/yumiko"`
168
169
 
169
170
  **Tuning notes:**
170
171
 
171
- - `model: "sonic-3"` — fastest Cartesia model, designed for streaming
172
+ - `model: "sonic-3.5"` — latest/fastest Cartesia model (sub-90ms, 42 languages), designed for streaming. `sonic-3` and `sonic-latest` also available.
172
173
  - `emotion` accepts named emotion presets (check Cartesia docs for the current list)
173
174
 
174
175
  ## AWS Polly
@@ -204,6 +205,22 @@ voice: {
204
205
 
205
206
  Shortcut: `"rime/cove"`
206
207
 
208
+ ## Soniox (BYOK)
209
+
210
+ Real-time TTS in 60+ languages. One Soniox key serves **both** Soniox TTS and STT.
211
+ Requires your own Soniox key.
212
+
213
+ ```typescript
214
+ voice: {
215
+ provider: "soniox",
216
+ voice_id: "Adrian", // Soniox voice name
217
+ model: "tts-rt-v1",
218
+ language: "en",
219
+ }
220
+ ```
221
+
222
+ Shortcut: `"soniox/Adrian"`
223
+
207
224
  ## Which to choose
208
225
 
209
226
  | Provider | Best for | Trade-off |
@@ -212,8 +229,9 @@ Shortcut: `"rime/cove"`
212
229
  | **Cartesia** | Real-time streaming, low latency | Smaller voice library |
213
230
  | **Polly** | Cheap IVR, simple flows | Less natural |
214
231
  | **Rime** | Ultra-natural expressive English | BYOK only; English-focused |
232
+ | **Soniox** | Multilingual (60+), single-vendor with Soniox STT | BYOK only |
215
233
 
216
- For most agents, start with ElevenLabs (`eleven_flash_v2_5`) or Cartesia (`sonic-3`). Use Polly only for high-volume, low-engagement flows.
234
+ For most agents, start with ElevenLabs (`eleven_flash_v2_5`) or Cartesia (`sonic-3.5`). Use Polly only for high-volume, low-engagement flows.
217
235
 
218
236
  ## Hot-reloading voices
219
237