@trysaperly/voice-openclaw 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +304 -0
- package/dist/index.d.ts +4 -0
- package/dist/index.js +11973 -0
- package/openclaw.plugin.json +67 -0
- package/package.json +66 -0
package/README.md
ADDED
|
@@ -0,0 +1,304 @@
|
|
|
1
|
+
# @trysaperly/voice-openclaw — give your openclaw agent a phone number
|
|
2
|
+
|
|
3
|
+
Give **your** openclaw agent a real phone number. People **call** it or **text**
|
|
4
|
+
it, and your agent answers — in its own context, with its tools and its memory.
|
|
5
|
+
The caller speaks; the transcribed turn is delivered to your agent as input; the
|
|
6
|
+
agent's reply is **spoken back over the phone**. A text gets a **text** back.
|
|
7
|
+
|
|
8
|
+
**Media never touches your machine or Saperly.** Telnyx holds the audio and does
|
|
9
|
+
all the speech-to-text and text-to-speech *in its own network*; this connector
|
|
10
|
+
only moves **text** in and **directives** out over one websocket. That is the
|
|
11
|
+
whole point — and it is why this is **not** the openclaw `voice-call` plugin (that
|
|
12
|
+
one streams raw call audio into its own gateway; this one never sees a single
|
|
13
|
+
audio frame).
|
|
14
|
+
|
|
15
|
+
```
|
|
16
|
+
☎ caller ──speech──▶ Telnyx (STT, in-network) ──▶ Saperly
|
|
17
|
+
│ one text websocket
|
|
18
|
+
▼
|
|
19
|
+
@trysaperly/voice-openclaw (this connector)
|
|
20
|
+
│ runs YOUR agent per turn
|
|
21
|
+
▼
|
|
22
|
+
YOUR openclaw agent (tools + memory)
|
|
23
|
+
│ spoken reply / directive
|
|
24
|
+
▼
|
|
25
|
+
☎ caller ◀──speech── Telnyx (TTS, in-network) ◀── Saperly ◀── directive
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Contents
|
|
29
|
+
|
|
30
|
+
- [Prerequisites](#prerequisites)
|
|
31
|
+
- [Install](#install)
|
|
32
|
+
- [Configure](#configure)
|
|
33
|
+
- [Usage — Voice](#usage--voice)
|
|
34
|
+
- [Usage — SMS](#usage--sms)
|
|
35
|
+
- [Session routing (`sessionKey`)](#session-routing-sessionkey)
|
|
36
|
+
- [Connect / auth / reconnect](#connect--auth--reconnect)
|
|
37
|
+
- [Troubleshooting](#troubleshooting)
|
|
38
|
+
- [For maintainers](#for-maintainers)
|
|
39
|
+
|
|
40
|
+
## Prerequisites
|
|
41
|
+
|
|
42
|
+
- **An openclaw gateway** you already run (Node ≥ 22, openclaw ≥ 2026.6.8). This
|
|
43
|
+
connector installs into it as a plugin.
|
|
44
|
+
- **A Saperly line in *manual* mode**, plus its `connectionId` and `manualSecret`
|
|
45
|
+
— **or** an `sk_` API key (which auto-discovers your manual lines for you).
|
|
46
|
+
|
|
47
|
+
Set the line to manual mode in the **Saperly dashboard** (Connections → set mode
|
|
48
|
+
to *manual*), or via the API:
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
curl -X PATCH https://api.saperly.com/connections/{id} \
|
|
52
|
+
-H "authorization: Bearer sk_…" \
|
|
53
|
+
-H "content-type: application/json" \
|
|
54
|
+
-d '{ "mode": "manual" }'
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
Saperly wires up the Telnyx side automatically — you don't provision anything on
|
|
58
|
+
Telnyx. Once it's in manual mode, grab the connection's **`connectionId`** and
|
|
59
|
+
**`manualSecret`** from the dashboard (or skip both and use an `sk_` key).
|
|
60
|
+
|
|
61
|
+
## Install
|
|
62
|
+
|
|
63
|
+
Install the plugin into your gateway from npm, then enable it:
|
|
64
|
+
|
|
65
|
+
```bash
|
|
66
|
+
openclaw plugins install npm:@trysaperly/voice-openclaw
|
|
67
|
+
openclaw plugins enable saperly-voice
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
(`openclaw plugins install` also accepts `clawhub:@trysaperly/voice-openclaw` or
|
|
71
|
+
`git:github.com/<owner>/<repo>@<ref>` if you host it elsewhere. The plugin **id** is
|
|
72
|
+
`saperly-voice`; the npm **package** is `@trysaperly/voice-openclaw`.)
|
|
73
|
+
|
|
74
|
+
Then allow + configure it in your openclaw config (`openclaw.json5`):
|
|
75
|
+
|
|
76
|
+
```json5
|
|
77
|
+
{
|
|
78
|
+
plugins: {
|
|
79
|
+
enabled: true,
|
|
80
|
+
allow: ["saperly-voice"],
|
|
81
|
+
entries: {
|
|
82
|
+
"saperly-voice": { enabled: true, config: { /* see Configure */ } },
|
|
83
|
+
},
|
|
84
|
+
},
|
|
85
|
+
}
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
On load it registers the `saperly_voice_reply` tool and opens one websocket per
|
|
89
|
+
bound line. The published package ships prebuilt JS (`dist/index.js`, declared as
|
|
90
|
+
`openclaw.runtimeExtensions`), so the gateway loads it without compiling TypeScript
|
|
91
|
+
or installing any plugin dependencies — `@saperly/voice-protocol` and `effect` are
|
|
92
|
+
bundled in.
|
|
93
|
+
|
|
94
|
+
> **After changing `openclaw.plugin.json`:** the plugin registry **caches** the
|
|
95
|
+
> config schema. If you add or change a config field (e.g. `sessionKey`), refresh
|
|
96
|
+
> it or the gateway rejects the "new" key:
|
|
97
|
+
> `openclaw plugins registry --refresh`.
|
|
98
|
+
|
|
99
|
+
**Monorepo / local development.** Working from a checkout of the Saperly repo
|
|
100
|
+
instead of an npm install? The package also declares
|
|
101
|
+
`"openclaw": { "extensions": ["./src/index.ts"] }`, so a workspace/git-checkout
|
|
102
|
+
gateway can load the channel straight from source (no build step). See
|
|
103
|
+
[For maintainers](#for-maintainers).
|
|
104
|
+
|
|
105
|
+
## Configure
|
|
106
|
+
|
|
107
|
+
You give the connector **either** an `sk_` key (it discovers your lines) **or**
|
|
108
|
+
one-or-more explicit `connectionId` + `manualSecret` pairs. There are three modes;
|
|
109
|
+
in all of them **environment variables win over the config file**, so you can keep
|
|
110
|
+
secrets out of `openclaw.json5`.
|
|
111
|
+
|
|
112
|
+
1. **`sk_`-key auto-discovery (recommended).** Hand it a Saperly `sk_` API key; it
|
|
113
|
+
binds every **manual** line the key is scoped to (a single-line key → one line,
|
|
114
|
+
a workspace key → all). Set `apiKey` / `SAPERLY_API_KEY`.
|
|
115
|
+
2. **Explicit many.** List the lines: `connections: [{ connectionId, manualSecret }, …]`.
|
|
116
|
+
3. **Explicit one.** A single `connectionId` + `manualSecret`.
|
|
117
|
+
|
|
118
|
+
Config-file form (`plugins.entries.saperly-voice.config`):
|
|
119
|
+
|
|
120
|
+
```json5
|
|
121
|
+
{
|
|
122
|
+
"saperly-voice": {
|
|
123
|
+
enabled: true,
|
|
124
|
+
config: {
|
|
125
|
+
baseUrl: "https://api.saperly.com", // no trailing /v2; ws/wss is derived
|
|
126
|
+
// (1) auto-discovery — prefer the env var so the key stays out of the file:
|
|
127
|
+
// apiKey: "sk_…",
|
|
128
|
+
// (3) or bind one specific line:
|
|
129
|
+
connectionId: "conn_123",
|
|
130
|
+
// manualSecret: "mc_…", // prefer SAPERLY_MANUAL_SECRET
|
|
131
|
+
client: "saperly-voice", // optional label for the wide-event trail
|
|
132
|
+
// sessionKey: "saperly-voice:peer:{peer}", // optional — see Session routing
|
|
133
|
+
},
|
|
134
|
+
},
|
|
135
|
+
}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
Equivalent environment variables (these override the file):
|
|
139
|
+
|
|
140
|
+
```
|
|
141
|
+
SAPERLY_BASE_URL=https://api.saperly.com
|
|
142
|
+
SAPERLY_API_KEY=sk_live_… # (1) auto-discovery — OR the pair below:
|
|
143
|
+
SAPERLY_CONNECTION_ID=conn_123 # (3) explicit single line
|
|
144
|
+
SAPERLY_MANUAL_SECRET=mc_abc123
|
|
145
|
+
SAPERLY_CLIENT=saperly-voice # optional label
|
|
146
|
+
SAPERLY_SESSION_KEY=saperly-voice:peer:{peer} # optional — see Session routing
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
**`baseUrl` normalization.** Paste your dashboard URL verbatim. `https://…` →
|
|
150
|
+
`wss://…`, `http://localhost:8787` (or `http://host.docker.internal:8787`) →
|
|
151
|
+
`ws://…`, and a bare host defaults to `wss`. The websocket URL is derived as
|
|
152
|
+
`<base→ws(s)>/v2/manual/{connectionId}/ws`.
|
|
153
|
+
|
|
154
|
+
**Which agent runs?** By default the channel's `main` agent answers, on its
|
|
155
|
+
configured default model. Override per channel with `agentId`, `provider`, and
|
|
156
|
+
`model` in the plugin config.
|
|
157
|
+
|
|
158
|
+
## Usage — Voice
|
|
159
|
+
|
|
160
|
+
Dial the number. Your agent gets the opening turn (it greets the caller), then each
|
|
161
|
+
thing the caller says arrives as a fresh turn on the **same session** for that call,
|
|
162
|
+
so the conversation has continuity — its tools and memory are all available.
|
|
163
|
+
|
|
164
|
+
The agent's reply is spoken back via Telnyx TTS. To do more than speak, the agent
|
|
165
|
+
calls the **`saperly_voice_reply`** tool (always echo the turn's `request_id`):
|
|
166
|
+
|
|
167
|
+
| `kind` | Effect |
|
|
168
|
+
| ------------- | ---------------------------------------------------------------- |
|
|
169
|
+
| `speak` | Say `text` via TTS, then keep listening. |
|
|
170
|
+
| `speak` + `end_call: true` | Say a final line, then hang up. |
|
|
171
|
+
| `wait` | Listen without speaking (optional `timeout_ms`). |
|
|
172
|
+
| `hangup` | End the call (optional `reason`). |
|
|
173
|
+
| `transfer` | Forward the call to `to` (E.164 or SIP URI). |
|
|
174
|
+
| `send_dtmf` | Play `digits`. |
|
|
175
|
+
| `reject` | Decline an inbound call before it's answered. |
|
|
176
|
+
|
|
177
|
+
Invalid args (e.g. `speak` with no `text`, an unknown `kind`) are rejected with a
|
|
178
|
+
correctable message and **never** reach the live call. Answer within **~18 s** or
|
|
179
|
+
the caller hears a short hold line (Saperly falls back at ~20 s; the connector
|
|
180
|
+
pre-empts it). A `call_ended` event closes the turn out and clears any pending work
|
|
181
|
+
for that call.
|
|
182
|
+
|
|
183
|
+
The agent can also **place** an outbound call: address the target as
|
|
184
|
+
`saperly-voice:+15551234567` and Saperly originates the call, routing the answered
|
|
185
|
+
leg back to your agent over the same socket.
|
|
186
|
+
|
|
187
|
+
## Usage — SMS
|
|
188
|
+
|
|
189
|
+
The **same** number receives texts. An inbound SMS is delivered to your agent as a
|
|
190
|
+
text turn, and the agent's reply is sent back as an **outbound SMS** — there is no
|
|
191
|
+
call leg, so only the reply text matters (call-control directives are a no-op for
|
|
192
|
+
SMS).
|
|
193
|
+
|
|
194
|
+
> **Carrier reality:** US long-code numbers do **not** receive *international*
|
|
195
|
+
> inbound SMS. To test texting, send from a **US** number.
|
|
196
|
+
|
|
197
|
+
## Session routing (`sessionKey`)
|
|
198
|
+
|
|
199
|
+
By default, **each conversation gets its own openclaw session** — one call is one
|
|
200
|
+
session, one SMS thread is another. You control this with the optional `sessionKey`
|
|
201
|
+
**template** (`sessionKey` in plugin config, or `SAPERLY_SESSION_KEY`). Placeholders
|
|
202
|
+
the connector fills:
|
|
203
|
+
|
|
204
|
+
| Placeholder | Filled with |
|
|
205
|
+
| ------------------ | ---------------------------------------------------------------- |
|
|
206
|
+
| `{conversationId}` | the call's id (voice) or the SMS thread key (`sms:{numberId}:{peer}`) |
|
|
207
|
+
| `{peer}` | the other party's number (the caller / texter) |
|
|
208
|
+
| `{line}` | your line's number |
|
|
209
|
+
| `{connectionId}` | the bound connection id |
|
|
210
|
+
|
|
211
|
+
| `sessionKey` | Routing |
|
|
212
|
+
| ----------------------------- | ----------------------------------------------------------- |
|
|
213
|
+
| *(unset — default)* | `saperly-voice:{conversationId}` — every conversation isolated |
|
|
214
|
+
| `saperly-voice:main` | one shared "main chat" — everything lands in one session |
|
|
215
|
+
| `saperly-voice:peer:{peer}` | one session **per person** — their calls + texts share memory |
|
|
216
|
+
| `saperly-voice:line:{line}` | one session **per line** |
|
|
217
|
+
|
|
218
|
+
**Trade-off.** A fixed or shared key (`…:main`, or `…:line:{line}` on a busy line)
|
|
219
|
+
merges every caller into one context **and interleaves concurrent turns** — perfect
|
|
220
|
+
for a single personal agent, wrong for a multi-tenant deployment. There, prefer the
|
|
221
|
+
default (per-conversation) or `…:peer:{peer}`.
|
|
222
|
+
|
|
223
|
+
**Concurrent calls.** An inbound call is admitted by a brief, fail-closed accept gate
|
|
224
|
+
(~8s): the agent must answer the `inbound_call` (a `speak` opener or `wait`) before the
|
|
225
|
+
line is picked up, or the call is declined. With the default (per-conversation) key,
|
|
226
|
+
each call runs in its **own** session, so simultaneous calls are admitted independently.
|
|
227
|
+
A shared/fixed key routes every call through **one** session — if your runtime serializes
|
|
228
|
+
same-session runs, a second call arriving while the first is mid-turn can miss the accept
|
|
229
|
+
window and be **declined**. For a line that takes overlapping calls, keep the default or
|
|
230
|
+
use `…:peer:{peer}`.
|
|
231
|
+
|
|
232
|
+
In **explicit-many** mode you can give one line its own routing: add a per-entry
|
|
233
|
+
`sessionKey` inside its `connections[]` object, and it overrides the top-level one.
|
|
234
|
+
|
|
235
|
+
> An absent `{peer}`/`{line}` falls back to the conversation id, so a missing value
|
|
236
|
+
> always **splits** sessions rather than silently merging two distinct conversations
|
|
237
|
+
> into one. (For voice, `{peer}`/`{line}` are remembered from the opening call frame
|
|
238
|
+
> and reused for every later turn of that call.)
|
|
239
|
+
|
|
240
|
+
## Connect / auth / reconnect
|
|
241
|
+
|
|
242
|
+
- **Connect.** The connector dials **out** to
|
|
243
|
+
`GET <baseUrl>/v2/manual/<connectionId>/ws` (it needs no public URL of its own).
|
|
244
|
+
One long-lived socket per bound line multiplexes every live call and text; each
|
|
245
|
+
turn carries a `request_id` your reply echoes so the right call gets the directive.
|
|
246
|
+
- **Auth.** The connection's `manualSecret`, presented as the websocket subprotocol
|
|
247
|
+
`Sec-WebSocket-Protocol: bearer.<secret>` (the browser-compatible escape hatch).
|
|
248
|
+
Under Node's `ws`, a plain `Authorization: Bearer <secret>` header also works.
|
|
249
|
+
- **Reconnect.** The socket is long-lived. On any drop — including `1012` (a newer
|
|
250
|
+
socket superseded this one) — it reconnects with jittered backoff (1 s → 30 s) and
|
|
251
|
+
re-sends `hello`. In-flight turns are released so nothing leaks.
|
|
252
|
+
- **Bad frames.** Every inbound frame is schema-decoded; an undecodable one is
|
|
253
|
+
dropped (never crashes the connector), and an advisory `error` frame is logged so
|
|
254
|
+
the next reply can be corrected.
|
|
255
|
+
|
|
256
|
+
## Troubleshooting
|
|
257
|
+
|
|
258
|
+
- **Not connecting / agent never answers.** Check `baseUrl` (no trailing `/v2`; it
|
|
259
|
+
derives `ws/wss`), the `connectionId`/`manualSecret` pair (or the `sk_` key), and
|
|
260
|
+
that **the line is in manual mode** — a non-manual connection has no `manualSecret`
|
|
261
|
+
and the upgrade fails closed.
|
|
262
|
+
- **`dropping undecodable frame` in the logs.** The connector bundle is older than
|
|
263
|
+
Saperly's current frame protocol. **Rebuild** it (see [For maintainers](#for-maintainers))
|
|
264
|
+
— a stale bundle silently drops frames it can't decode.
|
|
265
|
+
- **Socket closes with `1012`.** A newer connection superseded this one (e.g. you
|
|
266
|
+
started a second instance on the same line). The connector re-binds automatically;
|
|
267
|
+
if it flaps, make sure only one instance binds each line.
|
|
268
|
+
- **No inbound texts.** US long codes don't receive **international** SMS — text from
|
|
269
|
+
a US number to test (see [Usage — SMS](#usage--sms)).
|
|
270
|
+
|
|
271
|
+
## For maintainers
|
|
272
|
+
|
|
273
|
+
```bash
|
|
274
|
+
cd connectors/openclaw
|
|
275
|
+
bun install
|
|
276
|
+
bun run typecheck # tsc --noEmit, strict (exactOptionalPropertyTypes, no any)
|
|
277
|
+
bun test # protocol / bridge / config / channel unit tests
|
|
278
|
+
bun run build # tsdown → dist/index.js + dist/index.d.ts (prebuilt runtime entry)
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
- **`@saperly/voice-protocol`** is the shared websocket frame contract (the same
|
|
282
|
+
Effect Schema frames + directive validation Saperly's server uses — one source of
|
|
283
|
+
truth, no mirror to drift). This connector is **standalone** (its own `bun.lock`,
|
|
284
|
+
not in the monorepo workspace), so it depends on the package by `file:` path
|
|
285
|
+
(`"@saperly/voice-protocol": "file:../../packages/voice-protocol"`) for local dev
|
|
286
|
+
and tests. **It is bundled into `dist/index.js` at build time** (along with
|
|
287
|
+
`effect`), so it does *not* need to be published separately — both are `devDependencies`
|
|
288
|
+
here, and the published tarball is self-contained.
|
|
289
|
+
- **Bundle model.** `bun run build` runs **tsdown** (`tsdown.config.ts`): it bundles
|
|
290
|
+
the entry + `@saperly/voice-protocol` + `effect` into ESM `dist/index.js` and emits
|
|
291
|
+
`dist/index.d.ts`, keeping only the host-supplied openclaw SDK (`openclaw/plugin-sdk/*`)
|
|
292
|
+
external. Ambient type shims (`src/openclaw-sdk.d.ts`) let the connector type-check
|
|
293
|
+
standalone; the rest of the host API is read structurally off `api`.
|
|
294
|
+
- **Dual entry (`extensions` vs `runtimeExtensions`).** `package.json`'s `openclaw`
|
|
295
|
+
block declares both `extensions: ["./src/index.ts"]` (the source entry a
|
|
296
|
+
workspace/git-checkout gateway loads directly) and `runtimeExtensions:
|
|
297
|
+
["./dist/index.js"]` (the prebuilt JS an **installed npm package** loads, so the
|
|
298
|
+
gateway never compiles TypeScript at runtime). `files` ships `dist` +
|
|
299
|
+
`openclaw.plugin.json` + `README.md`; `prepublishOnly` rebuilds `dist` before pack.
|
|
300
|
+
- **Publish.** From `connectors/openclaw`: `npm pack --dry-run` to inspect the tarball,
|
|
301
|
+
then `npm publish` (the package is public; `prepublishOnly` runs the build). Bump
|
|
302
|
+
`version` first. `@saperly/voice-protocol` does **not** need publishing — it's bundled.
|
|
303
|
+
- **Registry caches `configSchema`.** After editing `openclaw.plugin.json`, run
|
|
304
|
+
`openclaw plugins registry --refresh` or the gateway boots on the stale schema.
|
package/dist/index.d.ts
ADDED