@trysaperly/voice-openclaw 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,304 @@
1
+ # @trysaperly/voice-openclaw — give your openclaw agent a phone number
2
+
3
+ Give **your** openclaw agent a real phone number. People **call** it or **text**
4
+ it, and your agent answers — in its own context, with its tools and its memory.
5
+ The caller speaks; the transcribed turn is delivered to your agent as input; the
6
+ agent's reply is **spoken back over the phone**. A text gets a **text** back.
7
+
8
+ **Media never touches your machine or Saperly.** Telnyx holds the audio and does
9
+ all the speech-to-text and text-to-speech *in its own network*; this connector
10
+ only moves **text** in and **directives** out over one websocket. That is the
11
+ whole point — and it is why this is **not** the openclaw `voice-call` plugin (that
12
+ one streams raw call audio into its own gateway; this one never sees a single
13
+ audio frame).
14
+
15
+ ```
16
+ ☎ caller ──speech──▶ Telnyx (STT, in-network) ──▶ Saperly
17
+ │ one text websocket
18
+
19
+ @trysaperly/voice-openclaw (this connector)
20
+ │ runs YOUR agent per turn
21
+
22
+ YOUR openclaw agent (tools + memory)
23
+ │ spoken reply / directive
24
+
25
+ ☎ caller ◀──speech── Telnyx (TTS, in-network) ◀── Saperly ◀── directive
26
+ ```
27
+
28
+ ## Contents
29
+
30
+ - [Prerequisites](#prerequisites)
31
+ - [Install](#install)
32
+ - [Configure](#configure)
33
+ - [Usage — Voice](#usage--voice)
34
+ - [Usage — SMS](#usage--sms)
35
+ - [Session routing (`sessionKey`)](#session-routing-sessionkey)
36
+ - [Connect / auth / reconnect](#connect--auth--reconnect)
37
+ - [Troubleshooting](#troubleshooting)
38
+ - [For maintainers](#for-maintainers)
39
+
40
+ ## Prerequisites
41
+
42
+ - **An openclaw gateway** you already run (Node ≥ 22, openclaw ≥ 2026.6.8). This
43
+ connector installs into it as a plugin.
44
+ - **A Saperly line in *manual* mode**, plus its `connectionId` and `manualSecret`
45
+ — **or** an `sk_` API key (which auto-discovers your manual lines for you).
46
+
47
+ Set the line to manual mode in the **Saperly dashboard** (Connections → set mode
48
+ to *manual*), or via the API:
49
+
50
+ ```bash
51
+ curl -X PATCH https://api.saperly.com/connections/{id} \
52
+ -H "authorization: Bearer sk_…" \
53
+ -H "content-type: application/json" \
54
+ -d '{ "mode": "manual" }'
55
+ ```
56
+
57
+ Saperly wires up the Telnyx side automatically — you don't provision anything on
58
+ Telnyx. Once it's in manual mode, grab the connection's **`connectionId`** and
59
+ **`manualSecret`** from the dashboard (or skip both and use an `sk_` key).
60
+
61
+ ## Install
62
+
63
+ Install the plugin into your gateway from npm, then enable it:
64
+
65
+ ```bash
66
+ openclaw plugins install npm:@trysaperly/voice-openclaw
67
+ openclaw plugins enable saperly-voice
68
+ ```
69
+
70
+ (`openclaw plugins install` also accepts `clawhub:@trysaperly/voice-openclaw` or
71
+ `git:github.com/<owner>/<repo>@<ref>` if you host it elsewhere. The plugin **id** is
72
+ `saperly-voice`; the npm **package** is `@trysaperly/voice-openclaw`.)
73
+
74
+ Then allow + configure it in your openclaw config (`openclaw.json5`):
75
+
76
+ ```json5
77
+ {
78
+ plugins: {
79
+ enabled: true,
80
+ allow: ["saperly-voice"],
81
+ entries: {
82
+ "saperly-voice": { enabled: true, config: { /* see Configure */ } },
83
+ },
84
+ },
85
+ }
86
+ ```
87
+
88
+ On load it registers the `saperly_voice_reply` tool and opens one websocket per
89
+ bound line. The published package ships prebuilt JS (`dist/index.js`, declared as
90
+ `openclaw.runtimeExtensions`), so the gateway loads it without compiling TypeScript
91
+ or installing any plugin dependencies — `@saperly/voice-protocol` and `effect` are
92
+ bundled in.
93
+
94
+ > **After changing `openclaw.plugin.json`:** the plugin registry **caches** the
95
+ > config schema. If you add or change a config field (e.g. `sessionKey`), refresh
96
+ > it or the gateway rejects the "new" key:
97
+ > `openclaw plugins registry --refresh`.
98
+
99
+ **Monorepo / local development.** Working from a checkout of the Saperly repo
100
+ instead of an npm install? The package also declares
101
+ `"openclaw": { "extensions": ["./src/index.ts"] }`, so a workspace/git-checkout
102
+ gateway can load the channel straight from source (no build step). See
103
+ [For maintainers](#for-maintainers).
104
+
105
+ ## Configure
106
+
107
+ You give the connector **either** an `sk_` key (it discovers your lines) **or**
108
+ one-or-more explicit `connectionId` + `manualSecret` pairs. There are three modes;
109
+ in all of them **environment variables win over the config file**, so you can keep
110
+ secrets out of `openclaw.json5`.
111
+
112
+ 1. **`sk_`-key auto-discovery (recommended).** Hand it a Saperly `sk_` API key; it
113
+ binds every **manual** line the key is scoped to (a single-line key → one line,
114
+ a workspace key → all). Set `apiKey` / `SAPERLY_API_KEY`.
115
+ 2. **Explicit many.** List the lines: `connections: [{ connectionId, manualSecret }, …]`.
116
+ 3. **Explicit one.** A single `connectionId` + `manualSecret`.
117
+
118
+ Config-file form (`plugins.entries.saperly-voice.config`):
119
+
120
+ ```json5
121
+ {
122
+ "saperly-voice": {
123
+ enabled: true,
124
+ config: {
125
+ baseUrl: "https://api.saperly.com", // no trailing /v2; ws/wss is derived
126
+ // (1) auto-discovery — prefer the env var so the key stays out of the file:
127
+ // apiKey: "sk_…",
128
+ // (3) or bind one specific line:
129
+ connectionId: "conn_123",
130
+ // manualSecret: "mc_…", // prefer SAPERLY_MANUAL_SECRET
131
+ client: "saperly-voice", // optional label for the wide-event trail
132
+ // sessionKey: "saperly-voice:peer:{peer}", // optional — see Session routing
133
+ },
134
+ },
135
+ }
136
+ ```
137
+
138
+ Equivalent environment variables (these override the file):
139
+
140
+ ```
141
+ SAPERLY_BASE_URL=https://api.saperly.com
142
+ SAPERLY_API_KEY=sk_live_… # (1) auto-discovery — OR the pair below:
143
+ SAPERLY_CONNECTION_ID=conn_123 # (3) explicit single line
144
+ SAPERLY_MANUAL_SECRET=mc_abc123
145
+ SAPERLY_CLIENT=saperly-voice # optional label
146
+ SAPERLY_SESSION_KEY=saperly-voice:peer:{peer} # optional — see Session routing
147
+ ```
148
+
149
+ **`baseUrl` normalization.** Paste your dashboard URL verbatim. `https://…` →
150
+ `wss://…`, `http://localhost:8787` (or `http://host.docker.internal:8787`) →
151
+ `ws://…`, and a bare host defaults to `wss`. The websocket URL is derived as
152
+ `<base→ws(s)>/v2/manual/{connectionId}/ws`.
153
+
154
+ **Which agent runs?** By default the channel's `main` agent answers, on its
155
+ configured default model. Override per channel with `agentId`, `provider`, and
156
+ `model` in the plugin config.
157
+
158
+ ## Usage — Voice
159
+
160
+ Dial the number. Your agent gets the opening turn (it greets the caller), then each
161
+ thing the caller says arrives as a fresh turn on the **same session** for that call,
162
+ so the conversation has continuity — its tools and memory are all available.
163
+
164
+ The agent's reply is spoken back via Telnyx TTS. To do more than speak, the agent
165
+ calls the **`saperly_voice_reply`** tool (always echo the turn's `request_id`):
166
+
167
+ | `kind` | Effect |
168
+ | ------------- | ---------------------------------------------------------------- |
169
+ | `speak` | Say `text` via TTS, then keep listening. |
170
+ | `speak` + `end_call: true` | Say a final line, then hang up. |
171
+ | `wait` | Listen without speaking (optional `timeout_ms`). |
172
+ | `hangup` | End the call (optional `reason`). |
173
+ | `transfer` | Forward the call to `to` (E.164 or SIP URI). |
174
+ | `send_dtmf` | Play `digits`. |
175
+ | `reject` | Decline an inbound call before it's answered. |
176
+
177
+ Invalid args (e.g. `speak` with no `text`, an unknown `kind`) are rejected with a
178
+ correctable message and **never** reach the live call. Answer within **~18 s** or
179
+ the caller hears a short hold line (Saperly falls back at ~20 s; the connector
180
+ pre-empts it). A `call_ended` event closes the turn out and clears any pending work
181
+ for that call.
182
+
183
+ The agent can also **place** an outbound call: address the target as
184
+ `saperly-voice:+15551234567` and Saperly originates the call, routing the answered
185
+ leg back to your agent over the same socket.
186
+
187
+ ## Usage — SMS
188
+
189
+ The **same** number receives texts. An inbound SMS is delivered to your agent as a
190
+ text turn, and the agent's reply is sent back as an **outbound SMS** — there is no
191
+ call leg, so only the reply text matters (call-control directives are a no-op for
192
+ SMS).
193
+
194
+ > **Carrier reality:** US long-code numbers do **not** receive *international*
195
+ > inbound SMS. To test texting, send from a **US** number.
196
+
197
+ ## Session routing (`sessionKey`)
198
+
199
+ By default, **each conversation gets its own openclaw session** — one call is one
200
+ session, one SMS thread is another. You control this with the optional `sessionKey`
201
+ **template** (`sessionKey` in plugin config, or `SAPERLY_SESSION_KEY`). Placeholders
202
+ the connector fills:
203
+
204
+ | Placeholder | Filled with |
205
+ | ------------------ | ---------------------------------------------------------------- |
206
+ | `{conversationId}` | the call's id (voice) or the SMS thread key (`sms:{numberId}:{peer}`) |
207
+ | `{peer}` | the other party's number (the caller / texter) |
208
+ | `{line}` | your line's number |
209
+ | `{connectionId}` | the bound connection id |
210
+
211
+ | `sessionKey` | Routing |
212
+ | ----------------------------- | ----------------------------------------------------------- |
213
+ | *(unset — default)* | `saperly-voice:{conversationId}` — every conversation isolated |
214
+ | `saperly-voice:main` | one shared "main chat" — everything lands in one session |
215
+ | `saperly-voice:peer:{peer}` | one session **per person** — their calls + texts share memory |
216
+ | `saperly-voice:line:{line}` | one session **per line** |
217
+
218
+ **Trade-off.** A fixed or shared key (`…:main`, or `…:line:{line}` on a busy line)
219
+ merges every caller into one context **and interleaves concurrent turns** — perfect
220
+ for a single personal agent, wrong for a multi-tenant deployment. There, prefer the
221
+ default (per-conversation) or `…:peer:{peer}`.
222
+
223
+ **Concurrent calls.** An inbound call is admitted by a brief, fail-closed accept gate
224
+ (~8s): the agent must answer the `inbound_call` (a `speak` opener or `wait`) before the
225
+ line is picked up, or the call is declined. With the default (per-conversation) key,
226
+ each call runs in its **own** session, so simultaneous calls are admitted independently.
227
+ A shared/fixed key routes every call through **one** session — if your runtime serializes
228
+ same-session runs, a second call arriving while the first is mid-turn can miss the accept
229
+ window and be **declined**. For a line that takes overlapping calls, keep the default or
230
+ use `…:peer:{peer}`.
231
+
232
+ In **explicit-many** mode you can give one line its own routing: add a per-entry
233
+ `sessionKey` inside its `connections[]` object, and it overrides the top-level one.
234
+
235
+ > An absent `{peer}`/`{line}` falls back to the conversation id, so a missing value
236
+ > always **splits** sessions rather than silently merging two distinct conversations
237
+ > into one. (For voice, `{peer}`/`{line}` are remembered from the opening call frame
238
+ > and reused for every later turn of that call.)
239
+
240
+ ## Connect / auth / reconnect
241
+
242
+ - **Connect.** The connector dials **out** to
243
+ `GET <baseUrl>/v2/manual/<connectionId>/ws` (it needs no public URL of its own).
244
+ One long-lived socket per bound line multiplexes every live call and text; each
245
+ turn carries a `request_id` your reply echoes so the right call gets the directive.
246
+ - **Auth.** The connection's `manualSecret`, presented as the websocket subprotocol
247
+ `Sec-WebSocket-Protocol: bearer.<secret>` (the browser-compatible escape hatch).
248
+ Under Node's `ws`, a plain `Authorization: Bearer <secret>` header also works.
249
+ - **Reconnect.** The socket is long-lived. On any drop — including `1012` (a newer
250
+ socket superseded this one) — it reconnects with jittered backoff (1 s → 30 s) and
251
+ re-sends `hello`. In-flight turns are released so nothing leaks.
252
+ - **Bad frames.** Every inbound frame is schema-decoded; an undecodable one is
253
+ dropped (never crashes the connector), and an advisory `error` frame is logged so
254
+ the next reply can be corrected.
255
+
256
+ ## Troubleshooting
257
+
258
+ - **Not connecting / agent never answers.** Check `baseUrl` (no trailing `/v2`; it
259
+ derives `ws/wss`), the `connectionId`/`manualSecret` pair (or the `sk_` key), and
260
+ that **the line is in manual mode** — a non-manual connection has no `manualSecret`
261
+ and the upgrade fails closed.
262
+ - **`dropping undecodable frame` in the logs.** The connector bundle is older than
263
+ Saperly's current frame protocol. **Rebuild** it (see [For maintainers](#for-maintainers))
264
+ — a stale bundle silently drops frames it can't decode.
265
+ - **Socket closes with `1012`.** A newer connection superseded this one (e.g. you
266
+ started a second instance on the same line). The connector re-binds automatically;
267
+ if it flaps, make sure only one instance binds each line.
268
+ - **No inbound texts.** US long codes don't receive **international** SMS — text from
269
+ a US number to test (see [Usage — SMS](#usage--sms)).
270
+
271
+ ## For maintainers
272
+
273
+ ```bash
274
+ cd connectors/openclaw
275
+ bun install
276
+ bun run typecheck # tsc --noEmit, strict (exactOptionalPropertyTypes, no any)
277
+ bun test # protocol / bridge / config / channel unit tests
278
+ bun run build # tsdown → dist/index.js + dist/index.d.ts (prebuilt runtime entry)
279
+ ```
280
+
281
+ - **`@saperly/voice-protocol`** is the shared websocket frame contract (the same
282
+ Effect Schema frames + directive validation Saperly's server uses — one source of
283
+ truth, no mirror to drift). This connector is **standalone** (its own `bun.lock`,
284
+ not in the monorepo workspace), so it depends on the package by `file:` path
285
+ (`"@saperly/voice-protocol": "file:../../packages/voice-protocol"`) for local dev
286
+ and tests. **It is bundled into `dist/index.js` at build time** (along with
287
+ `effect`), so it does *not* need to be published separately — both are `devDependencies`
288
+ here, and the published tarball is self-contained.
289
+ - **Bundle model.** `bun run build` runs **tsdown** (`tsdown.config.ts`): it bundles
290
+ the entry + `@saperly/voice-protocol` + `effect` into ESM `dist/index.js` and emits
291
+ `dist/index.d.ts`, keeping only the host-supplied openclaw SDK (`openclaw/plugin-sdk/*`)
292
+ external. Ambient type shims (`src/openclaw-sdk.d.ts`) let the connector type-check
293
+ standalone; the rest of the host API is read structurally off `api`.
294
+ - **Dual entry (`extensions` vs `runtimeExtensions`).** `package.json`'s `openclaw`
295
+ block declares both `extensions: ["./src/index.ts"]` (the source entry a
296
+ workspace/git-checkout gateway loads directly) and `runtimeExtensions:
297
+ ["./dist/index.js"]` (the prebuilt JS an **installed npm package** loads, so the
298
+ gateway never compiles TypeScript at runtime). `files` ships `dist` +
299
+ `openclaw.plugin.json` + `README.md`; `prepublishOnly` rebuilds `dist` before pack.
300
+ - **Publish.** From `connectors/openclaw`: `npm pack --dry-run` to inspect the tarball,
301
+ then `npm publish` (the package is public; `prepublishOnly` runs the build). Bump
302
+ `version` first. `@saperly/voice-protocol` does **not** need publishing — it's bundled.
303
+ - **Registry caches `configSchema`.** After editing `openclaw.plugin.json`, run
304
+ `openclaw plugins registry --refresh` or the gateway boots on the stale schema.
@@ -0,0 +1,4 @@
1
+ //#region src/index.d.ts
2
+ declare const _default: any;
3
+ //#endregion
4
+ export { _default as default };