@desplega.ai/agent-swarm 1.83.2 → 1.84.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +48 -8
- package/openapi.json +24 -3
- package/package.json +1 -1
- package/src/be/migrations/076_kapso_sender_user_backfill.sql +43 -0
- package/src/commands/context-preamble.ts +178 -0
- package/src/commands/runner.ts +28 -1
- package/src/http/users.ts +11 -3
- package/src/http/webhooks.ts +101 -0
- package/src/integrations/kapso/client.ts +198 -0
- package/src/integrations/kapso/config.ts +104 -0
- package/src/integrations/kapso/inbound.ts +147 -0
- package/src/prompts/base-prompt.ts +15 -2
- package/src/prompts/session-templates.ts +26 -12
- package/src/server.ts +14 -0
- package/src/tests/agentmail-sending-skill.test.ts +75 -0
- package/src/tests/agents-list-model-display.test.ts +33 -0
- package/src/tests/base-prompt.test.ts +90 -1
- package/src/tests/http-users.test.ts +53 -0
- package/src/tests/kapso-client.test.ts +94 -0
- package/src/tests/kapso-inbound.test.ts +257 -0
- package/src/tests/kv-page-proxy.test.ts +1 -0
- package/src/tests/pagination-metrics.test.ts +4 -4
- package/src/tests/prompt-template-session.test.ts +13 -3
- package/src/tests/runner-context-preamble.test.ts +202 -0
- package/src/tests/tool-annotations.test.ts +3 -2
- package/src/tools/cancel-task.ts +13 -5
- package/src/tools/get-task-details.ts +18 -10
- package/src/tools/get-tasks.ts +9 -4
- package/src/tools/register-kapso-number.ts +210 -0
- package/src/tools/send-task.ts +9 -5
- package/src/tools/task-action.ts +20 -10
- package/src/tools/templates.ts +35 -0
- package/src/tools/tool-config.ts +6 -0
- package/src/tools/whatsapp-message.ts +135 -0
- package/templates/skills/agentmail-sending/SKILL.md +169 -0
- package/templates/skills/kapso-whatsapp/SKILL.md +383 -0
|
@@ -0,0 +1,383 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: kapso-whatsapp
|
|
3
|
+
description: How to interact with Kapso WhatsApp from the swarm — read inbound webhook payloads (text AND media), fetch message history, send free-form messages within the 24h session window (and template messages outside it), mark-as-read, show the typing indicator, send reactions, download media, verify webhook signatures, and resolve contacts to swarm users. Canonical reference for ANY Kapso interaction beyond the thin `send-whatsapp-message` / `reply-whatsapp-message` MCP tools — for templates, media, reactions, typing, mark-as-read, signature verify, contact resolution, conversation history, drop to the REST recipes here. Use whenever a task references a WhatsApp message routed through Kapso, or when a workflow needs to reply on WhatsApp.
|
|
4
|
+
user-invocable: false
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Kapso WhatsApp
|
|
8
|
+
|
|
9
|
+
Kapso (https://kapso.ai) is a WhatsApp platform vendor that fronts the Meta Cloud API. A swarm provisions one or more WhatsApp phone numbers and wires each one to either a native inbound handler (PR #560) or a workflow that dispatches a task per inbound message.
|
|
10
|
+
|
|
11
|
+
## When to use MCP tools vs this skill's REST recipes
|
|
12
|
+
|
|
13
|
+
PR #560 ships **thin MCP-tool wrappers for the common case only**:
|
|
14
|
+
|
|
15
|
+
| Tool | Use for |
|
|
16
|
+
|---|---|
|
|
17
|
+
| `send-whatsapp-message` | Free-form text within the 24h session window. |
|
|
18
|
+
| `reply-whatsapp-message` | Same as above but quote-threads to an inbound WAMID. |
|
|
19
|
+
| `register-kapso-number` / `unregister-kapso-number` | Provisioning a phone number's webhook + KV mapping. |
|
|
20
|
+
|
|
21
|
+
**For ANYTHING else, drop to the REST recipes in this skill** — these are the canonical reference, and the MCP tools deliberately do NOT duplicate them:
|
|
22
|
+
|
|
23
|
+
- **Template messages** (outside 24h window) → §"Send a template" below.
|
|
24
|
+
- **Media** (image / document / audio / video, including wide-image padding and PTT voice notes) → §"Sending media".
|
|
25
|
+
- **Reactions** (👀 / ✅ / clear) → §"Send a reaction".
|
|
26
|
+
- **Typing indicator + mark-as-read** → §"Mark as read + typing indicator".
|
|
27
|
+
- **Signature verify (manual)** → §"Webhook signature verification".
|
|
28
|
+
- **Contact resolution → swarm user** → §"Resolve a contact to a swarm user".
|
|
29
|
+
- **Conversation history / message detail / templates list** → §"Read conversation context".
|
|
30
|
+
|
|
31
|
+
If the MCP-tool send returns a 24h-window error (`sessionWindowExpired: true`), fall through to the template path in §"Send a template" — this is exactly what the tool's structured-error points at.
|
|
32
|
+
|
|
33
|
+
## Setup
|
|
34
|
+
|
|
35
|
+
Swarm config keys (resolve with `get-config key:<NAME> includeSecrets:true` — Lead-only for secrets; workers should ask Lead if they need a value injected):
|
|
36
|
+
|
|
37
|
+
| Key | Value |
|
|
38
|
+
|---|---|
|
|
39
|
+
| `KAPSO_API_BASE_URL` | `https://api.kapso.ai` (host only, no `/platform/v1`) |
|
|
40
|
+
| `KAPSO_API_KEY` | API key (`X-API-Key` header) |
|
|
41
|
+
| `KAPSO_PHONE_NUMBER_ID` | The swarm's provisioned number's Meta phone-number ID |
|
|
42
|
+
| `KAPSO_WEBHOOK_HMAC_SECRET` | Shared HMAC secret. Kapso signs every webhook request with `X-Webhook-Signature: <hex>` |
|
|
43
|
+
|
|
44
|
+
The curl recipes below assume `$KAPSO_API_KEY`, `$KAPSO_API_BASE_URL`, and `$KAPSO_PHONE_NUMBER_ID` are resolved into your shell, e.g.:
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
API_BASE=$(get-config KAPSO_API_BASE_URL) # https://api.kapso.ai
|
|
48
|
+
API_KEY=$(get-config KAPSO_API_KEY)
|
|
49
|
+
PHONE_NUMBER_ID=$(get-config KAPSO_PHONE_NUMBER_ID)
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
The Kapso CLI is NOT installed in worker containers. Use direct HTTP or clone the `gokapso/agent-skills` repo for fallback scripts.
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
git clone --depth=1 https://github.com/gokapso/agent-skills /tmp/kapso-skills
|
|
56
|
+
cd /tmp/kapso-skills/skills/integrate-whatsapp && npm i # or observe-whatsapp / automate-whatsapp
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
The Meta Cloud API is proxied at `$KAPSO_API_BASE_URL/meta/whatsapp/v24.0/...` (auth: `X-API-Key`). Kapso's own platform endpoints live at `$KAPSO_API_BASE_URL/platform/v1/...`.
|
|
60
|
+
|
|
61
|
+
## Inbound webhook payload (v2)
|
|
62
|
+
|
|
63
|
+
When inbound events are routed through a workflow, the workflow's webhook trigger receives `whatsapp.message.*` and `whatsapp.conversation.*` events at `POST https://<your-swarm-host>/api/webhooks/<workflow-id>`.
|
|
64
|
+
|
|
65
|
+
Shape (top-level keys):
|
|
66
|
+
|
|
67
|
+
```json
|
|
68
|
+
{
|
|
69
|
+
"message": {
|
|
70
|
+
"id": "wamid.HBgL...", // Meta message id (WAMID)
|
|
71
|
+
"from": "15551234567", // E.164 without + (sender)
|
|
72
|
+
"from_user_id": "XX.00000...", // Meta-internal user id
|
|
73
|
+
"timestamp": "1700000000", // unix seconds (string)
|
|
74
|
+
"type": "text", // text | image | audio | video | document | sticker | location | contacts | reaction | ...
|
|
75
|
+
"text": { "body": "hi" }, // only for type=text
|
|
76
|
+
"context": null, // present when the user quote-replied another message
|
|
77
|
+
"kapso": {
|
|
78
|
+
"direction": "inbound|outbound",
|
|
79
|
+
"status": "received|delivered|read|sent|failed",
|
|
80
|
+
"processing_status": "pending|completed",
|
|
81
|
+
"origin": "cloud_api",
|
|
82
|
+
"has_media": false,
|
|
83
|
+
"content": "hi" // text representation (caption / filename / body)
|
|
84
|
+
}
|
|
85
|
+
},
|
|
86
|
+
"conversation": {
|
|
87
|
+
"id": "<conversation-uuid>",
|
|
88
|
+
"phone_number": "15551234567",
|
|
89
|
+
"phone_number_id": "<PHONE_NUMBER_ID>",
|
|
90
|
+
"contact_name": "Jane Doe",
|
|
91
|
+
"status": "active",
|
|
92
|
+
"last_active_at": "...",
|
|
93
|
+
"created_at": "...",
|
|
94
|
+
"kapso": {
|
|
95
|
+
"messages_count": 10,
|
|
96
|
+
"last_message_id": "wamid...",
|
|
97
|
+
"last_message_text": "hi",
|
|
98
|
+
"last_inbound_at": "...",
|
|
99
|
+
"last_outbound_at": "..."
|
|
100
|
+
}
|
|
101
|
+
},
|
|
102
|
+
"is_new_conversation": false,
|
|
103
|
+
"phone_number_id": "<PHONE_NUMBER_ID>"
|
|
104
|
+
}
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
**ALWAYS filter on `message.kapso.direction == "inbound"`** — Kapso fires the webhook for the swarm's own outbound sends, deliveries, reads, and failures too. Only inbound events from real humans warrant a task.
|
|
108
|
+
|
|
109
|
+
Test payloads include `"test": true` and `wamid.TEST_...` ids — handle gracefully (treat as a real inbound but mark it test in your reply; do not send a real WhatsApp reply to test payloads).
|
|
110
|
+
|
|
111
|
+
## Non-text message types
|
|
112
|
+
|
|
113
|
+
`message.type` can be `text`, `image`, `audio`, `video`, `document`, `sticker`, `location`, `contacts`, `reaction`, `button`, or `interactive`. Non-text inbound messages carry a type-specific object:
|
|
114
|
+
|
|
115
|
+
| type | object | key fields |
|
|
116
|
+
|---|---|---|
|
|
117
|
+
| `image` | `message.image` | `id` (media id), `mime_type`, `sha256`, `caption?` |
|
|
118
|
+
| `audio` | `message.audio` | `id`, `mime_type`, `voice` (true = voice note), `sha256` |
|
|
119
|
+
| `video` | `message.video` | `id`, `mime_type`, `sha256`, `caption?` |
|
|
120
|
+
| `document` | `message.document` | `id`, `mime_type`, `filename`, `sha256`, `caption?` |
|
|
121
|
+
| `sticker` | `message.sticker` | `id`, `mime_type`, `animated`, `sha256` |
|
|
122
|
+
| `location` | `message.location` | `latitude`, `longitude`, `name?`, `address?` |
|
|
123
|
+
| `contacts` | `message.contacts[]` | `name`, `phones[]`, `emails[]`, ... |
|
|
124
|
+
| `reaction` | `message.reaction` | `message_id` (wamid being reacted to), `emoji` |
|
|
125
|
+
|
|
126
|
+
`message.kapso.has_media` is `true` for image/audio/video/document/sticker. `message.kapso.content` carries a text representation where one exists (caption, filename). `message.transcript` may be present for audio if Kapso pre-transcribed it.
|
|
127
|
+
|
|
128
|
+
### Downloading media
|
|
129
|
+
|
|
130
|
+
Media messages carry a Meta **media id** (`message.<type>.id`), not a URL. Two-step download via the Kapso proxy:
|
|
131
|
+
|
|
132
|
+
1. Resolve the media id to a temporary URL + metadata:
|
|
133
|
+
```bash
|
|
134
|
+
curl -s -H "X-API-Key: $API_KEY" \
|
|
135
|
+
"$KAPSO_API_BASE_URL/meta/whatsapp/v24.0/<MEDIA_ID>"
|
|
136
|
+
# → { "url": "https://lookaside.fbsbx.com/...", "mime_type": "...", "file_size": ..., "id": "...", "sha256": "..." }
|
|
137
|
+
```
|
|
138
|
+
2. Download the binary from that `url` (Meta lookaside URLs expire fast — download immediately):
|
|
139
|
+
```bash
|
|
140
|
+
curl -sL -H "X-API-Key: $API_KEY" "<url>" -o /tmp/media.bin
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
NB: verify the exact proxy path against a real media message if your swarm has only handled `text` inbound so far. If the lookaside `url` 403s with `X-API-Key`, retry through `$KAPSO_API_BASE_URL/meta/whatsapp/...`.
|
|
144
|
+
|
|
145
|
+
### Recommended handling per type (proposal — adapt to what your swarm has installed)
|
|
146
|
+
|
|
147
|
+
- **audio / voice notes** → download the media. If the swarm has a speech-to-text capability (e.g. ElevenLabs Scribe, Whisper, or a transcription skill/tool installed on some role), transcribe via that path and feed the transcript into the conversation as if it were a text message. If no STT capability is available, download and process the raw media, or acknowledge and ask the sender to send text. Check which role owns audio/transcription in your swarm before assuming it exists.
|
|
148
|
+
- **image** → download, then describe / answer with a vision-capable model if one is available. A screenshot captioned "debug this" should get a real answer, not "I can't read images".
|
|
149
|
+
- **document** → download; read the text content (PDF/txt/etc.) and act on it.
|
|
150
|
+
- **video** → acknowledge + ask for specifics unless there is a clear transcription need.
|
|
151
|
+
- **location** → use `latitude` / `longitude` directly.
|
|
152
|
+
- **sticker** → treat as a lightweight reaction; usually no substantive reply needed.
|
|
153
|
+
- **reaction** → a user reacting to one of the swarm's messages. Usually acknowledge silently — do NOT trigger a full reply loop. (A debounce node, where present, naturally drops reaction-only events.)
|
|
154
|
+
- **contacts** → extract the shared contact info; act per the conversation.
|
|
155
|
+
|
|
156
|
+
## Resolve a contact to a swarm user
|
|
157
|
+
|
|
158
|
+
The goal is to map an inbound Kapso contact (`conversation.contact_name` + `message.from` phone) to whatever user identity your swarm tracks. Two paths in order:
|
|
159
|
+
|
|
160
|
+
1. By name: `resolve-user name:"<contact_name>"` (fuzzy substring match). Returns the canonical profile if there's one. Useful for contacts already registered in the swarm's user registry.
|
|
161
|
+
2. If no match — the contact is unknown. Lead can run `manage-user create name:"<contact_name>" notes:"WhatsApp +<phone>"` to register them. Workers should NOT create users autonomously; ask Lead.
|
|
162
|
+
|
|
163
|
+
Always quote the phone number in `manage-user notes` so future lookups by phone work (until the user registry has a dedicated `phone` column).
|
|
164
|
+
|
|
165
|
+
## Read conversation context
|
|
166
|
+
|
|
167
|
+
Use the Kapso platform endpoints via curl (no CLI needed):
|
|
168
|
+
|
|
169
|
+
```bash
|
|
170
|
+
API_BASE=$(get-config KAPSO_API_BASE_URL) # https://api.kapso.ai
|
|
171
|
+
API_KEY=$(get-config KAPSO_API_KEY)
|
|
172
|
+
PHONE_NUMBER_ID=$(get-config KAPSO_PHONE_NUMBER_ID)
|
|
173
|
+
|
|
174
|
+
# List conversations for the swarm's number
|
|
175
|
+
curl -s -H "X-API-Key: $API_KEY" \
|
|
176
|
+
"$API_BASE/platform/v1/whatsapp/conversations?phone_number_id=$PHONE_NUMBER_ID&status=active" | jq
|
|
177
|
+
|
|
178
|
+
# Get a single conversation
|
|
179
|
+
curl -s -H "X-API-Key: $API_KEY" \
|
|
180
|
+
"$API_BASE/platform/v1/whatsapp/conversations/<conversation_id>" | jq
|
|
181
|
+
|
|
182
|
+
# List messages for a conversation — USE THE QUERY-PARAM FORM.
|
|
183
|
+
# (The conversation-scoped path /conversations/<id>/messages returns 404.)
|
|
184
|
+
curl -s -H "X-API-Key: $API_KEY" \
|
|
185
|
+
"$API_BASE/platform/v1/whatsapp/messages?conversation_id=<conversation_id>&limit=20" | jq
|
|
186
|
+
|
|
187
|
+
# Single message detail
|
|
188
|
+
curl -s -H "X-API-Key: $API_KEY" \
|
|
189
|
+
"$API_BASE/platform/v1/whatsapp/messages/<wamid>" | jq
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
The message list is returned newest-first.
|
|
193
|
+
|
|
194
|
+
## Send a free-form text (within the 24h session window)
|
|
195
|
+
|
|
196
|
+
Per WhatsApp policy, free-form text is only allowed within 24h of the last inbound message. Outside that window you MUST use a pre-approved template (see "Send a template" below).
|
|
197
|
+
|
|
198
|
+
**Common case shortcut:** call the `send-whatsapp-message` MCP tool — it wraps exactly this REST call. The recipe below is the canonical reference and the fallback when you need fields the tool doesn't expose.
|
|
199
|
+
|
|
200
|
+
```bash
|
|
201
|
+
TO="15551234567"
|
|
202
|
+
TEXT="Hi 👋"
|
|
203
|
+
|
|
204
|
+
curl -s -X POST -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \
|
|
205
|
+
-d "{
|
|
206
|
+
\"messaging_product\": \"whatsapp\",
|
|
207
|
+
\"recipient_type\": \"individual\",
|
|
208
|
+
\"to\": \"$TO\",
|
|
209
|
+
\"type\": \"text\",
|
|
210
|
+
\"text\": { \"preview_url\": false, \"body\": \"$TEXT\" }
|
|
211
|
+
}" \
|
|
212
|
+
"$API_BASE/meta/whatsapp/v24.0/$PHONE_NUMBER_ID/messages" | jq
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
Returns `{ "messages": [{ "id": "wamid..." }] }` on success. Log the wamid.
|
|
216
|
+
|
|
217
|
+
### Quote-reply (thread to the original message)
|
|
218
|
+
|
|
219
|
+
Add a `context` object to make the message render as a reply to a specific inbound message. The `reply-whatsapp-message` MCP tool wraps exactly this; use the raw recipe when you need to combine quote-reply with media / templates / reactions (the tool only does text).
|
|
220
|
+
|
|
221
|
+
```json
|
|
222
|
+
{
|
|
223
|
+
"messaging_product": "whatsapp",
|
|
224
|
+
"recipient_type": "individual",
|
|
225
|
+
"to": "<phone>",
|
|
226
|
+
"context": { "message_id": "<inbound_wamid>" },
|
|
227
|
+
"type": "text",
|
|
228
|
+
"text": { "preview_url": false, "body": "<reply>" }
|
|
229
|
+
}
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
Prefer quote-replies when answering a specific question — it keeps long conversations legible.
|
|
233
|
+
|
|
234
|
+
## Sending media (image, document, audio, video)
|
|
235
|
+
|
|
236
|
+
Two-step pipeline through Kapso's Meta proxy: **upload, then send by id**. Sending by `id` is more reliable than `link` (no public-host requirement) — validated 2026-05-20.
|
|
237
|
+
|
|
238
|
+
### 1. Upload
|
|
239
|
+
|
|
240
|
+
```bash
|
|
241
|
+
curl -s -X POST -H "X-API-Key: $API_KEY" \
|
|
242
|
+
-F "messaging_product=whatsapp" \
|
|
243
|
+
-F "type=<mime>" \
|
|
244
|
+
-F "file=@/path/to/file.ext;type=<mime>" \
|
|
245
|
+
"$API_BASE/meta/whatsapp/v24.0/$PHONE_NUMBER_ID/media"
|
|
246
|
+
# → {"id":"<media-id>"}
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
### 2. Send by id
|
|
250
|
+
|
|
251
|
+
```json
|
|
252
|
+
{ "type": "image", "image": { "id": "<id>", "caption": "..." } }
|
|
253
|
+
{ "type": "document", "document": { "id": "<id>", "filename": "name.ext", "caption": "..." } }
|
|
254
|
+
{ "type": "audio", "audio": { "id": "<id>" } }
|
|
255
|
+
{ "type": "video", "video": { "id": "<id>", "caption": "..." } }
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
Quote-reply works on media too — add `"context": { "message_id": "<wamid>" }` at the top level.
|
|
259
|
+
|
|
260
|
+
### Wide images: pad to ~square, send as image (validated 2026-05-20)
|
|
261
|
+
|
|
262
|
+
WhatsApp scales `type:image` to bubble width + recompresses, so a wide 1200×630 social card renders as a tiny shrunken strip. **The fix is NOT `type:document`** — a `.png` sent as a document shows a plain file card with NO inline preview (must tap+download). Bad UX both ways.
|
|
263
|
+
|
|
264
|
+
Correct approach: **letterbox/pad the wide image onto a ~1:1 (1080×1080) or 4:5 (1080×1350) canvas** with a solid bg fill (white, or a colour sampled from the card's corner), card centered, then send THAT as `type:image`. WhatsApp shows ~1:1–4:5 images large WITH a preview and won't shrink them.
|
|
265
|
+
|
|
266
|
+
Pad with Pillow (ImageMagick is NOT installed in workers; `pip`/`python3 -c` with PIL works):
|
|
267
|
+
|
|
268
|
+
```python
|
|
269
|
+
from PIL import Image
|
|
270
|
+
src = Image.open("in.png").convert("RGB"); w,h = src.size
|
|
271
|
+
bg = src.getpixel((0,0)) # sample corner for fill
|
|
272
|
+
card = src.resize((1080, round(h*1080/w)), Image.LANCZOS)
|
|
273
|
+
canvas = Image.new("RGB", (1080,1080), bg)
|
|
274
|
+
canvas.paste(card, (0,(1080-card.height)//2)); canvas.save("out.png")
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
Reserve `type:document` for ACTUAL files — PDFs (which DO render a preview), spreadsheets, etc. — never for images.
|
|
278
|
+
|
|
279
|
+
### Voice notes (PTT play bar)
|
|
280
|
+
|
|
281
|
+
For PTT (play bar in the bubble), the audio MUST be `audio/ogg` with Opus. MP3 sends as a generic audio attachment, no PTT bar.
|
|
282
|
+
```bash
|
|
283
|
+
ffmpeg -i in.mp3 -c:a libopus -b:a 32k -application voip out.ogg
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
## Mark as read + typing indicator
|
|
287
|
+
|
|
288
|
+
The mark-as-read endpoint doubles as the typing indicator. POST to the same `/messages` endpoint with `status: "read"` and an optional `typing_indicator`:
|
|
289
|
+
|
|
290
|
+
```bash
|
|
291
|
+
curl -s -X POST -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \
|
|
292
|
+
-d '{
|
|
293
|
+
"messaging_product": "whatsapp",
|
|
294
|
+
"status": "read",
|
|
295
|
+
"message_id": "<inbound_wamid>",
|
|
296
|
+
"typing_indicator": { "type": "text" }
|
|
297
|
+
}' \
|
|
298
|
+
"$API_BASE/meta/whatsapp/v24.0/$PHONE_NUMBER_ID/messages"
|
|
299
|
+
# → {"success":true}
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
The typing indicator ("typing…" dots) auto-clears after ~25 seconds OR the moment you send any message. For long-running work, re-fire this call every <25s (e.g. right before you POST your reply) to keep the dots visible. Drop the `typing_indicator` field to mark-as-read only.
|
|
303
|
+
|
|
304
|
+
## Send a reaction
|
|
305
|
+
|
|
306
|
+
```bash
|
|
307
|
+
curl -s -X POST -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \
|
|
308
|
+
-d '{
|
|
309
|
+
"messaging_product": "whatsapp",
|
|
310
|
+
"recipient_type": "individual",
|
|
311
|
+
"to": "<phone>",
|
|
312
|
+
"type": "reaction",
|
|
313
|
+
"reaction": { "message_id": "<wamid>", "emoji": "👀" }
|
|
314
|
+
}' \
|
|
315
|
+
"$API_BASE/meta/whatsapp/v24.0/$PHONE_NUMBER_ID/messages"
|
|
316
|
+
```
|
|
317
|
+
|
|
318
|
+
A user can have only ONE reaction per message — sending a new emoji REPLACES the previous one (no explicit remove needed). Send `"emoji": ""` to clear a reaction entirely.
|
|
319
|
+
|
|
320
|
+
## Send a template (outside the 24h window)
|
|
321
|
+
|
|
322
|
+
If `send-whatsapp-message` returns `sessionWindowExpired: true`, fall through to this path. WhatsApp only allows pre-approved templates outside the 24h customer-service window.
|
|
323
|
+
|
|
324
|
+
```bash
|
|
325
|
+
curl -s -X POST -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \
|
|
326
|
+
-d '{
|
|
327
|
+
"messaging_product": "whatsapp",
|
|
328
|
+
"to": "<phone>",
|
|
329
|
+
"type": "template",
|
|
330
|
+
"template": {
|
|
331
|
+
"name": "<template_name>",
|
|
332
|
+
"language": { "code": "en_US" }
|
|
333
|
+
}
|
|
334
|
+
}' \
|
|
335
|
+
"$API_BASE/meta/whatsapp/v24.0/$PHONE_NUMBER_ID/messages"
|
|
336
|
+
```
|
|
337
|
+
|
|
338
|
+
List approved templates first: `GET $API_BASE/platform/v1/whatsapp/templates?phone_number_id=$PHONE_NUMBER_ID`.
|
|
339
|
+
|
|
340
|
+
## Webhook signature verification
|
|
341
|
+
|
|
342
|
+
Every Kapso webhook delivery includes `X-Webhook-Signature: <hex>` (HMAC-SHA256 of the raw body using `KAPSO_WEBHOOK_HMAC_SECRET`). The native handler (`/api/integrations/kapso/webhook`, PR #560) and the workflow webhook trigger both verify automatically — the trigger's `hmacHeader` is `X-Webhook-Signature` and `hmacSecret` resolves from swarm config.
|
|
343
|
+
|
|
344
|
+
To verify manually:
|
|
345
|
+
|
|
346
|
+
```bash
|
|
347
|
+
echo -n "$RAW_BODY" | openssl dgst -sha256 -hmac "$HMAC_SECRET" -hex | awk '{print $2}'
|
|
348
|
+
# Compare hex output with X-Webhook-Signature header (constant-time compare in production).
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
## Reply etiquette
|
|
352
|
+
|
|
353
|
+
- Same language as the inbound message — match whatever the human wrote in.
|
|
354
|
+
- Brief. WhatsApp is not Slack — 1-3 short messages max.
|
|
355
|
+
- Identify yourself if it's a first interaction in the conversation, e.g. "Hi! This is the swarm's WhatsApp assistant."
|
|
356
|
+
- Quote-reply (`context.message_id`) when answering a specific question.
|
|
357
|
+
- If you can't help (no skill for the request, out of scope) — say so and either escalate to Lead or ask the human to use another channel.
|
|
358
|
+
- Always log the outbound wamid in your task output so it's traceable.
|
|
359
|
+
|
|
360
|
+
## Where this fits in the swarm
|
|
361
|
+
|
|
362
|
+
Two inbound paths can exist (PR #560 adds the native one; a workflow path is the alternative):
|
|
363
|
+
|
|
364
|
+
- **Native handler** (`/api/integrations/kapso/webhook`, PR #560) — fires for any phone number registered via `register-kapso-number`. Verifies HMAC, dedupes by message id (KV `integrations:kapso:dedupe`, 24h TTL), reads the routing mapping from KV (`integrations:kapso:numbers`), and either dispatches a `kapso-inbound` task or delegates to a workflow trigger (advanced override). Also emits a `kapso.message.received` event on the workflow event bus.
|
|
365
|
+
- **Workflow path** — fires for unregistered numbers (or numbers whose mapping points at a workflow). A typical inbound-handling workflow chains: a react-eyes step (mark read + typing + 👀) → a debounce step (collapse rapid-fire bursts) → a gate → an agent-task triage step → a finalize step (✅/❌ reaction).
|
|
366
|
+
|
|
367
|
+
**Debounce / batching:** a debounce step waits a few seconds after each message and only the LAST message of a burst proceeds to the agent task — so a user firing 3 quick messages produces ONE task, not three. The agent is told the `batchSize` and should read trailing history and answer the whole burst in one reply. When >1 messages are collapsed, the user can be shown a "🧵 Got your N messages" note.
|
|
368
|
+
|
|
369
|
+
The agent-task triages like any other interaction; route heavier work to specialists via `send-task` (always include the WhatsApp source context so they can reply back).
|
|
370
|
+
|
|
371
|
+
HMAC verification is enforced (signed mode) on both paths.
|
|
372
|
+
|
|
373
|
+
## Common gotchas
|
|
374
|
+
|
|
375
|
+
- Phone numbers from Kapso are E.164 **without `+`** (e.g. `15551234567`). Add `+` when displaying to humans, drop it when calling the API.
|
|
376
|
+
- `message.text.body` is only present for `type:"text"`. For other types read `message.<type>` (see the table above) or `message.kapso.content` for a text representation.
|
|
377
|
+
- Outbound status events (`delivered`, `read`) are NOT a customer interaction — skip them. Filter by `message.kapso.direction == "inbound"`.
|
|
378
|
+
- Real inbound messages commonly arrive with `status: "delivered"` (delivered to the swarm). Do NOT skip on status — only `direction` signals inbound vs outbound.
|
|
379
|
+
- Kapso sometimes sends test payloads with `"test": true` and `wamid.TEST_*` ids. Don't reply to test payloads — just complete the task with a note.
|
|
380
|
+
- `KAPSO_PHONE_NUMBER_ID` is the **swarm's own** number, not the recipient's. The recipient is in `message.from` / `conversation.phone_number`.
|
|
381
|
+
- The message-list endpoint is `/platform/v1/whatsapp/messages?conversation_id=X` — the conversation-scoped `/conversations/<id>/messages` path 404s.
|
|
382
|
+
- **Wide images shrink — pad them, don't send as document.** `type:image` scales to bubble width; a wide social card becomes a strip. Sending it as `type:document` removes the preview entirely. Fix: pad onto a ~1:1/4:5 canvas and send as `type:image`. See "Sending media".
|
|
383
|
+
- **MCP tools cover text-only.** `send-whatsapp-message` and `reply-whatsapp-message` are deliberately thin — templates / media / reactions / typing / mark-as-read are NOT in the tool surface. For those, use the REST recipes in this skill directly.
|