remote-pi 0.1.3 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (99) hide show
  1. package/README.md +160 -40
  2. package/dist/bin/supervisord.d.ts +2 -0
  3. package/dist/bin/supervisord.js +44 -0
  4. package/dist/bin/supervisord.js.map +1 -0
  5. package/dist/config.d.ts +44 -13
  6. package/dist/config.js +61 -22
  7. package/dist/config.js.map +1 -1
  8. package/dist/daemon/client.d.ts +20 -0
  9. package/dist/daemon/client.js +128 -0
  10. package/dist/daemon/client.js.map +1 -0
  11. package/dist/daemon/control_protocol.d.ts +100 -0
  12. package/dist/daemon/control_protocol.js +63 -0
  13. package/dist/daemon/control_protocol.js.map +1 -0
  14. package/dist/daemon/id.d.ts +18 -0
  15. package/dist/daemon/id.js +30 -0
  16. package/dist/daemon/id.js.map +1 -0
  17. package/dist/daemon/install.d.ts +132 -0
  18. package/dist/daemon/install.js +312 -0
  19. package/dist/daemon/install.js.map +1 -0
  20. package/dist/daemon/registry.d.ts +47 -0
  21. package/dist/daemon/registry.js +123 -0
  22. package/dist/daemon/registry.js.map +1 -0
  23. package/dist/daemon/rpc_child.d.ts +76 -0
  24. package/dist/daemon/rpc_child.js +130 -0
  25. package/dist/daemon/rpc_child.js.map +1 -0
  26. package/dist/daemon/supervisor.d.ts +38 -0
  27. package/dist/daemon/supervisor.js +301 -0
  28. package/dist/daemon/supervisor.js.map +1 -0
  29. package/dist/index.d.ts +62 -8
  30. package/dist/index.js +1231 -303
  31. package/dist/index.js.map +1 -1
  32. package/dist/mesh/canonical.d.ts +30 -0
  33. package/dist/mesh/canonical.js +61 -0
  34. package/dist/mesh/canonical.js.map +1 -0
  35. package/dist/mesh/client.d.ts +31 -0
  36. package/dist/mesh/client.js +56 -0
  37. package/dist/mesh/client.js.map +1 -0
  38. package/dist/mesh/encoding.d.ts +36 -0
  39. package/dist/mesh/encoding.js +53 -0
  40. package/dist/mesh/encoding.js.map +1 -0
  41. package/dist/mesh/self_revoke.d.ts +111 -0
  42. package/dist/mesh/self_revoke.js +182 -0
  43. package/dist/mesh/self_revoke.js.map +1 -0
  44. package/dist/mesh/siblings.d.ts +62 -0
  45. package/dist/mesh/siblings.js +95 -0
  46. package/dist/mesh/siblings.js.map +1 -0
  47. package/dist/mesh/types.d.ts +34 -0
  48. package/dist/mesh/types.js +11 -0
  49. package/dist/mesh/types.js.map +1 -0
  50. package/dist/mesh/verify.d.ts +17 -0
  51. package/dist/mesh/verify.js +77 -0
  52. package/dist/mesh/verify.js.map +1 -0
  53. package/dist/pairing/qr.d.ts +16 -5
  54. package/dist/pairing/qr.js +27 -8
  55. package/dist/pairing/qr.js.map +1 -1
  56. package/dist/pairing/storage.d.ts +41 -0
  57. package/dist/pairing/storage.js +160 -21
  58. package/dist/pairing/storage.js.map +1 -1
  59. package/dist/protocol/types.d.ts +23 -0
  60. package/dist/session/broker.d.ts +74 -0
  61. package/dist/session/broker.js +142 -4
  62. package/dist/session/broker.js.map +1 -1
  63. package/dist/session/broker_remote.d.ts +110 -0
  64. package/dist/session/broker_remote.js +397 -0
  65. package/dist/session/broker_remote.js.map +1 -0
  66. package/dist/session/cwd_lock.d.ts +28 -0
  67. package/dist/session/cwd_lock.js +89 -0
  68. package/dist/session/cwd_lock.js.map +1 -0
  69. package/dist/session/global_config.d.ts +9 -0
  70. package/dist/session/global_config.js +9 -0
  71. package/dist/session/global_config.js.map +1 -1
  72. package/dist/session/leader_election.d.ts +16 -0
  73. package/dist/session/leader_election.js +22 -0
  74. package/dist/session/leader_election.js.map +1 -1
  75. package/dist/session/local_config.d.ts +12 -5
  76. package/dist/session/local_config.js +24 -3
  77. package/dist/session/local_config.js.map +1 -1
  78. package/dist/session/peer.d.ts +28 -1
  79. package/dist/session/peer.js +69 -2
  80. package/dist/session/peer.js.map +1 -1
  81. package/dist/session/peer_inventory.d.ts +13 -0
  82. package/dist/session/peer_inventory.js +48 -0
  83. package/dist/session/peer_inventory.js.map +1 -0
  84. package/dist/session/setup_wizard.d.ts +32 -8
  85. package/dist/session/setup_wizard.js +45 -33
  86. package/dist/session/setup_wizard.js.map +1 -1
  87. package/dist/session/tools.d.ts +15 -7
  88. package/dist/session/tools.js +139 -31
  89. package/dist/session/tools.js.map +1 -1
  90. package/dist/transport/pi_forward_client.d.ts +29 -0
  91. package/dist/transport/pi_forward_client.js +62 -0
  92. package/dist/transport/pi_forward_client.js.map +1 -0
  93. package/dist/ui/footer.js +8 -6
  94. package/dist/ui/footer.js.map +1 -1
  95. package/docs/daemon.md +289 -0
  96. package/package.json +8 -2
  97. package/service-templates/launchd.plist.template +35 -0
  98. package/service-templates/systemd.service.template +19 -0
  99. package/skills/agent-network/SKILL.md +273 -294
@@ -1,17 +1,17 @@
1
1
  ---
2
2
  name: agent-network
3
- description: Use when you (a Pi agent) are running inside a local agent session — i.e., when the Pi footer shows "📡 <session-name>". This skill teaches how to receive messages from other agents, how to reply in a correlatable way, how to ask things of other agents without losing track, and how to act when you don't yet have the context you need.
3
+ description: Use when you (a Pi agent) are running inside a local agent session — i.e., when the Pi footer shows "📡 <session-name>". This skill teaches how to discover who's online (`list_peers`), how to send messages with delivery status (`agent_send` + ACK), how replies arrive in a future turn, how cross-PC addressing works (`<pc_label>:<peer>`), and the retry matrix for the four ACK statuses.
4
4
  ---
5
5
 
6
- # Agent Network (skill — message protocol for Pi agents)
6
+ # Agent Network (skill — event-driven message protocol)
7
7
 
8
8
  You are connected to a **local agent session** over a Unix Domain Socket.
9
- Other Pi agents running on the same machine, in the same session, can send
10
- you messages. You can send messages to them too.
9
+ Other Pi agents on the same machine, in the same session, can send you
10
+ messages and you can send messages to them.
11
11
 
12
- This skill teaches how to participate in that network reliably. Read it to
13
- the end before acting — understanding the protocol avoids silence and
14
- deadlocks.
12
+ This skill teaches how to participate in that network reliably. Read it
13
+ to the end before acting — the protocol is **event-driven**, not
14
+ request/reply, and getting that wrong leaves coordination broken.
15
15
 
16
16
  ---
17
17
 
@@ -22,8 +22,33 @@ session broker filters before delivery. You will never see messages
22
22
  intended for other agents or "broadcast with `exclude_self`".
23
23
 
24
24
  **Practical consequence**: if a message arrived in your inbox, someone
25
- wanted your attention. Don't ignore it. Don't assume it was for someone
26
- else.
25
+ wanted your attention. Don't ignore it.
26
+
27
+ ---
28
+
29
+ ## First thing to do in a new session: `list_peers`
30
+
31
+ Before you send anything, find out who's actually online. `list_peers` is
32
+ a cheap metadata tool that returns the current inventory:
33
+
34
+ ```
35
+ list_peers()
36
+ → { peers: ["backend", "frontend", "casa:agent-1", "trab:worker"] }
37
+ ```
38
+
39
+ The reply is **synchronous** (broker resolves in milliseconds — this is
40
+ not a turn of another agent). Use it freely:
41
+
42
+ - At the start of a session, to see what mesh you're in
43
+ - After receiving `peer_joined` / `peer_left` to refresh
44
+ - Before any `agent_send` whose target name is uncertain
45
+
46
+ **Entry shape:**
47
+ - `backend` → local peer (this machine, same UDS broker)
48
+ - `casa:agent-1` → cross-PC peer on the Pi labeled `casa` (this Owner's
49
+ other machine, reached through the relay)
50
+
51
+ You are excluded from the result — no need to filter yourself out.
27
52
 
28
53
  ---
29
54
 
@@ -43,36 +68,85 @@ Every message has 5 fields:
43
68
 
44
69
  | Field | Meaning |
45
70
  |---|---|
46
- | `from` | Who sent it. Use this to know who to reply to |
47
- | `to` | You (or "broadcast", or a list of names including yours) |
48
- | `id` | Unique identifier of this specific message |
49
- | `re` | If this message is a REPLY to another, echoes that one's `id`. Otherwise `null` |
50
- | `body` | Free-form content. String or JSON object, sender's choice |
71
+ | `from` | Who sent it. Use this to know who to reply to. |
72
+ | `to` | You (or "broadcast", or a list of names including yours). |
73
+ | `id` | Unique identifier of this specific message. |
74
+ | `re` | If this message is a REPLY to another, echoes that one's `id`. Otherwise `null`. |
75
+ | `body` | Free-form content. String or JSON object, sender's choice. |
51
76
 
52
77
  ---
53
78
 
54
- ## When you receive a message
79
+ ## How sending works: `agent_send` returns an ACK status
80
+
81
+ `agent_send` is the **only** tool you need to talk to peers. Every
82
+ unicast call returns a status that tells you what happened at the
83
+ recipient. **Always inspect the status — it dictates what to do next.**
84
+
85
+ | Status | What it means | What you do |
86
+ |---|---|---|
87
+ | `received` | Peer was idle, broker handed the envelope over, peer will process it in its upcoming turn. | Move on. The reply (if any) arrives later in your inbox as a normal envelope with `re=<your-send-id>`. |
88
+ | `busy` | Peer is mid-turn — envelope **dropped**. | Retry 2× with backoff (2s, 5s). If still busy, abandon or escalate to the human. You own the retry. |
89
+ | `denied` | Peer explicitly refused the message. | Do NOT retry. Report to the user. |
90
+ | `timeout` | No ACK in 5s. Transport error — broker may be down, peer disappeared mid-handshake. | Treat as transient. Retry once after a longer delay (10s+), then escalate. |
91
+ | `sent` | You used `to: "broadcast"` or an array. There is no single ACK target. | Move on. Broadcasts are fire-and-forget. |
92
+ | `refused` | The tool refused your call locally (e.g., you tried to message yourself, or you're not in a session). | Fix the call. Don't retry the same arguments. |
93
+
94
+ `busy` is the most common non-trivial answer. Two **new** messages aimed
95
+ at the same peer in quick succession will see the second one as `busy`
96
+ — the peer can only be processing one turn at a time.
97
+
98
+ **Replies are exempt from busy gating.** A message with `re=<some-id>`
99
+ (an answer to something the recipient asked) is always delivered,
100
+ because it resolves pending state at the recipient rather than starting
101
+ a new turn for them. So if you fan out questions to several peers in
102
+ the same turn, every peer's reply will reach you even when you're still
103
+ processing — they all flow into your inbox for the next turn.
104
+
105
+ ---
106
+
107
+ ## How receiving works: replies arrive in a future turn
108
+
109
+ You **do not block waiting** for a peer's content reply. The model is
110
+ push-based:
111
+
112
+ 1. You call `agent_send` → status `received`.
113
+ 2. Your turn continues. You might do other work, or finish.
114
+ 3. **Later** — possibly several turns later — the peer finishes its
115
+ own turn, processes your message, and sends a reply.
116
+ 4. The reply lands in your inbox as a normal envelope. You see it on
117
+ your next turn input, with `re` set to the `id` you sent earlier.
118
+
119
+ You do not need a wait/poll/sleep. The Pi runtime delivers the reply as
120
+ a new turn input the moment it arrives.
55
121
 
56
- Do this in order, don't skip steps:
122
+ ### Concrete walk-through
57
123
 
58
- 1. **Look at `body`** to understand what's being asked
59
- 2. **Look at `from`** to know who to reply to
60
- 3. **Look at `id`** — this is the `correlation_id` you'll need to echo
61
- 4. **Execute the work** described in `body`
62
- 5. **Reply** with a new message:
63
- - `to`: the `from` of the original message
64
- - `id`: a fresh UUID v7 (your reply has its own identity)
65
- - `re`: the `id` of the original message (correlation)
66
- - `from`: your name
67
- - `body`: your answer
124
+ You (name: `orq`) ask `backend` a question:
68
125
 
69
- **Always reply.** If the sender sent something that clearly expects a
70
- reply (not a broadcast announcement), silence breaks their coordination.
71
- Even errors must be replied (with `body.status: "error"`).
126
+ ```
127
+ agent_send({ to: "backend", body: { q: "what's the JWT shape?" } })
128
+ { status: "received", target: "backend" }
129
+ ```
130
+
131
+ You finish your current turn (maybe reply to the user, maybe do other
132
+ sends). Turn ends.
72
133
 
73
- ### Concrete example
134
+ A few seconds later, the runtime hands you a new turn with this input:
135
+
136
+ ```
137
+ [agent-network] message from "backend" (id=<new-id>, re=<your-id>):
138
+ { "shape": { "sub": "string", "exp": "number", "roles": ["string"] } }
139
+ (This is a reply to a previous message of yours.)
140
+ ```
74
141
 
75
- You (name: `backend`) receive:
142
+ You correlate by looking at `re` — it matches the `id` you got back
143
+ when you originally sent. Now you have your answer. Use it.
144
+
145
+ ---
146
+
147
+ ## How to REPLY when you receive a message
148
+
149
+ You receive:
76
150
 
77
151
  ```json
78
152
  {
@@ -80,350 +154,255 @@ You (name: `backend`) receive:
80
154
  "to": "backend",
81
155
  "id": "abc-uuid",
82
156
  "re": null,
83
- "body": {
84
- "task": "Implement the POST /auth/login endpoint",
85
- "context_ref": "./contracts/auth.md"
86
- }
157
+ "body": { "task": "Implement POST /auth/login" }
87
158
  }
88
159
  ```
89
160
 
90
- You do the work. You reply:
161
+ When you have something to say back (an answer, an error, a status),
162
+ you send back another envelope **with `re` set to the original `id`**:
91
163
 
92
- ```json
93
- {
94
- "from": "backend",
95
- "to": "orchestrator",
96
- "id": "xyz-uuid",
97
- "re": "abc-uuid",
98
- "body": {
99
- "status": "done",
100
- "summary": "Endpoint implemented per contract",
101
- "files_changed": ["src/auth/login.ts", "src/auth/jwt.ts"]
102
- }
103
- }
164
+ ```
165
+ agent_send({
166
+ to: "orchestrator",
167
+ body: { status: "done", files_changed: [...] },
168
+ re: "abc-uuid"
169
+ })
104
170
  ```
105
171
 
106
- The orchestrator correlates via `re === "abc-uuid"` and knows this was
107
- the reply to their task. Without `re`, they receive the message but
108
- can't match it against the question and wait until timeout.
172
+ The orchestrator correlates the reply via `re === "abc-uuid"`. Without
173
+ `re`, they receive your message but cannot match it against the
174
+ question and the coordination drifts. **Always echo `re` on a reply.**
109
175
 
110
176
  ---
111
177
 
112
- ## When you need to ask another agent (mid-task)
113
-
114
- Before replying to a task, you may discover you need info from another
115
- agent. Typical scenario: you are `frontend`, you received a task to
116
- implement the login screen, but you don't know the exact JWT shape the
117
- `backend` exposes.
118
-
119
- **Correct flow** (synchronous via request/reply):
120
-
121
- 1. Pause your current task (don't reply to the orchestrator yet)
122
- 2. Send a message to `backend`:
123
- ```json
124
- {
125
- "from": "frontend",
126
- "to": "backend",
127
- "id": "new-uuid",
128
- "re": null,
129
- "body": {
130
- "question": "What's the exact payload shape of the JWT returned by POST /auth/login?",
131
- "context": "needed for FE parsing"
132
- }
133
- }
134
- ```
135
- 3. **Wait for the reply** with `re === "new-uuid"`
136
- 4. Use the received info to complete your original task
137
- 5. Reply to the orchestrator (with `re === "<original task id>"`)
138
-
139
- The transport layer (`agent_request()`) blocks until the reply arrives
140
- or times out. Use a reasonable timeout (30–60s for simple questions).
141
-
142
- ### Limits
143
-
144
- - **Ask focused questions**, not disguised delegations. "What's the
145
- shape of X?" is fine. "Can you implement Y for me?" is not — that's
146
- work the orchestrator should distribute.
147
- - **Maximum 1 hop**: if you asked B, and B needs to ask C to answer
148
- you, B should **fail** with `status: "blocked"` and let the
149
- orchestrator re-plan. Don't chain A → B → C → ...
150
- - **Timeout mandatory**: never wait indefinitely. If no reply in 60s,
151
- fail with `status: "blocked"` in your answer to the orchestrator,
152
- citing which peer didn't respond.
178
+ ## Asking multiple peers at once
153
179
 
154
- ---
180
+ You frequently need info from several agents before you can proceed.
181
+ You can fire multiple `agent_send` in the same turn — each returns its
182
+ own ACK status. Then your turn ends, and the replies arrive in future
183
+ turns as they come in.
155
184
 
156
- ## Asking multiple agents in parallel
185
+ ```typescript
186
+ // In one turn:
187
+ agent_send({ to: "backend", body: { q: "JWT shape?" } }); // -> received
188
+ agent_send({ to: "frontend", body: { q: "theme tokens?" } }); // -> received
189
+ agent_send({ to: "infra", body: { q: "ETA for Y?" } }); // -> busy — retry next turn
190
+ ```
157
191
 
158
- You frequently need info from **several agents at once** before you can
159
- proceed. The transport supports this natively every `agent_request()`
160
- returns a `Promise`, and each request has a unique `id` so the pending
161
- map demuxes replies correctly. **Multiple requests in flight never get
162
- confused with each other.**
192
+ In a later turn, you might see two of the three replies; the third
193
+ might arrive a turn after that. Track which `id` corresponds to which
194
+ question (the ACK return shape includes `id`, store it).
163
195
 
164
- Don't serialize what can run in parallel. Sequential = sum of all
165
- latencies. Parallel = max of latencies. With 3 agents at 200ms each:
166
- serial is 600ms, parallel is 200ms.
196
+ Don't assume replies arrive in send order. Use `re` to identify what
197
+ each reply is for.
167
198
 
168
- ### Pattern 1 — wait for all (most common)
199
+ ### When to retry
169
200
 
170
- ```typescript
171
- const [beAnswer, feAnswer] = await Promise.all([
172
- agent_request("backend", { question: "JWT shape?" }),
173
- agent_request("frontend", { question: "current theme tokens?" }),
174
- ]);
175
- // both arrived, you have both answers, continue your work
176
- ```
201
+ A `busy` peer might be free in a few seconds. The skill recommends:
177
202
 
178
- ### Pattern 2 fan-out structured
203
+ - Try once `busy` → wait ~2s, try again
204
+ - Still `busy` → wait ~5s, try again
205
+ - Still `busy` → abandon (report to human) or escalate to orchestrator
179
206
 
180
- ```typescript
181
- const peers = ["backend", "frontend", "infra"];
182
- const answers = await Promise.all(
183
- peers.map((p) => agent_request(p, { question: "ETA for Y?" }))
184
- );
185
- // answers[i] correlates to peers[i] by array index
186
- ```
207
+ Retries are **your** responsibility as the sender. The broker does not
208
+ queue messages.
187
209
 
188
- ### Pattern 3 — race (first answer wins)
210
+ ---
189
211
 
190
- ```typescript
191
- const winner = await Promise.race([
192
- agent_request("worker-1", taskBody),
193
- agent_request("worker-2", taskBody),
194
- ]);
195
- // useful for redundant queries; losing requests still finish
196
- // in the background but their replies are silently dropped
197
- ```
212
+ ## Cross-PC addressing (`<pc_label>:<peer>`)
198
213
 
199
- ### Pattern 4 tolerant of partial failure
214
+ When the Owner has paired multiple Pis (e.g. "casa" and "trab"), peers
215
+ on the other machine appear in `list_peers` with a prefix:
200
216
 
201
- ```typescript
202
- const settled = await Promise.allSettled([
203
- agent_request("a", q1, 30_000),
204
- agent_request("b", q2, 30_000),
205
- ]);
206
- const okReplies = settled
207
- .filter((r) => r.status === "fulfilled")
208
- .map((r) => r.value);
209
- const failures = settled
210
- .filter((r) => r.status === "rejected")
211
- .map((r) => r.reason);
212
- // proceed with what you got; report failures honestly
217
+ ```
218
+ { peers: ["backend", "frontend", "casa:agent-1", "trab:worker"] }
213
219
  ```
214
220
 
215
- ### Limits (same as 1-on-1 questions)
216
-
217
- - **Max 1 hop still applies to fan-out.** You can ask N agents in
218
- parallel, but each of them must reply directly to you. They cannot
219
- themselves fan-out to satisfy your question. If B needs C and D to
220
- answer your question, B replies `status: "blocked"` and the
221
- orchestrator re-plans.
222
- - **Per-request timeout**: each call has its own timer. One slow agent
223
- doesn't block the others — `Promise.all` rejects fast on first
224
- failure (use `allSettled` if you need tolerance).
225
- - **Focused questions, not delegations** — same rule as 1-on-1.
221
+ To send to a remote peer, use the prefixed name verbatim:
226
222
 
227
- ### Mental model
223
+ ```
224
+ agent_send({ to: "casa:agent-1", body: { ... } })
225
+ ```
228
226
 
229
- The `pending` map inside the transport correlates replies by their `re`
230
- field against the original `id`s you sent. As long as `id`s are unique
231
- (UUID v7, guaranteed), N parallel requests stay isolated. You can have
232
- dozens in flight without confusion — though if you need that many,
233
- question whether you should be a worker instead of an orchestrator.
227
+ The transport (relay) routes it across the mesh; the cross-PC peer
228
+ receives it as if it were local. Behavior matches single-PC:
229
+ `received | busy | denied | timeout` semantics are identical.
234
230
 
235
- ---
231
+ When you **reply** to a cross-PC message, use the original sender's
232
+ `from` verbatim — it already carries the prefix:
236
233
 
237
- ## Advanced addressing
234
+ ```
235
+ Incoming: { from: "casa:sess-3", to: "agent-1", id: "abc", re: null, ... }
236
+ Reply: agent_send({ to: "casa:sess-3", body: {...}, re: "abc" })
237
+ ```
238
238
 
239
- ### Broadcast
239
+ You do NOT prefix your own outgoing `from` — that rewrite happens at the
240
+ broker layer.
240
241
 
241
- `to: "broadcast"` delivers to everyone except the sender. Use rarely:
242
+ **Failure modes specific to cross-PC:**
242
243
 
243
- - Announcements: "wave 2 started", "leader changed to X"
244
- - Questions: no one replies to broadcasts, because no one knows
245
- who's supposed to answer
244
+ - `denied`: remote PC's broker has no peer by that local name (peer left
245
+ recently, or your cache is stale call `list_peers` again)
246
+ - `timeout`: the other PC is offline or the relay is unreachable. The
247
+ relay also synthesises a `transport_error` envelope (with `from:
248
+ "_relay"`) for offline/not_authorized/bad_envelope — you'll see it in
249
+ the inbox as a reply with `re=<your-send-id>` and `body.type:
250
+ "transport_error"`. Treat exactly like timeout.
246
251
 
247
- ### Multicast
252
+ ---
248
253
 
249
- `to: ["backend", "frontend"]` delivers to the listed recipients. Useful
250
- for directed notifications, e.g.: "both of you: stop touching
251
- `contracts/` while I update it".
254
+ ## Broadcast and multicast
252
255
 
253
- Each recipient gets the same message (same `id`). If you reply, `re`
254
- correlates normally.
256
+ `to: "broadcast"` delivers to every other peer. `to: ["a", "b"]`
257
+ delivers to the listed names.
255
258
 
256
- ### Self
259
+ - ✅ Announcements: "wave 2 started", "I'm taking the lock on /contracts"
260
+ - ❌ Questions: nobody knows who's supposed to answer — replies will be
261
+ uncorrelated
257
262
 
258
- You never receive your own messages (even on broadcast). No need to
259
- filter; the broker does it.
263
+ Broadcast/multicast skip the ACK protocol entirely the tool returns
264
+ `status: "sent"` immediately. You don't know who received it. If you
265
+ need delivery confirmation, use multiple unicast sends.
260
266
 
261
267
  ---
262
268
 
263
- ## Auto-discovery of who's in the session
269
+ ## Staying current: peer_joined / peer_left + `list_peers`
264
270
 
265
- You may receive, at some point after joining, `system` events from the
266
- broker:
271
+ You may receive, at any time, `system` events from the broker:
267
272
 
268
273
  ```json
269
- {
270
- "from": "broker",
271
- "to": "backend",
272
- "id": "uuid",
273
- "re": null,
274
- "body": {
275
- "type": "peer_joined",
276
- "name": "frontend",
277
- "capabilities": ["typescript", "react"]
278
- }
279
- }
274
+ { "from": "broker", "to": "backend", "id": "uuid", "re": null,
275
+ "body": { "type": "peer_joined", "name": "frontend" } }
280
276
  ```
281
277
 
282
278
  ```json
283
- {
284
- "from": "broker",
285
- "to": "backend",
286
- "id": "uuid",
287
- "re": null,
288
- "body": {
289
- "type": "peer_left",
290
- "name": "frontend"
291
- }
292
- }
279
+ { "from": "broker", "to": "backend", "id": "uuid", "re": null,
280
+ "body": { "type": "peer_left", "name": "frontend" } }
293
281
  ```
294
282
 
295
- Use these events to know who's online. Keep a mental list (or session
296
- state) of active peers. Don't ask a peer you know is offline.
283
+ Use these to track who's online. Don't ask peers you know are offline.
297
284
 
298
- If you need to list active peers on demand, ask the broker:
285
+ If you missed events (just woke up, or your view feels stale), call
286
+ `list_peers` — it returns the authoritative snapshot in milliseconds.
299
287
 
300
- ```json
301
- {
302
- "from": "backend",
303
- "to": "broker",
304
- "id": "uuid",
305
- "re": null,
306
- "body": { "type": "list_peers" }
307
- }
308
- ```
309
-
310
- The broker replies with `body: { peers: [...] }`.
288
+ Do **not** send a `list_peers` envelope to the broker via `agent_send`.
289
+ That's the old pre-tool pattern: it worked, but the reply arrived in a
290
+ future turn and the ACK status didn't carry the peer list. The dedicated
291
+ `list_peers` tool is strictly better — synchronous, typed return.
311
292
 
312
293
  ---
313
294
 
314
295
  ## Situations where you're in doubt
315
296
 
316
- ### "I received a message I don't understand"
297
+ ### "I received a task I don't understand"
317
298
 
318
- Reply with `status: "error"` and say what was unclear. Don't go silent.
299
+ Reply with `status: "error"` in the body, echoing the original `id` in
300
+ `re`. Don't go silent.
319
301
 
320
- ```json
321
- {
322
- "from": "backend",
323
- "to": "<original sender>",
324
- "id": "...",
325
- "re": "<original id>",
326
- "body": {
327
- "status": "error",
328
- "summary": "I didn't understand the request. The 'task' field is unclear."
329
- }
330
- }
331
- ```
302
+ ### "I received a message with `re` set, but I never sent that question"
303
+
304
+ Late reply to something that already wrapped up. Ignore. Don't reply
305
+ to a reply.
332
306
 
333
- ### "I received a message with `re` set, but I never sent a request"
307
+ ### "I'm in a session but no message ever arrives"
334
308
 
335
- Probably a late reply to a request that already timed out or was
336
- cancelled. Ignore silently. Don't reply.
309
+ Normal. You only receive when someone addresses you. Keep working on
310
+ the current task. Don't poll the broker.
337
311
 
338
- ### "I received a message without `re`, but it's clearly a reply"
312
+ ### "I sent something but got `timeout`"
339
313
 
340
- Treat it as a new message (task). The sender didn't follow protocol
341
- you can't correlate it with your original request even if you sent one.
342
- If genuinely confused, reply asking: "Is this message a reply to
343
- something? I didn't see a `re`."
314
+ The broker didn't ACK in 5s. Either the broker is restarting (failover)
315
+ or the peer disappeared between registration and delivery. Retry once
316
+ after ~10s; if still timeout, treat as transport failure and escalate.
344
317
 
345
- ### "I'm in a session but no message ever arrives"
318
+ ### "The leader died (peer_left from `broker` for the leader)"
319
+
320
+ The transport layer automatically promotes another peer to leader. Your
321
+ client reconnects transparently in ~500ms. During that window,
322
+ `agent_send` may return `timeout` — retry once after a beat before
323
+ giving up.
324
+
325
+ ---
346
326
 
347
- Normal. You only receive when someone addresses you. Keep working in
348
- solo mode until someone calls. Don't poll the broker periodically.
327
+ ## Legacy: `agent_request` is deprecated
349
328
 
350
- ### "The leader died (peer_left event from `broker`)"
329
+ You may see references to a tool called `agent_request` that takes a
330
+ target + body and **blocks the entire turn** waiting for the peer's
331
+ content reply. It still works, but emits a deprecation warning on use
332
+ and will be removed.
351
333
 
352
- The transport layer will automatically promote another peer to leader.
353
- You (client) will reconnect transparently in ~500ms. During that
354
- window, your `send/request` calls may fail retry once after 1s
355
- before propagating an error.
334
+ **Why deprecated:**
335
+
336
+ - Blocks your turn while a peer thinks costs tokens and wall time
337
+ - No ACK signal — you can't tell `busy` from `pondering` from `gone`
338
+ - Pairs badly with parallel multi-peer questions
339
+
340
+ **Migration:** every `agent_request` call becomes an `agent_send`. The
341
+ reply arrives in a future turn (see the walk-through above). Treat the
342
+ inbox as your event loop, not your call stack.
356
343
 
357
344
  ---
358
345
 
359
346
  ## Single-page summary
360
347
 
361
- 1. You only receive what's addressed to you. Don't filter. Trust the broker.
362
- 2. Every reply carries `re` = `id` of the original message. Without it,
363
- the sender can't correlate.
364
- 3. Reply's `to` = question's `from`.
365
- 4. Always reply success or error when you receive something that
366
- looks like a task.
367
- 5. You can ask other agents mid-task (request/reply, synchronous), but:
368
- - Max 1 hop
369
- - Always with timeout
370
- - Read-only question ("what is X?"), not delegation ("do Y")
371
- 6. **You can ask multiple agents in parallel** with `Promise.all` —
372
- each request's `id` keeps replies isolated. Don't serialize what can
373
- run in parallel.
374
- 7. Broadcast is for announcements, not questions.
375
- 8. When confused, reply with `status: "error"` instead of staying silent.
376
-
377
- That skill is everything you need to participate in the session without
378
- breaking other agents' flow. Re-read it when in doubt.
348
+ 1. **Discover first**: `list_peers()` returns `{peers: string[]}`
349
+ locals plus `<pc>:<peer>` cross-PC entries. Synchronous. Self-excluded.
350
+ 2. **Send tool**: `agent_send({to, body, re?})`. Returns `{status, ...}`
351
+ always inspect.
352
+ 3. **Unicast status**: `received | busy | denied | timeout`. Retry on
353
+ `busy` with backoff; abandon on `denied`; investigate on `timeout`.
354
+ 4. **Broadcast/multicast**: status is `sent`. Fire-and-forget.
355
+ 5. **Replies**: come back **in a future turn** as a normal inbound
356
+ envelope with `re=<your-send-id>`. Correlate by `re`.
357
+ 6. When YOU reply to a peer, set `re` to their original `id`, and use
358
+ their `from` verbatim as your `to` (including any `<pc>:` prefix).
359
+ 7. You never receive your own messages. The broker filters.
360
+ 8. The broker does not queue. If a peer is busy, your message is
361
+ **dropped** you own the retry.
362
+ 9. `agent_request` is deprecated. Migrate to `agent_send` + inbox.
363
+
364
+ That's the whole protocol. Re-read it when in doubt.
379
365
 
380
366
  ---
381
367
 
382
368
  ## Mini-FAQ
383
369
 
384
370
  **Q: Can I send a message to myself?**
385
- A: No. Both `agent_send` and `agent_request` refuse early with an
386
- error (`"cannot agent_send to yourself"`) when `to` matches your
387
- assigned name. The broker also drops unicast self-loops as a second
388
- line of defense. There's no upside — just do the work directly
389
- instead of round-tripping through the network.
390
-
391
- **Q: What happens to messages I sent before the recipient joined?**
392
- A: The broker drops them with a warning log. There is no persistent
393
- message queue. If you need delivery guarantees, wait for the
394
- `peer_joined` event before sending.
395
-
396
- **Q: Can I have the same name as another agent?**
397
- A: No. The broker auto-suffixes (e.g., you asked for `backend`, you
398
- get `backend#2` in `register_ack`). Use the name the broker gave you
399
- (`name_assigned`) in all your messages.
371
+ A: No. `agent_send` refuses early with `status: "refused"` when `to`
372
+ matches your assigned name. The broker also drops unicast self-loops
373
+ as a second line of defense.
374
+
375
+ **Q: What if the peer never replies to my message?**
376
+ A: Then you never see a reply. There is no implicit timeout — your
377
+ own send returned `received` (the broker handed it over), the peer
378
+ just chose not to answer. If a reply is important, the agent-network
379
+ skill in the peer's process should make them reply. If they're a
380
+ non-Pi process that just listens, you live with the silence.
381
+
382
+ **Q: How many sends can I fire in one turn?**
383
+ A: No hard limit. But if you fire 10+ unicasts, question whether
384
+ you should be a worker instead of an orchestrator. Workers answer
385
+ narrow; orchestrators dispatch wide.
386
+
387
+ **Q: Is order preserved?**
388
+ A: Per-pair, yes — the broker is FIFO. Across pairs, replies arrive
389
+ whenever the senders finish. Don't assume reply order matches send
390
+ order.
400
391
 
401
392
  **Q: Can `body` be binary?**
402
- A: Not directly. Use base64 inside a string if needed. But you're
403
- probably using this for text/JSON — don't make binary the use case.
404
-
405
- **Q: Is there message priority?**
406
- A: Not in MVP. Order is FIFO of arrival at the broker. If you need
407
- priority, open an issue.
408
-
409
- **Q: How do I discover other peers' capabilities (stack, role)?**
410
- A: `peer_joined` events carry `capabilities` in `body`. Save them when
411
- peers enter. Or ask the broker via `list_peers`.
393
+ A: Not directly. Use base64 inside a string if you must. JSON is the
394
+ intended payload.
412
395
 
413
396
  **Q: Can I disconnect any time?**
414
397
  A: Yes. The transport sends `peer_left` automatically when you close.
415
- Other agents will see you go.
416
-
417
- **Q: How many parallel requests is "too many"?**
418
- A: There's no hard limit, but if you're firing 10+ in parallel,
419
- question whether you're the wrong layer. Orchestrators dispatch wide;
420
- workers should answer narrow. If you're a worker fanning out to many
421
- peers, you may be doing the orchestrator's job.
398
+ Other agents see you go.
422
399
 
423
400
  ---
424
401
 
425
402
  ## See also
426
403
 
427
- - [`plan/19-agent-network-rfc.md`](../plan/19-agent-network-rfc.md) — motivation and context
428
- - [`plan/19-agent-network.md`](../plan/19-agent-network.md) — implementation plan
429
- - `~/.pi/remote/sessions/<name>/audit.jsonl` — append-only log of everything that passed through the broker (read-only audit). Legacy path preserved (2026-05-21 decision — no storage migration)
404
+ - [`plan/19-agent-network.md`](../plan/19-agent-network.md) — original protocol design
405
+ - [`plan/25-pc-mesh-bootstrap.md`](../plan/25-pc-mesh-bootstrap.md) — ACK protocol motivation + Wave 0 + cross-PC plans
406
+ - `~/.pi/remote/sessions/<name>/audit.jsonl` — append-only log of every
407
+ envelope that passed through the broker, with `ack_status` per entry
408
+ for cross-checking what really happened.