agent-relay 1.0.6 → 1.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (72) hide show
  1. package/README.md +18 -6
  2. package/dist/cli/index.d.ts +2 -0
  3. package/dist/cli/index.d.ts.map +1 -1
  4. package/dist/cli/index.js +344 -3
  5. package/dist/cli/index.js.map +1 -1
  6. package/dist/daemon/agent-registry.d.ts +60 -0
  7. package/dist/daemon/agent-registry.d.ts.map +1 -0
  8. package/dist/daemon/agent-registry.js +158 -0
  9. package/dist/daemon/agent-registry.js.map +1 -0
  10. package/dist/daemon/connection.d.ts +11 -1
  11. package/dist/daemon/connection.d.ts.map +1 -1
  12. package/dist/daemon/connection.js +31 -2
  13. package/dist/daemon/connection.js.map +1 -1
  14. package/dist/daemon/index.d.ts +2 -0
  15. package/dist/daemon/index.d.ts.map +1 -1
  16. package/dist/daemon/index.js +2 -0
  17. package/dist/daemon/index.js.map +1 -1
  18. package/dist/daemon/registry.d.ts +9 -0
  19. package/dist/daemon/registry.d.ts.map +1 -0
  20. package/dist/daemon/registry.js +9 -0
  21. package/dist/daemon/registry.js.map +1 -0
  22. package/dist/daemon/router.d.ts +34 -2
  23. package/dist/daemon/router.d.ts.map +1 -1
  24. package/dist/daemon/router.js +111 -1
  25. package/dist/daemon/router.js.map +1 -1
  26. package/dist/daemon/server.d.ts +1 -0
  27. package/dist/daemon/server.d.ts.map +1 -1
  28. package/dist/daemon/server.js +60 -13
  29. package/dist/daemon/server.js.map +1 -1
  30. package/dist/dashboard/public/index.html +625 -16
  31. package/dist/dashboard/server.d.ts +1 -1
  32. package/dist/dashboard/server.d.ts.map +1 -1
  33. package/dist/dashboard/server.js +125 -7
  34. package/dist/dashboard/server.js.map +1 -1
  35. package/dist/index.d.ts +1 -0
  36. package/dist/index.d.ts.map +1 -1
  37. package/dist/protocol/types.d.ts +15 -1
  38. package/dist/protocol/types.d.ts.map +1 -1
  39. package/dist/storage/adapter.d.ts +53 -0
  40. package/dist/storage/adapter.d.ts.map +1 -1
  41. package/dist/storage/adapter.js +3 -0
  42. package/dist/storage/adapter.js.map +1 -1
  43. package/dist/storage/sqlite-adapter.d.ts +58 -1
  44. package/dist/storage/sqlite-adapter.d.ts.map +1 -1
  45. package/dist/storage/sqlite-adapter.js +374 -47
  46. package/dist/storage/sqlite-adapter.js.map +1 -1
  47. package/dist/utils/project-namespace.d.ts.map +1 -1
  48. package/dist/utils/project-namespace.js +22 -1
  49. package/dist/utils/project-namespace.js.map +1 -1
  50. package/dist/wrapper/client.d.ts +22 -3
  51. package/dist/wrapper/client.d.ts.map +1 -1
  52. package/dist/wrapper/client.js +59 -9
  53. package/dist/wrapper/client.js.map +1 -1
  54. package/dist/wrapper/parser.d.ts +110 -4
  55. package/dist/wrapper/parser.d.ts.map +1 -1
  56. package/dist/wrapper/parser.js +296 -84
  57. package/dist/wrapper/parser.js.map +1 -1
  58. package/dist/wrapper/tmux-wrapper.d.ts +100 -9
  59. package/dist/wrapper/tmux-wrapper.d.ts.map +1 -1
  60. package/dist/wrapper/tmux-wrapper.js +441 -83
  61. package/dist/wrapper/tmux-wrapper.js.map +1 -1
  62. package/docs/AGENTS.md +27 -27
  63. package/docs/CHANGELOG.md +1 -1
  64. package/docs/DESIGN_V2.md +1079 -0
  65. package/docs/INTEGRATION-GUIDE.md +926 -0
  66. package/docs/PROPOSAL-trajectories.md +1582 -0
  67. package/docs/PROTOCOL.md +3 -3
  68. package/docs/SCALING_ANALYSIS.md +280 -0
  69. package/docs/TMUX_IMPLEMENTATION_NOTES.md +9 -9
  70. package/docs/TMUX_IMPROVEMENTS.md +968 -0
  71. package/docs/competitive-analysis-mcp-agent-mail.md +389 -0
  72. package/package.json +6 -2
@@ -0,0 +1,1079 @@
1
+ # Agent-Relay v2 Design Document
2
+
3
+ ## Overview
4
+
5
+ This document outlines improvements to agent-relay while preserving its core philosophy: **simple, transparent agent-to-agent communication via terminal output patterns**.
6
+
7
+ The `->relay:` pattern is the killer feature. Agents communicate naturally by just printing text. No APIs, no SDKs, no special integrations. This must remain the foundation.
8
+
9
+ ---
10
+
11
+ ## Current Pain Points
12
+
13
+ ### 1. Ephemeral Storage (`/tmp`)
14
+ - Data lives in `/tmp/agent-relay/<hash>/`
15
+ - Cleared on reboot (macOS/Linux)
16
+ - Message history lost unexpectedly
17
+
18
+ ### 2. Dead Code
19
+ - ACK/NACK protocol defined but not implemented
20
+ - Session resume tokens always return `RESUME_TOO_OLD`
21
+ - PostgreSQL adapter throws "not implemented"
22
+
23
+ ### 3. Memory Leaks
24
+ - `sentMessageHashes` Set grows unbounded
25
+ - Long-running sessions will OOM
26
+
27
+ ### 4. Polling Overhead
28
+ - `capture-pane` every 200ms consumes CPU
29
+ - Latency up to 200ms for message detection
30
+
31
+ ### 5. Fragile Injection Timing
32
+ - 1.5s idle detection is a heuristic
33
+ - Race conditions if agent outputs during injection
34
+
35
+ ---
36
+
37
+ ## Design Principles
38
+
39
+ 1. **Keep it simple** - Every feature must justify its complexity
40
+ 2. **Terminal-native** - Users stay in tmux, not a browser
41
+ 3. **Pattern-based** - `->relay:` is the API
42
+ 4. **Zero config** - Works out of the box
43
+ 5. **Debuggable** - Easy to understand what's happening
44
+
45
+ ---
46
+
47
+ ## Proposed Changes
48
+
49
+ ### Phase 1: Foundation Fixes
50
+
51
+ #### 1.1 Persistent Storage Location
52
+
53
+ Move from `/tmp` to XDG-compliant location:
54
+
55
+ ```
56
+ ~/.local/share/agent-relay/ # XDG_DATA_HOME fallback
57
+ ├── projects/
58
+ │ └── <project-hash>/
59
+ │ ├── relay.sock # Unix socket
60
+ │ ├── messages.db # SQLite
61
+ │ └── agents.json # Connected agents
62
+ └── config.json # Global settings (optional)
63
+ ```
64
+
65
+ **Migration path:**
66
+ - Check for existing `/tmp/agent-relay/` data on startup
67
+ - Offer one-time migration prompt
68
+ - Fall back to new location for fresh installs
69
+
70
+ #### 1.2 Remove Dead Code
71
+
72
+ Delete these unimplemented features:
73
+
74
+ | Feature | Location | Action |
75
+ |---------|----------|--------|
76
+ | ACK handling | `connection.ts:114-116` | Remove |
77
+ | Resume tokens | `connection.ts:140-143` | Remove |
78
+ | PostgreSQL adapter | `storage/adapter.ts:152-162` | Remove |
79
+ | Topic subscriptions | `router.ts` | Keep but mark experimental |
80
+
81
+ **Protocol simplification:**
82
+ ```typescript
83
+ // Before: 10 message types
84
+ type MessageType = 'HELLO' | 'WELCOME' | 'SEND' | 'DELIVER' | 'ACK' |
85
+ 'PING' | 'PONG' | 'SUBSCRIBE' | 'UNSUBSCRIBE' | 'ERROR' | 'BYE';
86
+
87
+ // After: 6 message types
88
+ type MessageType = 'HELLO' | 'WELCOME' | 'SEND' | 'DELIVER' |
89
+ 'PING' | 'PONG' | 'ERROR';
90
+ ```
91
+
92
+ #### 1.3 Fix Memory Leak
93
+
94
+ Replace unbounded Set with LRU cache:
95
+
96
+ ```typescript
97
+ // Before
98
+ private sentMessageHashes: Set<string> = new Set();
99
+
100
+ // After
101
+ import { LRUCache } from 'lru-cache';
102
+
103
+ private sentMessageHashes = new LRUCache<string, boolean>({
104
+ max: 10000, // Max 10k unique messages tracked
105
+ ttl: 1000 * 60 * 60, // Expire after 1 hour
106
+ });
107
+ ```
108
+
109
+ #### 1.4 Simplify Binary Protocol
110
+
111
+ Replace 4-byte length prefix with newline-delimited JSON:
112
+
113
+ ```typescript
114
+ // Before: Binary framing
115
+ [4-byte length][JSON payload]
116
+
117
+ // After: NDJSON (newline-delimited JSON)
118
+ {"v":1,"type":"SEND","to":"Bob","payload":{"body":"Hello"}}\n
119
+ {"v":1,"type":"DELIVER","from":"Alice","payload":{"body":"Hello"}}\n
120
+ ```
121
+
122
+ **Benefits:**
123
+ - Human-readable when debugging (`nc -U relay.sock`)
124
+ - Simpler parser (~20 lines vs ~50 lines)
125
+ - Standard format (NDJSON)
126
+
127
+ **Trade-off:** Messages cannot contain literal newlines in body. Since we already sanitize newlines for injection (`replace(/[\r\n]+/g, ' ')`), this is acceptable.
128
+
129
+ ---
130
+
131
+ ### Phase 2: Reliability Improvements
132
+
133
+ #### 2.1 Improved Injection Strategy
134
+
135
+ Replace time-based idle detection with input buffer detection:
136
+
137
+ ```typescript
138
+ // Current: Wait 1.5s after last output (fragile)
139
+ if (Date.now() - lastOutputTime > 1500) {
140
+ inject();
141
+ }
142
+
143
+ // Proposed: Check if input line is empty
144
+ async function isInputClear(): Promise<boolean> {
145
+ // Capture current pane content
146
+ const { stdout } = await execAsync(
147
+ `tmux capture-pane -t ${session} -p -J`
148
+ );
149
+ const lines = stdout.split('\n');
150
+ const lastLine = lines[lines.length - 1] || '';
151
+
152
+ // Check if last line is just a prompt (no partial input)
153
+ return /^[>$%#➜]\s*$/.test(lastLine);
154
+ }
155
+ ```
156
+
157
+ #### 2.2 Bracketed Paste Mode
158
+
159
+ Use bracketed paste for safer injection:
160
+
161
+ ```typescript
162
+ // Wrap injection in bracketed paste markers
163
+ const PASTE_START = '\x1b[200~';
164
+ const PASTE_END = '\x1b[201~';
165
+
166
+ async function injectSafe(text: string): Promise<void> {
167
+ await sendKeysLiteral(PASTE_START + text + PASTE_END);
168
+ await sendKeys('Enter');
169
+ }
170
+ ```
171
+
172
+ **Benefits:**
173
+ - Prevents shell interpretation of special characters
174
+ - Atomic paste (no interleaving)
175
+ - Supported by most modern terminals/shells
176
+
177
+ #### 2.3 Message Queue for Offline Agents
178
+
179
+ Queue messages when target agent is disconnected:
180
+
181
+ ```typescript
182
+ interface QueuedMessage {
183
+ id: string;
184
+ from: string;
185
+ to: string;
186
+ payload: SendPayload;
187
+ queuedAt: number;
188
+ attempts: number;
189
+ }
190
+
191
+ // In router.ts
192
+ if (!targetConnection || targetConnection.state !== 'ACTIVE') {
193
+ this.messageQueue.enqueue({
194
+ id: envelope.id,
195
+ from: connection.agentName,
196
+ to: envelope.to,
197
+ payload: envelope.payload,
198
+ queuedAt: Date.now(),
199
+ attempts: 0,
200
+ });
201
+
202
+ // Notify sender
203
+ connection.send({
204
+ type: 'QUEUED',
205
+ id: envelope.id,
206
+ reason: 'recipient_offline',
207
+ });
208
+ }
209
+
210
+ // On agent connect, flush queued messages
211
+ onAgentConnect(agentName: string) {
212
+ const queued = this.messageQueue.getForRecipient(agentName);
213
+ for (const msg of queued) {
214
+ this.deliverMessage(msg);
215
+ this.messageQueue.remove(msg.id);
216
+ }
217
+ }
218
+ ```
219
+
220
+ ---
221
+
222
+ ### Phase 3: Developer Experience
223
+
224
+ #### 3.1 Structured Logging
225
+
226
+ Replace scattered `console.log` with leveled logging:
227
+
228
+ ```typescript
229
+ import { createLogger } from './logger.js';
230
+
231
+ const log = createLogger('daemon');
232
+
233
+ log.info('Agent registered', { name: 'Alice', cli: 'claude' });
234
+ log.debug('Message routed', { from: 'Alice', to: 'Bob', id: '...' });
235
+ log.error('Connection failed', { error: err.message });
236
+ ```
237
+
238
+ Output format (when `DEBUG=agent-relay`):
239
+ ```
240
+ [14:23:01.234] INFO daemon: Agent registered name=Alice cli=claude
241
+ [14:23:01.456] DEBUG router: Message routed from=Alice to=Bob id=abc123
242
+ ```
243
+
244
+ #### 3.2 Health Check Endpoint
245
+
246
+ Add simple HTTP health check (optional, disabled by default):
247
+
248
+ ```typescript
249
+ // Enable with: agent-relay up --health-port 3889
250
+ // Or: AGENT_RELAY_HEALTH_PORT=3889
251
+
252
+ GET http://localhost:3889/health
253
+ {
254
+ "status": "ok",
255
+ "uptime": 3600,
256
+ "agents": ["Alice", "Bob"],
257
+ "messages": {
258
+ "sent": 42,
259
+ "delivered": 41,
260
+ "queued": 1
261
+ }
262
+ }
263
+ ```
264
+
265
+ #### 3.3 CLI Improvements
266
+
267
+ ```bash
268
+ # Current
269
+ agent-relay up
270
+ agent-relay -n Alice claude
271
+ agent-relay status
272
+ agent-relay read <id>
273
+
274
+ # Add
275
+ agent-relay agents # List connected agents
276
+ agent-relay send Alice "Hello" # Send from CLI (for testing)
277
+ agent-relay logs # Tail daemon logs
278
+ agent-relay logs Alice # Tail agent's relay activity
279
+ ```
280
+
281
+ ---
282
+
283
+ ### Phase 4: Optional Enhancements
284
+
285
+ #### 4.1 WebSocket Streaming (Optional)
286
+
287
+ Replace polling with WebSocket-based output streaming:
288
+
289
+ ```typescript
290
+ // Instead of polling capture-pane, attach via PTY
291
+ import { spawn } from 'node-pty';
292
+
293
+ const pty = spawn('tmux', ['attach-session', '-t', session, '-r'], {
294
+ // Read-only attach
295
+ });
296
+
297
+ pty.onData((data) => {
298
+ // Real-time output, no polling
299
+ const { commands } = parser.parse(data);
300
+ for (const cmd of commands) {
301
+ sendRelayCommand(cmd);
302
+ }
303
+ });
304
+ ```
305
+
306
+ **Trade-offs:**
307
+ | Aspect | Polling | WebSocket/PTY |
308
+ |--------|---------|---------------|
309
+ | Latency | 0-200ms | ~1-10ms |
310
+ | CPU | Higher | Lower |
311
+ | Complexity | Simple | More complex |
312
+ | Dependencies | None | node-pty |
313
+
314
+ **Recommendation:** Keep polling as default, offer streaming as `--experimental-streaming` flag.
315
+
316
+ #### 4.2 Message Encryption (Optional)
317
+
318
+ For sensitive inter-agent communication:
319
+
320
+ ```typescript
321
+ // Generate per-project key on first run
322
+ const projectKey = await generateKey();
323
+ fs.writeFileSync(keyPath, projectKey, { mode: 0o600 });
324
+
325
+ // Encrypt message bodies
326
+ const encrypted = await encrypt(payload.body, projectKey);
327
+ ```
328
+
329
+ **Scope:** Only encrypt message body, not metadata (to/from/timestamp).
330
+
331
+ ---
332
+
333
+ ## Migration Plan
334
+
335
+ ### v1.x → v2.0
336
+
337
+ 1. **Storage migration**
338
+ - Detect existing `/tmp/agent-relay/` data
339
+ - Copy to `~/.local/share/agent-relay/`
340
+ - Remove old location after successful migration
341
+
342
+ 2. **Protocol compatibility**
343
+ - v2 daemon accepts both binary and NDJSON
344
+ - v2 clients send NDJSON only
345
+ - Deprecation warning for binary clients
346
+
347
+ 3. **Breaking changes**
348
+ - Remove ACK/resume/PostgreSQL (unused)
349
+ - Change default storage location
350
+
351
+ ---
352
+
353
+ ## File Structure (Post-Refactor)
354
+
355
+ ```
356
+ src/
357
+ ├── cli/
358
+ │ └── index.ts # CLI entry point
359
+ ├── daemon/
360
+ │ ├── server.ts # Main daemon
361
+ │ ├── connection.ts # Connection handling (simplified)
362
+ │ └── router.ts # Message routing + queue
363
+ ├── wrapper/
364
+ │ ├── tmux-wrapper.ts # Agent wrapper
365
+ │ ├── parser.ts # ->relay: pattern parser
366
+ │ └── client.ts # Relay client
367
+ ├── protocol/
368
+ │ └── types.ts # Message types (reduced)
369
+ ├── storage/
370
+ │ └── sqlite-adapter.ts # SQLite only (removed abstraction)
371
+ └── utils/
372
+ ├── logger.ts # Structured logging
373
+ ├── paths.ts # XDG-compliant paths
374
+ └── lru-cache.ts # For deduplication
375
+ ```
376
+
377
+ ---
378
+
379
+ ## Success Metrics
380
+
381
+ | Metric | Current | Target |
382
+ |--------|---------|--------|
383
+ | Lines of code | ~2500 | ~2800 (with TUI) |
384
+ | Message types | 10 | 8 (added GROUP, TOPIC) |
385
+ | Max agents | ~3 practical | 10+ comfortable |
386
+ | Dependencies | 12 | 14 (adds blessed for TUI) |
387
+ | Memory (1hr session) | Unbounded | <100MB (10 agents) |
388
+ | Message detection latency | 0-200ms | 0-200ms |
389
+ | Data persistence | Lost on reboot | Permanent |
390
+ | Visibility | None | TUI dashboard |
391
+
392
+ ---
393
+
394
+ ## Phase 5: Multi-Agent Coordination (5-10 Agents)
395
+
396
+ Scaling from 2-3 agents to 5-10 requires better visibility, organization, and coordination patterns.
397
+
398
+ ### 5.1 Agent Groups
399
+
400
+ Group agents for targeted messaging:
401
+
402
+ ```bash
403
+ # Define groups in teams.json
404
+ {
405
+ "groups": {
406
+ "backend": ["ApiDev", "DbAdmin", "AuthService"],
407
+ "frontend": ["UiDev", "Stylist"],
408
+ "review": ["Reviewer", "QA"]
409
+ }
410
+ }
411
+
412
+ # Send to group
413
+ ->relay:@backend We need to refactor the user service
414
+ # → Message delivered to ApiDev, DbAdmin, AuthService
415
+
416
+ # Broadcast to all
417
+ ->relay:* Starting deployment in 5 minutes
418
+ ```
419
+
420
+ **Implementation:**
421
+ ```typescript
422
+ // In router.ts
423
+ route(from: Connection, envelope: Envelope<SendPayload>) {
424
+ const to = envelope.to;
425
+
426
+ if (to === '*') {
427
+ this.broadcast(from, envelope);
428
+ } else if (to.startsWith('@')) {
429
+ // Group message
430
+ const groupName = to.slice(1);
431
+ const members = this.groups.get(groupName) || [];
432
+ for (const member of members) {
433
+ if (member !== from.agentName) {
434
+ this.sendTo(member, envelope);
435
+ }
436
+ }
437
+ } else {
438
+ this.sendTo(to, envelope);
439
+ }
440
+ }
441
+ ```
442
+
443
+ ### 5.2 Terminal-Based Dashboard (TUI)
444
+
445
+ A simple terminal UI for monitoring all agents without leaving the terminal:
446
+
447
+ ```bash
448
+ agent-relay watch
449
+ ```
450
+
451
+ ```
452
+ ┌─ Agent Relay ──────────────────────────────────────────────┐
453
+ │ Agents (8 connected) │
454
+ ├─────────────────────────────────────────────────────────────┤
455
+ │ ● Coordinator idle 2m msgs: 12↑ 8↓ │
456
+ │ ● ApiDev active msgs: 5↑ 14↓ typing... │
457
+ │ ● DbAdmin active msgs: 3↑ 6↓ │
458
+ │ ● AuthService idle 45s msgs: 2↑ 4↓ │
459
+ │ ● UiDev active msgs: 8↑ 10↓ typing... │
460
+ │ ● Stylist idle 5m msgs: 1↑ 2↓ │
461
+ │ ● Reviewer active msgs: 0↑ 15↓ │
462
+ │ ○ QA offline queued: 3 │
463
+ ├─────────────────────────────────────────────────────────────┤
464
+ │ Recent Messages │
465
+ │ 14:23:01 ApiDev → DbAdmin: Can you check the user table? │
466
+ │ 14:23:15 DbAdmin → ApiDev: Schema looks correct │
467
+ │ 14:23:30 Coordinator → @backend: Stand up in 5 mins │
468
+ │ 14:24:01 UiDev → Reviewer: PR ready for auth flow │
469
+ ├─────────────────────────────────────────────────────────────┤
470
+ │ [a]ttach [s]end [g]roups [q]uit │
471
+ └─────────────────────────────────────────────────────────────┘
472
+ ```
473
+
474
+ **Features:**
475
+ - Real-time agent status (active/idle/offline)
476
+ - Message counts and queue depth
477
+ - Recent message feed
478
+ - Quick attach to any agent's tmux session
479
+ - Send messages from dashboard
480
+
481
+ **Implementation:** Use `blessed` or `ink` for terminal UI:
482
+ ```typescript
483
+ // src/cli/watch.ts
484
+ import blessed from 'blessed';
485
+
486
+ const screen = blessed.screen({ smartCSR: true });
487
+ const agentList = blessed.list({
488
+ parent: screen,
489
+ label: 'Agents',
490
+ // ...
491
+ });
492
+
493
+ // Subscribe to daemon events via WebSocket
494
+ const ws = new WebSocket(`ws+unix://${socketPath}`);
495
+ ws.on('message', (data) => {
496
+ const event = JSON.parse(data);
497
+ updateDisplay(event);
498
+ });
499
+ ```
500
+
501
+ ### 5.3 Coordination Patterns
502
+
503
+ #### Pattern 1: Coordinator Agent
504
+
505
+ One agent orchestrates the others:
506
+
507
+ ```
508
+ Coordinator
509
+ ├── ->relay:ApiDev Implement /api/users endpoint
510
+ ├── ->relay:DbAdmin Create users table
511
+ └── ->relay:UiDev Build user profile page
512
+
513
+ ApiDev → Coordinator: Done, endpoint at /api/users
514
+ DbAdmin → Coordinator: Table created with schema...
515
+ UiDev → Coordinator: Need API spec first
516
+
517
+ Coordinator → UiDev: Here's the spec: GET /api/users...
518
+ ```
519
+
520
+ #### Pattern 2: Pipeline
521
+
522
+ Agents pass work sequentially:
523
+
524
+ ```
525
+ Developer → Reviewer → QA → Deployer
526
+
527
+ ->relay:Reviewer PR #123 ready for review
528
+
529
+ ->relay:QA Review passed, ready for testing
530
+
531
+ ->relay:Deployer Tests passed, deploy when ready
532
+ ```
533
+
534
+ #### Pattern 3: Pub/Sub Topics
535
+
536
+ Agents subscribe to topics of interest:
537
+
538
+ ```bash
539
+ # Agent subscribes to topic
540
+ ->relay:subscribe security-alerts
541
+
542
+ # Any agent can publish
543
+ ->relay:topic:security-alerts Found SQL injection in auth.ts
544
+
545
+ # All subscribers receive the message
546
+ ```
547
+
548
+ **Implementation:**
549
+ ```typescript
550
+ // Subscribe syntax
551
+ ->relay:+topic-name # Subscribe
552
+ ->relay:-topic-name # Unsubscribe
553
+ ->relay:#topic-name msg # Publish to topic
554
+
555
+ // In parser.ts
556
+ const TOPIC_SUBSCRIBE = /^->relay:\+(\S+)$/;
557
+ const TOPIC_UNSUBSCRIBE = /^->relay:-(\S+)$/;
558
+ const TOPIC_PUBLISH = /^->relay:#(\S+)\s+(.+)$/;
559
+ ```
560
+
561
+ ### 5.4 Tmux Layout Helper
562
+
563
+ Quickly set up multi-agent tmux layouts:
564
+
565
+ ```bash
566
+ # Create tiled layout with all agents
567
+ agent-relay layout tile
568
+
569
+ # Create layout from teams.json
570
+ agent-relay layout teams
571
+
572
+ # Custom layout
573
+ agent-relay layout grid 3x3
574
+ ```
575
+
576
+ **Generated tmux layout:**
577
+ ```
578
+ ┌─────────────┬─────────────┬─────────────┐
579
+ │ Coordinator │ ApiDev │ DbAdmin │
580
+ ├─────────────┼─────────────┼─────────────┤
581
+ │ AuthService │ UiDev │ Stylist │
582
+ ├─────────────┼─────────────┼─────────────┤
583
+ │ Reviewer │ QA │ (empty) │
584
+ └─────────────┴─────────────┴─────────────┘
585
+ ```
586
+
587
+ **Implementation:**
588
+ ```bash
589
+ #!/bin/bash
590
+ # agent-relay layout tile
591
+ AGENTS=$(agent-relay agents --json | jq -r '.[].name')
592
+ COUNT=$(echo "$AGENTS" | wc -l)
593
+
594
+ tmux new-session -d -s relay-overview
595
+ for agent in $AGENTS; do
596
+ tmux split-window -t relay-overview
597
+ tmux send-keys -t relay-overview "tmux attach -t relay-$agent-*" Enter
598
+ done
599
+ tmux select-layout -t relay-overview tiled
600
+ tmux attach -t relay-overview
601
+ ```
602
+
603
+ ### 5.5 Agent Roles & Capabilities
604
+
605
+ Define what each agent can do:
606
+
607
+ ```json
608
+ // teams.json
609
+ {
610
+ "agents": {
611
+ "Coordinator": {
612
+ "role": "coordinator",
613
+ "canMessage": ["*"],
614
+ "canReceiveFrom": ["*"]
615
+ },
616
+ "ApiDev": {
617
+ "role": "developer",
618
+ "groups": ["backend"],
619
+ "canMessage": ["Coordinator", "@backend", "Reviewer"],
620
+ "canReceiveFrom": ["Coordinator", "@backend"]
621
+ },
622
+ "Reviewer": {
623
+ "role": "reviewer",
624
+ "canMessage": ["Coordinator", "QA"],
625
+ "canReceiveFrom": ["*"]
626
+ }
627
+ }
628
+ }
629
+ ```
630
+
631
+ **Use cases:**
632
+ - Prevent junior agents from messaging senior ones directly
633
+ - Ensure QA only receives from Reviewer (enforced pipeline)
634
+ - Coordinator can message anyone
635
+
636
+ ### 5.6 Message Priority & Filtering
637
+
638
+ With more agents, message prioritization becomes important:
639
+
640
+ ```bash
641
+ # Urgent message (interrupts immediately)
642
+ ->relay:!ApiDev Production is down, check auth service
643
+
644
+ # Normal message (waits for idle)
645
+ ->relay:ApiDev When you have time, review this PR
646
+
647
+ # Low priority (batched, delivered during quiet periods)
648
+ ->relay:?ApiDev FYI: Updated the style guide
649
+ ```
650
+
651
+ **Injection behavior:**
652
+ | Priority | Syntax | Behavior |
653
+ |----------|--------|----------|
654
+ | Urgent | `->relay:!Name` | Inject immediately, even if busy |
655
+ | Normal | `->relay:Name` | Wait for idle (current behavior) |
656
+ | Low | `->relay:?Name` | Batch and deliver during long idle |
657
+
658
+ ### 5.7 Status Broadcasts
659
+
660
+ Agents automatically announce state changes:
661
+
662
+ ```typescript
663
+ // Automatic status messages
664
+ ->relay:* STATUS: ApiDev is now idle
665
+ ->relay:* STATUS: Reviewer completed task (closed PR #123)
666
+ ->relay:* STATUS: QA disconnected
667
+
668
+ // Agents can filter these
669
+ // In wrapper config:
670
+ {
671
+ "hideStatusMessages": true, // Don't inject STATUS broadcasts
672
+ "showStatusInLogs": true // But log them for visibility
673
+ }
674
+ ```
675
+
676
+ ---
677
+
678
+ ## Why They Scale Better (And How We Can Too)
679
+
680
+ ### The Scaling Problem
681
+
682
+ With 2-3 agents, our current approach works well:
683
+ - Open 2-3 terminal tabs
684
+ - Switch between them manually
685
+ - Remember who's doing what
686
+
687
+ With 5-10 agents, this breaks down:
688
+
689
+ | Problem | Impact at 5-10 Agents |
690
+ |---------|----------------------|
691
+ | **No visibility** | Can't see what all agents are doing at once |
692
+ | **No status** | Don't know if agent is busy, idle, or stuck |
693
+ | **Lost context** | Forget which agent is working on what |
694
+ | **Message chaos** | Too many messages to track manually |
695
+ | **Terminal sprawl** | 10 tabs is unmanageable |
696
+
697
+ ### Why Their Approach Scales
698
+
699
+ ```
700
+ ┌─────────────────────────────────────────────────────────────────┐
701
+ │ THEIR ARCHITECTURE │
702
+ ├─────────────────────────────────────────────────────────────────┤
703
+ │ │
704
+ │ ┌─────────────────────────────────────────────────────────┐ │
705
+ │ │ BROWSER DASHBOARD │ │
706
+ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
707
+ │ │ │ Agent 1 │ │ Agent 2 │ │ Agent 3 │ │ Agent 4 │ ... │ │
708
+ │ │ │ ● active│ │ ○ idle │ │ ● active│ │ ✗ error │ │ │
709
+ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
710
+ │ │ │ │
711
+ │ │ [Live message feed] [Inbox: 3 unread] [Agent graph] │ │
712
+ │ └─────────────────────────────────────────────────────────┘ │
713
+ │ │ │
714
+ │ Single pane of glass │
715
+ │ │
716
+ └─────────────────────────────────────────────────────────────────┘
717
+
718
+ Key insight: ONE place to see EVERYTHING
719
+ ```
720
+
721
+ Their specific advantages at scale:
722
+
723
+ | Feature | Why It Helps at Scale |
724
+ |---------|----------------------|
725
+ | **Dashboard** | See all 10 agents at once without switching |
726
+ | **Activity state** | Know instantly who's busy vs idle |
727
+ | **Message inbox** | Messages don't disappear into terminal history |
728
+ | **Agent discovery** | Auto-finds agents, no manual tracking |
729
+ | **Persistent storage** | Query historical messages anytime |
730
+
731
+ ### How We Keep Our Strengths AND Scale
732
+
733
+ The goal: **Single pane of glass, but in the terminal**
734
+
735
+ ```
736
+ ┌─────────────────────────────────────────────────────────────────┐
737
+ │ OUR IMPROVED ARCHITECTURE │
738
+ ├─────────────────────────────────────────────────────────────────┤
739
+ │ │
740
+ │ ┌─────────────────────────────────────────────────────────┐ │
741
+ │ │ TUI DASHBOARD (agent-relay watch) │ │
742
+ │ │ │ │
743
+ │ │ Agents: Status: Messages: │ │
744
+ │ │ ● Coordinator active 12↑ 8↓ │ │
745
+ │ │ ● ApiDev typing... 5↑ 14↓ │ │
746
+ │ │ ● DbAdmin idle 30s 3↑ 6↓ │ │
747
+ │ │ ○ QA offline queued: 3 │ │
748
+ │ │ │ │
749
+ │ │ [Press 'a' to attach, 's' to send, 'q' to quit] │ │
750
+ │ └─────────────────────────────────────────────────────────┘ │
751
+ │ │ │
752
+ │ │ 'a' to attach │
753
+ │ ▼ │
754
+ │ ┌─────────────────────────────────────────────────────────┐ │
755
+ │ │ NATIVE TMUX SESSION │ │
756
+ │ │ │ │
757
+ │ │ claude> Working on the API endpoint... │ │
758
+ │ │ ->relay:DbAdmin Need the users table schema │ │
759
+ │ │ │ │
760
+ │ └─────────────────────────────────────────────────────────┘ │
761
+ │ │ │
762
+ │ │ Ctrl+B d to detach │
763
+ │ ▼ │
764
+ │ Back to TUI dashboard │
765
+ │ │
766
+ └─────────────────────────────────────────────────────────────────┘
767
+
768
+ Key insight: TUI for overview, native tmux for work
769
+ ```
770
+
771
+ ### Specific Scaling Improvements
772
+
773
+ #### 1. Daemon Event Stream
774
+
775
+ The daemon must broadcast events, not just route messages:
776
+
777
+ ```typescript
778
+ // NEW: Daemon broadcasts events to all listeners
779
+ interface DaemonEvent {
780
+ type: 'agent_connected' | 'agent_disconnected' | 'agent_active' |
781
+ 'agent_idle' | 'message_sent' | 'message_delivered' | 'message_queued';
782
+ timestamp: number;
783
+ data: Record<string, unknown>;
784
+ }
785
+
786
+ // In daemon/server.ts
787
+ class Daemon {
788
+ private eventSubscribers: Set<Connection> = new Set();
789
+
790
+ broadcast(event: DaemonEvent): void {
791
+ const envelope = { type: 'EVENT', event };
792
+ for (const subscriber of this.eventSubscribers) {
793
+ subscriber.send(envelope);
794
+ }
795
+ }
796
+
797
+ // Called when agent output detected
798
+ onAgentActivity(agentName: string): void {
799
+ this.broadcast({
800
+ type: 'agent_active',
801
+ timestamp: Date.now(),
802
+ data: { agent: agentName }
803
+ });
804
+ }
805
+ }
806
+ ```
807
+
808
+ #### 2. Activity Reporting from Wrapper
809
+
810
+ Wrappers must report activity state to daemon:
811
+
812
+ ```typescript
813
+ // In tmux-wrapper.ts
814
+ private reportActivity(): void {
815
+ const now = Date.now();
816
+ const timeSinceOutput = now - this.lastOutputTime;
817
+
818
+ let state: 'active' | 'idle' | 'typing';
819
+ if (timeSinceOutput < 1000) {
820
+ state = 'active';
821
+ } else if (this.detectTypingIndicator()) {
822
+ state = 'typing'; // Agent is thinking/working
823
+ } else if (timeSinceOutput < 30000) {
824
+ state = 'idle';
825
+ } else {
826
+ state = 'idle';
827
+ }
828
+
829
+ // Only send if state changed
830
+ if (state !== this.lastReportedState) {
831
+ this.client.sendStatus(state);
832
+ this.lastReportedState = state;
833
+ }
834
+ }
835
+
836
+ private detectTypingIndicator(): boolean {
837
+ // Claude Code shows "[1/418]" when thinking
838
+ // Detect this pattern in recent output
839
+ return /\[\d+\/\d+\]/.test(this.recentOutput);
840
+ }
841
+ ```
842
+
843
+ #### 3. TUI Dashboard Implementation
844
+
845
+ ```typescript
846
+ // src/cli/watch.ts
847
+ import blessed from 'blessed';
848
+
849
+ export async function watchCommand(socketPath: string): Promise<void> {
850
+ const screen = blessed.screen({ smartCSR: true });
851
+
852
+ // Agent list panel
853
+ const agentList = blessed.list({
854
+ parent: screen,
855
+ label: ' Agents ',
856
+ top: 0,
857
+ left: 0,
858
+ width: '50%',
859
+ height: '60%',
860
+ border: { type: 'line' },
861
+ style: {
862
+ selected: { bg: 'blue' }
863
+ },
864
+ keys: true,
865
+ vi: true,
866
+ });
867
+
868
+ // Message feed panel
869
+ const messageFeed = blessed.log({
870
+ parent: screen,
871
+ label: ' Messages ',
872
+ top: 0,
873
+ right: 0,
874
+ width: '50%',
875
+ height: '60%',
876
+ border: { type: 'line' },
877
+ scrollable: true,
878
+ });
879
+
880
+ // Status bar
881
+ const statusBar = blessed.box({
882
+ parent: screen,
883
+ bottom: 0,
884
+ height: 3,
885
+ content: ' [a]ttach [s]end [r]efresh [q]uit ',
886
+ });
887
+
888
+ // Connect to daemon event stream
889
+ const client = new RelayClient({ socketPath, subscribe: true });
890
+
891
+ client.onEvent = (event: DaemonEvent) => {
892
+ switch (event.type) {
893
+ case 'agent_connected':
894
+ updateAgentList();
895
+ break;
896
+ case 'message_sent':
897
+ messageFeed.log(`${event.data.from} → ${event.data.to}: ${event.data.preview}`);
898
+ break;
899
+ // ...
900
+ }
901
+ screen.render();
902
+ };
903
+
904
+ // Keyboard handlers
905
+ screen.key(['a'], () => attachToSelected());
906
+ screen.key(['s'], () => showSendDialog());
907
+ screen.key(['q'], () => process.exit(0));
908
+
909
+ screen.render();
910
+ }
911
+
912
+ function attachToSelected(): void {
913
+ const agent = getSelectedAgent();
914
+ // Detach from blessed, attach to tmux
915
+ screen.destroy();
916
+ execSync(`tmux attach-session -t relay-${agent}-*`, { stdio: 'inherit' });
917
+ // When user detaches (Ctrl+B d), restart watch
918
+ watchCommand(socketPath);
919
+ }
920
+ ```
921
+
922
+ #### 4. Message History Query
923
+
924
+ ```typescript
925
+ // src/cli/index.ts
926
+ program
927
+ .command('history')
928
+ .description('Show message history')
929
+ .option('-n <count>', 'Number of messages', '20')
930
+ .option('-f, --from <agent>', 'Filter by sender')
931
+ .option('-t, --to <agent>', 'Filter by recipient')
932
+ .option('--since <time>', 'Since time (e.g., "1h", "2024-01-01")')
933
+ .action(async (options) => {
934
+ const messages = await queryMessages({
935
+ limit: parseInt(options.n),
936
+ from: options.from,
937
+ to: options.to,
938
+ since: parseTime(options.since),
939
+ });
940
+
941
+ for (const msg of messages) {
942
+ console.log(`${msg.timestamp} ${msg.from} → ${msg.to}: ${msg.body.slice(0, 80)}`);
943
+ }
944
+ });
945
+ ```
946
+
947
+ #### 5. Agent Summary Command
948
+
949
+ ```bash
950
+ $ agent-relay agents
951
+
952
+ NAME STATUS MESSAGES LAST ACTIVE
953
+ ───────────────────────────────────────────────
954
+ Coordinator active 12↑ 8↓ now
955
+ ApiDev typing 5↑ 14↓ now
956
+ DbAdmin idle 3↑ 6↓ 30s ago
957
+ AuthService idle 2↑ 4↓ 2m ago
958
+ QA offline queued: 3 5m ago
959
+
960
+ Total: 5 agents (3 active, 1 idle, 1 offline)
961
+ ```
962
+
963
+ Implementation:
964
+ ```typescript
965
+ program
966
+ .command('agents')
967
+ .description('List connected agents with status')
968
+ .option('--json', 'Output as JSON')
969
+ .action(async (options) => {
970
+ const agents = await getAgentStatus(socketPath);
971
+
972
+ if (options.json) {
973
+ console.log(JSON.stringify(agents, null, 2));
974
+ return;
975
+ }
976
+
977
+ console.log('NAME STATUS MESSAGES LAST ACTIVE');
978
+ console.log('───────────────────────────────────────────────');
979
+
980
+ for (const agent of agents) {
981
+ const status = agent.status.padEnd(9);
982
+ const msgs = `${agent.sent}↑ ${agent.received}↓`.padEnd(11);
983
+ const lastActive = formatRelativeTime(agent.lastActive);
984
+ console.log(`${agent.name.padEnd(13)} ${status} ${msgs} ${lastActive}`);
985
+ }
986
+ });
987
+ ```
988
+
989
+ ### Scaling Comparison: Before vs After
990
+
991
+ | Capability | Current | After Improvements |
992
+ |------------|---------|-------------------|
993
+ | **See all agents** | Switch tabs manually | `agent-relay watch` TUI |
994
+ | **Agent status** | None | active/idle/typing/offline |
995
+ | **Message history** | Lost in scrollback | `agent-relay history` |
996
+ | **Quick attach** | Remember session names | Press 'a' in TUI |
997
+ | **Send from CLI** | Must be in session | `agent-relay send Bob "msg"` |
998
+ | **Agent list** | `tmux ls \| grep relay` | `agent-relay agents` |
999
+
1000
+ ### Architecture Changes for Scale
1001
+
1002
+ ```
1003
+ CURRENT (doesn't scale):
1004
+
1005
+ Wrapper 1 ──┐
1006
+ Wrapper 2 ──┼──► Daemon ──► SQLite
1007
+ Wrapper 3 ──┘
1008
+
1009
+ └──► User switches between terminal tabs
1010
+
1011
+
1012
+ IMPROVED (scales to 10+):
1013
+
1014
+ Wrapper 1 ──┐ ┌──► TUI Dashboard
1015
+ Wrapper 2 ──┼──► Daemon ◄────────┼──► CLI queries
1016
+ Wrapper 3 ──┤ │ └──► Health checks
1017
+ ... │ │
1018
+ Wrapper 10 ─┘ ▼
1019
+ SQLite
1020
+
1021
+ └──► Persistent history
1022
+ └──► Agent registry
1023
+ └──► Message queue
1024
+ ```
1025
+
1026
+ Key changes:
1027
+ 1. **Daemon becomes event hub** - broadcasts state changes
1028
+ 2. **Wrappers report status** - not just messages
1029
+ 3. **TUI provides overview** - single pane of glass
1030
+ 4. **CLI provides queries** - history, agents, send
1031
+ 5. **Storage is durable** - survives restarts
1032
+
1033
+ ---
1034
+
1035
+ ## Non-Goals
1036
+
1037
+ - **Browser Dashboard**: Out of scope. TUI (`agent-relay watch`) provides visibility.
1038
+ - **Multi-host support**: Single machine focus. Use SSH for remote.
1039
+ - **Agent memory/RAG**: Separate concern. Agents manage their own context.
1040
+ - **Authentication**: Unix socket permissions are sufficient for local use.
1041
+
1042
+ ---
1043
+
1044
+ ## Timeline
1045
+
1046
+ | Phase | Scope | Effort |
1047
+ |-------|-------|--------|
1048
+ | Phase 1 | Foundation fixes | 2-3 days |
1049
+ | Phase 2 | Reliability | 2-3 days |
1050
+ | Phase 3 | DX improvements | 1-2 days |
1051
+ | Phase 4 | Optional enhancements | As needed |
1052
+ | Phase 5 | Multi-agent coordination | 3-5 days |
1053
+
1054
+ ### Phase 5 Breakdown
1055
+
1056
+ | Feature | Effort | Priority |
1057
+ |---------|--------|----------|
1058
+ | Agent groups (`->relay:@groupname`) | 1 day | P1 |
1059
+ | TUI dashboard (`agent-relay watch`) | 2 days | P1 |
1060
+ | Tmux layout helper | 0.5 day | P2 |
1061
+ | Message priority (`!`, `?`) | 0.5 day | P2 |
1062
+ | Pub/sub topics | 1 day | P3 |
1063
+ | Agent roles/permissions | 1 day | P3 |
1064
+
1065
+ ---
1066
+
1067
+ ## Open Questions
1068
+
1069
+ 1. **NDJSON vs Binary**: Is the simplicity worth losing multi-line message support?
1070
+ - Mitigation: Encode newlines as `\n` in JSON strings (already done)
1071
+
1072
+ 2. **Polling interval**: Should 200ms be configurable?
1073
+ - Proposal: Add `--poll-interval` flag, default 200ms
1074
+
1075
+ 3. **Message TTL**: How long to queue messages for offline agents?
1076
+ - Proposal: 24 hours default, configurable
1077
+
1078
+ 4. **Backward compatibility**: How long to support v1 binary protocol?
1079
+ - Proposal: One minor version (v2.1 removes it)