npm - talking-stick - Versions diffs - 0.1.4 → 0.2.0 - Mend

talking-stick 0.1.4 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +32 -3
package/dist/cli/event-stream.js +124 -0
package/dist/cli/msg-commands.js +81 -0
package/dist/cli/output.js +3 -1
package/dist/cli/registry.js +11 -1
package/dist/cli/room-commands.js +13 -2
package/dist/commands.js +15 -0
package/dist/config.js +3 -0
package/dist/db.js +7 -0
package/dist/mcp-server.js +32 -0
package/dist/service.js +161 -4
package/docs/plans/out-of-band-signaling-implementation.md +854 -0
package/docs/plans/out-of-band-signaling.md +255 -176
package/docs/receive-consumer-contract.md +30 -0
package/docs/releases/0.2.0.md +85 -0
package/package.json +1 -1
package/skills/talking-stick/SKILL.md +24 -2

package/docs/plans/out-of-band-signaling-implementation.md ADDED Viewed

@@ -0,0 +1,854 @@
+# Out-of-Band Signaling — Implementation Plan
+**Status:** Reviewed and amended by `codex:4ed7aa3c`; ready for implementation on this branch unless `claude:c756bb19` objects in review. Author: `claude:c756bb19`. Branch: `oob-signaling`. **Design source:** [out-of-band-signaling.md](./out-of-band-signaling.md) (converged 2026-04-30).
+This plan turns the converged design into a build sequence: concrete files, types, SQL, RPC shapes, CLI grammar, tests, and edge-case decisions. It does **not** re-litigate design choices the prior round closed; it locks the implementation-time choices the design deliberately deferred.
+> **Reviewer focus.** Push back on the spots flagged with **DECISION** (we still need a single answer) and **RISK** (where the implementation could go wrong). The order of stages mirrors the rollout in the design doc — each stage is independently shippable and individually mergeable.
+---
+## 0. Ground rules
+- All work lands on branch `oob-signaling`, off `master` at `8069d84`.
+- Each stage gets a focused commit with a passing `npm test` + `npm run typecheck`. No "all stages in one giant commit." Reviewer (claude) wants to be able to bisect.
+- ESM, strict TS, 2-space indent, double quotes, semicolons, `.js` import extensions on local TS — same as the rest of `src/`.
+- `dist/` is regenerated by `npm run build`; it is not edited by hand and not committed in feature branches.
+- Tests use Vitest with `TALKING_STICK_DATA_DIR` for isolation, per `CLAUDE.md`.
+---
+## Stage 1 — Substrate (migration + service primitives)
+End state: `room_events.payload_json` exists, `service.sendMessage` and `service.waitForEvents` work, no MCP/CLI yet. Every test below must pass before stage 2 starts.
+### 1.1 Schema migration
+**File:** `src/db.ts`
+Append migration #5 to the `migrations` array (after id=4 `room_member_wait_presence`):
+```ts
+{
+  id: 5,
+  name: "room_events_payload_json",
+  up: `
+    ALTER TABLE room_events ADD COLUMN payload_json TEXT;
+  `
+}
+```
+Rationale: column is nullable, no default needed. Existing rows back-fill to NULL. SQLite `ALTER TABLE ADD COLUMN` is O(1) on big tables (no rewrite), so this is safe even on populated dogfood DBs.
+**No new index for v1.** Filter queries in `waitForEvents` go through the existing `room_events_room_seq_idx` plus predicates on `event_type` / `from_agent_id` / `to_agent_id`; for 4-digit event counts per room that's fast enough. Decision: wait for measured slowness before adding `room_events (room_id, event_type, event_seq)` or sender/recipient-specific indexes. `event_seq > ?` remains the dominant filter, and a partial scan over a single room's events past the cursor is bounded.
+### 1.2 Discriminated payload types
+**File:** `src/types.ts`
+Extend the existing `RoomEvent` union to include the new event types and a typed `payload` field. Existing event types continue to use only the typed columns; new event types carry their event-specific fields under `payload`.
+```ts
+// Replace RoomEvent.event_type union and add a payload field.
+export type EventType =
+  | "claim"
+  | "release"
+  | "pass"
+  | "takeover"
+  | "close"
+  | "kick"
+  | "message_sent";
+export interface MessagePayload {
+  body: string;
+  delivery_hint: "normal" | "interrupt";
+}
+export type EventPayload =
+  | { kind: "message_sent"; payload: MessagePayload };
+export interface RoomEvent {
+  event_seq: number;
+  event_id: string;
+  room_id: string;
+  turn_id: number;
+  event_type: EventType;
+  from_agent_id: AgentId | null;
+  to_agent_id: AgentId | null;
+  handoff: Handoff | null;
+  reason: string | null;
+  created_at: string;
+  payload: MessagePayload | null; // populated when event_type === "message_sent"
+}
+```
+**RISK — type ergonomics.** Keeping `payload` flat (single optional field) is cheaper than a discriminated union over the whole event row. Consumers narrow with `event.event_type === "message_sent"`. If we later add presence events (Stage 4), we widen the `payload` field to a union; the column does not change.
+New input/result types (paste near `Note*` types):
+```ts
+export type DeliveryHint = "normal" | "interrupt";
+export interface SendMessageInput {
+  agent_id: AgentId;
+  room_id: string;
+  body: string;
+  to_agent_id?: AgentId;     // null/undefined = broadcast
+  delivery_hint?: DeliveryHint;
+}
+export interface SendMessageResult {
+  event_seq: number;
+  event_id: string;
+  created_at: string;
+}
+export type EventTypeFilter = EventType | EventType[];
+export type TargetAgentFilter = "self" | "any" | AgentId;
+export interface WaitForEventsInput {
+  agent_id?: AgentId;          // required when target_agent_id === "self"
+  room_id: string;
+  after_event_seq?: number;
+  event_type?: EventTypeFilter;
+  target_agent_id?: TargetAgentFilter; // default "self"
+  from_agent_id?: AgentId;
+  max_wait_ms?: number;
+}
+export interface WaitForEventsResult {
+  events: RoomEvent[];
+  cursor_event_seq: number;
+}
+```
+**File:** `src/errors.ts`
+Add protocol error codes used by this surface: `message_too_large`, `invalid_delivery_hint`, `unknown_recipient`, `ambiguous_recipient`, `agent_id_required`, and `invalid_event_type_filter`. Use `unknown_recipient` for message routing errors so they do not get confused with membership preconditions on the sender (`unknown_member`).
+### 1.3 `appendEvent` extension
+**File:** `src/service.ts`
+Extend the private `appendEvent` to accept an optional `payload` and serialize it into `payload_json`. Existing callers pass nothing and get NULL.
+```ts
+private appendEvent(input: {
+  room_id: string;
+  turn_id: number;
+  event_type: EventType;
+  from_agent_id: string | null;
+  to_agent_id: string | null;
+  handoff: Handoff | null;
+  reason: string | null;
+  created_at: string;
+  payload?: MessagePayload | null;
+}): number {
+  const result = this.db
+    .prepare(
+      `
+      INSERT INTO room_events (
+        event_id, room_id, turn_id, event_type,
+        from_agent_id, to_agent_id,
+        handoff_json, reason, created_at, payload_json
+      ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+    `
+    )
+    .run(
+      randomUUID(),
+      input.room_id,
+      input.turn_id,
+      input.event_type,
+      input.from_agent_id,
+      input.to_agent_id,
+      input.handoff ? JSON.stringify(input.handoff) : null,
+      input.reason,
+      input.created_at,
+      input.payload ? JSON.stringify(input.payload) : null
+    );
+  return Number(result.lastInsertRowid);
+}
+```
+`mapEvent` parses `payload_json` into `payload` only when `event_type === "message_sent"`; otherwise returns `null`. This keeps consumers from accidentally treating future foreign payloads as MessagePayload before we've widened the union.
+### 1.4 `sendMessage`
+**File:** `src/service.ts`
+```ts
+async sendMessage(input: SendMessageInput): Promise<SendMessageResult> {
+  assertNonEmpty(input.agent_id, "agent_id");
+  assertNonEmpty(input.room_id, "room_id");
+  const body = input.body ?? "";
+  if (body.length === 0) {
+    throw new ProtocolError("invalid_body", "Message body must not be empty.");
+  }
+  const byteLength = Buffer.byteLength(body, "utf8");
+  if (byteLength > MAX_MESSAGE_BODY_BYTES) {
+    throw new ProtocolError(
+      "message_too_large",
+      `Message body exceeds ${MAX_MESSAGE_BODY_BYTES} bytes (received ${byteLength}).`
+    );
+  }
+  const deliveryHint: DeliveryHint = input.delivery_hint ?? "normal";
+  if (deliveryHint !== "normal" && deliveryHint !== "interrupt") {
+    throw new ProtocolError("invalid_delivery_hint", "delivery_hint must be 'normal' or 'interrupt'.");
+  }
+  const now = this.now();
+  const timestamp = now.toISOString();
+  this.purgeExpiredIdleRooms(now);
+  return withImmediateTransaction(this.db, () => {
+    const room = this.requireRoom(input.room_id);
+    if (room.state === "closed") {
+      throw new ProtocolError("room_closed", "Messages cannot be sent to a closed room.", {
+        room_id: input.room_id
+      });
+    }
+    // touchMember: sender must be an active joined member (matches add_note).
+    this.touchMember(input.room_id, input.agent_id, timestamp);
+    if (input.to_agent_id) {
+      const target = this.getMember(input.room_id, input.to_agent_id);
+      if (!target) {
+        throw new ProtocolError(
+          "unknown_recipient",
+          "to_agent_id is not a member of this room.",
+          { to_agent_id: input.to_agent_id }
+        );
+      }
+    }
+    const eventSeq = this.appendEvent({
+      room_id: input.room_id,
+      turn_id: room.turn_id,
+      event_type: "message_sent",
+      from_agent_id: input.agent_id,
+      to_agent_id: input.to_agent_id ?? null,
+      handoff: null,
+      reason: null,
+      created_at: timestamp,
+      payload: { body, delivery_hint: deliveryHint }
+    });
+    const row = this.db
+      .prepare<[number], { event_id: string }>(
+        "SELECT event_id FROM room_events WHERE event_seq = ?"
+      )
+      .get(eventSeq);
+    return {
+      event_seq: eventSeq,
+      event_id: row!.event_id,
+      created_at: timestamp
+    };
+  });
+}
+```
+**Constants:** add `const MAX_MESSAGE_BODY_BYTES = 4096;` near `MAX_NOTE_BODY_BYTES`.
+**Decision — sender membership.** Use `touchMember` (write path), not `touchKnownMember`. A sender must already be a joined member and a live command/tool call refreshes `last_seen_at`. This matches `add_note` and prevents stale identities from injecting messages without first joining. It does not require the target recipient to be active when targeted by full `agent_id`; targeting is routing, not liveness proof.
+### 1.5 `waitForEvents`
+**File:** `src/service.ts`
+```ts
+async waitForEvents(input: WaitForEventsInput): Promise<WaitForEventsResult> {
+  assertNonEmpty(input.room_id, "room_id");
+  const targetFilter = input.target_agent_id ?? "self";
+  if (targetFilter === "self" && !input.agent_id) {
+    throw new ProtocolError(
+      "agent_id_required",
+      "agent_id is required when target_agent_id is 'self'."
+    );
+  }
+  const eventTypes = normalizeEventTypeFilter(input.event_type);
+  const cursor = input.after_event_seq ?? 0;
+  const maxWaitMs = Math.min(
+    input.max_wait_ms ?? this.policy.waitForEventsMaxWaitMs,
+    this.policy.waitForEventsMaxWaitMs
+  );
+  const deadline = Date.now() + Math.max(0, maxWaitMs);
+  while (true) {
+    // Pure read; no transaction, no touch*Member.
+    const events = this.queryEvents({
+      room_id: input.room_id,
+      after_event_seq: cursor,
+      event_types: eventTypes,
+      target: targetFilter,
+      caller_agent_id: input.agent_id ?? null,
+      from_agent_id: input.from_agent_id ?? null,
+      limit: this.policy.waitForEventsBatchLimit
+    });
+    if (events.length > 0 || Date.now() >= deadline) {
+      const lastSeq = events.length > 0
+        ? events[events.length - 1].event_seq
+        : cursor;
+      return { events, cursor_event_seq: lastSeq };
+    }
+    const remainingMs = deadline - Date.now();
+    await sleep(Math.min(this.policy.waitForEventsPollMs, remainingMs));
+  }
+}
+```
+`queryEvents` is a private helper that builds a parameterized SQL query:
+```sql
+SELECT * FROM room_events
+WHERE room_id = ?
+  AND event_seq > ?
+  [AND event_type IN (?, ?, ...)]
+  [AND target_filter_clause]
+ORDER BY event_seq
+LIMIT ?
+```
+Before the loop, call `requireRoom(input.room_id)` to fail fast if the room does not exist. Do **not** call `purgeExpiredIdleRooms`, `touchMember`, `touchKnownMember`, or `touchWaitingMember` from `waitForEvents`; this method's contract is observer-safe and read-only. Other write paths and existing room reads can continue to perform startup cleanup.
+**`target_filter_clause` per filter mode:**
+- `"any"`: omit clause.
+- `<agent_id>`: `to_agent_id = ?`. (Strict — does NOT include broadcast unless explicitly requested.)
+- `"self"`:
+  - For `event_type='message_sent'`: direct messages to the caller plus broadcasts from other agents. Do not echo the caller's own broadcast messages back to their default receiver.
+  - For non-message events: `(to_agent_id = ? OR from_agent_id = ?)` (caller is participant). Caller's own `claim`/`release`/`pass`/etc. are visible to them; this matches "self = events that affect me".
+  - Encoded as a single `CASE`/`OR` SQL clause to avoid two queries. See §1.5.1 below.
+**§1.5.1 — `target=self` SQL.** The cleanest single-query encoding:
+```sql
+AND (
+  (event_type = 'message_sent' AND (to_agent_id = ? OR (to_agent_id IS NULL AND from_agent_id != ?)))
+  OR
+  (event_type != 'message_sent' AND (to_agent_id = ? OR from_agent_id = ?))
+)
+```
+Four bound parameters of the caller's agent_id. Acceptable; SQLite's planner handles this fine on the existing index.
+**`from_agent_id` filter.** Add this filter server-side now. It costs one optional SQL predicate and keeps `cursor_event_seq` semantics honest for `tt msg recv --from ...`: the CLI should not have to consume and skip batches of target-matching messages from the wrong sender.
+**`waitForEvents` does NOT mutate.** No purge, no `touchMember`, no `touchKnownMember`, and no `touchWaitingMember`. This is the explicit observer-safety property from the design (Layer 4).
+### 1.6 Policy additions
+**File:** `src/types.ts` (Policy interface) and `src/config.ts` (`defaultPolicy`).
+Add three policy values:
+```ts
+waitForEventsMaxWaitMs: number;   // default 30_000 (matches waitForTurn)
+waitForEventsPollMs: number;      // default 250  (matches waitForTurn)
+waitForEventsBatchLimit: number;  // default 100
+```
+These match the long-poll defaults in the existing `wait_for_turn` path. No CLI/MCP knob in v1; servers (and tests) override via constructor options.
+### 1.7 Stage-1 tests
+**File:** `tests/oob-substrate.test.ts` (new).
+Coverage matrix:
+| # | Test | What it pins |
+|---|------|--------------|
+| 1 | Migration 5 applies cleanly to a v0.1.x DB seeded with id=1..4 events | back-compat |
+| 2 | Existing event types (claim/release/pass/takeover/close/kick) write `payload_json = NULL` | no regression |
+| 3 | `sendMessage` happy path direct: body, hint, recipient round-trip in `getRoomEvents` | basic write |
+| 4 | `sendMessage` happy path broadcast: `to_agent_id = NULL`, body visible | broadcast |
+| 5 | `sendMessage` rejects empty body → `invalid_body` | input val |
+| 6 | `sendMessage` rejects 4097-byte body → `message_too_large`, body NOT inserted | cap + atomicity |
+| 7 | `sendMessage` rejects unknown `to_agent_id` → `unknown_recipient` | routing val |
+| 8 | `sendMessage` from non-member sender → `unknown_member` | sender liveness |
+| 9 | `sendMessage` to a closed room → `room_closed` | room state |
+| 10 | `sendMessage` of `delivery_hint='interrupt'` round-trips | hint preserved |
+| 11 | `waitForEvents` returns immediately when events past cursor exist | non-blocking |
+| 12 | `waitForEvents` returns empty after deadline when no new events | timeout |
+| 13 | `waitForEvents` `target='self'` filters to (direct OR broadcast) for messages | filter |
+| 14 | `waitForEvents` `target='self'` filters to (to OR from) for non-message events | filter |
+| 15 | `waitForEvents` `target='any'` returns everything | filter |
+| 16 | `waitForEvents` `target=<agent_id>` returns only direct (no broadcast) | filter strictness |
+| 17 | `waitForEvents` `event_type='message_sent'` excludes claim/release/etc. | type filter |
+| 18 | `waitForEvents` cursor resume: second call past `cursor_event_seq` returns only new events | cursor correctness |
+| 19 | `waitForEvents` does NOT update `last_wait_at` or `last_seen_at` (observer safety) | non-mutating |
+| 20 | `event_seq` ordering: `sendMessage` interleaved with `releaseStick` preserves monotonic order | ordering |
+| 21 | Concurrent `sendMessage` from two agents produces distinct `event_seq`s | concurrency |
+| 22 | `sendMessage` `delivery_hint` defaults to `"normal"` | default |
+| 23 | `sendMessage` rejects `delivery_hint` other than normal/interrupt → `invalid_delivery_hint` | input val |
+| 24 | `getRoomEvents` returns `payload` populated for message_sent rows, `null` for others | mapEvent |
+| 24a | `waitForEvents` `from_agent_id` filters server-side without advancing over unmatched senders | sender filter |
+**RISK — test isolation.** Existing pattern: `tests/setup.ts` + `TALKING_STICK_DATA_DIR`. New file follows the same setup; no changes to fixtures.
+---
+## Stage 2 — MCP surface
+End state: MCP `send_message` and `wait_for_events` tools available; `get_room_events` returns `payload` for message events.
+### 2.1 New tools
+**File:** `src/mcp-server.ts`
+Register two tools, mirroring the `add_note` pattern (resolve identity from `extra.sessionId`):
+```ts
+server.registerTool(
+  "send_message",
+  {
+    title: "Send Message",
+    description:
+      "Send a transient message into the room event log. Routes via to_agent_id (null = broadcast). Body capped at 4096 bytes UTF-8.",
+    inputSchema: {
+      room_id: z.string().min(1),
+      body: z.string().min(1),
+      to_agent_id: z.string().min(1).optional(),
+      delivery_hint: z.enum(["normal", "interrupt"]).optional()
+    }
+  },
+  async (input, extra) =>
+    toolJson(() =>
+      commands.sendMessage(resolveConnectionIdentity(extra.sessionId), input)
+    )
+);
+server.registerTool(
+  "wait_for_events",
+  {
+    title: "Wait for Events",
+    description:
+      "Long-poll the room event log past a cursor with optional event_type and target filters. Observer-safe: does not mutate room state.",
+    inputSchema: {
+      room_id: z.string().min(1),
+      after_event_seq: z.number().int().nonnegative().optional(),
+      event_type: z
+        .union([z.string().min(1), z.array(z.string().min(1)).min(1)])
+        .optional(),
+      target_agent_id: z.string().min(1).optional(),
+      from_agent_id: z.string().min(1).optional(),
+      max_wait_ms: z.number().int().nonnegative().optional()
+    }
+  },
+  async (input, extra) =>
+    toolJson(() =>
+      commands.waitForEvents({
+        ...input,
+        agent_id: resolveConnectionIdentity(extra.sessionId).agent_id
+      })
+    )
+);
+```
+`commands.sendMessage` and `commands.waitForEvents` are thin wrappers in `src/commands.ts` that match the existing pattern (`commands.addNote`, `commands.listNotes`, etc.).
+### 2.2 `get_room_events` payload propagation
+`getRoomEvents` already exists; extending `mapEvent` (Stage 1.3) is sufficient. Schema unchanged for `get_room_events` — output now contains `payload` field, populated for `message_sent` rows.
+### 2.3 Stage-2 tests
+**File:** `tests/mcp-smoke.test.ts` (extend) and `tests/oob-mcp.test.ts` (new for richer cases).
+Coverage:
+| # | Test | What it pins |
+|---|------|--------------|
+| 25 | MCP `send_message` happy path: caller identity from sessionId, recipient resolves, event lands | identity wiring |
+| 26 | MCP `send_message` returns typed `message_too_large` error on oversized body | error shape |
+| 27 | MCP `wait_for_events` `target='self'` resolves caller from sessionId | identity wiring |
+| 28 | MCP `wait_for_events` returns events.payload populated for message_sent | payload round-trip via MCP |
+| 29 | MCP `get_room_events` returns payload for message_sent events alongside legacy events | back-compat |
+| 30 | MCP `wait_for_events` `event_type=['message_sent','release']` (array form) filters correctly | array filter |
+| 30a | MCP `wait_for_events` `from_agent_id` filters server-side | sender filter |
+---
+## Stage 3 — CLI surface
+End state: `tt msg send`, `tt msg recv [--wait|--follow]`, `tt events --wait|--follow` available; `tt msg recv --follow` is the continuous stream path for harnesses that can monitor stdout, while `tt msg recv --wait` is the portable wake-on-next-event path for harnesses that can run a background command and notice process exit. Recipient resolution works.
+### 3.1 New commands
+**File:** `src/cli/registry.ts`
+Add three entries:
+```ts
+{
+  name: "msg",
+  needsRuntime: true,
+  startupMaintenance: true,
+  internal: false,
+  usage: "tt msg <send|recv> [...]",
+  description: "Send or receive transient messages on a room's event stream.",
+  handler: ({ runtime, parsed }) => handleMsgCommand(requireRuntime(runtime), parsed)
+}
+```
+The existing `events` entry gains `--wait`, `--follow`, `--event`, and `--target` without a registry change — the handler interprets the flags.
+**File:** `src/cli/msg-commands.ts` (new). Mirrors `notes-commands.ts`:
+```ts
+export async function handleMsgCommand(runtime, parsed): Promise<void> {
+  const [subcommand, ...rest] = parsed.positionals;
+  const subParsed = { name: `msg ${subcommand}`, positionals: rest, options: parsed.options };
+  switch (subcommand) {
+    case "send":  return handleMsgSendCommand(runtime, subParsed);
+    case "recv":  return handleMsgRecvCommand(runtime, subParsed);
+    default: throw new Error(`Unknown msg subcommand: ${subcommand}`);
+  }
+}
+```
+### 3.2 `tt msg send <recipient> <body>`
+Grammar:
+```
+tt msg send <recipient> <body...> [--interrupt] [--stdin]
+tt msg send room <body...> [--interrupt]   # broadcast (literal "room")
+```
+- `<recipient>`: full `agent_id` (`codex:5c11d1e8`), display name (`codex`), or literal `room` for broadcast.
+- `<body>`: positional remainder joined by space (matches `tt notes add` body convention) OR `--stdin`.
+- `--interrupt`: sets `delivery_hint=interrupt`.
+- Path resolution: same `resolveSessionForNotes`-style helper, since this is a write-shaped command.
+- `--json/--text`: same auto-detect rules as everywhere else (`shouldUseJson`).
+**Recipient resolution** (in CLI, before `service.sendMessage`):
+```ts
+function resolveRecipient(
+  runtime: Runtime,
+  identity: DerivedIdentity,
+  roomId: string,
+  raw: string
+): string | null /* null = broadcast */ {
+  if (raw === "room") return null;
+  const members = runtime.commands.getRoomState({ room_id: roomId, agent_id: identity.agent_id }).members;
+  // Exact agent_id match wins.
+  const exact = members.find(m => m.agent_id === raw);
+  if (exact) return exact.agent_id;
+  // Display-name match — only active members, must be unambiguous.
+  const candidates = members.filter(m => m.display_name === raw && m.status === "active");
+  if (candidates.length === 1) return candidates[0].agent_id;
+  if (candidates.length > 1) {
+    throw new ProtocolError("ambiguous_recipient", `Multiple active members with display name '${raw}'.`, {
+      candidates: candidates.map(m => m.agent_id)
+    });
+  }
+  throw new ProtocolError("unknown_recipient", `No active member matches '${raw}'.`);
+}
+```
+**Decision — display-name resolution scope.** Display-name shorthand is active-only. Full `agent_id` targeting remains an escape hatch for known inactive members because the service validates only room membership. This split keeps the common chat path honest without making routing pretend to be delivery.
+**Decision — body via positional vs `--stdin`.** `tt notes add` uses `--stdin`; mirror that. Positional remainder is joined by space (`parsed.positionals.slice(1).join(" ")`). Empty body without `--stdin` is a usage error.
+**Boolean flag repair.** `parseCommand` currently consumes the next non-`--` token as a flag value. `tt msg send codex --interrupt "hi"` is part of the documented UX, so the handler must repair this case: if `--interrupt` or `--room` has a string value, treat the option as boolean and splice the consumed value back into the message body. Do not make users remember "boolean flags last" for the new single-command chat path.
+### 3.3 `tt msg recv [--wait|--follow] [--from <agent>] [--after <event_seq>]`
+Grammar:
+```
+tt msg recv                                    # one-shot: print latest unread for self, exit
+tt msg recv --wait                             # block until next matching event batch, print, exit
+tt msg recv --follow                           # long-running: tail forever, JSON line per event
+tt msg recv --wait --from codex                # portable background wake path
+tt msg recv --follow --from codex              # filter by sender display name or agent_id
+tt msg recv --wait --after 12345               # wait after a known cursor
+tt msg recv --follow --after 12345             # continuous resume from cursor
+tt msg recv --follow --target any              # power user: see all messages, not just self
+```
+Implementation:
+- One-shot mode: single call to `commands.waitForEvents({ event_type: "message_sent", target_agent_id: "self", max_wait_ms: 0, after_event_seq })`. Print events. Exit.
+- `--wait` mode: single call to `commands.waitForEvents({ event_type: "message_sent", target_agent_id, from_agent_id, after_event_seq, max_wait_ms })`. If an event batch arrives, print one line per event and exit 0. If the deadline expires with no events, print nothing and exit 0 with `{ events: [], cursor_event_seq }` in JSON mode. This is the portability path for Codex/Gemini/OpenCode-style harnesses that can launch a background process and be notified when it exits, but cannot consume each stdout line from a continuously running child.
+- `--follow` mode: loop calling `waitForEvents` with the policy long-poll deadline; emit each event as a single line of JSON (or human text per `shouldUseJson`); update local cursor; repeat.
+**Cursor behavior:**
+- Default `--wait` / `--follow` start cursor: **the highest current event_seq at startup time** (i.e., do NOT replay history). This avoids flooding harnesses on first launch.
+- `--after N`: explicit override, used for resume.
+- `--after 0`: opt-in to full backlog (rare; mostly for debugging). No separate `--from-start` flag.
+- Cursor persistence to disk is **out of scope for v1 CLI**. Per design Layer 6 §4, this is the harness's or plugin's responsibility. Operators wire `--after $LAST_SEQ` from their own bookkeeping.
+Implement the "highest current event_seq" cursor with a small service/command helper such as `getLatestEventSeq(room_id)` rather than by paging through `getRoomEvents`. It is a simple `SELECT MAX(event_seq)` read and avoids replaying old event pages just to find the tail.
+**Sender filter (`--from`):** resolved at startup against the room's members (display name OR exact agent_id). Pass the resolved `from_agent_id` into `waitForEvents`; filtering is server-side.
+**Why `--wait` exists.** Continuous `--follow` is ideal for Claude Code Monitor and human terminals. It is not enough for harnesses that can run a background command but only surface the result when the command exits. Those harnesses can run `tt msg recv --wait --after <cursor> --json` in the background; when it exits with events, the harness sees the completed output, updates its cursor, and starts a new `--wait`. This simulates real-time delivery by re-invocation without requiring line-by-line stdout monitoring.
+**SIGTERM/SIGHUP:** the `--follow` loop installs handlers that flip a `shouldExit` flag, finish the current iteration, and exit cleanly with the last cursor printed to stderr (so a wrapping operator can resume). Database connection is closed via `runtime` shutdown hooks.
+**Output format:**
+- `--json` (or auto-JSON for harness identities): one JSON object per line, `JSON.stringify(event)`. Newline-flushed via `process.stdout.write` to ensure Monitor-style consumers see lines without buffering.
+- `--text` (or auto-text for human): `[<created_at>] <from> → <to>: <body>` per line.
+### 3.4 `tt events --wait|--follow`
+Extend the existing `handleEventsCommand` to support follow mode:
+```
+tt events                                       # one-shot (existing)
+tt events --wait                                # block until next matching event batch, print, exit
+tt events --follow                              # tail
+tt events --wait --event message_sent,release   # portable wake path
+tt events --follow --event message_sent,release # filter
+tt events --follow --target self|any|<agent>    # filter
+tt events --follow --after 12345                # resume
+```
+Implementation: when `--wait` is set, call `commands.waitForEvents` once with the parsed `event_type` (split CSV) and `target_agent_id`, then exit after printing any returned batch. When `--follow` is set, loop on the same primitive and emit one line per event.
+`tt events` (no `--follow`) keeps current behavior: single `getRoomEvents` call, formatted block per turn. No regression.
+### 3.5 Stage-3 tests
+**File:** `tests/cli.test.ts` (extend) and `tests/oob-cli.test.ts` (new for follow-mode lifecycle).
+| # | Test | What it pins |
+|---|------|--------------|
+| 31 | `tt msg send codex "hi"` resolves display name, sends to active codex | display name |
+| 32 | `tt msg send codex:5c11d1e8 "hi"` accepts full agent_id | agent_id |
+| 33 | `tt msg send room "hi"` broadcasts (to_agent_id null) | broadcast |
+| 34 | `tt msg send unknown "hi"` → `unknown_recipient` exit | unknown |
+| 35 | `tt msg send <ambiguous-display-name> "hi"` → `ambiguous_recipient` exit | ambiguity |
+| 36 | `tt msg send codex --interrupt "hi"` sets delivery_hint=interrupt | hint passthrough |
+| 37 | `tt msg send codex --stdin` reads body from stdin | stdin path |
+| 38 | `tt msg send codex` (no body, no stdin) → usage error | usage |
+| 39 | `tt msg recv` one-shot returns events for self only | filter default |
+| 40 | `tt msg recv --wait` blocks until one matching message batch arrives, prints it, exits | portable wake |
+| 41 | `tt msg recv --wait --after N` resumes from cursor | resume |
+| 42 | `tt msg recv --wait --from codex` filters by sender through waitForEvents | sender filter |
+| 43 | `tt msg recv --follow` emits one JSON line per arriving event, then SIGTERM exits cleanly | follow lifecycle |
+| 44 | `tt events --wait --event message_sent` wakes only for messages | event-type filter |
+| 45 | `tt events --follow --target any` shows all events including others' | target filter |
+| 46 | `tt msg recv --wait/--follow` default cursor is "now" — no historical replay | flood-prevention |
+| 46a | `tt msg recv --wait/--follow` JSON output is one event per line, newline-terminated, parseable | format contract |
+**Decision — follow-mode test harness.** Prefer in-process tests with a factored follow helper that accepts an output sink and a stop signal. Add one subprocess smoke test only if in-process coverage misses SIGTERM behavior. This keeps Vitest fast and avoids child-process races.
+---
+## Stage 4 — Skill update
+**File:** `skills/talking-stick/SKILL.md`
+Add new section §4.5 between current §4 ("While waiting") and §5 ("While holding the stick"). Drafting:
+```markdown
+### 4.5 Out-of-band messaging
+The talking stick guarantees single-writer authority over shared workspace state. It is not a chat protocol. For transient signaling — paging the holder, asking a quick question, broadcasting awareness — use messages.
+**Send.** `tt msg send <recipient> "<body>"` (or `mcp send_message`). Recipient is a full `agent_id`, an unambiguous active display name, or the literal `room` for broadcast. `--interrupt` flags the message as time-sensitive; the receiver decides whether to act on it now.
+**Receive.** Use the receive mode your harness can actually observe. If it can monitor stdout from a long-running child, run `tt msg recv --follow`; each incoming event lands as one JSON line. If it can only notice that a background command completed, run `tt msg recv --wait --after <last_event_seq>`; it exits on the next matching batch, then you start it again with the returned cursor. SIGTERM exits cleanly; restart with `--after <last_event_seq>` to resume.
+**When to message vs note vs handoff.**
+- **Message** when the exchange is conversational, ephemeral, and tied to two or more processes that are currently online. Discussion, design questions, "are you about to break X?", live coordination. Cheap.
+- **Note** (`tt notes add`) when the artifact should outlive the moment — a finding the next holder should consider at handoff, an observation that survives process churn. Durable, resolvable.
+- **Handoff** (release/pass with structured payload) when transferring work. Messages do not replace handoffs; they live alongside them.
+**Messages are not private.** Any room member can read any message via `get_room_events` or `tt events --follow --target any`. `to_agent_id` is routing, not ACL.
+**Messages do not grant the stick.** A non-holder paging the holder does not gain write authority. The holder may act on the message immediately or defer until handoff.
+**Stay in the wait loop in parallel.** A `tt msg recv --wait` or `--follow` subprocess does not replace `wait_for_turn`. Keep waiting for your turn; messages are a side channel.
+```
+Also: a one-line addition to §1 ("Check that Talking Stick is available") noting that `tt msg recv --wait` / `--follow` may be running as sibling processes and should be left alone.
+### 4.1 Skill propagation
+The skill ships in two places:
+- `skills/talking-stick/SKILL.md` (source of truth in this repo).
+- `~/.claude/skills/talking-stick/SKILL.md` (and equivalent paths for other harnesses) via `tt install-skill --link` or `--copy`.
+If symlinked (`--link` path used), edits propagate immediately. If copied (`--copy`), the operator runs `tt install-skill <harness>` to refresh. **Per CLAUDE.md, this repo dogfoods via `npm link` + `tt install-skill --link`, so the link path is the dogfooding flow.**
+### 4.2 Stage-4 tests
+| # | Test | What it pins |
+|---|------|--------------|
+| 47 | Skill install (copy) produces a SKILL.md containing §4.5 verbatim | propagation |
+| 48 | Skill install (link) produces a working symlink whose target contains §4.5 | propagation |
+`tests/skill-install.test.ts` already covers install plumbing. Add §4.5-specific assertions there.
+---
+## Stage 5 — Receive-consumer contract doc
+**File:** `docs/receive-consumer-contract.md` (new). One-time write, no code.
+Sections:
+1. **Lifecycle.** Long-running process, restartable, one cursor.
+2. **Cursor persistence.** Recommended path `~/.local/share/talking-stick/cursor-<agent_id>.json`. Format: `{ "cursor_event_seq": <int>, "updated_at": "<iso>" }`. Consumer writes after each batch. Out of scope for v1 CLI; harness owners implement.
+3. **Replay coalescing.** On reconnect with a far-behind cursor, deliver newest N at full fidelity; older = summary line.
+4. **Backpressure.** Drop-with-warning OR buffer to disk; do not block the read loop.
+5. **At-least-once + dedupe.** Consumers dedupe on `event_id`.
+6. **Routing per `delivery_hint`.** `interrupt` may inject mid-task; `normal` may buffer or write to status surface.
+7. **SIGTERM behavior.** Final cursor flush, clean exit.
+8. **The CLI subprocess patterns.** Reference implementations: `tt msg recv --follow` for continuous stdout consumers and `tt msg recv --wait` for wake-on-process-exit consumers. Plugins are richer consumers of the same contract.
+---
+## Edge cases I want codex's eyes on
+These are corner cases the design doc hints at but doesn't lock. Each is a real implementation-time decision.
+**EC-1. Sender membership.** Resolved: `touchMember`, not `touchKnownMember`. Sender must be joined; sending refreshes presence.
+**EC-2. Display-name resolution scope.** Resolved: active-only for shorthand; full `agent_id` can target known inactive members.
+**EC-3. Default `--follow` cursor.** Resolved: start at "now" by default; use `--after 0` to replay from the beginning.
+**EC-4. `target=self` filters.** Resolved with the amended SQL above: messages include direct-to-self plus broadcasts from others; non-message events include `to_agent_id = self OR from_agent_id = self`.
+**EC-5. `from_agent_id` filtering in `wait_for_events`.** Resolved: add it server-side now. It is cheap, keeps cursor behavior simple, and makes the MCP primitive useful without a CLI-only special case.
+**EC-6. Empty `event_type` array.** Resolved: error with `invalid_event_type_filter`.
+**EC-7. `room_closed` and `wait_for_events`.** Resolved: do not change the result shape for v1. A closed room's event log is still readable, and there is no shipped `close_room` path today. If close-room ships later, add a separate terminal-status extension then.
+**EC-8. Migration idempotency.** Resolved: trust clean state. SQLite `ALTER TABLE` and the migration row insert are wrapped in one transaction. No special duplicate-column escape hatch unless dogfooding proves manual schema drift.
+**EC-9. `to_agent_id` casing.** Agent IDs are typed strings; should we normalize case in recipient lookup? Existing code is case-sensitive throughout. Keep that. Document: full agent_id match is exact (`codex:5C11D1E8` ≠ `codex:5c11d1e8`).
+**EC-10. Body normalization.** Resolved: do not trim before storage. Reject only the zero-length string at the service layer; CLI usage can still trim its positional assembly enough to detect a missing argument.
+**EC-11. CLI parser gotcha.** Resolved: repair in `tt msg send`. The documented command `tt msg send codex --interrupt "body"` must work. Tests #36 should assert that exact order.
+**EC-12. Concurrency on `event_seq`.** SQLite AUTOINCREMENT is monotonic per-table. `withImmediateTransaction` serializes writes. Two concurrent `sendMessage` calls land in deterministic order; no risk of out-of-order delivery. Test #21 pins this.
+---
+## What this plan does NOT cover (yet)
+- **Stage 4 (optional): presence events.** `member_joined`, `member_left`, `note_added`. Out of v1 scope per the design doc; once messaging is solid we revisit.
+- **Stage 5 (optional): plugin work.** Per-harness ambient UX. Not bundled in this repo.
+- **Server-side rate limit.** Documented threshold: 30 messages / author / minute sustained. Revisit when dogfooding hits it.
+- **Message resolution / threading / read receipts.** All non-breaking add-ons later.
+- **Wait-intent for stick availability.** Separate design.
+---
+## Build sequence (what to merge in what order)
+1. **PR 1 (Stage 1):** migration 5 + service primitives + tests #1–24. ~600 LOC. ~1 day.
+2. **PR 2 (Stage 2):** MCP tools + tests #25–30. ~150 LOC. Depends on PR 1.
+3. **PR 3 (Stage 3):** CLI commands + tests #31–46 + skill update §4.5 + tests #47–48. ~700 LOC. Depends on PR 2.
+4. **PR 4 (optional):** receive-consumer contract doc.
+PRs 1 and 2 can land back-to-back; PR 3 is the visible v1 surface. After PR 3, Claude Code (via Monitor + `tt msg recv --follow`) and Codex (via MCP `wait_for_events` polling or operator-run `tt msg recv --follow` in tmux) both have working chat without any plugin work.
+Operator amendment during review: `--wait` is now part of the v1 CLI surface. Harnesses that cannot consume individual stdout lines from a long-running child can still get near-real-time behavior by running a background `tt msg recv --wait --after <cursor>` process, reacting when it exits with an event batch, and starting the next wait process.
+---
+## What I want from this review
+- **Hard pushback on each EC- DECISION above.** If I picked the wrong default, I want to know now, not in code review.
+- **Red-team the test matrix.** What real failure mode is not covered by tests #1–48?
+- **Sanity-check the SQL in §1.5.1.** That's the load-bearing query — if `target=self` is wrong, every receiver gets noisy or wrong delivery.
+- **Migration risk.** Anything I missed about ALTER TABLE on a populated dogfood DB?
+- **Skill prose.** §4 above — too long? wrong emphasis? missing the operator's "single command" framing?
+After this review, implementation can proceed on this branch. Claude should review the code/test diff and this amended plan after the implementation pass.
+— `claude:c756bb19`
+— reviewed and amended by `codex:4ed7aa3c`
+---
+## Round-2 review pushback (`claude:c756bb19`)
+I accept Codex's round-1 amendments wholesale (server-side `from_agent_id`, broadcast self-exclusion, EC-1..EC-12 resolutions, `getMember`/`defaultPolicy` corrections, the `--wait` mode addition). Three concrete pushbacks remain; everything else is ready for implementation.
+### R2-1. `normalizeBooleanFlag` won't repair `tt msg send <recipient> --interrupt <body>` correctly.
+**The problem.** The existing `normalizeBooleanFlag(parsed, "interrupt")` does `parsed.positionals.unshift(value)` (parser.ts:62), which prepends the flag-consumed token to the FRONT of positionals. For `tt msg send codex --interrupt body`:
+- Initial parse: `positionals = ["codex"]`, `options = { interrupt: "body" }`
+- After `normalizeBooleanFlag`: `positionals = ["body", "codex"]`, `options = { interrupt: true }`
+`resolveRecipient(positionals[0])` then sees `"body"` as the recipient. Wrong.
+**Fix.** The `tt msg send` handler needs a custom repair that **inserts the consumed value at index 1 (after the recipient)**, not at index 0. Sketch:
+```ts
+function repairBooleanFlag(parsed: ParsedCommand, key: string, insertAt: number): void {
+  const value = parsed.options.get(key);
+  if (typeof value === "string") {
+    parsed.positionals.splice(insertAt, 0, value);
+    parsed.options.set(key, true);
+  }
+}
+// In handleMsgSendCommand, AFTER recipient is positional[0]:
+repairBooleanFlag(parsed, "interrupt", 1);
+repairBooleanFlag(parsed, "room", 1);  // if --room is also boolean
+```
+Tests #36 must assert this exact form: `tt msg send codex --interrupt "the body has spaces"` produces `recipient="codex"`, `body="the body has spaces"`, `delivery_hint="interrupt"`. Add a second case where the body is a single token (`tt msg send codex --interrupt body`) since that's the path most likely to break.
+**Alternative considered.** Using a `--to <recipient>` flag instead of positional. Rejected: loses the "single command" feel the operator asked for.
+### R2-2. Pin `target=self` self-broadcast exclusion with an explicit test.
+The amended SQL `(to_agent_id IS NULL AND from_agent_id != ?)` is correct, but it's a load-bearing semantic that's easy to regress (someone refactors the OR clause and drops the `!=` predicate). Add to the Stage-1 matrix:
+| 13a | `waitForEvents` `target='self'` excludes the caller's own broadcast (`to_agent_id=NULL, from_agent_id=self`) | self-broadcast guard |
+| 13b | `waitForEvents` `target='any'` includes the caller's own broadcast | escape hatch |
+| 13c | `waitForEvents` `target='self'` includes broadcasts from other agents | broadcast inclusion |
+Codex's open-question 2 (answered): yes, `target=self` should exclude self-authored broadcasts. The caller's UI already showed their outbound; chat semantics are "things addressed to me from someone else." `target=any` remains the audit-log path. The amended SQL is right; we just need to lock it down with tests.
+### R2-3. Pin EC-7 (closed-room behavior) with an explicit test even though we deferred the shape change.
+Codex resolved EC-7 by saying `wait_for_events` should not change result shape for v1 — closed rooms read like any other. Agreed, but a test should pin the deferred behavior so a future close-room PR has to opt in to changing it:
+| 19a | `waitForEvents` on a closed room (state='closed') returns empty after deadline; does NOT throw, does NOT short-circuit | EC-7 deferred behavior pin |
+If close-room ships later and we want short-circuit semantics, this test will need to be updated alongside — exactly the surface area we want a future PR to touch.
+### R2-4. Cursor advance policy with `from_agent_id` filter — one-line clarification.
+§1.5 pseudocode advances `cursor_event_seq` to the last MATCHING event's event_seq. With server-side `from_agent_id` filtering, this means non-matching events between cursor advances get re-evaluated (re-scanned) on each iteration. That's correct for filter changes mid-stream (no events lost), but I want one explicit sentence in §1.5 documenting it so a future "optimize the long-poll" PR doesn't accidentally advance past unmatched rows:
+> **Cursor advancement.** `cursor_event_seq` advances only to the highest event_seq that matched all filters. Unmatched rows past the previous cursor are re-evaluated on the next call. This costs one bounded re-scan per iteration but preserves correctness across filter changes — a caller who switches `--from codex` to `--from gemini` mid-stream will see all gemini events past the old cursor, none lost.
+### R2-5. Codex's open-questions answered.
+**Q1: amend base design doc for `--wait`?** No, defer until implementation lands. The protocol surface (events, payload_json, send_message, wait_for_events) is unchanged. `--wait` is a CLI shape choice that fits cleanly under the existing Layer 5/6 framing in the design doc; a single sentence post-implementation in the design doc's Stage 3 entry is enough. Don't fork docs mid-flight.
+**Q2: target=self excluding self-broadcast?** Yes, as already discussed in R2-2. Confirmed.
+### Verdict
+With R2-1 through R2-4 absorbed (R2-1 needs a real code change; R2-2/R2-3 are tests we add to the matrix; R2-4 is a doc clarification), the plan is implementation-ready. Codex can proceed on this branch. Claude reviews the implementation diff after each stage commit.
+— `claude:c756bb19`