talking-stick 0.1.4 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,32 @@
1
+ # Receive Consumer Contract
2
+
3
+ `send_message` appends `message_sent` events to the room event log. `wait_for_events` is the canonical receive primitive. CLI consumers (`tt events --wait`, `tt events --follow`, `tt msg recv --wait`, `tt msg recv --follow`) and future harness-native consumers should share the same cursor and retry rules.
4
+
5
+ ## Delivery
6
+
7
+ - Delivery is at least once. Consumers must tolerate duplicates after restart.
8
+ - `event_seq` is monotonic per database and is the receive cursor.
9
+ - Consumers should persist the highest processed `event_seq` after each emitted batch.
10
+ - Directed messages are routing only. Any room member can read messages through `get_room_events` or `tt events --target any`.
11
+
12
+ ## Receive Modes
13
+
14
+ - Use `tt events --follow --after <cursor>` when the harness can monitor stdout from a long-running child. Each output line is one `RoomEvent` JSON object.
15
+ - Use `tt events --wait --after <cursor>` when the harness can only notice process completion. The process exits after the next matching batch or timeout; restart it with the latest processed cursor.
16
+ - Use `tt msg recv --wait` or `tt msg recv --follow` only when the consumer intentionally wants messages without turn handoffs.
17
+ - If no `--after` is supplied in `--wait` or `--follow` mode, the CLI starts from the current event tail to avoid flooding a new consumer with history.
18
+ - A one-shot `tt msg recv --after <cursor>` is a non-blocking drain operation.
19
+
20
+ ## Filtering
21
+
22
+ - `target=self` is the default for `--wait` and `--follow`. It receives direct messages to the caller plus broadcasts from other agents. It excludes the caller's own broadcasts.
23
+ - `target=any` receives all matching events/messages and is intended for audit/debug views.
24
+ - `--from <agent>` resolves a full `agent_id` or unambiguous active display name and is enforced server-side.
25
+
26
+ ## Consumer Responsibilities
27
+
28
+ - Keep `wait_for_turn` / `tt wait` running separately. Receive processes do not claim or grant the stick, even when they return pass, release, or assignment events.
29
+ - Treat an event wake as a prompt to read, reply, or retry `tt wait`. It is not permission to mutate shared files; only a `your_turn` wait result with a live guardian grants ownership.
30
+ - Decide how to surface `delivery_hint=interrupt`; the server only records the hint.
31
+ - Dedupe on `event_id` if restart replay is possible.
32
+ - Treat message bodies as room-visible text, not private data.
@@ -0,0 +1,85 @@
1
+ # Talking Stick 0.2.0
2
+
3
+ Date: 2026-04-30
4
+
5
+ Minor release that adds **out-of-band messaging** between agents in a room. Two agents — typically the holder and a non-holder, or two non-holders — can now exchange short conversational messages without passing the stick. The protocol substrate is one new column on `room_events`; the surface is two MCP tools (`send_message`, `wait_for_events`) and three CLI commands (`tt msg send`, `tt msg recv`, `tt events --wait|--follow`).
6
+
7
+ The feature targets **Vignette H** from the design doc: holder + non-holder alternating short messages on a sub-question, paying ~80 tokens of body per round-trip instead of the ~600 tokens of structured-handoff scaffolding when the discussion would have otherwise required `pass_stick`/`release_stick` ping-pong.
8
+
9
+ ## Added
10
+
11
+ ### Out-of-band messaging
12
+
13
+ Three CLI commands. All wrap the same MCP/service primitives.
14
+
15
+ ```bash
16
+ tt msg send <recipient|room> "<body>" [--interrupt] [--stdin] [--path DIR]
17
+ tt msg recv [--wait|--follow] [--from agent] [--after N] [--target self|any|agent] [--path DIR]
18
+ tt events --wait|--follow [--event TYPE[,TYPE]] [--target self|any|agent]
19
+ ```
20
+
21
+ - `<recipient>` is a full `agent_id`, an unambiguous active display name (`codex`, `claude`), or the literal `room` for broadcast.
22
+ - `--interrupt` marks the message time-sensitive. The receiving harness or operator decides whether to act on it now; the protocol delivers, the consumer routes.
23
+ - `tt msg recv --follow` is a long-running tail (one JSON line per event) suited to harnesses that can monitor child stdout (Claude Code Monitor, terminals).
24
+ - `tt msg recv --wait` exits on the next matching batch — ideal for harnesses that can launch a background command and notice when it completes; restart with `--after <last_event_seq>` to resume.
25
+
26
+ The matching MCP tools are `send_message` (write) and `wait_for_events` (observer-safe long-poll). `get_room_events` now returns parsed `payload` for `message_sent` rows alongside the existing `handoff` field for legacy event types.
27
+
28
+ ### Observer-safe event long-poll
29
+
30
+ `wait_for_events` is non-mutating by contract. It does not call `touchMember`, `touchKnownMember`, `touchWaitingMember`, or `purgeExpiredIdleRooms`. The only read it performs at entry is `requireRoom` for fail-fast on a missing room. Non-holders can long-poll the event log freely without disturbing the `last_wait_at` / `last_seen_at` bookkeeping that drives turn fairness.
31
+
32
+ ### `getLatestEventSeq` cursor helper
33
+
34
+ `tt msg recv --wait|--follow` defaults to "start at now" — the highest `event_seq` in the room at startup time — so first-launch receivers don't replay history. Implemented as a single `SELECT MAX(event_seq) FROM room_events WHERE room_id = ?`, exposed on the service and commands layer. Operators wire `--after $LAST_SEQ` from their own bookkeeping when resuming after a crash; cursor persistence to disk is the harness's or plugin's responsibility per the receive-consumer contract.
35
+
36
+ ### Splice-at-1 parser repair for boolean flags after positionals
37
+
38
+ The CLI parser consumes the next non-`--` token as a flag's value. That meant `tt msg send codex --interrupt body` would parse `interrupt="body"` and leave `codex` as the only positional. The handler now repairs this case by splicing the consumed value at positional index 1 (after the recipient), so `tt msg send codex --interrupt "body"` produces `recipient=codex`, `body="body"`, `delivery_hint=interrupt`. The existing `normalizeBooleanFlag` helper unshifts to the front (correct for `tt notes add --stdin` etc.); this new repair handles the `<positional> <body>` shape without weakening the generic parser.
39
+
40
+ ### Receive-consumer contract
41
+
42
+ [`docs/receive-consumer-contract.md`](../receive-consumer-contract.md) documents the lifecycle expected of any receive consumer (CLI subprocess, future plugin, harness adapter): cursor persistence, replay coalescing on far-behind cursors, backpressure (drop-with-warning, never block the read loop), at-least-once delivery + dedupe on `event_id`, SIGTERM clean exit with the last cursor flushed to stderr.
43
+
44
+ ## Skill
45
+
46
+ The bundled skill at [`skills/talking-stick/SKILL.md`](../../skills/talking-stick/SKILL.md) gains a new §4.5 *Out-of-band messaging* section:
47
+
48
+ - send via `tt msg send <recipient> "<body>"` or MCP `send_message`
49
+ - receive via `tt msg recv --wait` or `--follow` depending on what the harness can observe
50
+ - when to message (conversational, ephemeral, between live processes) vs note (durable, resolvable artifacts) vs handoff (transfer of work)
51
+ - messages are routing not ACL — `to_agent_id` is delivery, not privacy
52
+ - messages do not grant the stick — paging the holder gets attention, not write authority
53
+ - a `tt msg recv` subprocess does not replace `wait_for_turn` — keep waiting for your turn in parallel
54
+
55
+ The skill also picks up a small note in §1 reminding harnesses that sibling `tt msg recv --wait` / `--follow` subprocesses may be running and should be left alone unless the operator says otherwise.
56
+
57
+ ## Migration
58
+
59
+ `room_events` gains a nullable `payload_json TEXT` column (migration #5). `ALTER TABLE ADD COLUMN` is O(1) on populated tables; existing rows back-fill to NULL; legacy event types continue to write NULL via the optional `payload?` parameter on `appendEvent`. No action required by operators on upgrade — the column is invisible to v0.1.x clients.
60
+
61
+ ## Design properties pinned by tests
62
+
63
+ - **Self-broadcast exclusion** for `target=self`: caller's own broadcasts (`to_agent_id IS NULL AND from_agent_id = caller`) are excluded from their default receive view; the audit path (`target=any`) still includes them. The SQL clause is `(event_type='message_sent' AND (to_agent_id = ? OR (to_agent_id IS NULL AND from_agent_id != ?)))` — pinned by tests 13a/13b/13c in `tests/oob-substrate.test.ts`.
64
+ - **Closed-room behavior** (deferred): `wait_for_events` on a `state='closed'` room returns empty after deadline; no short-circuit, no error. Pinned by test 19a so a future `close_room` PR has to opt in to changing it.
65
+ - **Body cap.** 4096 bytes UTF-8; rejected with typed `message_too_large`. No silent truncation.
66
+ - **Sender filter** (`from_agent_id`) applied server-side, so cursor advancement under `tt msg recv --from <agent>` is honest.
67
+ - **SIGTERM lifecycle** for `tt msg recv --follow` covered by a real subprocess test that spawns the CLI, sends a message via MCP, asserts the JSON line on stdout, sends SIGTERM, and verifies clean exit.
68
+
69
+ ## Verification
70
+
71
+ ```bash
72
+ npm run typecheck # clean
73
+ npm run build # clean
74
+ npm test # 263 passed (was 257 before fd67873)
75
+ tt --help | grep "tt msg" # tt msg send/recv visible
76
+ ```
77
+
78
+ End-to-end dogfood pre-release: claude (MCP) ↔ codex (MCP) ↔ codex (CLI) round-tripped 6 messages (events 668→675) in the live coordination room with zero `pass_stick`/`release_stick` calls during the chat. Both `target=self` (excludes own broadcast) and `target=any` (includes own broadcast) verified in production.
79
+
80
+ ## Plan and design
81
+
82
+ - [`docs/plans/out-of-band-signaling.md`](../plans/out-of-band-signaling.md) — converged design (commit 8069d84)
83
+ - [`docs/plans/out-of-band-signaling-implementation.md`](../plans/out-of-band-signaling-implementation.md) — file-by-file build sequence with R1/R2 review history
84
+ - [`docs/receive-consumer-contract.md`](../receive-consumer-contract.md) — lifecycle, cursor, replay, backpressure
85
+ - [`skills/talking-stick/SKILL.md`](../../skills/talking-stick/SKILL.md) §4.5 — when-to-message-vs-note-vs-handoff guidance
@@ -0,0 +1,77 @@
1
+ # Talking Stick 0.3.0
2
+
3
+ Date: 2026-05-05
4
+
5
+ Breaking release that makes the `tt` CLI the only harness integration contract.
6
+ Talking Stick no longer installs or serves an MCP adapter. Agents coordinate by
7
+ running `tt` subprocesses for join/wait/handoff, notes, messages, and event
8
+ receive.
9
+
10
+ ## Breaking Changes
11
+
12
+ ### MCP server surface removed
13
+
14
+ Removed the MCP stdio server implementation, `tt mcp` command registration,
15
+ MCP-specific tests, and the `@modelcontextprotocol/sdk` dependency. The package
16
+ exports no MCP server helpers. `tt --help` no longer advertises MCP startup, and
17
+ `tt install` no longer writes MCP server config.
18
+
19
+ ### `tt install` is skill-only
20
+
21
+ `tt install <harness>` now installs or refreshes the bundled
22
+ `talking-stick` skill for Claude Code, Codex, Gemini, and OpenCode. The older
23
+ `tt install-skill` and `tt uninstall-skill` command surface is gone because
24
+ `tt install` / `tt uninstall` own skill installation directly.
25
+
26
+ ## Migration
27
+
28
+ ### Stale MCP cleanup
29
+
30
+ Updates remove stale Talking Stick MCP registrations from older installs instead
31
+ of keeping the broken dual integration path alive.
32
+
33
+ Cleanup runs from:
34
+
35
+ - package postinstall when installed under `node_modules/talking-stick`
36
+ - `tt self-update` after the package manager command returns
37
+ - the first normal installed-package `tt` invocation after a package-version
38
+ change
39
+ - explicit `tt install` and `tt uninstall`
40
+
41
+ Each run appends JSONL audit entries to
42
+ `${TALKING_STICK_DATA_DIR}/update-migrations.log`. OpenCode cleanup is
43
+ shape-strict: only the canonical `mcp.talking-stick` entry with `["tt", "mcp"]`
44
+ is removed. Claude Code, Codex, and Gemini use their native `mcp remove`
45
+ commands for the old `talking-stick` server name.
46
+
47
+ ## CLI-Only Runtime
48
+
49
+ The bundled skill now teaches harnesses to start
50
+ `tt events --follow --json` as the ambient receiver, keep
51
+ `tt wait --json` running for turn ownership, and verify the
52
+ returned guardian pid before long edits. `tt msg recv` remains a messages-only
53
+ fallback; the unified event stream is the primary OOB path because turn
54
+ handoffs and messages share one ordered feed.
55
+
56
+ For harnesses that cannot consume a long-running stdout stream, the documented
57
+ fallback is `tt events --wait --after <cursor> --json` as an observer-only wake
58
+ process alongside the normal `tt wait --json` ownership loop. Event wakes do not
59
+ grant the stick; agents still need a `your_turn` wait result and live guardian
60
+ before editing.
61
+
62
+ CLI identity resolution now prefers stable harness ancestry over transient
63
+ terminal ids when no explicit harness session id exists. That keeps repeated
64
+ shell-outs from one harness attached to the same room member.
65
+
66
+ ## Verification
67
+
68
+ ```bash
69
+ npm run typecheck
70
+ npm test
71
+ npm run build
72
+ git diff --check
73
+ ```
74
+
75
+ Stage validation covered the migration runner, install/uninstall/self-update
76
+ cleanup wiring, child-process CLI receive behavior, guardian repair, full-suite
77
+ tests after MCP deletion, and built `dist/` output with no MCP/server files.
@@ -742,8 +742,9 @@ Recommended defaults (product scale, sized for real agent work rather than chat
742
742
  owner_lease_ttl_ms = 45 * 60 * 1000; // 45 minutes
743
743
  heartbeat_interval_ms = 5 * 60 * 1000; // 5 minutes
744
744
  claim_ttl_ms = 20 * 60 * 1000; // 20 minutes
745
- wait_for_turn_max_wait_ms = 30 * 1000; // 30 seconds
745
+ wait_for_turn_max_wait_ms = 110 * 1000; // 110 seconds
746
746
  wait_for_turn_poll_ms = 250; // transport polling cadence
747
+ wait_for_events_max_wait_ms = 110 * 1000; // 110 seconds
747
748
  presence_ttl_ms = 4 * 60 * 60 * 1000; // 4 hours
748
749
  waiter_grace_ms = 10 * 1000; // 10 seconds
749
750
  ```
@@ -1158,5 +1159,5 @@ presence TTL: 4 hours
1158
1159
  close semantics: no `close_room` tool in the MVP implementation;
1159
1160
  rooms remain resumable and can become dormant
1160
1161
  when nobody is live
1161
- wait_for_turn max wait: 30 seconds, polled at 250 ms
1162
+ wait_for_turn max wait: 110 seconds, polled at 250 ms
1162
1163
  ```
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "talking-stick",
3
- "version": "0.1.4",
4
- "description": "MCP coordination server for path-scoped agent handoffs.",
3
+ "version": "0.3.0",
4
+ "description": "CLI coordination tool for path-scoped agent handoffs.",
5
5
  "type": "module",
6
6
  "bin": {
7
7
  "tt": "dist/cli.js"
@@ -11,18 +11,19 @@
11
11
  },
12
12
  "files": [
13
13
  "dist",
14
+ "scripts",
14
15
  "skills",
15
16
  "docs",
16
17
  "README.md"
17
18
  ],
18
19
  "scripts": {
19
20
  "build": "tsc -p tsconfig.build.json && chmod +x dist/cli.js",
21
+ "postinstall": "node scripts/postinstall-mcp-cleanup.cjs",
20
22
  "prepare": "tsc -p tsconfig.build.json && chmod +x dist/cli.js",
21
23
  "test": "vitest run",
22
24
  "typecheck": "tsc -p tsconfig.json --noEmit"
23
25
  },
24
26
  "dependencies": {
25
- "@modelcontextprotocol/sdk": "^1.29.0",
26
27
  "better-sqlite3": "^12.9.0",
27
28
  "zod": "^3.25.76"
28
29
  },
@@ -0,0 +1,25 @@
1
+ #!/usr/bin/env node
2
+ const { spawnSync } = require("node:child_process");
3
+ const fs = require("node:fs");
4
+ const path = require("node:path");
5
+
6
+ const cliPath = path.resolve(__dirname, "..", "dist", "cli.js");
7
+ const packageRoot = path.resolve(__dirname, "..").replace(/\\/g, "/");
8
+
9
+ if (
10
+ process.env.TALKING_STICK_DISABLE_MCP_MIGRATION ||
11
+ !packageRoot.includes("/node_modules/talking-stick") ||
12
+ !fs.existsSync(cliPath)
13
+ ) {
14
+ process.exit(0);
15
+ }
16
+
17
+ spawnSync(process.execPath, [cliPath, "migrate-mcp", "--reason", "update", "--quiet"], {
18
+ stdio: "ignore",
19
+ env: {
20
+ ...process.env,
21
+ TALKING_STICK_SKIP_STARTUP_MAINTENANCE: "1"
22
+ }
23
+ });
24
+
25
+ process.exit(0);
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: talking-stick
3
- description: Use when working in a repo that coordinates multiple agent harnesses with Talking Stick (`tt` / `talking-stick`), or when the user asks you to avoid parallel work, wait your turn, pass structured handoffs, or coordinate with Claude, Codex, Gemini, or OpenCode in the same workspace. Also use when a workspace contains a `.talking-stick/` marker or when the MCP tools `list_rooms`, `join_path`, `leave_room`, `kick_member`, `wait_for_turn`, `heartbeat`, `release_stick`, `pass_stick`, `takeover_stick`, `get_room_state`, `get_room_events`, `add_note`, or `list_notes` are available.
3
+ description: Use when working in a repo that coordinates multiple agent harnesses with Talking Stick (`tt` / `talking-stick`), or when the user asks you to avoid parallel work, wait your turn, pass structured handoffs, or coordinate with Claude, Codex, Gemini, or OpenCode in the same workspace. Also use when a workspace contains a `.talking-stick/` marker.
4
4
  ---
5
5
 
6
6
  This skill teaches a harness how to behave in a Talking Stick workspace.
@@ -18,46 +18,67 @@ Use this skill when any of these are true:
18
18
  - the user mentions `talking-stick`, `tt`, handoffs, turn-taking, or avoiding parallel work
19
19
  - the repo is known to use Talking Stick coordination
20
20
  - a `.talking-stick/` marker exists
21
- - the Talking Stick MCP tools are available in the current harness
22
21
 
23
22
  Do not use this skill for ordinary single-agent work in repos that are not using Talking Stick.
24
23
 
25
24
  ## Workflow
26
25
 
27
- ### 1. Check that Talking Stick is actually available
26
+ ### 1. Use The CLI
28
27
 
29
- Prefer the Talking Stick MCP tools when they are available. If they are not available but the `tt` CLI is on `PATH`, use the CLI instead (`tt list`, `tt join`, `tt leave`, `tt kick`, `tt wait`, `tt state`, `tt release`, `tt pass`, `tt assign`, `tt take`). Do not treat missing MCP tools alone as proof that coordination is unavailable.
28
+ Use the `tt` CLI for all Talking Stick coordination. Do not use old Talking Stick MCP tools for repo coordination, even if an older install exposes them; the CLI is the source of truth. Current updates should remove stale Talking Stick MCP registrations automatically.
30
29
 
31
- If coordination is required and neither the MCP tools nor the `tt` CLI are available, say so briefly and ask the user whether they want to install or enable Talking Stick first. Do not pretend coordination is active.
30
+ Useful commands:
31
+
32
+ - `tt whoami --json`
33
+ - `tt join --json`
34
+ - `tt wait --json`
35
+ - `tt try --json`
36
+ - `tt state --json`
37
+ - `tt events --after N --target any --json`
38
+ - `tt notes add "..." --json`
39
+ - `tt notes list --json`
40
+ - `tt events --follow --json`
41
+ - `tt msg send <recipient|room> "..." --json`
42
+ - `tt msg recv --follow --json` (messages-only fallback when an event-stream consumer is too broad)
43
+ - `tt release --stdin`
44
+ - `tt assign <agent_id|next> --stdin`
45
+ - `tt take --reason "..." --json`
46
+
47
+ Some workspaces may also have sibling receive processes running `tt events --follow`, `tt msg recv --wait`, or `tt msg recv --follow`; leave them alone unless the operator explicitly asks you to stop or restart them.
48
+
49
+ If coordination is required and `tt` is unavailable, say so briefly and ask the user whether they want to install or enable Talking Stick first. Do not pretend coordination is active.
32
50
 
33
51
  Human CLI runs silently keep already-installed Claude Code, Codex, and OpenCode skill copies/symlinks aligned with the bundled Talking Stick skill. This is best effort and only updates existing installs; Gemini skills are registry-managed and should be refreshed with `tt install gemini` when needed.
34
52
 
35
- ### 2. Join the workspace room once
53
+ ### 2. Join The Workspace Room Once
36
54
 
37
- On the first substantial task in a Talking Stick workspace:
55
+ On the first substantial task in a Talking Stick workspace, run:
38
56
 
39
- 1. call `join_path` with the current workspace path
40
- 2. keep the returned `room_id`
41
- 3. note the returned policy, especially `heartbeatIntervalMs`
57
+ ```sh
58
+ tt join --json
59
+ ```
42
60
 
43
- If the workspace is nested, accept the resolved canonical path the server returns.
61
+ Keep the returned room id and canonical path in mind. The current working directory is the implicit path for normal commands; pass an explicit path only when coordinating a different directory or intentionally selecting a nested room.
44
62
 
45
- ### 3. Wait before doing shared work
63
+ Right after joining, start a background ambient receiver so direct messages and turn passes/reservations surface as soon as they happen instead of waiting for the next time you poll:
46
64
 
47
- Before making shared edits or running owner-style actions, call `wait_for_turn`.
65
+ ```sh
66
+ tt events --follow --json
67
+ ```
48
68
 
49
- Use the `room_id` returned by `join_path`. Do not pass the original filesystem path to `wait_for_turn`; path resolution belongs to `join_path`, and waiting must target the exact resolved room. This avoids ambiguity when a nested workspace resolves to an ancestor room or when multiple rooms could exist under the same tree.
69
+ For `tt events --wait` and `tt events --follow`, the default target is `self`; add `--target any` only for audit/debug views. If your harness can stream a child process's stdout into the model's context (Claude Code's Monitor, Codex `attach`-style), this is enough each line becomes an event you see mid-task. If your harness can only notice that a backgrounded command exits, use the polling fallback in §4.5. Without an ambient receiver, neither messages nor turn handoffs reach you between deliberate `tt wait` / `tt events` calls.
50
70
 
51
- Keep the wait input minimal:
71
+ The ambient receiver is not a turn claimant. It never grants the stick and never starts the lease guardian. Keep using `tt wait --json` for ownership.
52
72
 
53
- ```json
54
- {
55
- "room_id": "<room_id from join_path>",
56
- "max_wait_ms": 110000
57
- }
73
+ ### 3. Wait Before Shared Work
74
+
75
+ Before making shared edits or running owner-style actions, run:
76
+
77
+ ```sh
78
+ tt wait --json
58
79
  ```
59
80
 
60
- `max_wait_ms` is optional. Use the longest client-safe wait you can support: 110000 ms is a good MCP default when the harness can tolerate it; 180000 ms is fine only when the tool/client timeout is known to exceed that. If the call times out at the harness layer, fall back to a shorter value and call again. Do not send `cursor`, even if an old tool schema still exposes it; `wait_for_turn` is cursor-free, and resumable event replay belongs to `get_room_events`.
81
+ The default wait timeout is `110s`, which is the normal active-coordination setting. If your harness has a shorter tool timeout, override with the longest safe value and immediately wait again when it returns without granting the turn. Do not busy-loop with short waits.
61
82
 
62
83
  Possible outcomes:
63
84
 
@@ -66,131 +87,153 @@ Possible outcomes:
66
87
  - `takeover_available`: surface the reason and make takeover explicit
67
88
  - `closed`: stop and explain that the room is closed
68
89
 
69
- ### 4. While waiting
70
-
71
- **Prefer to run the wait in the background.** If your harness supports running a command or subtask in the background, launch the wait (`wait_for_turn` or `tt wait`) as a background process so your foreground stays free for other work — reading, planning, answering the operator — until your turn arrives. Blocking the whole harness on the wait defeats the point.
72
-
73
- **Prefer wait cycles over scheduled wakeups.** A direct `wait_for_turn` long-poll keeps your cadence aligned with other agents and usually notices a released stick within the same cycle. Use scheduling only when your harness cannot keep a wait running in the background, or when it must return control between checks.
74
-
75
- Wakeup pattern:
90
+ A successful `tt wait` or `tt take` starts an internal `tt guard` lease guardian and returns `guardian_pid` in JSON. Verify the field is present and the pid is alive before you start a long edit; the guardian is what keeps your lease from expiring after the foreground `tt wait` process exits. If `guardian_pid` is missing or the pid is gone, stop, run `tt wait` again to repair the guardian (it will detect the existing ownership and respawn the guardian), and only then continue. Do not kill that guardian.
76
91
 
77
- 1. Probe `wait_for_turn` with `max_wait_ms: 0`.
78
- 2. If it returns `not_yet`, schedule a wakeup and return control to the harness. Keep active multi-agent wakeups tight: use 60-120 s, and never more than 120 s unless the operator explicitly pauses the room or the task is blocked outside the room.
79
- 3. On wakeup, repeat from step 1.
92
+ ### 4. While Waiting
80
93
 
81
- Scheduled wakeups are a fallback, not a reason to check in more slowly than agents using `wait_for_turn` directly. If your harness has neither background work nor wakeups, fall back to synchronous long-polls with the longest client-safe `max_wait_ms` from §3.
94
+ Prefer to run `tt wait` in the background if your harness supports background commands. That keeps the foreground free for reading, planning, answering the operator, and watching OOB messages until your turn arrives.
82
95
 
83
- Whether the wait runs in the foreground or the background, call it **once** with the client-safe `max_wait_ms` budget from above and let the server long-poll. When it returns without `your_turn`, call it again. Do not busy-loop with short waits that generates log noise and burns cache without buying anything.
96
+ Prefer wait cycles over scheduled wakeups. A direct long-poll stays aligned with other agents and usually notices a released stick within the same cycle. Use scheduled wakeups only when your harness cannot keep a wait running in the background.
84
97
 
85
- Coordination is meant to be lightweight. `wait_for_turn` is the only long-running call you should make. Room-inspection RPCs (`get_room_state`, `get_room_events`) exist to answer specific questions ("who holds the stick right now?", "what was in my predecessor's handoff?") do not call them on a timer or repeatedly just to check on another agent's progress. If you find yourself inspecting the room more than a few times per turn, stop; long-poll on `wait_for_turn` instead and trust the protocol.
98
+ Do not replace `tt wait` with an event receiver. `tt events --wait` is only a wake channel for messages and handoff/reservation events. If it exits with a pass, release, assignment, or message, process the event, then run or continue `tt wait --json`; do not touch shared files unless that wait returns `your_turn` and a live `guardian_pid`.
86
99
 
87
100
  If you do not have the stick:
88
101
 
89
102
  - do not make shared repo changes
90
103
  - do not silently race another harness
91
- - it is fine to read, plan, review, or help the user think — or any other work that does not mutate shared state
104
+ - it is fine to read, plan, review, or help the user think
92
105
  - tell the user who currently holds or is reserved the turn when that is useful
93
106
 
94
- The wait is for *active* non-mutating work, not idle sleep. Re-read the holder's last handoff, follow up on its `artifacts[]`, investigate the area they are touching, and rethink the plan from your own angle. If you find something the holder should know — a missed invariant, a related bug, a sharper plan — leave a note with `add_note` rather than sitting on it until your next turn. Notes do not grant permission to edit shared files; they are observations and pointers, not coordination bypasses. The point: while you wait you can still move the work forward by feeding the holder, not by stalling.
107
+ The wait is for active non-mutating work, not idle sleep. Re-read the holder's last handoff, follow up on its `artifacts[]`, investigate the area they are touching, and rethink the plan from your own angle. If you find something the holder should know, leave a durable note:
95
108
 
96
- When you do take the stick, first read the attached handoff and load any useful `artifacts[]`, then run `list_notes` once so you see what other members left for you. The owner's turn is the right place to act on a note, not to debate it with its author mid-turn.
109
+ ```sh
110
+ tt notes add "Finding or pointer for the current/next holder." --json
111
+ ```
97
112
 
98
- ### 5. While holding the stick
113
+ Room inspection exists to answer specific questions, not to poll. Do not run `tt state` after a routine `tt wait`; the wait result already says who owns or is reserved for the turn. Use `tt state`, `tt events --target any`, and `tt notes list` sparingly when the wait result is insufficient or you are debugging stale members, takeover, or history.
99
114
 
100
- If the task may run longer than a few minutes, heartbeat periodically.
115
+ When you do take the stick, first read the attached handoff and load any useful `artifacts[]`, then run `tt notes list --json` once so you see what other members left for you.
101
116
 
102
- Use the cadence from `join_path.policy.heartbeatIntervalMs` when available. Do not invent your own cadence if the server already told you one.
117
+ ### 4.5 Out-Of-Band Messaging
103
118
 
104
- **Holding the stick is for active work.** The moment you stop actively editing, reasoning through edits, or asking the operator a blocking question, release or pass. Do not idle-hold the room while waiting on long verification, non-blocking operator input, CI, or any other pause where another harness could make progress.
119
+ The talking stick guarantees single-writer authority over shared workspace state. It is not a chat protocol. For transient signaling, use messages.
105
120
 
106
- ### 6. Takeover is explicit
121
+ Send:
107
122
 
108
- If `wait_for_turn` reports `takeover_available`:
123
+ ```sh
124
+ tt msg send <recipient|room> "message body" --json
125
+ ```
109
126
 
110
- - explain why takeover is available (`owner_timeout`, `owner_gone`, `claim_timeout`, `recipient_gone`)
111
- - do not silently take over just because it is possible
112
- - if takeover is chosen, call `takeover_stick`
113
- - after takeover, call `get_room_events` so you can reconstruct the last handoff before touching code
127
+ Recipient is a full `agent_id`, an unambiguous active display name, or the literal `room` for broadcast. `--interrupt` marks the message as time-sensitive; the receiver decides whether to act on it now.
114
128
 
115
- If the operator explicitly tells you to take over despite a reservation or live owner, use the CLI path when available: `tt take --operator-requested --reason "<operator requested takeover>"`. Do not invent this override yourself; it is for direct operator intervention.
129
+ Receive with the mode your harness can observe. The recommended primary path is the unified event stream you started in §2:
116
130
 
117
- ### 7. Finish with a real handoff
131
+ ```sh
132
+ tt events --follow --json
133
+ ```
118
134
 
119
- When you are done with your turn, default to `release_stick`.
135
+ That streams direct messages, broadcasts, and turn passes/reservations for you as a single ordered feed — one JSON event per line. Use it whenever your harness can stream a child process's stdout into the model's context. If the harness can only notice that a backgrounded command exits, use the polling fallbacks:
120
136
 
121
- **Default to `release_stick`.** Releasing lets the server pick the next fair waiter: a recent waiter that is new or has gone longest without holding the stick. If the best-known candidate is between wait polls, the room can briefly stay claimable instead of pinning a stale reservation. This keeps the room open instead of silently turning agent-to-agent handoffs into a duopoly.
137
+ ```sh
138
+ tt events --wait --after <last_event_seq> --json # all event types
139
+ tt msg recv --wait --after <last_event_seq> --json # messages only
140
+ ```
122
141
 
123
- Use `pass_stick` only when you have a concrete reason a specific named member must go next:
142
+ Restart with the returned cursor to resume. `tt msg recv --follow` still exists for harnesses that want a messages-only feed, but the event stream is preferred because turn handoffs use the same channel and a messages-only consumer silently misses them.
124
143
 
125
- - they have unique context the next step requires
126
- - they hold a credential or capability others lack
127
- - the operator explicitly addressed the work to them
144
+ For Codex-style harnesses that cannot consume a continuous stdout stream, the safe loop is: keep `tt wait --json` as the ownership wait, and separately run `tt events --wait --after <last_event_seq> --json` as a short-lived wake process. An event wake can tell you to read, reply, or retry `tt wait`; it is never permission to edit.
128
145
 
129
- Otherwise release. Ping-ponging `pass_stick` between two agents is an antipattern because it can lock humans out of their own room.
146
+ Messages are public room events. Any room member can read them with `tt events --target any`. `to_agent_id` is routing, not an ACL.
130
147
 
131
- Always include a non-empty handoff.
148
+ Messages do not grant the stick. A non-holder paging the holder does not gain write authority. Keep waiting for your turn; messages are only a side channel.
132
149
 
133
- **Keep handoffs tight.** Handoffs are persisted in the event log and re-read on claims. Aim for roughly 150-300 words of `status`; reference commits by SHA instead of restating diffs, and use `artifacts[]` with path, line range, and role instead of pasting code. The handoff is the headline; long-form context belongs in `docs/` or a note.
150
+ ### 5. While Holding The Stick
134
151
 
135
- Minimum handoff quality:
152
+ Holding the stick is for active work. The moment you stop actively editing, reasoning through edits, or asking the operator a blocking question, release or assign the turn. Do not idle-hold the room while waiting on long verification, non-blocking operator input, CI, or any other pause where another harness could make progress.
136
153
 
137
- - `status`: what you finished, what changed, and what remains true
138
- - `next_action`: the concrete next step for the next owner
154
+ The `tt guard` process spawned by `tt wait` keeps the lease alive during active work. Later owner commands such as `tt release`, `tt assign`, and `tt take` must run under the same harness identity. If identity is ambiguous, use the exact active id with `TT_HARNESS_AGENT_ID=<agent_id>`.
139
155
 
140
- Add `artifacts`, `open_questions`, and `do_not` when they will save the next harness real time or prevent rework.
156
+ ### 6. Takeover Is Explicit
157
+
158
+ If `tt wait` reports `takeover_available`:
159
+
160
+ - explain why takeover is available (`owner_timeout`, `owner_gone`, `claim_timeout`, `recipient_gone`)
161
+ - do not silently take over just because it is possible
162
+ - if takeover is chosen, run `tt take --reason "..." --json`
163
+ - after takeover, run `tt events --target any --json` so you can reconstruct the last handoff before touching code
164
+
165
+ If the operator explicitly tells you to take over despite a reservation or live owner, use:
166
+
167
+ ```sh
168
+ tt take --operator-requested --reason "operator requested takeover" --json
169
+ ```
170
+
171
+ Do not invent this override yourself; it is for direct operator intervention.
172
+
173
+ ### 7. Finish With A Real Handoff
141
174
 
142
- Example:
175
+ When you are done with your turn, default to releasing:
143
176
 
144
- ```json
177
+ ```sh
178
+ tt release --stdin <<'JSON'
145
179
  {
146
- "status": "Added the MCP smoke test and verified it against two clients sharing one SQLite database.",
147
- "next_action": "Run the same handoff path through the human CLI and confirm pass/release behavior matches the MCP flow.",
180
+ "status": "Updated the CLI-only coordination plan and the bundled skill so harnesses use tt subprocesses for join, wait, OOB messaging, notes, and handoffs.",
181
+ "next_action": "Review the plan and then start the code-removal pass.",
148
182
  "artifacts": [
149
183
  {
150
- "path": "tests/mcp-smoke.test.ts",
184
+ "path": "docs/plans/2026-05-05-cli-only-coordination.md",
151
185
  "role": "review",
152
- "note": "End-to-end MCP adapter smoke coverage."
186
+ "note": "CLI-only migration plan."
153
187
  }
154
- ],
155
- "open_questions": [
156
- "Should tt install default to copy or link for local development?"
157
188
  ]
158
189
  }
190
+ JSON
159
191
  ```
160
192
 
161
- **`pass_stick` requires the target to be an active room member.** If the intended recipient's harness session has ended and they show as `inactive` in `get_room_state.members`, `pass_stick` can return `unknown_member`. Use `release_stick` instead; the next fair waiter can claim through the normal sequence path.
193
+ Use `tt assign <agent_id> . --stdin` only when a specific named member must go next:
162
194
 
163
- Remember that the operator can join their own room as `human:<user>`. Default behavior should leave room for them to claim turns naturally; releasing rather than passing keeps that door open.
195
+ - they have unique context the next step requires
196
+ - they hold a credential or capability others lack
197
+ - the operator explicitly addressed the work to them
164
198
 
165
- ### 8. After passing or releasing, stay in the loop
199
+ Otherwise release. Pinning turns between two agents is an antipattern because it can lock humans out of their own room.
200
+
201
+ Always include a non-empty handoff. Keep it tight: aim for roughly 150-300 words of `status`; reference commits by SHA instead of restating diffs, and use `artifacts[]` with path and role instead of pasting code.
202
+
203
+ Minimum handoff quality:
204
+
205
+ - `status`: what you finished, what changed, and what remains true
206
+ - `next_action`: the concrete next step for the next owner
207
+
208
+ Add `artifacts`, `open_questions`, and `do_not` when they will save the next harness real time or prevent rework.
166
209
 
167
- **The default after `release_stick` or `pass_stick` is to re-enter the wait loop and keep waiting until your next turn arrives.** Do not stop and ask the operator whether they want you back in the loop. Do not treat a handoff as end-of-session. In a multi-agent workspace, the expectation is: work on your turn, hand off, wait for your next turn, repeat.
210
+ ### 8. After Release, Stay In The Loop
168
211
 
169
- Stopping to ask questions after every handoff defeats the coordination protocol the operator wired you into a room so that you *would* keep showing up without being asked.
212
+ The default after `tt release` or `tt assign` is to re-enter the wait loop and keep waiting until your next turn arrives. Do not stop and ask the operator whether they want you back in the loop. Do not treat a handoff as end-of-session.
170
213
 
171
214
  Exit the wait loop only when one of these is true:
172
215
 
173
- - the shared task is explicitly finished (the operator said so, or the final handoff marks the work complete)
216
+ - the shared task is explicitly finished
174
217
  - you are the only active member and there is no one to hand off to
175
- - the operator gives a direct redirect or stop ("that's enough," "drop out of the room," a new unrelated task, etc.)
218
+ - the operator gives a direct redirect or stop
176
219
 
177
- In every other case: after `release_stick` or `pass_stick`, go straight back into the wait loop (ideally backgrounded — see §4).
220
+ In every other case, after `tt release` or `tt assign`, go straight back into `tt wait --json`.
178
221
 
179
- If the operator tells you to drop out of coordination, call `leave_room` or `tt leave`. Rooms with no active members are deleted instead of kept as history, and long-idle rooms may be purged on later invocations.
222
+ If the operator tells you to drop out of coordination, run `tt leave --json`. Rooms with no active members are deleted instead of kept as history, and long-idle rooms may be purged on later invocations.
180
223
 
181
- If the room state shows ghost members from past sessions whose processes are gone (visible as `inactive last seen ...` in `tt state`), call `kick_member` / `tt kick <agent_id>` to evict them. This is the right tool when liveness has already decided the target is dead — pass `force: true` only when the operator explicitly tells you to remove a still-active member.
224
+ If the room state shows ghost members from past sessions whose processes are gone, run `tt kick <agent_id> --json` to evict them. Use `--force` only when the operator explicitly tells you to remove a still-active member.
182
225
 
183
- ## Recovery and Inspection
226
+ ## Recovery And Inspection
184
227
 
185
228
  Use these reads when you need context:
186
229
 
187
- - `list_rooms`: discover active rooms under a path
188
- - `leave_room`: explicitly remove your membership from a room
189
- - `kick_member`: evict an idle member whose process is gone (use `force: true` only on operator instruction)
190
- - `get_room_state`: authoritative current room projection
191
- - `get_room_events`: replay recent claims, releases, passes, and takeovers
230
+ - `tt list --json`: discover active rooms under the current path
231
+ - `tt state --json`: authoritative current room projection
232
+ - `tt events --target any --json`: replay recent claims, releases, assignments, messages, and takeovers
233
+ - `tt notes list --json`: list durable notes
234
+ - `tt whoami --explain`: inspect identity resolution
192
235
 
193
- Prefer `get_room_state` over guessing from local memory when ownership may have changed.
236
+ Prefer `tt state` over guessing from local memory when ownership may have changed and you are not already looking at a fresh `tt wait` result.
194
237
 
195
238
  ## Behavior Priorities
196
239