oxtail 0.10.3 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -58,6 +58,8 @@ The v0.9/v0.10.1 changes close the public dogfooding gaps found by real peer tra
58
58
 
59
59
  ## Recently shipped
60
60
 
61
+ - **Wake hardening (v0.12.0 — issues #5/#6/#7, the v0.7-review backlog).** Three deferred wake items, landed together. **#6 (security):** wake send-keys now only ever target the pane the live process tree says hosts the peer's `server_pid` (`chooseVerifiedWakePane` → `currentPaneForServerPid`), never the peer's self-written `tmux_pane`/`tmux_session`; unverifiable ⇒ refuse (`skipped_no_target`). Registry-sourced tmux ids are shape-validated (`isValidTmuxPane`/`isValidTmuxSession`) and a spoofed `TMUX_PANE` env is ignored. This removed the cached-pane and session-name send-keys fallbacks (legit peers always register a real pane; churn is handled by re-resolution). **#5 (debounce):** all wake paths funnel through `wakePeer`, which coalesces repeat wakes to the same peer within `OXTAIL_WAKE_DEBOUNCE_MS` (default 1s, in-memory per process) ⇒ `skipped_debounced`. **#7 (observability):** a `wake_outcome` trace event per wake; `oxtail diagnose` summarizes wake_status counts by tool from `MCP_TRACE_FILE`; a scheduled `codex-drift.yml` fails if Codex's `PASTE_ENTER_SUPPRESS_WINDOW` drifts past our 500ms gap. New modules: `src/wake-debounce.ts`, `src/diagnose.ts`; `chooseVerifiedWakePane` in `src/registry.ts`.
62
+ - **Wake-on-reply (Slice 1, peer-messaging refinement push).** A `send_message` that carries `reply_to` now auto-wakes the original requester **by default** (explicit `wake:"off"` opts out), closing the observed stranding where a peer's async reply to an idle requester forced a human to relay it. The reply path is a separate, stricter gate than the lenient `wake:"auto"` path (`src/autowake.ts`): it fires only for a **fresh-idle** target (idle marker newer than `OXTAIL_AUTOWAKE_FRESH_IDLE_MS`, default 5m) — stale/unknown/missing/busy ⇒ `skipped_no_fresh_idle`, never a best-effort wake — and adds a **per-target rate limit** (`skipped_rate_limited`), a persistent **one-wake dedupe** keyed on `(session_id, reply_to)` (`skipped_deduped`, GC'd by age) to survive duplicate/late hook drains, an `OXTAIL_AUTOWAKE=off` kill-switch, and a best-effort `skipped_store_error` degrade so a broken dedupe store can never turn an already-enqueued reply into a tool error. Target is resolved by `client.session_id` with the pane re-resolved immediately before send-keys (no `server_pid`/stale-pane reuse). Response surfaces `wake_status` + `wake_reason:"reply_to_default"`. **Coverage caveat:** the fresh-idle gate keys on the busy/idle marker that only the Claude Code hooks maintain, so this slice reaches a **hooked Claude Code requester** (the observed case). A Codex / hookless-Claude requester has no idle marker ⇒ `skipped_no_fresh_idle` (reach it with explicit `wake:"auto"`); closing that direction is **Slice 2** (`expects_reply:true` — a requester-side waiter signal), deliberately not faked here with a blind `unknown ⇒ wake` that would reintroduce the active-waiter double-wake.
61
63
  - **Protocol hardening (v0.10.1).** `ask_peer` now stamps outbound messages with `request_id`; reply-to-capable peers answer with `send_message({ reply_to: request_id })`, and the waiter ignores stale same-peer messages. Explicit identity claims are monotonic, so stale automatic detection cannot clobber a real client session id. PreToolUse/Stop hook pushes are body-budgeted and labeled as peer context, not user authority.
62
64
  - **Deliver-on-complete and state-gated wake (v0.9).** The Stop hook delivers waiting messages at turn end, closing the text-only-turn gap left by PreToolUse. `UserPromptSubmit`/`Stop` maintain a busy/idle flag so `send_message({ wake: "auto" })` nudges idle peers without typing into a busy composer. Sticky Codex claim recovery keeps identity across MCP child restarts.
63
65
  - **Per-client wake routing (v0.7, refined).** `ask_peer` routes its wake mechanism per `client_type`. **Codex**: paste-burst-aware send-keys (500ms gap between text and Enter) — verified to submit. **Claude Code**: same send-keys mechanism without the gap (no paste-burst in its TUI) — verified end-to-end 2026-05-13 against `oxtail-claudejr`. v0.7 originally fail-fasted Claude Code targets under a hook-catalog argument; the follow-up restored symmetric wake after falsifying that conclusion empirically. Response includes a `wake_status` field for caller diagnostics. Pre-wake pane re-resolution closes the stale-pane-ID race from v0.6. `OXTAIL_ASK_PEER_WAKE_STRATEGY=auto|legacy|off` env override for rollback. Issue #3 has the spike findings.
package/README.md CHANGED
@@ -36,7 +36,7 @@ args = ["-y", "oxtail@0.10.1"]
36
36
 
37
37
  ```sh
38
38
  mkdir -p ~/.claude/commands
39
- curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.10.1/.claude/commands/oxtail-join.md \
39
+ curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/.claude/commands/oxtail-join.md \
40
40
  -o ~/.claude/commands/oxtail-join.md
41
41
  ```
42
42
 
@@ -44,9 +44,9 @@ curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.10.1/.claude/command
44
44
 
45
45
  ```sh
46
46
  mkdir -p ~/.codex/skills/oxtail-join/agents
47
- curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.10.1/integrations/codex/oxtail-join/SKILL.md \
47
+ curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/integrations/codex/oxtail-join/SKILL.md \
48
48
  -o ~/.codex/skills/oxtail-join/SKILL.md
49
- curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.10.1/integrations/codex/oxtail-join/agents/openai.yaml \
49
+ curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/integrations/codex/oxtail-join/agents/openai.yaml \
50
50
  -o ~/.codex/skills/oxtail-join/agents/openai.yaml
51
51
  ```
52
52
 
@@ -65,13 +65,13 @@ Contributing? `git clone https://github.com/d4j3y2k/oxtail && cd oxtail && npm i
65
65
  - `read_session` — the recent transcript of a peer session, as clean per-turn messages when the peer is oxtail-aware (Claude Code and Codex CLI), or as raw tmux pane text otherwise. Accepts a tmux session name OR a `client_session_id` UUID; an ambiguous tmux name returns `ambiguous-target` with the candidate UUIDs. Transcript reads are **budgeted** so a casual read can't blow your context window: by default the last 20 messages and ~24KB of text (newest-first), per-message ISO timestamps omitted. `count_truncated` / `bytes_truncated` say which budget bit; raise `limit` + `max_bytes` to pull more, set `include_timestamps: true` to keep timestamps, and pass `tail_scan: true` to read the file tail without parsing the whole transcript (qualifies `total_messages` via `total_messages_exact`).
66
66
  - `claim_session` — single-shot session registration. The routine path: `Bash echo $CLAUDE_CODE_SESSION_ID` (or `$CODEX_THREAD_ID` for Codex) → `claim_session({ session_id })`. Returns `{ ok, session_id, transcript_path }`.
67
67
  - `set_my_state` — write a small "state card" onto this session's registry entry so peers can see what we're doing without reading our transcript. v1 surfaces a single field, `purpose` (≤200 chars).
68
- - `send_message` — **fire-and-forget** message to a peer. Target is a tmux session name or a raw `client_session_id` UUID. Body ≤ 8KB. Delivery is async via the peer's mailbox file. By default does **not** wake an idle peer; pass `wake: "auto"` to nudge one (state-gated — see [Waking an idle peer](#waking-an-idle-peer)). Replies to `ask_peer` should pass `reply_to: "<request_id>"` when the inbound message carries a `request_id`. (v0.5+)
68
+ - `send_message` — **fire-and-forget** message to a peer. Target is a tmux session name or a raw `client_session_id` UUID. Body ≤ 8KB. Delivery is async via the peer's mailbox file. A plain message does **not** wake an idle peer; pass `wake: "auto"` to nudge one (state-gated — see [Waking an idle peer](#waking-an-idle-peer)). Replies to `ask_peer` should pass `reply_to: "<request_id>"` when the inbound message carries a `request_id` — and a reply **auto-wakes the requester by default** (strictly gated; `wake: "off"` opts out). (v0.5+)
69
69
  - `read_my_messages` — drain this session's mailbox and return any queued messages. Messages include `from_session_id`, server-stamped `origin: "peer"`, and optional `request_id` / `reply_to`. Codex peers (and unhooked Claude Code) poll this; Claude Code peers with the hooks installed see messages mid-turn (PreToolUse) or at turn end (Stop) instead. (v0.5+)
70
70
  - `ask_peer` — **delegate-and-wait**. Enqueues a message with a `request_id` and blocks server-side until the peer replies with `send_message({ reply_to: request_id })` or the timeout elapses. Default timeout is 45s (`OXTAIL_ASK_PEER_TIMEOUT_MS`), and each call may pass `timeout_ms`. New peers use strict `reply_to` correlation; legacy/no-capability peers fall back to best-effort first-message matching and the response reports `correlation: "uncorrelated"`. That legacy path may stale-match old same-peer chatter, so callers should treat `uncorrelated` as compatibility-only. Use `send_message` for fire-and-forget. (v0.7+)
71
71
  - `register_my_session` — pin this MCP server's `session_id` directly. Kept for debugging; prefer `claim_session`.
72
72
  - `get_my_session` — return this MCP server's own registry entry plus a per-strategy detection diagnosis. Useful for debugging.
73
73
 
74
- See [design principles](https://github.com/d4j3y2k/oxtail/blob/v0.10.1/AGENTS.md) for scope and architecture.
74
+ See [design principles](https://github.com/d4j3y2k/oxtail/blob/v0.12.0/AGENTS.md) for scope and architecture.
75
75
 
76
76
  ## Usage from an agent
77
77
 
@@ -147,7 +147,18 @@ send_message({ target: "<peer>", body: "...", wake: "auto" })
147
147
  // → { ok: true, message_id, ..., wake_status: "fired" | "skipped_busy" | "skipped_no_target" | "disabled" }
148
148
  ```
149
149
 
150
- It is **state-gated** off the activity flag above: if the peer is mid-turn (`busy`), the wake is skipped (`skipped_busy`) because its PreToolUse/Stop hooks will deliver during the turn — no point typing into a busy composer. Idle, unknown (hooks not installed), or stale-busy peers get a per-client `tmux send-keys` wake (Codex gets the paste-burst-aware gap; Claude Code does not). `wake: "off"` (the default) preserves the pure fire-and-forget contract.
150
+ It is **state-gated** off the activity flag above: if the peer is mid-turn (`busy`), the wake is skipped (`skipped_busy`) because its PreToolUse/Stop hooks will deliver during the turn — no point typing into a busy composer. Idle, unknown (hooks not installed), or stale-busy peers get a per-client `tmux send-keys` wake (Codex gets the paste-burst-aware gap; Claude Code does not). `wake: "off"` preserves the pure fire-and-forget contract.
151
+
152
+ **Wake-on-reply (the default for replies).** A reply — a `send_message` that carries `reply_to` — auto-wakes the requester **by default**, so an awaited answer doesn't strand an idle peer and force a human to relay it. You don't have to remember `wake: "auto"`; pass `wake: "off"` to opt out.
153
+
154
+ ```js
155
+ send_message({ target: "<requester>", body: "...", reply_to: "<request_id>" })
156
+ // → { ok: true, ..., wake_status: "...", wake_reason: "reply_to_default" }
157
+ ```
158
+
159
+ The reply path is deliberately **stricter** than explicit `wake: "auto"`. It fires only when the target is **freshly idle** — an `idle` activity marker newer than `OXTAIL_AUTOWAKE_FRESH_IDLE_MS` (default 5 min). Stale, unknown, missing, or busy state yields `skipped_no_fresh_idle` (no best-effort wake — typing unprompted into a terminal that may be unattended is the risk we refuse to take). Two more guards bound it: a **per-target rate limit** (`OXTAIL_AUTOWAKE_MIN_INTERVAL_MS`, default 4s → `skipped_rate_limited`) since one wake already drains the whole mailbox, and a **one-wake dedupe** keyed on `(session_id, reply_to)` (`skipped_deduped`) so a duplicate or late hook drain of the same reply can't re-fire. If the dedupe/rate store is somehow unwritable the wake degrades to `skipped_store_error` rather than failing the (already-delivered) message. The env kill-switch `OXTAIL_AUTOWAKE=off` disables reply auto-wake entirely (`wake_status: "disabled"`). Every outcome that reaches the gate surfaces a `wake_status`; the reply path also stamps `wake_reason: "reply_to_default"` (present even on a resolve error like `ambiguous-target`, where there's no single target to wake).
160
+
161
+ **Coverage (which requesters this reaches).** The fresh-idle gate keys on the requester's busy/idle activity marker, which only the Claude Code hooks maintain. So wake-on-reply currently closes the stranding for a **hooked Claude Code requester** (the originally-observed case: a peer's async reply to an idle Claude session). A **Codex** requester — or a Claude requester without the hooks installed — has no idle marker, so a reply with `wake` unset returns `skipped_no_fresh_idle` and is **not** auto-woken; reach it with an explicit `wake: "auto"`, which always takes the lenient wake path (idle/unknown/stale all wake; only a fresh-`busy` peer is skipped) and bypasses the strict fresh-idle gate even for a reply. Closing the Codex/unhooked-requester direction *by default* needs a requester-side waiter signal (`expects_reply`), which is the next slice — a blind `unknown ⇒ wake` default is deliberately avoided because it reintroduces the double-wake-an-active-waiter risk this gate exists to prevent.
151
162
 
152
163
  **Codex and the wake matrix.** The send-keys wake needs a tmux pane. A Codex peer running **outside tmux** has none, so it returns `wake_status: "skipped_no_target"` — its idle delivery stays poll-based (`read_my_messages`). Run Codex **inside a tmux pane** to get symmetric idle-wake; the routing already handles the Codex paste-burst case.
153
164
 
@@ -161,6 +172,8 @@ If you have a hook installed on a managed event that isn't from Terminator and i
161
172
 
162
173
  oxtail trusts any process running as the **same local user** to enqueue messages. The mailbox directory is mode `0o700` (private), so other users on the host cannot read or write. **On a shared-tenancy box (containers, multi-user dev hosts, etc.), do not run oxtail-aware agents:** any local process under your user can inject `<system-reminder>` content directly into a Claude session. The threat boundary is the same as `~/.ssh/` — what your user processes do, you trust.
163
174
 
175
+ Within that boundary oxtail still *narrows* redirectable side effects, as defense-in-depth rather than a hard boundary: wake keystrokes only go to the pane the process tree confirms hosts the target's `server_pid`, never a self-written `tmux_pane`/`tmux_session` (see [Pane targeting](#pane-targeting-verified)), and an accepted registry entry can't borrow another pid — its `server_pid` must match its own `<pid>.json` filename. So one peer's entry can't masquerade as hosting another agent to redirect that agent's wake. A same-user process can still overwrite any registry file outright (that's the trust boundary above); what it can't do is smuggle a pid mismatch past a reader.
176
+
164
177
  ## Delegate-and-wait (v0.10.1)
165
178
 
166
179
  `ask_peer` extends v0.5's mailbox transport into a blocking primitive:
@@ -171,7 +184,7 @@ ask_peer({ target, body })
171
184
  ok: true,
172
185
  message_id,
173
186
  request_id,
174
- wake_status: "fired" | "skipped_unsupported" | "skipped_no_target" | "disabled",
187
+ wake_status: "fired" | "skipped_busy" | "skipped_debounced" | "skipped_no_target" | "disabled",
175
188
  reply: { id, body, enqueued_at, from_session_id, reply_to, correlation } | null,
176
189
  correlation: "correlated" | "uncorrelated" | "none",
177
190
  timeout_ms,
@@ -179,7 +192,7 @@ ask_peer({ target, body })
179
192
  }
180
193
  ```
181
194
 
182
- `wake_status` distinguishes the four outcomes a caller may need to handle differently. `fired` means the wake was attempted (or the reply arrived during the grace window, so no wake was needed). `skipped_unsupported` is reservedno client currently returns this in auto mode (both Codex and Claude Code wake via send-keys). `skipped_no_target` means no tmux pane/session resolved for the target. `disabled` means `OXTAIL_ASK_PEER_WAKE_STRATEGY=off` is in effect.
195
+ `wake_status` distinguishes the outcomes a caller may need to handle differently. `fired` means the wake was attempted (or the reply arrived during the grace window, so no wake was needed). `skipped_busy` means the peer is mid-turn (its hooks/poll will deliver we still poll for the reply). `skipped_debounced` means a wake fired for this peer moments ago and this one was coalesced. `skipped_no_target` means no process-tree-verified pane resolved for the target. `disabled` means `OXTAIL_ASK_PEER_WAKE_STRATEGY=off` is in effect. (`skipped_unsupported` is reserved — no client currently returns it.)
183
196
 
184
197
  `timed_out` is `true` only when the poll loop ran to its deadline without a reply.
185
198
 
@@ -209,9 +222,13 @@ ask_peer({ target, body })
209
222
  4. Poll the caller's mailbox at 200ms. For reply-to-capable peers, only a message with both `from_session_id == target.session_id` and `reply_to == request_id` satisfies the wait; non-matching messages stay in the mailbox untouched. Legacy/no-capability peers are best-effort and are marked `correlation: "uncorrelated"`; this preserves old peers but can stale-match old same-peer chatter.
210
223
  5. Return the reply on match, or `{ reply: null, timed_out: true, wake_status, correlation: "none" }` after the timeout. Late replies fall back to the normal v0.5 hook / `read_my_messages` path — never lost, just delivered out of band.
211
224
 
212
- ### Pane staleness
225
+ ### Pane targeting (verified)
226
+
227
+ A peer's cached `tmux_pane` / `tmux_session` are written by the peer into its **own** registry file, so they aren't trustworthy targets for keystrokes — a malicious local peer could point them at someone else's pane. The **only** send-keys target oxtail uses is the pane the live process tree says currently hosts the peer's `server_pid` (resolved at wake-time via `ps`/`tmux` ancestry — unforgeable by editing a JSON file). This also handles pane-id churn for free: the pane is always re-resolved fresh. If `server_pid` can't be bound to any live pane, oxtail **refuses** to wake (`wake_status: "skipped_no_target"`) rather than fall back to a self-written value. `server_pid` itself is self-written too, so registry entries whose `server_pid` doesn't match their own `<pid>.json` filename are rejected — a forged entry can't borrow another process's pane. The pane id that does reach `tmux` is shape-validated (`%\d+`); session names are no longer used as a send-keys target at all. (Hardening from issue #6.)
213
228
 
214
- Pane targeting can go stale: `tmux_pane` is cached at server startup, but tmux can reuse pane ids after a pane is killed. v0.7 re-resolves the pane from the peer's `server_pid` at wake-time (via process-tree ancestry), preferring the live pane id over the cached one. If the peer is no longer in any tmux pane (orphaned), oxtail falls back to the registered tmux session name. If both targeting attempts fail, `wake_status` returns `skipped_no_target`.
229
+ ### Wake debouncing
230
+
231
+ All wake paths funnel through one place, which **coalesces** rapid repeat wakes to the same peer: if a wake fired for a peer within `OXTAIL_WAKE_DEBOUNCE_MS` (default 1s), a follow-up wake is skipped (`wake_status: "skipped_debounced"`) and relies on the still-pending response. This keeps a retried `ask_peer`, two callers racing the same peer, or a polling loop from stacking notification lines into the peer's composer. In-memory and per-process by design. (Issue #5.)
215
232
 
216
233
  ### Constraints
217
234
 
@@ -270,10 +287,24 @@ When a strategy doesn't fire, it returns an abstention with a `reason` (e.g. `"2
270
287
 
271
288
  If `MCP_TRACE_FILE` is set in the environment, every detection run appends an NDJSON record with trigger, winning strategy, per-strategy outcomes, and `next_step`. Useful for diagnosing unresolved `client_session_id`s in the wild.
272
289
 
290
+ ### Diagnosing wakes (`oxtail diagnose`)
291
+
292
+ The same `MCP_TRACE_FILE` also captures a `wake_outcome` record for every wake (which tool drove it and the resulting `wake_status`). Run:
293
+
294
+ ```sh
295
+ oxtail diagnose
296
+ ```
297
+
298
+ to get a summary — counts by `wake_status`, broken down by tool — so "is the wake mechanism working in my environment?" is one command instead of grepping JSONL. With `MCP_TRACE_FILE` unset it just prints how to enable tracing. (Issue #7.)
299
+
300
+ A scheduled CI job (`.github/workflows/codex-drift.yml`, also runnable on demand) fetches Codex's upstream `PASTE_ENTER_SUPPRESS_WINDOW` and fails if it drifts past oxtail's 500ms Codex wake gap — so a future Codex release that would break the wake surfaces as a red job rather than a silent field regression.
301
+
273
302
  ## Status
274
303
 
275
- v0.10.1. Completes the autonomous peer-messaging matrix and hardens the protocol: a message reaches a Claude Code peer whether it's mid-turn, finishing, or fully idle, and delegate-and-wait replies are correlated by `request_id` / `reply_to` for upgraded peers.
304
+ v0.12.0. Pushes the autonomous peer-messaging matrix toward zero human relay, then hardens the wake path.
276
305
 
306
+ - **Wake-on-reply (v0.11.0).** A reply — `send_message` with `reply_to` — auto-wakes a freshly-idle requester by default, so an awaited answer doesn't strand an idle peer. Strictly gated (fresh-idle only, per-target rate limit, one-wake dedupe, `OXTAIL_AUTOWAKE=off` kill-switch). `wake:"off"` opts out; explicit `wake:"auto"` is the escape hatch for a requester without an idle marker (Codex / hookless Claude).
307
+ - **Wake hardening (v0.12.0).** Wake keystrokes only ever target the pane the process tree confirms hosts the peer's `server_pid` — never a self-written `tmux_pane`/`tmux_session`, and registry entries whose `server_pid` doesn't match their filename are rejected. Rapid repeat wakes to one peer are coalesced (`skipped_debounced`). `oxtail diagnose` summarizes wake outcomes from `MCP_TRACE_FILE`, and a scheduled CI job flags drift in Codex's paste-burst window before it can break the wake.
277
308
  - **Correlated delegate-and-wait.** `ask_peer` now sends a `request_id`; upgraded peers reply with `send_message({ reply_to })`, and the waiter ignores same-peer chatter that does not match. Legacy peers are still supported, but their replies are marked `correlation: "uncorrelated"`.
278
309
  - **Identity monotonicity.** `claim_session` / `register_my_session` and sticky-claim recovery are authoritative after they set a session id; later automatic detection cannot clobber a claimed id with stale env data.
279
310
  - **Hook push budgeting and provenance.** PreToolUse/Stop delivery stamps `origin: "peer"`, reminds receivers that peer messages are not user authority, and caps hook-injected body text via `OXTAIL_HOOK_MAX_BODY_CHARS`.
@@ -0,0 +1,238 @@
1
+ // Slice 1 — wake-on-reply (interim liveness patch).
2
+ //
3
+ // When a `send_message` carries a `reply_to` (i.e. it is answering an earlier
4
+ // ask) and the caller did NOT explicitly pass `wake:"off"`, oxtail auto-wakes
5
+ // the original requester so an awaited answer doesn't strand an idle peer and
6
+ // force a human relay. This module is the GATE that decides whether that
7
+ // reply-default wake is allowed to fire. The actual send-keys is left to the
8
+ // caller (server.ts `wakePeer`) so this module stays free of tmux/process
9
+ // concerns and is unit-testable against a temp directory.
10
+ //
11
+ // The guards are deliberately conservative. A reply auto-wake types into the
12
+ // peer's terminal WITHOUT the human at that terminal having asked for anything
13
+ // this turn, so we only do it when ALL of these hold:
14
+ // 1. kill-switch `OXTAIL_AUTOWAKE` is not "off"
15
+ // 2. the target is FRESH-IDLE — its activity marker says "idle" AND is newer
16
+ // than a max-age threshold. Stale / unknown / missing ⇒ no wake (we do NOT
17
+ // fall back to a best-effort wake the way the lenient wake:auto path does).
18
+ // 3. we have not woken this target too recently (per-target rate limit)
19
+ // 4. we have not already woken for THIS exact (session_id, reply_to) — a
20
+ // one-wake dedupe that survives duplicate / late hook drains.
21
+ //
22
+ // Everything is keyed on the target's `client.session_id` (the agent identity,
23
+ // per AGENTS.md), never server_pid / tmux name.
24
+ import { createHash } from "node:crypto";
25
+ import { closeSync, mkdirSync, openSync, readdirSync, statSync, unlinkSync, utimesSync, writeFileSync, } from "node:fs";
26
+ import { homedir } from "node:os";
27
+ import { join } from "node:path";
28
+ function envPosInt(name, def, env = process.env) {
29
+ const v = env[name];
30
+ if (!v)
31
+ return def;
32
+ const n = Number(v);
33
+ return Number.isFinite(n) && n > 0 ? n : def;
34
+ }
35
+ // Fresh-idle window: how recently the peer must have gone idle for a reply
36
+ // auto-wake to fire. This is the MAX-AGE threshold the spec calls for, and it
37
+ // is intentionally a SEPARATE, stricter gate than the 10-minute busy-TTL used
38
+ // by the lenient `wake:"auto"` path: that path wakes on idle/unknown/stale, but
39
+ // a reply auto-wake fires unprompted, so we cap how old "idle" may be before we
40
+ // stop trusting that the peer is still sitting at its prompt. The 5-minute
41
+ // default leans conservative (an unprompted wake into a possibly-unattended
42
+ // terminal is the risk) while still covering a normal minute-scale
43
+ // ask→work→reply round-trip; raise it via OXTAIL_AUTOWAKE_FRESH_IDLE_MS if
44
+ // dogfooding shows replies regularly land later.
45
+ export const FRESH_IDLE_MAX_AGE_MS = envPosInt("OXTAIL_AUTOWAKE_FRESH_IDLE_MS", 5 * 60 * 1000);
46
+ // Per-target rate limit: the minimum gap between two reply auto-wakes to the
47
+ // same session_id. A single recent wake already pulls an idle peer into a turn
48
+ // that drains its whole mailbox, so additional keystroke wakes inside this
49
+ // window are redundant noise into a terminal. Conservative by design.
50
+ export const MIN_INTERVAL_MS = envPosInt("OXTAIL_AUTOWAKE_MIN_INTERVAL_MS", 4000);
51
+ // One-wake dedupe lifetime: how long a (session_id, reply_to) wake record is
52
+ // honored before it is GC'd. Comfortably longer than any single ask/reply
53
+ // round-trip so a late/duplicate hook drain of the same reply can't re-wake.
54
+ export const DEDUPE_TTL_MS = envPosInt("OXTAIL_AUTOWAKE_DEDUPE_TTL_MS", 60 * 60 * 1000);
55
+ // The kill-switch. Any casing of "off" disables reply auto-wake entirely.
56
+ export function autowakeKillSwitchOff(env = process.env) {
57
+ return String(env.OXTAIL_AUTOWAKE ?? "").trim().toLowerCase() === "off";
58
+ }
59
+ // FRESH-IDLE gate. Only a recent "idle" marker qualifies. A negative age means
60
+ // the activity file's mtime is in the future (clock skew) — untrusted, treated
61
+ // as not-fresh.
62
+ export function isFreshIdle(act, maxAgeMs = FRESH_IDLE_MAX_AGE_MS) {
63
+ if (!act || act.status !== "idle")
64
+ return false;
65
+ return act.ageMs >= 0 && act.ageMs < maxAgeMs;
66
+ }
67
+ // --- persistent dedupe / rate-limit store ------------------------------------
68
+ // One small file per record under ~/.oxtail/autowake/. mtime is the source of
69
+ // truth (driven by the injected nowMs so the store is deterministic in tests);
70
+ // the body is a debug breadcrumb. GC'd by age.
71
+ export function defaultAutowakeDir() {
72
+ return join(homedir(), ".oxtail", "autowake");
73
+ }
74
+ function hash(s) {
75
+ // reply_to is caller-controlled, so never build a filename from it directly.
76
+ return createHash("sha256").update(s).digest("hex").slice(0, 32);
77
+ }
78
+ function dedupePath(dir, sessionId, replyTo) {
79
+ // JSON-encode the pair so the boundary is unambiguous: reply_to is
80
+ // caller-controlled and could otherwise be crafted to collide with a
81
+ // different (sessionId, replyTo) split under a plain separator.
82
+ return join(dir, `d-${hash(JSON.stringify([sessionId, replyTo]))}`);
83
+ }
84
+ function ratePath(dir, sessionId) {
85
+ return join(dir, `r-${hash(sessionId)}`);
86
+ }
87
+ function setMtime(path, nowMs) {
88
+ const t = nowMs / 1000;
89
+ try {
90
+ utimesSync(path, t, t);
91
+ }
92
+ catch {
93
+ // best effort — mtime drives TTL math, but a failure here only makes the
94
+ // record look fresher/staler by the small real-vs-injected clock delta.
95
+ }
96
+ }
97
+ // Read-only: has a wake for this (session_id, reply_to) happened within the TTL?
98
+ export function isDuplicateWake(dir, sessionId, replyTo, nowMs, ttlMs = DEDUPE_TTL_MS) {
99
+ try {
100
+ const st = statSync(dedupePath(dir, sessionId, replyTo));
101
+ return nowMs - st.mtimeMs < ttlMs;
102
+ }
103
+ catch {
104
+ return false;
105
+ }
106
+ }
107
+ // Read-only: have we woken this target within the min-interval window?
108
+ export function isRateLimited(dir, sessionId, nowMs, minIntervalMs = MIN_INTERVAL_MS) {
109
+ try {
110
+ const st = statSync(ratePath(dir, sessionId));
111
+ return nowMs - st.mtimeMs < minIntervalMs;
112
+ }
113
+ catch {
114
+ return false;
115
+ }
116
+ }
117
+ function stampRate(dir, sessionId, nowMs) {
118
+ const p = ratePath(dir, sessionId);
119
+ try {
120
+ writeFileSync(p, String(nowMs));
121
+ setMtime(p, nowMs);
122
+ }
123
+ catch {
124
+ // best effort
125
+ }
126
+ }
127
+ // Atomically claim the (session_id, reply_to) wake slot. Returns true if THIS
128
+ // caller won (no fresh record existed) and may proceed to fire; false if a
129
+ // concurrent / duplicate claim already holds it. On a win, also stamps the
130
+ // per-target rate record so distinct replies inside MIN_INTERVAL_MS are
131
+ // suppressed. A stale record (older than TTL) is cleared first so the slot can
132
+ // be reclaimed after the GC horizon.
133
+ export function claimWake(dir, sessionId, replyTo, nowMs, ttlMs = DEDUPE_TTL_MS) {
134
+ mkdirSync(dir, { recursive: true });
135
+ const dpath = dedupePath(dir, sessionId, replyTo);
136
+ try {
137
+ const st = statSync(dpath);
138
+ if (nowMs - st.mtimeMs >= ttlMs)
139
+ unlinkSync(dpath);
140
+ }
141
+ catch (e) {
142
+ // ENOENT = no prior record (the common path) → fine. Any OTHER error (e.g.
143
+ // failing to unlink a STALE record because the store is unhealthy) must
144
+ // propagate so the caller degrades to skipped_store_error — otherwise the
145
+ // imminent openSync("wx") EEXIST on the un-removed stale record would be
146
+ // misreported as a genuine dedupe hit.
147
+ if (e.code !== "ENOENT")
148
+ throw e;
149
+ }
150
+ let won = false;
151
+ try {
152
+ const fd = openSync(dpath, "wx"); // atomic create-exclusive: closes the race
153
+ try {
154
+ writeFileSync(fd, JSON.stringify({ sessionId, replyTo, at: nowMs }));
155
+ }
156
+ finally {
157
+ closeSync(fd);
158
+ }
159
+ setMtime(dpath, nowMs);
160
+ won = true;
161
+ }
162
+ catch (e) {
163
+ // EEXIST: a fresh claim already exists → genuine duplicate (skip, no throw).
164
+ // Any OTHER error means the store itself is unusable (e.g. a permission
165
+ // problem) — don't misreport it as a duplicate; rethrow so the caller can
166
+ // degrade it to a deterministic store-error status instead of silently
167
+ // suppressing a legitimate wake.
168
+ if (e.code === "EEXIST") {
169
+ won = false;
170
+ }
171
+ else {
172
+ throw e;
173
+ }
174
+ }
175
+ if (won)
176
+ stampRate(dir, sessionId, nowMs);
177
+ return won;
178
+ }
179
+ // Remove autowake records older than the dedupe TTL. Cheap, low-volume dir;
180
+ // run opportunistically on each decision so records can't accumulate.
181
+ export function gcAutowake(dir, nowMs, ttlMs = DEDUPE_TTL_MS) {
182
+ let names;
183
+ try {
184
+ names = readdirSync(dir);
185
+ }
186
+ catch {
187
+ return; // dir not created yet
188
+ }
189
+ for (const name of names) {
190
+ if (name[0] !== "d" && name[0] !== "r")
191
+ continue;
192
+ const p = join(dir, name);
193
+ try {
194
+ const st = statSync(p);
195
+ if (nowMs - st.mtimeMs >= ttlMs)
196
+ unlinkSync(p);
197
+ }
198
+ catch {
199
+ // best effort
200
+ }
201
+ }
202
+ }
203
+ // The decision. Pure of tmux/process concerns: given the target identity, the
204
+ // reply_to, a snapshot of the target's activity, the current time, and the
205
+ // store directory, return whether the reply-default wake may fire. The caller
206
+ // performs the actual send-keys when fire === true.
207
+ export function decideReplyAutoWake(input) {
208
+ const { dir, sessionId, replyTo, activity, nowMs } = input;
209
+ if (autowakeKillSwitchOff(input.env))
210
+ return { fire: false, status: "disabled" };
211
+ // Identity is required: dedupe/rate/activity all key on session_id, and
212
+ // without it we cannot confirm fresh-idle. An unclaimed peer is never auto-woken.
213
+ if (!sessionId)
214
+ return { fire: false, status: "skipped_no_fresh_idle" };
215
+ if (!isFreshIdle(activity))
216
+ return { fire: false, status: "skipped_no_fresh_idle" };
217
+ // Wake bookkeeping is best-effort: send_message has ALREADY enqueued the
218
+ // reply by the time we run, so a broken dedupe/rate store (e.g. ~/.oxtail/
219
+ // autowake is a file, or a permission error) must degrade to a deterministic
220
+ // status — NEVER throw, which would surface as a tool error on an already-
221
+ // delivered message and invite a duplicate retry.
222
+ try {
223
+ gcAutowake(dir, nowMs); // opportunistic sweep before we read/claim
224
+ // Read-only dedupe first so a sequential duplicate reply reports the precise
225
+ // reason; then the per-target rate limit; then an atomic claim to close the
226
+ // concurrent-duplicate race (and to stamp the rate record on success).
227
+ if (isDuplicateWake(dir, sessionId, replyTo, nowMs))
228
+ return { fire: false, status: "skipped_deduped" };
229
+ if (isRateLimited(dir, sessionId, nowMs))
230
+ return { fire: false, status: "skipped_rate_limited" };
231
+ if (!claimWake(dir, sessionId, replyTo, nowMs))
232
+ return { fire: false, status: "skipped_deduped" };
233
+ }
234
+ catch {
235
+ return { fire: false, status: "skipped_store_error" };
236
+ }
237
+ return { fire: true };
238
+ }
@@ -0,0 +1,75 @@
1
+ // Issue #7 — `oxtail diagnose`.
2
+ //
3
+ // The wake mechanism is environment-sensitive (tmux present? peer in a pane?
4
+ // Codex paste-burst gap still sufficient?). When it silently doesn't work, a
5
+ // user otherwise has to spelunk MCP_TRACE_FILE by hand. This summarizes the
6
+ // `wake_outcome` trace events oxtail emits — counts by wake_status, broken down
7
+ // by which tool drove the wake — so "is wake working here?" is one command.
8
+ import { readFileSync } from "node:fs";
9
+ // Keep only `wake_outcome` events, newest `limit`, and tally them. Malformed
10
+ // JSONL lines are skipped (a trace file can be concurrently appended).
11
+ export function summarizeWakeOutcomes(lines, limit = 200) {
12
+ const outcomes = [];
13
+ for (const line of lines) {
14
+ if (!line.trim())
15
+ continue;
16
+ let rec;
17
+ try {
18
+ rec = JSON.parse(line);
19
+ }
20
+ catch {
21
+ continue;
22
+ }
23
+ if (rec.event === "wake_outcome")
24
+ outcomes.push(rec);
25
+ }
26
+ const recent = limit > 0 ? outcomes.slice(-limit) : outcomes;
27
+ const byStatus = {};
28
+ const byVia = {};
29
+ for (const r of recent) {
30
+ const status = String(r.wake_status ?? "unknown");
31
+ const via = String(r.via ?? "unknown");
32
+ byStatus[status] = (byStatus[status] ?? 0) + 1;
33
+ const viaBucket = (byVia[via] ??= {});
34
+ viaBucket[status] = (viaBucket[status] ?? 0) + 1;
35
+ }
36
+ return { total: recent.length, considered: outcomes.length, byStatus, byVia };
37
+ }
38
+ function sortedCounts(counts) {
39
+ return Object.entries(counts).sort((a, b) => b[1] - a[1] || a[0].localeCompare(b[0]));
40
+ }
41
+ export function formatWakeSummary(s) {
42
+ if (s.total === 0) {
43
+ return "oxtail diagnose: no wake_outcome events in the trace yet (no ask_peer / wake:auto / reply-default wakes recorded).";
44
+ }
45
+ const lines = [];
46
+ const capped = s.considered > s.total ? ` (newest ${s.total} of ${s.considered})` : ` (${s.total})`;
47
+ lines.push(`oxtail diagnose — wake outcomes${capped}:`);
48
+ for (const [status, n] of sortedCounts(s.byStatus)) {
49
+ lines.push(` ${status}: ${n}`);
50
+ }
51
+ lines.push("by tool:");
52
+ for (const [via, counts] of Object.entries(s.byVia).sort()) {
53
+ const parts = sortedCounts(counts).map(([st, n]) => `${st} ${n}`);
54
+ lines.push(` ${via}: ${parts.join(", ")}`);
55
+ }
56
+ return lines.join("\n");
57
+ }
58
+ // CLI entry. Returns a process exit code; `out` is injectable for tests.
59
+ export function runDiagnose(traceFile, out = console.log) {
60
+ if (!traceFile) {
61
+ out("oxtail diagnose: MCP_TRACE_FILE is not set, so there is no trace data to summarize.");
62
+ out("Set MCP_TRACE_FILE=/path/to/oxtail-trace.jsonl in the oxtail MCP server's env (e.g. in .mcp.json / ~/.claude.json / ~/.codex/config.toml), reproduce some wakes, then re-run `oxtail diagnose`.");
63
+ return 0;
64
+ }
65
+ let content;
66
+ try {
67
+ content = readFileSync(traceFile, "utf8");
68
+ }
69
+ catch {
70
+ out(`oxtail diagnose: could not read trace file ${traceFile} (set MCP_TRACE_FILE and reproduce some wakes first).`);
71
+ return 1;
72
+ }
73
+ out(formatWakeSummary(summarizeWakeOutcomes(content.split("\n"))));
74
+ return 0;
75
+ }
package/dist/registry.js CHANGED
@@ -42,6 +42,73 @@ function ensureDir() {
42
42
  function entryPath(pid) {
43
43
  return join(registryDir(), `${pid}.json`);
44
44
  }
45
+ // tmux's own identifiers, used to sanitize registry-sourced values before they
46
+ // reach a `tmux` command. A pane id is always `%<n>`; a session name, per tmux's
47
+ // rules for names we create, is `[A-Za-z0-9_-]+`. Validating defends against a
48
+ // malicious local peer writing a crafted `tmux_pane`/`tmux_session` into its own
49
+ // registry file to redirect or trick our wake send-keys (issue #6).
50
+ export function isValidTmuxPane(s) {
51
+ return /^%\d+$/.test(s);
52
+ }
53
+ export function isValidTmuxSession(s) {
54
+ return /^[A-Za-z0-9_-]+$/.test(s);
55
+ }
56
+ // The ONLY trustworthy send-keys target for waking a peer: the pane the live
57
+ // process tree says currently hosts the peer's `server_pid`. This is computed
58
+ // from `ps`/`tmux` state (currentPaneForServerPid), so it cannot be forged by a
59
+ // peer editing its own `~/.oxtail/sessions/<pid>.json` — unlike the cached
60
+ // `tmux_pane`/`tmux_session` fields, which the peer self-writes. Returns null
61
+ // (caller must refuse to wake) when:
62
+ // - the peer never registered a pane: a legit tmux-hosted peer always does
63
+ // (its session is derived from the pane), so a pane-less/session-only entry
64
+ // is hand-written or spoofed and must never be blind-fired; gating on a
65
+ // registered pane also avoids fishing for a pane from server_pid alone,
66
+ // which in tests can collide with the test runner's own pane.
67
+ // - server_pid isn't under any live tmux pane: we can't bind a trustworthy
68
+ // target, so we refuse rather than fall back to the self-written cached value.
69
+ // - the resolved pane isn't a well-formed pane id (tmux output anomaly).
70
+ // resolvePane is injected in tests; production uses currentPaneForServerPid.
71
+ export function chooseVerifiedWakePane(peer, resolvePane = currentPaneForServerPid) {
72
+ if (!peer.tmux_pane)
73
+ return null;
74
+ const live = resolvePane(peer.server_pid);
75
+ if (!live || !isValidTmuxPane(live))
76
+ return null;
77
+ return live;
78
+ }
79
+ // Extract the pid a registry filename encodes: `<pid>.json` → pid, else null.
80
+ export function filenamePid(file) {
81
+ const m = /^(\d+)\.json$/.exec(file);
82
+ if (!m)
83
+ return null;
84
+ const pid = Number(m[1]);
85
+ return Number.isInteger(pid) && pid > 0 ? pid : null;
86
+ }
87
+ // Read + parse a registry file, enforcing the provenance invariant that a
88
+ // process only ever writes its OWN `<pid>.json`: the parsed `server_pid` MUST
89
+ // equal the pid in the filename. register() always writes them in agreement, so
90
+ // a mismatch means the entry was hand-forged to borrow another process's pid —
91
+ // the #6 redirect where a peer self-writes `server_pid: <victimPid>` so that
92
+ // chooseVerifiedWakePane → currentPaneForServerPid resolves (and wakes) the
93
+ // victim's pane. Such entries, plus non-`<pid>.json` names and parse failures,
94
+ // are rejected (returns null) so no raw-registry reader trusts them. The
95
+ // local-user trust boundary still holds (a same-user process can overwrite any
96
+ // file), but this stops one peer's entry from impersonating another pid.
97
+ export function readEntryFile(dir, file) {
98
+ const fnamePid = filenamePid(file);
99
+ if (fnamePid === null)
100
+ return null;
101
+ let entry;
102
+ try {
103
+ entry = JSON.parse(readFileSync(join(dir, file), "utf8"));
104
+ }
105
+ catch {
106
+ return null;
107
+ }
108
+ if (entry.server_pid !== fnamePid)
109
+ return null;
110
+ return entry;
111
+ }
45
112
  function resolveTmuxSessionFromPane(pane) {
46
113
  if (!pane)
47
114
  return null;
@@ -120,7 +187,10 @@ export function findTmuxPaneByAncestry(startPid, panePids, ppids) {
120
187
  return null;
121
188
  }
122
189
  export function resolveTmuxPane(env = process.env, pid = process.pid) {
123
- if (env.TMUX_PANE)
190
+ // TMUX_PANE is a peer-controllable env var: only trust it if it has tmux's
191
+ // pane-id shape (%N). A spoofed/malformed value falls through to process-tree
192
+ // ancestry, which can't be forged by editing the environment (issue #6).
193
+ if (env.TMUX_PANE && isValidTmuxPane(env.TMUX_PANE))
124
194
  return env.TMUX_PANE;
125
195
  return findTmuxPaneByAncestry(pid, listTmuxPanePids(), listAllPpids());
126
196
  }
@@ -194,16 +264,10 @@ function gcDeadSiblings(entry) {
194
264
  if (!existsSync(dir))
195
265
  return;
196
266
  for (const file of readdirSync(dir)) {
197
- if (!file.endsWith(".json"))
198
- continue;
267
+ const other = readEntryFile(dir, file);
268
+ if (!other)
269
+ continue; // skip non-<pid>.json, parse errors, and forged entries
199
270
  const full = join(dir, file);
200
- let other;
201
- try {
202
- other = JSON.parse(readFileSync(full, "utf8"));
203
- }
204
- catch {
205
- continue;
206
- }
207
271
  if (other.server_pid === entry.server_pid)
208
272
  continue;
209
273
  if (other.client.session_id !== sid)
@@ -267,16 +331,10 @@ export function readAll() {
267
331
  return [];
268
332
  const live = [];
269
333
  for (const file of readdirSync(dir)) {
270
- if (!file.endsWith(".json"))
271
- continue;
334
+ const entry = readEntryFile(dir, file);
335
+ if (!entry)
336
+ continue; // non-<pid>.json, parse error, or forged server_pid
272
337
  const full = join(dir, file);
273
- let entry;
274
- try {
275
- entry = JSON.parse(readFileSync(full, "utf8"));
276
- }
277
- catch {
278
- continue;
279
- }
280
338
  if (!isAlive(entry.server_pid)) {
281
339
  // Reap-deferral: a dead child's mailbox may still hold undrained mail
282
340
  // that the session's union-drain (PreToolUse hook + read_my_messages)
@@ -342,15 +400,9 @@ export function sessionPidsForId(sessionId) {
342
400
  return [];
343
401
  const entries = [];
344
402
  for (const file of readdirSync(dir)) {
345
- if (!file.endsWith(".json"))
346
- continue;
347
- let e;
348
- try {
349
- e = JSON.parse(readFileSync(join(dir, file), "utf8"));
350
- }
351
- catch {
352
- continue;
353
- }
403
+ const e = readEntryFile(dir, file);
404
+ if (!e)
405
+ continue; // skip non-<pid>.json, parse errors, and forged entries
354
406
  if (e.client.session_id === sessionId)
355
407
  entries.push(e);
356
408
  }
package/dist/server.js CHANGED
@@ -10,9 +10,11 @@ import { dirname, join, sep } from "node:path";
10
10
  import { clientFromHandshake, detectClient, enrichWithDiagnosis, transcriptPathFor, } from "./clients.js";
11
11
  import { isAbstain } from "./detect/index.js";
12
12
  import { trace } from "./trace.js";
13
- import { buildEntry, currentPaneForServerPid, findByTmuxSession, readAll, refreshTmuxBinding, register, sessionPidsForId, unregister, } from "./registry.js";
13
+ import { buildEntry, chooseVerifiedWakePane, findByTmuxSession, readAll, refreshTmuxBinding, register, sessionPidsForId, unregister, } from "./registry.js";
14
14
  import * as mailbox from "./mailbox.js";
15
15
  import { recoverClaim, resolveAncestors, writeClaim } from "./claims.js";
16
+ import { decideReplyAutoWake, defaultAutowakeDir } from "./autowake.js";
17
+ import { markWoke, newWakeDebounceStore, recentlyWoke } from "./wake-debounce.js";
16
18
  // CLI subcommand dispatch must run before any MCP setup so that
17
19
  // `npx oxtail install-hook` doesn't open an MCP transport or register a
18
20
  // session. Use named exports and await them; calling `await import(...)`
@@ -32,6 +34,10 @@ import { recoverClaim, resolveAncestors, writeClaim } from "./claims.js";
32
34
  await mod.uninstall();
33
35
  process.exit(0);
34
36
  }
37
+ if (sub === "diagnose") {
38
+ const { runDiagnose } = await import("./diagnose.js");
39
+ process.exit(runDiagnose(process.env.MCP_TRACE_FILE));
40
+ }
35
41
  }
36
42
  import { readClaudeTranscript, readCodexTranscript, } from "./transcripts.js";
37
43
  // Single builder for every readSession return so the field set (including the
@@ -1002,7 +1008,7 @@ function resolveTarget(target, caller) {
1002
1008
  server.registerTool("send_message", {
1003
1009
  description: [
1004
1010
  "Fire-and-forget message to a peer in the same project root. Target: a tmux session name OR a client_session_id (UUID). Async via the peer's mailbox — delivered mid-turn (PreToolUse hook) or next-turn (read_my_messages); cross-project targets are rejected.",
1005
- "By default does NOT wake an idle peer. Pass wake:\"auto\" to nudge one via per-client send-keys, state-gated (skipped if the peer is mid-turn). Response then carries wake_status: \"fired\" | \"skipped_busy\" | \"skipped_no_target\" | \"disabled\".",
1011
+ "A plain message does NOT wake an idle peer. Pass wake:\"auto\" to nudge one via per-client send-keys, state-gated (skipped if the peer is mid-turn). EXCEPTION (wake-on-reply): when you set reply_to, this auto-wakes the requester by default so your answer doesn't strand them idle — pass wake:\"off\" to suppress. The reply-default wake is strictly gated: it fires only for a FRESHLY-IDLE requester (one whose Claude Code hooks maintain a fresh idle marker), with a per-target rate limit and a one-wake dedupe; env kill-switch OXTAIL_AUTOWAKE=off. A requester with no idle marker (Codex, or Claude without the hooks) returns skipped_no_fresh_idle and is NOT auto-woken — use explicit wake:\"auto\" for those. Response carries wake_status (\"fired\" | \"skipped_busy\" | \"skipped_debounced\" | \"skipped_no_fresh_idle\" | \"skipped_rate_limited\" | \"skipped_deduped\" | \"skipped_store_error\" | \"skipped_no_target\" | \"disabled\") and, on the reply path, wake_reason:\"reply_to_default\".",
1006
1012
  "Body is verbatim — wrap in <system-reminder>...</system-reminder> yourself if you want that framing. When replying to ask_peer, include reply_to: request_id from the inbound message. For a blocking send-and-wait, use ask_peer instead.",
1007
1013
  ].join(" "),
1008
1014
  inputSchema: {
@@ -1020,7 +1026,7 @@ server.registerTool("send_message", {
1020
1026
  wake: z
1021
1027
  .enum(["off", "auto"])
1022
1028
  .optional()
1023
- .describe('Wake strategy. "off" (default): pure fire-and-forget, no nudge. "auto": nudge an idle peer via per-client send-keys, state-gated (skipped if the peer is mid-turn). Response carries wake_status when set.'),
1029
+ .describe('Wake strategy. Default (unset): no nudge for a plain message, but a reply (reply_to set) auto-wakes a freshly-idle requester. "off": pure fire-and-forget, no nudge even for a reply. "auto": nudge an idle peer via per-client send-keys, state-gated (skipped if the peer is mid-turn). Response carries wake_status when set.'),
1024
1030
  reply_to: z
1025
1031
  .string()
1026
1032
  .min(1)
@@ -1035,11 +1041,14 @@ server.registerTool("send_message", {
1035
1041
  }, async ({ target, body, wake, reply_to, source_message_id }) => {
1036
1042
  const resolved = resolveTarget(target, entry);
1037
1043
  if (!resolved.ok) {
1038
- const wake_status = wake === "auto" ? resolveErrorWakeStatus(resolved.error) : undefined;
1044
+ const replyDefault = replyAutoWakeTriggered(wake, reply_to);
1045
+ const wakeIntended = wake === "auto" || replyDefault;
1046
+ const wake_status = wakeIntended ? resolveErrorWakeStatus(resolved.error) : undefined;
1039
1047
  return jsonResult({
1040
1048
  schema_version: 1,
1041
1049
  ...resolved,
1042
1050
  ...(wake_status ? { wake_status } : {}),
1051
+ ...(replyDefault ? { wake_reason: "reply_to_default" } : {}),
1043
1052
  });
1044
1053
  }
1045
1054
  const peer = resolved.entry;
@@ -1048,7 +1057,15 @@ server.registerTool("send_message", {
1048
1057
  reply_to,
1049
1058
  source_message_id,
1050
1059
  });
1051
- const wake_status = wake === "auto" ? await wakeForSend(peer) : undefined;
1060
+ const { wake_status, wake_reason } = await resolveSendWake(peer, wake, reply_to);
1061
+ if (wake_status) {
1062
+ trace("wake_outcome", {
1063
+ via: wake_reason === "reply_to_default" ? "reply_default" : "send_message",
1064
+ wake_status,
1065
+ target_session_id: peer.client.session_id,
1066
+ client_type: peer.client.type,
1067
+ });
1068
+ }
1052
1069
  return jsonResult({
1053
1070
  schema_version: 1,
1054
1071
  ok: true,
@@ -1056,6 +1073,7 @@ server.registerTool("send_message", {
1056
1073
  target_session_id: peer.client.session_id,
1057
1074
  target_server_pid: peer.server_pid,
1058
1075
  ...(wake_status ? { wake_status } : {}),
1076
+ ...(wake_reason ? { wake_reason } : {}),
1059
1077
  });
1060
1078
  });
1061
1079
  // read_my_messages budget. A session's union drain can return a backlog; cap
@@ -1246,11 +1264,11 @@ function askPeerDelay(ms, signal) {
1246
1264
  // parsed as a key event. The -l flag neutralizes any tmux keysequences a
1247
1265
  // malicious peer could plant in its registry entry.
1248
1266
  //
1249
- // Pane targeting can go stale: tmux_pane is cached at server startup
1250
- // (registry resolveTmuxPane), but Terminator-style window churn can move or
1251
- // close the pane after registration. send-keys against a dead pane id
1252
- // errors; if pane targeting fails and a sessionName is also available,
1253
- // retry against it (targets the session's currently-active pane).
1267
+ // askPeerWakeImpl keeps a generic pane→sessionName retry for its own unit
1268
+ // tests, but PRODUCTION wakePeer now passes only the process-tree-verified pane
1269
+ // (sessionName = null): a self-written tmux_session is not a trustworthy
1270
+ // send-keys target (issue #6), and pane-id churn is handled by re-resolving the
1271
+ // pane from server_pid on every wake rather than by a session fallback.
1254
1272
  async function defaultFireWakeKeystrokes(target, clientType) {
1255
1273
  execFileSync("tmux", ["send-keys", "-t", target, "-l", ASK_PEER_WAKE_TEXT], {
1256
1274
  stdio: ["ignore", "pipe", "pipe"],
@@ -1297,46 +1315,61 @@ export async function askPeerWakeImpl(pane, sessionName, fire) {
1297
1315
  // peer's client_type. Returns the wake_status that should surface in the
1298
1316
  // ask_peer response so callers can distinguish "we tried, no answer" from
1299
1317
  // "we didn't try because the client can't be woken."
1318
+ // In-memory per-process wake-debounce state, keyed by peer session_id. Coalesces
1319
+ // rapid repeat wakes to the same peer across all wake paths (issue #5).
1320
+ const wakeDebounce = newWakeDebounceStore();
1300
1321
  async function wakePeer(peer) {
1301
1322
  if (ASK_PEER_WAKE_STRATEGY === "off") {
1302
1323
  trace("ask_peer_wake_skipped", { reason: "strategy-off" });
1303
1324
  return "disabled";
1304
1325
  }
1305
1326
  const clientType = peer.client.type;
1306
- if (!peer.tmux_pane && !peer.tmux_session) {
1307
- return "skipped_no_target";
1308
- }
1309
- // Race-fix: tmux_pane is cached at registration but pane ids can be reused
1310
- // by tmux after a pane is killed. If we send-keys against a reused id we
1311
- // wake the wrong shell. When the peer registered WITH a cached pane,
1312
- // re-resolve from its server_pid at wake-time and prefer the live value.
1313
- // If the peer registered without a pane (no TMUX_PANE in env, no ancestry
1314
- // match), skip the re-resolution entirely fishing for a pane based on
1315
- // server_pid alone is unsafe (server_pid may not even still be alive, and
1316
- // in tests it can coincide with the test runner's process tree).
1317
- const livePane = peer.tmux_pane
1318
- ? currentPaneForServerPid(peer.server_pid)
1319
- : null;
1320
- if (peer.tmux_pane && livePane && livePane !== peer.tmux_pane) {
1321
- trace("ask_peer_wake_pane_refreshed", {
1327
+ // #5: coalesce a rapid repeat wake to the same peer (concurrent/retried
1328
+ // ask_peer, polling loops) so we don't stack a second notification line into
1329
+ // its composer. Keyed on session_id; an unclaimed peer (no id) isn't debounced.
1330
+ const sid = peer.client.session_id;
1331
+ if (sid && recentlyWoke(wakeDebounce, sid, Date.now())) {
1332
+ trace("ask_peer_wake_skipped", { reason: "debounced", target_session_id: sid });
1333
+ return "skipped_debounced";
1334
+ }
1335
+ // Security (#6): tmux_pane / tmux_session come from the peer's OWN registry
1336
+ // file, so a malicious local peer could point them at someone else's pane or
1337
+ // session to redirect our wake keystrokes. The ONLY trustworthy send-keys
1338
+ // target is the pane the live process tree says currently hosts the peer's
1339
+ // server_pid — chooseVerifiedWakePane resolves that and refuses (returns null)
1340
+ // when it can't be verified, instead of falling back to the self-written
1341
+ // cached pane or tmux_session. This also subsumes the old stale-pane re-
1342
+ // resolution race fix: we ALWAYS use the freshly process-tree-resolved pane.
1343
+ const verifiedPane = chooseVerifiedWakePane(peer);
1344
+ if (!verifiedPane) {
1345
+ trace("ask_peer_wake_skipped", {
1346
+ reason: "no-verified-pane",
1322
1347
  cached: peer.tmux_pane,
1323
- live: livePane,
1324
1348
  server_pid: peer.server_pid,
1349
+ target_session_id: peer.client.session_id,
1325
1350
  });
1351
+ return "skipped_no_target";
1326
1352
  }
1327
- else if (peer.tmux_pane && !livePane) {
1328
- trace("ask_peer_wake_pane_orphaned", {
1353
+ if (verifiedPane !== peer.tmux_pane) {
1354
+ trace("ask_peer_wake_pane_refreshed", {
1329
1355
  cached: peer.tmux_pane,
1356
+ live: verifiedPane,
1330
1357
  server_pid: peer.server_pid,
1331
1358
  });
1332
1359
  }
1333
- const effectivePane = livePane ?? peer.tmux_pane;
1334
1360
  // Legacy mode bypasses per-client routing: every wake is the v0.6 sequence
1335
1361
  // (no inter-keystroke delay). Cast to "unknown" so defaultFireWakeKeystrokes
1336
1362
  // skips the Codex delay branch.
1337
1363
  const fireType = ASK_PEER_WAKE_STRATEGY === "legacy" ? "unknown" : clientType;
1338
1364
  const fire = (target) => defaultFireWakeKeystrokes(target, fireType);
1339
- const ok = await askPeerWakeImpl(effectivePane, peer.tmux_session, fire);
1365
+ // #5: stamp the debounce BEFORE the (possibly async, paste-burst-delayed) fire
1366
+ // so a concurrent second wakePeer for this peer — which runs while we're
1367
+ // awaiting send-keys — sees the stamp and coalesces instead of double-firing.
1368
+ if (sid)
1369
+ markWoke(wakeDebounce, sid, Date.now());
1370
+ // No session-name fallback: a self-written tmux_session could target another
1371
+ // session, and the verified pane already handles pane-id churn. Pass null.
1372
+ const ok = await askPeerWakeImpl(verifiedPane, null, fire);
1340
1373
  return ok ? "fired" : "skipped_no_target";
1341
1374
  }
1342
1375
  // --- send_message wake:auto gating -------------------------------------------
@@ -1376,6 +1409,63 @@ async function wakeForSend(peer) {
1376
1409
  }
1377
1410
  return wakePeer(peer);
1378
1411
  }
1412
+ // --- Slice 1: wake-on-reply (reply_to default) -------------------------------
1413
+ // A send_message that carries a reply_to is answering an earlier ask. The wake
1414
+ // arg is a three-way for a reply:
1415
+ // unset → the STRICT reply-default auto-wake (fresh-idle only, rate limit,
1416
+ // one-wake dedupe, env kill-switch — autowake.ts). wake_reason:
1417
+ // "reply_to_default".
1418
+ // "auto" → the caller explicitly opts into the LENIENT wakeForSend path
1419
+ // (idle/unknown/stale all wake; only fresh-busy is skipped). This is
1420
+ // the escape hatch for a requester with no idle marker — a Codex or
1421
+ // hookless-Claude requester that the strict gate skips as
1422
+ // skipped_no_fresh_idle. Not flagged reply_to_default: the caller
1423
+ // asked for it explicitly.
1424
+ // "off" → no wake at all.
1425
+ // Here we just wire identity/activity/time into the strict gate and fire the
1426
+ // existing send-keys path when it says go.
1427
+ //
1428
+ // Note (per Codex's slice-1 correction): the fresh-idle gate makes an explicit
1429
+ // "is the requester actively blocked in ask_peer?" suppression unnecessary —
1430
+ // an active waiter is mid-turn and therefore marked busy, so it never reads as
1431
+ // fresh-idle. That holds only as long as the busy/idle freshness is correct;
1432
+ // it is not an independent proof.
1433
+ //
1434
+ // Triggers the STRICT reply-default path: a reply (reply_to set) with wake
1435
+ // UNSET. Explicit "auto"/"off" opt out of the strict path (auto → lenient,
1436
+ // off → none), so this is false for them.
1437
+ function replyAutoWakeTriggered(wake, replyTo) {
1438
+ return !!replyTo && wake === undefined;
1439
+ }
1440
+ async function autoWakeOnReply(peer, replyTo) {
1441
+ const sid = peer.client.session_id;
1442
+ const decision = decideReplyAutoWake({
1443
+ dir: defaultAutowakeDir(),
1444
+ sessionId: sid ?? null,
1445
+ replyTo,
1446
+ activity: readActivity(sid),
1447
+ nowMs: Date.now(),
1448
+ });
1449
+ if (!decision.fire) {
1450
+ trace("autowake_reply_skipped", { target_session_id: sid, status: decision.status });
1451
+ return decision.status;
1452
+ }
1453
+ trace("autowake_reply_fire", { target_session_id: sid });
1454
+ return wakePeer(peer);
1455
+ }
1456
+ // Resolve the wake for a send_message. The strict reply-default path engages
1457
+ // only for a reply with wake UNSET; an explicit wake:"auto" always means the
1458
+ // lenient wakeForSend path (even for a reply — the Codex/hookless escape hatch),
1459
+ // and wake:"off" means no wake. Returns the status + reason to surface.
1460
+ async function resolveSendWake(peer, wake, replyTo) {
1461
+ if (replyAutoWakeTriggered(wake, replyTo)) {
1462
+ return { wake_status: await autoWakeOnReply(peer, replyTo), wake_reason: "reply_to_default" };
1463
+ }
1464
+ if (wake === "auto") {
1465
+ return { wake_status: await wakeForSend(peer) };
1466
+ }
1467
+ return {};
1468
+ }
1379
1469
  // Poll my mailbox at ASK_PEER_POLL_MS until a matching reply lands or the
1380
1470
  // deadline elapses. Each tick checks mtime first and only acquires the
1381
1471
  // mailbox lock when there's a probable hit. The lock is held only inside
@@ -1500,6 +1590,12 @@ server.registerTool("ask_peer", {
1500
1590
  // send_message wake:auto. (Codex has no activity file, so it is never
1501
1591
  // detected busy and still fires — unchanged for that client.)
1502
1592
  wakeStatus = await wakeForSend(peer);
1593
+ trace("wake_outcome", {
1594
+ via: "ask_peer",
1595
+ wake_status: wakeStatus,
1596
+ target_session_id: peer.client.session_id,
1597
+ client_type: peer.client.type,
1598
+ });
1503
1599
  if (wakeStatus === "skipped_unsupported") {
1504
1600
  // Reserved branch. No client currently returns skipped_unsupported
1505
1601
  // in auto mode (Codex and Claude Code both wake via send-keys).
@@ -0,0 +1,45 @@
1
+ // Issue #5 — per-peer wake debouncer.
2
+ //
3
+ // Every wake fires `tmux send-keys` into the peer's composer. When the same peer
4
+ // is woken again within a fraction of a second — a caller retrying ask_peer, two
5
+ // callers targeting the same peer concurrently, or a polling loop — oxtail blasts
6
+ // a second WAKE_TEXT line on top of the first, which (with the Codex paste-burst
7
+ // gap) can land inside an already-active turn. This debouncer coalesces those:
8
+ // if a wake fired for a peer within a short window, subsequent wakes are skipped
9
+ // and rely on the still-pending response.
10
+ //
11
+ // Deliberately in-memory and per-process (state lives on the calling oxtail
12
+ // server): the common burst — one caller hammering one peer — is same-process,
13
+ // and cross-process coordination is out of scope for this slice. All wake paths
14
+ // (ask_peer, send_message wake:"auto", the reply-default wake) funnel through
15
+ // wakePeer, so one check there covers them all.
16
+ function envPosInt(name, def, env = process.env) {
17
+ const v = env[name];
18
+ if (!v)
19
+ return def;
20
+ const n = Number(v);
21
+ return Number.isFinite(n) && n > 0 ? n : def;
22
+ }
23
+ // Default 1s — long enough to swallow a rapid retry / concurrent double-wake,
24
+ // short enough that a genuinely separate follow-up wake a moment later still
25
+ // lands. Tunable via OXTAIL_WAKE_DEBOUNCE_MS.
26
+ export const WAKE_DEBOUNCE_MS = envPosInt("OXTAIL_WAKE_DEBOUNCE_MS", 1000);
27
+ export function newWakeDebounceStore() {
28
+ return new Map();
29
+ }
30
+ // True if a wake fired for this key within the window — i.e. skip this one.
31
+ export function recentlyWoke(store, key, nowMs, windowMs = WAKE_DEBOUNCE_MS) {
32
+ const last = store.get(key);
33
+ return last !== undefined && nowMs - last < windowMs;
34
+ }
35
+ // Record that a wake fired for this key. Opportunistically evicts stale entries
36
+ // so the map can't grow unbounded across many short-lived peers.
37
+ export function markWoke(store, key, nowMs, windowMs = WAKE_DEBOUNCE_MS) {
38
+ store.set(key, nowMs);
39
+ if (store.size > 256) {
40
+ for (const [k, t] of store) {
41
+ if (nowMs - t > windowMs * 10)
42
+ store.delete(k);
43
+ }
44
+ }
45
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "oxtail",
3
- "version": "0.10.3",
3
+ "version": "0.12.0",
4
4
  "private": false,
5
5
  "type": "module",
6
6
  "description": "Coordination layer for parallel AI coding agent sessions, exposed over MCP.",
@@ -0,0 +1,63 @@
1
+ #!/usr/bin/env node
2
+ // Issue #7 — drift detector for Codex's paste-burst window.
3
+ //
4
+ // oxtail's Codex wake inserts a 500ms gap (ASK_PEER_CODEX_SUBMIT_DELAY_MS)
5
+ // between the typed wake text and Enter, to outlast Codex's paste-burst
6
+ // PASTE_ENTER_SUPPRESS_WINDOW — a private constant tested at 120ms. If Codex
7
+ // bumps that window past our gap in a future release, our wake silently
8
+ // regresses to "Enter gets swallowed" with no signal pointing at the cause.
9
+ //
10
+ // This script fetches the upstream constant and exits non-zero if it changed
11
+ // (or moved/renamed). Run on a schedule (see .github/workflows/codex-drift.yml)
12
+ // so drift surfaces as a failing job rather than a silent field regression.
13
+
14
+ const URL =
15
+ "https://raw.githubusercontent.com/openai/codex/main/codex-rs/tui/src/bottom_pane/paste_burst.rs";
16
+ const EXPECTED_MS = 120; // value oxtail's 500ms gap was verified against
17
+ const OUR_GAP_MS = 500; // ASK_PEER_CODEX_SUBMIT_DELAY_MS in src/server.ts
18
+
19
+ async function fetchSource(attempts = 3) {
20
+ let lastErr;
21
+ for (let i = 0; i < attempts; i++) {
22
+ try {
23
+ const res = await fetch(URL);
24
+ if (res.ok) return await res.text();
25
+ lastErr = new Error(`HTTP ${res.status}`);
26
+ } catch (e) {
27
+ lastErr = e;
28
+ }
29
+ await new Promise((r) => setTimeout(r, 1000 * (i + 1)));
30
+ }
31
+ throw lastErr;
32
+ }
33
+
34
+ let src;
35
+ try {
36
+ src = await fetchSource();
37
+ } catch (e) {
38
+ console.error(`drift-check: could not fetch paste_burst.rs (${e?.message ?? e}). Transient — re-run.`);
39
+ process.exit(2);
40
+ }
41
+
42
+ const m = src.match(/PASTE_ENTER_SUPPRESS_WINDOW[\s\S]{0,120}?from_millis\((\d+)\)/);
43
+ if (!m) {
44
+ console.error(
45
+ "drift-check: PASTE_ENTER_SUPPRESS_WINDOW / from_millis(...) not found upstream — Codex may have renamed or restructured the paste-burst logic. Re-verify oxtail's Codex wake gap (ASK_PEER_CODEX_SUBMIT_DELAY_MS) by hand.",
46
+ );
47
+ process.exit(1);
48
+ }
49
+
50
+ const ms = Number(m[1]);
51
+ if (ms !== EXPECTED_MS) {
52
+ const stillSafe = ms < OUR_GAP_MS;
53
+ console.error(
54
+ `drift-check: PASTE_ENTER_SUPPRESS_WINDOW changed ${EXPECTED_MS}ms -> ${ms}ms. ` +
55
+ `oxtail's gap is ${OUR_GAP_MS}ms — ` +
56
+ (stillSafe
57
+ ? "still larger, so wake should still submit, but update EXPECTED_MS here once re-verified."
58
+ : "NO LONGER LARGER: Codex wake will regress (Enter swallowed). Bump ASK_PEER_CODEX_SUBMIT_DELAY_MS in src/server.ts."),
59
+ );
60
+ process.exit(1);
61
+ }
62
+
63
+ console.log(`drift-check: PASTE_ENTER_SUPPRESS_WINDOW still ${ms}ms; oxtail gap ${OUR_GAP_MS}ms — OK.`);