npm - oxtail - Versions diffs - 0.11.0 → 0.12.0 - Mend

oxtail 0.11.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/AGENTS.md +1 -0
package/README.md +29 -9
package/dist/diagnose.js +75 -0
package/dist/registry.js +80 -28
package/dist/server.js +62 -28
package/dist/wake-debounce.js +45 -0
package/package.json +1 -1
package/scripts/check-codex-paste-burst.mjs +63 -0

package/AGENTS.md CHANGED Viewed

@@ -58,6 +58,7 @@ The v0.9/v0.10.1 changes close the public dogfooding gaps found by real peer tra
 ## Recently shipped
+- **Wake hardening (v0.12.0 — issues #5/#6/#7, the v0.7-review backlog).** Three deferred wake items, landed together. **#6 (security):** wake send-keys now only ever target the pane the live process tree says hosts the peer's `server_pid` (`chooseVerifiedWakePane` → `currentPaneForServerPid`), never the peer's self-written `tmux_pane`/`tmux_session`; unverifiable ⇒ refuse (`skipped_no_target`). Registry-sourced tmux ids are shape-validated (`isValidTmuxPane`/`isValidTmuxSession`) and a spoofed `TMUX_PANE` env is ignored. This removed the cached-pane and session-name send-keys fallbacks (legit peers always register a real pane; churn is handled by re-resolution). **#5 (debounce):** all wake paths funnel through `wakePeer`, which coalesces repeat wakes to the same peer within `OXTAIL_WAKE_DEBOUNCE_MS` (default 1s, in-memory per process) ⇒ `skipped_debounced`. **#7 (observability):** a `wake_outcome` trace event per wake; `oxtail diagnose` summarizes wake_status counts by tool from `MCP_TRACE_FILE`; a scheduled `codex-drift.yml` fails if Codex's `PASTE_ENTER_SUPPRESS_WINDOW` drifts past our 500ms gap. New modules: `src/wake-debounce.ts`, `src/diagnose.ts`; `chooseVerifiedWakePane` in `src/registry.ts`.
 - **Wake-on-reply (Slice 1, peer-messaging refinement push).** A `send_message` that carries `reply_to` now auto-wakes the original requester **by default** (explicit `wake:"off"` opts out), closing the observed stranding where a peer's async reply to an idle requester forced a human to relay it. The reply path is a separate, stricter gate than the lenient `wake:"auto"` path (`src/autowake.ts`): it fires only for a **fresh-idle** target (idle marker newer than `OXTAIL_AUTOWAKE_FRESH_IDLE_MS`, default 5m) — stale/unknown/missing/busy ⇒ `skipped_no_fresh_idle`, never a best-effort wake — and adds a **per-target rate limit** (`skipped_rate_limited`), a persistent **one-wake dedupe** keyed on `(session_id, reply_to)` (`skipped_deduped`, GC'd by age) to survive duplicate/late hook drains, an `OXTAIL_AUTOWAKE=off` kill-switch, and a best-effort `skipped_store_error` degrade so a broken dedupe store can never turn an already-enqueued reply into a tool error. Target is resolved by `client.session_id` with the pane re-resolved immediately before send-keys (no `server_pid`/stale-pane reuse). Response surfaces `wake_status` + `wake_reason:"reply_to_default"`. **Coverage caveat:** the fresh-idle gate keys on the busy/idle marker that only the Claude Code hooks maintain, so this slice reaches a **hooked Claude Code requester** (the observed case). A Codex / hookless-Claude requester has no idle marker ⇒ `skipped_no_fresh_idle` (reach it with explicit `wake:"auto"`); closing that direction is **Slice 2** (`expects_reply:true` — a requester-side waiter signal), deliberately not faked here with a blind `unknown ⇒ wake` that would reintroduce the active-waiter double-wake.
 - **Protocol hardening (v0.10.1).** `ask_peer` now stamps outbound messages with `request_id`; reply-to-capable peers answer with `send_message({ reply_to: request_id })`, and the waiter ignores stale same-peer messages. Explicit identity claims are monotonic, so stale automatic detection cannot clobber a real client session id. PreToolUse/Stop hook pushes are body-budgeted and labeled as peer context, not user authority.
 - **Deliver-on-complete and state-gated wake (v0.9).** The Stop hook delivers waiting messages at turn end, closing the text-only-turn gap left by PreToolUse. `UserPromptSubmit`/`Stop` maintain a busy/idle flag so `send_message({ wake: "auto" })` nudges idle peers without typing into a busy composer. Sticky Codex claim recovery keeps identity across MCP child restarts.

package/README.md CHANGED Viewed

@@ -36,7 +36,7 @@ args = ["-y", "oxtail@0.10.1"]
 ```sh
 mkdir -p ~/.claude/commands
-curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.10.1/.claude/commands/oxtail-join.md \
+curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/.claude/commands/oxtail-join.md \
   -o ~/.claude/commands/oxtail-join.md
 ```
@@ -44,9 +44,9 @@ curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.10.1/.claude/command
 ```sh
 mkdir -p ~/.codex/skills/oxtail-join/agents
-curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.10.1/integrations/codex/oxtail-join/SKILL.md \
+curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/integrations/codex/oxtail-join/SKILL.md \
   -o ~/.codex/skills/oxtail-join/SKILL.md
-curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.10.1/integrations/codex/oxtail-join/agents/openai.yaml \
+curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/integrations/codex/oxtail-join/agents/openai.yaml \
   -o ~/.codex/skills/oxtail-join/agents/openai.yaml
 ```
@@ -71,7 +71,7 @@ Contributing? `git clone https://github.com/d4j3y2k/oxtail && cd oxtail && npm i
 - `register_my_session` — pin this MCP server's `session_id` directly. Kept for debugging; prefer `claim_session`.
 - `get_my_session` — return this MCP server's own registry entry plus a per-strategy detection diagnosis. Useful for debugging.
-See [design principles](https://github.com/d4j3y2k/oxtail/blob/v0.10.1/AGENTS.md) for scope and architecture.
+See [design principles](https://github.com/d4j3y2k/oxtail/blob/v0.12.0/AGENTS.md) for scope and architecture.
 ## Usage from an agent
@@ -172,6 +172,8 @@ If you have a hook installed on a managed event that isn't from Terminator and i
 oxtail trusts any process running as the **same local user** to enqueue messages. The mailbox directory is mode `0o700` (private), so other users on the host cannot read or write. **On a shared-tenancy box (containers, multi-user dev hosts, etc.), do not run oxtail-aware agents:** any local process under your user can inject `<system-reminder>` content directly into a Claude session. The threat boundary is the same as `~/.ssh/` — what your user processes do, you trust.
+Within that boundary oxtail still *narrows* redirectable side effects, as defense-in-depth rather than a hard boundary: wake keystrokes only go to the pane the process tree confirms hosts the target's `server_pid`, never a self-written `tmux_pane`/`tmux_session` (see [Pane targeting](#pane-targeting-verified)), and an accepted registry entry can't borrow another pid — its `server_pid` must match its own `<pid>.json` filename. So one peer's entry can't masquerade as hosting another agent to redirect that agent's wake. A same-user process can still overwrite any registry file outright (that's the trust boundary above); what it can't do is smuggle a pid mismatch past a reader.
 ## Delegate-and-wait (v0.10.1)
 `ask_peer` extends v0.5's mailbox transport into a blocking primitive:
@@ -182,7 +184,7 @@ ask_peer({ target, body })
       ok: true,
       message_id,
       request_id,
-      wake_status: "fired" | "skipped_unsupported" | "skipped_no_target" | "disabled",
+      wake_status: "fired" | "skipped_busy" | "skipped_debounced" | "skipped_no_target" | "disabled",
       reply: { id, body, enqueued_at, from_session_id, reply_to, correlation } | null,
       correlation: "correlated" | "uncorrelated" | "none",
       timeout_ms,
@@ -190,7 +192,7 @@ ask_peer({ target, body })
     }
 ```
-`wake_status` distinguishes the four outcomes a caller may need to handle differently. `fired` means the wake was attempted (or the reply arrived during the grace window, so no wake was needed). `skipped_unsupported` is reserved — no client currently returns this in auto mode (both Codex and Claude Code wake via send-keys). `skipped_no_target` means no tmux pane/session resolved for the target. `disabled` means `OXTAIL_ASK_PEER_WAKE_STRATEGY=off` is in effect.
+`wake_status` distinguishes the outcomes a caller may need to handle differently. `fired` means the wake was attempted (or the reply arrived during the grace window, so no wake was needed). `skipped_busy` means the peer is mid-turn (its hooks/poll will deliver — we still poll for the reply). `skipped_debounced` means a wake fired for this peer moments ago and this one was coalesced. `skipped_no_target` means no process-tree-verified pane resolved for the target. `disabled` means `OXTAIL_ASK_PEER_WAKE_STRATEGY=off` is in effect. (`skipped_unsupported` is reserved — no client currently returns it.)
 `timed_out` is `true` only when the poll loop ran to its deadline without a reply.
@@ -220,9 +222,13 @@ ask_peer({ target, body })
 4. Poll the caller's mailbox at 200ms. For reply-to-capable peers, only a message with both `from_session_id == target.session_id` and `reply_to == request_id` satisfies the wait; non-matching messages stay in the mailbox untouched. Legacy/no-capability peers are best-effort and are marked `correlation: "uncorrelated"`; this preserves old peers but can stale-match old same-peer chatter.
 5. Return the reply on match, or `{ reply: null, timed_out: true, wake_status, correlation: "none" }` after the timeout. Late replies fall back to the normal v0.5 hook / `read_my_messages` path — never lost, just delivered out of band.
-### Pane staleness
+### Pane targeting (verified)
+A peer's cached `tmux_pane` / `tmux_session` are written by the peer into its **own** registry file, so they aren't trustworthy targets for keystrokes — a malicious local peer could point them at someone else's pane. The **only** send-keys target oxtail uses is the pane the live process tree says currently hosts the peer's `server_pid` (resolved at wake-time via `ps`/`tmux` ancestry — unforgeable by editing a JSON file). This also handles pane-id churn for free: the pane is always re-resolved fresh. If `server_pid` can't be bound to any live pane, oxtail **refuses** to wake (`wake_status: "skipped_no_target"`) rather than fall back to a self-written value. `server_pid` itself is self-written too, so registry entries whose `server_pid` doesn't match their own `<pid>.json` filename are rejected — a forged entry can't borrow another process's pane. The pane id that does reach `tmux` is shape-validated (`%\d+`); session names are no longer used as a send-keys target at all. (Hardening from issue #6.)
+### Wake debouncing
-Pane targeting can go stale: `tmux_pane` is cached at server startup, but tmux can reuse pane ids after a pane is killed. v0.7 re-resolves the pane from the peer's `server_pid` at wake-time (via process-tree ancestry), preferring the live pane id over the cached one. If the peer is no longer in any tmux pane (orphaned), oxtail falls back to the registered tmux session name. If both targeting attempts fail, `wake_status` returns `skipped_no_target`.
+All wake paths funnel through one place, which **coalesces** rapid repeat wakes to the same peer: if a wake fired for a peer within `OXTAIL_WAKE_DEBOUNCE_MS` (default 1s), a follow-up wake is skipped (`wake_status: "skipped_debounced"`) and relies on the still-pending response. This keeps a retried `ask_peer`, two callers racing the same peer, or a polling loop from stacking notification lines into the peer's composer. In-memory and per-process by design. (Issue #5.)
 ### Constraints
@@ -281,10 +287,24 @@ When a strategy doesn't fire, it returns an abstention with a `reason` (e.g. `"2
 If `MCP_TRACE_FILE` is set in the environment, every detection run appends an NDJSON record with trigger, winning strategy, per-strategy outcomes, and `next_step`. Useful for diagnosing unresolved `client_session_id`s in the wild.
+### Diagnosing wakes (`oxtail diagnose`)
+The same `MCP_TRACE_FILE` also captures a `wake_outcome` record for every wake (which tool drove it and the resulting `wake_status`). Run:
+```sh
+oxtail diagnose
+```
+to get a summary — counts by `wake_status`, broken down by tool — so "is the wake mechanism working in my environment?" is one command instead of grepping JSONL. With `MCP_TRACE_FILE` unset it just prints how to enable tracing. (Issue #7.)
+A scheduled CI job (`.github/workflows/codex-drift.yml`, also runnable on demand) fetches Codex's upstream `PASTE_ENTER_SUPPRESS_WINDOW` and fails if it drifts past oxtail's 500ms Codex wake gap — so a future Codex release that would break the wake surfaces as a red job rather than a silent field regression.
 ## Status
-v0.10.1. Completes the autonomous peer-messaging matrix and hardens the protocol: a message reaches a Claude Code peer whether it's mid-turn, finishing, or fully idle, and delegate-and-wait replies are correlated by `request_id` / `reply_to` for upgraded peers.
+v0.12.0. Pushes the autonomous peer-messaging matrix toward zero human relay, then hardens the wake path.
+- **Wake-on-reply (v0.11.0).** A reply — `send_message` with `reply_to` — auto-wakes a freshly-idle requester by default, so an awaited answer doesn't strand an idle peer. Strictly gated (fresh-idle only, per-target rate limit, one-wake dedupe, `OXTAIL_AUTOWAKE=off` kill-switch). `wake:"off"` opts out; explicit `wake:"auto"` is the escape hatch for a requester without an idle marker (Codex / hookless Claude).
+- **Wake hardening (v0.12.0).** Wake keystrokes only ever target the pane the process tree confirms hosts the peer's `server_pid` — never a self-written `tmux_pane`/`tmux_session`, and registry entries whose `server_pid` doesn't match their filename are rejected. Rapid repeat wakes to one peer are coalesced (`skipped_debounced`). `oxtail diagnose` summarizes wake outcomes from `MCP_TRACE_FILE`, and a scheduled CI job flags drift in Codex's paste-burst window before it can break the wake.
 - **Correlated delegate-and-wait.** `ask_peer` now sends a `request_id`; upgraded peers reply with `send_message({ reply_to })`, and the waiter ignores same-peer chatter that does not match. Legacy peers are still supported, but their replies are marked `correlation: "uncorrelated"`.
 - **Identity monotonicity.** `claim_session` / `register_my_session` and sticky-claim recovery are authoritative after they set a session id; later automatic detection cannot clobber a claimed id with stale env data.
 - **Hook push budgeting and provenance.** PreToolUse/Stop delivery stamps `origin: "peer"`, reminds receivers that peer messages are not user authority, and caps hook-injected body text via `OXTAIL_HOOK_MAX_BODY_CHARS`.

package/dist/diagnose.js ADDED Viewed

@@ -0,0 +1,75 @@
+// Issue #7 — `oxtail diagnose`.
+//
+// The wake mechanism is environment-sensitive (tmux present? peer in a pane?
+// Codex paste-burst gap still sufficient?). When it silently doesn't work, a
+// user otherwise has to spelunk MCP_TRACE_FILE by hand. This summarizes the
+// `wake_outcome` trace events oxtail emits — counts by wake_status, broken down
+// by which tool drove the wake — so "is wake working here?" is one command.
+import { readFileSync } from "node:fs";
+// Keep only `wake_outcome` events, newest `limit`, and tally them. Malformed
+// JSONL lines are skipped (a trace file can be concurrently appended).
+export function summarizeWakeOutcomes(lines, limit = 200) {
+    const outcomes = [];
+    for (const line of lines) {
+        if (!line.trim())
+            continue;
+        let rec;
+        try {
+            rec = JSON.parse(line);
+        }
+        catch {
+            continue;
+        }
+        if (rec.event === "wake_outcome")
+            outcomes.push(rec);
+    }
+    const recent = limit > 0 ? outcomes.slice(-limit) : outcomes;
+    const byStatus = {};
+    const byVia = {};
+    for (const r of recent) {
+        const status = String(r.wake_status ?? "unknown");
+        const via = String(r.via ?? "unknown");
+        byStatus[status] = (byStatus[status] ?? 0) + 1;
+        const viaBucket = (byVia[via] ??= {});
+        viaBucket[status] = (viaBucket[status] ?? 0) + 1;
+    }
+    return { total: recent.length, considered: outcomes.length, byStatus, byVia };
+}
+function sortedCounts(counts) {
+    return Object.entries(counts).sort((a, b) => b[1] - a[1] || a[0].localeCompare(b[0]));
+}
+export function formatWakeSummary(s) {
+    if (s.total === 0) {
+        return "oxtail diagnose: no wake_outcome events in the trace yet (no ask_peer / wake:auto / reply-default wakes recorded).";
+    }
+    const lines = [];
+    const capped = s.considered > s.total ? ` (newest ${s.total} of ${s.considered})` : ` (${s.total})`;
+    lines.push(`oxtail diagnose — wake outcomes${capped}:`);
+    for (const [status, n] of sortedCounts(s.byStatus)) {
+        lines.push(`  ${status}: ${n}`);
+    }
+    lines.push("by tool:");
+    for (const [via, counts] of Object.entries(s.byVia).sort()) {
+        const parts = sortedCounts(counts).map(([st, n]) => `${st} ${n}`);
+        lines.push(`  ${via}: ${parts.join(", ")}`);
+    }
+    return lines.join("\n");
+}
+// CLI entry. Returns a process exit code; `out` is injectable for tests.
+export function runDiagnose(traceFile, out = console.log) {
+    if (!traceFile) {
+        out("oxtail diagnose: MCP_TRACE_FILE is not set, so there is no trace data to summarize.");
+        out("Set MCP_TRACE_FILE=/path/to/oxtail-trace.jsonl in the oxtail MCP server's env (e.g. in .mcp.json / ~/.claude.json / ~/.codex/config.toml), reproduce some wakes, then re-run `oxtail diagnose`.");
+        return 0;
+    }
+    let content;
+    try {
+        content = readFileSync(traceFile, "utf8");
+    }
+    catch {
+        out(`oxtail diagnose: could not read trace file ${traceFile} (set MCP_TRACE_FILE and reproduce some wakes first).`);
+        return 1;
+    }
+    out(formatWakeSummary(summarizeWakeOutcomes(content.split("\n"))));
+    return 0;
+}

package/dist/registry.js CHANGED Viewed

@@ -42,6 +42,73 @@ function ensureDir() {
 function entryPath(pid) {
     return join(registryDir(), `${pid}.json`);
 }
+// tmux's own identifiers, used to sanitize registry-sourced values before they
+// reach a `tmux` command. A pane id is always `%<n>`; a session name, per tmux's
+// rules for names we create, is `[A-Za-z0-9_-]+`. Validating defends against a
+// malicious local peer writing a crafted `tmux_pane`/`tmux_session` into its own
+// registry file to redirect or trick our wake send-keys (issue #6).
+export function isValidTmuxPane(s) {
+    return /^%\d+$/.test(s);
+}
+export function isValidTmuxSession(s) {
+    return /^[A-Za-z0-9_-]+$/.test(s);
+}
+// The ONLY trustworthy send-keys target for waking a peer: the pane the live
+// process tree says currently hosts the peer's `server_pid`. This is computed
+// from `ps`/`tmux` state (currentPaneForServerPid), so it cannot be forged by a
+// peer editing its own `~/.oxtail/sessions/<pid>.json` — unlike the cached
+// `tmux_pane`/`tmux_session` fields, which the peer self-writes. Returns null
+// (caller must refuse to wake) when:
+//   - the peer never registered a pane: a legit tmux-hosted peer always does
+//     (its session is derived from the pane), so a pane-less/session-only entry
+//     is hand-written or spoofed and must never be blind-fired; gating on a
+//     registered pane also avoids fishing for a pane from server_pid alone,
+//     which in tests can collide with the test runner's own pane.
+//   - server_pid isn't under any live tmux pane: we can't bind a trustworthy
+//     target, so we refuse rather than fall back to the self-written cached value.
+//   - the resolved pane isn't a well-formed pane id (tmux output anomaly).
+// resolvePane is injected in tests; production uses currentPaneForServerPid.
+export function chooseVerifiedWakePane(peer, resolvePane = currentPaneForServerPid) {
+    if (!peer.tmux_pane)
+        return null;
+    const live = resolvePane(peer.server_pid);
+    if (!live || !isValidTmuxPane(live))
+        return null;
+    return live;
+}
+// Extract the pid a registry filename encodes: `<pid>.json` → pid, else null.
+export function filenamePid(file) {
+    const m = /^(\d+)\.json$/.exec(file);
+    if (!m)
+        return null;
+    const pid = Number(m[1]);
+    return Number.isInteger(pid) && pid > 0 ? pid : null;
+}
+// Read + parse a registry file, enforcing the provenance invariant that a
+// process only ever writes its OWN `<pid>.json`: the parsed `server_pid` MUST
+// equal the pid in the filename. register() always writes them in agreement, so
+// a mismatch means the entry was hand-forged to borrow another process's pid —
+// the #6 redirect where a peer self-writes `server_pid: <victimPid>` so that
+// chooseVerifiedWakePane → currentPaneForServerPid resolves (and wakes) the
+// victim's pane. Such entries, plus non-`<pid>.json` names and parse failures,
+// are rejected (returns null) so no raw-registry reader trusts them. The
+// local-user trust boundary still holds (a same-user process can overwrite any
+// file), but this stops one peer's entry from impersonating another pid.
+export function readEntryFile(dir, file) {
+    const fnamePid = filenamePid(file);
+    if (fnamePid === null)
+        return null;
+    let entry;
+    try {
+        entry = JSON.parse(readFileSync(join(dir, file), "utf8"));
+    }
+    catch {
+        return null;
+    }
+    if (entry.server_pid !== fnamePid)
+        return null;
+    return entry;
+}
 function resolveTmuxSessionFromPane(pane) {
     if (!pane)
         return null;
@@ -120,7 +187,10 @@ export function findTmuxPaneByAncestry(startPid, panePids, ppids) {
     return null;
 }
 export function resolveTmuxPane(env = process.env, pid = process.pid) {
-    if (env.TMUX_PANE)
+    // TMUX_PANE is a peer-controllable env var: only trust it if it has tmux's
+    // pane-id shape (%N). A spoofed/malformed value falls through to process-tree
+    // ancestry, which can't be forged by editing the environment (issue #6).
+    if (env.TMUX_PANE && isValidTmuxPane(env.TMUX_PANE))
         return env.TMUX_PANE;
     return findTmuxPaneByAncestry(pid, listTmuxPanePids(), listAllPpids());
 }
@@ -194,16 +264,10 @@ function gcDeadSiblings(entry) {
     if (!existsSync(dir))
         return;
     for (const file of readdirSync(dir)) {
-        if (!file.endsWith(".json"))
-            continue;
+        const other = readEntryFile(dir, file);
+        if (!other)
+            continue; // skip non-<pid>.json, parse errors, and forged entries
         const full = join(dir, file);
-        let other;
-        try {
-            other = JSON.parse(readFileSync(full, "utf8"));
-        }
-        catch {
-            continue;
-        }
         if (other.server_pid === entry.server_pid)
             continue;
         if (other.client.session_id !== sid)
@@ -267,16 +331,10 @@ export function readAll() {
         return [];
     const live = [];
     for (const file of readdirSync(dir)) {
-        if (!file.endsWith(".json"))
-            continue;
+        const entry = readEntryFile(dir, file);
+        if (!entry)
+            continue; // non-<pid>.json, parse error, or forged server_pid
         const full = join(dir, file);
-        let entry;
-        try {
-            entry = JSON.parse(readFileSync(full, "utf8"));
-        }
-        catch {
-            continue;
-        }
         if (!isAlive(entry.server_pid)) {
             // Reap-deferral: a dead child's mailbox may still hold undrained mail
             // that the session's union-drain (PreToolUse hook + read_my_messages)
@@ -342,15 +400,9 @@ export function sessionPidsForId(sessionId) {
         return [];
     const entries = [];
     for (const file of readdirSync(dir)) {
-        if (!file.endsWith(".json"))
-            continue;
-        let e;
-        try {
-            e = JSON.parse(readFileSync(join(dir, file), "utf8"));
-        }
-        catch {
-            continue;
-        }
+        const e = readEntryFile(dir, file);
+        if (!e)
+            continue; // skip non-<pid>.json, parse errors, and forged entries
         if (e.client.session_id === sessionId)
             entries.push(e);
     }

package/dist/server.js CHANGED Viewed

@@ -10,10 +10,11 @@ import { dirname, join, sep } from "node:path";
 import { clientFromHandshake, detectClient, enrichWithDiagnosis, transcriptPathFor, } from "./clients.js";
 import { isAbstain } from "./detect/index.js";
 import { trace } from "./trace.js";
-import { buildEntry, currentPaneForServerPid, findByTmuxSession, readAll, refreshTmuxBinding, register, sessionPidsForId, unregister, } from "./registry.js";
+import { buildEntry, chooseVerifiedWakePane, findByTmuxSession, readAll, refreshTmuxBinding, register, sessionPidsForId, unregister, } from "./registry.js";
 import * as mailbox from "./mailbox.js";
 import { recoverClaim, resolveAncestors, writeClaim } from "./claims.js";
 import { decideReplyAutoWake, defaultAutowakeDir } from "./autowake.js";
+import { markWoke, newWakeDebounceStore, recentlyWoke } from "./wake-debounce.js";
 // CLI subcommand dispatch must run before any MCP setup so that
 // `npx oxtail install-hook` doesn't open an MCP transport or register a
 // session. Use named exports and await them; calling `await import(...)`
@@ -33,6 +34,10 @@ import { decideReplyAutoWake, defaultAutowakeDir } from "./autowake.js";
         await mod.uninstall();
         process.exit(0);
     }
+    if (sub === "diagnose") {
+        const { runDiagnose } = await import("./diagnose.js");
+        process.exit(runDiagnose(process.env.MCP_TRACE_FILE));
+    }
 }
 import { readClaudeTranscript, readCodexTranscript, } from "./transcripts.js";
 // Single builder for every readSession return so the field set (including the
@@ -1003,7 +1008,7 @@ function resolveTarget(target, caller) {
 server.registerTool("send_message", {
     description: [
         "Fire-and-forget message to a peer in the same project root. Target: a tmux session name OR a client_session_id (UUID). Async via the peer's mailbox — delivered mid-turn (PreToolUse hook) or next-turn (read_my_messages); cross-project targets are rejected.",
-        "A plain message does NOT wake an idle peer. Pass wake:\"auto\" to nudge one via per-client send-keys, state-gated (skipped if the peer is mid-turn). EXCEPTION (wake-on-reply): when you set reply_to, this auto-wakes the requester by default so your answer doesn't strand them idle — pass wake:\"off\" to suppress. The reply-default wake is strictly gated: it fires only for a FRESHLY-IDLE requester (one whose Claude Code hooks maintain a fresh idle marker), with a per-target rate limit and a one-wake dedupe; env kill-switch OXTAIL_AUTOWAKE=off. A requester with no idle marker (Codex, or Claude without the hooks) returns skipped_no_fresh_idle and is NOT auto-woken — use explicit wake:\"auto\" for those. Response carries wake_status (\"fired\" | \"skipped_busy\" | \"skipped_no_fresh_idle\" | \"skipped_rate_limited\" | \"skipped_deduped\" | \"skipped_store_error\" | \"skipped_no_target\" | \"disabled\") and, on the reply path, wake_reason:\"reply_to_default\".",
+        "A plain message does NOT wake an idle peer. Pass wake:\"auto\" to nudge one via per-client send-keys, state-gated (skipped if the peer is mid-turn). EXCEPTION (wake-on-reply): when you set reply_to, this auto-wakes the requester by default so your answer doesn't strand them idle — pass wake:\"off\" to suppress. The reply-default wake is strictly gated: it fires only for a FRESHLY-IDLE requester (one whose Claude Code hooks maintain a fresh idle marker), with a per-target rate limit and a one-wake dedupe; env kill-switch OXTAIL_AUTOWAKE=off. A requester with no idle marker (Codex, or Claude without the hooks) returns skipped_no_fresh_idle and is NOT auto-woken — use explicit wake:\"auto\" for those. Response carries wake_status (\"fired\" | \"skipped_busy\" | \"skipped_debounced\" | \"skipped_no_fresh_idle\" | \"skipped_rate_limited\" | \"skipped_deduped\" | \"skipped_store_error\" | \"skipped_no_target\" | \"disabled\") and, on the reply path, wake_reason:\"reply_to_default\".",
         "Body is verbatim — wrap in <system-reminder>...</system-reminder> yourself if you want that framing. When replying to ask_peer, include reply_to: request_id from the inbound message. For a blocking send-and-wait, use ask_peer instead.",
     ].join(" "),
     inputSchema: {
@@ -1053,6 +1058,14 @@ server.registerTool("send_message", {
         source_message_id,
     });
     const { wake_status, wake_reason } = await resolveSendWake(peer, wake, reply_to);
+    if (wake_status) {
+        trace("wake_outcome", {
+            via: wake_reason === "reply_to_default" ? "reply_default" : "send_message",
+            wake_status,
+            target_session_id: peer.client.session_id,
+            client_type: peer.client.type,
+        });
+    }
     return jsonResult({
         schema_version: 1,
         ok: true,
@@ -1251,11 +1264,11 @@ function askPeerDelay(ms, signal) {
 // parsed as a key event. The -l flag neutralizes any tmux keysequences a
 // malicious peer could plant in its registry entry.
 //
-// Pane targeting can go stale: tmux_pane is cached at server startup
-// (registry resolveTmuxPane), but Terminator-style window churn can move or
-// close the pane after registration. send-keys against a dead pane id
-// errors; if pane targeting fails and a sessionName is also available,
-// retry against it (targets the session's currently-active pane).
+// askPeerWakeImpl keeps a generic pane→sessionName retry for its own unit
+// tests, but PRODUCTION wakePeer now passes only the process-tree-verified pane
+// (sessionName = null): a self-written tmux_session is not a trustworthy
+// send-keys target (issue #6), and pane-id churn is handled by re-resolving the
+// pane from server_pid on every wake rather than by a session fallback.
 async function defaultFireWakeKeystrokes(target, clientType) {
     execFileSync("tmux", ["send-keys", "-t", target, "-l", ASK_PEER_WAKE_TEXT], {
         stdio: ["ignore", "pipe", "pipe"],
@@ -1302,46 +1315,61 @@ export async function askPeerWakeImpl(pane, sessionName, fire) {
 // peer's client_type. Returns the wake_status that should surface in the
 // ask_peer response so callers can distinguish "we tried, no answer" from
 // "we didn't try because the client can't be woken."
+// In-memory per-process wake-debounce state, keyed by peer session_id. Coalesces
+// rapid repeat wakes to the same peer across all wake paths (issue #5).
+const wakeDebounce = newWakeDebounceStore();
 async function wakePeer(peer) {
     if (ASK_PEER_WAKE_STRATEGY === "off") {
         trace("ask_peer_wake_skipped", { reason: "strategy-off" });
         return "disabled";
     }
     const clientType = peer.client.type;
-    if (!peer.tmux_pane && !peer.tmux_session) {
-        return "skipped_no_target";
-    }
-    // Race-fix: tmux_pane is cached at registration but pane ids can be reused
-    // by tmux after a pane is killed. If we send-keys against a reused id we
-    // wake the wrong shell. When the peer registered WITH a cached pane,
-    // re-resolve from its server_pid at wake-time and prefer the live value.
-    // If the peer registered without a pane (no TMUX_PANE in env, no ancestry
-    // match), skip the re-resolution entirely — fishing for a pane based on
-    // server_pid alone is unsafe (server_pid may not even still be alive, and
-    // in tests it can coincide with the test runner's process tree).
-    const livePane = peer.tmux_pane
-        ? currentPaneForServerPid(peer.server_pid)
-        : null;
-    if (peer.tmux_pane && livePane && livePane !== peer.tmux_pane) {
-        trace("ask_peer_wake_pane_refreshed", {
+    // #5: coalesce a rapid repeat wake to the same peer (concurrent/retried
+    // ask_peer, polling loops) so we don't stack a second notification line into
+    // its composer. Keyed on session_id; an unclaimed peer (no id) isn't debounced.
+    const sid = peer.client.session_id;
+    if (sid && recentlyWoke(wakeDebounce, sid, Date.now())) {
+        trace("ask_peer_wake_skipped", { reason: "debounced", target_session_id: sid });
+        return "skipped_debounced";
+    }
+    // Security (#6): tmux_pane / tmux_session come from the peer's OWN registry
+    // file, so a malicious local peer could point them at someone else's pane or
+    // session to redirect our wake keystrokes. The ONLY trustworthy send-keys
+    // target is the pane the live process tree says currently hosts the peer's
+    // server_pid — chooseVerifiedWakePane resolves that and refuses (returns null)
+    // when it can't be verified, instead of falling back to the self-written
+    // cached pane or tmux_session. This also subsumes the old stale-pane re-
+    // resolution race fix: we ALWAYS use the freshly process-tree-resolved pane.
+    const verifiedPane = chooseVerifiedWakePane(peer);
+    if (!verifiedPane) {
+        trace("ask_peer_wake_skipped", {
+            reason: "no-verified-pane",
             cached: peer.tmux_pane,
-            live: livePane,
             server_pid: peer.server_pid,
+            target_session_id: peer.client.session_id,
         });
+        return "skipped_no_target";
     }
-    else if (peer.tmux_pane && !livePane) {
-        trace("ask_peer_wake_pane_orphaned", {
+    if (verifiedPane !== peer.tmux_pane) {
+        trace("ask_peer_wake_pane_refreshed", {
             cached: peer.tmux_pane,
+            live: verifiedPane,
             server_pid: peer.server_pid,
         });
     }
-    const effectivePane = livePane ?? peer.tmux_pane;
     // Legacy mode bypasses per-client routing: every wake is the v0.6 sequence
     // (no inter-keystroke delay). Cast to "unknown" so defaultFireWakeKeystrokes
     // skips the Codex delay branch.
     const fireType = ASK_PEER_WAKE_STRATEGY === "legacy" ? "unknown" : clientType;
     const fire = (target) => defaultFireWakeKeystrokes(target, fireType);
-    const ok = await askPeerWakeImpl(effectivePane, peer.tmux_session, fire);
+    // #5: stamp the debounce BEFORE the (possibly async, paste-burst-delayed) fire
+    // so a concurrent second wakePeer for this peer — which runs while we're
+    // awaiting send-keys — sees the stamp and coalesces instead of double-firing.
+    if (sid)
+        markWoke(wakeDebounce, sid, Date.now());
+    // No session-name fallback: a self-written tmux_session could target another
+    // session, and the verified pane already handles pane-id churn. Pass null.
+    const ok = await askPeerWakeImpl(verifiedPane, null, fire);
     return ok ? "fired" : "skipped_no_target";
 }
 // --- send_message wake:auto gating -------------------------------------------
@@ -1562,6 +1590,12 @@ server.registerTool("ask_peer", {
             // send_message wake:auto. (Codex has no activity file, so it is never
             // detected busy and still fires — unchanged for that client.)
             wakeStatus = await wakeForSend(peer);
+            trace("wake_outcome", {
+                via: "ask_peer",
+                wake_status: wakeStatus,
+                target_session_id: peer.client.session_id,
+                client_type: peer.client.type,
+            });
             if (wakeStatus === "skipped_unsupported") {
                 // Reserved branch. No client currently returns skipped_unsupported
                 // in auto mode (Codex and Claude Code both wake via send-keys).

package/dist/wake-debounce.js ADDED Viewed

@@ -0,0 +1,45 @@
+// Issue #5 — per-peer wake debouncer.
+//
+// Every wake fires `tmux send-keys` into the peer's composer. When the same peer
+// is woken again within a fraction of a second — a caller retrying ask_peer, two
+// callers targeting the same peer concurrently, or a polling loop — oxtail blasts
+// a second WAKE_TEXT line on top of the first, which (with the Codex paste-burst
+// gap) can land inside an already-active turn. This debouncer coalesces those:
+// if a wake fired for a peer within a short window, subsequent wakes are skipped
+// and rely on the still-pending response.
+//
+// Deliberately in-memory and per-process (state lives on the calling oxtail
+// server): the common burst — one caller hammering one peer — is same-process,
+// and cross-process coordination is out of scope for this slice. All wake paths
+// (ask_peer, send_message wake:"auto", the reply-default wake) funnel through
+// wakePeer, so one check there covers them all.
+function envPosInt(name, def, env = process.env) {
+    const v = env[name];
+    if (!v)
+        return def;
+    const n = Number(v);
+    return Number.isFinite(n) && n > 0 ? n : def;
+}
+// Default 1s — long enough to swallow a rapid retry / concurrent double-wake,
+// short enough that a genuinely separate follow-up wake a moment later still
+// lands. Tunable via OXTAIL_WAKE_DEBOUNCE_MS.
+export const WAKE_DEBOUNCE_MS = envPosInt("OXTAIL_WAKE_DEBOUNCE_MS", 1000);
+export function newWakeDebounceStore() {
+    return new Map();
+}
+// True if a wake fired for this key within the window — i.e. skip this one.
+export function recentlyWoke(store, key, nowMs, windowMs = WAKE_DEBOUNCE_MS) {
+    const last = store.get(key);
+    return last !== undefined && nowMs - last < windowMs;
+}
+// Record that a wake fired for this key. Opportunistically evicts stale entries
+// so the map can't grow unbounded across many short-lived peers.
+export function markWoke(store, key, nowMs, windowMs = WAKE_DEBOUNCE_MS) {
+    store.set(key, nowMs);
+    if (store.size > 256) {
+        for (const [k, t] of store) {
+            if (nowMs - t > windowMs * 10)
+                store.delete(k);
+        }
+    }
+}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "oxtail",
-  "version": "0.11.0",
+  "version": "0.12.0",
   "private": false,
   "type": "module",
   "description": "Coordination layer for parallel AI coding agent sessions, exposed over MCP.",

package/scripts/check-codex-paste-burst.mjs ADDED Viewed

@@ -0,0 +1,63 @@
+#!/usr/bin/env node
+// Issue #7 — drift detector for Codex's paste-burst window.
+//
+// oxtail's Codex wake inserts a 500ms gap (ASK_PEER_CODEX_SUBMIT_DELAY_MS)
+// between the typed wake text and Enter, to outlast Codex's paste-burst
+// PASTE_ENTER_SUPPRESS_WINDOW — a private constant tested at 120ms. If Codex
+// bumps that window past our gap in a future release, our wake silently
+// regresses to "Enter gets swallowed" with no signal pointing at the cause.
+//
+// This script fetches the upstream constant and exits non-zero if it changed
+// (or moved/renamed). Run on a schedule (see .github/workflows/codex-drift.yml)
+// so drift surfaces as a failing job rather than a silent field regression.
+const URL =
+  "https://raw.githubusercontent.com/openai/codex/main/codex-rs/tui/src/bottom_pane/paste_burst.rs";
+const EXPECTED_MS = 120; // value oxtail's 500ms gap was verified against
+const OUR_GAP_MS = 500; // ASK_PEER_CODEX_SUBMIT_DELAY_MS in src/server.ts
+async function fetchSource(attempts = 3) {
+  let lastErr;
+  for (let i = 0; i < attempts; i++) {
+    try {
+      const res = await fetch(URL);
+      if (res.ok) return await res.text();
+      lastErr = new Error(`HTTP ${res.status}`);
+    } catch (e) {
+      lastErr = e;
+    }
+    await new Promise((r) => setTimeout(r, 1000 * (i + 1)));
+  }
+  throw lastErr;
+}
+let src;
+try {
+  src = await fetchSource();
+} catch (e) {
+  console.error(`drift-check: could not fetch paste_burst.rs (${e?.message ?? e}). Transient — re-run.`);
+  process.exit(2);
+}
+const m = src.match(/PASTE_ENTER_SUPPRESS_WINDOW[\s\S]{0,120}?from_millis\((\d+)\)/);
+if (!m) {
+  console.error(
+    "drift-check: PASTE_ENTER_SUPPRESS_WINDOW / from_millis(...) not found upstream — Codex may have renamed or restructured the paste-burst logic. Re-verify oxtail's Codex wake gap (ASK_PEER_CODEX_SUBMIT_DELAY_MS) by hand.",
+  );
+  process.exit(1);
+}
+const ms = Number(m[1]);
+if (ms !== EXPECTED_MS) {
+  const stillSafe = ms < OUR_GAP_MS;
+  console.error(
+    `drift-check: PASTE_ENTER_SUPPRESS_WINDOW changed ${EXPECTED_MS}ms -> ${ms}ms. ` +
+      `oxtail's gap is ${OUR_GAP_MS}ms — ` +
+      (stillSafe
+        ? "still larger, so wake should still submit, but update EXPECTED_MS here once re-verified."
+        : "NO LONGER LARGER: Codex wake will regress (Enter swallowed). Bump ASK_PEER_CODEX_SUBMIT_DELAY_MS in src/server.ts."),
+  );
+  process.exit(1);
+}
+console.log(`drift-check: PASTE_ENTER_SUPPRESS_WINDOW still ${ms}ms; oxtail gap ${OUR_GAP_MS}ms — OK.`);