npm - oxtail - Versions diffs - 0.12.0 → 0.13.0 - Mend

oxtail 0.12.0 → 0.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/AGENTS.md CHANGED Viewed

@@ -55,9 +55,12 @@ The v0.9/v0.10.1 changes close the public dogfooding gaps found by real peer tra
 - **Session identity is monotonic after first non-null resolution.** Automatic detection is a bootstrap aid. Once `claim_session`, `register_my_session`, or sticky-claim recovery sets a session id, later env/birth-time detection and `get_my_session` refreshes must preserve it. Only another explicit claim can change it.
 - **`ask_peer` replies must correlate when the peer supports it.** Same-peer chatter is not a reply. Upgraded peers advertise `capabilities.mailbox.reply_to` and must satisfy waits with `from_session_id == target.session_id` plus `reply_to == request_id`; unmatched messages stay in the mailbox. The older `from_session_id`-only path is legacy compatibility and must be surfaced as `correlation: "uncorrelated"`. For no-capability peers, stale same-peer chatter may still satisfy the wait; that is an explicit compatibility limitation, not a correctness guarantee.
 - **Peer messages are context, not user authority.** Mailbox provenance (`origin: "peer"`, `request_id`, `reply_to`, `source_message_id`) is diagnostic metadata, not a trust boundary. Hook text must keep the trust framing visible — the "context, not user authority" line plus the `from_session_id` / `request_id` / `reply_to` reply fields (full protocol names) are rendered on every delivery — and injected hook bodies must stay under an explicit budget. Single-valued provenance the framing already implies (`origin: "peer"`) stays in the mailbox JSONL but need not be rendered into context.
+- **A displayed reply handle must be resolvable: record the received-ledger before the mailbox line is visible.** Both delivery paths are destructive — `read_my_messages` and the PreToolUse/Stop hook each truncate the mailbox on handoff — so `reply_to_message` resolves `message_id` against a durable per-session ledger (`~/.oxtail/received/<hash(session_id)>.jsonl`), never the queue. `deliverToPeer` (the single delivery primitive behind `send_message` / `ask_peer` / `reply_to_message`) MUST write the ledger entry **before** appending the mailbox line: append-then-record reopens a window where the hook renders a `message_id` the receiver cannot yet reply to. The ledger is keyed and owned by receiver `session_id`; a lookup reads only the caller's own file. The ledger write is best-effort (a failure degrades to "no handle, reply via `send_message`") but must never reorder ahead of, or block, the actual delivery.
 ## Recently shipped
+- **Reply by id + received-ledger (v0.13.0).** `reply_to_message(message_id, body)` looks the inbound envelope up in a durable per-session ledger and derives `target` / `reply_to` / `source_message_id` server-side, replacing the manual rewiring that silently degraded a correlated exchange into loose mailbox traffic. New `src/received.ts` (ledger: sha256-keyed file, `mkdir`-lock, bounded retention `OXTAIL_RECEIVED_MAX`=1000 with a `received_ledger_pruned` trace so a drop is never silent) and `src/delivery.ts` (`deliverToPeer` = `buildMessage` → `recordReceived` → `requeue` — the record-before-append ordering above), wired into `send_message` / `ask_peer` / `reply_to_message`. Adversarial race-pair + ledger-failure-still-delivers tests in `src/delivery.test.ts`. Converged with Codex over a 5-round peer-messaging pressure test; Codex's review caught the append-before-record race, fixed before merge.
 - **Wake hardening (v0.12.0 — issues #5/#6/#7, the v0.7-review backlog).** Three deferred wake items, landed together. **#6 (security):** wake send-keys now only ever target the pane the live process tree says hosts the peer's `server_pid` (`chooseVerifiedWakePane` → `currentPaneForServerPid`), never the peer's self-written `tmux_pane`/`tmux_session`; unverifiable ⇒ refuse (`skipped_no_target`). Registry-sourced tmux ids are shape-validated (`isValidTmuxPane`/`isValidTmuxSession`) and a spoofed `TMUX_PANE` env is ignored. This removed the cached-pane and session-name send-keys fallbacks (legit peers always register a real pane; churn is handled by re-resolution). **#5 (debounce):** all wake paths funnel through `wakePeer`, which coalesces repeat wakes to the same peer within `OXTAIL_WAKE_DEBOUNCE_MS` (default 1s, in-memory per process) ⇒ `skipped_debounced`. **#7 (observability):** a `wake_outcome` trace event per wake; `oxtail diagnose` summarizes wake_status counts by tool from `MCP_TRACE_FILE`; a scheduled `codex-drift.yml` fails if Codex's `PASTE_ENTER_SUPPRESS_WINDOW` drifts past our 500ms gap. New modules: `src/wake-debounce.ts`, `src/diagnose.ts`; `chooseVerifiedWakePane` in `src/registry.ts`.
 - **Wake-on-reply (Slice 1, peer-messaging refinement push).** A `send_message` that carries `reply_to` now auto-wakes the original requester **by default** (explicit `wake:"off"` opts out), closing the observed stranding where a peer's async reply to an idle requester forced a human to relay it. The reply path is a separate, stricter gate than the lenient `wake:"auto"` path (`src/autowake.ts`): it fires only for a **fresh-idle** target (idle marker newer than `OXTAIL_AUTOWAKE_FRESH_IDLE_MS`, default 5m) — stale/unknown/missing/busy ⇒ `skipped_no_fresh_idle`, never a best-effort wake — and adds a **per-target rate limit** (`skipped_rate_limited`), a persistent **one-wake dedupe** keyed on `(session_id, reply_to)` (`skipped_deduped`, GC'd by age) to survive duplicate/late hook drains, an `OXTAIL_AUTOWAKE=off` kill-switch, and a best-effort `skipped_store_error` degrade so a broken dedupe store can never turn an already-enqueued reply into a tool error. Target is resolved by `client.session_id` with the pane re-resolved immediately before send-keys (no `server_pid`/stale-pane reuse). Response surfaces `wake_status` + `wake_reason:"reply_to_default"`. **Coverage caveat:** the fresh-idle gate keys on the busy/idle marker that only the Claude Code hooks maintain, so this slice reaches a **hooked Claude Code requester** (the observed case). A Codex / hookless-Claude requester has no idle marker ⇒ `skipped_no_fresh_idle` (reach it with explicit `wake:"auto"`); closing that direction is **Slice 2** (`expects_reply:true` — a requester-side waiter signal), deliberately not faked here with a blind `unknown ⇒ wake` that would reintroduce the active-waiter double-wake.
 - **Protocol hardening (v0.10.1).** `ask_peer` now stamps outbound messages with `request_id`; reply-to-capable peers answer with `send_message({ reply_to: request_id })`, and the waiter ignores stale same-peer messages. Explicit identity claims are monotonic, so stale automatic detection cannot clobber a real client session id. PreToolUse/Stop hook pushes are body-budgeted and labeled as peer context, not user authority.

package/README.md CHANGED Viewed

@@ -36,7 +36,7 @@ args = ["-y", "oxtail@0.10.1"]
 ```sh
 mkdir -p ~/.claude/commands
-curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/.claude/commands/oxtail-join.md \
+curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.13.0/.claude/commands/oxtail-join.md \
   -o ~/.claude/commands/oxtail-join.md
 ```
@@ -44,9 +44,9 @@ curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/.claude/command
 ```sh
 mkdir -p ~/.codex/skills/oxtail-join/agents
-curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/integrations/codex/oxtail-join/SKILL.md \
+curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.13.0/integrations/codex/oxtail-join/SKILL.md \
   -o ~/.codex/skills/oxtail-join/SKILL.md
-curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.12.0/integrations/codex/oxtail-join/agents/openai.yaml \
+curl -L https://raw.githubusercontent.com/d4j3y2k/oxtail/v0.13.0/integrations/codex/oxtail-join/agents/openai.yaml \
   -o ~/.codex/skills/oxtail-join/agents/openai.yaml
 ```
@@ -68,10 +68,11 @@ Contributing? `git clone https://github.com/d4j3y2k/oxtail && cd oxtail && npm i
 - `send_message` — **fire-and-forget** message to a peer. Target is a tmux session name or a raw `client_session_id` UUID. Body ≤ 8KB. Delivery is async via the peer's mailbox file. A plain message does **not** wake an idle peer; pass `wake: "auto"` to nudge one (state-gated — see [Waking an idle peer](#waking-an-idle-peer)). Replies to `ask_peer` should pass `reply_to: "<request_id>"` when the inbound message carries a `request_id` — and a reply **auto-wakes the requester by default** (strictly gated; `wake: "off"` opts out). (v0.5+)
 - `read_my_messages` — drain this session's mailbox and return any queued messages. Messages include `from_session_id`, server-stamped `origin: "peer"`, and optional `request_id` / `reply_to`. Codex peers (and unhooked Claude Code) poll this; Claude Code peers with the hooks installed see messages mid-turn (PreToolUse) or at turn end (Stop) instead. (v0.5+)
 - `ask_peer` — **delegate-and-wait**. Enqueues a message with a `request_id` and blocks server-side until the peer replies with `send_message({ reply_to: request_id })` or the timeout elapses. Default timeout is 45s (`OXTAIL_ASK_PEER_TIMEOUT_MS`), and each call may pass `timeout_ms`. New peers use strict `reply_to` correlation; legacy/no-capability peers fall back to best-effort first-message matching and the response reports `correlation: "uncorrelated"`. That legacy path may stale-match old same-peer chatter, so callers should treat `uncorrelated` as compatibility-only. Use `send_message` for fire-and-forget. (v0.7+)
+- `reply_to_message` — **reply by `message_id`**. The atomic, correlation-safe alternative to hand-wiring `send_message`'s `target` + `reply_to`: pass the `message_id` the hook or `read_my_messages` showed you and the server looks the inbound envelope up in this session's durable **received-ledger**, derives the reply target (the original sender), carries `reply_to: request_id` when the inbound was an `ask_peer` (keeping the exchange correlated), and stamps `source_message_id`. Replying to a plain `send_message` works too — it just omits `reply_to`. Ownership is structural (you can only reply to a message delivered to *you*); fail-closed on an unknown/aged-out id. Same wake semantics as `send_message`, including the wake-on-reply default. (v0.13+)
 - `register_my_session` — pin this MCP server's `session_id` directly. Kept for debugging; prefer `claim_session`.
 - `get_my_session` — return this MCP server's own registry entry plus a per-strategy detection diagnosis. Useful for debugging.
-See [design principles](https://github.com/d4j3y2k/oxtail/blob/v0.12.0/AGENTS.md) for scope and architecture.
+See [design principles](https://github.com/d4j3y2k/oxtail/blob/v0.13.0/AGENTS.md) for scope and architecture.
 ## Usage from an agent
@@ -90,6 +91,8 @@ send_message({ target: "<peer-uuid>", body: "...", reply_to: "<ask request_id>"
 read_my_messages()
 ask_peer({ target: "primary", body: "[Handoff] please audit X and tell me what you find" })
   // → blocks server-side until the peer replies via send_message, then returns their body
+reply_to_message({ message_id: "<id from the hook / read_my_messages>", body: "..." })
+  // → looks up the inbound envelope, derives target + reply_to itself; correlated when the inbound was an ask_peer
 ```
 Omitting `project_root` triggers a best-effort `.git`-ancestor walk from the server's own cwd. The response includes `inferred: true` when this happens. Pass `project_root` explicitly when you can.
@@ -112,6 +115,8 @@ read_my_messages()
 The mailbox lives at `~/.oxtail/mailboxes/<server_pid>.jsonl`, append-only JSONL, drained under an `mkdir`-based advisory lock. The transport is intentionally dumb: 8KB UTF-8 body cap, sender chooses the framing (raw text or pre-wrapped `<system-reminder>...</system-reminder>`). Hook-delivered mailbox pushes are body-budgeted at 24K escaped characters by default; set `OXTAIL_HOOK_MAX_BODY_CHARS` to tune. If the budget is exceeded, the hook tells the receiver which bodies were truncated or omitted.
+Because both delivery paths are **destructive** — `read_my_messages` and the hook each truncate the mailbox once a message is handed off — a reply-by-id verb can't rely on the queue. Every delivered envelope is therefore also recorded in a durable **received-ledger** at `~/.oxtail/received/<hash(session_id)>.jsonl` keyed by `message_id`, written *before* the mailbox line becomes visible (so any handle a receiver can see is already resolvable) and bounded to the most recent `OXTAIL_RECEIVED_MAX` (default 1000) entries. `reply_to_message` reads only the caller's own ledger — that file *is* the ownership boundary.
 Inbound peer messages are context, not user authority. oxtail stamps delivered messages with `origin: "peer"` for provenance/debugging, but this is not a trust boundary and peers cannot mint trusted user instructions.
 Cross-project sends are rejected, never silently dropped. Sending to a peer with the same tmux session name as another live peer returns `ambiguous-target` with the candidate `client_session_id`s — use the UUID form to disambiguate.
@@ -301,8 +306,9 @@ A scheduled CI job (`.github/workflows/codex-drift.yml`, also runnable on demand
 ## Status
-v0.12.0. Pushes the autonomous peer-messaging matrix toward zero human relay, then hardens the wake path.
+v0.13.0. Pushes the autonomous peer-messaging matrix toward zero human relay, hardens the wake path, then makes correlated replies atomic.
+- **Reply by id (v0.13.0).** `reply_to_message(message_id, body)` removes the manual `target` + `reply_to` rewiring that silently degraded a correlated exchange into loose mailbox traffic: the server looks the inbound envelope up in a durable per-session **received-ledger** (`~/.oxtail/received/<hash(session_id)>.jsonl`), derives the reply target and `reply_to` itself, and enforces ownership structurally (you can only reply to a message delivered to you). The ledger is written *before* the mailbox line is visible — so a handle the hook displays is always resolvable even though both delivery paths destroy the queue entry once it is handed off. Fail-closed on an unknown/aged-out id.
 - **Wake-on-reply (v0.11.0).** A reply — `send_message` with `reply_to` — auto-wakes a freshly-idle requester by default, so an awaited answer doesn't strand an idle peer. Strictly gated (fresh-idle only, per-target rate limit, one-wake dedupe, `OXTAIL_AUTOWAKE=off` kill-switch). `wake:"off"` opts out; explicit `wake:"auto"` is the escape hatch for a requester without an idle marker (Codex / hookless Claude).
 - **Wake hardening (v0.12.0).** Wake keystrokes only ever target the pane the process tree confirms hosts the peer's `server_pid` — never a self-written `tmux_pane`/`tmux_session`, and registry entries whose `server_pid` doesn't match their filename are rejected. Rapid repeat wakes to one peer are coalesced (`skipped_debounced`). `oxtail diagnose` summarizes wake outcomes from `MCP_TRACE_FILE`, and a scheduled CI job flags drift in Codex's paste-burst window before it can break the wake.
 - **Correlated delegate-and-wait.** `ask_peer` now sends a `request_id`; upgraded peers reply with `send_message({ reply_to })`, and the waiter ignores same-peer chatter that does not match. Legacy peers are still supported, but their replies are marked `correlation: "uncorrelated"`.

package/dist/delivery.js ADDED Viewed

@@ -0,0 +1,32 @@
+import * as mailbox from "./mailbox.js";
+import { recordReceived } from "./received.js";
+import { trace } from "./trace.js";
+// Deliver a message to a peer's mailbox, recording the durable reply-handle in
+// the receiver's ledger BEFORE the mailbox line becomes visible. The ordering is
+// the correctness guarantee: a hook/poll drainer can only observe the mailbox
+// line after the append, which happens strictly after the ledger write — so any
+// message_id a receiver can drain/render already has a ledger entry behind it.
+// The reverse order (append, then record) left a window where the hook rendered
+// a handle reply_to_message could not yet resolve (the race Codex caught).
+//
+// receiverSessionId may be null/empty (an unclaimed peer): then there is no
+// ledger to own the handle and we skip the record — reply_to_message simply
+// won't find it, which is the documented fall-back-to-send_message path.
+//
+// The ledger write is best-effort: a ledger failure must NEVER drop the actual
+// delivery. Worst case the reply handle is missing and the peer falls back to
+// send_message — never the reverse (a visible line with no handle on success),
+// because record precedes append.
+export function deliverToPeer(receiverSessionId, targetPid, body, fromSessionId, options = {}) {
+    const msg = mailbox.buildMessage(body, fromSessionId, options);
+    if (receiverSessionId) {
+        try {
+            recordReceived(receiverSessionId, msg);
+        }
+        catch (e) {
+            trace("received_ledger_write_failed", { message_id: msg.id, error: String(e) });
+        }
+    }
+    mailbox.requeue(targetPid, msg);
+    return msg;
+}

package/dist/mailbox.js CHANGED Viewed

@@ -107,8 +107,12 @@ export function serializeMailboxLine(msg) {
     }
     return line;
 }
-export function enqueue(target_pid, body, from_session_id, options = {}) {
-    const msg = {
+// Mint a message envelope WITHOUT writing it anywhere. Split out from enqueue so
+// a higher layer (delivery.ts) can record the durable received-ledger entry
+// BEFORE the mailbox line becomes visible — the ordering that guarantees any
+// message_id a receiver can drain/render already has a ledger entry behind it.
+export function buildMessage(body, from_session_id, options = {}) {
+    return {
         schema_version: 1,
         id: randomBytes(8).toString("hex"),
         body,
@@ -120,10 +124,12 @@ export function enqueue(target_pid, body, from_session_id, options = {}) {
         ...(options.reply_to ? { reply_to: options.reply_to } : {}),
         ...(options.source_message_id ? { source_message_id: options.source_message_id } : {}),
     };
-    const line = serializeMailboxLine(msg);
+}
+export function enqueue(target_pid, body, from_session_id, options = {}) {
+    const msg = buildMessage(body, from_session_id, options);
     acquireLock(target_pid);
     try {
-        appendFileSync(mailboxPath(target_pid), line);
+        appendFileSync(mailboxPath(target_pid), serializeMailboxLine(msg));
     }
     finally {
         releaseLock(target_pid);

package/dist/received.js ADDED Viewed

@@ -0,0 +1,176 @@
+import { createHash } from "node:crypto";
+import { mkdirSync, readFileSync, rmdirSync, statSync, writeFileSync, } from "node:fs";
+import { homedir } from "node:os";
+import { join } from "node:path";
+import { trace } from "./trace.js";
+// The received-message ledger: a durable, per-session index of every inbound
+// envelope, keyed by message_id. It exists because both delivery paths are
+// DESTRUCTIVE — mailbox.drain() truncates the queue to 0 after a read, and the
+// PreToolUse hook does `:> "$m"` after rendering messages into model context.
+// So once a message is delivered, the mailbox no longer holds it. A reply verb
+// (reply_to_message) that looks a message up by id therefore cannot rely on the
+// mailbox; it needs this separate ledger.
+//
+// Correctness hinges on ORDERING, enforced by delivery.ts: the ledger entry is
+// written BEFORE the mailbox line becomes visible. A drainer can only observe
+// the line after the append, which happens strictly after this write — so any
+// message_id a receiver can see has a ledger entry behind it. (The reverse order
+// left a window where the hook rendered a handle reply_to_message couldn't yet
+// resolve — the race Codex caught in review.)
+//
+// Ownership is structural: the ledger lives at received/<hash(session_id)>, and
+// lookups only ever read the caller's own file. You can only reply to a message
+// that was delivered to YOU.
+function receivedDir() {
+    // Resolved lazily so tests can swap HOME between cases (mirrors mailbox.ts).
+    return join(homedir(), ".oxtail", "received");
+}
+// Hash the session_id into the filename (mirrors claims.ts) so two distinct ids
+// can never collide onto one ledger file — a lossy character-sanitize could map
+// different sessions to the same path. UUIDs are already path-safe; the hash is
+// defensive and collision-free.
+function ledgerKey(sessionId) {
+    return createHash("sha256").update(sessionId).digest("hex").slice(0, 32);
+}
+function ledgerPath(sessionId) {
+    return join(receivedDir(), `${ledgerKey(sessionId)}.jsonl`);
+}
+function lockPath(sessionId) {
+    return `${ledgerPath(sessionId)}.lock`;
+}
+// Lock idiom mirrors mailbox.ts (mkdir-based, staleness-cleared). The ledger
+// read-modify-write is small (bounded by receivedMax() lines) so the lock
+// window is short.
+const LOCK_STALE_MS = 30_000;
+const LOCK_RETRY_LIMIT = 50;
+const LOCK_RETRY_DELAY_MS = 10;
+// Bounded retention: keep at most this many of the most-recent inbound messages
+// per session. Read lazily so tests can tune it per-case. Generous by default so
+// a realistic mailbox burst (read_my_messages budgets 50/drain) can't push a
+// just-displayed handle out of the ledger before the receiver replies; when the
+// cap DOES bite, recordReceived traces the drop so it is never silent.
+export function receivedMax() {
+    const n = Number(process.env.OXTAIL_RECEIVED_MAX);
+    return Number.isFinite(n) && n > 0 ? Math.floor(n) : 1000;
+}
+function sleepSync(ms) {
+    Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
+}
+function acquireLock(sessionId) {
+    mkdirSync(receivedDir(), { recursive: true, mode: 0o700 });
+    const lock = lockPath(sessionId);
+    for (let i = 0; i < LOCK_RETRY_LIMIT; i++) {
+        try {
+            mkdirSync(lock, { mode: 0o700 });
+            return;
+        }
+        catch (e) {
+            const err = e;
+            if (err.code !== "EEXIST")
+                throw err;
+            try {
+                const st = statSync(lock);
+                if (Date.now() - st.mtimeMs > LOCK_STALE_MS) {
+                    try {
+                        rmdirSync(lock);
+                        trace("received_lock_stale_clear", { session_id: sessionId });
+                    }
+                    catch {
+                        // raced with another clearer; fall through to retry
+                    }
+                    continue;
+                }
+            }
+            catch {
+                // stat may race; just retry
+            }
+            sleepSync(LOCK_RETRY_DELAY_MS);
+        }
+    }
+    throw new Error(`could not acquire received-ledger lock for ${sessionId}`);
+}
+function releaseLock(sessionId) {
+    try {
+        rmdirSync(lockPath(sessionId));
+    }
+    catch {
+        // ignore ENOENT / not-empty / EPERM
+    }
+}
+function readLines(sessionId) {
+    try {
+        const raw = readFileSync(ledgerPath(sessionId), "utf8");
+        if (!raw)
+            return [];
+        return raw.split("\n").filter((l) => l.length > 0);
+    }
+    catch (e) {
+        const err = e;
+        if (err.code === "ENOENT")
+            return [];
+        throw err;
+    }
+}
+// Append an inbound envelope to the receiver's ledger and prune to receivedMax()
+// (oldest dropped first). Called by delivery.ts BEFORE the mailbox append.
+export function recordReceived(receiverSessionId, msg) {
+    if (!receiverSessionId)
+        return;
+    acquireLock(receiverSessionId);
+    try {
+        const lines = readLines(receiverSessionId);
+        lines.push(JSON.stringify(msg));
+        const max = receivedMax();
+        let pruned = lines;
+        if (lines.length > max) {
+            pruned = lines.slice(lines.length - max);
+            // No silent caps: a dropped handle becomes reply_to_message
+            // "message-not-found", so surface that the bound bit.
+            trace("received_ledger_pruned", {
+                session_id: receiverSessionId,
+                dropped: lines.length - max,
+                kept: max,
+            });
+        }
+        writeFileSync(ledgerPath(receiverSessionId), pruned.join("\n") + "\n", {
+            mode: 0o600,
+        });
+    }
+    finally {
+        releaseLock(receiverSessionId);
+    }
+}
+// Look up a previously-received envelope by message_id in this session's ledger.
+// Newest-first scan (ids are unique, so the first match is the only match).
+// Returns null when not found / aged out — the fail-closed signal the reply
+// verb turns into message-not-found. Read under the same lock so a concurrent
+// recordReceived rewrite can't be observed half-written.
+export function lookupReceived(receiverSessionId, messageId) {
+    if (!receiverSessionId)
+        return null;
+    acquireLock(receiverSessionId);
+    try {
+        const lines = readLines(receiverSessionId);
+        for (let i = lines.length - 1; i >= 0; i--) {
+            let parsed;
+            try {
+                parsed = JSON.parse(lines[i]);
+            }
+            catch {
+                continue;
+            }
+            if (parsed &&
+                typeof parsed === "object" &&
+                parsed.id === messageId) {
+                return parsed;
+            }
+        }
+        return null;
+    }
+    finally {
+        releaseLock(receiverSessionId);
+    }
+}
+export function receivedFilePath(sessionId) {
+    return ledgerPath(sessionId);
+}

package/dist/server.js CHANGED Viewed

@@ -12,6 +12,8 @@ import { isAbstain } from "./detect/index.js";
 import { trace } from "./trace.js";
 import { buildEntry, chooseVerifiedWakePane, findByTmuxSession, readAll, refreshTmuxBinding, register, sessionPidsForId, unregister, } from "./registry.js";
 import * as mailbox from "./mailbox.js";
+import * as received from "./received.js";
+import { deliverToPeer } from "./delivery.js";
 import { recoverClaim, resolveAncestors, writeClaim } from "./claims.js";
 import { decideReplyAutoWake, defaultAutowakeDir } from "./autowake.js";
 import { markWoke, newWakeDebounceStore, recentlyWoke } from "./wake-debounce.js";
@@ -1053,7 +1055,11 @@ server.registerTool("send_message", {
     }
     const peer = resolved.entry;
     const fromSessionId = entry.client.session_id ?? undefined;
-    const msg = mailbox.enqueue(peer.server_pid, body, fromSessionId, {
+    // deliverToPeer records the durable reply-handle in the recipient's ledger
+    // BEFORE the mailbox line is visible, so a later reply_to_message(message_id)
+    // resolves even after the destructive mailbox/hook drain — and never sees a
+    // displayed-but-unrecorded handle (record precedes append).
+    const msg = deliverToPeer(peer.client.session_id, peer.server_pid, body, fromSessionId, {
         reply_to,
         source_message_id,
     });
@@ -1076,6 +1082,100 @@ server.registerTool("send_message", {
         ...(wake_reason ? { wake_reason } : {}),
     });
 });
+server.registerTool("reply_to_message", {
+    description: [
+        "Reply to a specific inbound peer message by its message_id — the atomic, correlation-safe alternative to hand-wiring send_message's target + reply_to. The server looks the message up in this session's durable received-ledger, so you pass only the message_id the PreToolUse hook or read_my_messages already showed you; it derives the reply target (the original sender), carries reply_to=request_id when the inbound was an ask_peer (keeping the exchange correlated), and sets source_message_id for provenance. Replying to a plain send_message works too — it just omits reply_to. Ownership is structural: you can only reply to a message delivered to you.",
+        "Delivery + wake match send_message exactly, including the wake-on-reply default: when the inbound carried a request_id and you leave wake unset, a freshly-idle requester is auto-woken; pass wake:\"auto\" to nudge any idle peer, or wake:\"off\" to suppress. Fail-closed: an unknown or aged-out message_id returns error message-not-found instead of guessing a target.",
+    ].join(" "),
+    inputSchema: {
+        message_id: z
+            .string()
+            .min(1)
+            .describe("The message_id of the inbound peer message you are replying to, as shown by the PreToolUse hook or read_my_messages."),
+        body: z
+            .string()
+            .min(1)
+            .refine((s) => Buffer.byteLength(s, "utf8") <= 8192, {
+            message: "body exceeds 8192 UTF-8 bytes",
+        })
+            .describe("Reply body, ≤8KB UTF-8. Verbatim."),
+        wake: z
+            .enum(["off", "auto"])
+            .optional()
+            .describe('Wake strategy, same semantics as send_message. Unset: wake-on-reply default (auto-wakes a freshly-idle requester when the inbound was an ask_peer). "auto": nudge any idle peer. "off": no nudge.'),
+    },
+}, async ({ message_id, body, wake }) => {
+    const myId = entry.client.session_id;
+    if (!myId) {
+        return jsonResult({
+            schema_version: 1,
+            ok: false,
+            error: "no-session-id",
+            message: "This session has not claimed a session_id, so it has no received-ledger to reply from. Call claim_session first.",
+        });
+    }
+    const inbound = received.lookupReceived(myId, message_id);
+    if (!inbound) {
+        return jsonResult({
+            schema_version: 1,
+            ok: false,
+            error: "message-not-found",
+            message: `No received message ${message_id} in this session's ledger (it may have aged out of retention, or predates reply_to_message). Fall back to send_message with an explicit target.`,
+        });
+    }
+    const targetSid = inbound.from_session_id;
+    if (!targetSid) {
+        return jsonResult({
+            schema_version: 1,
+            ok: false,
+            error: "no-reply-target",
+            message: `Inbound message ${message_id} has no from_session_id, so there is no peer to reply to.`,
+        });
+    }
+    const replyTo = inbound.request_id; // undefined when the inbound was a plain send_message
+    const resolved = resolveTarget(targetSid, entry);
+    if (!resolved.ok) {
+        const replyDefault = replyAutoWakeTriggered(wake, replyTo);
+        const wakeIntended = wake === "auto" || replyDefault;
+        const wake_status = wakeIntended ? resolveErrorWakeStatus(resolved.error) : undefined;
+        return jsonResult({
+            schema_version: 1,
+            ...resolved,
+            in_reply_to_message_id: message_id,
+            original_from_session_id: targetSid,
+            ...(wake_status ? { wake_status } : {}),
+            ...(replyDefault ? { wake_reason: "reply_to_default" } : {}),
+        });
+    }
+    const peer = resolved.entry;
+    const fromSessionId = entry.client.session_id ?? undefined;
+    // Record the reply itself into the original asker's ledger (record-before-
+    // append) so replies can be replied to in turn — chained correlation.
+    const msg = deliverToPeer(peer.client.session_id, peer.server_pid, body, fromSessionId, {
+        reply_to: replyTo,
+        source_message_id: message_id,
+    });
+    const { wake_status, wake_reason } = await resolveSendWake(peer, wake, replyTo);
+    if (wake_status) {
+        trace("wake_outcome", {
+            via: wake_reason === "reply_to_default" ? "reply_default" : "reply_to_message",
+            wake_status,
+            target_session_id: peer.client.session_id,
+            client_type: peer.client.type,
+        });
+    }
+    return jsonResult({
+        schema_version: 1,
+        ok: true,
+        message_id: msg.id,
+        in_reply_to_message_id: message_id,
+        target_session_id: peer.client.session_id,
+        target_server_pid: peer.server_pid,
+        correlation: replyTo ? "correlated" : "uncorrelated",
+        ...(wake_status ? { wake_status } : {}),
+        ...(wake_reason ? { wake_reason } : {}),
+    });
+});
 // read_my_messages budget. A session's union drain can return a backlog; cap
 // how much one call hands back so a flood (or a peer spamming near-8KB bodies)
 // can't blow the caller's context in a single drain. Overflow is NOT dropped or
@@ -1556,7 +1656,9 @@ server.registerTool("ask_peer", {
     const requestId = randomBytes(8).toString("hex");
     const requireReplyTo = peerSupportsReplyTo(peer);
     const fromSessionId = entry.client.session_id ?? undefined;
-    const msg = mailbox.enqueue(peer.server_pid, body, fromSessionId, {
+    // Record-before-append (mirrors send_message): lets the peer answer with
+    // reply_to_message(message_id) instead of hand-wiring target + reply_to.
+    const msg = deliverToPeer(expectedSessionId, peer.server_pid, body, fromSessionId, {
         request_id: requestId,
     });
     const startedAt = Date.now();

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "oxtail",
-  "version": "0.12.0",
+  "version": "0.13.0",
   "private": false,
   "type": "module",
   "description": "Coordination layer for parallel AI coding agent sessions, exposed over MCP.",