npm - zidane - Versions diffs - 5.1.19 → 5.1.21 - Mend

zidane 5.1.19 → 5.1.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/dist/tools-CMVruxF0.js.map +1 -1
package/dist/tui.js +4 -0
package/dist/tui.js.map +1 -1
package/docs/ARCHITECTURE.md +34 -2
package/docs/CHAT.md +104 -33
package/docs/SKILL.md +3 -2
package/docs/TUI.md +25 -8
package/package.json +1 -1

package/docs/ARCHITECTURE.md CHANGED Viewed

@@ -176,9 +176,40 @@ Built-in tools are opinionated about output sizes — drop your v2 `tool:transfo
 | `shell` | Tail-priority truncation at `maxOutputBytes=32768` (32 KiB, combined stdout+stderr). Head trim marker: `…(N bytes truncated from head)…`. `0` disables. UTF-8 never splits mid-codepoint. Appends `(exit N, Nms)` footer + surfaces non-empty stderr by default (`metadata: false` opts out). |
 | `write_file` | Reads existing content; returns `Created` / `Updated` / `No change needed: …` so the model detects no-ops without a separate read. Race window in shared docker/sandbox contexts documented and accepted. |
 | `edit` | Fails clearly on non-unique `old_string` (unless `replace_all: true`). On not-found, includes a nearest-match preview so the model recovers without re-reading. |
-| `multi_edit` | Sequential edits to one file. **Ungated** (no `_outcomes` in input — the SDK default): atomic, first failure aborts. **Gated** (chat layer injects `_outcomes` per-hunk via `injectOutcomesIntoInput`): each hunk reported independently into a structured `Edited <path>: N/M applied · K denied · …` body — `parseEditOutcomesFromResult` re-parses for transcript replay. See CHAT.md → Per-edit approval. |
+| `multi_edit` | Sequential edits to one file. **Single-mode atomic** — applies every edit in input order against the file as left by the previous step; first tool-level failure aborts the batch with the legacy `multi_edit error: edit #N <reason>` string, success returns `Edited <path>: applied N edits (M replacements).`. The tool body knows nothing about approvals: per-hunk decisions are a host concern enforced upstream — the chat layer's `tool:gate` rebinds `ctx.input.edits` to the approved subset (rebind, not mutate — the model's original `tool_call` block in `session.turns` stays untouched) before the body runs, and a `tool:transform` hook appends an `<edit-outcomes>` block to the result so live + replayed transcripts share the same per-hunk view (`parseEditOutcomesFromResult` re-parses on reload). See CHAT.md → Per-edit approval. |
 | `grep` | Wraps `rg` when present (with explicit `.` path to avoid stdin hangs). Bun.Glob fallback otherwise. `head_limit=250`, `offset` paginates. |
+## Per-edit approval — harness purity
+The harness has no notion of "partial approval". Edit-family tools (`edit`, `multi_edit`, `write_file`) are single-mode and atomic; per-hunk decisions live entirely above the loop. The split is deliberate — SDK consumers (CI agents, headless pipelines, custom hosts) keep the legacy contract; the per-hunk UX is purely a chat-layer concern.
+Three loop-visible artifacts carry the decision through:
+1. **Input rebind at `tool:gate`** — the host's gate handler (`src/tui/app.tsx`'s `applyGate`) computes the approved subset, then assigns a fresh shallow-clone to `ctx.input` whose `edits` array is filtered. The model's original `tool_call.input` in `session.turns` is never mutated (the rebind produces a new object); the tool body sees the smaller, all-approved batch and runs unchanged.
+2. **Pending-annotation map** — keyed by `tool_call.id`, holds the 1:1 `EditOutcome[]` (over the model's ORIGINAL hunks). Lives in the host's React tree (`pendingAnnotationsRef` in the TUI).
+3. **`tool:transform` annotation** — the host appends an `<edit-outcomes>` block to `ctx.result` when at least one hunk wasn't applied. Bubbles to `child:tool:transform` for subagent-issued calls via `BUBBLED_MUTABLE_EVENTS`.
+**Wire format** (canonical shape for live emit + persisted replay):
+```
+Edited path/to/foo.ts: applied 2 edits (3 replacements).
+<edit-outcomes>
+#1 applied
+#2 denied: denied by user
+#3 applied
+</edit-outcomes>
+```
+- Opening + closing tags each on their own line.
+- One line per hunk: `#<1-based-index> <kind>[: <reason>]`.
+- `kind ∈ {applied | denied | skipped | failed}`.
+- Block emitted ONLY when at least one hunk is NOT applied. All-applied calls fall through to the legacy summary alone.
+A fully-denied call (every hunk rejected) skips the substitute path; the harness writes `Blocked: User denied this tool call` as the tool_result and the host emits a synthetic `tool-result` event with body `[fully denied] <edit-outcomes>…</edit-outcomes>` for live display only (persisted history stays terse).
+Replay path: `parseEditOutcomesFromResult(text)` (from `zidane/chat`) recovers the `EditOutcome[]` from the annotation block. `eventsFromTurns` pairs `tool_call` ↔ `tool_result` by `callId` and re-attaches the outcomes onto the `'tool'` event so reloaded transcripts paint identical per-hunk badges.
 ## Tool argument auto-coercion
 `validateToolArgs` runs between `tool:gate` and `tool:before`:
@@ -459,11 +490,12 @@ The child's lifecycle also bubbles to the parent hook surface with `childId` + `
 ```
 child:stream:text / child:stream:thinking / child:stream:end
 child:tool:gate / child:mcp:tool:gate          ← share the child's ctx — parent mutations propagate
+child:tool:transform                           ← share the child's ctx — parent mutations propagate
 child:tool:before / child:tool:after / child:tool:error
 child:turn:after
 ```
-The two `child:*:gate` events are special: the bubbled ctx is the same reference the child loop awaits on, so a parent listener can refuse / substitute a subagent's tool call without registering on the child agent.
+`BUBBLED_MUTABLE_EVENTS` (`src/tools/spawn.ts`) is the canonical list: `tool:gate`, `mcp:tool:gate`, `tool:transform`. The bubbled ctx is the same reference the child loop awaits on, so a parent listener can refuse / substitute / annotate a subagent's tool call without registering on the child agent. The chat layer's per-edit annotation flow registers on both `tool:transform` and `child:tool:transform` so subagent-issued `multi_edit` calls also get `<edit-outcomes>` blocks appended to their results.
 ## Dependency Graph

package/docs/CHAT.md CHANGED Viewed

@@ -88,7 +88,9 @@ const agent = createAgent({
 | Hook | Purpose | Consumer in chat layer |
 |---|---|---|
-| `tool:gate`, `child:tool:gate`, `mcp:tool:gate`, `child:mcp:tool:gate` | Approval gate | `useSafeModeActions().requestApproval` + (for file-edit tools) `resolveApprovalForPayload` / `injectOutcomesIntoInput`. See **Per-edit approval**. |
+| `tool:gate`, `child:tool:gate`, `mcp:tool:gate`, `child:mcp:tool:gate` | Approval gate | `useSafeModeActions().requestApproval(name, input, originator?)` + (for file-edit tools) `resolveApprovalForPayload`; partial → rebind `ctx.input` to the approved subset + stash outcomes in a pending-annotation map. See **Per-edit approval**. |
+| `tool:transform`, `child:tool:transform` | Per-edit annotation | Look up `callId` in the pending-annotation map and append `buildEditOutcomesAnnotation(outcomes)` to `ctx.result`. Clear the entry. |
+| `agent:done` | Stranded-annotation sweep | `pendingAnnotations.clear()` — covers completed / aborted / error paths so a never-fired `tool:transform` doesn't leak. |
 | `mcp:auth:required`, `mcp:auth:url`, `mcp:auth:success`, `mcp:auth:error`, `mcp:connect` | OAuth badge state | `useMcpAuthDispatch` → `reduceMcpAuth` |
 | `stream:thinking`, `stream:text`, `child:stream:thinking`, `child:stream:text` | Streaming deltas | `useStreamBuffer().queueStreamDelta` |
 | `tool:before`, `tool:after`, `mcp:tool:after`, `child:tool:before`, `child:tool:after` | Tool call/result events | `stream.appendImmediate` |
@@ -123,8 +125,8 @@ The table below indexes every named export; sections further down dive into the
 | `config` + `config-context` | `resolveConfig`, `useConfig`, `ChatOptions`, `ResolvedConfig`, `ResolvedPaths`, `ModelInfo`, `ProviderRegistry`. See **Required options**. |
 | `credentials` | AI-provider credential store. `setProviderCredential`, `readProviderCredential`, `removeProviderCredential`, `readCredentials`, `writeCredentials`, `credentialsPath`, `applyApiKeyEnv` (called by `resolveConfig` before any factory runs). Owner-only file mode. |
 | `discovery-context` + `discovery-slot` | Live catalog plumbing: `DiscoveryProvider`, `useDiscovery`, `useDiscoveryOptional`, `createDiscoverySlot` (stale-while-revalidate primitive). Propagates fresh catalogs (files, skills, MCPs) into open modals. See **Discovery context** below. |
-| `edit-approval` | Pure helpers bridging the safe-mode gate to the `multi_edit` tool's per-edit `_outcomes` side channel — `resolveApprovalForPayload`, `maskToOutcomeKinds`, `injectOutcomesIntoInput`, `parseEditOutcomesFromResult`, `summarizeOutcomes`, `ResolvedApproval`. See **Per-edit approval**. |
-| `edit-diff` | Diff plumbing — `extractEditPayload`, `previewEditPayload`, `applyEditPayload`, `buildUnifiedDiff`, `buildContextualDiff`, `computeLineDiff`, `computeInlineDiff`, `splitLines`, `tokenize`, `filetypeFromPath`, `readEditOutcomes`, `OUTCOMES_INPUT_KEY`. See **Edit-diff rendering**. |
+| `edit-approval` | Pure helpers bridging the safe-mode gate to the host-side per-hunk flow — `resolveApprovalForPayload`, `maskToOutcomeKinds`, `buildEditOutcomesAnnotation`, `parseEditOutcomesFromResult`, `summarizeOutcomes`, `ResolvedApproval`. See **Per-edit approval**. |
+| `edit-diff` | Diff plumbing — `extractEditPayload`, `previewEditPayload`, `applyEditPayload`, `buildUnifiedDiff`, `buildContextualDiff`, `computeLineDiff`, `computeInlineDiff`, `splitLines`, `tokenize`, `filetypeFromPath`. See **Edit-diff rendering**. |
 | `enabled-toggle-set` | `useEnabledToggleSet({ catalog, keyOf, settingKey })` — generic state machine for `enabledSkills` / `enabledMcps` (undefined = all enabled, `[]` = off, `[names]` = allowlist). |
 | `files-discovery` | `listProjectFiles({ cwd, signal? })` → `FileEntry[]`, gitignored paths excluded. |
 | `format` | `fmtTokens`, `ageString`, `shortId`, `compactPath` — display-only helpers. |
@@ -425,9 +427,37 @@ type ApprovalDecision
     | 'accept-safelist'
     | 'deny'
     | { kind: 'partial', mask: readonly boolean[] }
+type ApprovalOriginator
+  = | { kind: 'parent' }
+    | { kind: 'child', label: string }      // label is the `child-N` tag
+type RequestApproval = (
+  tool: string,
+  input: Record<string, unknown>,
+  originator?: ApprovalOriginator,
+) => Promise<ApprovalDecision>
+interface ApprovalRequest {
+  id: string
+  tool: string
+  input: Record<string, unknown>
+  resolve: (decision: ApprovalDecision) => void
+  /** Caller attribution. Absent ≡ `{ kind: 'parent' }`. */
+  originator?: ApprovalOriginator
+}
 ```
-`ApprovalDecision` is a discriminated union — exhaustive switches must handle the object form, emitted by the per-edit modal for file-edit tools (`edit` / `multi_edit` / `write_file`). The `mask` is 1:1 with `EditPayload.hunks`; `true` = apply, `false` = deny. Bridge it to the tool body via `resolveApprovalForPayload` (see **Per-edit approval**); non-edit gates collapse `partial` to `allow`.
+`ApprovalDecision` is a discriminated union — exhaustive switches must handle the object form, emitted by the per-edit modal for file-edit tools (`edit` / `multi_edit` / `write_file`). The `mask` is 1:1 with `EditPayload.hunks`; `true` = apply, `false` = deny. Bridge it via `resolveApprovalForPayload` (see **Per-edit approval**); non-edit gates collapse `partial` to `allow`.
+`ApprovalOriginator` lets the modal show ` · child-N` attribution when a subagent's gate bubbles up through the parent's hook bus. The TUI's `applyGate` reads `ctx.childId` (set by `BUBBLED_MUTABLE_EVENTS` in `src/tools/spawn.ts`) and threads it through:
+```ts
+const originator: ApprovalOriginator = ctx.childId
+  ? { kind: 'child', label: ctx.childId }
+  : { kind: 'parent' }
+const decision = await requestApproval(name, input, originator)
+```
 ```tsx
 const { requestApproval, resolveHead, denyAll } = useSafeModeActions()
@@ -441,7 +471,7 @@ const pending = queue[0] ?? null
 `accept-safelist` calls `addToSafelist(dataDir, projectDir, suggestSafelistEntry(tool, input))` so subsequent calls with the same shape skip the gate. The safelist lives at `<dataDir>/projects.json` (user dir, never the project dir). `partial` never writes a safelist entry — the safelist key is the tool name + an arg shape, with no hunk-level scope.
-**Parallel-call deny semantics**: each pending approval resolves on its own. A single `deny` does **not** cascade through the queue — the model receives `Blocked: User denied this tool call` as that one tool's result and the turn continues; other parallel approvals stay queued and prompt independently. The user's explicit "stop everything" gesture is the host-level `esc abort run` shortcut, which calls `denyAll()` + aborts the active `agent.run()` in one shot.
+**Parallel-call deny semantics**: each pending approval resolves on its own. A single `deny` does **not** cascade through the queue — the gate handler sets `ctx.block = true` + `ctx.reason = 'User denied this tool call'` for THAT call and returns, so the model receives `Blocked: User denied this tool call` as that one tool's result and the turn continues; other parallel approvals stay queued and prompt independently. The user's explicit "stop everything" gesture is the host-level `esc abort run` shortcut, which calls `cancelRunOnDenial()` → `denyAll()` + `agent.abort()` in one shot.
 ## Interactions
@@ -575,16 +605,16 @@ interface HunkResolution {
 Wire:
-- `tool:before` hook → read pre-write content for `write_file` (other edit tools carry it in the input) → call `extractEditPayload` → attach the result as `StreamEvent.edit` on the `'tool'` event. The hook also runs `readEditOutcomes(input, hunks.length)` so any `_outcomes` previously injected by the gate ride through onto `payload.outcomes`.
+- `tool:before` hook → read pre-write content for `write_file` (other edit tools carry it in the input) → call `extractEditPayload` → attach the result as `StreamEvent.edit` on the `'tool'` event. The host skips this default emit when the gate already painted a synthetic event for a partial / fully-denied call (otherwise the transcript would show one row with full hunks + a second with the reduced subset).
 - Renderer reads `event.edit` and renders accordingly. When `payload.outcomes` is populated, each hunk row carries an applied / denied / skipped / failed badge; pass the array through `summarizeOutcomes` for the header tally.
-- Suppress the paired success `tool-result` (`isEditErrorResult` is the gate). The structured `multi_edit` body counts as "success" only when every edit applied — `Edited <path>: N/N applied · 0 denied · 0 skipped · 0 failed`. Mixed / failed bodies stay visible.
+- Suppress the paired success `tool-result` (`isEditErrorResult` is the gate). Success means "every hunk applied" — the result body is the legacy `Edited <path>: applied N edits (M replacements).` summary alone, no annotation block. Any presence of `<edit-outcomes>` (mixed / denied / failed) or a `[fully denied]` body keeps the result visible alongside the diff so the user reads the per-hunk reasons.
 - Historical replay (from `eventsFromTurns`) has no pre-write snapshot for `write_file`; the diff renders all-add, matching git's "new file" convention. Outcomes are reconstructed from the persisted `tool_result` body via `parseEditOutcomesFromResult` and re-attached to the `'tool'` event, so a reloaded transcript shows the same per-hunk badges live capture displayed.
 `theme.surfaces.diff` carries the row colors (`addBg`, `removeBg`, optional `*ContentBg`, `addFg`, `removeFg`); built-in themes pre-mix translucent reds/greens so terminals without alpha-blend get a legible effect, and CSS hosts can use the same hex values as `background-color`.
 ## Per-edit approval
-The `multi_edit` tool can accept or reject **individual hunks** instead of all-or-nothing. The user-side decision lives in `ApprovalDecision`'s `{ kind: 'partial', mask }` shape; the bridge to the tool body is a host-only side channel on `input._outcomes` (`OUTCOMES_INPUT_KEY`); the result body carries the post-apply state back in a stable line shape that replay can re-parse.
+Edit-family tools (`edit`, `multi_edit`, `write_file`) can accept or reject **individual hunks** instead of all-or-nothing. The user-side decision lives in `ApprovalDecision`'s `{ kind: 'partial', mask }` shape; the bridge is purely host-side — the harness stays pure (single-mode atomic `multi_edit` body, no side channels on `tool_call.input`). The host's `tool:gate` handler rebinds `ctx.input.edits` to the approved subset before the body runs; a paired `tool:transform` hook appends an `<edit-outcomes>` annotation block to the result so the renderer can paint per-hunk badges live + on replay.
 ```ts
 type EditOutcomeKind = 'applied' | 'denied' | 'skipped' | 'failed' | 'pending'
@@ -608,11 +638,9 @@ interface EditPayload {
 import {
   resolveApprovalForPayload,
   maskToOutcomeKinds,
-  injectOutcomesIntoInput,
+  buildEditOutcomesAnnotation,
   parseEditOutcomesFromResult,
   summarizeOutcomes,
-  OUTCOMES_INPUT_KEY,
-  readEditOutcomes,
   type ResolvedApproval,
 } from 'zidane/chat'
 ```
@@ -621,11 +649,9 @@ import {
 |---|---|
 | `resolveApprovalForPayload(decision, payload)` | Turn an `ApprovalDecision` into `{ outcomes, shouldBlock, syntheticEvent }`. Pure — does not mutate `input` / `payload`. |
 | `maskToOutcomeKinds(mask, fallbackLength, deniedReason?)` | Convert a boolean mask to a 1:1 `EditOutcome[]`. Missing entries default to `applied` (no-decision = keep). |
-| `injectOutcomesIntoInput(input, outcomes)` | In-place writer that stamps `input._outcomes` so the tool body honors per-hunk decisions. |
-| `parseEditOutcomesFromResult(text)` | Re-parse the structured `multi_edit` body back into outcomes. Used by `eventsFromTurns` so replay shows the same badges as live capture. Returns `null` for legacy / unstructured bodies. |
+| `buildEditOutcomesAnnotation(outcomes)` | Render an `EditOutcome[]` as the wire-format `<edit-outcomes>…</edit-outcomes>` block — body APPENDed to a tool result (joined with `\n\n`). Idempotent on missing reasons. |
+| `parseEditOutcomesFromResult(text)` | Re-parse the annotation block back into outcomes. Used by `eventsFromTurns` so replay shows the same badges as live capture. Returns `null` when the block is missing / malformed. |
 | `summarizeOutcomes(outcomes)` | `{ applied, denied, skipped, failed, pending, total }` tally for the header badge. |
-| `readEditOutcomes(input, length)` | Read `input._outcomes` back off the tool call's input — padded to `length` with `applied`. Used by `extractEditPayload` to thread outcomes onto the `tool:before` event. |
-| `OUTCOMES_INPUT_KEY` (`'_outcomes'`) | Side-channel field name. **Not** in the tool's JSON `inputSchema` — the model never sees it. |
 ```ts
 interface ResolvedApproval {
@@ -638,23 +664,45 @@ interface ResolvedApproval {
 }
 ```
+### Wire format
+Canonical annotation block — emitted ONLY when at least one hunk is NOT applied. All-applied calls fall through to the legacy `Edited <path>: applied N edits (M replacements).` summary alone.
+```
+Edited src/foo.ts: applied 2 edits (3 replacements).
+<edit-outcomes>
+#1 applied
+#2 denied: denied by user
+#3 applied
+</edit-outcomes>
+```
+- Opening + closing tags each on their own line.
+- One line per hunk: `#<1-based-index> <kind>[: <reason>]`.
+- `kind ∈ {applied | denied | skipped | failed}`.
+- Indexes are 1-based against the **model's original** `edits` array — `parseEditOutcomesFromResult` re-keys to a 0-based `EditOutcome[]`.
+A fully-denied call (every hunk rejected) skips the substitute path entirely: the harness writes `Blocked: User denied this tool call` to the persisted tool_result, and the host emits a synthetic `tool-result` event with body `[fully denied] <edit-outcomes>…</edit-outcomes>` for live display only. `isEditErrorResult` keeps that synthetic body visible alongside the diff.
 ### Gate handler
-End-to-end example showing how a host wires a `tool:gate` handler against the helpers:
+End-to-end pattern for a host wiring `tool:gate` against the helpers. The harness stays pure: the handler **rebinds** `ctx.input` (does not mutate the model's `tool_call.input` reference on the persisted assistant turn) and stashes outcomes for a paired `tool:transform` to append on the way out:
 ```ts
 import {
+  buildEditOutcomesAnnotation,
   extractEditPayload,
-  injectOutcomesIntoInput,
   resolveApprovalForPayload,
   toolCallPreview,
 } from 'zidane/chat'
+const pendingAnnotations = new Map<string, readonly EditOutcome[]>()
 agent.hooks.hook('tool:gate', async (ctx) => {
   if (ctx.block) return                                   // upstream already refused
-  const decision = await requestApproval(ctx.name, ctx.input)
+  const decision = await requestApproval(ctx.name, ctx.input, originator)
   const editPayload = extractEditPayload(ctx.name, ctx.input)
   if (!editPayload) {
     if (decision === 'deny') {
@@ -676,6 +724,12 @@ agent.hooks.hook('tool:gate', async (ctx) => {
         edit: resolved.syntheticEvent,
         ...(ctx.turnId ? { turnId: ctx.turnId } : {}),
       })
+      stream.appendImmediate({
+        kind: 'tool-result',
+        text: `[fully denied] ${buildEditOutcomesAnnotation(resolved.outcomes)}`,
+        tool: ctx.name,
+        ...(ctx.turnId ? { turnId: ctx.turnId } : {}),
+      })
     }
     ctx.block = true
     ctx.reason = 'User denied this tool call'
@@ -683,30 +737,47 @@ agent.hooks.hook('tool:gate', async (ctx) => {
   }
   if (resolved.syntheticEvent) {
-    injectOutcomesIntoInput(ctx.input, resolved.outcomes)  // partial — tool body skips denied hunks
+    // Rebind — fresh shallow clone whose `edits` is the approved subset.
+    // The model's original `tool_call.input` on the persisted turn stays
+    // intact because the rebind produces a new object.
+    const reducedEdits = reduceEditsByOutcomes(ctx.input.edits, resolved.outcomes)
+    ctx.input = { ...ctx.input, edits: reducedEdits }
+    pendingAnnotations.set(ctx.callId, resolved.outcomes)
   }
-  // accept-* (or partial that collapsed to all-applied) → fall through, no mutation.
+  // accept-* (or partial that collapsed to all-applied) → fall through.
+})
+agent.hooks.hook('tool:transform', (ctx) => {
+  const outcomes = pendingAnnotations.get(ctx.callId)
+  if (!outcomes) return
+  pendingAnnotations.delete(ctx.callId)
+  const annotation = buildEditOutcomesAnnotation(outcomes)
+  ctx.result = typeof ctx.result === 'string'
+    ? (ctx.result.length === 0 ? annotation : `${ctx.result}\n\n${annotation}`)
+    : [...ctx.result, { type: 'text', text: `\n${annotation}` }]
 })
+// Same handler registered on `child:tool:transform` so subagent calls
+// also get annotated — `BUBBLED_MUTABLE_EVENTS` shares the child ctx.
+agent.hooks.hook('child:tool:transform', /* same body */)
 ```
+Wipe `pendingAnnotations` on `agent:done` (covers completed / aborted / error paths) — `tool:transform` doesn't fire when `validation:reject` or a throwing `tool:before` synthesizes the tool_result, so entries can otherwise strand across runs. The TUI also clears the map defensively in its session-teardown path (`pendingAnnotationsRef.current.clear()`).
 ### `multi_edit` tool body shape
-The tool body branches on `_outcomes` presence:
+Single-mode atomic — the harness has no notion of approvals. Applies every edit in input order against the file as left by the previous step. First tool-level failure aborts the batch with the legacy `multi_edit error: edit #N <reason>` string; success returns `Edited <path>: applied N edits (M replacements).`. SDK consumers (CI agents, headless harnesses, pipelines parsing the result) see exactly the legacy contract.
-- **Gated** (host injected `_outcomes`) — never aborts on per-edit tool-level failure. Each step is reported into a structured body via `formatMultiEditReport`:
+When a host rebinds `ctx.input.edits` to the approved subset at the gate (above), the body sees a smaller all-approved batch and still runs the same atomic semantics. The per-hunk outcomes ride into the wire / persisted history as the appended `<edit-outcomes>` block — the model and any downstream parser see one self-describing tool_result.
-  ```
-  Edited src/foo.ts: 2/3 applied · 1 denied · 0 skipped · 0 failed
-    #1 applied: replaced 2 occurrences
-    #2 denied: denied by user
-    #3 applied: replaced 1 occurrence
-  ```
+### Modal lifecycle
-  Anchored line shape `  #N <outcome>[: <detail>]` keeps `parseEditOutcomesFromResult` trivially regex-able.
+The chat layer makes no assumption about how the host renders the approval surface. Two contracts a GUI must honor:
-- **Ungated** (programmatic / SDK call, no `_outcomes`) — original pre-2026 atomic semantics. First tool-level failure aborts the batch with the legacy `multi_edit error: edit #N <reason>` string; success returns the legacy `Edited <path>: applied N edits (M replacements).` summary. SDK consumers (CI agents, headless harnesses, pipelines parsing the result) keep working unchanged.
+- **Display the head of `useSafeModeQueue()` only.** The queue is FIFO; parallel tool calls prompt in arrival order. Resolve with `resolveHead(decision)` to pop the head — never resolve other entries directly, since the underlying `Promise` resolvers are owned by the provider.
+- **Key the rendered surface on `request.id`.** Back-to-back approvals in the same queue tick must force-remount the modal so per-call UI state (per-hunk mask, focused row, zone, etc.) doesn't carry across. The TUI's `ChatScreen` does this with `<FileEditApprovalModal key={fileEditPending.id} … />` (`src/tui/screens.tsx`).
-The two modes are mutually exclusive — `_outcomes` flips the contract. The model never opts in: `_outcomes` is omitted from the tool's JSON `inputSchema`, so the LLM can't write it.
+For hosts that route some decisions through a separate modal stack (the TUI inlines `FileEditApprovalModal` in the transcript slot while non-file-edit gates render as an `ApprovalBlock` below the prompt), push an empty placeholder onto the global modal context for the inline surface so `useModalAwareFocus()` keeps releasing background focus — the focus contract is "any modal-aware surface = background blurred", and an inline modal is still a modal by that definition.
 ## Tool call display
@@ -1198,7 +1269,7 @@ const text = turnAsText(turn)
 **Render an edit diff** — call `extractEditPayload` in `tool:before` (passing pre-write content for `write_file`), persist on the `'tool'` `StreamEvent.edit` field, render via `buildUnifiedDiff(payload)` + `filetypeFromPath(payload.path)` for syntax highlighting. Theme via `theme.surfaces.diff`. For a renderer-faithful preview (matches the tool body's lenient resolver, real file line numbers, per-hunk resolvability metadata), use `previewEditPayload(payload, priorContent, contextLines?)` and pass `result.diffText` to the diff renderable; consult `result.resolution[i].resolved` to decide whether to badge the hunk as unresolvable.
-**Wire per-edit approval** — in the `tool:gate` handler, call `extractEditPayload(name, input)`. If it returns a payload, route the decision through `resolveApprovalForPayload(decision, payload)`: on `shouldBlock`, emit `resolved.syntheticEvent` as a `'tool'` event with `outcomes` (so the transcript shows the denied diff) and set `ctx.block = true`; on partial-accept, call `injectOutcomesIntoInput(ctx.input, resolved.outcomes)` so the `multi_edit` body skips denied hunks. The tool result body's structured shape (`Edited <path>: N/M applied · …`) is parsed back by `parseEditOutcomesFromResult` during replay, so reloaded transcripts re-attach the same per-hunk badges.
+**Wire per-edit approval** — in the `tool:gate` handler, call `extractEditPayload(name, input)`. If it returns a payload, route the decision through `resolveApprovalForPayload(decision, payload)`: on `shouldBlock`, emit `resolved.syntheticEvent` as a `'tool'` event (so the transcript shows the intended diff with denied badges), emit a paired `'tool-result'` event with body `[fully denied] ${buildEditOutcomesAnnotation(resolved.outcomes)}` for live display, and set `ctx.block = true`; on partial-accept, **rebind** `ctx.input` to a fresh shallow-clone whose `edits` is the approved subset and stash `resolved.outcomes` in a per-callId pending-annotation map. A paired `tool:transform` handler reads back the map and appends `buildEditOutcomesAnnotation(outcomes)` to `ctx.result`, so the wire / persisted history carries the per-hunk decisions for replay. Register the same handler on `child:tool:transform` for subagent-issued calls. Wipe the map on `agent:done` to catch paths where `tool:transform` never fires (validation reject, throwing `tool:before`). See **Per-edit approval** for the full gate handler skeleton.
 **Drive a model picker** — `buildModelCatalog(providers, modelsFor, currentPick)` + `filterModelCatalog(catalog, query)` + `indexOfEntry`. Gate an effort sub-picker on `modelSupportsReasoning(descriptor, modelId)`.

package/docs/SKILL.md CHANGED Viewed

@@ -217,7 +217,7 @@ Alias only when semantically equivalent. `shell → Bash` is safe; `list_files
 | `readFile` | Line range, default `offset=1, limit=2000`, 256 KiB cap. Paging footer; binary marker. |
 | `writeFile` | Returns `Created` / `Updated` / `No change needed: …` for no-op detection. |
 | `edit` | Surgical `old_string` → `new_string`. Clear errors on non-unique (unless `replace_all`) / not-found (with nearest-match preview). |
-| `multiEdit` | Sequential edits to one file. Atomic by default; the chat layer's per-edit gate flips it into per-hunk outcome reporting via the `_outcomes` side channel (see `docs/CHAT.md`). |
+| `multiEdit` | Sequential edits to one file. Single-mode atomic — first failure aborts with `multi_edit error: edit #N <reason>`; success returns `Edited <path>: applied N edits (M replacements).`. Per-edit approval is a host concern: the chat layer's `tool:gate` rebinds `ctx.input.edits` to the user-approved subset before the body runs, then a `tool:transform` hook appends an `<edit-outcomes>` annotation to the result so the renderer can paint per-hunk badges (see `docs/CHAT.md`). |
 | `listFiles` | Directory listing. |
 | `glob` | `**`, `*`, `?` pattern matching via Bun.Glob; shells out in docker/sandbox. |
 | `grep` | ripgrep + Bun.Glob fallback. Full Claude Code Grep semantics. `head_limit=250`, `offset` paginates. |
@@ -324,11 +324,12 @@ Mutable ctx fields: `tool:gate` (`block`, `reason`, `result`), `tool:transform`
 ```
 child:stream:text / child:stream:thinking / child:stream:end
 child:tool:gate / child:mcp:tool:gate    ← mutable: block/reason/result propagate to the child
+child:tool:transform                     ← mutable: parent can rewrite the child's tool_result
 child:tool:before / child:tool:after / child:tool:error
 child:turn:after
 ```
-Render nested activity without listening on the child instance. The `child:*:gate` events are unique: they share the same `ctx` reference the child's loop awaits on, so a parent gate handler can refuse / substitute a child's tool call without registering on the child agent.
+Render nested activity without listening on the child instance. The `child:*:gate` and `child:tool:transform` events are special: they share the same `ctx` reference the child's loop awaits on (see `BUBBLED_MUTABLE_EVENTS` in `src/tools/spawn.ts`), so a parent gate handler can refuse / substitute / annotate a child's tool call without registering on the child agent. The chat layer's per-edit annotation flow relies on `child:tool:transform` to append `<edit-outcomes>` blocks onto a subagent's `multi_edit` / `edit` / `write_file` results.
 ### Hook recipes

package/docs/TUI.md CHANGED Viewed

@@ -274,9 +274,9 @@ Hosts adding a new tool: extend `TOOL_DISPLAY` in `zidane/chat` with a `{ displa
 `Settings.showEditDiffs` (default **on**) renders `edit` / `multi_edit` / `write_file` tool calls as a unified diff via the native `<diff>` renderable. Wire path:
-- `tool:before` hook reads pre-write content for `write_file` (`edit` / `multi_edit` carry the old text in their input), then calls `extractEditPayload(name, input, priorContent)` from `zidane/chat`. The resulting `EditPayload` rides on the `'tool'` `StreamEvent.edit` field. `extractEditPayload` also reads back any `_outcomes` previously injected by the gate, so `payload.outcomes` rides through onto the transcript event.
+- `tool:before` hook reads pre-write content for `write_file` (`edit` / `multi_edit` carry the old text in their input), then calls `extractEditPayload(name, input, priorContent)` from `zidane/chat`. The resulting `EditPayload` rides on the `'tool'` `StreamEvent.edit` field. The default emit short-circuits when `pendingAnnotationsRef.current.has(callId)` is true — the gate handler already painted a synthetic `'tool'` event with the FULL hunks + outcomes, so emitting again with the reduced input would paint a misleading second row.
 - `EditDiffBlock` consumes that payload and calls `buildUnifiedDiff(payload)` + `filetypeFromPath(payload.path)` for syntax highlighting. When `payload.outcomes` is populated (live partial-approval or persisted replay), each hunk row carries an applied / denied / skipped / failed badge and the header line shows the tally via `summarizeOutcomes`.
-- The paired `tool-result` is suppressed (`isEditErrorResult` is the gate) only for **all-applied** outcomes. The new structured `multi_edit` body shape (`Edited <path>: N/M applied · K denied · …`) stays visible whenever any hunk wasn't applied, so the user reads the per-edit reasons next to the diff. Errors (`Edit error: …`, `Tool failed: …`, legacy `multi_edit error: …`) always bypass suppression.
+- The paired `tool-result` is suppressed (`isEditErrorResult` is the gate) only for **all-applied** outcomes — the result body is the legacy `Edited <path>: applied N edits (M replacements).` summary with no annotation. Presence of an `<edit-outcomes>…</edit-outcomes>` block (mixed / denied / failed) or a `[fully denied] …` body keeps the row visible so the user reads the per-hunk reasons next to the diff. Errors (`Edit error: …`, `Tool failed: …`, `multi_edit error: …`) always bypass suppression.
 - Historical replay from persisted turns has no pre-write snapshot for `write_file` — the diff renders all-add, matching git's "new file" convention. Outcomes are reconstructed from the persisted `tool_result` body via `parseEditOutcomesFromResult` and re-attached, so reloaded transcripts show the same per-hunk badges live capture displayed.
 Diff row colors come from `theme.surfaces.diff` (`addBg`, `removeBg`, optional `*ContentBg` for a deeper content-column hue, `addFg` / `removeFg` for the gutter glyphs). Built-in themes pre-mix translucent reds/greens against their primary surface so terminals without true alpha-blend still read cleanly.
@@ -289,15 +289,19 @@ File-edit tools (`edit` / `multi_edit` / `write_file`) get their own approval su
 - **`MultiEditApprovalModal`** (multi-step `multi_edit`, ≥ 2 hunks) — per-hunk toggle list + focused-hunk diff panel + action bar. List zone: `↑` / `↓` move cursor, `space` toggles the focused hunk, `y` / `n` set all on/off. Actions zone: `←` / `→` cycle, `↵` confirm. `tab` cycles between zones. `a` / `s` / `p` / `d` bulk shortcuts work in either zone. The first action label is dynamic — `Apply all`, `Apply N/M`, `Nothing selected` — and submits one of:
   - All hunks on → bulk decision (`accept-once` / `accept-session` / `accept-safelist`) preserving the safelist path identical to single-edit.
   - All hunks off → `'deny'`.
-  - Mixed mask → `{ kind: 'partial', mask }` — `gateDecision` calls `resolveApprovalForPayload` + `injectOutcomesIntoInput` to bake the mask into `input._outcomes`. See [Per-edit approval in CHAT.md](./CHAT.md#per-edit-approval) for the helper contract.
+  - Mixed mask → `{ kind: 'partial', mask }` — `applyGate` calls `resolveApprovalForPayload`, then **rebinds** `ctx.input` to a fresh shallow-clone whose `edits` is the approved subset (the model's original `tool_call.input` on the persisted assistant turn stays untouched) and stashes the full-length `EditOutcome[]` in `pendingAnnotationsRef.current` keyed by `ctx.callId`. The paired `tool:transform` / `child:tool:transform` hook reads back the entry and appends a `buildEditOutcomesAnnotation(outcomes)` block to `ctx.result`, so the wire / persisted history carries the per-hunk decisions for replay. See [Per-edit approval in CHAT.md](./CHAT.md#per-edit-approval) for the full contract.
-**Inline mount**. The modal renders inside `ChatScreen`'s transcript slot via `flexGrow: 1` — the transcript stays mounted with `visible: !fileEditPending` so its memoized turn anchors and scrollbox state survive the open/close cycle. The prompt + footer stay anchored at the bottom. `ChatScreen` pushes an empty `<></>` placeholder onto the global modal context so `useModalAwareFocus()` keeps releasing background focus and the app-level `esc` / `ctrl+s` shortcuts stay gated; the actual UI is sibling, not stacked.
+**Inline mount**. The modal renders inside `ChatScreen`'s transcript slot via `flexGrow: 1`. The transcript stays mounted with `visible: !fileEditPending` — OpenTUI maps that to Yoga's `Display.None`, removing the box from layout entirely so the sibling modal claims the slot via its own `flexGrow: 1`. Memoized turn anchors, scrollbox position, and lazy markdown chrome all survive the open → close round-trip. The prompt + footer stay anchored at the bottom (queue-only while `busy`). `ChatScreen` pushes an empty `<></>` placeholder onto the global modal context so `useModalAwareFocus()` keeps releasing background focus and the app-level `esc` / `ctrl+s` shortcuts stay gated; the actual UI is sibling, not stacked.
-**Title layout**. Left-aligned title rides the top border (`edit approval` / `multi-edit approval`). The right-aligned filename + ` · N/M selected` suffix paints via a sibling `<text style={{ position: 'absolute', top: 0, right: 1 }}>` — OpenTUI's box scissor rect excludes the border row, so a child node can't paint over the top border (same trick `TitleOverlay` and `CompletionPopup` use). `rightTitleFilename(targetPath, termWidth, leftLen, suffixLen)` budgets the basename so the suffix never gets pushed off the right edge on a narrow terminal.
+**Force-remount on queue advance**. The modal is keyed on `request.id` (`<FileEditApprovalModal key={fileEditPending.id} … />`). Without the key, `MultiEditApprovalModal`'s `mask` / `cursor` / `zone` and `SingleEditApprovalModal`'s `selected` would carry over from the previous call when the safe-mode queue pops one approval and exposes the next. React re-uses the component instance otherwise and the user would see a checkbox list whose state matches the prior modal.
+**Originator attribution**. The right-side title overlay appends ` · child-N` when `request.originator` is `{ kind: 'child', label: 'child-N' }`. `applyGate` builds the originator from `ctx.childId` (set by `BUBBLED_MUTABLE_EVENTS` for subagent-issued gates) and passes it as the third arg to `requestApproval`. Parent calls show no suffix.
+**Title layout**. Left-aligned title rides the top border (`edit approval` / `multi-edit approval`). The right-aligned filename + ` · child-N` + ` · N/M selected` suffix paints via a sibling `<text style={{ position: 'absolute', top: 0, right: 1 }}>` — OpenTUI's box scissor rect excludes the border row, so a child node can't paint over the top border (same trick `TitleOverlay` and `CompletionPopup` use). `rightTitleFilename(targetPath, termWidth, leftLen, suffixLen)` budgets the basename so the suffixes never get pushed off the right edge on a narrow terminal.
 **Lenient preview**. Both layouts run the focused hunk's diff through `previewEditPayload(payload, priorContent, contextLines)` from `zidane/chat` — the model-faithful resolver (curly-quote recovery, line-number-prefix stripping, model-side `<n>`→`<name>` desanitize) so what the user previews matches what the tool would actually apply. Unresolved hunks (resolver couldn't locate `old_string`, or matched ambiguously without `replace_all`) render an `UnresolvedHunkPanel` instead of a blank diff: red-bordered, headlined with the failure mode, showing the model's intended `old_string` / `new_string` so the user can deny intelligently. The multi-edit list flags the same rows with a red `⚠` glyph.
-`isFileEditTool(tool)` (exported from `file-edit-approval-modal.tsx`) is the gate routing predicate. The set is `{'edit', 'multi_edit', 'write_file'}`; everything else stays on the inline `ApprovalBlock` path.
+`isFileEditTool(tool)` (exported from `file-edit-approval-modal.tsx`) is the gate routing predicate. The set is `{'edit', 'multi_edit', 'write_file'}`; everything else stays on the inline `ApprovalBlock` path. For a fully-denied file-edit call, `applyGate` skips the substitute path entirely — sets `ctx.block = true` + emits a synthetic `tool-result` event with body `[fully denied] <edit-outcomes>…</edit-outcomes>` for live display. The persisted result stays the terse `Blocked: User denied this tool call` the harness writes.
 ## Settings rows
@@ -354,6 +358,19 @@ File discovery uses `git ls-files --cached --others --exclude-standard` when ava
 Adding a third provider: build it in `zidane/chat` against `CompletionProvider<TItem>`, pass it alongside the others in `completionProviders` — the popup picks it up automatically.
+## Transcript scrolling
+`<Transcript>` mounts a `<scrollbox>` with OpenTUI's native auto-pin: `stickyScroll` + `stickyStart="bottom"`. The scrollbox stays pinned to the bottom while content grows, and detaches the moment the user scrolls up — re-attaches as soon as they hit the bottom edge again. This is the path that keeps streamed text + tool results glued under the cursor while a run is busy without manual `requestAnimationFrame` glue in React.
+Two scroll effects layer on top, both keyed off React state:
+- **Auto-pin** (native): OpenTUI handles the `[items, busy]` → "stay at bottom" loop. No React work; survives a 60 fps stream without forcing re-layouts.
+- **`selectedTurnId` effect** (React): `useEffect([selectedTurnId, anchors, ownership])` calls `scrollbox.scrollChildIntoView(id)` via `requestAnimationFrame` so OpenTUI's layout pass has settled before measuring. Snaps to `scrollHeight` for the last turn (or a turn that owns the last result-only row) so a tall response's tail stays visible.
+The two coexist — selecting a turn jumps to it; exiting select-turn mode (`esc`) plus the next streamed delta re-engages the native sticky-bottom.
+`Settings.smoothStreaming` (default **on**) drip-feeds streamed text at a smooth cadence (typewriter) instead of in stream bursts; the auto-pin loop tracks each delta naturally because each tick still grows the scrollbox content.
 ## Select-turn mode
 Press `ctrl+s` on the chat screen (idle, ≥ 1 turn, no pending approval / interaction) to enter select-turn mode:
@@ -458,7 +475,7 @@ src/tui/
   index.tsx                     runTui + public exports
   app.tsx                       App + AppShell + ThemedShell — provider stack + agent lifecycle
   screens.tsx                   AuthScreen, SessionsScreen, ChatScreen, wizard steps, PromptBlock, ApprovalBlock, QueuedMessagesBlock
-  components.tsx                Transcript, EventLine, MarkdownBlock, SubagentBlock, Footer, Spinner, StatusSpinner, TitleOverlay, EditDiffBlock (outcomes-aware: per-hunk applied/denied/skipped/failed badges via `summarizeOutcomes`), ToolCallBlock, UserPromptBlock
+  components.tsx                Transcript (OpenTUI `stickyScroll` + `stickyStart="bottom"` for auto-pin; `selectedTurnId` scroll effect on top), EventLine, MarkdownBlock, SubagentBlock, Footer, Spinner, StatusSpinner, TitleOverlay, EditDiffBlock (outcomes-aware: per-hunk applied/denied/skipped/failed badges via `summarizeOutcomes`), ToolCallBlock, UserPromptBlock
   modal.tsx                     ModalRoot + Modal + useModalAwareFocus
   model-picker.tsx              Cross-provider searchable model list modal
   effort-picker.tsx             Reasoning-effort modal
@@ -468,7 +485,7 @@ src/tui/
   mcps-settings.tsx             MCP servers list + toggle modal (standalone for embedders)
   session-details-modal.tsx     Stats + delete / export / title / compact (ctrl+x)
   turn-details-modal.tsx        Fork / delete / copy (opened from select-turn mode)
-  file-edit-approval-modal.tsx  FileEditApprovalModal + SingleEditApprovalModal + MultiEditApprovalModal + UnresolvedHunkPanel. Inline-mounted in ChatScreen's transcript slot — bridges ApprovalDecision (including { kind: 'partial', mask }) to the gate via the helpers in `zidane/chat`'s `edit-approval` module.
+  file-edit-approval-modal.tsx  FileEditApprovalModal + SingleEditApprovalModal + MultiEditApprovalModal + UnresolvedHunkPanel + `originatorSuffix` (` · child-N` attribution). Inline-mounted in ChatScreen's transcript slot, keyed on `request.id` for force-remount on queue advance — bridges ApprovalDecision (including { kind: 'partial', mask }) to the gate via the helpers in `zidane/chat`'s `edit-approval` module.
   interaction-block.tsx         InteractionBlock + plan picker + question wizard
   toggle-list-modal.tsx         Generic checkbox-list modal (ToggleListModal)
   completion-popup.tsx          Provider-agnostic autocomplete popover

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "zidane",
-  "version": "5.1.19",
+  "version": "5.1.21",
   "description": "an agent that goes straight to the goal",
   "type": "module",
   "private": false,