zidane 5.1.20 → 5.1.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -176,9 +176,40 @@ Built-in tools are opinionated about output sizes — drop your v2 `tool:transfo
176
176
  | `shell` | Tail-priority truncation at `maxOutputBytes=32768` (32 KiB, combined stdout+stderr). Head trim marker: `…(N bytes truncated from head)…`. `0` disables. UTF-8 never splits mid-codepoint. Appends `(exit N, Nms)` footer + surfaces non-empty stderr by default (`metadata: false` opts out). |
177
177
  | `write_file` | Reads existing content; returns `Created` / `Updated` / `No change needed: …` so the model detects no-ops without a separate read. Race window in shared docker/sandbox contexts documented and accepted. |
178
178
  | `edit` | Fails clearly on non-unique `old_string` (unless `replace_all: true`). On not-found, includes a nearest-match preview so the model recovers without re-reading. |
179
- | `multi_edit` | Sequential edits to one file. **Ungated** (no `_outcomes` in input the SDK default): atomic, first failure aborts. **Gated** (chat layer injects `_outcomes` per-hunk via `injectOutcomesIntoInput`): each hunk reported independently into a structured `Edited <path>: N/M applied · K denied · …` body `parseEditOutcomesFromResult` re-parses for transcript replay. See CHAT.md → Per-edit approval. |
179
+ | `multi_edit` | Sequential edits to one file. **Single-mode atomic** applies every edit in input order against the file as left by the previous step; first tool-level failure aborts the batch with the legacy `multi_edit error: edit #N <reason>` string, success returns `Edited <path>: applied N edits (M replacements).`. The tool body knows nothing about approvals: per-hunk decisions are a host concern enforced upstream — the chat layer's `tool:gate` rebinds `ctx.input.edits` to the approved subset (rebind, not mutate — the model's original `tool_call` block in `session.turns` stays untouched) before the body runs, and a `tool:transform` hook appends an `<edit-outcomes>` block to the result so live + replayed transcripts share the same per-hunk view (`parseEditOutcomesFromResult` re-parses on reload). See CHAT.md → Per-edit approval. |
180
180
  | `grep` | Wraps `rg` when present (with explicit `.` path to avoid stdin hangs). Bun.Glob fallback otherwise. `head_limit=250`, `offset` paginates. |
181
181
 
182
+ ## Per-edit approval — harness purity
183
+
184
+ The harness has no notion of "partial approval". Edit-family tools (`edit`, `multi_edit`, `write_file`) are single-mode and atomic; per-hunk decisions live entirely above the loop. The split is deliberate — SDK consumers (CI agents, headless pipelines, custom hosts) keep the legacy contract; the per-hunk UX is purely a chat-layer concern.
185
+
186
+ Three loop-visible artifacts carry the decision through:
187
+
188
+ 1. **Input rebind at `tool:gate`** — the host's gate handler (`src/tui/app.tsx`'s `applyGate`) computes the approved subset, then assigns a fresh shallow-clone to `ctx.input` whose `edits` array is filtered. The model's original `tool_call.input` in `session.turns` is never mutated (the rebind produces a new object); the tool body sees the smaller, all-approved batch and runs unchanged.
189
+ 2. **Pending-annotation map** — keyed by `tool_call.id`, holds the 1:1 `EditOutcome[]` (over the model's ORIGINAL hunks). Lives in the host's React tree (`pendingAnnotationsRef` in the TUI).
190
+ 3. **`tool:transform` annotation** — the host appends an `<edit-outcomes>` block to `ctx.result` when at least one hunk wasn't applied. Bubbles to `child:tool:transform` for subagent-issued calls via `BUBBLED_MUTABLE_EVENTS`.
191
+
192
+ **Wire format** (canonical shape for live emit + persisted replay):
193
+
194
+ ```
195
+ Edited path/to/foo.ts: applied 2 edits (3 replacements).
196
+
197
+ <edit-outcomes>
198
+ #1 applied
199
+ #2 denied: denied by user
200
+ #3 applied
201
+ </edit-outcomes>
202
+ ```
203
+
204
+ - Opening + closing tags each on their own line.
205
+ - One line per hunk: `#<1-based-index> <kind>[: <reason>]`.
206
+ - `kind ∈ {applied | denied | skipped | failed}`.
207
+ - Block emitted ONLY when at least one hunk is NOT applied. All-applied calls fall through to the legacy summary alone.
208
+
209
+ A fully-denied call (every hunk rejected) skips the substitute path; the harness writes `Blocked: User denied this tool call` as the tool_result and the host emits a synthetic `tool-result` event with body `[fully denied] <edit-outcomes>…</edit-outcomes>` for live display only (persisted history stays terse).
210
+
211
+ Replay path: `parseEditOutcomesFromResult(text)` (from `zidane/chat`) recovers the `EditOutcome[]` from the annotation block. `eventsFromTurns` pairs `tool_call` ↔ `tool_result` by `callId` and re-attaches the outcomes onto the `'tool'` event so reloaded transcripts paint identical per-hunk badges.
212
+
182
213
  ## Tool argument auto-coercion
183
214
 
184
215
  `validateToolArgs` runs between `tool:gate` and `tool:before`:
@@ -459,11 +490,12 @@ The child's lifecycle also bubbles to the parent hook surface with `childId` + `
459
490
  ```
460
491
  child:stream:text / child:stream:thinking / child:stream:end
461
492
  child:tool:gate / child:mcp:tool:gate ← share the child's ctx — parent mutations propagate
493
+ child:tool:transform ← share the child's ctx — parent mutations propagate
462
494
  child:tool:before / child:tool:after / child:tool:error
463
495
  child:turn:after
464
496
  ```
465
497
 
466
- The two `child:*:gate` events are special: the bubbled ctx is the same reference the child loop awaits on, so a parent listener can refuse / substitute a subagent's tool call without registering on the child agent.
498
+ `BUBBLED_MUTABLE_EVENTS` (`src/tools/spawn.ts`) is the canonical list: `tool:gate`, `mcp:tool:gate`, `tool:transform`. The bubbled ctx is the same reference the child loop awaits on, so a parent listener can refuse / substitute / annotate a subagent's tool call without registering on the child agent. The chat layer's per-edit annotation flow registers on both `tool:transform` and `child:tool:transform` so subagent-issued `multi_edit` calls also get `<edit-outcomes>` blocks appended to their results.
467
499
 
468
500
  ## Dependency Graph
469
501
 
package/docs/CHAT.md CHANGED
@@ -88,7 +88,9 @@ const agent = createAgent({
88
88
 
89
89
  | Hook | Purpose | Consumer in chat layer |
90
90
  |---|---|---|
91
- | `tool:gate`, `child:tool:gate`, `mcp:tool:gate`, `child:mcp:tool:gate` | Approval gate | `useSafeModeActions().requestApproval` + (for file-edit tools) `resolveApprovalForPayload` / `injectOutcomesIntoInput`. See **Per-edit approval**. |
91
+ | `tool:gate`, `child:tool:gate`, `mcp:tool:gate`, `child:mcp:tool:gate` | Approval gate | `useSafeModeActions().requestApproval(name, input, originator?)` + (for file-edit tools) `resolveApprovalForPayload`; partial → rebind `ctx.input` to the approved subset + stash outcomes in a pending-annotation map. See **Per-edit approval**. |
92
+ | `tool:transform`, `child:tool:transform` | Per-edit annotation | Look up `callId` in the pending-annotation map and append `buildEditOutcomesAnnotation(outcomes)` to `ctx.result`. Clear the entry. |
93
+ | `agent:done` | Stranded-annotation sweep | `pendingAnnotations.clear()` — covers completed / aborted / error paths so a never-fired `tool:transform` doesn't leak. |
92
94
  | `mcp:auth:required`, `mcp:auth:url`, `mcp:auth:success`, `mcp:auth:error`, `mcp:connect` | OAuth badge state | `useMcpAuthDispatch` → `reduceMcpAuth` |
93
95
  | `stream:thinking`, `stream:text`, `child:stream:thinking`, `child:stream:text` | Streaming deltas | `useStreamBuffer().queueStreamDelta` |
94
96
  | `tool:before`, `tool:after`, `mcp:tool:after`, `child:tool:before`, `child:tool:after` | Tool call/result events | `stream.appendImmediate` |
@@ -123,8 +125,8 @@ The table below indexes every named export; sections further down dive into the
123
125
  | `config` + `config-context` | `resolveConfig`, `useConfig`, `ChatOptions`, `ResolvedConfig`, `ResolvedPaths`, `ModelInfo`, `ProviderRegistry`. See **Required options**. |
124
126
  | `credentials` | AI-provider credential store. `setProviderCredential`, `readProviderCredential`, `removeProviderCredential`, `readCredentials`, `writeCredentials`, `credentialsPath`, `applyApiKeyEnv` (called by `resolveConfig` before any factory runs). Owner-only file mode. |
125
127
  | `discovery-context` + `discovery-slot` | Live catalog plumbing: `DiscoveryProvider`, `useDiscovery`, `useDiscoveryOptional`, `createDiscoverySlot` (stale-while-revalidate primitive). Propagates fresh catalogs (files, skills, MCPs) into open modals. See **Discovery context** below. |
126
- | `edit-approval` | Pure helpers bridging the safe-mode gate to the `multi_edit` tool's per-edit `_outcomes` side channel — `resolveApprovalForPayload`, `maskToOutcomeKinds`, `injectOutcomesIntoInput`, `parseEditOutcomesFromResult`, `summarizeOutcomes`, `ResolvedApproval`. See **Per-edit approval**. |
127
- | `edit-diff` | Diff plumbing — `extractEditPayload`, `previewEditPayload`, `applyEditPayload`, `buildUnifiedDiff`, `buildContextualDiff`, `computeLineDiff`, `computeInlineDiff`, `splitLines`, `tokenize`, `filetypeFromPath`, `readEditOutcomes`, `OUTCOMES_INPUT_KEY`. See **Edit-diff rendering**. |
128
+ | `edit-approval` | Pure helpers bridging the safe-mode gate to the host-side per-hunk flow — `resolveApprovalForPayload`, `maskToOutcomeKinds`, `buildEditOutcomesAnnotation`, `parseEditOutcomesFromResult`, `summarizeOutcomes`, `ResolvedApproval`. See **Per-edit approval**. |
129
+ | `edit-diff` | Diff plumbing — `extractEditPayload`, `previewEditPayload`, `applyEditPayload`, `buildUnifiedDiff`, `buildContextualDiff`, `computeLineDiff`, `computeInlineDiff`, `splitLines`, `tokenize`, `filetypeFromPath`. See **Edit-diff rendering**. |
128
130
  | `enabled-toggle-set` | `useEnabledToggleSet({ catalog, keyOf, settingKey })` — generic state machine for `enabledSkills` / `enabledMcps` (undefined = all enabled, `[]` = off, `[names]` = allowlist). |
129
131
  | `files-discovery` | `listProjectFiles({ cwd, signal? })` → `FileEntry[]`, gitignored paths excluded. |
130
132
  | `format` | `fmtTokens`, `ageString`, `shortId`, `compactPath` — display-only helpers. |
@@ -425,9 +427,37 @@ type ApprovalDecision
425
427
  | 'accept-safelist'
426
428
  | 'deny'
427
429
  | { kind: 'partial', mask: readonly boolean[] }
430
+
431
+ type ApprovalOriginator
432
+ = | { kind: 'parent' }
433
+ | { kind: 'child', label: string } // label is the `child-N` tag
434
+
435
+ type RequestApproval = (
436
+ tool: string,
437
+ input: Record<string, unknown>,
438
+ originator?: ApprovalOriginator,
439
+ ) => Promise<ApprovalDecision>
440
+
441
+ interface ApprovalRequest {
442
+ id: string
443
+ tool: string
444
+ input: Record<string, unknown>
445
+ resolve: (decision: ApprovalDecision) => void
446
+ /** Caller attribution. Absent ≡ `{ kind: 'parent' }`. */
447
+ originator?: ApprovalOriginator
448
+ }
428
449
  ```
429
450
 
430
- `ApprovalDecision` is a discriminated union — exhaustive switches must handle the object form, emitted by the per-edit modal for file-edit tools (`edit` / `multi_edit` / `write_file`). The `mask` is 1:1 with `EditPayload.hunks`; `true` = apply, `false` = deny. Bridge it to the tool body via `resolveApprovalForPayload` (see **Per-edit approval**); non-edit gates collapse `partial` to `allow`.
451
+ `ApprovalDecision` is a discriminated union — exhaustive switches must handle the object form, emitted by the per-edit modal for file-edit tools (`edit` / `multi_edit` / `write_file`). The `mask` is 1:1 with `EditPayload.hunks`; `true` = apply, `false` = deny. Bridge it via `resolveApprovalForPayload` (see **Per-edit approval**); non-edit gates collapse `partial` to `allow`.
452
+
453
+ `ApprovalOriginator` lets the modal show ` · child-N` attribution when a subagent's gate bubbles up through the parent's hook bus. The TUI's `applyGate` reads `ctx.childId` (set by `BUBBLED_MUTABLE_EVENTS` in `src/tools/spawn.ts`) and threads it through:
454
+
455
+ ```ts
456
+ const originator: ApprovalOriginator = ctx.childId
457
+ ? { kind: 'child', label: ctx.childId }
458
+ : { kind: 'parent' }
459
+ const decision = await requestApproval(name, input, originator)
460
+ ```
431
461
 
432
462
  ```tsx
433
463
  const { requestApproval, resolveHead, denyAll } = useSafeModeActions()
@@ -441,7 +471,7 @@ const pending = queue[0] ?? null
441
471
 
442
472
  `accept-safelist` calls `addToSafelist(dataDir, projectDir, suggestSafelistEntry(tool, input))` so subsequent calls with the same shape skip the gate. The safelist lives at `<dataDir>/projects.json` (user dir, never the project dir). `partial` never writes a safelist entry — the safelist key is the tool name + an arg shape, with no hunk-level scope.
443
473
 
444
- **Parallel-call deny semantics**: each pending approval resolves on its own. A single `deny` does **not** cascade through the queue — the model receives `Blocked: User denied this tool call` as that one tool's result and the turn continues; other parallel approvals stay queued and prompt independently. The user's explicit "stop everything" gesture is the host-level `esc abort run` shortcut, which calls `denyAll()` + aborts the active `agent.run()` in one shot.
474
+ **Parallel-call deny semantics**: each pending approval resolves on its own. A single `deny` does **not** cascade through the queue — the gate handler sets `ctx.block = true` + `ctx.reason = 'User denied this tool call'` for THAT call and returns, so the model receives `Blocked: User denied this tool call` as that one tool's result and the turn continues; other parallel approvals stay queued and prompt independently. The user's explicit "stop everything" gesture is the host-level `esc abort run` shortcut, which calls `cancelRunOnDenial()` → `denyAll()` + `agent.abort()` in one shot.
445
475
 
446
476
  ## Interactions
447
477
 
@@ -575,16 +605,16 @@ interface HunkResolution {
575
605
 
576
606
  Wire:
577
607
 
578
- - `tool:before` hook → read pre-write content for `write_file` (other edit tools carry it in the input) → call `extractEditPayload` → attach the result as `StreamEvent.edit` on the `'tool'` event. The hook also runs `readEditOutcomes(input, hunks.length)` so any `_outcomes` previously injected by the gate ride through onto `payload.outcomes`.
608
+ - `tool:before` hook → read pre-write content for `write_file` (other edit tools carry it in the input) → call `extractEditPayload` → attach the result as `StreamEvent.edit` on the `'tool'` event. The host skips this default emit when the gate already painted a synthetic event for a partial / fully-denied call (otherwise the transcript would show one row with full hunks + a second with the reduced subset).
579
609
  - Renderer reads `event.edit` and renders accordingly. When `payload.outcomes` is populated, each hunk row carries an applied / denied / skipped / failed badge; pass the array through `summarizeOutcomes` for the header tally.
580
- - Suppress the paired success `tool-result` (`isEditErrorResult` is the gate). The structured `multi_edit` body counts as "success" only when every edit applied `Edited <path>: N/N applied · 0 denied · 0 skipped · 0 failed`. Mixed / failed bodies stay visible.
610
+ - Suppress the paired success `tool-result` (`isEditErrorResult` is the gate). Success means "every hunk applied" the result body is the legacy `Edited <path>: applied N edits (M replacements).` summary alone, no annotation block. Any presence of `<edit-outcomes>` (mixed / denied / failed) or a `[fully denied]` body keeps the result visible alongside the diff so the user reads the per-hunk reasons.
581
611
  - Historical replay (from `eventsFromTurns`) has no pre-write snapshot for `write_file`; the diff renders all-add, matching git's "new file" convention. Outcomes are reconstructed from the persisted `tool_result` body via `parseEditOutcomesFromResult` and re-attached to the `'tool'` event, so a reloaded transcript shows the same per-hunk badges live capture displayed.
582
612
 
583
613
  `theme.surfaces.diff` carries the row colors (`addBg`, `removeBg`, optional `*ContentBg`, `addFg`, `removeFg`); built-in themes pre-mix translucent reds/greens so terminals without alpha-blend get a legible effect, and CSS hosts can use the same hex values as `background-color`.
584
614
 
585
615
  ## Per-edit approval
586
616
 
587
- The `multi_edit` tool can accept or reject **individual hunks** instead of all-or-nothing. The user-side decision lives in `ApprovalDecision`'s `{ kind: 'partial', mask }` shape; the bridge to the tool body is a host-only side channel on `input._outcomes` (`OUTCOMES_INPUT_KEY`); the result body carries the post-apply state back in a stable line shape that replay can re-parse.
617
+ Edit-family tools (`edit`, `multi_edit`, `write_file`) can accept or reject **individual hunks** instead of all-or-nothing. The user-side decision lives in `ApprovalDecision`'s `{ kind: 'partial', mask }` shape; the bridge is purely host-side — the harness stays pure (single-mode atomic `multi_edit` body, no side channels on `tool_call.input`). The host's `tool:gate` handler rebinds `ctx.input.edits` to the approved subset before the body runs; a paired `tool:transform` hook appends an `<edit-outcomes>` annotation block to the result so the renderer can paint per-hunk badges live + on replay.
588
618
 
589
619
  ```ts
590
620
  type EditOutcomeKind = 'applied' | 'denied' | 'skipped' | 'failed' | 'pending'
@@ -608,11 +638,9 @@ interface EditPayload {
608
638
  import {
609
639
  resolveApprovalForPayload,
610
640
  maskToOutcomeKinds,
611
- injectOutcomesIntoInput,
641
+ buildEditOutcomesAnnotation,
612
642
  parseEditOutcomesFromResult,
613
643
  summarizeOutcomes,
614
- OUTCOMES_INPUT_KEY,
615
- readEditOutcomes,
616
644
  type ResolvedApproval,
617
645
  } from 'zidane/chat'
618
646
  ```
@@ -621,11 +649,9 @@ import {
621
649
  |---|---|
622
650
  | `resolveApprovalForPayload(decision, payload)` | Turn an `ApprovalDecision` into `{ outcomes, shouldBlock, syntheticEvent }`. Pure — does not mutate `input` / `payload`. |
623
651
  | `maskToOutcomeKinds(mask, fallbackLength, deniedReason?)` | Convert a boolean mask to a 1:1 `EditOutcome[]`. Missing entries default to `applied` (no-decision = keep). |
624
- | `injectOutcomesIntoInput(input, outcomes)` | In-place writer that stamps `input._outcomes` so the tool body honors per-hunk decisions. |
625
- | `parseEditOutcomesFromResult(text)` | Re-parse the structured `multi_edit` body back into outcomes. Used by `eventsFromTurns` so replay shows the same badges as live capture. Returns `null` for legacy / unstructured bodies. |
652
+ | `buildEditOutcomesAnnotation(outcomes)` | Render an `EditOutcome[]` as the wire-format `<edit-outcomes>…</edit-outcomes>` block — body APPENDed to a tool result (joined with `\n\n`). Idempotent on missing reasons. |
653
+ | `parseEditOutcomesFromResult(text)` | Re-parse the annotation block back into outcomes. Used by `eventsFromTurns` so replay shows the same badges as live capture. Returns `null` when the block is missing / malformed. |
626
654
  | `summarizeOutcomes(outcomes)` | `{ applied, denied, skipped, failed, pending, total }` tally for the header badge. |
627
- | `readEditOutcomes(input, length)` | Read `input._outcomes` back off the tool call's input — padded to `length` with `applied`. Used by `extractEditPayload` to thread outcomes onto the `tool:before` event. |
628
- | `OUTCOMES_INPUT_KEY` (`'_outcomes'`) | Side-channel field name. **Not** in the tool's JSON `inputSchema` — the model never sees it. |
629
655
 
630
656
  ```ts
631
657
  interface ResolvedApproval {
@@ -638,23 +664,45 @@ interface ResolvedApproval {
638
664
  }
639
665
  ```
640
666
 
667
+ ### Wire format
668
+
669
+ Canonical annotation block — emitted ONLY when at least one hunk is NOT applied. All-applied calls fall through to the legacy `Edited <path>: applied N edits (M replacements).` summary alone.
670
+
671
+ ```
672
+ Edited src/foo.ts: applied 2 edits (3 replacements).
673
+
674
+ <edit-outcomes>
675
+ #1 applied
676
+ #2 denied: denied by user
677
+ #3 applied
678
+ </edit-outcomes>
679
+ ```
680
+
681
+ - Opening + closing tags each on their own line.
682
+ - One line per hunk: `#<1-based-index> <kind>[: <reason>]`.
683
+ - `kind ∈ {applied | denied | skipped | failed}`.
684
+ - Indexes are 1-based against the **model's original** `edits` array — `parseEditOutcomesFromResult` re-keys to a 0-based `EditOutcome[]`.
685
+
686
+ A fully-denied call (every hunk rejected) skips the substitute path entirely: the harness writes `Blocked: User denied this tool call` to the persisted tool_result, and the host emits a synthetic `tool-result` event with body `[fully denied] <edit-outcomes>…</edit-outcomes>` for live display only. `isEditErrorResult` keeps that synthetic body visible alongside the diff.
687
+
641
688
  ### Gate handler
642
689
 
643
- End-to-end example showing how a host wires a `tool:gate` handler against the helpers:
690
+ End-to-end pattern for a host wiring `tool:gate` against the helpers. The harness stays pure: the handler **rebinds** `ctx.input` (does not mutate the model's `tool_call.input` reference on the persisted assistant turn) and stashes outcomes for a paired `tool:transform` to append on the way out:
644
691
 
645
692
  ```ts
646
693
  import {
694
+ buildEditOutcomesAnnotation,
647
695
  extractEditPayload,
648
- injectOutcomesIntoInput,
649
696
  resolveApprovalForPayload,
650
697
  toolCallPreview,
651
698
  } from 'zidane/chat'
652
699
 
700
+ const pendingAnnotations = new Map<string, readonly EditOutcome[]>()
701
+
653
702
  agent.hooks.hook('tool:gate', async (ctx) => {
654
703
  if (ctx.block) return // upstream already refused
655
704
 
656
- const decision = await requestApproval(ctx.name, ctx.input)
657
-
705
+ const decision = await requestApproval(ctx.name, ctx.input, originator)
658
706
  const editPayload = extractEditPayload(ctx.name, ctx.input)
659
707
  if (!editPayload) {
660
708
  if (decision === 'deny') {
@@ -676,6 +724,12 @@ agent.hooks.hook('tool:gate', async (ctx) => {
676
724
  edit: resolved.syntheticEvent,
677
725
  ...(ctx.turnId ? { turnId: ctx.turnId } : {}),
678
726
  })
727
+ stream.appendImmediate({
728
+ kind: 'tool-result',
729
+ text: `[fully denied] ${buildEditOutcomesAnnotation(resolved.outcomes)}`,
730
+ tool: ctx.name,
731
+ ...(ctx.turnId ? { turnId: ctx.turnId } : {}),
732
+ })
679
733
  }
680
734
  ctx.block = true
681
735
  ctx.reason = 'User denied this tool call'
@@ -683,30 +737,47 @@ agent.hooks.hook('tool:gate', async (ctx) => {
683
737
  }
684
738
 
685
739
  if (resolved.syntheticEvent) {
686
- injectOutcomesIntoInput(ctx.input, resolved.outcomes) // partialtool body skips denied hunks
740
+ // Rebindfresh shallow clone whose `edits` is the approved subset.
741
+ // The model's original `tool_call.input` on the persisted turn stays
742
+ // intact because the rebind produces a new object.
743
+ const reducedEdits = reduceEditsByOutcomes(ctx.input.edits, resolved.outcomes)
744
+ ctx.input = { ...ctx.input, edits: reducedEdits }
745
+ pendingAnnotations.set(ctx.callId, resolved.outcomes)
687
746
  }
688
- // accept-* (or partial that collapsed to all-applied) → fall through, no mutation.
747
+ // accept-* (or partial that collapsed to all-applied) → fall through.
748
+ })
749
+
750
+ agent.hooks.hook('tool:transform', (ctx) => {
751
+ const outcomes = pendingAnnotations.get(ctx.callId)
752
+ if (!outcomes) return
753
+ pendingAnnotations.delete(ctx.callId)
754
+ const annotation = buildEditOutcomesAnnotation(outcomes)
755
+ ctx.result = typeof ctx.result === 'string'
756
+ ? (ctx.result.length === 0 ? annotation : `${ctx.result}\n\n${annotation}`)
757
+ : [...ctx.result, { type: 'text', text: `\n${annotation}` }]
689
758
  })
759
+
760
+ // Same handler registered on `child:tool:transform` so subagent calls
761
+ // also get annotated — `BUBBLED_MUTABLE_EVENTS` shares the child ctx.
762
+ agent.hooks.hook('child:tool:transform', /* same body */)
690
763
  ```
691
764
 
765
+ Wipe `pendingAnnotations` on `agent:done` (covers completed / aborted / error paths) — `tool:transform` doesn't fire when `validation:reject` or a throwing `tool:before` synthesizes the tool_result, so entries can otherwise strand across runs. The TUI also clears the map defensively in its session-teardown path (`pendingAnnotationsRef.current.clear()`).
766
+
692
767
  ### `multi_edit` tool body shape
693
768
 
694
- The tool body branches on `_outcomes` presence:
769
+ Single-mode atomic — the harness has no notion of approvals. Applies every edit in input order against the file as left by the previous step. First tool-level failure aborts the batch with the legacy `multi_edit error: edit #N <reason>` string; success returns `Edited <path>: applied N edits (M replacements).`. SDK consumers (CI agents, headless harnesses, pipelines parsing the result) see exactly the legacy contract.
695
770
 
696
- - **Gated** (host injected `_outcomes`) never aborts on per-edit tool-level failure. Each step is reported into a structured body via `formatMultiEditReport`:
771
+ When a host rebinds `ctx.input.edits` to the approved subset at the gate (above), the body sees a smaller all-approved batch and still runs the same atomic semantics. The per-hunk outcomes ride into the wire / persisted history as the appended `<edit-outcomes>` block — the model and any downstream parser see one self-describing tool_result.
697
772
 
698
- ```
699
- Edited src/foo.ts: 2/3 applied · 1 denied · 0 skipped · 0 failed
700
- #1 applied: replaced 2 occurrences
701
- #2 denied: denied by user
702
- #3 applied: replaced 1 occurrence
703
- ```
773
+ ### Modal lifecycle
704
774
 
705
- Anchored line shape ` #N <outcome>[: <detail>]` keeps `parseEditOutcomesFromResult` trivially regex-able.
775
+ The chat layer makes no assumption about how the host renders the approval surface. Two contracts a GUI must honor:
706
776
 
707
- - **Ungated** (programmatic / SDK call, no `_outcomes`) original pre-2026 atomic semantics. First tool-level failure aborts the batch with the legacy `multi_edit error: edit #N <reason>` string; success returns the legacy `Edited <path>: applied N edits (M replacements).` summary. SDK consumers (CI agents, headless harnesses, pipelines parsing the result) keep working unchanged.
777
+ - **Display the head of `useSafeModeQueue()` only.** The queue is FIFO; parallel tool calls prompt in arrival order. Resolve with `resolveHead(decision)` to pop the head never resolve other entries directly, since the underlying `Promise` resolvers are owned by the provider.
778
+ - **Key the rendered surface on `request.id`.** Back-to-back approvals in the same queue tick must force-remount the modal so per-call UI state (per-hunk mask, focused row, zone, etc.) doesn't carry across. The TUI's `ChatScreen` does this with `<FileEditApprovalModal key={fileEditPending.id} … />` (`src/tui/screens.tsx`).
708
779
 
709
- The two modes are mutually exclusive `_outcomes` flips the contract. The model never opts in: `_outcomes` is omitted from the tool's JSON `inputSchema`, so the LLM can't write it.
780
+ For hosts that route some decisions through a separate modal stack (the TUI inlines `FileEditApprovalModal` in the transcript slot while non-file-edit gates render as an `ApprovalBlock` below the prompt), push an empty placeholder onto the global modal context for the inline surface so `useModalAwareFocus()` keeps releasing background focus — the focus contract is "any modal-aware surface = background blurred", and an inline modal is still a modal by that definition.
710
781
 
711
782
  ## Tool call display
712
783
 
@@ -1198,7 +1269,7 @@ const text = turnAsText(turn)
1198
1269
 
1199
1270
  **Render an edit diff** — call `extractEditPayload` in `tool:before` (passing pre-write content for `write_file`), persist on the `'tool'` `StreamEvent.edit` field, render via `buildUnifiedDiff(payload)` + `filetypeFromPath(payload.path)` for syntax highlighting. Theme via `theme.surfaces.diff`. For a renderer-faithful preview (matches the tool body's lenient resolver, real file line numbers, per-hunk resolvability metadata), use `previewEditPayload(payload, priorContent, contextLines?)` and pass `result.diffText` to the diff renderable; consult `result.resolution[i].resolved` to decide whether to badge the hunk as unresolvable.
1200
1271
 
1201
- **Wire per-edit approval** — in the `tool:gate` handler, call `extractEditPayload(name, input)`. If it returns a payload, route the decision through `resolveApprovalForPayload(decision, payload)`: on `shouldBlock`, emit `resolved.syntheticEvent` as a `'tool'` event with `outcomes` (so the transcript shows the denied diff) and set `ctx.block = true`; on partial-accept, call `injectOutcomesIntoInput(ctx.input, resolved.outcomes)` so the `multi_edit` body skips denied hunks. The tool result body's structured shape (`Edited <path>: N/M applied · …`) is parsed back by `parseEditOutcomesFromResult` during replay, so reloaded transcripts re-attach the same per-hunk badges.
1272
+ **Wire per-edit approval** — in the `tool:gate` handler, call `extractEditPayload(name, input)`. If it returns a payload, route the decision through `resolveApprovalForPayload(decision, payload)`: on `shouldBlock`, emit `resolved.syntheticEvent` as a `'tool'` event (so the transcript shows the intended diff with denied badges), emit a paired `'tool-result'` event with body `[fully denied] ${buildEditOutcomesAnnotation(resolved.outcomes)}` for live display, and set `ctx.block = true`; on partial-accept, **rebind** `ctx.input` to a fresh shallow-clone whose `edits` is the approved subset and stash `resolved.outcomes` in a per-callId pending-annotation map. A paired `tool:transform` handler reads back the map and appends `buildEditOutcomesAnnotation(outcomes)` to `ctx.result`, so the wire / persisted history carries the per-hunk decisions for replay. Register the same handler on `child:tool:transform` for subagent-issued calls. Wipe the map on `agent:done` to catch paths where `tool:transform` never fires (validation reject, throwing `tool:before`). See **Per-edit approval** for the full gate handler skeleton.
1202
1273
 
1203
1274
  **Drive a model picker** — `buildModelCatalog(providers, modelsFor, currentPick)` + `filterModelCatalog(catalog, query)` + `indexOfEntry`. Gate an effort sub-picker on `modelSupportsReasoning(descriptor, modelId)`.
1204
1275
 
package/docs/SKILL.md CHANGED
@@ -217,7 +217,7 @@ Alias only when semantically equivalent. `shell → Bash` is safe; `list_files
217
217
  | `readFile` | Line range, default `offset=1, limit=2000`, 256 KiB cap. Paging footer; binary marker. |
218
218
  | `writeFile` | Returns `Created` / `Updated` / `No change needed: …` for no-op detection. |
219
219
  | `edit` | Surgical `old_string` → `new_string`. Clear errors on non-unique (unless `replace_all`) / not-found (with nearest-match preview). |
220
- | `multiEdit` | Sequential edits to one file. Atomic by default; the chat layer's per-edit gate flips it into per-hunk outcome reporting via the `_outcomes` side channel (see `docs/CHAT.md`). |
220
+ | `multiEdit` | Sequential edits to one file. Single-mode atomic first failure aborts with `multi_edit error: edit #N <reason>`; success returns `Edited <path>: applied N edits (M replacements).`. Per-edit approval is a host concern: the chat layer's `tool:gate` rebinds `ctx.input.edits` to the user-approved subset before the body runs, then a `tool:transform` hook appends an `<edit-outcomes>` annotation to the result so the renderer can paint per-hunk badges (see `docs/CHAT.md`). |
221
221
  | `listFiles` | Directory listing. |
222
222
  | `glob` | `**`, `*`, `?` pattern matching via Bun.Glob; shells out in docker/sandbox. |
223
223
  | `grep` | ripgrep + Bun.Glob fallback. Full Claude Code Grep semantics. `head_limit=250`, `offset` paginates. |
@@ -324,11 +324,12 @@ Mutable ctx fields: `tool:gate` (`block`, `reason`, `result`), `tool:transform`
324
324
  ```
325
325
  child:stream:text / child:stream:thinking / child:stream:end
326
326
  child:tool:gate / child:mcp:tool:gate ← mutable: block/reason/result propagate to the child
327
+ child:tool:transform ← mutable: parent can rewrite the child's tool_result
327
328
  child:tool:before / child:tool:after / child:tool:error
328
329
  child:turn:after
329
330
  ```
330
331
 
331
- Render nested activity without listening on the child instance. The `child:*:gate` events are unique: they share the same `ctx` reference the child's loop awaits on, so a parent gate handler can refuse / substitute a child's tool call without registering on the child agent.
332
+ Render nested activity without listening on the child instance. The `child:*:gate` and `child:tool:transform` events are special: they share the same `ctx` reference the child's loop awaits on (see `BUBBLED_MUTABLE_EVENTS` in `src/tools/spawn.ts`), so a parent gate handler can refuse / substitute / annotate a child's tool call without registering on the child agent. The chat layer's per-edit annotation flow relies on `child:tool:transform` to append `<edit-outcomes>` blocks onto a subagent's `multi_edit` / `edit` / `write_file` results.
332
333
 
333
334
  ### Hook recipes
334
335
 
package/docs/TUI.md CHANGED
@@ -274,9 +274,9 @@ Hosts adding a new tool: extend `TOOL_DISPLAY` in `zidane/chat` with a `{ displa
274
274
 
275
275
  `Settings.showEditDiffs` (default **on**) renders `edit` / `multi_edit` / `write_file` tool calls as a unified diff via the native `<diff>` renderable. Wire path:
276
276
 
277
- - `tool:before` hook reads pre-write content for `write_file` (`edit` / `multi_edit` carry the old text in their input), then calls `extractEditPayload(name, input, priorContent)` from `zidane/chat`. The resulting `EditPayload` rides on the `'tool'` `StreamEvent.edit` field. `extractEditPayload` also reads back any `_outcomes` previously injected by the gate, so `payload.outcomes` rides through onto the transcript event.
277
+ - `tool:before` hook reads pre-write content for `write_file` (`edit` / `multi_edit` carry the old text in their input), then calls `extractEditPayload(name, input, priorContent)` from `zidane/chat`. The resulting `EditPayload` rides on the `'tool'` `StreamEvent.edit` field. The default emit short-circuits when `pendingAnnotationsRef.current.has(callId)` is true the gate handler already painted a synthetic `'tool'` event with the FULL hunks + outcomes, so emitting again with the reduced input would paint a misleading second row.
278
278
  - `EditDiffBlock` consumes that payload and calls `buildUnifiedDiff(payload)` + `filetypeFromPath(payload.path)` for syntax highlighting. When `payload.outcomes` is populated (live partial-approval or persisted replay), each hunk row carries an applied / denied / skipped / failed badge and the header line shows the tally via `summarizeOutcomes`.
279
- - The paired `tool-result` is suppressed (`isEditErrorResult` is the gate) only for **all-applied** outcomes. The new structured `multi_edit` body shape (`Edited <path>: N/M applied · K denied · …`) stays visible whenever any hunk wasn't applied, so the user reads the per-edit reasons next to the diff. Errors (`Edit error: …`, `Tool failed: …`, legacy `multi_edit error: …`) always bypass suppression.
279
+ - The paired `tool-result` is suppressed (`isEditErrorResult` is the gate) only for **all-applied** outcomes the result body is the legacy `Edited <path>: applied N edits (M replacements).` summary with no annotation. Presence of an `<edit-outcomes>…</edit-outcomes>` block (mixed / denied / failed) or a `[fully denied] …` body keeps the row visible so the user reads the per-hunk reasons next to the diff. Errors (`Edit error: …`, `Tool failed: …`, `multi_edit error: …`) always bypass suppression.
280
280
  - Historical replay from persisted turns has no pre-write snapshot for `write_file` — the diff renders all-add, matching git's "new file" convention. Outcomes are reconstructed from the persisted `tool_result` body via `parseEditOutcomesFromResult` and re-attached, so reloaded transcripts show the same per-hunk badges live capture displayed.
281
281
 
282
282
  Diff row colors come from `theme.surfaces.diff` (`addBg`, `removeBg`, optional `*ContentBg` for a deeper content-column hue, `addFg` / `removeFg` for the gutter glyphs). Built-in themes pre-mix translucent reds/greens against their primary surface so terminals without true alpha-blend still read cleanly.
@@ -289,15 +289,19 @@ File-edit tools (`edit` / `multi_edit` / `write_file`) get their own approval su
289
289
  - **`MultiEditApprovalModal`** (multi-step `multi_edit`, ≥ 2 hunks) — per-hunk toggle list + focused-hunk diff panel + action bar. List zone: `↑` / `↓` move cursor, `space` toggles the focused hunk, `y` / `n` set all on/off. Actions zone: `←` / `→` cycle, `↵` confirm. `tab` cycles between zones. `a` / `s` / `p` / `d` bulk shortcuts work in either zone. The first action label is dynamic — `Apply all`, `Apply N/M`, `Nothing selected` — and submits one of:
290
290
  - All hunks on → bulk decision (`accept-once` / `accept-session` / `accept-safelist`) preserving the safelist path identical to single-edit.
291
291
  - All hunks off → `'deny'`.
292
- - Mixed mask → `{ kind: 'partial', mask }` — `gateDecision` calls `resolveApprovalForPayload` + `injectOutcomesIntoInput` to bake the mask into `input._outcomes`. See [Per-edit approval in CHAT.md](./CHAT.md#per-edit-approval) for the helper contract.
292
+ - Mixed mask → `{ kind: 'partial', mask }` — `applyGate` calls `resolveApprovalForPayload`, then **rebinds** `ctx.input` to a fresh shallow-clone whose `edits` is the approved subset (the model's original `tool_call.input` on the persisted assistant turn stays untouched) and stashes the full-length `EditOutcome[]` in `pendingAnnotationsRef.current` keyed by `ctx.callId`. The paired `tool:transform` / `child:tool:transform` hook reads back the entry and appends a `buildEditOutcomesAnnotation(outcomes)` block to `ctx.result`, so the wire / persisted history carries the per-hunk decisions for replay. See [Per-edit approval in CHAT.md](./CHAT.md#per-edit-approval) for the full contract.
293
293
 
294
- **Inline mount**. The modal renders inside `ChatScreen`'s transcript slot via `flexGrow: 1` the transcript stays mounted with `visible: !fileEditPending` so its memoized turn anchors and scrollbox state survive the open/close cycle. The prompt + footer stay anchored at the bottom. `ChatScreen` pushes an empty `<></>` placeholder onto the global modal context so `useModalAwareFocus()` keeps releasing background focus and the app-level `esc` / `ctrl+s` shortcuts stay gated; the actual UI is sibling, not stacked.
294
+ **Inline mount**. The modal renders inside `ChatScreen`'s transcript slot via `flexGrow: 1`. The transcript stays mounted with `visible: !fileEditPending` — OpenTUI maps that to Yoga's `Display.None`, removing the box from layout entirely so the sibling modal claims the slot via its own `flexGrow: 1`. Memoized turn anchors, scrollbox position, and lazy markdown chrome all survive the openclose round-trip. The prompt + footer stay anchored at the bottom (queue-only while `busy`). `ChatScreen` pushes an empty `<></>` placeholder onto the global modal context so `useModalAwareFocus()` keeps releasing background focus and the app-level `esc` / `ctrl+s` shortcuts stay gated; the actual UI is sibling, not stacked.
295
295
 
296
- **Title layout**. Left-aligned title rides the top border (`edit approval` / `multi-edit approval`). The right-aligned filename + ` · N/M selected` suffix paints via a sibling `<text style={{ position: 'absolute', top: 0, right: 1 }}>` OpenTUI's box scissor rect excludes the border row, so a child node can't paint over the top border (same trick `TitleOverlay` and `CompletionPopup` use). `rightTitleFilename(targetPath, termWidth, leftLen, suffixLen)` budgets the basename so the suffix never gets pushed off the right edge on a narrow terminal.
296
+ **Force-remount on queue advance**. The modal is keyed on `request.id` (`<FileEditApprovalModal key={fileEditPending.id} />`). Without the key, `MultiEditApprovalModal`'s `mask` / `cursor` / `zone` and `SingleEditApprovalModal`'s `selected` would carry over from the previous call when the safe-mode queue pops one approval and exposes the next. React re-uses the component instance otherwise and the user would see a checkbox list whose state matches the prior modal.
297
+
298
+ **Originator attribution**. The right-side title overlay appends ` · child-N` when `request.originator` is `{ kind: 'child', label: 'child-N' }`. `applyGate` builds the originator from `ctx.childId` (set by `BUBBLED_MUTABLE_EVENTS` for subagent-issued gates) and passes it as the third arg to `requestApproval`. Parent calls show no suffix.
299
+
300
+ **Title layout**. Left-aligned title rides the top border (`edit approval` / `multi-edit approval`). The right-aligned filename + ` · child-N` + ` · N/M selected` suffix paints via a sibling `<text style={{ position: 'absolute', top: 0, right: 1 }}>` — OpenTUI's box scissor rect excludes the border row, so a child node can't paint over the top border (same trick `TitleOverlay` and `CompletionPopup` use). `rightTitleFilename(targetPath, termWidth, leftLen, suffixLen)` budgets the basename so the suffixes never get pushed off the right edge on a narrow terminal.
297
301
 
298
302
  **Lenient preview**. Both layouts run the focused hunk's diff through `previewEditPayload(payload, priorContent, contextLines)` from `zidane/chat` — the model-faithful resolver (curly-quote recovery, line-number-prefix stripping, model-side `<n>`→`<name>` desanitize) so what the user previews matches what the tool would actually apply. Unresolved hunks (resolver couldn't locate `old_string`, or matched ambiguously without `replace_all`) render an `UnresolvedHunkPanel` instead of a blank diff: red-bordered, headlined with the failure mode, showing the model's intended `old_string` / `new_string` so the user can deny intelligently. The multi-edit list flags the same rows with a red `⚠` glyph.
299
303
 
300
- `isFileEditTool(tool)` (exported from `file-edit-approval-modal.tsx`) is the gate routing predicate. The set is `{'edit', 'multi_edit', 'write_file'}`; everything else stays on the inline `ApprovalBlock` path.
304
+ `isFileEditTool(tool)` (exported from `file-edit-approval-modal.tsx`) is the gate routing predicate. The set is `{'edit', 'multi_edit', 'write_file'}`; everything else stays on the inline `ApprovalBlock` path. For a fully-denied file-edit call, `applyGate` skips the substitute path entirely — sets `ctx.block = true` + emits a synthetic `tool-result` event with body `[fully denied] <edit-outcomes>…</edit-outcomes>` for live display. The persisted result stays the terse `Blocked: User denied this tool call` the harness writes.
301
305
 
302
306
  ## Settings rows
303
307
 
@@ -354,6 +358,19 @@ File discovery uses `git ls-files --cached --others --exclude-standard` when ava
354
358
 
355
359
  Adding a third provider: build it in `zidane/chat` against `CompletionProvider<TItem>`, pass it alongside the others in `completionProviders` — the popup picks it up automatically.
356
360
 
361
+ ## Transcript scrolling
362
+
363
+ `<Transcript>` mounts a `<scrollbox>` with OpenTUI's native auto-pin: `stickyScroll` + `stickyStart="bottom"`. The scrollbox stays pinned to the bottom while content grows, and detaches the moment the user scrolls up — re-attaches as soon as they hit the bottom edge again. This is the path that keeps streamed text + tool results glued under the cursor while a run is busy without manual `requestAnimationFrame` glue in React.
364
+
365
+ Two scroll effects layer on top, both keyed off React state:
366
+
367
+ - **Auto-pin** (native): OpenTUI handles the `[items, busy]` → "stay at bottom" loop. No React work; survives a 60 fps stream without forcing re-layouts.
368
+ - **`selectedTurnId` effect** (React): `useEffect([selectedTurnId, anchors, ownership])` calls `scrollbox.scrollChildIntoView(id)` via `requestAnimationFrame` so OpenTUI's layout pass has settled before measuring. Snaps to `scrollHeight` for the last turn (or a turn that owns the last result-only row) so a tall response's tail stays visible.
369
+
370
+ The two coexist — selecting a turn jumps to it; exiting select-turn mode (`esc`) plus the next streamed delta re-engages the native sticky-bottom.
371
+
372
+ `Settings.smoothStreaming` (default **on**) drip-feeds streamed text at a smooth cadence (typewriter) instead of in stream bursts; the auto-pin loop tracks each delta naturally because each tick still grows the scrollbox content.
373
+
357
374
  ## Select-turn mode
358
375
 
359
376
  Press `ctrl+s` on the chat screen (idle, ≥ 1 turn, no pending approval / interaction) to enter select-turn mode:
@@ -458,7 +475,7 @@ src/tui/
458
475
  index.tsx runTui + public exports
459
476
  app.tsx App + AppShell + ThemedShell — provider stack + agent lifecycle
460
477
  screens.tsx AuthScreen, SessionsScreen, ChatScreen, wizard steps, PromptBlock, ApprovalBlock, QueuedMessagesBlock
461
- components.tsx Transcript, EventLine, MarkdownBlock, SubagentBlock, Footer, Spinner, StatusSpinner, TitleOverlay, EditDiffBlock (outcomes-aware: per-hunk applied/denied/skipped/failed badges via `summarizeOutcomes`), ToolCallBlock, UserPromptBlock
478
+ components.tsx Transcript (OpenTUI `stickyScroll` + `stickyStart="bottom"` for auto-pin; `selectedTurnId` scroll effect on top), EventLine, MarkdownBlock, SubagentBlock, Footer, Spinner, StatusSpinner, TitleOverlay, EditDiffBlock (outcomes-aware: per-hunk applied/denied/skipped/failed badges via `summarizeOutcomes`), ToolCallBlock, UserPromptBlock
462
479
  modal.tsx ModalRoot + Modal + useModalAwareFocus
463
480
  model-picker.tsx Cross-provider searchable model list modal
464
481
  effort-picker.tsx Reasoning-effort modal
@@ -468,7 +485,7 @@ src/tui/
468
485
  mcps-settings.tsx MCP servers list + toggle modal (standalone for embedders)
469
486
  session-details-modal.tsx Stats + delete / export / title / compact (ctrl+x)
470
487
  turn-details-modal.tsx Fork / delete / copy (opened from select-turn mode)
471
- file-edit-approval-modal.tsx FileEditApprovalModal + SingleEditApprovalModal + MultiEditApprovalModal + UnresolvedHunkPanel. Inline-mounted in ChatScreen's transcript slot — bridges ApprovalDecision (including { kind: 'partial', mask }) to the gate via the helpers in `zidane/chat`'s `edit-approval` module.
488
+ file-edit-approval-modal.tsx FileEditApprovalModal + SingleEditApprovalModal + MultiEditApprovalModal + UnresolvedHunkPanel + `originatorSuffix` (` · child-N` attribution). Inline-mounted in ChatScreen's transcript slot, keyed on `request.id` for force-remount on queue advance — bridges ApprovalDecision (including { kind: 'partial', mask }) to the gate via the helpers in `zidane/chat`'s `edit-approval` module.
472
489
  interaction-block.tsx InteractionBlock + plan picker + question wizard
473
490
  toggle-list-modal.tsx Generic checkbox-list modal (ToggleListModal)
474
491
  completion-popup.tsx Provider-agnostic autocomplete popover
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "zidane",
3
- "version": "5.1.20",
3
+ "version": "5.1.21",
4
4
  "description": "an agent that goes straight to the goal",
5
5
  "type": "module",
6
6
  "private": false,