npm - zidane - Versions diffs - 5.4.0 → 5.4.1 - Mend

zidane 5.4.0 → 5.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/dist/chat.d.ts +59 -2
package/dist/chat.d.ts.map +1 -1
package/dist/chat.js +2 -2
package/dist/index-CrqFoaQA.d.ts.map +1 -1
package/dist/index.js +3 -3
package/dist/{login-bK0EP8La.js → login-8c5C0FYq.js} +2 -2
package/dist/{login-bK0EP8La.js.map → login-8c5C0FYq.js.map} +1 -1
package/dist/{presets-M8f6lDnW.js → presets-Ck4VusTo.js} +2 -2
package/dist/{presets-M8f6lDnW.js.map → presets-Ck4VusTo.js.map} +1 -1
package/dist/presets.js +1 -1
package/dist/{tools-DKdyPoUf.js → tools-PQH1Ge4M.js} +95 -22
package/dist/tools-PQH1Ge4M.js.map +1 -0
package/dist/tools.js +1 -1
package/dist/{transcript-anchors-Fgh_rZ04.d.ts → transcript-anchors-ByB2MSCB.d.ts} +17 -2
package/dist/transcript-anchors-ByB2MSCB.d.ts.map +1 -0
package/dist/tui.d.ts +2 -2
package/dist/tui.d.ts.map +1 -1
package/dist/tui.js +52 -23
package/dist/tui.js.map +1 -1
package/dist/{turn-operations-DDokWR8p.js → turn-operations-Bqs4YbbH.js} +128 -4
package/dist/turn-operations-Bqs4YbbH.js.map +1 -0
package/docs/ARCHITECTURE.md +6 -5
package/docs/CHAT.md +21 -3
package/docs/SKILL.md +1 -1
package/docs/TUI.md +2 -2
package/package.json +1 -1
package/dist/tools-DKdyPoUf.js.map +0 -1
package/dist/transcript-anchors-Fgh_rZ04.d.ts.map +0 -1
package/dist/turn-operations-DDokWR8p.js.map +0 -1

package/docs/ARCHITECTURE.md CHANGED Viewed

@@ -187,18 +187,19 @@ Built-in tools are opinionated about output sizes — drop your v2 `tool:transfo
 | `shell` | Tail-priority truncation at `maxOutputBytes=32768` (32 KiB, combined stdout+stderr). Head trim marker: `…(N bytes truncated from head)…`. `0` disables. UTF-8 never splits mid-codepoint. Appends `(exit N, Nms)` footer + surfaces non-empty stderr by default (`metadata: false` opts out). |
 | `write_file` | Reads existing content; returns `Created` / `Updated` / `No change needed: …` so the model detects no-ops without a separate read. Race window in shared docker/sandbox contexts documented and accepted. |
 | `edit` | Fails clearly on non-unique `old_string` (unless `replace_all: true`). On not-found, includes a nearest-match preview so the model recovers without re-reading. |
-| `multi_edit` | Sequential edits to one file. **Single-mode atomic** — applies every edit in input order against the file as left by the previous step; first tool-level failure aborts the batch with the legacy `multi_edit error: edit #N <reason>` string, success returns `Edited <path>: applied N edits (M replacements).`. The tool body knows nothing about approvals: per-hunk decisions are a host concern enforced upstream — the chat layer's `tool:gate` rebinds `ctx.input.edits` to the approved subset (rebind, not mutate — the model's original `tool_call` block in `session.turns` stays untouched) before the body runs, and a `tool:transform` hook appends an `<edit-outcomes>` block to the result so live + replayed transcripts share the same per-hunk view (`parseEditOutcomesFromResult` re-parses on reload). See CHAT.md → Per-edit approval. |
+| `multi_edit` | Sequential edits to one file. **Best-effort, per-hunk** — each step runs against the file as left by the previous APPLIED step; a per-step failure (`old_string` not found, ambiguous without `replace_all`, identical strings, malformed input) is reported in the result but doesn't block siblings. Writes the file iff at least one step applied. All-applied returns the legacy `Edited <path>: applied N edits (M replacements).` so the renderer's success-suppression keeps the diff alone; any failure emits `Edited <path>: applied N of M edits …` (or `multi_edit error: no edits applied to <path> (M attempted).` when nothing landed) plus per-failure lines + an `<edit-outcomes>` block. Approval is a host concern: the chat layer's `tool:gate` rebinds `ctx.input.edits` to the approved subset (rebind, not mutate — the model's original `tool_call` block in `session.turns` stays untouched), and a `tool:transform` hook merges the body's subset-keyed outcomes with the approval-side denied/skipped entries (`mergeApprovalAndBodyOutcomes`) before appending the canonical `<edit-outcomes>` block. `parseEditOutcomesFromResult` re-parses it on reload. See CHAT.md → Per-edit approval. |
 | `grep` | Wraps `rg` when present (with explicit `.` path to avoid stdin hangs). Bun.Glob fallback otherwise. `head_limit=250`, `offset` paginates. |
 ## Per-edit approval — harness purity
-The harness has no notion of "partial approval". Edit-family tools (`edit`, `multi_edit`, `write_file`) are single-mode and atomic; per-hunk decisions live entirely above the loop. The split is deliberate — SDK consumers (CI agents, headless pipelines, custom hosts) keep the legacy contract; the per-hunk UX is purely a chat-layer concern.
+The harness has no notion of "partial approval"; per-hunk DENIAL lives entirely above the loop. `edit` / `write_file` are single-mode (one hunk, succeed-or-error). `multi_edit` is best-effort, per-hunk on the BODY side — it reports `applied` / `failed` per step in its result text and is the only edit tool whose body emits an `<edit-outcomes>` block on its own (when any step failed). The chat layer merges that body-side block with approval-side decisions before the canonical block lands on the wire.
-Three loop-visible artifacts carry the decision through:
+Four loop-visible artifacts carry the decision through:
 1. **Input rebind at `tool:gate`** — the host's gate handler (`src/tui/app.tsx`'s `applyGate`) computes the approved subset, then assigns a fresh shallow-clone to `ctx.input` whose `edits` array is filtered. The model's original `tool_call.input` in `session.turns` is never mutated (the rebind produces a new object); the tool body sees the smaller, all-approved batch and runs unchanged.
 2. **Pending-annotation map** — keyed by `tool_call.id`, holds the 1:1 `EditOutcome[]` (over the model's ORIGINAL hunks). Lives in the host's React tree (`pendingAnnotationsRef` in the TUI).
-3. **`tool:transform` annotation** — the host appends an `<edit-outcomes>` block to `ctx.result` when at least one hunk wasn't applied. Bubbles to `child:tool:transform` for subagent-issued calls via `BUBBLED_MUTABLE_EVENTS`.
+3. **Body-side outcomes** — `multi_edit`'s best-effort body emits an `<edit-outcomes>` block in its result whenever any step failed, keyed against the approved SUBSET it actually saw (in subset-position order).
+4. **`tool:transform` annotation merge** — the host strips the body's block (when present), merges it into the approval-side 1:1 outcomes via `mergeApprovalAndBodyOutcomes` (each approval `applied` placeholder is replaced by the body's next outcome), rewrites the body's header so subset-relative `N of M edits` becomes original-total (`rewriteMultiEditHeader`), and appends the canonical merged block. The merged outcomes also flow back into the in-flight `'tool'` event (`updateToolEventOutcomes`) so live diff badges reflect the post-merge truth without waiting for a session reload. Bubbles to `child:tool:transform` for subagent-issued calls via `BUBBLED_MUTABLE_EVENTS`.
 **Wire format** (canonical shape for live emit + persisted replay):
@@ -214,7 +215,7 @@ Edited path/to/foo.ts: applied 2 edits (3 replacements).
 - Opening + closing tags each on their own line.
 - One line per hunk: `#<1-based-index> <kind>[: <reason>]`.
-- `kind ∈ {applied | denied | skipped | failed}`.
+- `kind ∈ {applied | denied | skipped | failed}`. `failed` originates in `multi_edit`'s body (per-step tool-level rejection); `denied` / `skipped` originate at the gate; `applied` is the resolved positive outcome.
 - Block emitted ONLY when at least one hunk is NOT applied. All-applied calls fall through to the legacy summary alone.
 A fully-denied call (every hunk rejected) skips the substitute path; the harness writes `Blocked: User denied this tool call` as the tool_result and the host emits a synthetic `tool-result` event with body `[fully denied] <edit-outcomes>…</edit-outcomes>` for live display only (persisted history stays terse).

package/docs/CHAT.md CHANGED Viewed

@@ -712,7 +712,12 @@ Wire:
 ## Per-edit approval
-Edit-family tools (`edit`, `multi_edit`, `write_file`) can accept or reject **individual hunks** instead of all-or-nothing. The user-side decision lives in `ApprovalDecision`'s `{ kind: 'partial', mask }` shape; the bridge is purely host-side — the harness stays pure (single-mode atomic `multi_edit` body, no side channels on `tool_call.input`). The host's `tool:gate` handler rebinds `ctx.input.edits` to the approved subset before the body runs; a paired `tool:transform` hook appends an `<edit-outcomes>` annotation block to the result so the renderer can paint per-hunk badges live + on replay.
+Edit-family tools (`edit`, `multi_edit`, `write_file`) can accept or reject **individual hunks** instead of all-or-nothing. The user-side decision lives in `ApprovalDecision`'s `{ kind: 'partial', mask }` shape. Two layers cooperate:
+- **`multi_edit` body** is best-effort: a per-step failure (`old_string` not found, ambiguous, identical strings) is recorded against that step alone — siblings still run. The body emits an `<edit-outcomes>` block in its result whenever any step failed, keyed against the approved SUBSET it actually saw.
+- **The host** rebinds `ctx.input.edits` to the approved subset at `tool:gate` and (in `tool:transform`) merges the body's subset-keyed outcomes with its 1:1 approval-side decisions via `mergeApprovalAndBodyOutcomes`, then re-appends the canonical annotation block.
+Net effect: the canonical wire-format `<edit-outcomes>` block carries `applied` / `denied` / `skipped` / `failed` 1:1 with the model's ORIGINAL edits, for live + replay capture alike.
 ```ts
 type EditOutcomeKind = 'applied' | 'denied' | 'skipped' | 'failed' | 'pending'
@@ -738,6 +743,8 @@ import {
   maskToOutcomeKinds,
   buildEditOutcomesAnnotation,
   parseEditOutcomesFromResult,
+  mergeApprovalAndBodyOutcomes,
+  stripEditOutcomesAnnotation,
   summarizeOutcomes,
   type ResolvedApproval,
 } from 'zidane/chat'
@@ -749,6 +756,9 @@ import {
 | `maskToOutcomeKinds(mask, fallbackLength, deniedReason?)` | Convert a boolean mask to a 1:1 `EditOutcome[]`. Missing entries default to `applied` (no-decision = keep). |
 | `buildEditOutcomesAnnotation(outcomes)` | Render an `EditOutcome[]` as the wire-format `<edit-outcomes>…</edit-outcomes>` block — body APPENDed to a tool result (joined with `\n\n`). Idempotent on missing reasons. |
 | `parseEditOutcomesFromResult(text)` | Re-parse the annotation block back into outcomes. Used by `eventsFromTurns` so replay shows the same badges as live capture. Returns `null` when the block is missing / malformed. |
+| `mergeApprovalAndBodyOutcomes(approval, body)` | Fold a body-side subset-keyed outcome list (e.g. from `multi_edit`'s best-effort body) into the approval-side 1:1 outcomes — each `applied` placeholder is replaced by the body's next outcome in subset order. Returns a fresh array. |
+| `stripEditOutcomesAnnotation(text)` | Peel the first `<edit-outcomes>` block (and one leading `\n\n` separator) out of a body. Used by `tool:transform` to remove a body-emitted block before re-appending the merged version — the parser is anchored on the FIRST block. |
+| `rewriteMultiEditHeader(text, merged, path)` | Rewrite a `multi_edit` body header so its `N of M edits` / `N attempted` counts reflect the merged outcomes (= model's original edit-list total), not the subset the body actually saw after gate rebinding. Preserves the body-side replacements count and pluralization. |
 | `summarizeOutcomes(outcomes)` | `{ applied, denied, skipped, failed, pending, total }` tally for the header badge. |
 ```ts
@@ -864,9 +874,17 @@ Wipe `pendingAnnotations` on `agent:done` (covers completed / aborted / error pa
 ### `multi_edit` tool body shape
-Single-mode atomic — the harness has no notion of approvals. Applies every edit in input order against the file as left by the previous step. First tool-level failure aborts the batch with the legacy `multi_edit error: edit #N <reason>` string; success returns `Edited <path>: applied N edits (M replacements).`. SDK consumers (CI agents, headless harnesses, pipelines parsing the result) see exactly the legacy contract.
+Best-effort, per-hunk — the harness has no notion of approvals, but the body itself records failures per-step instead of bailing on the first. Applies every edit in input order against the file as left by the previous APPLIED step; a per-step rejection (`old_string` not found, ambiguous without `replace_all`, identical, malformed) is recorded against THAT step alone, and siblings still run. The file is written iff at least one step applied.
+Result shapes:
+- **All applied** → legacy `Edited <path>: applied N edits (M replacements).` summary, NO annotation block. Renderer suppression (`isEditErrorResult`) keeps the diff alone.
+- **Mixed** → `Edited <path>: applied N of M edits (R replacements).` + per-failure lines (`edit #K failed: <reason>`) + `<edit-outcomes>` block keyed against the input the body saw.
+- **All failed** → `multi_edit error: no edits applied to <path> (M attempted).` + per-failure lines + `<edit-outcomes>` block. The `multi_edit error:` prefix is preserved so existing visibility / log filters keep working.
+SDK consumers (CI agents, headless harnesses, pipelines parsing the result) get a self-describing result on success AND on partial failure — re-issuing just the failed steps is a parse-and-resubmit instead of "re-run the whole batch".
-When a host rebinds `ctx.input.edits` to the approved subset at the gate (above), the body sees a smaller all-approved batch and still runs the same atomic semantics. The per-hunk outcomes ride into the wire / persisted history as the appended `<edit-outcomes>` block — the model and any downstream parser see one self-describing tool_result.
+When a host rebinds `ctx.input.edits` to the approved subset at the gate (above), the body sees a smaller batch and records outcomes against THAT subset. The host's `tool:transform` hook merges the body's subset-keyed outcomes with the approval-side decisions via `mergeApprovalAndBodyOutcomes` and rewrites the body's header (`rewriteMultiEditHeader`) so its `N of M` counts reflect the model's original total, not the subset. The merged outcomes also flow into the in-flight `'tool'` event via `updateToolEventOutcomes` so the live diff badges catch up — no session reload needed.
 ### Modal lifecycle

package/docs/SKILL.md CHANGED Viewed

@@ -217,7 +217,7 @@ Alias only when semantically equivalent. `shell → Bash` is safe; `list_files
 | `readFile` | Line range, default `offset=1, limit=2000`, 256 KiB cap. Paging footer; binary marker. |
 | `writeFile` | Returns `Created` / `Updated` / `No change needed: …` for no-op detection. |
 | `edit` | Surgical `old_string` → `new_string`. Clear errors on non-unique (unless `replace_all`) / not-found (with nearest-match preview). |
-| `multiEdit` | Sequential edits to one file. Single-mode atomic — first failure aborts with `multi_edit error: edit #N <reason>`; success returns `Edited <path>: applied N edits (M replacements).`. Per-edit approval is a host concern: the chat layer's `tool:gate` rebinds `ctx.input.edits` to the user-approved subset before the body runs, then a `tool:transform` hook appends an `<edit-outcomes>` annotation to the result so the renderer can paint per-hunk badges (see `docs/CHAT.md`). |
+| `multiEdit` | Sequential edits to one file. Best-effort, per-hunk — each step runs against the file as left by the previous APPLIED step; a per-step rejection is recorded against that step alone, siblings still run. All-applied → legacy `Edited <path>: applied N edits (M replacements).`; any failure → header + per-failure lines + `<edit-outcomes>` block. Per-edit approval is a host concern: the chat layer's `tool:gate` rebinds `ctx.input.edits` to the user-approved subset before the body runs, then a `tool:transform` hook merges body + approval outcomes and appends the canonical `<edit-outcomes>` annotation (see `docs/CHAT.md`). |
 | `listFiles` | Directory listing. |
 | `glob` | `**`, `*`, `?` pattern matching via Bun.Glob; shells out in docker/sandbox. |
 | `grep` | ripgrep + Bun.Glob fallback. Full Claude Code Grep semantics. `head_limit=250`, `offset` paginates. |

package/docs/TUI.md CHANGED Viewed

@@ -292,7 +292,7 @@ File-edit tools (`edit` / `multi_edit` / `write_file`) get their own approval su
 - **`MultiEditApprovalModal`** (multi-step `multi_edit`, ≥ 2 hunks) — per-hunk toggle list + focused-hunk diff panel + action bar. List zone: `↑` / `↓` move cursor, `space` toggles the focused hunk, `y` / `n` set all on/off. Actions zone: `←` / `→` cycle, `↵` confirm. `tab` cycles between zones. `a` / `s` / `p` / `d` bulk shortcuts work in either zone. The first action label is dynamic — `Apply all`, `Apply N/M`, `Nothing selected` — and submits one of:
   - All hunks on → bulk decision (`accept-once` / `accept-session` / `accept-safelist`) preserving the safelist path identical to single-edit.
   - All hunks off → `'deny'`.
-  - Mixed mask → `{ kind: 'partial', mask }` — `applyGate` calls `resolveApprovalForPayload`, then **rebinds** `ctx.input` to a fresh shallow-clone whose `edits` is the approved subset (the model's original `tool_call.input` on the persisted assistant turn stays untouched) and stashes the full-length `EditOutcome[]` in `pendingAnnotationsRef.current` keyed by `ctx.callId`. The paired `tool:transform` / `child:tool:transform` hook reads back the entry and appends a `buildEditOutcomesAnnotation(outcomes)` block to `ctx.result`, so the wire / persisted history carries the per-hunk decisions for replay. See [Per-edit approval in CHAT.md](./CHAT.md#per-edit-approval) for the full contract.
+  - Mixed mask → `{ kind: 'partial', mask }` — `applyGate` calls `resolveApprovalForPayload`, then **rebinds** `ctx.input` to a fresh shallow-clone whose `edits` is the approved subset (the model's original `tool_call.input` on the persisted assistant turn stays untouched) and stashes the full-length `EditOutcome[]` in `pendingAnnotationsRef.current` keyed by `ctx.callId`. The paired `tool:transform` / `child:tool:transform` hook reads the entry, MERGES with any body-emitted `<edit-outcomes>` block (`multi_edit`'s best-effort body emits one when any step failed) via `mergeApprovalAndBodyOutcomes`, strips the body's block, rewrites the body's header so `N of M` counts reflect the model's original total (`rewriteMultiEditHeader`), and appends the canonical merged annotation. The merged outcomes also flow into the in-flight `'tool'` event (`updateToolEventOutcomes` via `stream.flushAndUpdate`) so the live diff badges catch up to body-side failures the gate couldn't anticipate — no reload required. A partially-approved `multi_edit` whose body subsequently has a step-level failure surfaces as e.g. `[denied, applied, denied, failed, applied]` end-to-end — live AND on replay. See [Per-edit approval in CHAT.md](./CHAT.md#per-edit-approval) for the full contract.
 **Inline mount**. The modal renders inside `ChatScreen`'s transcript slot via `flexGrow: 1`. The transcript stays mounted with `visible: !fileEditPending` — OpenTUI maps that to Yoga's `Display.None`, removing the box from layout entirely so the sibling modal claims the slot via its own `flexGrow: 1`. Memoized turn anchors, scrollbox position, and lazy markdown chrome all survive the open → close round-trip. The prompt + footer stay anchored at the bottom (queue-only while `busy`). `ChatScreen` pushes an empty `<></>` placeholder onto the global modal context so `useModalAwareFocus()` keeps releasing background focus and the app-level `esc` / `ctrl+s` shortcuts stay gated; the actual UI is sibling, not stacked.
@@ -435,7 +435,7 @@ Each row carries an OAuth status badge driven by the `mcp-auth-state` reducer:
 | Badge | Status | Meaning |
 |---|---|---|
 | `✓ authed` | `{ kind: 'authed' }` | tokens stored + bootstrap connected |
-| `⚠ needs login` | `{ kind: 'needs-auth', reason: 'no-tokens' \| 'auto-promoted' }` | explicit OAuth config or 401 + RFC 9728 metadata detected |
+| `! needs login` | `{ kind: 'needs-auth', reason: 'no-tokens' \| 'auto-promoted' }` | explicit OAuth config or 401 + RFC 9728 metadata detected |
 | `… authorizing` | `{ kind: 'authorizing', url? }` | interactive login in flight |
 | `✗ <error>` | `{ kind: 'error', error }` | last login attempt failed; message is verbatim |
 | (blank) | `{ kind: 'idle' }` | nothing to say (stdio server, or not yet bootstrapped) |

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "zidane",
-  "version": "5.4.0",
+  "version": "5.4.1",
   "description": "an agent that goes straight to the goal",
   "type": "module",
   "private": false,