npm - zidane - Versions diffs - 4.0.2 → 4.1.4 - Mend

zidane 4.0.2 → 4.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (77) hide show

package/README.md +196 -614
package/dist/agent-BoV5Twdl.d.ts +2347 -0
package/dist/agent-BoV5Twdl.d.ts.map +1 -0
package/dist/contexts-3Arvn7yR.js +321 -0
package/dist/contexts-3Arvn7yR.js.map +1 -0
package/dist/contexts.d.ts +2 -25
package/dist/contexts.js +2 -10
package/dist/errors-D1lhd6mX.js +118 -0
package/dist/errors-D1lhd6mX.js.map +1 -0
package/dist/index-28otmfLX.d.ts +400 -0
package/dist/index-28otmfLX.d.ts.map +1 -0
package/dist/index-BfSdALzk.d.ts +113 -0
package/dist/index-BfSdALzk.d.ts.map +1 -0
package/dist/index-DPsd0qwm.d.ts +254 -0
package/dist/index-DPsd0qwm.d.ts.map +1 -0
package/dist/index.d.ts +5 -95
package/dist/index.js +141 -271
package/dist/index.js.map +1 -0
package/dist/interpolate-CukJwP2G.js +887 -0
package/dist/interpolate-CukJwP2G.js.map +1 -0
package/dist/mcp-8wClKY-3.js +771 -0
package/dist/mcp-8wClKY-3.js.map +1 -0
package/dist/mcp.d.ts +2 -4
package/dist/mcp.js +2 -13
package/dist/messages-z5Pq20p7.js +1020 -0
package/dist/messages-z5Pq20p7.js.map +1 -0
package/dist/presets-Cs7_CsMk.js +39 -0
package/dist/presets-Cs7_CsMk.js.map +1 -0
package/dist/presets.d.ts +2 -43
package/dist/presets.js +2 -17
package/dist/providers-CX-R-Oy-.js +969 -0
package/dist/providers-CX-R-Oy-.js.map +1 -0
package/dist/providers.d.ts +2 -4
package/dist/providers.js +3 -23
package/dist/session/sqlite.d.ts +7 -12
package/dist/session/sqlite.d.ts.map +1 -0
package/dist/session/sqlite.js +67 -79
package/dist/session/sqlite.js.map +1 -0
package/dist/session-Cn68UASv.js +440 -0
package/dist/session-Cn68UASv.js.map +1 -0
package/dist/session.d.ts +2 -4
package/dist/session.js +3 -27
package/dist/skills.d.ts +3 -322
package/dist/skills.js +24 -47
package/dist/skills.js.map +1 -0
package/dist/stats-DoKUtF5T.js +58 -0
package/dist/stats-DoKUtF5T.js.map +1 -0
package/dist/tools-DpeWKzP1.js +3941 -0
package/dist/tools-DpeWKzP1.js.map +1 -0
package/dist/tools.d.ts +3 -95
package/dist/tools.js +2 -40
package/dist/tui.d.ts +533 -0
package/dist/tui.d.ts.map +1 -0
package/dist/tui.js +2004 -0
package/dist/tui.js.map +1 -0
package/dist/types-Bx_F8jet.js +39 -0
package/dist/types-Bx_F8jet.js.map +1 -0
package/dist/types.d.ts +4 -55
package/dist/types.js +4 -28
package/package.json +38 -4
package/dist/agent-BAHrGtqu.d.ts +0 -2425
package/dist/chunk-4ILGBQ23.js +0 -803
package/dist/chunk-4LPBN547.js +0 -3540
package/dist/chunk-64LLNY7F.js +0 -28
package/dist/chunk-6STZTA4N.js +0 -830
package/dist/chunk-7GQ7P6DM.js +0 -566
package/dist/chunk-IC7FT4OD.js +0 -37
package/dist/chunk-JCOB6IYO.js +0 -22
package/dist/chunk-JH6IAAFA.js +0 -28
package/dist/chunk-LNN5UTS2.js +0 -97
package/dist/chunk-PMCQOMV4.js +0 -490
package/dist/chunk-UD25QF3H.js +0 -304
package/dist/chunk-W57VY6DJ.js +0 -834
package/dist/sandbox-D7v6Wy62.d.ts +0 -28
package/dist/skills-use-DwZrNmcw.d.ts +0 -80
package/dist/types-Bai5rKpa.d.ts +0 -89
package/dist/validation-Pm--dQEU.d.ts +0 -185

package/README.md CHANGED Viewed

@@ -2,38 +2,34 @@
 # Zidane
-An agent that goes straight to the goal.
-Minimal TypeScript agent loop built with [Bun](https://bun.sh).
-Hook into every step using [hookable](https://github.com/unjs/hookable).
-Built to be embedded.
+An agent that goes straight to the goal. Minimal TypeScript agent loop built with [Bun](https://bun.sh), hookable via [hookable](https://github.com/unjs/hookable). Built to be embedded.
 ## Features
-A small, hookable core with sensible defaults so most consumers don't write a single hook. Built around three principles: **token discipline by default** (cache, dedup, compaction, byte-accounting), **self-healing on the fault paths** (auto-coerce args, hallucinated-tool fallback, error rewriting), and **provider parity** (server-side features on Anthropic, client-side equivalents everywhere else).
+Small, hookable core with sensible defaults. Three principles:
+- **Token discipline** — cache, dedup, compaction, byte-accounting.
+- **Self-healing fault paths** — auto-coerce args, hallucinated-tool fallback, error rewriting.
+- **Provider parity** — server-side features on Anthropic, client-side equivalents elsewhere.
-- 🧠 **Multi-provider, multi-auth** — Anthropic, OpenAI Codex, OpenRouter, Cerebras, plus a generic `openaiCompat` factory (Baseten, Fireworks, Groq, local servers). OAuth + API key, auto-refreshing tokens. Anthropic accepts opt-in `extraBetas` and `contextManagement` for first-party features.
-- 🪝 **Streaming, hookable turn loop** — text/thinking deltas, tool calls, MCP, sessions, skills, spawn, OAuth, validation, budgets — all observable (and most mutatable) via typed hook events. Per-request `system:transform` hook for runtime-derived prompt sections.
-- 🛠 **Tools first-class** — `shell`, `read_file`, `write_file`, `edit`, `multi_edit`, `glob`, `grep`, `spawn`, human-in-the-loop, plus any [MCP](https://modelcontextprotocol.io) server. Sequential or parallel, per-call gates (`tool:gate` with writable `block` / `result` / `runToolCounts`), validation auto-coerce (`"true"` → `true`), hallucinated-tool fallback (`tool:unknown`), error rewriting (`tool:error` → `result`). Optional **progressive disclosure** (`behavior.toolDisclosure: 'lazy'`) hides MCP schemas behind a `tool_search` native tool when context budget matters more than upfront discovery — gated server-side and in-loop so a model can't bypass.
-- ✂️ **Token-aware ergonomics** — paginated reads with a "how to page" footer, 8 KB tail-truncated `shell`, idempotent `write_file`; `outputBytes` surfaced on every tool/MCP hook. `behavior.toolOutputBudget` injects a "summarize" nudge when a turn's outputs exceed the cap; `behavior.toolBudgets` caps per-tool call counts (`'steer'` or `'block'`); `behavior.thinkingDecay` tapers reasoning budget per turn.
-- 🗜 **Context discipline** — auto-injected `cache_control` breakpoints (Anthropic + OpenRouter); server-side compaction via `context-management-2025-06-27` on Anthropic, `behavior.compactStrategy: 'tail'` on everyone else. Per-session `read_file` dedup + opt-in `requireReadBeforeEdit` guard kill stale-content edits; `behavior.dedupTools` generalizes the same pattern to arbitrary tools (`todowrite`, `execute_sql`, …).
-- 🎯 **Reasoning + structured output** — thinking levels (`off` / `minimal` / `low` / `medium` / `high` / `adaptive`) with optional exact budgets; force the final answer to a JSON Schema (Zod v4 interop), no brittle parsing.
-- 💾 **Sessions, skills, multimodal** — pluggable session stores (memory / SQLite / remote / file-map), incremental persistence; [Agent Skills](https://agentskills.io/specification) spec-aligned with `allowed-tools` enforcement + resume rehydration; images + documents via `PromptPart[]`, tools can return image blocks routed natively on vision providers or via companion messages elsewhere.
-- 🧵 **Sub-agents + execution contexts** — delegate to child agents with inherited or overridden preset (child events bubble to the parent); run tools in-process, Docker, or any `SandboxProvider` (E2B / Rivet / custom). Parallel MCP bootstrap with `agent.warmup()` + `eager: true` to hide cold starts.
-- 🧭 **Typed errors + 1000+ tests** — `AgentContextExceededError` / `AgentProviderError` / `AgentAbortedError` instead of sniffing strings. Suite runs in under 2s with mock providers + mock execution contexts, zero API keys.
+- 🧠 **Providers** — Anthropic, OpenAI Codex, OpenRouter, Cerebras, plus `openaiCompat` (Baseten, Fireworks, Groq, local). OAuth + API key with auto-refresh.
+- 🪝 **Hookable turn loop** — every text/thinking delta, tool call, MCP, session, skill, spawn, OAuth, validation, and budget event is observable and (mostly) mutable.
+- 🛠 **First-class tools** — `shell`, `read_file`, `write_file`, `edit`, `multi_edit`, `glob`, `grep`, `spawn`, human-in-the-loop, plus any [MCP](https://modelcontextprotocol.io) server. Per-call gates, arg auto-coerce, hallucinated-tool fallback, error rewriting. Lazy MCP disclosure via `tool_search`.
+- ✂️ **Token-aware** — paginated reads, tail-truncated `shell`, idempotent `write_file`; `outputBytes` everywhere. `toolOutputBudget`, `toolBudgets`, `thinkingDecay`.
+- 🗜 **Context discipline** — `cache_control` breakpoints; server-side compaction on Anthropic, client-side `compactStrategy: 'tail'` elsewhere. Per-session read dedup + `requireReadBeforeEdit`; generalized `dedupTools`.
+- 🎯 **Reasoning + structured output** — thinking levels with optional exact budgets; force final response to a JSON Schema (Zod v4 interop).
+- 💾 **Sessions, skills, multimodal** — pluggable stores, incremental persistence; [Agent Skills](https://agentskills.io/specification) spec; images + documents via `PromptPart[]`.
+- 🧵 **Sub-agents + execution contexts** — child events bubble to parent; run tools in-process, Docker, or any `SandboxProvider`.
+- 🧭 **Typed errors + 1000+ tests** — `AgentContextExceededError` / `AgentProviderError` / `AgentAbortedError`. Suite under 2s with mocks.
 ## Quickstart
 ```bash
 bun install
-bun run auth                                    # Anthropic + OpenAI Codex OAuth
+bun run auth                                    # Anthropic + OpenAI Codex OAuth (--openai / --anthropic to scope)
 bun start --prompt "create a hello world app"
 ```
-`auth` runs both OAuth flows by default. Pass `--openai` or `--anthropic` to authenticate only one provider; the npm script form works too, e.g. `npm run auth --openai`.
 ## Agent Setup
 ```ts
@@ -50,69 +46,65 @@ const stats = await agent.run({ prompt: 'build a REST API' })
 console.log(`Done in ${stats.turns} turns`)
 ```
-All options on `createAgent`:
+`createAgent` options:
 ```ts
 createAgent({
-  provider,                          // required: LLM provider
-  name: 'basic',                     // optional display name (shown in traces/logs)
-  system: 'You are a helpful...',    // default system prompt
-  tools: { shell, readFile },        // tool set (default: no tools)
-  toolAliases: { shell: 'Bash' },    // map canonical names to LLM-facing names
-  session,                           // session for persistence
-  behavior: {                        // agent-level defaults
-    toolExecution: 'parallel',       // or 'sequential' (default: parallel)
-    maxTurns: 50,                    // max loop iterations
-    maxTokens: 16384,                // max tokens per LLM response
-    thinkingBudget: 10240,           // exact thinking token budget
-    thinkingDecay: { afterTurn: 5, factor: 0.5, floor: 1024 }, // taper budget per run-relative turn
-    cache: true,                     // prompt-cache breakpoints on supported providers (default: true)
-    toolOutputBudget: 32768,         // soft per-turn cap on tool-output bytes (off by default)
-    dedupReads: true,                // dedup identical re-reads of the same file in `read_file` (default: true)
-    dedupTools: { todowrite: i => JSON.stringify(i.todos) }, // generic per-tool argument dedup
-    requireReadBeforeEdit: false,    // refuse `edit` / `multi_edit` against unread or stale files (default: false)
-    toolBudgets: { todowrite: { max: 6, onExceed: 'steer' } }, // per-tool soft call caps
-    compactStrategy: 'off',          // client-side tail compaction for non-Anthropic providers — 'off' | 'tail' (default: 'off')
-    compactThreshold: 131_072,       // bytes threshold that triggers tail compaction (default: 128 KiB)
-    compactKeepTurns: 4,             // trailing turns left intact during compaction (default: 4)
-    toolDisclosure: 'eager',         // 'eager' | 'lazy' — hide MCP schemas behind tool_search (default: 'eager')
-    toolSearch: { tool: true, limit: 20 }, // tune the auto-injected tool_search (no-op in eager mode)
+  provider,                          // required
+  name: 'basic',                     // display name (traces/logs)
+  system: 'You are a helpful...',
+  tools: { shell, readFile },        // default: {}
+  toolAliases: { shell: 'Bash' },    // canonical → LLM-facing names
+  session,
+  behavior: {
+    toolExecution: 'parallel',       // 'parallel' | 'sequential' (default: parallel)
+    maxTurns: 50,
+    maxTokens: 16384,
+    thinkingBudget: 10240,
+    thinkingDecay: { afterTurn: 5, factor: 0.5, floor: 1024 },
+    cache: true,                     // prompt-cache breakpoints
+    toolOutputBudget: 32768,         // soft per-turn byte cap (off by default)
+    dedupReads: true,                // dedup re-reads in `read_file`
+    dedupTools: { todowrite: i => JSON.stringify(i.todos) },
+    requireReadBeforeEdit: false,    // refuse edits against unread/stale files
+    toolBudgets: { todowrite: { max: 6, onExceed: 'steer' } },
+    compactStrategy: 'off',          // 'off' | 'tail' (non-Anthropic compaction)
+    compactThreshold: 131_072,       // 128 KiB
+    compactKeepTurns: 4,
+    toolDisclosure: 'eager',         // 'eager' | 'lazy' (hide MCP schemas behind tool_search)
+    toolSearch: { tool: true, limit: 20 },
   },
-  execution: createProcessContext(), // where tools run
-  mcpServers: [],                    // MCP tool servers
-  eager: true,                       // pre-warm MCP bootstrap in the background (default: false)
-  skills: {},                        // skills configuration
+  execution: createProcessContext(),
+  mcpServers: [],
+  eager: true,                       // pre-warm MCP in background
+  skills: {},
 })
 ```
-Presets are just `Partial<AgentOptions>` — spread them in, override any field:
+Presets are `Partial<AgentOptions>` — spread, override:
 ```ts
 createAgent({ ...basic, provider, system: 'be concise' })
 ```
-All options on `agent.run()`:
+`agent.run()` options:
 ```ts
 await agent.run({
-  prompt: 'your task',       // optional when session has existing turns
+  prompt: 'your task',       // optional when session has turns
   model: 'claude-opus-4-6',
   system: 'be concise',
   thinking: 'medium',        // off | minimal | low | medium | high
-  behavior: {                // per-run overrides
-    maxTurns: 10,
-    maxTokens: 4096,
-    thinkingBudget: 8192,
-  },
-  tools: {},                 // override tools for this run ({} = no tools)
-  images: [],                // base64 images
+  behavior: { maxTurns: 10, maxTokens: 4096, thinkingBudget: 8192 },
+  tools: {},                 // {} = no tools for this run
+  images: [],
   signal: abortController.signal,
 })
 ```
-`prompt` is optional when a session with existing turns is provided — the agent resumes from the last turn. This supports apps where the user message is persisted to the session before the agent runs (e.g. WebSocket → session → queue → agent).
+`prompt` is optional when the session already has turns — the agent resumes. Useful when the user message is persisted before the run (WebSocket → session → queue → agent).
-Precedence: `run.behavior` > `agent.behavior` > hardcoded defaults.
+Precedence: `run.behavior` > `agent.behavior` > defaults.
 ## CLI
@@ -130,7 +122,7 @@ bun start \
 ## Providers
-All providers accept runtime credentials via a params object. Env vars are fallbacks.
+All providers accept runtime credentials via params. Env vars are fallbacks.
 ### Anthropic
@@ -138,17 +130,15 @@ All providers accept runtime credentials via a params object. Env vars are fallb
 import { anthropic } from 'zidane/providers'
 anthropic({ apiKey: 'sk-ant-...' })
-anthropic({ access: 'sk-ant-oat-...' })                      // OAuth
-anthropic({ access: 'sk-ant-oat-...', refresh: '...', expires: Date.now() + 3600_000 }) // auto-refresh
-anthropic({ apiKey: '...', defaultModel: 'claude-sonnet-4-6' })
+anthropic({ access: 'sk-ant-oat-...', refresh: '...', expires: Date.now() + 3600_000 }) // OAuth + auto-refresh
-// Opt into first-party Anthropic betas + server-side context compaction:
+// First-party betas + server-side compaction:
 anthropic({
   apiKey: '...',
   extraBetas: [
-    'context-management-2025-06-27',     // server-side, token-accurate compaction
-    'token-efficient-tools-2026-03-28',  // ~4.5% output token reduction
-    'interleaved-thinking-2025-05-14',   // think between tool calls in one turn
+    'context-management-2025-06-27',     // token-accurate compaction
+    'token-efficient-tools-2026-03-28',  // ~4.5% output reduction
+    'interleaved-thinking-2025-05-14',   // think between tool calls
   ],
   contextManagement: {
     edits: [{
@@ -161,55 +151,22 @@ anthropic({
 })
 ```
-Fallback: `params.apiKey` > `params.access` > `ANTHROPIC_API_KEY` env > `.credentials.json`
+Fallback: `params.apiKey` > `params.access` > `ANTHROPIC_API_KEY` env > `.credentials.json`. `extraBetas` merge with OAuth defaults and de-dupe. `contextManagement` is sent as `context_management`; pair with the matching beta. Non-Anthropic equivalent: `behavior.compactStrategy: 'tail'`.
-`extraBetas` are merged with the OAuth defaults (`claude-code-20250219`, `oauth-2025-04-20`) and de-duped. `contextManagement` is sent on the request body as `context_management`; pair it with the `context-management-2025-06-27` beta. For non-Anthropic providers, see `behavior.compactStrategy: 'tail'` for the client-side fallback.
+`extraBodyParams` passes un-typed Messages API fields through (factory options win on collision). Use when Anthropic ships a beta before zidane has a knob. `openaiCompat` accepts the same field (e.g. `reasoning_effort`, `metadata`, OpenRouter `provider` routing).
-`extraBodyParams` is a generic forward-compat pass-through for un-typed Messages API fields. Spread into the request before the typed core, so explicit factory options always win on collision. Use it when Anthropic ships a new beta before zidane has a dedicated knob:
+### OpenRouter / OpenAI / Cerebras
 ```ts
-anthropic({
-  apiKey: '...',
-  extraBetas: ['some-future-beta'],
-  extraBodyParams: { future_field: { /* ... */ } },
-})
-```
-`openaiCompat` accepts the same `extraBodyParams` for OpenAI-style endpoints (e.g. `reasoning_effort`, `metadata`, OpenRouter `provider` routing).
-### OpenRouter
-```ts
-import { openrouter } from 'zidane/providers'
+import { openrouter, openai, cerebras } from 'zidane/providers'
 openrouter({ apiKey: 'sk-or-...', defaultModel: 'google/gemini-pro' })
-```
-Fallback: `params.apiKey` > `OPENROUTER_API_KEY` env
-### OpenAI
-```ts
-import { openai } from 'zidane/providers'
-openai()                                                // OpenAI Codex OAuth
-openai({ access: 'eyJ...', defaultModel: 'gpt-5.4' })
+openai()                                                                  // OpenAI Codex OAuth
 openai({ access: 'eyJ...', refresh: '...', expires: Date.now() + 3600_000, accountId: 'acct_123' })
-```
-Fallback: `params.apiKey` > `params.access` > `OPENAI_CODEX_API_KEY` env > `.credentials.json`
-Pass the full OAuth credential fields (`access`, `refresh`, `expires`, plus provider extras like `accountId`) to let the provider auto-refresh tokens without reading `.credentials.json`.
-### Cerebras
-```ts
-import { cerebras } from 'zidane/providers'
 cerebras({ apiKey: 'csk-...', defaultModel: 'zai-glm-4.7' })
 ```
-Fallback: `params.apiKey` > `CEREBRAS_API_KEY` env
+Fallbacks: `params.apiKey` > `params.access` (Codex) > `<PROVIDER>_API_KEY` env > `.credentials.json` (Codex). Pass full OAuth fields on `openai()` to auto-refresh without reading `.credentials.json`.
 ### OpenAI-compatible (custom endpoints)
@@ -222,265 +179,127 @@ openaiCompat({
   name: 'baseten',
   apiKey: process.env.BASETEN_API_KEY!,
   baseURL: process.env.BASETEN_PROXY_URL!,
-  authHeader: { name: 'Authorization', scheme: 'Api-Key' },  // vendor-specific scheme
+  authHeader: { name: 'Authorization', scheme: 'Api-Key' },  // vendor-specific
   capabilities: { vision: false, imageInToolResult: false },
-  cacheBreakpoints: false,                                    // set true only for endpoints that honor `cache_control`
+  cacheBreakpoints: false,                                    // true only when endpoint honors `cache_control`
 })
 ```
-`openrouter` and `cerebras` are thin wrappers around this factory with vendor defaults pinned. Reach for `openaiCompat` directly when adding a new backend instead of forking a bespoke provider.
+`openrouter` and `cerebras` are thin wrappers with vendor defaults pinned. Use `openaiCompat` directly for new backends.
 ### Prompt caching
-Enabled by default via `behavior.cache`. The provider inserts `cache_control: { type: 'ephemeral' }` markers on the three largest stable prefixes — system prompt, tool definitions, and the last message's final content block — so the shared prefix is served from cache across turns.
+`behavior.cache` (default on). `cache_control: { type: 'ephemeral' }` is inserted on three stable prefixes (system, last tool, last message's final block). Hits + writes surface on `TurnUsage.cacheRead` / `cacheCreation` via the `usage` hook.
 | Provider | Behavior |
 |---|---|
-| `anthropic` | Breakpoints honored natively. |
-| `openrouter` | Breakpoints forwarded; Anthropic + Gemini routes honor them, OpenAI / DeepSeek / Grok / Groq / Moonshot routes cache automatically and ignore the markers. |
-| `openaiCompat` | Opt-in via `cacheBreakpoints: true`. Default off so strict-schema endpoints (OpenAI direct, most OSS servers) don't reject unknown fields. |
-| `cerebras` | Off (factory doesn't enable breakpoints). |
-| `openai` (Codex) | Not affected — separate wire format (pi-ai). |
-Cache hits + writes land on `TurnUsage.cacheRead` / `TurnUsage.cacheCreation` and are surfaced via the `usage` hook.
+| `anthropic` | Honored natively. |
+| `openrouter` | Forwarded; Anthropic + Gemini honor; OpenAI / DeepSeek / Grok / Groq / Moonshot cache automatically and ignore the markers. |
+| `openaiCompat` | Opt-in via `cacheBreakpoints: true`. Off by default. |
+| `cerebras` | Off. |
+| `openai` (Codex) | Not affected (pi-ai wire format). |
 ## Presets
-Reusable slices of `AgentOptions` — spread them into `createAgent()`.
-The `basic` preset bundles:
+`basic` ships:
 | Tool | Description |
 |---|---|
-| `shell` | Execute shell commands. Combined stdout+stderr tail-truncated at 8 KB by default; `maxOutputBytes: 0` disables |
-| `readFile` | Read a file by line range. Default: lines 1..2000, byte cap 64 KB. Truncation footer documents how to page; binary files return a marker instead of mojibake |
-| `writeFile` | Write a file. Returns `Created` / `Updated` / `No change needed: …` so the model can detect no-ops without a separate read |
-| `edit` | Surgical replace of `old_string` → `new_string`. Fails clearly on non-unique matches (unless `replace_all`) and on not-found (with a nearest-match preview) |
-| `multiEdit` | Atomic list of edits to one file. All-or-nothing: any failed edit prevents the write |
-| `listFiles` | List directory contents |
-| `spawn` | Spawn a sub-agent |
+| `shell` | Combined stdout+stderr tail-truncated at 8 KB. `maxOutputBytes: 0` disables. |
+| `readFile` | Line range, default 1..2000, 64 KB cap. Paging footer; binary marker. |
+| `writeFile` | Returns `Created` / `Updated` / `No change needed: …` for no-op detection. |
+| `edit` | Surgical `old_string` → `new_string`. Clear errors on non-unique / not-found (with nearest-match preview). |
+| `multiEdit` | Atomic edits to one file. All-or-nothing. |
+| `listFiles` | Directory listing. |
+| `spawn` | Sub-agent. |
-Opt-in tools available via `import { glob, grep, createInteractionTool } from 'zidane'`:
+Opt-in (via `import from 'zidane'`): `glob` (Bun.Glob; shells out in docker/sandbox), `grep` (ripgrep + Bun.Glob fallback; full Claude Code Grep semantics), `createInteractionTool` (HITL factory).
-| Tool | Description |
-|---|---|
-| `glob` | Bun.Glob-backed pattern matching (in-process); shells out in docker/sandbox |
-| `grep` | ripgrep-backed regex search (with a Bun.Glob fallback). `output_mode`, `-i / -n / -A / -B / -C`, `multiline`, `head_limit`, `offset` — Claude Code Grep semantics |
-| `createInteractionTool` | Human-in-the-loop factory |
-The three `skills_use` / `skills_read` / `skills_run_script` tools auto-inject when the skills catalog is non-empty.
-Define a custom preset:
+`skills_use` / `skills_read` / `skills_run_script` auto-inject when the skills catalog is non-empty.
 ```ts
 import { basicTools, definePreset } from 'zidane/presets'
-const researcher = definePreset({
-  name: 'researcher',
-  system: 'You are a research assistant.',
-  tools: { ...basicTools },
-})
-createAgent({ ...researcher, provider })
-```
-For pure chat with no tools, omit `tools` or pass `{}` at run time:
-```ts
-createAgent({ provider })                                 // no tools
-await agent.run({ prompt: 'just chat', tools: {} })       // override for one run
+createAgent({ ...definePreset({ name: 'researcher', tools: basicTools }), provider })
+createAgent({ provider })                            // no tools
+await agent.run({ prompt: 'just chat', tools: {} }) // no tools for one run
 ```
 ## Thinking
-Extended reasoning with named levels or exact token budgets.
+Named levels or exact budgets. Traces persist as `{ type: 'thinking', text }` blocks and stream via `stream:thinking`. Supported by Anthropic (native) and OpenRouter/Cerebras (`reasoning_content`/`reasoning` SSE fields).
 | Level | Default budget |
 |---|---|
 | `off` | disabled |
-| `minimal` | 1,024 tokens |
-| `low` | 4,096 tokens |
-| `medium` | 10,240 tokens |
-| `high` | 32,768 tokens |
-| `adaptive` | model self-budgets per turn |
+| `minimal` | 1,024 |
+| `low` | 4,096 |
+| `medium` | 10,240 |
+| `high` | 32,768 |
+| `adaptive` | model self-budgets |
 ```ts
-// Named level
-await agent.run({ prompt: 'solve this', thinking: 'high' })
-// Exact budget (overrides level default)
-await agent.run({ prompt: 'solve this', thinking: 'high', behavior: { thinkingBudget: 50000 } })
-// Adaptive — model self-budgets, but `thinkingBudget` caps the response envelope
-// (max_tokens) to soft-bound runaway thinking on Anthropic.
-await agent.run({ prompt: 'solve this', thinking: 'adaptive', behavior: { thinkingBudget: 32000 } })
-// Agent-level default
-const agent = createAgent({ ...basic, provider, behavior: { thinkingBudget: 16384 } })
+await agent.run({ prompt: '…', thinking: 'high' })
+await agent.run({ prompt: '…', thinking: 'high', behavior: { thinkingBudget: 50000 } })  // exact
+await agent.run({ prompt: '…', thinking: 'adaptive', behavior: { thinkingBudget: 32000 } })
 ```
-Thinking traces are stored in session turns as `{ type: 'thinking', text }` content blocks and streamed live via the `stream:thinking` hook. Supported by Anthropic (native) and OpenRouter/Cerebras (`reasoning_content`/`reasoning` SSE fields).
-`adaptive` is Anthropic-specific (`thinking.type='adaptive'`) and avoids the `thinking.type='enabled'` deprecation warning on opus 4.6+. It has no native budget knob — when `thinkingBudget` is paired with `adaptive`, zidane caps `max_tokens = min(maxTokens, thinkingBudget)` so unbounded reasoning can't run away. Other providers fall back to no reasoning when `adaptive` is selected.
+`adaptive` is Anthropic-only (`thinking.type='adaptive'`, avoids the opus 4.6+ deprecation warning). Pairing it with `thinkingBudget` caps `max_tokens = min(maxTokens, thinkingBudget)` to bound runaway reasoning. Other providers fall back to no reasoning on `adaptive`.
 ## Hooks
-Every hook receives a mutable context object.
-### Turn lifecycle
-```ts
-agent.hooks.hook('turn:before', (ctx) => {
-  // ctx.turn, ctx.turnId, ctx.options (StreamOptions)
-})
-agent.hooks.hook('turn:after', (ctx) => {
-  // ctx.turn, ctx.turnId, ctx.usage, ctx.message (full SessionTurn)
-  // Always fires — even if the provider throws mid-stream
-  // Turn is guaranteed to be in agent.turns before this fires
-})
-agent.hooks.hook('usage', (ctx) => {
-  // ctx.turn, ctx.turnId, ctx.usage (per-turn)
-  // ctx.totalIn, ctx.totalOut (running parent-loop totals — children fold in
-  // post-loop and are visible on `agent:done`)
-})
+Hooks fire at every lifecycle point via [hookable](https://github.com/unjs/hookable). Awaited in registration order; ctx is shared per firing (last-writer wins). See `docs/SKILL.md` for the full hook reference table.
-agent.hooks.hook('agent:done', (ctx) => {
-  // ctx.totalIn / ctx.totalOut / ctx.cost — cumulative across parent loop +
-  //   every recursively-spawned sub-agent
-  // ctx.turns, ctx.elapsed — parent-loop view (use `flattenTurns(ctx).length`
-  //   for tree-wide turn counts, `statsByModel(ctx)` for per-model breakdown)
-  // ctx.children? — per-child stats in completion order
-  // ctx.output — structured output (when behavior.schema is set)
-  // Fires on all exit paths: completion, maxTurns, and abort
-})
-```
-### Streaming
-```ts
-agent.hooks.hook('stream:text', (ctx) => {
-  // ctx.delta, ctx.text, ctx.turnId
-})
-agent.hooks.hook('stream:end', (ctx) => {
-  // ctx.text (final), ctx.turnId
-  // Only fires when there is text content (not on tool-only turns)
-})
-agent.hooks.hook('stream:thinking', (ctx) => {
-  // ctx.delta, ctx.thinking (accumulated), ctx.turnId
-  // Fires when the model streams reasoning traces (Anthropic, OpenRouter)
-})
-agent.hooks.hook('oauth:refresh', (ctx) => {
-  // ctx.provider, ctx.providerId, ctx.source
-  // ctx.previousCredentials, ctx.credentials
-  // Fires when an OAuth token is refreshed from passed credentials or .credentials.json
-})
-```
-### Tool execution
-All tool hooks include `turnId` and `callId` for correlation. Typed via `ToolHookContext`.
+### Practical examples
 ```ts
+// Refuse or substitute a tool call.
 agent.hooks.hook('tool:gate', (ctx) => {
-  // ctx.turnId, ctx.callId, ctx.name, ctx.input, ctx.runToolCounts
   if (ctx.name === 'shell' && String(ctx.input.command).includes('rm -rf')) {
     ctx.block = true
     ctx.reason = 'dangerous command'
   }
-  // Substitute a successful result without running the tool — mirrors
-  // tool:unknown / tool:error. When both are set, `block` wins.
   if (ctx.name === 'todowrite' && (ctx.runToolCounts.todowrite ?? 0) > 0)
-    ctx.result = 'Already recorded; no-op.'
+    ctx.result = 'Already recorded; no-op.'  // `block` wins if both set
 })
-agent.hooks.hook('tool:before', (ctx) => { /* ctx.turnId, ctx.callId, ctx.name, ctx.input, ctx.runToolCounts, ctx.coercions? */ })
-agent.hooks.hook('tool:after', (ctx) => { /* + ctx.result, ctx.outputBytes, ctx.runToolCounts, ctx.coercions? */ })
-agent.hooks.hook('tool:error', (ctx) => {
-  // + ctx.error. Mutate ctx.result to substitute the payload sent back to the
-  // model in place of the default `Tool error: <msg>` — useful for OSS-model
-  // error rewriting (collapse stack traces, prepend recovery hints).
-})
+// Redact secrets before the model sees a tool result.
 agent.hooks.hook('tool:transform', (ctx) => {
-  // + ctx.result, ctx.isError, ctx.outputBytes (pre-mutation), ctx.coercions? — mutate result/isError to modify.
-  // Built-in tools already truncate; use this hook for consumer concerns the framework can't infer,
-  // e.g. redacting secrets in tool output before they reach the model.
   if (typeof ctx.result === 'string')
     ctx.result = ctx.result.replace(/\b(API_KEY|TOKEN|PASSWORD)\s*=\s*\S+/gi, '$1=<redacted>')
 })
+// Substitute for hallucinated tool names instead of erroring.
 agent.hooks.hook('tool:unknown', (ctx) => {
-  // Fires when the model invents a tool name (or calls one no longer registered).
-  // Mutate ctx.result to substitute a friendly response, set ctx.suppressError = true
-  // to skip the companion `tool:error`.
   if (ctx.name === 'EnterPlanMode') {
     ctx.result = 'EnterPlanMode is not available — use shell to draft a plan as comments.'
     ctx.suppressError = true
   }
 })
-agent.hooks.hook('validation:reject', (ctx) => {
-  // Fires when arg validation rejects the input even after auto-coercion attempts.
-  // Observational — the model still receives `Validation error: …` for the retry.
-  // ctx.reason, ctx.schema
-})
-agent.hooks.hook('validation:coerce', (ctx) => {
-  // Fires when validation auto-healed at least one field. Never fires on
-  // perfectly-typed inputs. ctx.coercions lists the field names that were changed.
-  // Symmetric counterpart to `validation:reject` — useful for "model wrongness rate".
-})
-```
-`ctx.coercions` (when present) is the same `readonly string[]` exposed via `validation:coerce`. The field is **omitted** from `tool:before` / `tool:after` / `tool:transform` ctx when no coercion happened, so it never noises up the happy path. Listeners can `if (ctx.coercions)` guard.
-MCP tool hooks mirror the same pattern with `server` and `tool` fields. Typed via `McpToolHookContext`.
-```ts
-agent.hooks.hook('mcp:tool:gate', (ctx) => { /* ctx.turnId, ctx.callId, ctx.server, ctx.tool, ctx.input, ctx.block, ctx.reason */ })
-agent.hooks.hook('mcp:tool:before', (ctx) => { /* ctx.turnId, ctx.callId, ctx.server, ctx.tool, ctx.input */ })
-agent.hooks.hook('mcp:tool:after', (ctx) => { /* + ctx.result, ctx.outputBytes */ })
-agent.hooks.hook('mcp:tool:transform', (ctx) => { /* + ctx.result, ctx.outputBytes — mutate to modify */ })
-agent.hooks.hook('mcp:tool:error', (ctx) => { /* + ctx.error */ })
-```
-`outputBytes` measures the wire size of the tool's result. On `*:transform` it's the **pre-mutation** size (a truncation handler can size-budget); on `*:after` it's the **post-mutation** size that goes to the model. `toolOutputByteLength(content)` exported from `zidane` reproduces the formula.
-### Context transform
-Prune messages before each LLM call:
+// Per-turn observation.
+agent.hooks.hook('turn:after', (ctx) => { /* ctx.turn, ctx.usage, ctx.message — always fires */ })
+agent.hooks.hook('stream:text', (ctx) => { /* ctx.delta, ctx.text */ })
+agent.hooks.hook('agent:done', (ctx) => { /* AgentStats — cumulative incl. children */ })
-```ts
+// Mutate messages / system before the provider call.
 agent.hooks.hook('context:transform', (ctx) => {
-  if (ctx.messages.length > 30)
-    ctx.messages.splice(2, ctx.messages.length - 30)
+  if (ctx.messages.length > 30) ctx.messages.splice(2, ctx.messages.length - 30)
 })
-```
-### System transform
-Mutate the system prompt per request — useful for runtime-derived sections (files already read in the session, live tool budgets, skill activation reminders). Fires after `context:transform`, before the request goes out. `messages` is read-only here.
-```ts
 agent.hooks.hook('system:transform', (ctx) => {
-  // ctx.system, ctx.messages (readonly), ctx.turn, ctx.turnId, ctx.session?
   if (ctx.session && ctx.turn > 1)
     ctx.system += `\n\n## Reminder: keep responses concise after turn ${ctx.turn}.`
 })
 ```
-Cache breakpoints land naturally inside the provider after this hook, so repeated turns with the same derived system text still hit the cache.
+Mutable hooks: `tool:gate` (`block` / `reason` / `result`), `tool:transform` (`result` / `isError`), `tool:error` + `tool:unknown` (`result`), `context:transform` (`messages`), `system:transform` + `system:before` (`system`), `skills:catalog` (`catalog`), `mcp:tool:gate` (`block` / `reason` / `result`), `mcp:tool:transform` (`result`). All tool hooks include `turnId` + `callId`. `outputBytes` is **pre-mutation** on `*:transform`, **post-mutation** on `*:after` — reproduce via `toolOutputByteLength()`. `ctx.coercions` is **omitted** when no coercion happened — guard with `if (ctx.coercions)`.
 ### Hook recipes
-Three patterns that don't have a built-in default. Copy-paste and tune.
+Three patterns the framework can't auto-infer. Copy-paste and tune.
 ```ts
-// 1. Truncate MCP tool results.
-//    Built-in tools (shell, read_file) already tail-truncate; MCP server outputs
-//    don't, since their sizes vary wildly and zidane can't pick a sane default
-//    on their behalf. Apply the same shape to mcp:tool:transform.
+// 1. Truncate MCP tool results — sizes vary too much for a default.
 agent.hooks.hook('mcp:tool:transform', (ctx) => {
   if (ctx.outputBytes <= 8192 || typeof ctx.result !== 'string')
     return
@@ -496,7 +315,7 @@ agent.hooks.hook('tool:unknown', (ctx) => {
   }
 })
-// 3. Drop old turns once the conversation grows past a soft cap.
+// 3. Drop old turns past a soft cap.
 agent.hooks.hook('context:transform', (ctx) => {
   const KEEP_RECENT = 30
   if (ctx.messages.length > KEEP_RECENT) {
@@ -506,92 +325,49 @@ agent.hooks.hook('context:transform', (ctx) => {
 })
 ```
-`mcp:tool:transform`, `tool:unknown`, and `context:transform` are the highest-leverage entries on the surface for the cases v3 doesn't auto-handle. Most production agents end up with one of each.
+`mcp:tool:transform`, `tool:unknown`, and `context:transform` are the highest-leverage entries the framework doesn't auto-handle. Most production agents end up with one of each.
 ### Per-turn output budget
-When working with OSS models that return large tool outputs, set `behavior.toolOutputBudget` to inject a "summarize before continuing" message after any turn whose combined post-`tool:transform` tool-output bytes exceed the cap. Off by default.
-```ts
-const agent = createAgent({
-  ...basic,
-  provider,
-  behavior: { toolOutputBudget: 32768 },
-})
-agent.hooks.hook('budget:exceeded', (ctx) => {
-  console.warn(`turn ${ctx.turn}: ${ctx.bytes} > ${ctx.budget} bytes`)
-})
-agent.hooks.hook('tool-budget:exceeded', (ctx) => {
-  // Per-tool counterpart, fires when `behavior.toolBudgets[ctx.tool]` trips.
-  // ctx.tool, ctx.count, ctx.max, ctx.turnId, ctx.mode ('steer' | 'block')
-  console.warn(`tool ${ctx.tool} hit cap (${ctx.count}/${ctx.max}, mode=${ctx.mode})`)
-})
-```
+`behavior.toolOutputBudget` injects a "summarize before continuing" message when a turn's combined post-`tool:transform` bytes exceed the cap. Off by default. Subscribe via `budget:exceeded` (byte) and `tool-budget:exceeded` (per-tool, fields: `tool, count, max, turnId, mode`).
 ### Client-side context compaction (non-Anthropic)
-For non-Anthropic providers (cerebras / openai-compat / openrouter on OSS models), `behavior.compactStrategy: 'tail'` elides older `tool_result` blocks from the wire-level message list once their combined size exceeds `compactThreshold`. The newest `compactKeepTurns` messages stay intact so the model retains the freshest tool context.
-```ts
-const agent = createAgent({
-  ...basic,
-  provider: cerebras({ apiKey: '...' }),
-  behavior: {
-    compactStrategy: 'tail',
-    compactThreshold: 131_072,  // 128 KiB; default
-    compactKeepTurns: 4,        // default
-  },
-})
-```
-Anthropic users should prefer the server-side `context-management-2025-06-27` beta (token-accurate, configured via `anthropic({ extraBetas, contextManagement })`) — `'tail'` is a client-side approximation that exists because OSS-model providers have no server-side equivalent.
+`behavior.compactStrategy: 'tail'` elides older `tool_result` blocks once their combined size exceeds `compactThreshold` (default 128 KiB); the newest `compactKeepTurns` (default 4) stay intact. Anthropic users should prefer the server-side `context-management-2025-06-27` beta via `anthropic({ extraBetas, contextManagement })` — token-accurate.
 ### Read dedup + read-before-edit guard
-`behavior.dedupReads` (on by default) — `read_file` returns a short `"unchanged since the previous read"` stub instead of re-emitting bytes when the model re-reads the same file with the same slice. Per-session content-hash; requires a session.
-`behavior.requireReadBeforeEdit` (off by default) — `edit` and `multi_edit` reject when the file hasn't been read in the session, or when its on-disk content has drifted since the last read. Eliminates the silent-corruption case where a model edits against bytes it "remembers" but no longer reflect reality. Recommended on for stricter eval-grade runs.
+- `behavior.dedupReads` (default **on**) — `read_file` returns `"unchanged since the previous read"` on identical re-reads. Per-session content-hash.
+- `behavior.requireReadBeforeEdit` (default **off**) — `edit` / `multi_edit` reject when the file hasn't been read this session or has drifted. Recommended for eval-grade runs.
 ### Generic per-tool dedup
-`behavior.dedupTools` extends the read-file pattern to arbitrary tools. Provide a hasher per tool keyed by canonical name; identical inputs replay the prior result without re-running the tool. Requires a session.
-The hasher contract has **three return values, three meanings** — pick deliberately:
+`behavior.dedupTools` extends the pattern to arbitrary tools via a hasher keyed by canonical name. Requires a session. Hasher contract — **three returns, three meanings**:
 | Return | Meaning |
 |---|---|
-| non-empty string | Cache key for this call. Equal keys replay the prior result. |
-| `undefined` | **Skip dedup for this call.** Tool runs normally; nothing recorded. |
+| non-empty string | Cache key. Equal keys replay the prior result. |
+| `undefined` | Skip dedup for this call. Tool runs normally. |
 | `''` or non-string | Treated as `undefined` (defensive). |
 ```ts
 behavior: {
   dedupTools: {
-    // Always cache by full input — every identical re-call dedups.
     todowrite: input => JSON.stringify(input),
-    // Cache by a normalized subset; non-cacheable shapes opt out via `undefined`.
     execute_sql: (input) => {
       const q = typeof input.query === 'string' ? input.query.trim().toLowerCase() : undefined
-      if (!q || q.includes('now()') || q.includes('random()')) return undefined
+      if (!q || q.includes('now()') || q.includes('random()')) return undefined  // non-cacheable
       return q
     },
   },
 }
 ```
-The `undefined` opt-out is **not** the same as `JSON.stringify(input)` — that would dedup against the verbatim input. Use `undefined` to mean "this specific call is not cacheable" (timestamps baked in, randomness, debug flags).
-Tools with side effects or non-deterministic outputs (network, time, randomness) **must not** be listed — there is no safety net beyond the consumer's hasher. For MCP tools, key by the namespaced wire name (`mcp_<server>_<tool>`).
+Tools with side effects or non-determinism (network, time, randomness) **must not** be listed. For MCP tools, key by the namespaced wire name (`mcp_<server>_<tool>`).
 ### Per-tool call budgets
-`behavior.toolBudgets` caps per-tool calls per run. Two reactions:
-- `'steer'` — let the call run, but emit a synthetic user message after the turn nudging the model to commit and finish. Fires once per tool per run.
-- `'block'` — refuse subsequent calls with `Blocked: <reason>`.
+`behavior.toolBudgets` caps per-tool calls per run. `'steer'` lets the call run then nudges the model to commit (once per tool per run); `'block'` refuses with `Blocked: <reason>`.
 ```ts
 behavior: {
@@ -602,11 +378,11 @@ behavior: {
 }
 ```
-Pass a function for custom messages: `onExceed: ctx => ({ mode: 'steer', message: '...' })`. Subscribe to `tool-budget:exceeded` for telemetry. Counts include dedup hits — by design, since both eat against agent-loop sanity.
+Pass a function for custom messages: `onExceed: ctx => ({ mode: 'steer', message: '...' })`. Counts include dedup hits — by design.
 ### Adaptive thinking budget
-`behavior.thinkingDecay` tapers the thinking budget across turns. Late turns are usually checkpoint / cleanup work where reasoning rarely pays for itself.
+`behavior.thinkingDecay` tapers thinking across turns. Late turns are usually checkpoint work where reasoning rarely pays off.
 ```ts
 behavior: {
@@ -616,81 +392,51 @@ behavior: {
 }
 ```
-Pass a function for arbitrary curves: `thinkingDecay: (turn, base) => base / Math.sqrt(turn)`. No-op when `thinkingBudget` is unset. Honored by every provider that respects `thinkingBudget`.
+Pass a function for arbitrary curves: `thinkingDecay: (turn, base) => base / Math.sqrt(turn)`. No-op when `thinkingBudget` is unset.
 ## Steering and Follow-up
-### Steering
-Inject a message while the agent is working. Delivered between tool calls.
+- `agent.steer(msg)` — inject mid-run, delivered between tool calls.
+- `agent.followUp(msg)` — queue for after the run finishes.
 ```ts
 agent.steer('focus only on the tests directory')
-```
-### Follow-up
-Queue messages that extend the conversation after the agent finishes.
-```ts
 agent.followUp('now write tests for what you built')
 ```
 ## Sub-agent Spawning
-The `spawn` tool delegates tasks to child agents that run independently.
+`spawn` delegates to independent child agents. Children inherit the parent's preset (tools, system, aliases, MCP servers, skills, behavior) by default. Pass `preset` on `createSpawnTool()` to override per child.
 ```ts
-import { basicTools, definePreset } from 'zidane/presets'
-import { createSpawnTool } from 'zidane/tools'
+import { basicTools, definePreset, createSpawnTool } from 'zidane'
-const orchestrator = definePreset({
+definePreset({
   name: 'orchestrator',
   tools: {
     ...basicTools,
-    spawn: createSpawnTool({
-      maxConcurrent: 5,
-      model: 'claude-haiku-4-5-20251001',
-      thinking: 'low',
-    }),
+    spawn: createSpawnTool({ maxConcurrent: 5, model: 'claude-haiku-4-5-20251001', thinking: 'low' }),
   },
 })
 ```
-Children inherit the parent's preset (tools, system prompt, aliases, MCP servers, skills, behavior) and can spawn their own children. Pass `preset` on `createSpawnTool()` to override the inherited slice per child.
 ## Interaction Tool
-Let the agent pause and request structured input from the outside world. Not included in any preset by default.
+Pause the agent and request structured input. Not in any preset by default. `onRequest` may be async — the agent waits. Return a string or object.
 ```ts
-import { basicTools, definePreset } from 'zidane/presets'
-import { createInteractionTool } from 'zidane/tools'
+import { createInteractionTool } from 'zidane'
 const askUser = createInteractionTool({
   name: 'ask_user',
-  schema: {
-    type: 'object',
-    properties: { question: { type: 'string' } },
-    required: ['question'],
-  },
-  onRequest: async (payload) => {
-    const answer = await promptUser(payload.question)
-    return { answer }
-  },
-})
-const interactive = definePreset({
-  name: 'interactive',
-  tools: { ...basicTools, ask_user: askUser },
+  schema: { type: 'object', properties: { question: { type: 'string' } }, required: ['question'] },
+  onRequest: async ({ question }) => ({ answer: await promptUser(question) }),
 })
 ```
-`onRequest` can be async — the agent waits for the response. Return a string or object (objects are JSON-stringified).
 ## Sessions
-Sessions give an agent persistent turn history and run metadata across calls.
+Persistent turn history + run metadata across calls. Turns persist incrementally — a crash leaves history up to the last completed turn.
 ```ts
 import { createAgent, createSession, createSqliteStore } from 'zidane'
@@ -703,45 +449,13 @@ await agent.run({ prompt: 'hello' })
 await session.save()
 ```
-Turns are persisted incrementally after each turn — not as a full save. If the agent crashes, you have turns up to the last completed turn.
-### Storage backends
-```ts
-import { createMemoryStore, createRemoteStore, createFileMapStore } from 'zidane/session'
-import { createSqliteStore } from 'zidane/session/sqlite'   // separate subpath (Bun-only)
-createMemoryStore()                                    // in-memory, no persistence
-createSqliteStore({ path: './sessions.db' })           // SQLite via bun:sqlite — WAL mode, per-turn flush
-createRemoteStore({ url: 'https://api.example.com' })  // HTTP REST API
-createFileMapStore(hostAdapter)                        // bridge to any { get, save, delete } file-map backend
-```
-`createSqliteStore` lives on its own subpath because it depends on `bun:sqlite`. Non-Bun consumers importing from `zidane` or `zidane/session` never evaluate that module.
-### Restoring a session
+Storage backends — `createMemoryStore()` (in-memory), `createSqliteStore({ path })` from `zidane/session/sqlite` (Bun-only subpath; WAL, per-turn flush), `createRemoteStore({ url })` (HTTP), `createFileMapStore(adapter)` (any `{ get, save, delete }` backend; `turns.jsonl` + `meta.json`).
-```ts
-import { loadSession } from 'zidane/session'
-const session = await loadSession(store, 'my-session')
-if (session) {
-  const agent = createAgent({ ...basic, provider, session })
-  await agent.run({ prompt: 'continue' })
-}
-```
-### Session hooks
-```ts
-agent.hooks.hook('session:start', (ctx) => { /* ctx.sessionId, ctx.runId, ctx.prompt */ })
-agent.hooks.hook('session:end', (ctx) => { /* ctx.sessionId, ctx.runId, ctx.status, ctx.turnRange */ })
-agent.hooks.hook('session:turns', (ctx) => { /* ctx.sessionId, ctx.turns (SessionTurn[]), ctx.count */ })
-```
+Restore via `await loadSession(store, id)`. Session hooks: `session:start`, `session:turns`, `session:end` (always fires, carries `turnRange`).
 ## MCP Servers
-Connect any MCP-compatible tool server. Tools are namespaced as `mcp_{server}_{tool}`.
+Connect any MCP server. Tools are namespaced `mcp_{server}_{tool}`. Connections are lazy (first `run()`) and reused; all servers bootstrap in parallel.
 ```ts
 const agent = createAgent({
@@ -754,93 +468,55 @@ const agent = createAgent({
 })
 ```
-MCP servers can live on a preset too (they're just `AgentOptions` fields). Connections are lazy (first `run()`) and reused.
-Set `bootstrapTimeout` to cap how long a slow `connect + listTools` phase can delay the first model request. Per-server `disclosure: 'lazy' | 'eager'` overrides the agent-wide `behavior.toolDisclosure` (see [Progressive tool disclosure](#progressive-tool-disclosure)).
+Per-server `disclosure: 'lazy' | 'eager'` overrides `behavior.toolDisclosure` (see [Progressive tool disclosure](#progressive-tool-disclosure)).
 ### Hiding bootstrap latency
-Every server is bootstrapped in parallel, but the first `run()` still waits for the slowest one. Two knobs to hide the cost:
+The first `run()` still waits on the slowest server. Two knobs:
 ```ts
-// Option 1 — pre-warm manually behind other setup work.
-const agent = createAgent({ provider, mcpServers })
-await Promise.all([agent.warmup(), authenticate(), loadConfig()])
-await agent.run({ prompt: 'go' })                 // no MCP wait here
-// Option 2 — let createAgent kick the warmup off for you.
-const agent = createAgent({ provider, mcpServers, eager: true })
-// ... unrelated startup work ...
-await agent.run({ prompt: 'go' })                 // awaits the in-flight warmup
+await Promise.all([agent.warmup(), authenticate(), loadConfig()])  // pre-warm manually
+const agent = createAgent({ provider, mcpServers, eager: true })   // or kick off automatically
 ```
-`warmup()` is idempotent and safe to call from multiple callers concurrently. Failures are surfaced on the next `warmup()` / `run()` rather than crashing the eager kickoff.
-### Observability
+`warmup()` is idempotent and concurrency-safe. Failures surface on the next `warmup()` / `run()`, not on the eager kickoff.
-Two hooks fire around each per-server bootstrap, regardless of success:
+Two hooks fire per bootstrap regardless of outcome — attribute cold-start latency per server:
 ```ts
-agent.hooks.hook('mcp:bootstrap:start', ({ name, transport }) => { /* ... */ })
 agent.hooks.hook('mcp:bootstrap:end', (ctx) => {
-  // ctx.name, ctx.transport, ctx.durationMs
-  // ctx.ok === true  → ctx.toolCount
-  // ctx.ok === false → ctx.error
+  // ctx.name, ctx.transport, ctx.durationMs, ctx.ok
+  // ok ? ctx.toolCount : ctx.error
 })
 ```
-Use these to attribute cold-start latency per server — the only way to know if a specific MCP (e.g. a remote GitHub MCP) is the one stretching your first `run()`.
 ## Progressive tool disclosure
-When MCP brings hundreds of tools, every turn ships every schema in the tool list. `behavior.toolDisclosure: 'lazy'` flips MCP tools to a name-only catalog in the system prompt and auto-injects a `tool_search` native tool the model uses to load schemas on demand. Native (non-MCP) tools and skill tools are always eager — only MCP tools are eligible for lazy disclosure.
+With hundreds of MCP tools, every turn ships every schema. `behavior.toolDisclosure: 'lazy'` flips MCP tools to a name-only catalog and auto-injects a `tool_search` native tool. Native + skill tools stay eager.
 ```ts
 const agent = createAgent({
   ...basic,
   provider,
   mcpServers: [
-    { name: 'github', transport: 'stdio', command: 'gh-mcp' },         // 200+ tools
-    { name: 'fs', transport: 'stdio', command: 'fs-mcp', disclosure: 'eager' }, // override per-server
+    { name: 'github', transport: 'stdio', command: 'gh-mcp' },                    // 200+ tools
+    { name: 'fs', transport: 'stdio', command: 'fs-mcp', disclosure: 'eager' },   // per-server override
   ],
-  behavior: {
-    toolDisclosure: 'lazy',
-    toolSearch: { limit: 20 },                  // default cap on results per call
-  },
+  behavior: { toolDisclosure: 'lazy', toolSearch: { limit: 20 } },
 })
 ```
-The catalog appended to the system prompt looks like:
-```
-<searchable_tools>
-  <server name="github">
-    <tool name="mcp_github_search_issues">Search GitHub issues by query.</tool>
-    <tool name="mcp_github_create_pr">Open a pull request.</tool>
-    …
-  </server>
-</searchable_tools>
-```
-`tool_search` accepts `query` (substring), `names` (explicit), `server` (bulk-unlock one server), and `limit`. Surfaced tools persist for the rest of the run; the loop rebuilds the wire-level tool list each turn so the next provider call advertises them.
-Two hard guarantees:
-- **Hard gate.** A `tool:gate` middleware refuses dispatch on lazy tools the model hasn't surfaced yet — production providers already enforce this server-side, but the in-loop gate covers custom / mock / lenient providers and any path where a model quotes a name straight from the catalog. Refusal text points the model at `tool_search` so it self-corrects.
-- **Aliasing-safe.** Catalog and `tool_search` results show the **wire** (`toolAliases`-rewritten) name — the only one the provider accepts. The unlock set is keyed by canonical name so dispatch / `session.turns` / hook contexts stay alias-stable.
-Cost model: each `tool_search` call appends to the wire-level tool list, advancing the provider's tool-list cache breakpoint. That costs one cache miss per discovery wave; subsequent turns with the same unlocked set hit cache normally. With many lazy tools and few discovery waves, this still beats eager (which always sends every schema) — but it's not a free optimisation.
+System prompt gains `<searchable_tools>` with `name + description` per lazy tool. `tool_search` accepts `query` (substring), `names`, `server`, `limit` — matches unlock for the rest of the run. A `tool:gate` middleware refuses dispatch on un-surfaced lazy tools (covers custom/mock providers; production providers also refuse server-side). Catalog + search results show the **wire** name; the unlock set keys on canonical so dispatch and `session.turns` stay alias-stable.
-Opt out via `behavior.toolSearch.tool: false` (the catalog still emits, the call-to-action prose drops). Pre-existing host tool named `tool_search` shadows the auto-injection — see the JSDoc on `behavior.toolSearch` for the host-defined-tool semantics.
+Cost: one cache miss per discovery wave (the tool list grows); subsequent turns hit cache. Opt out via `behavior.toolSearch.tool: false` (catalog still emits, call-to-action drops). A pre-existing host tool named `tool_search` shadows the auto-injection.
 ## Skills
-Reusable instruction packages following the [Agent Skills](https://agentskills.io/specification) open standard.
-### SKILL.md format
+Reusable instruction packages following the [Agent Skills](https://agentskills.io/specification) standard.
 ```
 my-skill/
-  SKILL.md
+  SKILL.md       # frontmatter + instructions
   scripts/       # optional
   references/    # optional
   assets/        # optional
@@ -856,92 +532,52 @@ allowed-tools: Bash Read Write
 paths: "src/**/*.ts, test/**/*.ts"
 ---
-Full instructions the agent receives when this skill activates.
+Full instructions the agent receives on activation.
 ```
-### Discovery
-Scan paths in priority order (first found wins):
-1. `{cwd}/.agents/skills`
-2. `{cwd}/.zidane/skills`
-3. `~/.agents/skills`
-4. `~/.zidane/skills`
-### Configuration
+Default scan paths (first found wins): `{cwd}/.agents/skills`, `{cwd}/.zidane/skills`, `~/.agents/skills`, `~/.zidane/skills`. Instructions support `!\`command\`` — runs during resolution; output replaces the placeholder.
 ```ts
 import { createAgent, defineSkill } from 'zidane'
-const agent = createAgent({
+createAgent({
   ...basic,
   provider,
   skills: {
     scan: ['./custom-skills'],
-    write: [
-      defineSkill({
-        name: 'review',
-        description: 'Code review guidelines.',
-        instructions: 'Review for correctness and test coverage.',
-      }),
-    ],
+    write: [defineSkill({ name: 'review', description: 'Code review.', instructions: '...' })],
     exclude: ['deprecated-skill'],
     enabled: ['review', 'deploy'],
   },
 })
 ```
-Instructions support `!\`command\`` for dynamic content — commands run during resolution and output replaces the placeholder.
 ## Execution Contexts
-An execution context defines **where** tools run. Defaults to in-process.
-### Docker
+Where tools run. Defaults to in-process. Docker isolates; sandbox runs remotely (E2B, Rivet, custom).
 ```ts
-import { createAgent, createDockerContext } from 'zidane'
+import { createDockerContext, createSandboxContext } from 'zidane'
-const agent = createAgent({
-  ...basic,
-  provider,
-  execution: createDockerContext({
-    image: 'node:22',
-    cwd: '/workspace',
-    limits: { memory: 512, cpu: '1.0' },
-  }),
-})
-```
-### Sandbox (remote)
-Implement `SandboxProvider` for your provider (E2B, Rivet, etc.):
-```ts
-import { createAgent, createSandboxContext } from 'zidane'
-const agent = createAgent({
-  ...basic,
-  provider,
-  execution: createSandboxContext(myProvider),
-})
+createDockerContext({ image: 'node:22', cwd: '/workspace', limits: { memory: 512, cpu: '1.0' } })
+createSandboxContext(myProvider)  // implement SandboxProvider
 ```
 ## State Management
 ```ts
-agent.isRunning           // is a run in progress?
-agent.turns               // conversation history (SessionTurn[])
-agent.abort()             // cancel the current run
-agent.reset()             // clear messages and queues
-await agent.warmup()      // pre-connect MCP (idempotent, safe to call concurrently)
-await agent.destroy()     // clean up context + MCP connections
-await agent.waitForIdle() // wait for current run to complete
+agent.isRunning           // run in progress?
+agent.turns               // SessionTurn[]
+agent.abort()             // cancel current run
+agent.reset()             // clear turns + queues
+await agent.warmup()      // pre-connect MCP (idempotent)
+await agent.destroy()     // clean up context + MCP
+await agent.waitForIdle() // wait for run to complete
 ```
 ## Message Format
-All messages use a canonical format. Providers convert to/from wire formats internally.
+Canonical format. Providers convert to/from wire formats internally.
 ```ts
 type SessionContentBlock =
@@ -954,24 +590,15 @@ type SessionContentBlock =
 type ToolResultContent =
   | { type: 'text', text: string }
   | { type: 'image', mediaType: string, data: string }
-interface SessionMessage {
-  role: 'user' | 'assistant'
-  content: SessionContentBlock[]
-}
 ```
-Tool results can carry structured content — pure-text tools keep returning a `string`, tools that produce images (MCP browser servers, screenshot tools) return a `ToolResultContent[]` that the loop routes natively on providers with `imageInToolResult: true` and via a companion user message elsewhere. Use `toolResultToText(output)` to flatten when a consumer only handles strings.
-Converters for external interop:
+Image-producing tools (MCP browsers, screenshots) return `ToolResultContent[]` — routed natively on providers with `imageInToolResult: true`, via companion user message elsewhere. Flatten with `toolResultToText(output)`.
-```ts
-import { fromAnthropic, toAnthropic, fromOpenAI, toOpenAI, autoDetectAndConvert } from 'zidane'
-```
+External interop converters: `fromAnthropic`, `toAnthropic`, `fromOpenAI`, `toOpenAI`, `autoDetectAndConvert` (re-exported from `zidane`).
 ## Typed Errors
-Provider failures are wrapped into typed error classes before leaving `agent.run()` — match on `instanceof` instead of sniffing strings.
+Provider failures are wrapped before leaving `agent.run()`. Match on `instanceof`, not strings. Every provider ships `classifyError(err)`; unrecognized errors fall through as `AgentProviderError`. Abort paths (`agent.abort()` / `AbortSignal`) always produce `AgentAbortedError`.
 ```ts
 import { AgentAbortedError, AgentContextExceededError, AgentProviderError } from 'zidane'
@@ -980,23 +607,17 @@ try {
   await agent.run({ prompt })
 }
 catch (err) {
-  if (err instanceof AgentContextExceededError) {
-    // prune history, retry
-  }
-  else if (err instanceof AgentAbortedError) {
-    // user cancelled
-  }
+  if (err instanceof AgentContextExceededError) { /* prune history, retry */ }
+  else if (err instanceof AgentAbortedError) { /* user cancelled */ }
   else if (err instanceof AgentProviderError) {
     console.error(`${err.provider}: ${err.message} (${err.providerCode})`)
   }
 }
 ```
-Every provider ships a `classifyError(err)` that maps native errors into a `ClassifiedError` union — unrecognized errors fall through as `AgentProviderError`. Abort paths (`agent.abort()` or a triggered `AbortSignal`) always produce `AgentAbortedError` regardless of classification.
 ## Structured Output
-Force the agent's final response to match a JSON Schema via provider-level tool forcing.
+Force the final response to a JSON Schema via provider-level tool forcing. Lands on `stats.output` and fires the `output` hook (`ctx.output`, `ctx.schema`).
 ```ts
 const stats = await agent.run({
@@ -1009,84 +630,45 @@ const stats = await agent.run({
     },
   },
 })
 console.log(stats.output) // { name: 'Alice', age: 30 }
 ```
-The `output` hook fires when structured output is extracted:
-```ts
-agent.hooks.hook('output', (ctx) => {
-  // ctx.output — the parsed JSON matching the schema
-  // ctx.schema — the schema that was enforced
-})
-```
-### Zod v4 integration
-Use `zodToJsonSchema` to normalize `z.toJsonSchema()` output for tool schemas:
-```ts
-import { z } from 'zod'
-import { zodToJsonSchema } from 'zidane'
-const schema = zodToJsonSchema(z.toJsonSchema(z.object({ name: z.string() })))
-```
+For Zod v4, normalize via `zodToJsonSchema(z.toJsonSchema(schema))` — strips `$schema` (some providers reject it).
 ## Usage Tracking
-`stats.totalIn` / `stats.totalOut` / `stats.cost` are **cumulative** — parent
-loop plus every recursively-spawned sub-agent. `stats.turns` and
-`stats.turnUsage` cover the parent loop only; reach for the helpers below for
-tree-wide breakdowns.
+`stats.totalIn` / `stats.totalOut` / `stats.cost` are **cumulative** (parent + recursive children). `stats.turns` and `stats.turnUsage` cover the parent loop only. Use helpers for tree-wide breakdowns.
 ```ts
 import { flattenTurns, statsByModel } from 'zidane'
-const stats = await agent.run({ prompt: 'hello' })
-stats.totalIn                   // cumulative input tokens (parent + recursive children)
-stats.totalOut                  // cumulative output tokens
-stats.cost                      // cumulative USD cost (when reported by provider)
-stats.turnUsage                 // TurnUsage[] — parent loop only
-stats.children                  // ChildRunStats[] — recursive subtree, completion order
-stats.timeTillFirstTokenMs      // ms from run() start to the first stream/tool event
-flattenTurns(stats)             // every TurnUsage in the tree, parent first then DFS children
-statsByModel(stats)             // Map<modelId, { input, output, cost, cacheRead, cacheCreation, turns }>
-```
-## Types
+stats.totalIn / stats.totalOut / stats.cost  // cumulative
+stats.turnUsage                              // parent loop only
+stats.children                               // ChildRunStats[] in completion order
+stats.timeTillFirstTokenMs                   // ms to first stream/tool event
-All types are available from `zidane/types`:
-```ts
-import type { Agent, SessionTurn, TurnUsage, Provider, ToolDef, ValidationResult } from 'zidane/types'
-// Hook context types for typed event handlers
-import type { ToolHookContext, McpToolHookContext, SessionHookContext, StreamHookContext } from 'zidane/types'
+flattenTurns(stats)   // every TurnUsage in tree (DFS)
+statsByModel(stats)   // Map<modelId, { input, output, cost, cacheRead, cacheCreation, turns }>
 ```
-Helpers (re-exported from the main entry):
+## Types & Helpers
-```ts
-import { toolResultToText, toolOutputByteLength, validateToolArgs } from 'zidane'
-```
+Types from `zidane/types` (`Agent`, `SessionTurn`, `TurnUsage`, `Provider`, `ToolDef`, `ValidationResult`, hook contexts).
-- `toolResultToText(content)` — flatten `string | ToolResultContent[]` to a string for logging.
-- `toolOutputByteLength(content)` — same formula the loop uses for `outputBytes`.
-- `validateToolArgs(input, schema)` — the validator the loop runs between `tool:gate` and `tool:before`. Useful for unit tests of consumer tool definitions.
+Helpers re-exported from `zidane`:
+- `toolResultToText(content)` — flatten for logging.
+- `toolOutputByteLength(content)` — same formula as `outputBytes`.
+- `validateToolArgs(input, schema)` — the loop's validator.
-## Testing
+## Testing & Benchmarks
 ```bash
 bun test
 ```
-1000+ tests with mock provider and execution context. No API keys or Docker needed; the suite runs in under 2 seconds.
-## Benchmarks
+1000+ tests with mock provider + execution context. Under 2 s; no API keys or Docker.
-Harness integrations for running zidane against public agent benchmarks live in [`benchmarks/`](./benchmarks). First integration: [Terminal-Bench](./benchmarks/terminal-bench).
+Benchmark harnesses live in [`benchmarks/`](./benchmarks). First integration: [Terminal-Bench](./benchmarks/terminal-bench).
 ## License