zidane 1.3.1 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,189 +4,118 @@
4
4
 
5
5
  An agent that goes straight to the goal.
6
6
 
7
- Minimal TypeScript agent loop built with [Bun](https://bun.sh).
8
-
9
- Hook into every step of the agent's execution using [hookable](https://github.com/unjs/hookable).
10
-
11
- Built to be embedded in other projects easily, extended through [providers](#providers), [harnesses](#harnesses), and [execution contexts](#execution-contexts).
7
+ Minimal TypeScript agent loop built with [Bun](https://bun.sh). Hook into every step using [hookable](https://github.com/unjs/hookable). Built to be embedded.
12
8
 
13
9
  ## Quickstart
14
10
 
15
11
  ```bash
16
- # Install
17
12
  bun install
18
-
19
- # Authenticate with Anthropic OAuth (Claude Pro/Max)
20
- bun run auth
21
-
22
- # Run
23
- bun start --prompt "create a hello world express app"
24
- ```
25
-
26
- ## CLI
27
-
28
- ```bash
29
- bun start \
30
- --prompt "your task" \ # required
31
- --model claude-opus-4-6 \ # model id (default: claude-opus-4-6)
32
- --provider anthropic \ # anthropic | openrouter | cerebras
33
- --harness basic \ # tool set to use
34
- --system "be concise" \ # system prompt
35
- --thinking off \ # off | minimal | low | medium | high
36
- --context process \ # process | docker
37
- --mcp '{"name":"fs","transport":"stdio","command":"npx","args":["-y","@modelcontextprotocol/server-filesystem","."]}'
13
+ bun run auth # Anthropic OAuth
14
+ bun start --prompt "create a hello world app"
38
15
  ```
39
16
 
40
- The `--mcp` flag accepts a JSON object matching `McpServerConfig`. It can be passed multiple times.
41
-
42
- ## Execution Contexts
43
-
44
- An execution context defines **where** the agent's tools run. All tool operations (shell, filesystem) go through it.
45
-
46
- ### In-process (default)
47
-
48
- Runs in the same Node/Bun process. No isolation, fastest.
17
+ ## Agent Setup
49
18
 
50
19
  ```ts
51
- import { createAgent, createProcessContext } from 'zidane'
20
+ import { createAgent, anthropic } from 'zidane'
21
+ import { basic } from 'zidane'
52
22
 
53
23
  const agent = createAgent({
54
- harness,
55
- provider,
56
- // execution defaults to createProcessContext()
24
+ provider: anthropic({ apiKey: 'sk-ant-...' }),
25
+ harness: basic,
57
26
  })
58
- ```
59
-
60
- ### Docker
61
-
62
- Full container isolation via [dockerode](https://github.com/apocas/dockerode). Configurable resource limits.
63
27
 
64
- ```bash
65
- # CLI
66
- bun start --prompt "run uname -a" --context docker
67
- bun start --prompt "build the app" --context docker --image node:22 --cwd /workspace
28
+ const stats = await agent.run({ prompt: 'build a REST API' })
29
+ console.log(`Done in ${stats.turns} turns`)
68
30
  ```
69
31
 
70
- ```ts
71
- import { createAgent, createDockerContext } from 'zidane'
32
+ All options on `createAgent`:
72
33
 
73
- const agent = createAgent({
74
- harness,
75
- provider,
76
- execution: createDockerContext({
77
- image: 'node:22',
78
- cwd: '/workspace',
79
- limits: { memory: 512, cpu: '1.0' },
80
- }),
34
+ ```ts
35
+ createAgent({
36
+ provider, // required: LLM provider
37
+ harness: basic, // tool set (default: noTools)
38
+ enableTools: true, // false for pure chat mode
39
+ toolExecution: 'sequential', // or 'parallel'
40
+ maxTurns: 50, // max loop iterations
41
+ maxTokens: 16384, // max tokens per LLM response
42
+ thinkingBudget: 10240, // exact thinking token budget
43
+ execution: createProcessContext(), // where tools run
44
+ mcpServers: [], // MCP tool servers
45
+ session, // session for persistence
46
+ skills: {}, // skills configuration
81
47
  })
82
48
  ```
83
49
 
84
- Requires `dockerode` as a peer dependency: `bun add dockerode`
85
-
86
- ### Sandbox (remote)
87
-
88
- Offloads execution to a remote sandbox API. Implement the `SandboxProvider` interface for your provider (Rivet, E2B, etc.).
50
+ All options on `agent.run()`:
89
51
 
90
52
  ```ts
91
- import { createAgent, createSandboxContext } from 'zidane'
92
- import type { SandboxProvider } from 'zidane'
93
-
94
- const myProvider: SandboxProvider = {
95
- name: 'my-sandbox',
96
- spawn: async (config) => { /* ... */ },
97
- exec: async (id, command) => { /* ... */ },
98
- readFile: async (id, path) => { /* ... */ },
99
- writeFile: async (id, path, content) => { /* ... */ },
100
- listFiles: async (id, path) => { /* ... */ },
101
- destroy: async (id) => { /* ... */ },
102
- }
103
-
104
- const agent = createAgent({
105
- harness,
106
- provider,
107
- execution: createSandboxContext(myProvider),
53
+ await agent.run({
54
+ prompt: 'your task', // required
55
+ model: 'claude-opus-4-6',
56
+ system: 'be concise',
57
+ thinking: 'medium', // off | minimal | low | medium | high
58
+ thinkingBudget: 8192, // overrides level-based default
59
+ maxTurns: 10, // overrides agent-level default
60
+ maxTokens: 4096, // overrides agent-level default
61
+ images: [], // base64 images
62
+ signal: abortController.signal,
108
63
  })
109
64
  ```
110
65
 
111
- ### Execution Context Interface
112
-
113
- All contexts implement the same interface:
114
-
115
- ```ts
116
- interface ExecutionContext {
117
- type: 'process' | 'docker' | 'sandbox'
118
- capabilities: { shell, filesystem, network, gpu }
119
- spawn(config?): Promise<ExecutionHandle>
120
- exec(handle, command, options?): Promise<ExecResult>
121
- readFile(handle, path): Promise<string>
122
- writeFile(handle, path, content): Promise<void>
123
- listFiles(handle, path): Promise<string[]>
124
- destroy(handle): Promise<void>
125
- }
126
- ```
66
+ Per-run options override agent-level defaults. Agent-level defaults override hardcoded defaults.
127
67
 
128
- Access the context from a running agent:
68
+ ## CLI
129
69
 
130
- ```ts
131
- agent.execution // ExecutionContext
132
- agent.execution.type // 'process' | 'docker' | 'sandbox'
133
- agent.handle // ExecutionHandle (after first run)
134
- await agent.destroy() // clean up context resources
70
+ ```bash
71
+ bun start \
72
+ --prompt "your task" \ # required
73
+ --model claude-opus-4-6 \ # model id
74
+ --provider anthropic \ # anthropic | openrouter | cerebras
75
+ --harness basic \ # tool set
76
+ --system "be concise" \ # system prompt
77
+ --thinking off \ # off | minimal | low | medium | high
78
+ --context process \ # process | docker
79
+ --mcp '{"name":"fs","transport":"stdio","command":"npx","args":["-y","@modelcontextprotocol/server-filesystem","."]}'
135
80
  ```
136
81
 
137
82
  ## Providers
138
83
 
139
- ### Anthropic
84
+ All providers accept runtime credentials via a params object. Env vars are fallbacks.
140
85
 
141
- Direct Anthropic API with OAuth and API key support.
86
+ ### Anthropic
142
87
 
143
- ```bash
144
- # OAuth (Claude Pro/Max subscription)
145
- bun run auth
88
+ ```ts
89
+ import { anthropic } from 'zidane'
146
90
 
147
- # Or API key
148
- ANTHROPIC_API_KEY=sk-ant-... bun start --prompt "hello"
91
+ anthropic({ apiKey: 'sk-ant-...' })
92
+ anthropic({ access: 'sk-ant-oat-...' }) // OAuth
93
+ anthropic({ apiKey: '...', defaultModel: 'claude-sonnet-4-6' })
149
94
  ```
150
95
 
96
+ Fallback: `params.apiKey` > `params.access` > `ANTHROPIC_API_KEY` env > `.credentials.json`
97
+
151
98
  ### OpenRouter
152
99
 
153
- Access 200+ models through OpenRouter's unified API.
100
+ ```ts
101
+ import { openrouter } from 'zidane'
154
102
 
155
- ```bash
156
- OPENROUTER_API_KEY=sk-or-... bun start \
157
- --provider openrouter \
158
- --model anthropic/claude-sonnet-4-6 \
159
- --prompt "hello"
103
+ openrouter({ apiKey: 'sk-or-...', defaultModel: 'google/gemini-pro' })
160
104
  ```
161
105
 
162
- ### Cerebras
163
-
164
- Ultra-fast inference on Cerebras wafer-scale hardware.
106
+ Fallback: `params.apiKey` > `OPENROUTER_API_KEY` env
165
107
 
166
- ```bash
167
- CEREBRAS_API_KEY=csk-... bun start \
168
- --provider cerebras \
169
- --model zai-glm-4.7 \
170
- --prompt "hello"
171
- ```
172
-
173
- ## Thinking
108
+ ### Cerebras
174
109
 
175
- Extended reasoning for complex tasks. Maps to Anthropic's thinking API or OpenRouter's `:thinking` variant.
110
+ ```ts
111
+ import { cerebras } from 'zidane'
176
112
 
177
- ```bash
178
- bun start --prompt "solve this proof" --thinking high
113
+ cerebras({ apiKey: 'csk-...', defaultModel: 'zai-glm-4.7' })
179
114
  ```
180
115
 
181
- | Level | Budget |
182
- |---|---|
183
- | `off` | disabled |
184
- | `minimal` | 1k tokens |
185
- | `low` | 4k tokens |
186
- | `medium` | 10k tokens |
187
- | `high` | 32k tokens |
116
+ Fallback: `params.apiKey` > `CEREBRAS_API_KEY` env
188
117
 
189
- ## Tools (Harnesses)
118
+ ## Harnesses
190
119
 
191
120
  Tools are grouped into **harnesses**. The `basic` harness includes:
192
121
 
@@ -196,231 +125,192 @@ Tools are grouped into **harnesses**. The `basic` harness includes:
196
125
  | `read_file` | Read file contents |
197
126
  | `write_file` | Write/create files |
198
127
  | `list_files` | List directory contents |
199
- | `spawn` | Spawn a sub-agent for a task |
128
+ | `spawn` | Spawn a sub-agent |
200
129
 
201
- All paths are sandboxed to the working directory.
202
-
203
- Define a custom harness with `defineHarness`:
130
+ Define a custom harness:
204
131
 
205
132
  ```ts
206
- import { defineHarness } from 'zidane'
133
+ import { defineHarness, basicTools } from 'zidane'
207
134
 
208
135
  const harness = defineHarness({
209
136
  name: 'researcher',
210
137
  system: 'You are a research assistant.',
211
138
  tools: { ...basicTools },
212
- mcpServers: [
213
- { name: 'filesystem', transport: 'stdio', command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '.'] },
214
- ],
215
139
  })
216
140
  ```
217
141
 
218
- ## Sub-agent Spawning
219
-
220
- The `spawn` tool lets the agent delegate tasks to child agents. Children run independently and return their result as a tool response.
221
-
222
- ### Static spawn tool
142
+ For pure chat with no tools:
223
143
 
224
144
  ```ts
225
- import { spawn, basicTools, defineHarness } from 'zidane'
226
-
227
- const harness = defineHarness({
228
- name: 'orchestrator',
229
- tools: { ...basicTools, spawn },
230
- })
145
+ const agent = createAgent({ provider, enableTools: false })
231
146
  ```
232
147
 
233
- Children inherit the parent's harness (and can spawn their own children).
148
+ ## Thinking
234
149
 
235
- ### Configurable factory
150
+ Extended reasoning with named levels or exact token budgets.
236
151
 
237
- Use `createSpawnTool` when you need custom concurrency limits, model overrides, or lifecycle callbacks.
152
+ | Level | Default budget |
153
+ |---|---|
154
+ | `off` | disabled |
155
+ | `minimal` | 1,024 tokens |
156
+ | `low` | 4,096 tokens |
157
+ | `medium` | 10,240 tokens |
158
+ | `high` | 32,768 tokens |
238
159
 
239
160
  ```ts
240
- import { createSpawnTool } from 'zidane'
241
-
242
- const spawnTool = createSpawnTool({
243
- maxConcurrent: 5,
244
- model: 'claude-haiku-4-5-20251001',
245
- system: 'You are a focused sub-agent.',
246
- thinking: 'low',
247
- onSpawn: (child) => console.log(`started ${child.id}`),
248
- onComplete: (child, stats) => console.log(`${child.id} done in ${stats.turns} turns`),
249
- })
161
+ // Named level
162
+ await agent.run({ prompt: 'solve this', thinking: 'high' })
250
163
 
251
- const harness = defineHarness({
252
- name: 'orchestrator',
253
- tools: { spawn: spawnTool },
254
- })
164
+ // Exact budget (overrides level default)
165
+ await agent.run({ prompt: 'solve this', thinking: 'high', thinkingBudget: 50000 })
166
+
167
+ // Agent-level default
168
+ const agent = createAgent({ provider, harness, thinkingBudget: 16384 })
255
169
  ```
256
170
 
257
- ## MCP Servers
171
+ ## Hooks
258
172
 
259
- Connect any MCP-compatible tool server. Tools are namespaced as `mcp_{serverName}_{toolName}`.
173
+ Every hook receives a mutable context object.
260
174
 
261
- ### Agent-level
175
+ ### Turn lifecycle
262
176
 
263
177
  ```ts
264
- const agent = createAgent({
265
- harness,
266
- provider,
267
- mcpServers: [
268
- { name: 'filesystem', transport: 'stdio', command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '.'] },
269
- { name: 'search', transport: 'sse', url: 'http://localhost:3001/sse' },
270
- { name: 'api', transport: 'streamable-http', url: 'http://localhost:3002/mcp' },
271
- ],
178
+ agent.hooks.hook('turn:before', (ctx) => {
179
+ // ctx.turn, ctx.turnId, ctx.options (StreamOptions)
272
180
  })
273
- ```
274
181
 
275
- ### Harness-level
276
-
277
- MCP servers can also be declared on the harness so they're shared across all agents using it.
182
+ agent.hooks.hook('turn:after', (ctx) => {
183
+ // ctx.turn, ctx.turnId, ctx.usage { input, output }
184
+ // Always fires even if the provider throws mid-stream
185
+ })
278
186
 
279
- ```ts
280
- const harness = defineHarness({
281
- name: 'with-mcp',
282
- tools: { ...basicTools },
283
- mcpServers: [
284
- { name: 'db', transport: 'stdio', command: 'node', args: ['db-server.js'] },
285
- ],
187
+ agent.hooks.hook('agent:done', (ctx) => {
188
+ // ctx.totalIn, ctx.totalOut, ctx.turns, ctx.elapsed, ctx.children?
189
+ // Fires on all exit paths: completion, maxTurns, and abort
286
190
  })
287
191
  ```
288
192
 
289
- MCP connections are made lazily on the first `run()` call and reused across subsequent runs. They are closed when `agent.destroy()` is called.
290
-
291
- ## Sessions
193
+ ### Streaming
292
194
 
293
- Sessions give an agent persistent identity, turn history, and run metadata across multiple calls or restarts. Each message exchange is a `SessionTurn` with its own UUID, enabling real-time multiplayer streaming.
195
+ ```ts
196
+ agent.hooks.hook('stream:text', (ctx) => {
197
+ // ctx.delta, ctx.text, ctx.turnId, ctx.blockIndex
198
+ })
294
199
 
295
- ### SessionTurn
200
+ agent.hooks.hook('stream:end', (ctx) => {
201
+ // ctx.text (final), ctx.turnId, ctx.blockIndex
202
+ // Only fires when there is text content (not on tool-only turns)
203
+ })
204
+ ```
296
205
 
297
- Every message in a session is a turn:
206
+ ### Tool execution
298
207
 
299
208
  ```ts
300
- interface SessionTurn {
301
- id: string // UUID generated by store or crypto.randomUUID()
302
- role: 'user' | 'assistant' | 'system'
303
- content: SessionContentBlock[] // same format used by providers
304
- usage?: TurnUsage // token usage (assistant turns only)
305
- createdAt: number // timestamp
306
- }
209
+ agent.hooks.hook('tool:before', (ctx) => { /* ctx.name, ctx.input */ })
210
+ agent.hooks.hook('tool:after', (ctx) => { /* ctx.name, ctx.input, ctx.result */ })
211
+ agent.hooks.hook('tool:error', (ctx) => { /* ctx.name, ctx.input, ctx.error */ })
307
212
  ```
308
213
 
309
- ### Creating a session
214
+ ### Tool gate
310
215
 
311
- `createSession` is async stores can generate IDs server-side (e.g. Supabase).
216
+ Block a tool from running:
312
217
 
313
218
  ```ts
314
- import { createSession, createMemoryStore } from 'zidane/session'
315
-
316
- // In-memory (default, no persistence)
317
- const session = await createSession({ id: 'my-session', agentId: 'my-agent' })
318
-
319
- // With a store for persistence
320
- const store = createMemoryStore()
321
- const session = await createSession({ id: 'my-session', store })
219
+ agent.hooks.hook('tool:gate', (ctx) => {
220
+ if (ctx.name === 'shell' && String(ctx.input.command).includes('rm -rf')) {
221
+ ctx.block = true
222
+ ctx.reason = 'dangerous command'
223
+ }
224
+ })
322
225
  ```
323
226
 
324
- ### Storage backends
227
+ ### Tool transform
325
228
 
326
- Three built-in stores are available. All implement the full `SessionStore` interface including incremental operations.
229
+ Modify tool output before it's sent back to the model:
327
230
 
328
231
  ```ts
329
- import { createMemoryStore, createSqliteStore, createRemoteStore } from 'zidane/session'
232
+ agent.hooks.hook('tool:transform', (ctx) => {
233
+ if (ctx.result.length > 5000)
234
+ ctx.result = ctx.result.slice(0, 5000) + '\n... (truncated)'
235
+ })
236
+ ```
330
237
 
331
- // In-memory, fast, no disk I/O, lost on process restart
332
- const memStore = createMemoryStore()
238
+ ### Context transform
333
239
 
334
- // SQLite, persistent, zero-dependency (uses Bun's built-in SQLite)
335
- const sqliteStore = createSqliteStore({ path: './sessions.db' })
240
+ Prune messages before each LLM call:
336
241
 
337
- // Remote HTTP, delegates to a custom REST API
338
- const remoteStore = createRemoteStore({ url: 'https://api.example.com/sessions' })
242
+ ```ts
243
+ agent.hooks.hook('context:transform', (ctx) => {
244
+ if (ctx.messages.length > 30)
245
+ ctx.messages.splice(2, ctx.messages.length - 30)
246
+ })
339
247
  ```
340
248
 
341
- ### SessionStore interface
249
+ ## Steering and Follow-up
250
+
251
+ ### Steering
252
+
253
+ Inject a message while the agent is working. Delivered between tool calls.
342
254
 
343
255
  ```ts
344
- interface SessionStore {
345
- // Optional: server-side ID generation
346
- generateSessionId?: () => string | Promise<string>
347
- generateTurnId?: () => string | Promise<string>
348
-
349
- // Core CRUD
350
- load: (sessionId: string) => Promise<SessionData | null>
351
- save: (session: SessionData) => Promise<void>
352
- delete: (sessionId: string) => Promise<void>
353
- list: (filter?) => Promise<string[]>
354
-
355
- // Incremental operations (avoids full re-save)
356
- appendTurns: (sessionId: string, turns: SessionTurn[]) => Promise<void>
357
- getTurns: (sessionId: string, from?: number, limit?: number) => Promise<SessionTurn[]>
358
- updateRun: (sessionId: string, run: SessionRun) => Promise<void>
359
- updateStatus: (sessionId: string, status: SessionStatus) => Promise<void>
360
- }
256
+ agent.steer('focus only on the tests directory')
361
257
  ```
362
258
 
363
- Custom ID generation lets external databases (e.g. Supabase) provide UUIDs server-side, keeping IDs in sync:
259
+ ### Follow-up
260
+
261
+ Queue messages that extend the conversation after the agent finishes.
364
262
 
365
263
  ```ts
366
- const store = createRemoteStore({ url: '...' })
367
- store.generateTurnId = async () => {
368
- const { data } = await supabase.rpc('gen_random_uuid')
369
- return data
370
- }
264
+ agent.followUp('now write tests for what you built')
371
265
  ```
372
266
 
373
- ### Agent integration
267
+ ## Sub-agent Spawning
268
+
269
+ The `spawn` tool delegates tasks to child agents that run independently.
374
270
 
375
271
  ```ts
376
- const agent = createAgent({
377
- harness,
378
- provider,
379
- session,
380
- })
272
+ import { createSpawnTool, defineHarness, basicTools } from 'zidane'
381
273
 
382
- await agent.run({ prompt: 'hello' })
383
- await session.save() // persist to store
274
+ const harness = defineHarness({
275
+ name: 'orchestrator',
276
+ tools: {
277
+ ...basicTools,
278
+ spawn: createSpawnTool({
279
+ maxConcurrent: 5,
280
+ model: 'claude-haiku-4-5-20251001',
281
+ thinking: 'low',
282
+ }),
283
+ },
284
+ })
384
285
  ```
385
286
 
386
- Turns are persisted incrementally after each agent turn via `appendTurns` — not as a full document save. If the agent crashes mid-run, you still have turns up to the last completed turn.
287
+ Children inherit the parent's harness and can spawn their own children.
387
288
 
388
- ### Session status
289
+ ## Sessions
389
290
 
390
- Sessions track their status: `'idle' | 'running' | 'completed' | 'error'`. The agent updates it automatically during runs.
291
+ Sessions give an agent persistent turn history and run metadata across calls.
391
292
 
392
293
  ```ts
393
- session.status // 'idle'
394
- await agent.run({ prompt: 'go' })
395
- // idle → running → completed (or error)
396
- ```
294
+ import { createAgent, createSession, createSqliteStore } from 'zidane'
397
295
 
398
- ### Session hooks
296
+ const store = createSqliteStore({ path: './sessions.db' })
297
+ const session = await createSession({ store })
399
298
 
400
- ```ts
401
- agent.hooks.hook('session:start', (ctx) => {
402
- // ctx.sessionId, ctx.runId, ctx.prompt
403
- })
299
+ const agent = createAgent({ harness, provider, session })
300
+ await agent.run({ prompt: 'hello' })
301
+ await session.save()
302
+ ```
404
303
 
405
- agent.hooks.hook('session:end', (ctx) => {
406
- // ctx.sessionId, ctx.runId
407
- // ctx.status: 'completed' | 'aborted' | 'error'
408
- })
304
+ Turns are persisted incrementally after each turn — not as a full save. If the agent crashes, you have turns up to the last completed turn.
409
305
 
410
- agent.hooks.hook('session:turns', (ctx) => {
411
- // ctx.sessionId, ctx.count
412
- // fired after each turn (incremental sync)
413
- })
306
+ ### Storage backends
414
307
 
415
- agent.hooks.hook('session:save', (ctx) => {
416
- // ctx.sessionId
417
- // fired after session.save() completes
418
- })
308
+ ```ts
309
+ import { createMemoryStore, createSqliteStore, createRemoteStore } from 'zidane/session'
419
310
 
420
- agent.hooks.hook('session:meta', (ctx) => {
421
- // ctx.sessionId, ctx.key, ctx.value
422
- // fired when session.setMeta() is called
423
- })
311
+ createMemoryStore() // in-memory, no persistence
312
+ createSqliteStore({ path: './sessions.db' }) // SQLite (Bun built-in)
313
+ createRemoteStore({ url: 'https://api.example.com' }) // HTTP REST API
424
314
  ```
425
315
 
426
316
  ### Restoring a session
@@ -431,227 +321,144 @@ import { loadSession } from 'zidane/session'
431
321
  const session = await loadSession(store, 'my-session')
432
322
  if (session) {
433
323
  const agent = createAgent({ harness, provider, session })
434
- await agent.run({ prompt: 'continue from before' })
324
+ await agent.run({ prompt: 'continue' })
435
325
  }
436
326
  ```
437
327
 
438
- ## Hooks
439
-
440
- The agent uses [hookable](https://github.com/unjs/hookable) for lifecycle events. Every hook receives a mutable context object.
441
-
442
- ### Lifecycle
328
+ ### Session hooks
443
329
 
444
330
  ```ts
445
- agent.hooks.hook('system:before', (ctx) => {
446
- // ctx.system: system prompt text
447
- })
448
-
449
- agent.hooks.hook('turn:before', (ctx) => {
450
- // ctx.turn: turn number
451
- // ctx.turnId: UUID for this turn (generated before LLM call)
452
- // ctx.options: StreamOptions being sent to provider
453
- })
454
-
455
- agent.hooks.hook('turn:after', (ctx) => {
456
- // ctx.turn, ctx.turnId, ctx.usage { input, output }
457
- })
458
-
459
- agent.hooks.hook('agent:done', (ctx) => {
460
- // ctx.totalIn, ctx.totalOut, ctx.turns, ctx.elapsed, ctx.children?
461
- })
462
-
463
- agent.hooks.hook('agent:abort', () => {
464
- // fired when agent.abort() is called
465
- })
331
+ agent.hooks.hook('session:start', (ctx) => { /* ctx.sessionId, ctx.runId, ctx.prompt */ })
332
+ agent.hooks.hook('session:end', (ctx) => { /* ctx.sessionId, ctx.runId, ctx.status */ })
333
+ agent.hooks.hook('session:turns', (ctx) => { /* ctx.sessionId, ctx.count */ })
466
334
  ```
467
335
 
468
- ### Streaming
469
-
470
- ```ts
471
- agent.hooks.hook('stream:text', (ctx) => {
472
- // ctx.delta: new text chunk
473
- // ctx.text: accumulated text so far
474
- // ctx.turnId: UUID of the turn being streamed
475
- // ctx.blockIndex: content block index within the turn
476
- })
477
-
478
- agent.hooks.hook('stream:end', (ctx) => {
479
- // ctx.text: final complete text
480
- // ctx.turnId, ctx.blockIndex
481
- })
482
- ```
336
+ ## MCP Servers
483
337
 
484
- ### Tool Execution
338
+ Connect any MCP-compatible tool server. Tools are namespaced as `mcp_{server}_{tool}`.
485
339
 
486
340
  ```ts
487
- agent.hooks.hook('tool:before', (ctx) => {
488
- // ctx.name, ctx.input
489
- })
490
-
491
- agent.hooks.hook('tool:after', (ctx) => {
492
- // ctx.name, ctx.input, ctx.result
493
- })
494
-
495
- agent.hooks.hook('tool:error', (ctx) => {
496
- // ctx.name, ctx.input, ctx.error
341
+ const agent = createAgent({
342
+ harness,
343
+ provider,
344
+ mcpServers: [
345
+ { name: 'fs', transport: 'stdio', command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '.'] },
346
+ { name: 'api', transport: 'streamable-http', url: 'http://localhost:3002/mcp' },
347
+ ],
497
348
  })
498
349
  ```
499
350
 
500
- ### Tool Gate: block execution
351
+ MCP servers can also be declared on the harness. Connections are lazy (first `run()`) and reused.
501
352
 
502
- Mutate `ctx.block = true` to prevent a tool from running.
353
+ ## Skills
503
354
 
504
- ```ts
505
- agent.hooks.hook('tool:gate', (ctx) => {
506
- if (ctx.name === 'shell' && String(ctx.input.command).includes('rm -rf')) {
507
- ctx.block = true
508
- ctx.reason = 'dangerous command'
509
- }
510
- })
511
- ```
355
+ Reusable instruction packages following the [Agent Skills](https://agentskills.io/specification) open standard.
512
356
 
513
- ### Tool Transform: modify output
357
+ ### SKILL.md format
514
358
 
515
- Mutate `ctx.result` or `ctx.isError` to transform tool results before they're sent back to the model.
516
-
517
- ```ts
518
- agent.hooks.hook('tool:transform', (ctx) => {
519
- if (ctx.result.length > 5000)
520
- ctx.result = ctx.result.slice(0, 5000) + '\n... (truncated)'
521
- })
522
359
  ```
523
-
524
- ### Context Transform: prune messages
525
-
526
- Mutate `ctx.messages` before each LLM call for context window management.
527
-
528
- ```ts
529
- agent.hooks.hook('context:transform', (ctx) => {
530
- if (ctx.messages.length > 30)
531
- ctx.messages.splice(2, ctx.messages.length - 30)
532
- })
360
+ my-skill/
361
+ SKILL.md
362
+ scripts/ # optional
363
+ references/ # optional
364
+ assets/ # optional
533
365
  ```
534
366
 
535
- ### Spawn hooks
367
+ ```markdown
368
+ ---
369
+ name: my-skill
370
+ description: When to activate this skill.
371
+ model: claude-opus-4-6
372
+ thinking: low
373
+ allowed-tools: Bash Read Write
374
+ paths: "src/**/*.ts, test/**/*.ts"
375
+ ---
536
376
 
537
- Fired by the `spawn` tool when child agents are created.
538
-
539
- ```ts
540
- agent.hooks.hook('spawn:before', (ctx) => {
541
- // ctx.id: child agent id (e.g. 'child-1')
542
- // ctx.task: the task prompt given to the child
543
- })
544
-
545
- agent.hooks.hook('spawn:complete', (ctx) => {
546
- // ctx.id, ctx.task
547
- // ctx.stats: AgentStats from the child run
548
- })
549
-
550
- agent.hooks.hook('spawn:error', (ctx) => {
551
- // ctx.id, ctx.task, ctx.error
552
- })
377
+ Full instructions the agent receives when this skill activates.
553
378
  ```
554
379
 
555
- ### MCP hooks
380
+ ### Discovery
556
381
 
557
- Fired during MCP server lifecycle.
382
+ Scan paths in priority order (first found wins):
558
383
 
559
- ```ts
560
- agent.hooks.hook('mcp:connect', (ctx) => {
561
- // ctx.name: server name
562
- // ctx.transport: 'stdio' | 'sse' | 'streamable-http'
563
- // ctx.tools: namespaced tool names discovered on this server
564
- })
384
+ 1. `{cwd}/.agents/skills`
385
+ 2. `{cwd}/.zidane/skills`
386
+ 3. `~/.agents/skills`
387
+ 4. `~/.zidane/skills`
565
388
 
566
- agent.hooks.hook('mcp:error', (ctx) => {
567
- // ctx.name: server name
568
- // ctx.error: connection error
569
- })
389
+ ### Configuration
570
390
 
571
- agent.hooks.hook('mcp:close', (ctx) => {
572
- // ctx.name: server name being closed
573
- })
574
-
575
- agent.hooks.hook('mcp:tool:before', (ctx) => {
576
- // ctx.server: MCP server name
577
- // ctx.tool: original tool name (not namespaced)
578
- // ctx.input: tool arguments
579
- })
580
-
581
- agent.hooks.hook('mcp:tool:after', (ctx) => {
582
- // ctx.server, ctx.tool, ctx.input
583
- // ctx.result: tool output string
584
- })
391
+ ```ts
392
+ import { createAgent, defineSkill } from 'zidane'
585
393
 
586
- agent.hooks.hook('mcp:tool:error', (ctx) => {
587
- // ctx.server, ctx.tool, ctx.input, ctx.error
394
+ const agent = createAgent({
395
+ harness,
396
+ provider,
397
+ skills: {
398
+ scan: ['./custom-skills'],
399
+ write: [
400
+ defineSkill({
401
+ name: 'review',
402
+ description: 'Code review guidelines.',
403
+ instructions: 'Review for correctness and test coverage.',
404
+ }),
405
+ ],
406
+ exclude: ['deprecated-skill'],
407
+ enabled: ['review', 'deploy'],
408
+ },
588
409
  })
589
410
  ```
590
411
 
591
- ### Steering inject
412
+ Instructions support `!\`command\`` for dynamic content — commands run during resolution and output replaces the placeholder.
592
413
 
593
- ```ts
594
- agent.hooks.hook('steer:inject', (ctx) => {
595
- // ctx.message: the steering message being injected
596
- })
597
- ```
598
-
599
- ## Steering and Follow-up
414
+ ## Execution Contexts
600
415
 
601
- ### Steering: interrupt mid-run
416
+ An execution context defines **where** tools run. Defaults to in-process.
602
417
 
603
- Inject a message while the agent is working. Delivered between tool calls, skipping remaining tools in the current turn.
418
+ ### Docker
604
419
 
605
420
  ```ts
606
- agent.hooks.hook('tool:after', () => {
607
- agent.steer('focus only on the tests directory')
421
+ import { createAgent, createDockerContext } from 'zidane'
422
+
423
+ const agent = createAgent({
424
+ harness,
425
+ provider,
426
+ execution: createDockerContext({
427
+ image: 'node:22',
428
+ cwd: '/workspace',
429
+ limits: { memory: 512, cpu: '1.0' },
430
+ }),
608
431
  })
609
432
  ```
610
433
 
611
- ### Follow-up, continue after done
434
+ ### Sandbox (remote)
612
435
 
613
- Queue messages that extend the conversation after the agent finishes.
436
+ Implement `SandboxProvider` for your provider (E2B, Rivet, etc.):
614
437
 
615
438
  ```ts
616
- agent.followUp('now write tests for what you built')
617
- agent.followUp('then update the README')
618
- ```
619
-
620
- ## Parallel Tool Execution
621
-
622
- Execute multiple tool calls from a single turn concurrently.
439
+ import { createAgent, createSandboxContext } from 'zidane'
623
440
 
624
- ```ts
625
441
  const agent = createAgent({
626
442
  harness,
627
443
  provider,
628
- toolExecution: 'parallel', // default: 'sequential'
444
+ execution: createSandboxContext(myProvider),
629
445
  })
630
446
  ```
631
447
 
632
- ## Image Content
633
-
634
- Pass images alongside the prompt.
448
+ ## State Management
635
449
 
636
450
  ```ts
637
- import { readFileSync } from 'fs'
638
-
639
- await agent.run({
640
- prompt: 'describe this screenshot',
641
- images: [{
642
- type: 'image',
643
- source: {
644
- type: 'base64',
645
- media_type: 'image/png',
646
- data: readFileSync('screenshot.png').toString('base64'),
647
- },
648
- }],
649
- })
451
+ agent.isRunning // is a run in progress?
452
+ agent.messages // conversation history
453
+ agent.abort() // cancel the current run
454
+ agent.reset() // clear messages and queues
455
+ await agent.destroy() // clean up context + MCP connections
456
+ await agent.waitForIdle() // wait for current run to complete
650
457
  ```
651
458
 
652
459
  ## Message Format
653
460
 
654
- All messages in zidane use the canonical `SessionMessage` format, with or without sessions:
461
+ All messages use a canonical format. Providers convert to/from wire formats internally.
655
462
 
656
463
  ```ts
657
464
  type SessionContentBlock =
@@ -667,7 +474,7 @@ interface SessionMessage {
667
474
  }
668
475
  ```
669
476
 
670
- Providers convert to and from native wire formats internally. Converters are available for external interop:
477
+ Converters for external interop:
671
478
 
672
479
  ```ts
673
480
  import { fromAnthropic, toAnthropic, fromOpenAI, toOpenAI, autoDetectAndConvert } from 'zidane'
@@ -675,101 +482,10 @@ import { fromAnthropic, toAnthropic, fromOpenAI, toOpenAI, autoDetectAndConvert
675
482
 
676
483
  ## Usage Tracking
677
484
 
678
- Every turn reports token usage. Provider-specific fields are optional:
679
-
680
- ```ts
681
- interface TurnUsage {
682
- input: number
683
- output: number
684
- cacheCreation?: number // Anthropic: tokens written to cache
685
- cacheRead?: number // Anthropic: tokens read from cache
686
- thinking?: number // thinking tokens used
687
- cost?: number // USD cost reported by provider (e.g. OpenRouter)
688
- }
689
- ```
690
-
691
- Per-turn data is available on `AgentStats` and `SessionRun`:
692
-
693
485
  ```ts
694
486
  const stats = await agent.run({ prompt: 'hello' })
695
- stats.turnUsage // TurnUsage[] per turn
696
- stats.cost // total cost (sum of per-turn costs, if reported)
697
-
698
- // In session runs
699
- session.runs[0].turnUsage // per-turn breakdown
700
- session.runs[0].totalUsage // aggregated TurnUsage
701
- session.runs[0].cost // total cost for this run
702
- ```
703
-
704
- ## State Management
705
-
706
- ```ts
707
- agent.isRunning // boolean: is a run in progress?
708
- agent.messages // SessionMessage[]: conversation history
709
- agent.execution // ExecutionContext: where tools run
710
- agent.handle // ExecutionHandle: spawned context handle
711
- agent.abort() // cancel the current run
712
- agent.reset() // clear messages and queues
713
- await agent.destroy() // clean up execution context and MCP connections
714
- await agent.waitForIdle() // wait for current run to complete
715
- ```
716
-
717
- ## Project Structure
718
-
719
- ```
720
- src/
721
- types.ts shared types
722
- agent.ts createAgent, AgentHooks, state management
723
- loop.ts turn execution loop
724
- start.ts CLI entrypoint
725
- auth.ts Anthropic OAuth flow
726
- index.ts package exports
727
- contexts/
728
- types.ts ExecutionContext interface, capabilities
729
- process.ts in-process context (default)
730
- docker.ts Docker container context
731
- sandbox.ts remote sandbox context
732
- index.ts barrel exports
733
- tools/
734
- index.ts tool exports
735
- validation.ts tool argument validation
736
- shell.ts shell tool
737
- read-file.ts read_file tool
738
- write-file.ts write_file tool
739
- list-files.ts list_files tool
740
- spawn.ts spawn tool and createSpawnTool factory
741
- providers/
742
- index.ts Provider interface
743
- openai-compat.ts shared OpenAI-compatible utilities
744
- anthropic.ts Anthropic provider
745
- openrouter.ts OpenRouter provider
746
- cerebras.ts Cerebras provider
747
- harnesses/
748
- index.ts HarnessConfig, defineHarness, ToolContext
749
- basic.ts basic harness (shell, read, write, list, spawn)
750
- mcp/
751
- index.ts MCP server connection and tool discovery
752
- session/
753
- index.ts Session interface, createSession, loadSession
754
- messages.ts SessionMessage converters (Anthropic/OpenAI)
755
- memory.ts in-memory session store
756
- sqlite.ts SQLite-backed session store
757
- remote.ts HTTP remote session store
758
- output/
759
- terminal.ts terminal rendering (md4x)
760
- test/
761
- mock-provider.ts mock provider for testing
762
- mock-context.ts mock execution context for testing
763
- agent.test.ts agent loop tests
764
- contexts.test.ts execution context tests
765
- harness.test.ts harness tests
766
- mcp.test.ts MCP connection and hook tests
767
- spawn.test.ts spawn tool and hook tests
768
- validation.test.ts validation tests
769
- providers.test.ts provider tests
770
- openai-compat.test.ts OpenAI-compat utility tests
771
- session.test.ts session store and agent integration tests
772
- session-messages.test.ts SessionMessage converter tests
487
+ stats.turnUsage // TurnUsage[] per-turn { input, output, cacheCreation?, cacheRead?, thinking?, cost? }
488
+ stats.cost // total USD cost (if reported by provider)
773
489
  ```
774
490
 
775
491
  ## Testing
@@ -778,9 +494,8 @@ test/
778
494
  bun test
779
495
  ```
780
496
 
781
- 300 tests with mock provider and mock execution context, no LLM calls or Docker needed.
497
+ 430+ tests with mock provider and execution context. No API keys or Docker needed.
782
498
 
783
499
  ## License
784
500
 
785
501
  ISC
786
-