zidane 1.1.5 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -8,6 +8,8 @@ Minimal TypeScript agent loop built with [Bun](https://bun.sh).
8
8
 
9
9
  Hook into every step of the agent's execution using [hookable](https://github.com/unjs/hookable).
10
10
 
11
+ Built to be embedded in other projects easily, extended through [providers](#providers), [harnesses](#harnesses), and [execution contexts](#execution-contexts).
12
+
11
13
  ## Quickstart
12
14
 
13
15
  ```bash
@@ -30,7 +32,106 @@ bun start \
30
32
  --provider anthropic \ # anthropic | openrouter | cerebras
31
33
  --harness basic \ # tool set to use
32
34
  --system "be concise" \ # system prompt
33
- --thinking off # off | minimal | low | medium | high
35
+ --thinking off \ # off | minimal | low | medium | high
36
+ --context process \ # process | docker
37
+ --mcp '{"name":"fs","transport":"stdio","command":"npx","args":["-y","@modelcontextprotocol/server-filesystem","."]}'
38
+ ```
39
+
40
+ The `--mcp` flag accepts a JSON object matching `McpServerConfig`. It can be passed multiple times.
41
+
42
+ ## Execution Contexts
43
+
44
+ An execution context defines **where** the agent's tools run. All tool operations (shell, filesystem) go through it.
45
+
46
+ ### In-process (default)
47
+
48
+ Runs in the same Node/Bun process. No isolation, fastest.
49
+
50
+ ```ts
51
+ import { createAgent, createProcessContext } from 'zidane'
52
+
53
+ const agent = createAgent({
54
+ harness,
55
+ provider,
56
+ // execution defaults to createProcessContext()
57
+ })
58
+ ```
59
+
60
+ ### Docker
61
+
62
+ Full container isolation via [dockerode](https://github.com/apocas/dockerode). Configurable resource limits.
63
+
64
+ ```bash
65
+ # CLI
66
+ bun start --prompt "run uname -a" --context docker
67
+ bun start --prompt "build the app" --context docker --image node:22 --cwd /workspace
68
+ ```
69
+
70
+ ```ts
71
+ import { createAgent, createDockerContext } from 'zidane'
72
+
73
+ const agent = createAgent({
74
+ harness,
75
+ provider,
76
+ execution: createDockerContext({
77
+ image: 'node:22',
78
+ cwd: '/workspace',
79
+ limits: { memory: 512, cpu: '1.0' },
80
+ }),
81
+ })
82
+ ```
83
+
84
+ Requires `dockerode` as a peer dependency: `bun add dockerode`
85
+
86
+ ### Sandbox (remote)
87
+
88
+ Offloads execution to a remote sandbox API. Implement the `SandboxProvider` interface for your provider (Rivet, E2B, etc.).
89
+
90
+ ```ts
91
+ import { createAgent, createSandboxContext } from 'zidane'
92
+ import type { SandboxProvider } from 'zidane'
93
+
94
+ const myProvider: SandboxProvider = {
95
+ name: 'my-sandbox',
96
+ spawn: async (config) => { /* ... */ },
97
+ exec: async (id, command) => { /* ... */ },
98
+ readFile: async (id, path) => { /* ... */ },
99
+ writeFile: async (id, path, content) => { /* ... */ },
100
+ listFiles: async (id, path) => { /* ... */ },
101
+ destroy: async (id) => { /* ... */ },
102
+ }
103
+
104
+ const agent = createAgent({
105
+ harness,
106
+ provider,
107
+ execution: createSandboxContext(myProvider),
108
+ })
109
+ ```
110
+
111
+ ### Execution Context Interface
112
+
113
+ All contexts implement the same interface:
114
+
115
+ ```ts
116
+ interface ExecutionContext {
117
+ type: 'process' | 'docker' | 'sandbox'
118
+ capabilities: { shell, filesystem, network, gpu }
119
+ spawn(config?): Promise<ExecutionHandle>
120
+ exec(handle, command, options?): Promise<ExecResult>
121
+ readFile(handle, path): Promise<string>
122
+ writeFile(handle, path, content): Promise<void>
123
+ listFiles(handle, path): Promise<string[]>
124
+ destroy(handle): Promise<void>
125
+ }
126
+ ```
127
+
128
+ Access the context from a running agent:
129
+
130
+ ```ts
131
+ agent.execution // ExecutionContext
132
+ agent.execution.type // 'process' | 'docker' | 'sandbox'
133
+ agent.handle // ExecutionHandle (after first run)
134
+ await agent.destroy() // clean up context resources
34
135
  ```
35
136
 
36
137
  ## Providers
@@ -69,8 +170,6 @@ CEREBRAS_API_KEY=csk-... bun start \
69
170
  --prompt "hello"
70
171
  ```
71
172
 
72
- Available models: `zai-glm-4.7`, `gpt-oss-120b`
73
-
74
173
  ## Thinking
75
174
 
76
175
  Extended reasoning for complex tasks. Maps to Anthropic's thinking API or OpenRouter's `:thinking` variant.
@@ -97,9 +196,187 @@ Tools are grouped into **harnesses**. The `basic` harness includes:
97
196
  | `read_file` | Read file contents |
98
197
  | `write_file` | Write/create files |
99
198
  | `list_files` | List directory contents |
199
+ | `spawn` | Spawn a sub-agent for a task |
100
200
 
101
201
  All paths are sandboxed to the working directory.
102
202
 
203
+ Define a custom harness with `defineHarness`:
204
+
205
+ ```ts
206
+ import { defineHarness } from 'zidane'
207
+
208
+ const harness = defineHarness({
209
+ name: 'researcher',
210
+ system: 'You are a research assistant.',
211
+ tools: { ...basicTools },
212
+ mcpServers: [
213
+ { name: 'filesystem', transport: 'stdio', command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '.'] },
214
+ ],
215
+ })
216
+ ```
217
+
218
+ ## Sub-agent Spawning
219
+
220
+ The `spawn` tool lets the agent delegate tasks to child agents. Children run independently and return their result as a tool response.
221
+
222
+ ### Static spawn tool
223
+
224
+ ```ts
225
+ import { spawn, basicTools, defineHarness } from 'zidane'
226
+
227
+ const harness = defineHarness({
228
+ name: 'orchestrator',
229
+ tools: { ...basicTools, spawn },
230
+ })
231
+ ```
232
+
233
+ Children inherit the parent's harness (and can spawn their own children).
234
+
235
+ ### Configurable factory
236
+
237
+ Use `createSpawnTool` when you need custom concurrency limits, model overrides, or lifecycle callbacks.
238
+
239
+ ```ts
240
+ import { createSpawnTool } from 'zidane'
241
+
242
+ const spawnTool = createSpawnTool({
243
+ maxConcurrent: 5,
244
+ model: 'claude-haiku-4-5-20251001',
245
+ system: 'You are a focused sub-agent.',
246
+ thinking: 'low',
247
+ onSpawn: (child) => console.log(`started ${child.id}`),
248
+ onComplete: (child, stats) => console.log(`${child.id} done in ${stats.turns} turns`),
249
+ })
250
+
251
+ const harness = defineHarness({
252
+ name: 'orchestrator',
253
+ tools: { spawn: spawnTool },
254
+ })
255
+ ```
256
+
257
+ ## MCP Servers
258
+
259
+ Connect any MCP-compatible tool server. Tools are namespaced as `mcp_{serverName}_{toolName}`.
260
+
261
+ ### Agent-level
262
+
263
+ ```ts
264
+ const agent = createAgent({
265
+ harness,
266
+ provider,
267
+ mcpServers: [
268
+ { name: 'filesystem', transport: 'stdio', command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '.'] },
269
+ { name: 'search', transport: 'sse', url: 'http://localhost:3001/sse' },
270
+ { name: 'api', transport: 'streamable-http', url: 'http://localhost:3002/mcp' },
271
+ ],
272
+ })
273
+ ```
274
+
275
+ ### Harness-level
276
+
277
+ MCP servers can also be declared on the harness so they're shared across all agents using it.
278
+
279
+ ```ts
280
+ const harness = defineHarness({
281
+ name: 'with-mcp',
282
+ tools: { ...basicTools },
283
+ mcpServers: [
284
+ { name: 'db', transport: 'stdio', command: 'node', args: ['db-server.js'] },
285
+ ],
286
+ })
287
+ ```
288
+
289
+ MCP connections are made lazily on the first `run()` call and reused across subsequent runs. They are closed when `agent.destroy()` is called.
290
+
291
+ ## Sessions
292
+
293
+ Sessions give an agent persistent identity, message history, and run metadata across multiple calls or restarts.
294
+
295
+ ### Creating a session
296
+
297
+ ```ts
298
+ import { createSession, createMemoryStore } from 'zidane/session'
299
+
300
+ // In-memory (default, no persistence)
301
+ const session = createSession({ id: 'my-session', agentId: 'my-agent' })
302
+
303
+ // With a store for persistence
304
+ const store = createMemoryStore()
305
+ const session = createSession({ id: 'my-session', store })
306
+ ```
307
+
308
+ ### Storage backends
309
+
310
+ Three built-in stores are available:
311
+
312
+ ```ts
313
+ import { createMemoryStore, createSqliteStore, createRemoteStore } from 'zidane/session'
314
+
315
+ // In-memory, fast, no disk I/O, lost on process restart
316
+ const memStore = createMemoryStore()
317
+
318
+ // SQLite, persistent, zero-dependency (uses Bun's built-in SQLite)
319
+ const sqliteStore = createSqliteStore({ path: './sessions.db' })
320
+
321
+ // Remote HTTP, delegates to a custom REST API
322
+ const remoteStore = createRemoteStore({ url: 'https://api.example.com/sessions' })
323
+ ```
324
+
325
+ ### Agent integration
326
+
327
+ ```ts
328
+ const agent = createAgent({
329
+ harness,
330
+ provider,
331
+ session,
332
+ })
333
+
334
+ await agent.run({ prompt: 'hello' })
335
+ await session.save() // persist to store
336
+ ```
337
+
338
+ ### Session hooks
339
+
340
+ ```ts
341
+ agent.hooks.hook('session:start', (ctx) => {
342
+ // ctx.sessionId, ctx.runId, ctx.prompt
343
+ })
344
+
345
+ agent.hooks.hook('session:end', (ctx) => {
346
+ // ctx.sessionId, ctx.runId
347
+ // ctx.status: 'completed' | 'aborted' | 'error'
348
+ })
349
+
350
+ agent.hooks.hook('session:messages', (ctx) => {
351
+ // ctx.sessionId, ctx.count
352
+ // fired after each turn (live message sync)
353
+ })
354
+
355
+ agent.hooks.hook('session:save', (ctx) => {
356
+ // ctx.sessionId
357
+ // fired after session.save() completes
358
+ })
359
+
360
+ agent.hooks.hook('session:meta', (ctx) => {
361
+ // ctx.sessionId, ctx.key, ctx.value
362
+ // fired when session.setMeta() is called
363
+ })
364
+ ```
365
+
366
+ Messages are synced to the session after every turn, not just at run start/end. If the agent crashes mid-run, you still have messages up to the last completed turn.
367
+
368
+ ### Restoring a session
369
+
370
+ ```ts
371
+ import { loadSession } from 'zidane/session'
372
+
373
+ const session = await loadSession(store, 'my-session')
374
+ if (session) {
375
+ const agent = createAgent({ harness, provider, session })
376
+ await agent.run({ prompt: 'continue from before' })
377
+ }
378
+ ```
379
+
103
380
  ## Hooks
104
381
 
105
382
  The agent uses [hookable](https://github.com/unjs/hookable) for lifecycle events. Every hook receives a mutable context object.
@@ -108,12 +385,12 @@ The agent uses [hookable](https://github.com/unjs/hookable) for lifecycle events
108
385
 
109
386
  ```ts
110
387
  agent.hooks.hook('system:before', (ctx) => {
111
- // ctx.system system prompt text
388
+ // ctx.system: system prompt text
112
389
  })
113
390
 
114
391
  agent.hooks.hook('turn:before', (ctx) => {
115
- // ctx.turn turn number
116
- // ctx.options StreamOptions being sent to provider
392
+ // ctx.turn: turn number
393
+ // ctx.options: StreamOptions being sent to provider
117
394
  })
118
395
 
119
396
  agent.hooks.hook('turn:after', (ctx) => {
@@ -121,7 +398,7 @@ agent.hooks.hook('turn:after', (ctx) => {
121
398
  })
122
399
 
123
400
  agent.hooks.hook('agent:done', (ctx) => {
124
- // ctx.totalIn, ctx.totalOut, ctx.turns, ctx.elapsed
401
+ // ctx.totalIn, ctx.totalOut, ctx.turns, ctx.elapsed, ctx.children?
125
402
  })
126
403
 
127
404
  agent.hooks.hook('agent:abort', () => {
@@ -133,12 +410,12 @@ agent.hooks.hook('agent:abort', () => {
133
410
 
134
411
  ```ts
135
412
  agent.hooks.hook('stream:text', (ctx) => {
136
- // ctx.delta new text chunk
137
- // ctx.text accumulated text so far
413
+ // ctx.delta: new text chunk
414
+ // ctx.text: accumulated text so far
138
415
  })
139
416
 
140
417
  agent.hooks.hook('stream:end', (ctx) => {
141
- // ctx.text final complete text
418
+ // ctx.text: final complete text
142
419
  })
143
420
  ```
144
421
 
@@ -158,7 +435,7 @@ agent.hooks.hook('tool:error', (ctx) => {
158
435
  })
159
436
  ```
160
437
 
161
- ### Tool Gate block execution
438
+ ### Tool Gate: block execution
162
439
 
163
440
  Mutate `ctx.block = true` to prevent a tool from running.
164
441
 
@@ -171,7 +448,7 @@ agent.hooks.hook('tool:gate', (ctx) => {
171
448
  })
172
449
  ```
173
450
 
174
- ### Tool Transform modify output
451
+ ### Tool Transform: modify output
175
452
 
176
453
  Mutate `ctx.result` or `ctx.isError` to transform tool results before they're sent back to the model.
177
454
 
@@ -182,7 +459,7 @@ agent.hooks.hook('tool:transform', (ctx) => {
182
459
  })
183
460
  ```
184
461
 
185
- ### Context Transform prune messages
462
+ ### Context Transform: prune messages
186
463
 
187
464
  Mutate `ctx.messages` before each LLM call for context window management.
188
465
 
@@ -193,9 +470,73 @@ agent.hooks.hook('context:transform', (ctx) => {
193
470
  })
194
471
  ```
195
472
 
196
- ## Steering & Follow-up
473
+ ### Spawn hooks
474
+
475
+ Fired by the `spawn` tool when child agents are created.
476
+
477
+ ```ts
478
+ agent.hooks.hook('spawn:before', (ctx) => {
479
+ // ctx.id: child agent id (e.g. 'child-1')
480
+ // ctx.task: the task prompt given to the child
481
+ })
482
+
483
+ agent.hooks.hook('spawn:complete', (ctx) => {
484
+ // ctx.id, ctx.task
485
+ // ctx.stats: AgentStats from the child run
486
+ })
487
+
488
+ agent.hooks.hook('spawn:error', (ctx) => {
489
+ // ctx.id, ctx.task, ctx.error
490
+ })
491
+ ```
492
+
493
+ ### MCP hooks
494
+
495
+ Fired during MCP server lifecycle.
496
+
497
+ ```ts
498
+ agent.hooks.hook('mcp:connect', (ctx) => {
499
+ // ctx.name: server name
500
+ // ctx.transport: 'stdio' | 'sse' | 'streamable-http'
501
+ // ctx.tools: namespaced tool names discovered on this server
502
+ })
503
+
504
+ agent.hooks.hook('mcp:error', (ctx) => {
505
+ // ctx.name: server name
506
+ // ctx.error: connection error
507
+ })
508
+
509
+ agent.hooks.hook('mcp:close', (ctx) => {
510
+ // ctx.name: server name being closed
511
+ })
512
+
513
+ agent.hooks.hook('mcp:tool:before', (ctx) => {
514
+ // ctx.server: MCP server name
515
+ // ctx.tool: original tool name (not namespaced)
516
+ // ctx.input: tool arguments
517
+ })
518
+
519
+ agent.hooks.hook('mcp:tool:after', (ctx) => {
520
+ // ctx.server, ctx.tool, ctx.input
521
+ // ctx.result: tool output string
522
+ })
523
+
524
+ agent.hooks.hook('mcp:tool:error', (ctx) => {
525
+ // ctx.server, ctx.tool, ctx.input, ctx.error
526
+ })
527
+ ```
528
+
529
+ ### Steering inject
530
+
531
+ ```ts
532
+ agent.hooks.hook('steer:inject', (ctx) => {
533
+ // ctx.message: the steering message being injected
534
+ })
535
+ ```
536
+
537
+ ## Steering and Follow-up
197
538
 
198
- ### Steering interrupt mid-run
539
+ ### Steering: interrupt mid-run
199
540
 
200
541
  Inject a message while the agent is working. Delivered between tool calls, skipping remaining tools in the current turn.
201
542
 
@@ -205,7 +546,7 @@ agent.hooks.hook('tool:after', () => {
205
546
  })
206
547
  ```
207
548
 
208
- ### Follow-up continue after done
549
+ ### Follow-up, continue after done
209
550
 
210
551
  Queue messages that extend the conversation after the agent finishes.
211
552
 
@@ -220,7 +561,7 @@ Execute multiple tool calls from a single turn concurrently.
220
561
 
221
562
  ```ts
222
563
  const agent = createAgent({
223
- harness: 'basic',
564
+ harness,
224
565
  provider,
225
566
  toolExecution: 'parallel', // default: 'sequential'
226
567
  })
@@ -246,13 +587,68 @@ await agent.run({
246
587
  })
247
588
  ```
248
589
 
590
+ ## Message Format
591
+
592
+ All messages in zidane use the canonical `SessionMessage` format, with or without sessions:
593
+
594
+ ```ts
595
+ type SessionContentBlock =
596
+ | { type: 'text', text: string }
597
+ | { type: 'image', mediaType: string, data: string }
598
+ | { type: 'tool_call', id: string, name: string, input: Record<string, unknown> }
599
+ | { type: 'tool_result', callId: string, output: string, isError?: boolean }
600
+ | { type: 'thinking', text: string }
601
+
602
+ interface SessionMessage {
603
+ role: 'user' | 'assistant'
604
+ content: SessionContentBlock[]
605
+ }
606
+ ```
607
+
608
+ Providers convert to and from native wire formats internally. Converters are available for external interop:
609
+
610
+ ```ts
611
+ import { fromAnthropic, toAnthropic, fromOpenAI, toOpenAI, autoDetectAndConvert } from 'zidane'
612
+ ```
613
+
614
+ ## Usage Tracking
615
+
616
+ Every turn reports token usage. Provider-specific fields are optional:
617
+
618
+ ```ts
619
+ interface TurnUsage {
620
+ input: number
621
+ output: number
622
+ cacheCreation?: number // Anthropic: tokens written to cache
623
+ cacheRead?: number // Anthropic: tokens read from cache
624
+ thinking?: number // thinking tokens used
625
+ cost?: number // USD cost reported by provider (e.g. OpenRouter)
626
+ }
627
+ ```
628
+
629
+ Per-turn data is available on `AgentStats` and `SessionRun`:
630
+
631
+ ```ts
632
+ const stats = await agent.run({ prompt: 'hello' })
633
+ stats.turnUsage // TurnUsage[] per turn
634
+ stats.cost // total cost (sum of per-turn costs, if reported)
635
+
636
+ // In session runs
637
+ session.runs[0].turnUsage // per-turn breakdown
638
+ session.runs[0].totalUsage // aggregated TurnUsage
639
+ session.runs[0].cost // total cost for this run
640
+ ```
641
+
249
642
  ## State Management
250
643
 
251
644
  ```ts
252
- agent.isRunning // boolean is a run in progress?
253
- agent.messages // Message[] conversation history
254
- agent.abort() // cancel the current run
255
- agent.reset() // clear messages and queues
645
+ agent.isRunning // boolean: is a run in progress?
646
+ agent.messages // SessionMessage[]: conversation history
647
+ agent.execution // ExecutionContext: where tools run
648
+ agent.handle // ExecutionHandle: spawned context handle
649
+ agent.abort() // cancel the current run
650
+ agent.reset() // clear messages and queues
651
+ await agent.destroy() // clean up execution context and MCP connections
256
652
  await agent.waitForIdle() // wait for current run to complete
257
653
  ```
258
654
 
@@ -261,12 +657,25 @@ await agent.waitForIdle() // wait for current run to complete
261
657
  ```
262
658
  src/
263
659
  types.ts shared types
264
- agent.ts createAgent, state management
660
+ agent.ts createAgent, AgentHooks, state management
265
661
  loop.ts turn execution loop
266
662
  start.ts CLI entrypoint
267
663
  auth.ts Anthropic OAuth flow
664
+ index.ts package exports
665
+ contexts/
666
+ types.ts ExecutionContext interface, capabilities
667
+ process.ts in-process context (default)
668
+ docker.ts Docker container context
669
+ sandbox.ts remote sandbox context
670
+ index.ts barrel exports
268
671
  tools/
672
+ index.ts tool exports
269
673
  validation.ts tool argument validation
674
+ shell.ts shell tool
675
+ read-file.ts read_file tool
676
+ write-file.ts write_file tool
677
+ list-files.ts list_files tool
678
+ spawn.ts spawn tool and createSpawnTool factory
270
679
  providers/
271
680
  index.ts Provider interface
272
681
  openai-compat.ts shared OpenAI-compatible utilities
@@ -274,14 +683,31 @@ src/
274
683
  openrouter.ts OpenRouter provider
275
684
  cerebras.ts Cerebras provider
276
685
  harnesses/
277
- index.ts harness registry
278
- basic.ts shell, read, write, list tools
686
+ index.ts HarnessConfig, defineHarness, ToolContext
687
+ basic.ts basic harness (shell, read, write, list, spawn)
688
+ mcp/
689
+ index.ts MCP server connection and tool discovery
690
+ session/
691
+ index.ts Session interface, createSession, loadSession
692
+ messages.ts SessionMessage converters (Anthropic/OpenAI)
693
+ memory.ts in-memory session store
694
+ sqlite.ts SQLite-backed session store
695
+ remote.ts HTTP remote session store
279
696
  output/
280
697
  terminal.ts terminal rendering (md4x)
281
698
  test/
282
699
  mock-provider.ts mock provider for testing
283
- agent.test.ts agent test suite (30 tests)
700
+ mock-context.ts mock execution context for testing
701
+ agent.test.ts agent loop tests
702
+ contexts.test.ts execution context tests
703
+ harness.test.ts harness tests
704
+ mcp.test.ts MCP connection and hook tests
705
+ spawn.test.ts spawn tool and hook tests
284
706
  validation.test.ts validation tests
707
+ providers.test.ts provider tests
708
+ openai-compat.test.ts OpenAI-compat utility tests
709
+ session.test.ts session store and agent integration tests
710
+ session-messages.test.ts SessionMessage converter tests
285
711
  ```
286
712
 
287
713
  ## Testing
@@ -290,8 +716,9 @@ test/
290
716
  bun test
291
717
  ```
292
718
 
293
- 30 tests with a mock provider no LLM calls needed.
719
+ 300 tests with mock provider and mock execution context, no LLM calls or Docker needed.
294
720
 
295
721
  ## License
296
722
 
297
723
  ISC
724
+