npm - zidane - Versions diffs - 1.1.5 → 1.3.1 - Mend

zidane 1.1.5 → 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

package/README.md +516 -27
package/dist/agent-DxIUxou4.d.ts +319 -0
package/dist/chunk-26LIQARN.js +109 -0
package/dist/chunk-IC2WAUBZ.js +416 -0
package/dist/chunk-LS57GDAV.js +365 -0
package/dist/chunk-N523NBO2.js +45 -0
package/dist/chunk-PNKVD2UK.js +26 -0
package/dist/chunk-SWS5624X.js +1107 -0
package/dist/harnesses.d.ts +7 -24
package/dist/harnesses.js +6 -1
package/dist/index.d.ts +49 -82
package/dist/index.js +51 -255
package/dist/mcp.d.ts +7 -0
package/dist/mcp.js +11 -0
package/dist/providers.d.ts +65 -1
package/dist/providers.js +37 -157
package/dist/session.d.ts +193 -0
package/dist/session.js +27 -0
package/dist/spawn-bEqlGUVT.d.ts +63 -0
package/dist/tools.d.ts +28 -0
package/dist/tools.js +20 -0
package/dist/types-CLRMCak3.d.ts +104 -0
package/package.json +15 -1
package/dist/chunk-ECE5USCO.js +0 -125
package/dist/index-ByJfS-kX.d.ts +0 -101

package/README.md CHANGED Viewed

@@ -8,6 +8,8 @@ Minimal TypeScript agent loop built with [Bun](https://bun.sh).
 Hook into every step of the agent's execution using [hookable](https://github.com/unjs/hookable).
+Built to be embedded in other projects easily, extended through [providers](#providers), [harnesses](#harnesses), and [execution contexts](#execution-contexts).
 ## Quickstart
 ```bash
@@ -30,7 +32,106 @@ bun start \
   --provider anthropic    \   # anthropic | openrouter | cerebras
   --harness basic         \   # tool set to use
   --system "be concise"   \   # system prompt
-  --thinking off              # off | minimal | low | medium | high
+  --thinking off          \   # off | minimal | low | medium | high
+  --context process       \   # process | docker
+  --mcp '{"name":"fs","transport":"stdio","command":"npx","args":["-y","@modelcontextprotocol/server-filesystem","."]}'
+```
+The `--mcp` flag accepts a JSON object matching `McpServerConfig`. It can be passed multiple times.
+## Execution Contexts
+An execution context defines **where** the agent's tools run. All tool operations (shell, filesystem) go through it.
+### In-process (default)
+Runs in the same Node/Bun process. No isolation, fastest.
+```ts
+import { createAgent, createProcessContext } from 'zidane'
+const agent = createAgent({
+  harness,
+  provider,
+  // execution defaults to createProcessContext()
+})
+```
+### Docker
+Full container isolation via [dockerode](https://github.com/apocas/dockerode). Configurable resource limits.
+```bash
+# CLI
+bun start --prompt "run uname -a" --context docker
+bun start --prompt "build the app" --context docker --image node:22 --cwd /workspace
+```
+```ts
+import { createAgent, createDockerContext } from 'zidane'
+const agent = createAgent({
+  harness,
+  provider,
+  execution: createDockerContext({
+    image: 'node:22',
+    cwd: '/workspace',
+    limits: { memory: 512, cpu: '1.0' },
+  }),
+})
+```
+Requires `dockerode` as a peer dependency: `bun add dockerode`
+### Sandbox (remote)
+Offloads execution to a remote sandbox API. Implement the `SandboxProvider` interface for your provider (Rivet, E2B, etc.).
+```ts
+import { createAgent, createSandboxContext } from 'zidane'
+import type { SandboxProvider } from 'zidane'
+const myProvider: SandboxProvider = {
+  name: 'my-sandbox',
+  spawn: async (config) => { /* ... */ },
+  exec: async (id, command) => { /* ... */ },
+  readFile: async (id, path) => { /* ... */ },
+  writeFile: async (id, path, content) => { /* ... */ },
+  listFiles: async (id, path) => { /* ... */ },
+  destroy: async (id) => { /* ... */ },
+}
+const agent = createAgent({
+  harness,
+  provider,
+  execution: createSandboxContext(myProvider),
+})
+```
+### Execution Context Interface
+All contexts implement the same interface:
+```ts
+interface ExecutionContext {
+  type: 'process' | 'docker' | 'sandbox'
+  capabilities: { shell, filesystem, network, gpu }
+  spawn(config?): Promise<ExecutionHandle>
+  exec(handle, command, options?): Promise<ExecResult>
+  readFile(handle, path): Promise<string>
+  writeFile(handle, path, content): Promise<void>
+  listFiles(handle, path): Promise<string[]>
+  destroy(handle): Promise<void>
+}
+```
+Access the context from a running agent:
+```ts
+agent.execution          // ExecutionContext
+agent.execution.type     // 'process' | 'docker' | 'sandbox'
+agent.handle             // ExecutionHandle (after first run)
+await agent.destroy()    // clean up context resources
 ```
 ## Providers
@@ -69,8 +170,6 @@ CEREBRAS_API_KEY=csk-... bun start \
   --prompt "hello"
 ```
-Available models: `zai-glm-4.7`, `gpt-oss-120b`
 ## Thinking
 Extended reasoning for complex tasks. Maps to Anthropic's thinking API or OpenRouter's `:thinking` variant.
@@ -97,9 +196,245 @@ Tools are grouped into **harnesses**. The `basic` harness includes:
 | `read_file` | Read file contents |
 | `write_file` | Write/create files |
 | `list_files` | List directory contents |
+| `spawn` | Spawn a sub-agent for a task |
 All paths are sandboxed to the working directory.
+Define a custom harness with `defineHarness`:
+```ts
+import { defineHarness } from 'zidane'
+const harness = defineHarness({
+  name: 'researcher',
+  system: 'You are a research assistant.',
+  tools: { ...basicTools },
+  mcpServers: [
+    { name: 'filesystem', transport: 'stdio', command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '.'] },
+  ],
+})
+```
+## Sub-agent Spawning
+The `spawn` tool lets the agent delegate tasks to child agents. Children run independently and return their result as a tool response.
+### Static spawn tool
+```ts
+import { spawn, basicTools, defineHarness } from 'zidane'
+const harness = defineHarness({
+  name: 'orchestrator',
+  tools: { ...basicTools, spawn },
+})
+```
+Children inherit the parent's harness (and can spawn their own children).
+### Configurable factory
+Use `createSpawnTool` when you need custom concurrency limits, model overrides, or lifecycle callbacks.
+```ts
+import { createSpawnTool } from 'zidane'
+const spawnTool = createSpawnTool({
+  maxConcurrent: 5,
+  model: 'claude-haiku-4-5-20251001',
+  system: 'You are a focused sub-agent.',
+  thinking: 'low',
+  onSpawn: (child) => console.log(`started ${child.id}`),
+  onComplete: (child, stats) => console.log(`${child.id} done in ${stats.turns} turns`),
+})
+const harness = defineHarness({
+  name: 'orchestrator',
+  tools: { spawn: spawnTool },
+})
+```
+## MCP Servers
+Connect any MCP-compatible tool server. Tools are namespaced as `mcp_{serverName}_{toolName}`.
+### Agent-level
+```ts
+const agent = createAgent({
+  harness,
+  provider,
+  mcpServers: [
+    { name: 'filesystem', transport: 'stdio', command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '.'] },
+    { name: 'search', transport: 'sse', url: 'http://localhost:3001/sse' },
+    { name: 'api', transport: 'streamable-http', url: 'http://localhost:3002/mcp' },
+  ],
+})
+```
+### Harness-level
+MCP servers can also be declared on the harness so they're shared across all agents using it.
+```ts
+const harness = defineHarness({
+  name: 'with-mcp',
+  tools: { ...basicTools },
+  mcpServers: [
+    { name: 'db', transport: 'stdio', command: 'node', args: ['db-server.js'] },
+  ],
+})
+```
+MCP connections are made lazily on the first `run()` call and reused across subsequent runs. They are closed when `agent.destroy()` is called.
+## Sessions
+Sessions give an agent persistent identity, turn history, and run metadata across multiple calls or restarts. Each message exchange is a `SessionTurn` with its own UUID, enabling real-time multiplayer streaming.
+### SessionTurn
+Every message in a session is a turn:
+```ts
+interface SessionTurn {
+  id: string                      // UUID — generated by store or crypto.randomUUID()
+  role: 'user' | 'assistant' | 'system'
+  content: SessionContentBlock[]  // same format used by providers
+  usage?: TurnUsage               // token usage (assistant turns only)
+  createdAt: number               // timestamp
+}
+```
+### Creating a session
+`createSession` is async — stores can generate IDs server-side (e.g. Supabase).
+```ts
+import { createSession, createMemoryStore } from 'zidane/session'
+// In-memory (default, no persistence)
+const session = await createSession({ id: 'my-session', agentId: 'my-agent' })
+// With a store for persistence
+const store = createMemoryStore()
+const session = await createSession({ id: 'my-session', store })
+```
+### Storage backends
+Three built-in stores are available. All implement the full `SessionStore` interface including incremental operations.
+```ts
+import { createMemoryStore, createSqliteStore, createRemoteStore } from 'zidane/session'
+// In-memory, fast, no disk I/O, lost on process restart
+const memStore = createMemoryStore()
+// SQLite, persistent, zero-dependency (uses Bun's built-in SQLite)
+const sqliteStore = createSqliteStore({ path: './sessions.db' })
+// Remote HTTP, delegates to a custom REST API
+const remoteStore = createRemoteStore({ url: 'https://api.example.com/sessions' })
+```
+### SessionStore interface
+```ts
+interface SessionStore {
+  // Optional: server-side ID generation
+  generateSessionId?: () => string | Promise<string>
+  generateTurnId?: () => string | Promise<string>
+  // Core CRUD
+  load: (sessionId: string) => Promise<SessionData | null>
+  save: (session: SessionData) => Promise<void>
+  delete: (sessionId: string) => Promise<void>
+  list: (filter?) => Promise<string[]>
+  // Incremental operations (avoids full re-save)
+  appendTurns: (sessionId: string, turns: SessionTurn[]) => Promise<void>
+  getTurns: (sessionId: string, from?: number, limit?: number) => Promise<SessionTurn[]>
+  updateRun: (sessionId: string, run: SessionRun) => Promise<void>
+  updateStatus: (sessionId: string, status: SessionStatus) => Promise<void>
+}
+```
+Custom ID generation lets external databases (e.g. Supabase) provide UUIDs server-side, keeping IDs in sync:
+```ts
+const store = createRemoteStore({ url: '...' })
+store.generateTurnId = async () => {
+  const { data } = await supabase.rpc('gen_random_uuid')
+  return data
+}
+```
+### Agent integration
+```ts
+const agent = createAgent({
+  harness,
+  provider,
+  session,
+})
+await agent.run({ prompt: 'hello' })
+await session.save() // persist to store
+```
+Turns are persisted incrementally after each agent turn via `appendTurns` — not as a full document save. If the agent crashes mid-run, you still have turns up to the last completed turn.
+### Session status
+Sessions track their status: `'idle' | 'running' | 'completed' | 'error'`. The agent updates it automatically during runs.
+```ts
+session.status // 'idle'
+await agent.run({ prompt: 'go' })
+// idle → running → completed (or error)
+```
+### Session hooks
+```ts
+agent.hooks.hook('session:start', (ctx) => {
+  // ctx.sessionId, ctx.runId, ctx.prompt
+})
+agent.hooks.hook('session:end', (ctx) => {
+  // ctx.sessionId, ctx.runId
+  // ctx.status: 'completed' | 'aborted' | 'error'
+})
+agent.hooks.hook('session:turns', (ctx) => {
+  // ctx.sessionId, ctx.count
+  // fired after each turn (incremental sync)
+})
+agent.hooks.hook('session:save', (ctx) => {
+  // ctx.sessionId
+  // fired after session.save() completes
+})
+agent.hooks.hook('session:meta', (ctx) => {
+  // ctx.sessionId, ctx.key, ctx.value
+  // fired when session.setMeta() is called
+})
+```
+### Restoring a session
+```ts
+import { loadSession } from 'zidane/session'
+const session = await loadSession(store, 'my-session')
+if (session) {
+  const agent = createAgent({ harness, provider, session })
+  await agent.run({ prompt: 'continue from before' })
+}
+```
 ## Hooks
 The agent uses [hookable](https://github.com/unjs/hookable) for lifecycle events. Every hook receives a mutable context object.
@@ -108,20 +443,21 @@ The agent uses [hookable](https://github.com/unjs/hookable) for lifecycle events
 ```ts
 agent.hooks.hook('system:before', (ctx) => {
-  // ctx.system — system prompt text
+  // ctx.system: system prompt text
 })
 agent.hooks.hook('turn:before', (ctx) => {
-  // ctx.turn — turn number
-  // ctx.options — StreamOptions being sent to provider
+  // ctx.turn: turn number
+  // ctx.turnId: UUID for this turn (generated before LLM call)
+  // ctx.options: StreamOptions being sent to provider
 })
 agent.hooks.hook('turn:after', (ctx) => {
-  // ctx.turn, ctx.usage { input, output }
+  // ctx.turn, ctx.turnId, ctx.usage { input, output }
 })
 agent.hooks.hook('agent:done', (ctx) => {
-  // ctx.totalIn, ctx.totalOut, ctx.turns, ctx.elapsed
+  // ctx.totalIn, ctx.totalOut, ctx.turns, ctx.elapsed, ctx.children?
 })
 agent.hooks.hook('agent:abort', () => {
@@ -133,12 +469,15 @@ agent.hooks.hook('agent:abort', () => {
 ```ts
 agent.hooks.hook('stream:text', (ctx) => {
-  // ctx.delta — new text chunk
-  // ctx.text — accumulated text so far
+  // ctx.delta: new text chunk
+  // ctx.text: accumulated text so far
+  // ctx.turnId: UUID of the turn being streamed
+  // ctx.blockIndex: content block index within the turn
 })
 agent.hooks.hook('stream:end', (ctx) => {
-  // ctx.text — final complete text
+  // ctx.text: final complete text
+  // ctx.turnId, ctx.blockIndex
 })
 ```
@@ -158,7 +497,7 @@ agent.hooks.hook('tool:error', (ctx) => {
 })
 ```
-### Tool Gate — block execution
+### Tool Gate: block execution
 Mutate `ctx.block = true` to prevent a tool from running.
@@ -171,7 +510,7 @@ agent.hooks.hook('tool:gate', (ctx) => {
 })
 ```
-### Tool Transform — modify output
+### Tool Transform: modify output
 Mutate `ctx.result` or `ctx.isError` to transform tool results before they're sent back to the model.
@@ -182,7 +521,7 @@ agent.hooks.hook('tool:transform', (ctx) => {
 })
 ```
-### Context Transform — prune messages
+### Context Transform: prune messages
 Mutate `ctx.messages` before each LLM call for context window management.
@@ -193,9 +532,73 @@ agent.hooks.hook('context:transform', (ctx) => {
 })
 ```
-## Steering & Follow-up
+### Spawn hooks
-### Steering — interrupt mid-run
+Fired by the `spawn` tool when child agents are created.
+```ts
+agent.hooks.hook('spawn:before', (ctx) => {
+  // ctx.id: child agent id (e.g. 'child-1')
+  // ctx.task: the task prompt given to the child
+})
+agent.hooks.hook('spawn:complete', (ctx) => {
+  // ctx.id, ctx.task
+  // ctx.stats: AgentStats from the child run
+})
+agent.hooks.hook('spawn:error', (ctx) => {
+  // ctx.id, ctx.task, ctx.error
+})
+```
+### MCP hooks
+Fired during MCP server lifecycle.
+```ts
+agent.hooks.hook('mcp:connect', (ctx) => {
+  // ctx.name: server name
+  // ctx.transport: 'stdio' | 'sse' | 'streamable-http'
+  // ctx.tools: namespaced tool names discovered on this server
+})
+agent.hooks.hook('mcp:error', (ctx) => {
+  // ctx.name: server name
+  // ctx.error: connection error
+})
+agent.hooks.hook('mcp:close', (ctx) => {
+  // ctx.name: server name being closed
+})
+agent.hooks.hook('mcp:tool:before', (ctx) => {
+  // ctx.server: MCP server name
+  // ctx.tool: original tool name (not namespaced)
+  // ctx.input: tool arguments
+})
+agent.hooks.hook('mcp:tool:after', (ctx) => {
+  // ctx.server, ctx.tool, ctx.input
+  // ctx.result: tool output string
+})
+agent.hooks.hook('mcp:tool:error', (ctx) => {
+  // ctx.server, ctx.tool, ctx.input, ctx.error
+})
+```
+### Steering inject
+```ts
+agent.hooks.hook('steer:inject', (ctx) => {
+  // ctx.message: the steering message being injected
+})
+```
+## Steering and Follow-up
+### Steering: interrupt mid-run
 Inject a message while the agent is working. Delivered between tool calls, skipping remaining tools in the current turn.
@@ -205,7 +608,7 @@ agent.hooks.hook('tool:after', () => {
 })
 ```
-### Follow-up — continue after done
+### Follow-up, continue after done
 Queue messages that extend the conversation after the agent finishes.
@@ -220,7 +623,7 @@ Execute multiple tool calls from a single turn concurrently.
 ```ts
 const agent = createAgent({
-  harness: 'basic',
+  harness,
   provider,
   toolExecution: 'parallel', // default: 'sequential'
 })
@@ -246,13 +649,68 @@ await agent.run({
 })
 ```
+## Message Format
+All messages in zidane use the canonical `SessionMessage` format, with or without sessions:
+```ts
+type SessionContentBlock =
+  | { type: 'text', text: string }
+  | { type: 'image', mediaType: string, data: string }
+  | { type: 'tool_call', id: string, name: string, input: Record<string, unknown> }
+  | { type: 'tool_result', callId: string, output: string, isError?: boolean }
+  | { type: 'thinking', text: string }
+interface SessionMessage {
+  role: 'user' | 'assistant'
+  content: SessionContentBlock[]
+}
+```
+Providers convert to and from native wire formats internally. Converters are available for external interop:
+```ts
+import { fromAnthropic, toAnthropic, fromOpenAI, toOpenAI, autoDetectAndConvert } from 'zidane'
+```
+## Usage Tracking
+Every turn reports token usage. Provider-specific fields are optional:
+```ts
+interface TurnUsage {
+  input: number
+  output: number
+  cacheCreation?: number  // Anthropic: tokens written to cache
+  cacheRead?: number      // Anthropic: tokens read from cache
+  thinking?: number       // thinking tokens used
+  cost?: number           // USD cost reported by provider (e.g. OpenRouter)
+}
+```
+Per-turn data is available on `AgentStats` and `SessionRun`:
+```ts
+const stats = await agent.run({ prompt: 'hello' })
+stats.turnUsage   // TurnUsage[] per turn
+stats.cost        // total cost (sum of per-turn costs, if reported)
+// In session runs
+session.runs[0].turnUsage   // per-turn breakdown
+session.runs[0].totalUsage  // aggregated TurnUsage
+session.runs[0].cost        // total cost for this run
+```
 ## State Management
 ```ts
-agent.isRunning   // boolean — is a run in progress?
-agent.messages    // Message[] — conversation history
-agent.abort()     // cancel the current run
-agent.reset()     // clear messages and queues
+agent.isRunning        // boolean: is a run in progress?
+agent.messages         // SessionMessage[]: conversation history
+agent.execution        // ExecutionContext: where tools run
+agent.handle           // ExecutionHandle: spawned context handle
+agent.abort()          // cancel the current run
+agent.reset()          // clear messages and queues
+await agent.destroy()  // clean up execution context and MCP connections
 await agent.waitForIdle() // wait for current run to complete
 ```
@@ -261,12 +719,25 @@ await agent.waitForIdle() // wait for current run to complete
 ```
 src/
   types.ts              shared types
-  agent.ts              createAgent, state management
+  agent.ts              createAgent, AgentHooks, state management
   loop.ts               turn execution loop
   start.ts              CLI entrypoint
   auth.ts               Anthropic OAuth flow
+  index.ts              package exports
+  contexts/
+    types.ts            ExecutionContext interface, capabilities
+    process.ts          in-process context (default)
+    docker.ts           Docker container context
+    sandbox.ts          remote sandbox context
+    index.ts            barrel exports
   tools/
+    index.ts            tool exports
     validation.ts       tool argument validation
+    shell.ts            shell tool
+    read-file.ts        read_file tool
+    write-file.ts       write_file tool
+    list-files.ts       list_files tool
+    spawn.ts            spawn tool and createSpawnTool factory
   providers/
     index.ts            Provider interface
     openai-compat.ts    shared OpenAI-compatible utilities
@@ -274,14 +745,31 @@ src/
     openrouter.ts       OpenRouter provider
     cerebras.ts         Cerebras provider
   harnesses/
-    index.ts            harness registry
-    basic.ts            shell, read, write, list tools
+    index.ts            HarnessConfig, defineHarness, ToolContext
+    basic.ts            basic harness (shell, read, write, list, spawn)
+  mcp/
+    index.ts            MCP server connection and tool discovery
+  session/
+    index.ts            Session interface, createSession, loadSession
+    messages.ts         SessionMessage converters (Anthropic/OpenAI)
+    memory.ts           in-memory session store
+    sqlite.ts           SQLite-backed session store
+    remote.ts           HTTP remote session store
   output/
     terminal.ts         terminal rendering (md4x)
 test/
   mock-provider.ts      mock provider for testing
-  agent.test.ts         agent test suite (30 tests)
+  mock-context.ts       mock execution context for testing
+  agent.test.ts         agent loop tests
+  contexts.test.ts      execution context tests
+  harness.test.ts       harness tests
+  mcp.test.ts           MCP connection and hook tests
+  spawn.test.ts         spawn tool and hook tests
   validation.test.ts    validation tests
+  providers.test.ts     provider tests
+  openai-compat.test.ts OpenAI-compat utility tests
+  session.test.ts       session store and agent integration tests
+  session-messages.test.ts  SessionMessage converter tests
 ```
 ## Testing
@@ -290,8 +778,9 @@ test/
 bun test
 ```
-30 tests with a mock provider — no LLM calls needed.
+300 tests with mock provider and mock execution context, no LLM calls or Docker needed.
 ## License
 ISC