@robota-sdk/agent-sdk 3.0.0-beta.60 → 3.0.0-beta.61

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # @robota-sdk/agent-sdk
2
2
 
3
- Programmatic SDK for building AI agents with Robota. Provides `InteractiveSession` as the central client-facing API, `query()` for one-shot use, session management, built-in tools, permissions, hooks, streaming, and context loading.
3
+ Programmatic SDK for building AI agents with Robota. Provides `InteractiveSession` as the central client-facing API, `createQuery()` for one-shot use, session management, SDK-owned command/common APIs, permissions, hooks, streaming, context loading, bounded prompt file references, and context reference inventory.
4
4
 
5
5
  This is the **assembly layer** of the Robota ecosystem — it composes lower-level packages (`agent-core`, `agent-tools`, `agent-sessions`, `agent-provider-anthropic`) into a cohesive SDK.
6
6
 
@@ -15,40 +15,52 @@ pnpm add @robota-sdk/agent-sdk
15
15
  ## Quick Start
16
16
 
17
17
  ```typescript
18
- import { query } from '@robota-sdk/agent-sdk';
18
+ import { createQuery } from '@robota-sdk/agent-sdk';
19
+ import { AnthropicProvider } from '@robota-sdk/agent-provider-anthropic';
20
+
21
+ const provider = new AnthropicProvider({ apiKey: process.env.ANTHROPIC_API_KEY });
22
+ const query = createQuery({ provider });
19
23
 
20
24
  // Simple one-shot query
21
25
  const response = await query('Show me the file list');
22
26
 
23
27
  // With options
24
- const response = await query('Analyze the code', {
28
+ const queryWithOptions = createQuery({
29
+ provider,
25
30
  cwd: '/path/to/project',
26
31
  permissionMode: 'acceptEdits',
27
32
  maxTurns: 10,
28
33
  onTextDelta: (delta) => process.stdout.write(delta),
29
34
  });
35
+
36
+ const detailedResponse = await queryWithOptions('Analyze the code');
30
37
  ```
31
38
 
32
39
  ## Features
33
40
 
34
41
  - **InteractiveSession** — Event-driven session wrapper (composition over Session). Central client-facing API for CLI, web, API server, or any other client
35
42
  - **SystemCommandExecutor + ISystemCommand** — SDK-level command execution infrastructure for product-composed command modules
36
- - **CommandRegistry, BuiltinCommandSource, SkillCommandSource** — Slash command registry and discovery (owned by SDK; agent-cli re-exports `CommandRegistry` from here)
37
- - **query()** — Single entry point for one-shot AI agent interactions with streaming support
38
- - **createSession()** — Assembly factory: wires tools, provider, config, and context into a Session
39
- - **Built-in Tools** — Bash, Read, Write, Edit, Glob, Grep, WebFetch, WebSearch are assembled for sessions; six local file/process/search exports are re-exported from `@robota-sdk/agent-tools`
43
+ - **CommandRegistry, BuiltinCommandSource, SkillCommandSource** — Command registry and SDK common discovery APIs. User-visible built-ins are composed through `agent-command-*` packages.
44
+ - **Model Command Common APIs** — Provider-neutral `/model` helpers that resolve active provider catalogs and optionally invoke provider-owned refresh hooks
45
+ - **createQuery()** — Provider-bound factory for one-shot AI agent interactions with streaming support
46
+ - **Session assembly** — Internal factory wires tools, provider, config, and context for `InteractiveSession`
47
+ - **Built-in Tools** — Bash, Read, Write, Edit, Glob, Grep, WebFetch, WebSearch are assembled for SDK sessions; direct tool usage imports from `@robota-sdk/agent-tools`
48
+ - **Sandbox Execution** — Optional `sandboxClient` injection routes Bash and core file tools through a provider-backed execution plane; `workspaceManifest` can prepare a fresh sandbox workspace before session creation
49
+ - **Sandbox Hydration** — Snapshot-capable sandbox clients persist `sandboxSnapshotId` on shutdown and restore it before saved message replay on non-fork resume
40
50
  - **Agent Tool** — Sub-agent session creation for multi-agent workflows
41
51
  - **Permissions** — 3-step evaluation (deny list, allow list, mode policy) with four modes: `plan`, `default`, `acceptEdits`, `bypassPermissions`
42
52
  - **Hooks** — `PreToolUse`, `PostToolUse`, `PreCompact`, `PostCompact`, `SessionStart`, `UserPromptSubmit`, `Stop` events with shell command execution
43
53
  - **Streaming** — Real-time text delta callbacks via `onTextDelta`
44
54
  - **Context Loading** — AGENTS.md / CLAUDE.md walk-up discovery and system prompt assembly
45
- - **Config Loading** — 6-file settings merge with provider profiles, legacy provider compatibility, and `$ENV:VAR` substitution for provider API keys
55
+ - **Prompt File References** — Path-like `@file` prompt references are resolved by the SDK under the session `cwd`, bounded by size/recursion limits, recorded as structured history events, and registered as observed context references
56
+ - **Context Reference Inventory** — Manual `/context add` references are stored by `InteractiveSession`, included in future prompt model input, and exposed through SDK command common APIs
57
+ - **Config Loading** — 6-file settings merge with provider profiles, legacy provider compatibility, and `$ENV:VAR` substitution for provider credentials
46
58
  - **Context Window Management** — Token tracking, configurable auto-compaction (default ~83.5%), manual `session.compact()`
47
59
  - **Background Jobs** — Runtime-managed subagent tasks with transcripts and task snapshots
48
60
  - **Agent Batch Jobs** — `Agent({ jobs: [...] })` starts explicit parallel subagent requests deterministically
49
61
  - **Edit Checkpoints** — Checkpoint/rewind support for safer edit workflows
50
62
  - **Project Memory** — Command-driven memory capture and retrieval surfaces
51
- - **Replay Events** — Session execution can forward provider/tool boundary events into append-only logs
63
+ - **Replay Events** — Session execution can forward provider/tool boundary events and provider-native raw payload events into append-only logs
52
64
  - **Bundle Plugin System** — Install and manage reusable extensions packaged as bundle plugins
53
65
 
54
66
  ## Architecture
@@ -59,10 +71,10 @@ agent-sdk (assembly layer)
59
71
  │ └── Session ← generic session (agent-sessions)
60
72
  ├── SystemCommandExecutor ← SDK-level command execution
61
73
  ├── CommandRegistry / BuiltinCommandSource / SkillCommandSource
62
- ├── Agent tool batch jobs and background orchestration
74
+ ├── Agent runtime dependencies and background orchestration
63
75
  ├── Edit checkpoints and command-driven memory
64
- ├── query() ← one-shot entry point
65
- ├── createSession() ← assembly factory
76
+ ├── createQuery() ← one-shot entry point factory
77
+ ├── createSession() ← internal assembly factory
66
78
  └── deps:
67
79
  agent-sessions (Session, SessionStore)
68
80
  agent-tools (tool infrastructure + 8 built-in tools)
@@ -82,19 +94,22 @@ The SDK is **pure TypeScript with no React dependency**. The CLI is a thin TUI-o
82
94
  `InteractiveSession` wraps `Session` (composition over inheritance) to provide event-driven interaction for any client. It manages streaming text accumulation, tool execution state tracking, prompt queuing, abort orchestration, and message history. Logic that was previously embedded in CLI React hooks now lives here.
83
95
 
84
96
  ```typescript
85
- import { InteractiveSession } from '@robota-sdk/agent-sdk';
97
+ import { InteractiveSession, createProjectSessionStore } from '@robota-sdk/agent-sdk';
86
98
  import type { IInteractiveSessionOptions } from '@robota-sdk/agent-sdk';
87
99
 
100
+ const cwd = process.cwd();
101
+ const sessionStore = createProjectSessionStore(cwd);
102
+
88
103
  const session = new InteractiveSession({
89
104
  config,
90
105
  context,
91
106
  projectInfo,
92
- sessionStore, // SessionStore instance for persistence
93
- resumeSessionId, // Session ID to restore (optional)
107
+ sessionStore, // SDK-owned project-local persistence facade
108
+ resumeSessionId, // Session ID to restore, including sandbox snapshot when available
94
109
  forkSession, // Session ID to fork from (optional)
95
110
  permissionMode: 'default',
96
111
  maxTurns: 10,
97
- cwd: process.cwd(),
112
+ cwd,
98
113
  permissionHandler: async (toolName, toolArgs) => ({ allowed: true }),
99
114
  });
100
115
 
@@ -127,9 +142,17 @@ session.on('interrupted', (result) => {
127
142
  // Submit a prompt (queues if already executing, max 1 queued)
128
143
  await session.submit('Explain this code');
129
144
 
145
+ // Path-like @file references are expanded into model-only prompt context by the SDK.
146
+ // The user-visible history keeps the original prompt plus a structured file-reference event.
147
+ await session.submit('Explain @AGENTS.md and @docs/SPEC.md');
148
+
130
149
  // Submit with display override (shown in UI) and raw input (for hook matching)
131
150
  await session.submit(fullPrompt, '/audit', '/rulebased-harness:audit');
132
151
 
152
+ // Execute slash commands through the command layer. With the skills command module composed,
153
+ // `/audit src/index.ts` is normalized by SDK to command "skills" with args "audit src/index.ts".
154
+ await session.executeCommand('audit', 'src/index.ts');
155
+
133
156
  // Abort current execution and clear queue
134
157
  session.abort();
135
158
 
@@ -184,21 +207,38 @@ executor.register({
184
207
 
185
208
  // List all commands
186
209
  executor.listCommands(); // ISystemCommand[]
187
- executor.hasCommand('mode'); // boolean
210
+ executor.hasCommand('permissions'); // boolean
188
211
  ```
189
212
 
190
- Product built-ins are supplied as `agent-command-*` modules. For example, `/help` is owned by `@robota-sdk/agent-command-help`, while `/compact` is owned by `@robota-sdk/agent-command-compact`.
213
+ SDK core does not own user-visible built-in commands. Product built-ins are supplied as `agent-command-*` modules. SDK command identity is slash-free (`skills`, `help`, `compact`); UI shells render and parse those commands as slash syntax such as `/skills`, `/help`, and `/compact`.
214
+
215
+ Command modules may use SDK common APIs for shared provider-neutral behavior. For `/model`, the SDK
216
+ resolves the active provider from settings, reads provider-owned fallback metadata from injected
217
+ `IProviderDefinition` records, and can invoke provider-owned catalog refresh hooks. The CLI/TUI must
218
+ only render command results and must not own provider model lists.
219
+
220
+ For provider setup, the SDK projects provider-owned setup help links from `IProviderDefinition`
221
+ records into generic prompt descriptions. The CLI/TUI renders those descriptions without owning
222
+ provider-specific API key or console URLs.
223
+
224
+ For `/permissions`, the SDK owns permission-mode constants, subcommand descriptors, validation,
225
+ state formatting, and command-host adapter access. The command module owns user-visible behavior and
226
+ keeps permission-mode changes under `/permissions [mode]`.
227
+
228
+ For `/validate-session`, the session command module calls SDK session command APIs to locate and
229
+ validate the current JSONL session log. Hosts may override `validateCurrentSessionReplayLog()` in
230
+ `ICommandHostContext` when they own a non-file-backed session log store.
191
231
 
192
232
  ### CommandRegistry, BuiltinCommandSource, SkillCommandSource
193
233
 
194
234
  These classes provide slash command discovery and aggregation for clients that expose a command palette or autocomplete UI.
195
235
 
196
236
  ```typescript
197
- import { CommandRegistry, BuiltinCommandSource, SkillCommandSource } from '@robota-sdk/agent-sdk';
237
+ import { CommandRegistry } from '@robota-sdk/agent-sdk';
238
+ import { createSkillsCommandModule } from '@robota-sdk/agent-command-skills';
198
239
 
199
240
  const registry = new CommandRegistry();
200
- registry.addSource(new BuiltinCommandSource());
201
- registry.addSource(new SkillCommandSource(process.cwd()));
241
+ registry.addModule(createSkillsCommandModule({ cwd: process.cwd() }));
202
242
 
203
243
  // Get all commands (returns ICommand[])
204
244
  const commands = registry.getCommands();
@@ -210,59 +250,100 @@ const filtered = registry.getCommands('mod'); // matches "mode", "model"
210
250
  registry.resolveQualifiedName('audit'); // "my-plugin:audit"
211
251
  ```
212
252
 
213
- `SkillCommandSource` discovers skills from (highest priority first):
253
+ `SkillCommandSource` is the SDK common API used by the skills command module. It discovers skills from (highest priority first):
214
254
 
215
255
  - `<cwd>/.claude/skills/*/SKILL.md`
216
256
  - `<cwd>/.claude/commands/*.md` (Claude Code compatible)
217
257
  - `~/.robota/skills/*/SKILL.md`
218
258
  - `<cwd>/.agents/skills/*/SKILL.md`
219
259
 
220
- ### query()
260
+ Model-invocable skills are exposed to the model as metadata only when the session has a composed
261
+ model-invocable `skills` command descriptor. `@robota-sdk/agent-command-skills` owns `skills` and
262
+ activates skills through the SDK host API. Models use the SDK-projected `robota_command_skills`
263
+ tool with skill arguments in `args`. Mentioning a skill in ordinary prose,
264
+ recommending a skill in assistant text, or matching a natural-language phrase in SDK/TUI code does
265
+ not activate the skill.
266
+
267
+ ### createQuery()
221
268
 
222
269
  ```typescript
223
- import { query } from '@robota-sdk/agent-sdk';
270
+ import { createQuery } from '@robota-sdk/agent-sdk';
271
+ import { AnthropicProvider } from '@robota-sdk/agent-provider-anthropic';
272
+
273
+ const provider = new AnthropicProvider({ apiKey: process.env.ANTHROPIC_API_KEY });
274
+ const query = createQuery({ provider });
224
275
 
225
276
  const response = await query('Show me the file list');
226
277
 
227
- const response = await query('Analyze the code', {
278
+ const queryWithOptions = createQuery({
279
+ provider,
228
280
  cwd: '/path/to/project',
229
281
  permissionMode: 'acceptEdits',
230
282
  maxTurns: 10,
231
- autoCompactThreshold: 0.75,
232
283
  onTextDelta: (delta) => process.stdout.write(delta),
233
- onCompact: () => console.log('Context compacted'),
234
284
  });
235
- ```
236
-
237
- ### createSession()
238
285
 
239
- ```typescript
240
- import { createSession, loadConfig, loadContext, detectProject } from '@robota-sdk/agent-sdk';
286
+ const detailedResponse = await queryWithOptions('Analyze the code');
287
+ ```
241
288
 
242
- const [config, context, projectInfo] = await Promise.all([
243
- loadConfig(cwd),
244
- loadContext(cwd),
245
- detectProject(cwd),
246
- ]);
289
+ ### Session Assembly
247
290
 
248
- const session = createSession({ config, context, terminal, projectInfo, permissionMode });
249
- const response = await session.run('Hello');
250
- ```
291
+ `createSession()`, `loadConfig()`, `loadContext()`, and `detectProject()` are internal SDK assembly
292
+ details. Use `InteractiveSession` for event-driven sessions or `createQuery()` for prompt-only
293
+ one-shot calls.
251
294
 
252
295
  ### Built-in Tools
253
296
 
254
- `@robota-sdk/agent-sdk` re-exports 6 of the 8 built-in tools from `@robota-sdk/agent-tools`:
297
+ `@robota-sdk/agent-sdk` assembles built-in tools for SDK sessions, but direct tool usage imports
298
+ from the owner package:
255
299
 
256
300
  ```typescript
257
- import { bashTool, readTool, writeTool, editTool, globTool, grepTool } from '@robota-sdk/agent-sdk';
301
+ import {
302
+ bashTool,
303
+ editTool,
304
+ globTool,
305
+ grepTool,
306
+ readTool,
307
+ webFetchTool,
308
+ webSearchTool,
309
+ writeTool,
310
+ } from '@robota-sdk/agent-tools';
258
311
  ```
259
312
 
260
- `webFetchTool` and `webSearchTool` are **not** re-exported from `@robota-sdk/agent-sdk`. Import them directly from the owning package:
313
+ ### Sandbox Execution
314
+
315
+ SDK sessions can receive a provider-neutral sandbox client. When provided, Bash, Read, Write, and Edit use the sandbox execution plane instead of the host process/filesystem:
261
316
 
262
317
  ```typescript
263
- import { webFetchTool, webSearchTool } from '@robota-sdk/agent-tools';
318
+ import { InteractiveSession } from '@robota-sdk/agent-sdk';
319
+ import { AnthropicProvider } from '@robota-sdk/agent-provider-anthropic';
320
+ import { E2BSandboxClient } from '@robota-sdk/agent-tools';
321
+ import type { IWorkspaceManifest } from '@robota-sdk/agent-tools';
322
+ import { Sandbox } from 'e2b';
323
+
324
+ const provider = new AnthropicProvider({ apiKey: process.env.ANTHROPIC_API_KEY });
325
+ const sandbox = await Sandbox.create();
326
+ const workspaceManifest: IWorkspaceManifest = {
327
+ entries: {
328
+ 'task.md': { type: 'file', content: 'Review this repository.\n' },
329
+ repo: { type: 'gitRepo', url: 'https://github.com/example/project.git', ref: 'main' },
330
+ output: { type: 'dir' },
331
+ },
332
+ };
333
+
334
+ const session = new InteractiveSession({
335
+ cwd: process.cwd(),
336
+ provider,
337
+ sandboxClient: new E2BSandboxClient({ sandbox }),
338
+ workspaceManifest,
339
+ reversibleExecution: { mode: 'local-first' },
340
+ });
264
341
  ```
265
342
 
343
+ `E2BSandboxClient` is a structural adapter owned by `agent-tools`, and it does not make `e2b` a dependency of `agent-sdk`. Install and create the concrete provider SDK in the application layer, then pass the adapter into `InteractiveSession`. `workspaceManifest` also uses the `agent-tools` contract; SDK applies it once before constructing the underlying `Session`.
344
+
345
+ When `sessionStore` and a snapshot-capable `sandboxClient` are both provided, `InteractiveSession.shutdown()` stores `sandboxSnapshotId` in the session record. A later non-fork `resumeSessionId` restore calls `sandboxClient.restore(snapshotId)` before saved messages are injected back into the `Session`. Forked sessions intentionally do not hydrate the previous sandbox reference because provider pause/resume references can be one-to-one.
346
+
266
347
  ## Subagent Sessions
267
348
 
268
349
  `createSubagentSession()` creates an isolated child session for delegating subtasks. The subagent receives pre-resolved config and context from the parent — it does not load config files or context from disk. Callers may provide a stable `sessionId` and `sessionLogger` so the child session writes a durable transcript.
@@ -288,7 +369,13 @@ Built-in agents: `general-purpose` (full tool access), `Explore` (read-only, Hai
288
369
 
289
370
  `createAgentTool()` wraps subagent creation into a tool the AI can invoke directly. The parent session's hooks, permissions, and context are forwarded to the child.
290
371
 
291
- Background subagent lifecycle events are persisted through `InteractiveSession` when a `SessionStore` is configured. Streaming chunks are written to append-only JSONL logs/transcripts rather than rewriting the main session JSON per token.
372
+ Background subagent lifecycle events are persisted through `InteractiveSession` when an SDK session persistence facade is configured. Streaming chunks are written to append-only JSONL logs/transcripts rather than rewriting the main session JSON per token.
373
+
374
+ ## Replay-Grade Session Events
375
+
376
+ `Session.run()` forwards core execution events through the session logger. Current events include provider request envelopes, provider-native raw request/response/stream payloads, provider-normalized responses, assistant message commits, tool batch starts, tool execution requests, and tool execution results.
377
+
378
+ Provider-native payload events are emitted by concrete provider packages through `IChatOptions.onProviderNativeRawPayload`, then redacted and externalized by the session logger before they are written to disk. The SDK exposes session command APIs so command modules such as `/validate-session` can validate replay coverage without adding file-log logic to CLI/TUI hosts.
292
379
 
293
380
  ## Hook Executors (SDK-Specific)
294
381
 
@@ -372,7 +459,7 @@ Settings are merged from lowest to highest priority:
372
459
  }
373
460
  ```
374
461
 
375
- `currentProvider` selects the active entry from `providers`. Qwen Model Studio profiles use `type: "qwen"` with the documented DashScope OpenAI-compatible `baseURL`. Gemma-family local models should use a `type: "gemma"` profile so provider-specific stream projection is applied. The resolved SDK config normalizes the active profile into `provider.name`, `provider.model`, `provider.apiKey`, optional `provider.baseURL`, and optional `provider.timeout`. The legacy `provider` object remains supported when `currentProvider` is not configured.
462
+ `currentProvider` selects the active entry from `providers`. Qwen Model Studio profiles use `type: "qwen"` with the documented DashScope OpenAI-compatible `baseURL`. DeepSeek API profiles use `type: "deepseek"` with the documented DeepSeek OpenAI-compatible `baseURL`. Gemma-family local models should use a `type: "gemma"` profile so provider-specific stream projection is applied. The resolved SDK config normalizes the active profile into `provider.name`, `provider.model`, `provider.apiKey`, optional `provider.baseURL`, and optional `provider.timeout`. The legacy `provider` object remains supported when `currentProvider` is not configured.
376
463
 
377
464
  ## Permission Modes
378
465