@standardagents/skill 0.14.1 → 0.15.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@standardagents/skill",
3
- "version": "0.14.1",
3
+ "version": "0.15.1",
4
4
  "private": false,
5
5
  "publishConfig": {
6
6
  "access": "public",
@@ -334,10 +334,12 @@ This is the single biggest gap in most coding-agent-built tools. **Do not write
334
334
  | What you need | Use this | Don't use |
335
335
  |---|---|---|
336
336
  | Store a file between turns | `state.writeFile` / `state.readFile` | S3, external blob store |
337
- | Persist structured data across turns | `state.context` (in-memory) + `state.writeFile` JSON (durable) | External KV, Redis |
337
+ | Persist small structured data across turns | `state.getValue` / `state.setValue` durable KV | External KV, Redis |
338
+ | Persist large or binary data across turns | `state.writeFile` / `state.readFile` | External blob store |
338
339
  | Trigger work later | `state.scheduleEffect` | External cron, queue service |
339
340
  | Invoke another tool from inside a tool | `state.invokeTool` / `state.queueTool` | Re-implementing tool logic inline |
340
- | Read/write config and secrets | `state.env` / `state.setEnv` | `process.env`, `.env` files |
341
+ | Read/write config and secrets | `state.env` / `state.envType` / `state.setEnv` | `process.env`, `.env` files |
342
+ | Run isolated deterministic code | `state.runCode` with explicit bridges | `eval`, child processes, implicit runtime access |
341
343
  | Search files the thread has seen | `state.grepFiles` / `state.findFiles` | Reimplementing search |
342
344
  | Escalate / report status to the parent | `state.notifyParent` / `state.setStatus` | Custom message bus |
343
345
  | Load a sibling prompt / agent / model | `state.loadPrompt` / `state.loadAgent` / `state.loadModel` | Duplicating the definition |
@@ -360,26 +362,94 @@ Logs getLogs
360
362
  Resources loadModel, loadPrompt, loadAgent,
361
363
  getChildThread, getParentThread,
362
364
  getPromptNames, getAgentNames, getModelNames
363
- Env env, setEnv
365
+ Env env, envType, setEnv
364
366
  Parent notifyParent, setStatus
365
367
  Tools queueTool, invokeTool
366
368
  Effects scheduleEffect, getScheduledEffects, removeScheduledEffect
367
369
  Events emit
368
370
  Context context (Record<string, unknown>, in-memory only)
371
+ KV getValue, setValue
369
372
  Files writeFile, readFile, readFileStream, statFile, readdirFile,
370
373
  unlinkFile, mkdirFile, rmdirFile, getFileStats,
371
374
  grepFiles, findFiles, getFileThumbnail
372
- Execution execution, terminate
375
+ Execution execution, terminate, runCode
373
376
  ```
374
377
 
378
+ ### Durable key-value store
379
+
380
+ Use `state.getValue()` and `state.setValue()` for small per-thread durable JSON values such as counters, checkpoints, cursors, tool state, and user preferences. Values survive restarts and are scoped to the current thread.
381
+
382
+ ```ts
383
+ const count = (await state.getValue<number>('invocation_count')) ?? 0;
384
+ await state.setValue('invocation_count', count + 1);
385
+ ```
386
+
387
+ `setValue(key, null)` and `setValue(key, undefined)` delete the key. For larger payloads, binary data, user-visible artifacts, or content that should be shared as a file, use `state.writeFile()` / `state.readFile()` instead.
388
+
389
+ ### Sandboxed code execution
390
+
391
+ Use `state.runCode()` for model- or user-authored JavaScript/TypeScript instead of `eval` or `new Function`. The sandbox runs in Cloudflare Dynamic Workers, has no implicit thread state, filesystem, network, timers, or host globals, and only receives capabilities you explicitly bridge through `imports` or `globals`.
392
+
393
+ ```ts
394
+ const run = state.runCode(
395
+ `
396
+ import { readFile } from "fs";
397
+
398
+ export async function summarize(path: string) {
399
+ const text = await readFile(path);
400
+ return text.slice(0, 200);
401
+ }
402
+ `,
403
+ {
404
+ language: 'typescript',
405
+ execute: { fn: 'summarize', args: ['/notes/input.txt'] },
406
+ imports: {
407
+ fs: {
408
+ readFile: async (path: string) => {
409
+ const file = await state.readFile(path);
410
+ return file ? new TextDecoder().decode(file) : '';
411
+ },
412
+ },
413
+ },
414
+ },
415
+ );
416
+
417
+ const result = await run;
418
+ ```
419
+
420
+ By default, `runCode()` executes the `default` export with no args. Use `execute: { fn, args }` to call a named export or pass arguments; `fn: 'default'` calls the default export with args. Use `modules` to provide local relative ES modules. The result is a status object: successful runs return `status: 'success'`, `result`, `logs`, `reports`, and `durationMs`; failed runs return an error status and an `error` object. Call `run.terminate(reason)` from your own timeout budget when needed.
421
+
375
422
  ### Notes on a few that are easy to misuse
376
423
 
377
- - **`state.context`** is in-memory for the *current execution*. It is not durable across thread restarts. For durable structured state, write a JSON file with `state.writeFile`.
424
+ - **`state.context`** is in-memory for the *current execution*. It is not durable across thread restarts. For durable structured state, use `state.getValue` / `state.setValue`.
425
+ - **`state.getValue` / `state.setValue`** are durable per-thread JSON storage. Use them for small structured state; use files for larger content or artifacts.
426
+ - **`state.env` / `state.envType` / `state.setEnv`** are for runtime configuration and secrets. `state.env(name)` resolves thread -> account -> instance -> agent -> prompt. `state.envType(name)` returns `'secret'` by default; `'text'` means the value may be shown in tool output. `state.setEnv(name, value, { type: 'text' | 'secret' })` writes thread-scoped env and propagates to active descendants. Omit `type` only when preserving the existing type is intentional; new keys default to secret.
427
+ - **`state.runCode`** runs JavaScript or TypeScript in an isolated Dynamic Worker sandbox. The sandbox does not receive `ThreadState`, env, files, network, or host globals implicitly. Pass exact capabilities through `imports`, `globals`, `modules`, and `execute`. It executes exported values/functions; code such as `console.log(fibonacci(5))` can log, but it must still `export default ...` or export a named function/value for the host to receive a result.
378
428
  - **`state.scheduleEffect`** runs a named effect after a delay. It survives restarts. This is your cron, your queue, and your retry timer all in one.
379
429
  - **`state.invokeTool` vs `state.queueTool`** — `invokeTool` runs synchronously and returns the result; `queueTool` schedules the call to run later in the normal tool-call flow. Prefer `queueTool` when the model should see the result as a regular tool call.
380
430
  - **`state.notifyParent`** — for resumable subagents with `parentCommunication: 'explicit'`, this is the only way the child talks to the parent. Use it sparingly; every notification interrupts the parent.
381
431
  - **File attachments** use the path convention `/attachments/{filename}.{ext}`. Always use this path when passing files between agents — the runtime copies them across thread filesystems automatically.
382
432
 
433
+ ### Sandboxed code env bridge pattern
434
+
435
+ When building a coding agent that runs user-authored code, do **not** expose all thread env and do **not** rely on `process.env`. Use an explicit allowlist stored in durable KV:
436
+
437
+ 1. A setup tool such as `set_code_envs` (called `set_code_run_envs` in the built-in sandboxed coding agent) receives the required env names and stores the allowlist with `state.setValue(...)`.
438
+ 2. The execution tool reads that allowlist with `state.getValue(...)`.
439
+ 3. For each allowed name, the execution tool resolves `await state.env(name)` and `await state.envType(name)`. If a value is missing, it may create an empty thread env entry with `state.setEnv(name, '', { type: 'secret' })` so the UI can prompt the user.
440
+ 4. The execution tool calls `state.runCode(source, { imports: { env: { env: allowedValues } }, ... })`, exposing only the whitelisted env object.
441
+ 5. The tool redacts values whose `envType` is `secret` from results, reports, logs, and errors. Values marked `text` may appear.
442
+
443
+ The prompt for a coding agent must explain this exact interface. Tell the model to call the allowlist setup tool before running code, then import a static object from `"env"`:
444
+
445
+ ```ts
446
+ import { env } from "env";
447
+
448
+ const apiKey = env.WEATHER_API_KEY;
449
+ ```
450
+
451
+ Also tell it what **not** to do: do not read `process.env`, do not call `env()` as a function, do not `await` env values, do not use named env imports, and do not pass env values around as tool arguments.
452
+
383
453
  ---
384
454
 
385
455
  ## Tools
@@ -390,6 +460,16 @@ A "tool" is anything an agent can call. There are three kinds:
390
460
  2. **Subprompts** — prompts exposed as tools via their `toolDescription`. A single-step LLM call. Use for switching models on a focused task (image generation, polished writing, JSON extraction).
391
461
  3. **Subagents** — full agents exposed as tools via `exposeAsTool: true` on the agent definition. Use when you need iteration, QA, reflection, or long-lived addressable behavior. Always `dual_ai`.
392
462
 
463
+ ### Provider-visible argument schemas
464
+
465
+ Tool argument schemas are sent to model providers, and strict tool-calling providers commonly reject JSON Schema objects unless every object schema has `additionalProperties: false`. This applies recursively, not just at the top level. If a tool has nested object args, arrays of objects, `anyOf` object branches, prompt `requiredSchema`, or an agent exposed as a tool, verify the model-facing schema contains `additionalProperties: false` for every `type: "object"`.
466
+
467
+ Common failure modes:
468
+
469
+ - `z.record(...)` can emit `propertyNames`, which some providers reject for tool schemas. Prefer explicit `z.object({ ... })` shapes for provider-visible tool args.
470
+ - `z.object({}).catchall(...)` can emit `additionalProperties: {}`. Strict providers expect the boolean `false`, not an object schema.
471
+ - "Arbitrary object" tool args are a poor fit for strict provider tool schemas. Prefer a JSON string for truly arbitrary payloads, or define the object properties explicitly.
472
+
393
473
  ### `PromptDefinition` cheat sheet
394
474
 
395
475
  A prompt is what actually gets sent to the LLM at one step. Set on each prompt file in `agents/prompts/`. For full signatures, read the spec types from `node_modules/@standardagents/spec/dist/` (or browse `packages/spec/src/` on GitHub), and see `agents/prompts/AGENTS.md`.
@@ -525,7 +605,7 @@ A second tool-config form exists for plain callables (`{ name, env, options }`),
525
605
  These are correct as written in the spec — internalize them.
526
606
 
527
607
  1. **Parents always create children.**
528
- - Explicitly via the built-in `subagent_create` tool.
608
+ - Explicitly via the built-in `subagent_create` tool, which requires a non-empty `name` for the child instance.
529
609
  - Implicitly via `immediate: { ... }` — the child spawns the moment the parent prompt activates, before any LLM step.
530
610
  2. **Children only communicate back to their parents.** Two flavors:
531
611
  - **Implicit**: the child auto-queues a message to the parent when the session ends (via `sessionStop`, `sessionFail`, or `maxSessionTurns`). Default for all subagents.
@@ -545,6 +625,9 @@ Hooks extend agent behavior without modifying core logic. Defined via `defineHoo
545
625
 
546
626
  | Hook | Execution Point | Purpose |
547
627
  |---|---|---|
628
+ | `after_thread_created` | After thread creation, before execution | Initialize thread state |
629
+ | `after_subagent_created` | On parent after child thread creation | Track or initialize child relationships |
630
+ | `after_system_message` | After system message render | Transform dynamic system instructions |
548
631
  | `filter_messages` | Before LLM context assembly | Filter/transform message history |
549
632
  | `prefilter_llm_history` | After context assembly | Final adjustments before LLM request |
550
633
  | `before_create_message` | Before message insert | Transform message before storage |
@@ -559,7 +642,26 @@ Hooks extend agent behavior without modifying core logic. Defined via `defineHoo
559
642
 
560
643
  ## Variables and environment
561
644
 
562
- Variables let tools, prompts, and agents declare dynamic values they need. Two types:
645
+ Variables let tools, prompts, and agents declare dynamic values they need. Declare them on prompts, tools, and agents with `variables`:
646
+
647
+ ```ts
648
+ variables: [
649
+ {
650
+ name: 'LOCATION',
651
+ type: 'text',
652
+ required: true,
653
+ description: 'City or ZIP code to use for weather lookups.',
654
+ },
655
+ {
656
+ name: 'WEATHER_API_KEY',
657
+ type: 'secret',
658
+ required: true,
659
+ description: 'Weather API credential used only inside tools.',
660
+ },
661
+ ];
662
+ ```
663
+
664
+ Two value types are supported:
563
665
 
564
666
  - **`text`** — simple string. Safe to render in prompts.
565
667
  - **`secret`** — encrypted; **MUST only be used inside tools**. Never reference a secret in prompt text and never return it to the model. A `GMAIL_API_KEY` is `secret`; a `LOCATION` is `text`.
@@ -568,6 +670,17 @@ When a thread is created, all required variables in the agent graph must be prov
568
670
 
569
671
  Scoped variables (`scoped: true`) do not inherit from parent thread env — they reset for the declaring agent's subtree. Use this when a subagent must run with different config from its parent (e.g., a per-instance Slack channel ID).
570
672
 
673
+ Thread env values are editable from the thread metadata UI. `ThreadState.setEnv(name, "")` intentionally creates a blank thread env entry that still appears in that UI. Prefer the explicit form when writing new values:
674
+
675
+ ```ts
676
+ await state.setEnv('LOCATION', 'Charlottesville, VA', { type: 'text' });
677
+ await state.setEnv('WEATHER_API_KEY', token, { type: 'secret' });
678
+ ```
679
+
680
+ In code, read values through `await state.env('NAME')`, check display policy with `await state.envType('NAME')`, and write thread-scoped values with `await state.setEnv('NAME', value, { type: 'text' | 'secret' })`. Use `text` only for values that are safe in prompts, tool output, logs, and errors. Use `secret` for tokens, API keys, credentials, and anything that should be redacted.
681
+
682
+ Undeclared thread-only env keys are treated as write-only secrets when scanned for the UI, so the key is visible and editable but arbitrary stored values are not echoed back.
683
+
571
684
  ---
572
685
 
573
686
  ## Implementation checking