npm - @standardagents/skill - Versions diffs - 0.14.1 → 0.15.1 - Mend

@standardagents/skill 0.14.1 → 0.15.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/package.json +1 -1
package/skills/agentbuilder/SKILL.md +120 -7

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@standardagents/skill",
-  "version": "0.14.1",
+  "version": "0.15.1",
   "private": false,
   "publishConfig": {
     "access": "public",

package/skills/agentbuilder/SKILL.md CHANGED Viewed

@@ -334,10 +334,12 @@ This is the single biggest gap in most coding-agent-built tools. **Do not write
 | What you need | Use this | Don't use |
 |---|---|---|
 | Store a file between turns | `state.writeFile` / `state.readFile` | S3, external blob store |
-| Persist structured data across turns | `state.context` (in-memory) + `state.writeFile` JSON (durable) | External KV, Redis |
+| Persist small structured data across turns | `state.getValue` / `state.setValue` durable KV | External KV, Redis |
+| Persist large or binary data across turns | `state.writeFile` / `state.readFile` | External blob store |
 | Trigger work later | `state.scheduleEffect` | External cron, queue service |
 | Invoke another tool from inside a tool | `state.invokeTool` / `state.queueTool` | Re-implementing tool logic inline |
-| Read/write config and secrets | `state.env` / `state.setEnv` | `process.env`, `.env` files |
+| Read/write config and secrets | `state.env` / `state.envType` / `state.setEnv` | `process.env`, `.env` files |
+| Run isolated deterministic code | `state.runCode` with explicit bridges | `eval`, child processes, implicit runtime access |
 | Search files the thread has seen | `state.grepFiles` / `state.findFiles` | Reimplementing search |
 | Escalate / report status to the parent | `state.notifyParent` / `state.setStatus` | Custom message bus |
 | Load a sibling prompt / agent / model | `state.loadPrompt` / `state.loadAgent` / `state.loadModel` | Duplicating the definition |
@@ -360,26 +362,94 @@ Logs         getLogs
 Resources    loadModel, loadPrompt, loadAgent,
              getChildThread, getParentThread,
              getPromptNames, getAgentNames, getModelNames
-Env          env, setEnv
+Env          env, envType, setEnv
 Parent       notifyParent, setStatus
 Tools        queueTool, invokeTool
 Effects      scheduleEffect, getScheduledEffects, removeScheduledEffect
 Events       emit
 Context      context (Record<string, unknown>, in-memory only)
+KV           getValue, setValue
 Files        writeFile, readFile, readFileStream, statFile, readdirFile,
              unlinkFile, mkdirFile, rmdirFile, getFileStats,
              grepFiles, findFiles, getFileThumbnail
-Execution    execution, terminate
+Execution    execution, terminate, runCode
 ```
+### Durable key-value store
+Use `state.getValue()` and `state.setValue()` for small per-thread durable JSON values such as counters, checkpoints, cursors, tool state, and user preferences. Values survive restarts and are scoped to the current thread.
+```ts
+const count = (await state.getValue<number>('invocation_count')) ?? 0;
+await state.setValue('invocation_count', count + 1);
+```
+`setValue(key, null)` and `setValue(key, undefined)` delete the key. For larger payloads, binary data, user-visible artifacts, or content that should be shared as a file, use `state.writeFile()` / `state.readFile()` instead.
+### Sandboxed code execution
+Use `state.runCode()` for model- or user-authored JavaScript/TypeScript instead of `eval` or `new Function`. The sandbox runs in Cloudflare Dynamic Workers, has no implicit thread state, filesystem, network, timers, or host globals, and only receives capabilities you explicitly bridge through `imports` or `globals`.
+```ts
+const run = state.runCode(
+  `
+    import { readFile } from "fs";
+    export async function summarize(path: string) {
+      const text = await readFile(path);
+      return text.slice(0, 200);
+    }
+  `,
+  {
+    language: 'typescript',
+    execute: { fn: 'summarize', args: ['/notes/input.txt'] },
+    imports: {
+      fs: {
+        readFile: async (path: string) => {
+          const file = await state.readFile(path);
+          return file ? new TextDecoder().decode(file) : '';
+        },
+      },
+    },
+  },
+);
+const result = await run;
+```
+By default, `runCode()` executes the `default` export with no args. Use `execute: { fn, args }` to call a named export or pass arguments; `fn: 'default'` calls the default export with args. Use `modules` to provide local relative ES modules. The result is a status object: successful runs return `status: 'success'`, `result`, `logs`, `reports`, and `durationMs`; failed runs return an error status and an `error` object. Call `run.terminate(reason)` from your own timeout budget when needed.
 ### Notes on a few that are easy to misuse
-- **`state.context`** is in-memory for the *current execution*. It is not durable across thread restarts. For durable structured state, write a JSON file with `state.writeFile`.
+- **`state.context`** is in-memory for the *current execution*. It is not durable across thread restarts. For durable structured state, use `state.getValue` / `state.setValue`.
+- **`state.getValue` / `state.setValue`** are durable per-thread JSON storage. Use them for small structured state; use files for larger content or artifacts.
+- **`state.env` / `state.envType` / `state.setEnv`** are for runtime configuration and secrets. `state.env(name)` resolves thread -> account -> instance -> agent -> prompt. `state.envType(name)` returns `'secret'` by default; `'text'` means the value may be shown in tool output. `state.setEnv(name, value, { type: 'text' | 'secret' })` writes thread-scoped env and propagates to active descendants. Omit `type` only when preserving the existing type is intentional; new keys default to secret.
+- **`state.runCode`** runs JavaScript or TypeScript in an isolated Dynamic Worker sandbox. The sandbox does not receive `ThreadState`, env, files, network, or host globals implicitly. Pass exact capabilities through `imports`, `globals`, `modules`, and `execute`. It executes exported values/functions; code such as `console.log(fibonacci(5))` can log, but it must still `export default ...` or export a named function/value for the host to receive a result.
 - **`state.scheduleEffect`** runs a named effect after a delay. It survives restarts. This is your cron, your queue, and your retry timer all in one.
 - **`state.invokeTool` vs `state.queueTool`** — `invokeTool` runs synchronously and returns the result; `queueTool` schedules the call to run later in the normal tool-call flow. Prefer `queueTool` when the model should see the result as a regular tool call.
 - **`state.notifyParent`** — for resumable subagents with `parentCommunication: 'explicit'`, this is the only way the child talks to the parent. Use it sparingly; every notification interrupts the parent.
 - **File attachments** use the path convention `/attachments/{filename}.{ext}`. Always use this path when passing files between agents — the runtime copies them across thread filesystems automatically.
+### Sandboxed code env bridge pattern
+When building a coding agent that runs user-authored code, do **not** expose all thread env and do **not** rely on `process.env`. Use an explicit allowlist stored in durable KV:
+1. A setup tool such as `set_code_envs` (called `set_code_run_envs` in the built-in sandboxed coding agent) receives the required env names and stores the allowlist with `state.setValue(...)`.
+2. The execution tool reads that allowlist with `state.getValue(...)`.
+3. For each allowed name, the execution tool resolves `await state.env(name)` and `await state.envType(name)`. If a value is missing, it may create an empty thread env entry with `state.setEnv(name, '', { type: 'secret' })` so the UI can prompt the user.
+4. The execution tool calls `state.runCode(source, { imports: { env: { env: allowedValues } }, ... })`, exposing only the whitelisted env object.
+5. The tool redacts values whose `envType` is `secret` from results, reports, logs, and errors. Values marked `text` may appear.
+The prompt for a coding agent must explain this exact interface. Tell the model to call the allowlist setup tool before running code, then import a static object from `"env"`:
+```ts
+import { env } from "env";
+const apiKey = env.WEATHER_API_KEY;
+```
+Also tell it what **not** to do: do not read `process.env`, do not call `env()` as a function, do not `await` env values, do not use named env imports, and do not pass env values around as tool arguments.
 ---
 ## Tools
@@ -390,6 +460,16 @@ A "tool" is anything an agent can call. There are three kinds:
 2. **Subprompts** — prompts exposed as tools via their `toolDescription`. A single-step LLM call. Use for switching models on a focused task (image generation, polished writing, JSON extraction).
 3. **Subagents** — full agents exposed as tools via `exposeAsTool: true` on the agent definition. Use when you need iteration, QA, reflection, or long-lived addressable behavior. Always `dual_ai`.
+### Provider-visible argument schemas
+Tool argument schemas are sent to model providers, and strict tool-calling providers commonly reject JSON Schema objects unless every object schema has `additionalProperties: false`. This applies recursively, not just at the top level. If a tool has nested object args, arrays of objects, `anyOf` object branches, prompt `requiredSchema`, or an agent exposed as a tool, verify the model-facing schema contains `additionalProperties: false` for every `type: "object"`.
+Common failure modes:
+- `z.record(...)` can emit `propertyNames`, which some providers reject for tool schemas. Prefer explicit `z.object({ ... })` shapes for provider-visible tool args.
+- `z.object({}).catchall(...)` can emit `additionalProperties: {}`. Strict providers expect the boolean `false`, not an object schema.
+- "Arbitrary object" tool args are a poor fit for strict provider tool schemas. Prefer a JSON string for truly arbitrary payloads, or define the object properties explicitly.
 ### `PromptDefinition` cheat sheet
 A prompt is what actually gets sent to the LLM at one step. Set on each prompt file in `agents/prompts/`. For full signatures, read the spec types from `node_modules/@standardagents/spec/dist/` (or browse `packages/spec/src/` on GitHub), and see `agents/prompts/AGENTS.md`.
@@ -525,7 +605,7 @@ A second tool-config form exists for plain callables (`{ name, env, options }`),
 These are correct as written in the spec — internalize them.
 1. **Parents always create children.**
-   - Explicitly via the built-in `subagent_create` tool.
+   - Explicitly via the built-in `subagent_create` tool, which requires a non-empty `name` for the child instance.
    - Implicitly via `immediate: { ... }` — the child spawns the moment the parent prompt activates, before any LLM step.
 2. **Children only communicate back to their parents.** Two flavors:
    - **Implicit**: the child auto-queues a message to the parent when the session ends (via `sessionStop`, `sessionFail`, or `maxSessionTurns`). Default for all subagents.
@@ -545,6 +625,9 @@ Hooks extend agent behavior without modifying core logic. Defined via `defineHoo
 | Hook | Execution Point | Purpose |
 |---|---|---|
+| `after_thread_created` | After thread creation, before execution | Initialize thread state |
+| `after_subagent_created` | On parent after child thread creation | Track or initialize child relationships |
+| `after_system_message` | After system message render | Transform dynamic system instructions |
 | `filter_messages` | Before LLM context assembly | Filter/transform message history |
 | `prefilter_llm_history` | After context assembly | Final adjustments before LLM request |
 | `before_create_message` | Before message insert | Transform message before storage |
@@ -559,7 +642,26 @@ Hooks extend agent behavior without modifying core logic. Defined via `defineHoo
 ## Variables and environment
-Variables let tools, prompts, and agents declare dynamic values they need. Two types:
+Variables let tools, prompts, and agents declare dynamic values they need. Declare them on prompts, tools, and agents with `variables`:
+```ts
+variables: [
+  {
+    name: 'LOCATION',
+    type: 'text',
+    required: true,
+    description: 'City or ZIP code to use for weather lookups.',
+  },
+  {
+    name: 'WEATHER_API_KEY',
+    type: 'secret',
+    required: true,
+    description: 'Weather API credential used only inside tools.',
+  },
+];
+```
+Two value types are supported:
 - **`text`** — simple string. Safe to render in prompts.
 - **`secret`** — encrypted; **MUST only be used inside tools**. Never reference a secret in prompt text and never return it to the model. A `GMAIL_API_KEY` is `secret`; a `LOCATION` is `text`.
@@ -568,6 +670,17 @@ When a thread is created, all required variables in the agent graph must be prov
 Scoped variables (`scoped: true`) do not inherit from parent thread env — they reset for the declaring agent's subtree. Use this when a subagent must run with different config from its parent (e.g., a per-instance Slack channel ID).
+Thread env values are editable from the thread metadata UI. `ThreadState.setEnv(name, "")` intentionally creates a blank thread env entry that still appears in that UI. Prefer the explicit form when writing new values:
+```ts
+await state.setEnv('LOCATION', 'Charlottesville, VA', { type: 'text' });
+await state.setEnv('WEATHER_API_KEY', token, { type: 'secret' });
+```
+In code, read values through `await state.env('NAME')`, check display policy with `await state.envType('NAME')`, and write thread-scoped values with `await state.setEnv('NAME', value, { type: 'text' | 'secret' })`. Use `text` only for values that are safe in prompts, tool output, logs, and errors. Use `secret` for tokens, API keys, credentials, and anything that should be redacted.
+Undeclared thread-only env keys are treated as write-only secrets when scanned for the UI, so the key is visible and editable but arbitrary stored values are not echoed back.
 ---
 ## Implementation checking