npm - @zibby/skills - Versions diffs - 0.1.11 → 0.1.12 - Mend

@zibby/skills 0.1.11 → 0.1.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/dist/package.json +2 -2
package/docs/cli-reference.md +120 -256
package/docs/cloning-repositories.md +2 -2
package/docs/cloud/bundles.md +92 -0
package/docs/cloud/dedicated-egress.md +140 -0
package/docs/cloud/env-vars.md +144 -0
package/docs/cloud/limits.md +81 -0
package/docs/cloud/logs.md +104 -0
package/docs/cloud/triggering.md +114 -0
package/docs/concepts/agents.md +112 -0
package/docs/concepts/graph.md +83 -0
package/docs/concepts/sessions.md +70 -0
package/docs/concepts/skills.md +84 -0
package/docs/concepts/state.md +106 -0
package/docs/get-started/deploy.md +75 -0
package/docs/get-started/install.md +58 -0
package/docs/get-started/run-locally.md +94 -0
package/docs/get-started/trigger-and-logs.md +90 -0
package/docs/get-started/your-first-workflow.md +66 -0
package/docs/intro.md +37 -65
package/docs/legacy/test-automation.md +110 -0
package/docs/packages/agent-workflow.md +88 -0
package/docs/packages/cli.md +42 -207
package/docs/packages/core.md +40 -224
package/docs/recipes/index.md +62 -0
package/docs/recipes/test.md +154 -0
package/package.json +2 -2

package/docs/concepts/graph.md ADDED Viewed

@@ -0,0 +1,83 @@
+---
+sidebar_position: 1
+title: Graph & nodes
+---
+# Graph & nodes
+A **graph** is a directed graph of agent invocations. You declare it in code:
+```js
+import { WorkflowGraph } from '@zibby/agent-workflow';
+import { z } from '@zibby/core';
+const graph = new WorkflowGraph()
+  .addNode('plan',     { prompt: 'List 3 tasks for: {{goal}}', outputSchema: Plan,   agent: 'claude' })
+  .addNode('execute',  { prompt: 'Do task: {{task}}',           outputSchema: Done,   agent: 'cursor' })
+  .addNode('verify',   { prompt: 'Verify: {{result}}',          outputSchema: Status, agent: 'codex'  })
+  .addEdge('plan', 'execute')
+  .addEdge('execute', 'verify')
+  .setEntryPoint('plan');
+```
+## Node config
+Every node accepts:
+| Field | Required | Description |
+|---|---|---|
+| `prompt` | yes | Either a string template (`{{state.X}}` interpolated) or a function `({ input, state }) => string`. |
+| `outputSchema` | yes | A Zod schema. The node's output is validated against this before downstream nodes see it. Validation failure = node failure. |
+| `agent` | no | Agent strategy override: `'cursor' | 'claude' | 'codex' | 'gemini' | 'assistant'`. Falls back to project default if omitted. |
+| `retries` | no | Number of times to retry on failure (default: 0). |
+| `skills` | no | Array of skill IDs to enable for this node — see [Skills](./skills). |
+| `onComplete` | no | Async callback invoked with the node's validated output. |
+## Edges
+Three forms:
+```js
+// Linear: A → B
+graph.addEdge('A', 'B');
+// Conditional branching: state-driven routing
+graph.addConditionalEdges('classify', (state) => {
+  if (state.classify.severity === 'critical') return 'pageOncall';
+  if (state.classify.severity === 'high')     return 'createIncident';
+  return 'logAndExit';
+});
+// Multi-conditional with named labels (cleaner for many branches)
+graph.addConditionalEdges('classify', {
+  routes: (state) => state.classify.severity,
+  labels: { critical: 'pageOncall', high: 'createIncident', _default: 'logAndExit' },
+});
+```
+## Entry point
+Exactly one node is the entry point:
+```js
+graph.setEntryPoint('plan');
+```
+This is the node that runs first when `graph.run(initialState)` is called.
+## Running
+```js
+const { state } = await graph.run(agent, {
+  input: { goal: 'add a dark-mode toggle' },
+  agentType: 'cursor',
+});
+console.log(state.verify.status);
+```
+Each node's output lands at `state[nodeName]`, so downstream prompts can reference `{{state.plan.tasks}}` or `state.plan.tasks` from a function prompt.
+## State immutability
+State is append-only — every node-completion creates a new state object. Earlier values are preserved in the state's `_history` array, accessible via `state.rollback(N)` for retries that need to drop the latest N writes.

package/docs/concepts/sessions.md ADDED Viewed

@@ -0,0 +1,70 @@
+---
+sidebar_position: 5
+title: Sessions & artifacts
+---
+# Sessions & artifacts
+Every workflow run — local or cloud — produces a **session folder** under `.zibby/output/sessions/<sessionId>/`. The folder is the canonical record of what happened.
+## What's inside
+```
+.zibby/output/sessions/1777678254943_ymcw/
+├── result.json              # final structured output (Zod-validated)
+├── raw_stream_output.txt    # every byte the agent emitted, raw
+├── events.json              # JSONL execution log: per-node start/end, validation, retries
+├── .session-info.json       # session metadata (start time, agent, model, workflow type)
+└── nodes/
+    ├── plan/
+    │   ├── prompt.txt         # exact prompt sent to the agent
+    │   ├── raw_output.txt     # what the agent returned, pre-validation
+    │   └── result.json        # the validated output
+    ├── implement/...
+    └── verify/...
+```
+## Session IDs
+Format: `<unix_ms>_<random>`. Generated when the graph starts running.
+You can override via:
+```bash
+zibby workflow run my-pipeline --session 1777678254943_ymcw
+```
+Useful for replay — re-run from a saved input/state without re-paying earlier nodes.
+## Replay a session
+```bash
+zibby workflow run my-pipeline \
+  --session 1777678254943_ymcw \
+  --node verify   # only re-run the 'verify' node
+```
+The graph reads `state.plan` and `state.implement` from the saved session and only invokes `verify` again. Cheap iteration on the last node when debugging.
+## Cloud sessions
+Cloud runs land in the same `.zibby/output/sessions/` layout, just inside the ECS container's `/workspace/`. The session folder is uploaded to S3 at the end of each run; `zibby workflow logs <uuid> -t` streams CloudWatch logs in real time.
+To download a finished cloud session locally:
+```bash
+zibby workflow download <uuid>
+```
+This pulls the workflow source (so you can edit and redeploy) plus the most recent execution's session folder.
+## Studio integration
+[Zibby Studio](https://zibby.app) is a desktop UI that watches `.zibby/output/sessions/`. Anything that writes a session folder shows up in Studio automatically — pin a session, watch state evolve live, or stop a workflow from the Stop button.
+The protocol is documented and stable:
+- `__WORKFLOW_GRAPH_LOG__` markers in stdout signal node begin/end events
+- `.zibby-studio-stop` file is the kill switch — Studio writes it, the runtime checks for it between nodes
+- `ZIBBY_RUN_SOURCE=studio` env var tells the runtime "you were spawned by Studio"
+- `stoppedByStudio: true` returned from `graph.run()` when the kill switch fired

package/docs/concepts/skills.md ADDED Viewed

@@ -0,0 +1,84 @@
+---
+sidebar_position: 4
+title: Skills
+---
+# Skills
+A **skill** is a named bundle of MCP tools (and optional prompt fragments) that a node can opt into. Skills let you compose tool access per-node without giving every node every tool.
+## Built-in skills
+`@zibby/skills` ships these:
+| Skill ID | What it adds |
+|---|---|
+| `browser` | Playwright MCP — browse, click, fill, screenshot |
+| `github` | GitHub MCP — issues, PRs, file edits, branches |
+| `jira` | Jira MCP — tickets, comments, transitions |
+| `slack` | Slack MCP — read/write channels, threads, DMs |
+| `memory` | Test memory database — version-controlled (Dolt) knowledge from prior runs |
+## Enabling on a node
+```js
+import { registerSkill } from '@zibby/agent-workflow';
+import { browserSkill } from '@zibby/skills';
+registerSkill(browserSkill);
+graph.addNode('research', {
+  prompt: 'Find pricing for {{input.product}}',
+  outputSchema: Price,
+  agent: 'cursor',
+  skills: ['browser'],
+});
+```
+Two effects:
+1. The agent gets the Browser MCP tools at this node only.
+2. The skill's prompt fragment (telling the agent how to use the tools) is appended to the prompt.
+## Custom skills
+Implement the `Skill` shape:
+```js
+import { registerSkill } from '@zibby/agent-workflow';
+registerSkill({
+  id: 'pdf',
+  serverName: 'pdf-mcp',
+  tools: ['pdfExtract', 'pdfRender'],
+  promptFragment: 'When working with PDFs, use the pdfExtract tool first.',
+});
+```
+Now any node can opt in via `skills: ['pdf']`.
+## Why per-node, not per-graph?
+Two reasons:
+- **Tool scoping** — a node that's planning shouldn't have Slack write access; a node that's posting status shouldn't have file-write tools. Per-node skills give you least-privilege automatically.
+- **Prompt size** — every enabled skill adds prompt fragments. Don't pay for tools you won't use.
+## Function skills (no MCP server needed)
+For lightweight cases, register a function skill:
+```js
+import { registerSkill } from '@zibby/agent-workflow';
+registerSkill({
+  id: 'pricing',
+  type: 'function',
+  fn: async ({ product }) => {
+    const r = await fetch(`https://my.api/price?p=${product}`);
+    return r.json();
+  },
+  description: 'Look up product pricing.',
+});
+```
+The agent sees `pricing` as a callable tool, gets your function's return value back as a tool result. No MCP server to spin up.

package/docs/concepts/state.md ADDED Viewed

@@ -0,0 +1,106 @@
+---
+sidebar_position: 3
+title: State & schema
+---
+# State & schema
+Every workflow run carries one **state** object that flows from node to node. State is the only handoff mechanism — there's no shared globals, no hidden context.
+## Shape
+```js
+state = {
+  input: { /* whatever the trigger passed */ },
+  plan: { tasks: ['t1', 't2'] },        // ← from node 'plan'
+  implement: { diff: '...' },            // ← from node 'implement'
+  verify: { status: 'ok' },              // ← from node 'verify'
+}
+```
+When a node completes, its validated output lands at `state[nodeName]`.
+## Schema-validated handoff
+Every node's `outputSchema` is a Zod schema. It runs *before* downstream nodes see the output:
+```js
+import { z } from '@zibby/core';
+const Plan = z.object({
+  tasks: z.array(z.string()).min(1),
+  priority: z.enum(['low', 'normal', 'high']),
+});
+graph.addNode('plan', {
+  prompt: 'Triage this ticket.',
+  outputSchema: Plan,
+  agent: 'claude',
+});
+```
+If the agent returns malformed output, the node fails. Combined with `retries: N`, you get cheap automatic recovery from one-off LLM hallucinations.
+Downstream nodes can reference plan output directly:
+```js
+graph.addNode('implement', {
+  prompt: ({ state }) => `Implement these tasks:\n${state.plan.tasks.map(t => `- ${t}`).join('\n')}`,
+  outputSchema: Implementation,
+  agent: 'cursor',
+});
+```
+## Function vs. template prompts
+Two prompt forms:
+```js
+// String template — variables interpolated from state.
+graph.addNode('plan', { prompt: 'Plan: {{input.goal}}', ... });
+// Function — full programmatic control.
+graph.addNode('plan', {
+  prompt: ({ input, state }) => {
+    const ctx = state.context?.summary ?? '';
+    return `Plan: ${input.goal}\n\nContext:\n${ctx}`;
+  },
+  ...
+});
+```
+Functions get `{ input, state, getAll, get }` so they can introspect state without throwing on missing keys.
+## Skill hints
+If a node opts into [skills](./skills), the framework appends prompt fragments that tell the agent how to use those tools. You don't write that boilerplate — register the skill once, list it on the node:
+```js
+graph.addNode('search', {
+  prompt: 'Find info about {{input.query}}',
+  outputSchema: Results,
+  agent: 'cursor',
+  skills: ['browser'],   // appends Browser MCP usage instructions to the prompt
+});
+```
+## Rollback
+State is history-tracked. To revert the last N writes (typical use case: a node validation passes Zod but the *content* is wrong, and you want to retry with extra instructions):
+```js
+const recovered = state.rollback(2);
+```
+`rollback` returns a fresh state with the last 2 writes dropped. Use it inside `onComplete` callbacks or custom retry logic.
+## Reading state in templates
+Inside the `prompt` string, `{{x}}` resolves first against `state.x`, then against `input.x`. Dotted paths work:
+```
+{{state.plan.tasks.length}}
+{{input.ticket}}
+```
+Function prompts give you the same access without the templating layer.

package/docs/get-started/deploy.md ADDED Viewed

@@ -0,0 +1,75 @@
+---
+sidebar_position: 4
+title: 4. Deploy to cloud
+pagination_prev: get-started/run-locally
+pagination_next: get-started/trigger-and-logs
+---
+# Ship a workflow to Zibby Cloud
+```bash
+zibby workflow deploy my-pipeline
+```
+If you have multiple projects, the CLI prompts you to pick one. On success it prints a UUID:
+```
+✔ Deployed my-pipeline (v1)
+✔ Bundle ready (78s) — runtime npm install eliminated
+  UUID:        2b1ea07f-3ede-4bfd-a51d-431f0bab008e
+  Next steps:
+    zibby workflow run my-pipeline         Run locally
+    zibby workflow trigger 2b1ea07f-...    Run in cloud
+    zibby workflow list                    View all workflows
+```
+The UUID is **canonical** — it never changes once issued. All cloud commands (`trigger`, `logs`, `download`, `delete`) take that UUID as their identifier. The CLI caches it in `.zibby/workflows/my-pipeline/.zibby-deploy.json` so you don't have to remember it. Commit that file to git; collaborators share the same canonical reference.
+## What deploy actually does
+Two phases:
+1. **Source upload** — your workflow folder (sources only, no `node_modules`) is uploaded as a JSON payload to S3 via a presigned URL. The CLI also resolves your `.zibby.config.mjs` (if present at project root) and ships it inside the bundle as `zibby.config.json`, so the cloud sees the same config as your local runs.
+2. **Bundle build (Heroku-style)** — a CodeBuild job downloads the sources, runs `npm install --omit=dev`, packages the result as a tarball, and uploads it to S3. The tarball is what each cloud execution downloads at trigger time, so there's **no `npm install` at runtime** — workflows boot in seconds.
+You'll see a live spinner with the active build step:
+```
+⠹ Building bundle on Zibby Cloud... [3/4] Installing dependencies — 32s
+```
+Pass `--verbose` if you want to see raw CodeBuild logs.
+## Re-deploys keep the same UUID
+Once a workflow has a UUID, every subsequent `deploy` increments the version but keeps the UUID stable:
+```
+v1 → v2 → v3 → ...    (same UUID, same trigger URL)
+```
+So your `curl` calls and CI integrations don't break across deploys.
+## Naming vs. UUIDs
+A clean mental model:
+- **Workflow folder name** (`my-pipeline`) is *local* — used by `workflow new`, `start`, `deploy`. It's just a directory name.
+- **UUID** (`2b1ea07f-...`) is *canonical* — used by `trigger`, `logs`, `download`, `delete`. Stable across deploys.
+`workflow list` shows both:
+```
+┌─────────────────────────────────────┬──────────────┬──────────┬─────┐
+│ UUID                                │ Name         │ Project  │ Ver │
+├─────────────────────────────────────┼──────────────┼──────────┼─────┤
+│ 2b1ea07f-3ede-4bfd-a51d-431f0bab008e│ my-pipeline  │ Zibby UI │ 3   │
+│ -                                   │ scratchpad   │ -        │ -   │
+└─────────────────────────────────────┴──────────────┴──────────┴─────┘
+```
+`-` in the UUID column means a local-only workflow that hasn't been deployed yet.
+→ Next: [Trigger & tail logs](./trigger-and-logs)

package/docs/get-started/install.md ADDED Viewed

@@ -0,0 +1,58 @@
+---
+sidebar_position: 1
+title: 1. Install
+pagination_prev: intro
+pagination_next: get-started/your-first-workflow
+---
+# Install the CLI
+```bash
+npm install -g @zibby/cli
+```
+Verify:
+```bash
+zibby --version
+```
+You should see `zibby v0.1.x`.
+## No global install? Use npx
+Every example below works with `npx @zibby/cli` instead of `zibby`:
+```bash
+npx @zibby/cli workflow new my-pipeline
+```
+The first call downloads the package; subsequent calls use the cached copy.
+## Log in
+Cloud features (deploy, trigger, logs) need authentication:
+```bash
+zibby login
+```
+This opens your browser and writes a session token to `~/.zibby/config.json`. Logged in for 30 days.
+For CI/CD, set `ZIBBY_API_KEY` in the environment instead — the CLI prefers that over the saved session.
+## Pick an agent runtime
+Workflows hand off to external coding-agent CLIs at runtime. You'll need at least one of these installed (locally — the cloud runtime has them pre-installed):
+| Agent | How to install | Auth |
+|---|---|---|
+| **Cursor** | Install [Cursor](https://cursor.com), or `npm i -g cursor-agent` | `cursor-agent login` (or `CURSOR_API_KEY`) |
+| **Claude Code** | `npm i -g @anthropic-ai/claude-agent-sdk` | `ANTHROPIC_API_KEY` |
+| **Codex** | `npm i -g @openai/codex` | `OPENAI_API_KEY` |
+| **Gemini** | `npm i -g @google/gemini-cli` | `GOOGLE_API_KEY` |
+| **Assistant** | None — uses OpenAI Assistants API | `OPENAI_API_KEY` |
+You only need one. The cloud runtime supports all five out of the box.
+→ Next: [Your first workflow](./your-first-workflow)

package/docs/get-started/run-locally.md ADDED Viewed

@@ -0,0 +1,94 @@
+---
+sidebar_position: 3
+title: 3. Run it locally
+pagination_prev: get-started/your-first-workflow
+pagination_next: get-started/deploy
+---
+# Run a workflow locally
+```bash
+zibby workflow run my-pipeline
+```
+One-shot: loads `graph.mjs`, instantiates your `WorkflowAgent` class, runs the graph against a real agent, prints results, exits. Output lands in `.zibby/output/sessions/<sessionId>/`:
+- `result.json` — the final structured output (Zod-validated)
+- `raw_stream_output.txt` — every byte the agent emitted
+- `events.json` — JSONL execution log: which node ran when, what it received, what it returned, retries
+- `.session-info.json` — session metadata
+The flag surface mirrors `zibby workflow trigger` (cloud) on purpose: the call you make locally is the same call you make against the cloud, just with the verb flipped.
+## Pass input
+Most workflows take input. Edit `graph.mjs` to define an input schema and reference `state.input`:
+```js
+graph.addNode('plan', {
+  prompt: ({ input }) => `Triage this ticket: ${input.ticket}`,
+  outputSchema: z.object({ tasks: z.array(z.string()) }),
+  agent: 'claude',
+});
+```
+Then pass input via `--input`:
+```bash
+zibby workflow run my-pipeline --input '{"ticket":"BUG-123"}'
+```
+Or `--param key=value` (repeatable, dot-notation supported):
+```bash
+zibby workflow run my-pipeline -p ticket=BUG-123 -p priority=high
+zibby workflow run my-pipeline -p user.name=Alice -p user.role=admin
+```
+Or `--input-file payload.json` for larger payloads.
+Precedence: `--param` > `--input` > `--input-file`. Same as `trigger`.
+## Iterating
+Each run is a fresh process — re-edit `graph.mjs` or any node file and re-run. There's no daemon to restart; tool errors and crashes don't leave a server hanging on port 3848.
+If you want auto-rerun on file change, run it under `nodemon`:
+```bash
+npx nodemon --ext mjs,js --exec "zibby workflow run my-pipeline -p ticket=BUG-123"
+```
+Or add it to your workflow's `package.json`:
+```json
+{
+  "scripts": {
+    "dev": "nodemon --ext mjs,js --exec \"zibby workflow run my-pipeline\""
+  }
+}
+```
+Then `npm run dev`.
+## Inspect a run
+After a run finishes, the CLI prints the session ID. Open the session folder to inspect every step:
+```bash
+ls .zibby/output/sessions/<sessionId>/
+cat .zibby/output/sessions/<sessionId>/result.json | jq
+```
+## Studio (long-lived server, optional)
+Studio is a desktop UI for browsing live + past runs. It connects to a local HTTP server, so when you use Studio you start that instead:
+```bash
+zibby workflow start my-pipeline   # long-lived server on :3848 (Studio talks to this)
+zibby studio                       # launch the desktop app
+```
+For CLI-only iteration, stick with `zibby workflow run`.
+→ Next: [Deploy to cloud](./deploy)

package/docs/get-started/trigger-and-logs.md ADDED Viewed

@@ -0,0 +1,90 @@
+---
+sidebar_position: 5
+title: 5. Trigger & tail logs
+pagination_prev: get-started/deploy
+---
+# Run a deployed workflow and watch logs
+## Trigger
+```bash
+zibby workflow trigger 2b1ea07f-3ede-4bfd-a51d-431f0bab008e
+```
+The CLI returns immediately with a job ID:
+```
+✔ Workflow triggered successfully
+  Job Details:
+    Job ID:    ee333411-22e6-4733-b790-d480af3f662e
+    Status:    running
+    Version:   3
+    Triggered: 02/05/2026, 10:23:13
+  Monitor execution:
+    zibby workflow logs 2b1ea07f-...
+    zibby workflow logs 2b1ea07f-... -t
+```
+Pass input the same way as locally:
+```bash
+zibby workflow trigger <uuid> -p ticket=BUG-123
+zibby workflow trigger <uuid> --input '{"ticket":"BUG-123"}'
+zibby workflow trigger <uuid> --input-file ./input.json
+```
+If you omit the UUID, the CLI shows an interactive picker over your deployed workflows.
+## Tail logs (Heroku-style)
+```bash
+zibby workflow logs 2b1ea07f-3ede-4bfd-a51d-431f0bab008e -t
+```
+`-t` follows live. Without `-t` it dumps the last execution and exits.
+```
+  Streaming logs for workflow 2b1ea07f-3ede-4bfd-a51d-431f0bab008e...
+  Press Ctrl+C to stop.
+2026-05-02 23:30:51.345  zibby v0.1.x
+────────────────────────────────────────────────────────────
+ Workflow:  my-pipeline
+ Job:       ee333411-...
+ Project:   6b60049d-...
+ Agent:     cursor (model: auto)
+────────────────────────────────────────────────────────────
+[setup] Bundle extracted (3.2s)
+[setup] Loaded MyPipelineWorkflow
+[setup] Registered 5 agent strategies (...)
+┌ example
+│ ◆ Model: auto | key: ***bc97
+│ ...
+└ done 19.4s
+[done] my-pipeline completed in 19.4s
+```
+The stream covers the full execution end-to-end — agent reasoning, tool calls, schema validation, completion summary.
+## Multiple executions
+Workflow UUIDs follow the workflow, not a single run. After one execution finishes, `logs -t` waits for the next trigger of the same workflow and auto-switches to streaming it. Like `heroku logs --tail`.
+To exit after the current run, just Ctrl+C — there's no "exit on completion" mode (yet).
+## Triggering from anywhere
+The CLI is the easiest way, but workflows expose an HTTP API for programmatic triggering — see [Cloud → Triggering programmatically](../cloud/triggering).
+## You're done
+You've now scaffolded, run locally, deployed, triggered, and tailed a workflow. The rest of the docs go deeper:
+- [Concepts](../concepts/graph) — how the graph engine works
+- [CLI Reference](../cli-reference) — every command
+- [Cloud](../cloud/triggering) — HTTP triggers, log archives, bundle internals
+- [Packages](../packages/agent-workflow) — package-level docs