npm - @zibby/cli - Versions diffs - 0.4.16 → 0.4.18 - Mend

@zibby/cli 0.4.16 → 0.4.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

package/dist/templates/.claude/commands/add-skill.md ADDED Viewed

@@ -0,0 +1,83 @@
+---
+description: Add a custom MCP skill to a Zibby workflow
+argument-hint: <workflow-name> <skill-purpose-or-mcp-server-name>
+---
+# /add-skill
+The user wants to add a custom skill (MCP tool bundle) to a workflow.
+**Arguments:** $ARGUMENTS
+## Steps
+1. **Identify the MCP server** the skill wraps:
+   - If the user named one (e.g. "slack", "linear", "filesystem"), find
+     the official MCP server. Standard ones live at
+     `@modelcontextprotocol/server-<name>`.
+   - If unsure, ask: "Which MCP server should this skill wrap, or
+     should it be a JS-only middleware?"
+2. **Find the workflow** at `.zibby/workflows/<name>/`. Create a
+   `skills/` subfolder if it doesn't exist.
+3. **Write `skills/<id>.mjs`:**
+   ```js
+   import { registerSkill } from '@zibby/agent-workflow';
+   registerSkill({
+     id: 'slack',                       // referenced by node `skills: ['slack']`
+     serverName: 'slack-mcp',
+     command: 'npx',
+     args: ['-y', '@modelcontextprotocol/server-slack'],
+     allowedTools: ['mcp__slack__*'],   // pattern of tools the agent gets access to
+     envKeys: ['SLACK_BOT_TOKEN'],
+     description: 'Read channels, post messages, search history',
+   });
+   ```
+4. **Import the skill file from `graph.mjs`** at the TOP, before
+   `new WorkflowGraph()`:
+   ```js
+   import './skills/slack.mjs';        // side-effect: registers the skill
+   import { WorkflowGraph } from '@zibby/agent-workflow';
+   // ...
+   ```
+5. **Opt nodes into the skill:**
+   ```js
+   graph.addNode('post_summary', {
+     ...postSummaryNode,
+     skills: ['slack'],                // ← agent gets slack tools here
+   });
+   ```
+6. **Document the env requirement.** Add to the workflow's README or
+   tell the user which env var they need to set:
+   - Locally: `export SLACK_BOT_TOKEN=xoxb-...` or put in `.env`
+   - Cloud: `zibby workflow env set <workflow> SLACK_BOT_TOKEN=...`
+7. **Validate + test:**
+   ```bash
+   zibby workflow validate <name>
+   zibby workflow run <name> -p ...
+   ```
+   The agent should now have access to `mcp__slack__*` tools in the
+   nodes that opted in.
+## When NOT to use a custom skill
+- If the work can be done with plain Node.js (HTTP call, file write,
+  git command) — use a custom-code node with `execute()` instead. MCP
+  skills are for tool surfaces the agent decides to use, not for
+  deterministic glue.
+## When to use `middleware` instead of MCP
+If you don't have an MCP server but want to attach a JS helper that
+nodes can use, see the "Custom skill via a non-MCP function" section
+of `.claude/CLAUDE.md`. This is rare — prefer MCP when one exists.

package/dist/templates/.claude/commands/new-workflow.md ADDED Viewed

@@ -0,0 +1,61 @@
+---
+description: Scaffold a new Zibby workflow from a natural-language description
+argument-hint: <description of what the workflow should do>
+---
+# /new-workflow
+You're about to create a new Zibby workflow. The user's request is:
+**$ARGUMENTS**
+## Steps
+1. **Sketch the graph.** Based on the user's request, decide:
+   - How many nodes? (Typically 2-5. More than 7 is usually a sign of
+     overdesign — collapse adjacent nodes.)
+   - Which nodes need an LLM (judgement, generation) vs custom-code
+     (deterministic: git ops, HTTP, file IO)?
+   - What's the linear sequence vs conditional branching?
+   - What's the final output the user cares about?
+2. **Pick a workflow name** — kebab-case, ≤24 chars, descriptive.
+   Examples: `code-review`, `pr-summary`, `nightly-changelog`.
+3. **Run the scaffold:**
+   ```bash
+   zibby workflow new <name>
+   ```
+   This creates `.zibby/workflows/<name>/` with starter files.
+4. **Edit the files** in this order (read CLAUDE.md §1 if you've forgotten the shapes):
+   - `workflow.json` — set `name`, `description`, `defaultAgent`
+   - `nodes/*.mjs` — one file per node, each with `name`, `outputSchema`
+     (Zod), and either `prompt` (LLM) or `execute` (custom-code)
+   - `graph.mjs` — wire them up with `addNode` + `addEdge` + `setEntryPoint`
+5. **Validate** (this catches 80% of mistakes before running anything):
+   ```bash
+   zibby workflow validate <name>
+   ```
+   Fix any reported issues.
+6. **Test locally** with a realistic input:
+   ```bash
+   zibby workflow run <name> -p <key>=<value>
+   ```
+   Watch the timeline. If a node fails, the `raw` field shows what the
+   agent actually returned vs what the schema expected.
+7. **Report back to the user** with:
+   - The workflow path
+   - The local test result
+   - The exact `zibby workflow run` command they can use
+   - Ask if they want to deploy
+## DO NOT
+- Don't deploy without asking (`zibby workflow deploy` has cost)
+- Don't use `state.set()` / `state.get()` inside `execute()` — just `return`
+- Don't skip `zibby workflow validate` — it catches schema typos fast
+- Don't add nodes the request didn't ask for

package/dist/templates/.claude/commands/validate-workflow.md ADDED Viewed

@@ -0,0 +1,67 @@
+---
+description: Statically validate a Zibby workflow + run it locally with sample input
+argument-hint: <workflow-name> [optional input as key=value pairs]
+---
+# /validate-workflow
+The user wants to verify a workflow works before deploying.
+**Arguments:** $ARGUMENTS
+## Steps
+1. **Static validation first** (fast — does NOT call any LLM):
+   ```bash
+   zibby workflow validate <name>
+   ```
+   Checks:
+   - Graph topology (entry point set, edges reach END, no orphan nodes)
+   - Every node has `outputSchema`
+   - Every `skills: ['x']` reference is registered
+   - Zod schemas parse cleanly
+   If this fails, **fix the reported issues before running anything
+   else**. Validation errors mean the workflow can't possibly work.
+2. **Local dry-run** with realistic input:
+   ```bash
+   zibby workflow run <name> -p key1=value1 -p key2=value2
+   ```
+   Watch the timeline (`┌ nodeName … └ done`). Each node should:
+   - Show timing under ~30s for LLM nodes, <1s for custom-code
+   - Print its output
+   - Hand off to the next node
+3. **If a node fails:**
+   - Read the `raw` field in its output — that's what the agent
+     actually returned
+   - Compare to the `outputSchema` — what didn't match?
+   - Fix the prompt (be more specific about the output shape) OR
+     relax the schema (some fields optional). Prefer fixing prompts.
+4. **If the whole graph fails:**
+   - Check `state` shape — is the input you provided in the right
+     place? Top-level keys, not nested under `input`.
+   - Check the entry point — `graph.setEntryPoint('first_node')`.
+5. **Report back:**
+   - Validation result (pass / fail + what)
+   - Local run result (pass / fail + which node)
+   - If failed: a one-line diagnosis + a proposed fix
+   - If passed: the exact command the user can use to deploy
+## DO
+- Run validate before run before deploy. Cost increases 10× at each step.
+- Use realistic inputs (`-p`) — defaults are usually placeholders.
+## DON'T
+- Don't deploy a workflow that hasn't passed local run.
+- Don't suppress / ignore Zod errors — they're telling you the agent
+  produced something the next node won't accept.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@zibby/cli",
-  "version": "0.4.16",
+  "version": "0.4.18",
   "description": "Zibby CLI - Test automation generator and runner",
   "type": "module",
   "bin": {
@@ -34,13 +34,14 @@
   "dependencies": {
     "@aws-sdk/client-sqs": "^3.1038.0",
     "@zibby/agent-workflow": "^0.3.2",
-    "@zibby/core": "^0.3.6",
-    "@zibby/ui-memory": "^1.0.0",
+    "@zibby/core": "^0.4.0",
     "@zibby/skills": "^0.1.11",
+    "@zibby/ui-memory": "^1.0.0",
     "adm-zip": "^0.5.17",
     "chalk": "^5.3.0",
     "cli-highlight": "^2.1.11",
     "commander": "^12.0.0",
+    "cronstrue": "^3.14.0",
     "dotenv": "^17.4.1",
     "express": "^4.18.2",
     "glob": "^13.0.0",
@@ -54,6 +55,7 @@
   },
   "files": [
     "dist/",
+    "templates/",
     "README.md",
     "LICENSE"
   ],

package/templates/.claude/CLAUDE.md ADDED Viewed

@@ -0,0 +1,425 @@
+# Zibby project — how to write and ship workflows
+This file is auto-loaded by Claude Code / Cursor / Codex when working in
+this repo. It's the canonical reference for building Zibby workflows.
+You are an AI agent. The user describes what they want; you write the
+workflow code (graph + nodes + skills), test it locally, and deploy. The
+user shouldn't need to read the code you produce — they just describe
+the intent.
+---
+## 0. The 30-second tour
+```
+.zibby/workflows/<name>/
+├── workflow.json     # name, description, version, default agent
+├── graph.mjs         # graph topology (nodes + edges)
+├── nodes/            # one file per node — prompt + outputSchema + skills
+│   ├── plan.mjs
+│   ├── implement.mjs
+│   └── verify.mjs
+└── package.json      # @zibby/agent-workflow + zod, nothing else needed
+```
+Lifecycle:
+```bash
+zibby workflow new <name>            # scaffold (creates .zibby/workflows/<name>/)
+zibby workflow run <name> -p key=val # run locally — one-shot, no server
+zibby workflow start <name>          # run locally with hot-reload (server)
+zibby workflow validate <name>       # static check (graph topology, schemas, skills)
+zibby workflow deploy <name>         # push to Zibby Cloud, returns UUID
+zibby workflow trigger <uuid> -p ... # remote run
+zibby workflow schedule <uuid> set …  # recurring cron run (Unix 5-field)
+zibby workflow logs -t               # tail cloud logs
+```
+**Always `zibby workflow run` before `deploy`.** Local run is ~5s cold
+start; cloud is ~60s. Iterating in cloud is 12× slower.
+---
+## 1. Anatomy of a workflow
+### `workflow.json`
+```json
+{
+  "name": "code-review",
+  "description": "Review a git diff and return structured findings",
+  "version": "0.1.0",
+  "defaultAgent": "claude",
+  "stateSchema": {
+    "diff": "string",
+    "findings": "array"
+  }
+}
+```
+`defaultAgent` is one of: `claude`, `cursor`, `codex`, `gemini`. Any node
+can override with `agent: 'cursor'` in its config.
+### `graph.mjs`
+```js
+import { WorkflowGraph } from '@zibby/agent-workflow';
+import { z } from 'zod';
+import { planNode } from './nodes/plan.mjs';
+import { implementNode } from './nodes/implement.mjs';
+import { verifyNode } from './nodes/verify.mjs';
+export default function buildGraph() {
+  const graph = new WorkflowGraph();
+  graph.addNode('plan',      planNode);
+  graph.addNode('implement', implementNode);
+  graph.addNode('verify',    verifyNode);
+  graph.addEdge('plan', 'implement');
+  graph.addEdge('implement', 'verify');
+  graph.addEdge('verify', 'END');         // 'END' is the terminal sentinel
+  graph.setEntryPoint('plan');
+  return graph;
+}
+```
+### A node — `nodes/plan.mjs`
+```js
+import { z } from 'zod';
+export const planNode = {
+  name: 'plan',
+  agent: 'claude',                        // optional — falls back to defaultAgent
+  outputSchema: z.object({
+    steps: z.array(z.string()),
+    risks: z.array(z.string()),
+  }),
+  prompt: (state) => `
+You are planning a code change. The user wants:
+${state.userRequest}
+Return:
+- steps: ordered list of actions to take
+- risks: anything that might go wrong
+`,
+};
+```
+Three required fields on every LLM node: `name`, `outputSchema` (a Zod
+schema), `prompt` (string or function of state).
+### A custom-code node (no LLM)
+```js
+export const fetchDiffNode = {
+  name: 'fetch_diff',
+  outputSchema: z.object({ diff: z.string(), filesChanged: z.array(z.string()) }),
+  execute: async (context) => {
+    const { execSync } = await import('node:child_process');
+    const diff = execSync('git diff --staged', { encoding: 'utf-8' });
+    const filesChanged = execSync('git diff --staged --name-only', { encoding: 'utf-8' })
+      .trim().split('\n').filter(Boolean);
+    return { diff, filesChanged };
+  },
+};
+```
+Custom-code nodes use `execute(context)` instead of `prompt`. They skip
+the LLM entirely. Use them for deterministic work: git ops, file IO,
+HTTP calls, parsing.
+---
+## 2. State — read and write
+```js
+// READ — state is passed to prompt fns + execute fns as the first arg
+prompt: (state) => `Plan a change for: ${state.userRequest}`
+execute: async (state) => {
+  const diff = state.diff;                // read prior node output here
+  // ...
+}
+```
+**Each node's output is stored at `state[nodeName]`.** If `plan` returns
+`{ steps: [...], risks: [...] }`, the next node reads them as
+`state.plan.steps` and `state.plan.risks`.
+Initial input goes at the TOP of state (not nested under `input`). When
+the user runs `zibby workflow run code-review -p userRequest="fix login"`,
+the first node sees `state.userRequest = "fix login"`.
+**Don't use `state.set()` / `state.get()` inside `execute()`** — those
+are internal to the graph runtime. Just `return` a plain object that
+matches your `outputSchema`. The runtime puts it under `state[nodeName]`.
+---
+## 3. Conditional routing
+```js
+graph.addConditionalEdges('verify', (state) => {
+  if (state.verify.passed) return 'END';
+  if (state.verify.retries < 3) return 'implement';
+  return 'fail';
+});
+```
+The router function receives the full state, returns the name of the
+next node (or `'END'`). All possible target nodes must also be declared
+elsewhere via `addNode`.
+---
+## 4. Skills — pluggable MCP tool bundles
+A **skill** is a named bundle of MCP tools a node can opt into. Built-in
+skills: `browser`, `memory`. Custom skills: register them yourself.
+### Using a skill in a node
+```js
+export const navigateNode = {
+  name: 'navigate',
+  skills: ['browser'],                    // ← opt in
+  outputSchema: z.object({ url: z.string(), title: z.string() }),
+  prompt: (state) => `Navigate to ${state.target} and report the title.`,
+};
+```
+The agent (claude/cursor/codex/gemini) auto-discovers the skill's tools.
+`browser` exposes `mcp__playwright__browser_navigate`, `_snapshot`,
+`_click`, etc. The agent's `allowedTools` is set automatically — you
+don't list each tool.
+### Writing a custom skill
+```js
+// .zibby/workflows/<name>/skills/slack.mjs
+import { registerSkill } from '@zibby/agent-workflow';
+registerSkill({
+  id: 'slack',                            // referenced by `skills: ['slack']`
+  serverName: 'slack-mcp',
+  command: 'npx',
+  args: ['-y', '@modelcontextprotocol/server-slack'],
+  allowedTools: ['mcp__slack__*'],
+  envKeys: ['SLACK_BOT_TOKEN'],           // required env to run
+  description: 'Read channels, post messages, search history',
+});
+```
+Then import it from your `graph.mjs` BEFORE building the graph:
+```js
+import './skills/slack.mjs';              // side-effect: registers the skill
+import { WorkflowGraph } from '@zibby/agent-workflow';
+// ...
+```
+Skills register on `globalThis` so any node that runs in this process
+can use them. In cloud, set required `envKeys` via
+`zibby workflow env set slack ZIBBY_SECRET SLACK_BOT_TOKEN=xoxb-...`.
+### Custom skill via a non-MCP function
+If you don't have an MCP server, expose a Node function directly:
+```js
+registerSkill({
+  id: 'jira-fetch',
+  resolve: () => null,                    // no MCP — use middleware instead
+  middleware: async () => async (nodeName, next, state) => {
+    // attach a JS-only helper. Less common — prefer MCP when possible.
+    state.jiraFetch = async (issueId) => { /* ... */ };
+    return next();
+  },
+});
+```
+---
+## 5. Agent strategies — claude / cursor / codex / gemini
+The same node can run under any agent — they all read `prompt`, return
+the same shaped `outputSchema`. Differences:
+| Agent  | Best for                          | Auth env                          |
+|--------|-----------------------------------|-----------------------------------|
+| claude | Reasoning, planning, structured output | `ANTHROPIC_API_KEY` or `CLAUDE_CODE_OAUTH_TOKEN` |
+| cursor | Code editing in a repo            | `CURSOR_API_KEY`                 |
+| codex  | Direct shell + code generation    | `OPENAI_API_KEY`                 |
+| gemini | Cheap volume tasks                | `GEMINI_API_KEY`                 |
+Pick per-node (override `defaultAgent`):
+```js
+graph.addNode('plan',      { ...planNode,      agent: 'claude' });
+graph.addNode('implement', { ...implementNode, agent: 'cursor' });
+graph.addNode('verify',    { ...verifyNode,    agent: 'codex'  });
+```
+Model overrides go in `.zibby.config.mjs`:
+```js
+export default {
+  models: {
+    plan:      'claude-opus-4-7',
+    implement: 'cursor-fast',
+    verify:    'gpt-5-codex',
+  },
+};
+```
+---
+## 6. Running and shipping
+### Local dry-run (FAST — do this every iteration)
+```bash
+zibby workflow run code-review -p userRequest="add rate limiting to /api"
+```
+One-shot run. ~5s cold, then ~2s per warm iteration when the agent has
+cached prompts. Reads + writes to your local filesystem (no cloud
+upload).
+### Static validate (FASTER — does NOT run any agent)
+```bash
+zibby workflow validate code-review
+```
+Checks: graph topology (no orphan nodes, entry point exists, edges
+reach END), node shapes (every node has `outputSchema`), skill
+references (every `skills: ['x']` is registered), schema validity. If
+this fails, `run` will definitely fail too — fix it first.
+### Deploy
+```bash
+zibby workflow deploy code-review
+#  → returns UUID, caches in .zibby-deploy.json
+```
+Bundles the workflow folder, uploads to Zibby Cloud. Cloud runs on
+Fargate, picks up per-node API keys you set via `zibby workflow env`.
+### Trigger remote run
+```bash
+zibby workflow trigger <uuid> -p userRequest="fix login bug" -t
+```
+`-t` tails logs in Heroku style. Cmd-C stops the tail (workflow keeps
+running in cloud).
+### Schedule recurring runs (cron)
+```bash
+# Set / update a schedule (standard Unix 5-field cron)
+zibby workflow schedule <uuid> set "0 9 * * 1-5"                          # 9am weekdays UTC
+zibby workflow schedule <uuid> set "*/15 * * * *"                         # every 15 min
+zibby workflow schedule <uuid> set "0 9 * * *" --tz America/Los_Angeles   # 9am LA time
+zibby workflow schedule <uuid> set "0 0 * * *" -p kind=nightly            # fixed input params
+# Inspect / clear
+zibby workflow schedule <uuid>                                            # show current
+zibby workflow schedule <uuid> clear                                      # remove
+```
+Backed by EventBridge Scheduler. Each scheduled fire goes through the
+same `runWorkflow` path as manual + webhook triggers — same Fargate
+container, same memory sync, same logs.
+Cron format is **standard Unix 5-field** (`minute hour day-of-month
+month day-of-week`). Use `crontab.guru` to debug expressions. The
+backend converts to EventBridge's 6-field form internally; you never
+see AWS's flavor.
+**Minimum interval is 5 minutes.** `* * * * *`, `*/2 * * * *`, etc.
+are rejected with 400. Use `*/5 * * * *` (every 5 min) or slower —
+this matches GitHub Actions cron's rule. Each fire = a full Fargate
+run = compute + LLM tokens + workflow-execution quota burn, so
+sub-5-min runs aren't economical at our pricing. If you genuinely
+need sub-minute reactivity, use the **webhook trigger** instead.
+A workflow has at most ONE schedule. `set` is upsert (create if absent,
+update if present). When the workflow is deleted, the schedule is
+cleaned up too.
+---
+## 7. Common pitfalls — read before debugging
+| Symptom                                          | Fix                                            |
+|--------------------------------------------------|------------------------------------------------|
+| `Node 'X' must define outputSchema`              | Add `outputSchema: z.object({...})` to the node config |
+| `Skill 'foo' not registered`                     | Import the skill file in graph.mjs BEFORE `new WorkflowGraph()` |
+| `state.previousNode is undefined` in a prompt    | Wrong order — add `graph.addEdge('previousNode', 'thisNode')` |
+| Custom-code node returns nothing                 | `return` an object matching `outputSchema`. `undefined` = failure |
+| Agent ignores your skill's tools                 | Add `skills: ['name']` to the node config, not just register |
+| Zod error: "Expected string, received undefined" | The previous node's outputSchema doesn't match its return — fix the producer, not the consumer |
+| `workflow trigger` works but `run` doesn't       | Local run reads env from `.env` / shell; cloud reads from `zibby workflow env`. Set both. |
+| Hangs forever on a node                          | Add `retries: 0` to fail fast while debugging; check the prompt isn't asking the agent to wait |
+---
+## 8. The agent's job (this is YOU)
+When the user says **"write me a workflow that does X"**:
+1. **Sketch the graph first.** What are the 2-5 nodes? Which can be
+   custom-code (deterministic) and which need an LLM (judgement)?
+2. **Run `zibby workflow new <name>`** to scaffold.
+3. **Edit the files** — `graph.mjs`, `nodes/*.mjs`. Use Zod for every
+   schema. Default to `claude` unless the user has a preference.
+4. **Run `zibby workflow validate <name>`.** Fix any reported issues.
+5. **Run `zibby workflow run <name> -p ...`** with a realistic input.
+   Watch the timeline output. If a node fails, read its `raw` output
+   to understand what the LLM returned vs what the schema expected.
+6. **Iterate on prompts** — the user shouldn't need to. If a node's
+   output doesn't match the schema, tighten the prompt or relax the
+   schema (in that order).
+7. **Once it works locally**, ask the user if they want to deploy.
+   Don't deploy without asking — cloud has cost.
+Read `.claude/commands/` for slash commands the user can invoke:
+`/new-workflow`, `/add-node`, `/add-skill`, `/validate-workflow`.
+---
+## 9. Quick reference
+```js
+import { WorkflowGraph, registerSkill, registerStrategy, AgentStrategy } from '@zibby/agent-workflow';
+import { z } from 'zod';
+// Graph
+const graph = new WorkflowGraph();
+graph.addNode(name, { prompt, outputSchema, execute, skills, agent, retries });
+graph.addEdge(from, to);
+graph.addConditionalEdges(from, (state) => 'nextNodeName');
+graph.setEntryPoint(name);
+// Skill
+registerSkill({ id, serverName, command, args, allowedTools, envKeys });
+// Strategy (rare — only when wrapping a non-built-in LLM)
+class MyAgent extends AgentStrategy { /* implement invoke() */ }
+registerStrategy(new MyAgent());
+// State (inside execute / prompt fns)
+//   READ:  state.someKey   or   state.previousNode.field
+//   WRITE: return { ... } from execute() — runtime puts it at state[nodeName]
+```
+That's the whole API. Anything else is either a convenience or a
+footgun — when in doubt, prefer the above.