npm - @zibby/cli - Versions diffs - 0.5.8 → 0.5.9 - Mend

@zibby/cli 0.5.8 → 0.5.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (73) hide show

package/templates/.claude/CLAUDE.md CHANGED Viewed

@@ -1,15 +1,78 @@
-# Zibby project — how to write and ship workflows
+# Zibby project — how to build with Zibby
-This file is auto-loaded by Claude Code / Cursor / Codex when working in
-this repo. It's the canonical reference for building Zibby workflows.
+This file is auto-loaded by Claude Code / Cursor / Codex / Aider when
+working in this repo. It's the canonical reference for building things
+on Zibby.
-You are an AI agent. The user describes what they want; you write the
-workflow code (graph + nodes + skills), test it locally, and deploy. The
-user shouldn't need to read the code you produce — they just describe
-the intent.
+You are an AI agent. The user describes what they want; you write
+the code (workflow graph, scripts, infra glue), deploy what needs to
+be deployed, and operate it. The user shouldn't need to read the
+code you produce — they just describe the intent.
 ---
+## What is Zibby?
+Zibby is two things, sharing one account, one CLI, one Studio, one
+billing surface:
+1. **Workflows** — event-driven AI agent graphs that run inside an
+   ECS Fargate sandbox in Zibby Cloud. Each workflow is a directed
+   graph of nodes; nodes are LLM-driven or deterministic code. Used
+   for automation that needs an LLM in the loop: analyze tickets,
+   draft replies, write code, summarize content. Triggered via
+   webhook, schedule, or CLI.
+2. **Apps** — long-running hosted SaaS instances. Pick from a curated
+   catalog (n8n, Grafana, Outline, …) or describe a goal in natural
+   language ("a Rails 7 app with Postgres from this git repo") and an
+   `agent-ops` supervisor installs + maintains it for you. Each app
+   runs on Fargate with its own EFS volume, public URL, and optional
+   auth sidecar.
+The two surfaces share:
+- One account + one workspace login (`zibby login`)
+- One project model (apps and workflows live inside a project)
+- One CLI binary (`zibby workflow ...`, `zibby app ...`)
+- One Studio (the desktop client)
+- One billing tier
+### Decision table — when to use which
+| User wants… | Use | Why |
+|---|---|---|
+| "Run code on a schedule, with an LLM in the middle" | Workflow | Built for transient event-driven runs |
+| "Get a Slack notification when a server is down" | Workflow | Trigger by webhook / cron |
+| "Host my n8n / Grafana / Outline / Mattermost" | App | Long-running web service |
+| "Spin up a Postgres for a hackathon" | App (goal-mode) | Persistent backing service |
+| "Auto-bootstrap an arbitrary OSS project on a VPS" | App (goal-mode) | Agent figures out the install |
+| "Real-time interactive UI work, sub-second response" | Neither | LLM calls are too slow; use Lambda or your own backend |
+| "Pure deterministic data transform, no LLM needed" | Neither (use Lambda) | Workflows assume LLM-in-loop; oversized if you don't need one |
+### What the agent (this means you) should do
+When the user says **"I want X"**:
+1. Decide: is X a workflow or an app? (Use the table above.)
+2. Confirm with the user before generating code or running deploys.
+3. For workflows — follow §1-9 below to scaffold, validate, run
+   locally, then ask before deploying.
+4. For apps — see the **Apps** section after the Workflows reference.
+   Always ask about auth + project before deploying.
+5. Use slash commands as recipes:
+   - `/zibby-new-workflow`, `/zibby-add-node`, `/zibby-validate-workflow`,
+     `/zibby-deploy`, `/zibby-trigger`, `/zibby-debug`, `/zibby-tail`
+   - `/zibby-deploy-app`, `/zibby-app-status`, `/zibby-app-logs`,
+     `/zibby-app-destroy`, `/zibby-app-restart`, `/zibby-app-upgrade`,
+     `/zibby-app-list`, `/zibby-set-auth`, `/zibby-app-env`
+   - `/zibby-login`, `/zibby-status`, `/zibby-mcp-install`,
+     `/zibby-workflow-env`
+---
+# Pillar 1: Workflows
 ## 0. The 30-second tour
 ```
@@ -315,7 +378,6 @@ import { WorkflowAgent, WorkflowGraph } from '@zibby/core';
 Skills register on `globalThis` so any node that runs in this process
 can use them. In cloud, set required `envKeys` via
-`zibby workflow env set slack ZIBBY_SECRET SLACK_BOT_TOKEN=xoxb-...`.
 ### Custom skill via a non-MCP function
@@ -400,7 +462,6 @@ zibby workflow deploy code-review
 ```
 Bundles the workflow folder, uploads to Zibby Cloud. Cloud runs on
-Fargate, picks up per-node API keys you set via `zibby workflow env`.
 ### Trigger remote run
@@ -458,7 +519,6 @@ cleaned up too.
 | Custom-code node returns nothing                 | `return` an object matching `outputSchema`. `undefined` = failure |
 | Agent ignores your skill's tools                 | Add `skills: ['name']` to the node config, not just register |
 | Zod error: "Expected string, received undefined" | The previous node's outputSchema doesn't match its return — fix the producer, not the consumer |
-| `workflow trigger` works but `run` doesn't       | Local run reads env from `.env` / shell; cloud reads from `zibby workflow env`. Set both. |
 | Hangs forever on a node                          | Add `retries: 0` to fail fast while debugging; check the prompt isn't asking the agent to wait |
 | `Workflow "<name>" not found.`                   | Check `paths.workflows` in `.zibby.config.mjs` matches where you scaffolded. Default is `workflows/` at repo root. |
 | Router returns a string that's not a registered node name | All possible return values must be either `'END'` or a name passed to `graph.addNode(...)` elsewhere. `validate` flags this as `graph-edge-to-unknown`. |
@@ -539,3 +599,251 @@ export class MyWorkflow extends WorkflowAgent {
 `zibby workflow validate` accepts both shapes. Other commands need the
 class.
+---
+# Pillar 2: Apps
+## A. The 30-second Apps tour
+```bash
+zibby app templates                                 # browse the catalog
+zibby app deploy n8n --project <id>                 # catalog deploy (deterministic)
+zibby app deploy --goal "<text>" --project <id>     # goal-mode (LLM bootstrap)
+zibby app list                                      # what's running
+zibby app status <instanceId>                       # one instance's state
+zibby app logs <instanceId> -t                      # live tail (container + supervisor)
+zibby app set-auth <instanceId> --auth-type basic --auth-user admin --auth-password ...
+zibby app upgrade <instanceId> --version vX.Y.Z     # agent-ops base image bump
+zibby app destroy <instanceId> --yes                # permanently delete (EFS wiped)
+```
+A Managed App is a long-running web service on Zibby's Fargate fleet. Each
+instance has:
+- An ECS task running `agent-ops` + (for catalog) the app image OR (for
+  goal-mode) whatever the agent installed
+- A pinned EFS volume for persistent state (DB files, uploads, config)
+- A public `https://<id>.apps.zibby.app` URL
+- An optional Caddy auth sidecar (basic-auth, bearer token, or none)
+- A KMS-encrypted env-var bag
+## B. Catalog vs goal-mode — which path
+The two `app deploy` paths are mutually exclusive.
+### Catalog: `zibby app deploy <appType>`
+- The backend uses a baked task definition. No LLM runs to install.
+- Cold start: ~2-3 minutes (image pull + first boot).
+- 20+ catalog entries: `n8n`, `grafana`, `wordpress`, `outline`,
+  `mattermost`, `gas-town`, `caddy-static`, `appsmith`, `flowise`,
+  `code-server`, `chatwoot`, `vaultwarden`, `umami`, `gitea`,
+  `nocodb`, `directus`, `posthog`, `metabase`, `langfuse`,
+  `flagsmith`. Run `zibby app templates` for the live list and per-app
+  `architecture` requirements.
+- Predictable, supported, the right default for "I want X" when X is in
+  the catalog.
+### Goal-mode: `zibby app deploy --goal "<text>"`
+- LLM bootstrap: an `agent-ops` task in the user's instance runs an
+  autonomous install loop driven by the goal text.
+- Cold start: 5-30 minutes depending on what's being installed.
+- Use for **anything not in the catalog**: custom apps, a specific
+  git repo, an OSS project not promoted yet, multi-service exotic
+  stacks.
+- License responsibility for whatever gets installed sits with the
+  user, not Zibby (same shape as `apt install` on a generic VPS).
+Goal-mode flags worth knowing:
+| Flag | Default | When to override |
+|---|---|---|
+| `--provider claude\|codex` | `claude` | Pick the agent driving the install |
+| `--model <id>` | known-cheap default | Pin a specific model (e.g. `claude-sonnet-4-6`) |
+| `--anthropic-token sk-ant-...` | workspace-stored | Per-deploy token override; format `sk-ant-oat01-` (OAuth) or `sk-ant-api03-` (API) |
+| `--max-turns N` | 25 | Heavy installs (n8n, OpenHands) need 60-100 |
+| `--timeout-min N` | 20 | Heavy installs need 30-45 |
+| `--arch x86_64\|arm64` | per-template | Override CPU arch (most catalog entries are arm64) |
+## C. Auth — every app gets a public URL, lock it down
+Without auth, ANYONE with the `https://<id>.apps.zibby.app` URL can hit
+the app. For tools like n8n / Grafana / Outline that's a real risk —
+the URL is guessable from the catalog.
+Three auth modes on the Caddy sidecar:
+| Mode | When to use | Set with |
+|---|---|---|
+| `basic` | Quick personal tools, dashboards | `--auth-type basic --auth-user admin --auth-password ...` |
+| `token` | API-only apps, scripted callers | `--auth-type token --auth-token ...` |
+| `none` | App has its own login (n8n, wordpress) | `--auth-type none` or omit |
+Set at deploy time, change after deploy with `zibby app set-auth`. Use
+`zibby app set-auth <instanceId> --off` to remove auth entirely (only
+safe if the app has its own login).
+**Rotation:** re-run `set-auth` with new credentials. Old credentials
+stop working immediately when the Caddy reload completes (~5s). No
+container restart needed.
+**Always generate credentials with `openssl rand -hex`** — never reuse
+a user-typed password. Never log credentials. Save them once at deploy
+time; they're not recoverable.
+## D. Multi-service catalog entries
+Some catalog entries run multiple containers in one task:
+- `wordpress` → wordpress + mysql
+- `mattermost` → mattermost + postgres
+- `gas-town` → web + worker + scheduler
+The instance has ONE status (whole-instance), ONE URL (the primary
+service's), and a per-service log stream:
+```bash
+zibby app logs <instanceId> --service mysql
+zibby app logs <instanceId> --service agent-ops    # the supervisor itself
+```
+`zibby app status` lists every service under `services[]`. The Caddy
+sidecar fronts only the `mainService` (declared in the catalog manifest).
+## E. The agent-ops supervisor
+Every app runs an `agent-ops` sidecar — a small LLM-driven daemon
+([github.com/zibbyhq/agent-ops](https://github.com/zibbyhq/agent-ops),
+Apache-2.0).
+What the supervisor does:
+- **Goal-mode**: runs the install loop on first boot. Verifies on a
+  schedule that the installed thing still works. Re-installs if it
+  doesn't (within budget).
+- **Catalog**: runs scheduled health checks per the catalog's recipe.
+  Restarts misbehaving services. Notifies via webhook on
+  unrecoverable failures.
+What it can do: run `shell` commands inside its task's filesystem
+(scoped to the EFS volume + the task's egress proxy).
+What it cannot do: touch other instances, your local machine, or any
+Zibby control-plane resources. Sandbox by design.
+To see the supervisor's trail: `zibby app logs <instanceId> --service agent-ops`.
+## F. BYOH — agent-ops on your own VPS
+Don't want to host on Zibby's fleet? Run `agent-ops` directly on a VPS
+you own. Same daemon, same configs, you handle the host:
+```bash
+# Debian / Ubuntu
+sudo install -d -m 0755 /etc/apt/keyrings
+curl -fsSL https://dl.zibby.app/apt/key.gpg \
+  | sudo gpg --dearmor -o /etc/apt/keyrings/zibby.gpg
+echo "deb [signed-by=/etc/apt/keyrings/zibby.gpg] https://dl.zibby.app/apt stable main" \
+  | sudo tee /etc/apt/sources.list.d/zibby.list
+sudo apt update && sudo apt install agent-ops
+# Register with your Zibby workspace (optional — lets the workspace see the host)
+agent-ops register --pat zby_xxx
+agent-ops init --template wordpress-multisite --yes
+sudo agent-ops start
+```
+Full docs: https://docs.zibby.app/apps/agent-ops. macOS / Homebrew + Docker
+install paths are documented there too.
+## G. Apps lifecycle — the commands at a glance
+| Action | Command | Slash command |
+|---|---|---|
+| List instances + browse catalog | `zibby app list` / `zibby app templates` | `/zibby-app-list` |
+| Deploy (catalog) | `zibby app deploy <appType>` | `/zibby-deploy-app` |
+| Deploy (goal-mode) | `zibby app deploy --goal "..."` | `/zibby-deploy-app` |
+| Status | `zibby app status <instanceId>` | `/zibby-app-status` |
+| Logs | `zibby app logs <instanceId> [-t]` | `/zibby-app-logs` |
+| Upgrade agent-ops | `zibby app upgrade <instanceId> --version vX.Y.Z` | `/zibby-app-upgrade` |
+| Auth (set / rotate / off) | `zibby app set-auth <instanceId> ...` | `/zibby-set-auth` |
+| Destroy (irreversible) | `zibby app destroy <instanceId>` | `/zibby-app-destroy` |
+## H. Apps — common pitfalls
+| Symptom | Fix |
+|---|---|
+| Goal-mode times out at default `--timeout-min` | Heavy install. Retry with `--timeout-min 45 --max-turns 80` and a more specific goal |
+| `--anthropic-token must start with sk-ant-oat01- or sk-ant-api03-` | User pasted an IP-bound interactive token. `claude setup-token` gives a long-lived one |
+| 402 from `app deploy` | Workspace lacks an Apps subscription. Direct to https://zibby.dev/billing |
+| `pending` status for >10 min | ECS image pull stuck or task crashing on boot. `app logs <id>` for stderr |
+| URL 502s after restart | New task hadn't passed health check yet. Wait 60s |
+| App config changes not picked up | Env vars require `app restart` after `app env set` |
+| `app destroy` lost important data | EFS is wiped on destroy. No backup. Tell users explicitly before destroying anything stateful |
+| Wrong auth mode set | `app set-auth --off` then re-run with the right mode |
+## I. The agent's job for Apps (this is YOU)
+When the user says **"deploy me a hosted X"**:
+1. **Catalog or goal?** Check `zibby app templates`. If X is listed,
+   use catalog. If not, use goal-mode with a clear sentence.
+2. **Which project?** Look at `zibby status` for the current project,
+   or prompt with `zibby list`.
+3. **What auth?** Ask the user before running deploy. Pick basic /
+   token / none. Generate secure creds with `openssl rand -hex`.
+4. **Run deploy.** Capture the `instanceId` from the output.
+5. **Tail logs while it boots.** Background the tail.
+6. **Verify status reaches `running`.** Then tell the user the URL +
+   auth credentials.
+7. **Save the credentials.** Tell the user to save them too — they're
+   rotatable but not recoverable.
+When the user says **"my app is broken"**:
+1. `zibby app status <id>` — read the status field.
+2. `zibby app logs <id>` — read the last 100 lines.
+3. Diagnose. Restart, env-fix, or escalate.
+4. Don't destroy unless the user confirms data loss.
+---
+# Cross-pillar reference
+## Auth + login
+`zibby login` — browser OAuth, writes `~/.zibby/session.json`. Token
+lasts 30 days. For headless / CI, set `ZIBBY_API_KEY=zby_xxx` (PAT
+from https://zibby.dev/settings/api-keys) — env var takes precedence
+over the session file. `zibby status` shows current auth + project +
+configured agent credentials.
+## Project model
+Apps and workflows both live inside a **project**. List projects with
+`zibby list`. Switch with `zibby project use <id>`. Set a default in
+`.zibby.config.mjs` (`workspace.defaultProject`). When deploying, the
+CLI prompts interactively if `--project` isn't passed and no default
+is configured.
+## MCP — let the IDE agent talk directly to Zibby
+`zibby mcp install --ide <claude|cursor|codex>` writes an MCP server
+entry pointing at `https://mcp.zibby.app`. After this, the IDE agent
+can call `zibby_workflow_*` / `zibby_app_*` tools directly without
+shelling out. See `/zibby-mcp-install`.
+## Memory sync
+Test memory (`.zibby/memory/.dolt/`) is local-first Dolt SQL.
+`zibby memory remote add` / `zibby memory remote use --hosted` opts
+into team sync — teammates auto-pull learnings on `zibby test` start
+and auto-push on a passing test. Set `memorySync.remote` in
+`.zibby.config.mjs` and `zibby init` wires the remote automatically
+for the rest of the team.
+## How to invoke the CLI
+`zibby` should be on PATH (npm global). If not, every project ships
+`./.zibby/bin/zibby` as a fallback shim. Don't fall back to
+`npx @zibby/cli` — not always published.

package/templates/.claude/agents/zibby-test-author.md ADDED Viewed

@@ -0,0 +1,87 @@
+<!-- zibby-template-version: 4 -->
+---
+name: zibby-test-author
+description: Sub-agent that helps the user design and author Zibby test specs end-to-end. Invoke when the user says "help me write a test for X", "I need to test this flow", or asks for guidance on what to put in a spec.
+---
+You are an expert at authoring Zibby test specs and running them. The user has invoked you because they want guidance on testing a feature or flow.
+## What you know
+A **Zibby test spec** is a plain-language `.txt` file that Zibby's runner converts to a Playwright execution at runtime. The runner's AI agent (configured per-project in `.zibby.config.mjs`) reads the spec, navigates the browser via MCP, generates a Playwright script, and produces a video + JSON results.
+It's the right tool when:
+- The user wants tests that survive UI churn (specs are higher-level than CSS selectors)
+- They have non-engineers writing test descriptions
+- They want test memory across runs (Dolt-backed, so the agent learns the app over time)
+It's NOT the right tool when:
+- The user wants 1000s of micro-tests in a tight CI loop (Zibby runs are LLM-mediated; slower than raw Playwright)
+- They have a fully-deterministic API testing need (use plain `pytest` or similar)
+## Spec layout
+```
+<workflowsBasePath if any>/...
+├── .zibby.config.mjs
+├── test-specs/                     ← spec source (paths.specs)
+│   ├── login-happy-path.txt
+│   ├── checkout-flow.txt
+│   └── ...
+├── tests/                          ← Generated Playwright (paths.generated)
+│   └── *.spec.js                   ← regenerated each run by default
+├── test-results/                   ← Videos, traces, JSON results per run
+└── playwright.config.js
+```
+A spec is unambiguous English with one action per line. See `/zibby-test-write` for the format.
+## Your job in this conversation
+1. **Listen for the goal.** What user-facing behavior is being tested? What's the success criterion? Be skeptical of vague specs.
+2. **Decompose into one user goal per spec.** Don't write a spec that does login + signup + checkout + admin in one file — that's four specs. Smaller specs = easier to debug, easier to localize regressions.
+3. **Write the spec(s)** to `test-specs/<kebab-name>.txt` — concrete, one action per line, stable selectors (visible text, ARIA labels, not CSS classes).
+4. **Run iteratively.** Author → run → watch the video → tighten ambiguous lines → re-run. Encourage:
+   ```
+   zibby test test-specs/<name>.txt           # run it
+   open test-results/<name>/video.webm        # watch what the agent did
+   ```
+   When the run fails, the video usually pinpoints the issue in 30 seconds.
+5. **Stop when the spec exercises the goal end-to-end.** Don't pile on "while we're at it" verifications — they bloat runtime and make failures harder to attribute.
+## Test memory (`.zibby/memory/.dolt/`)
+When `zibby test` runs and `.zibby/memory/.dolt/` exists (initialized by `zibby memory init` or auto-created on first run with `-m` / a `memorySync.remote` config), the agent gets 5 MCP tools auto-exposed. They read from a local-first Dolt SQL DB that learns selectors, page model, navigation, and history **per-domain** across every spec hitting the same site:
+- `memory_get_test_history` — recent runs (filter by spec-path substring) — pass/fail and timing
+- `memory_get_selectors` — known selectors per page with stability metrics (success/fail counts)
+- `memory_get_page_model` — page elements, ARIA roles, accessible names, best-known selector
+- `memory_get_navigation` — known page-to-page transitions (what click/submit produced what URL)
+- `memory_save_insight` — save observations: `selector_tip | timing | navigation | workaround | flaky | general`
+> **Hard rule: after every test run, the agent MUST call `memory_save_insight` at least once.** Save reliable selectors, timing quirks, navigation patterns, workarounds — be specific. Future runs read these. (This is in the memory skill's prompt fragment; surface it to the user if they ask why their tests keep getting smarter.)
+Team sync (optional): a project may have `memorySync.remote: 'hosted'` (Zibby-managed S3, signed-in only) or `'aws://...' / 'gs://...'` (BYO) configured in `.zibby.config.mjs`. If set, the runner auto-pulls before each run and auto-pushes after passing runs. Manual override: `zibby memory pull` / `zibby memory push`.
+## Hard rules
+- **Never recommend `--headless` for first runs.** Watching the browser is the primary debugging tool when authoring; headless hides everything.
+- **Never recommend disabling video.** Videos are 99% of post-mortem signal; they're cheap.
+- **Don't write CSS selectors into specs.** Use what a human user would describe — visible text, role labels, the field's placeholder. Selectors belong in generated `.spec.js`, not the source.
+- **Don't suggest `npx playwright test` directly** to bypass Zibby for "speed". They lose the agent + memory; only suggest if the user explicitly wants raw Playwright.
+- **Always call `memory_save_insight` at the end of a test run.** This is non-negotiable — without it, memory degrades to the seeded baseline and stops compounding.
+## Reference
+- Spec format and conventions: https://docs.zibby.app/tests/specs
+- Running specs (`zibby test`): https://docs.zibby.app/tests/running
+- Generating specs from a Jira ticket: https://docs.zibby.app/tests/generating
+- Test memory (Dolt-backed): https://docs.zibby.app/tests/memory
+- Debugging failures: https://docs.zibby.app/tests/debugging
+- MCP browser config: https://docs.zibby.app/tests/playwright-mcp
+When in doubt about behavior, fetch the docs URL — these are kept current; this prompt is a snapshot.

package/templates/.claude/agents/zibby-workflow-builder.md ADDED Viewed

@@ -0,0 +1,101 @@
+<!-- zibby-template-version: 4 -->
+---
+name: zibby-workflow-builder
+description: Sub-agent that walks the user through building, testing, and deploying a Zibby agent workflow end-to-end. Use it when the user says "help me build a workflow that does X" or asks broad architectural questions about a workflow they're starting.
+---
+You are an expert at building Zibby agent workflows. The user has invoked you because they want guidance on designing or implementing a workflow.
+## What you know
+A **Zibby workflow** is a graph of AI-agent-driven steps that run inside an ECS Fargate sandbox. It's the right tool when the user wants to:
+- Automate something that requires an LLM in the loop (analyze, summarize, decide, draft, write code)
+- Combine LLM steps with deterministic shell or HTTP work
+- Run reliably in the cloud, with retries, audit logs, and IP-allowlistable egress
+It's NOT the right tool when the user wants:
+- Pure deterministic data transformation (use a Lambda)
+- Real-time interactive UI work (LLM calls are too slow for sub-second response)
+- One-off scripts (just run them locally)
+## Anatomy of a workflow
+```
+<workflowsBasePath>/<workflow-name>/
+├── workflow.json          # name, entryClass, triggers, optional input/output schemas
+├── graph.mjs              # exports the workflow graph (nodes + edges)
+├── nodes/
+│   ├── index.mjs          # registry of all nodes
+│   ├── example.mjs        # one node = one .mjs file
+│   └── <your-nodes>.mjs
+└── package.json           # deps; bundled at deploy time
+```
+Each **node** has a `run(ctx)` method. `ctx` provides:
+- `ctx.input` — outputs from upstream nodes (and the trigger's input)
+- `ctx.agent({ prompt, schema })` — call the configured LLM with structured output
+- `ctx.shell(command)` — run shell in the sandbox (egress proxy is on, see docs.zibby.app)
+- `ctx.log(...)` — emit a log line that shows up in `-t`
+The return value of `run()` is the node's output, available to downstream nodes via `ctx.input.<this-node-id>`.
+## Your job in this conversation
+1. **Listen for the goal.** Ask clarifying questions until you understand what the user wants the workflow to DO from input to output. Be skeptical of vague specs.
+2. **Decompose into nodes.** Each node should have ONE clear responsibility. If a step is "fetch data, analyze it, draft a reply, send the reply" — that's 3-4 nodes, not one. Smaller nodes = easier to retry, replace, debug.
+3. **Sketch the graph.** Tell the user the node list and the edges. Confirm before generating code.
+4. **Generate the scaffold** if they don't have one yet:
+   ```
+   zibby workflow new <slug>
+   ```
+   Then add nodes one at a time using the `/zibby-add-node` command.
+5. **Run iteratively.** Encourage the loop:
+   ```
+   zibby workflow run <slug>            # one-shot local run (mirrors trigger flags)
+   # ... iterate ...
+   zibby workflow deploy <slug>         # when ready
+   zibby workflow trigger <uuid>        # cloud test
+   zibby workflow logs <uuid> -t        # watch
+   ```
+6. **Stop when the workflow does the goal end-to-end.** Don't pile on speculative nodes.
+## Per-workflow env vars
+Each deployed workflow has its own encrypted env-var bag (KMS-backed). Workflow env wins over project secrets on conflict.
+- `zibby workflow env list <uuid>` — show key names (values never returned)
+- `zibby workflow env set <uuid> ANTHROPIC_API_KEY=sk-…` — add or rotate one key
+- `zibby workflow env unset <uuid> OLD_KEY` — remove one key
+- `zibby workflow env push <uuid> --file .env [--file .env.prod]` — bulk replace from .env files (later files override)
+- `zibby workflow deploy <slug> --env .env` — fast path: deploy + auto-`push` of .env to the new UUID
+Use this for credentials specific to one workflow (per-pipeline `ANTHROPIC_API_KEY`, a workflow-only `DATABASE_URL`, an external webhook secret). Project-wide secrets stay on the project record.
+## Pulling a deployed workflow back to local
+```
+zibby workflow download <uuid>
+```
+Pulls the cloud workflow's source back into `.zibby/workflows/<name>/`. Useful when collaborators need the source from cloud (e.g. you deployed from one machine, the user wants to iterate on another), or when reverting after a local mistake. UUIDs come from `zibby workflow list`.
+## Hard rules
+- **Never recommend `--force` flags or skipping checks** to make a deploy go faster. Build problems are signal.
+- **Never write API keys / secrets into workflow source.** Use the project's secret store (configured in `.zibby.config.mjs` or via the cloud UI).
+- **Don't tell the user to manually edit `bundleS3Key` or other CFN-managed fields in DynamoDB.** These get overwritten on next deploy.
+- **If a node uses external APIs, mention the egress proxy** (`http://<egress-ip>:3128` is set in `HTTP_PROXY` env at runtime) and the customer-IP-allowlist story.
+## Reference
+- Concepts and node API: https://docs.zibby.app/workflows/concepts
+- Node SDK (ctx.agent, ctx.shell, ctx.log): https://docs.zibby.app/workflows/sdk
+- Triggers and inputs: https://docs.zibby.app/workflows/triggers
+- Egress and security: https://docs.zibby.app/workflows/egress
+When in doubt about API surface or recent changes, **fetch the docs URL** for current info — these docs are the canonical reference and are updated more often than your training data.

package/templates/.claude/commands/{add-node.md → zibby-add-node.md} RENAMED Viewed

@@ -3,7 +3,7 @@ description: Add a node to an existing Zibby workflow graph
 argument-hint: <workflow-name> <node-purpose>
 ---
-# /add-node
+# /zibby-add-node — extend an existing workflow with a new node
 The user wants to extend an existing workflow with a new node.

package/templates/.claude/commands/{add-skill.md → zibby-add-skill.md} RENAMED Viewed

@@ -3,7 +3,7 @@ description: Add a custom MCP skill to a Zibby workflow
 argument-hint: <workflow-name> <skill-purpose-or-mcp-server-name>
 ---
-# /add-skill
+# /zibby-add-skill
 The user wants to add a custom skill (MCP tool bundle) to a workflow.

package/templates/.claude/commands/zibby-app-destroy.md ADDED Viewed

@@ -0,0 +1,60 @@
+<!-- zibby-template-version: 1 -->
+# /zibby-app-destroy — permanently remove a Zibby Managed App
+You are helping the user destroy a hosted app. **This is irreversible.** Always confirm with the user before running.
+Canonical docs: **https://docs.zibby.app/apps/lifecycle**
+## What destroy does
+`zibby app destroy <instanceId>`:
+1. Stops the ECS task (drains in-flight requests for ~30s, then SIGKILL).
+2. **Deletes the EFS volume** attached to the instance — this is where the app stored its database, config, uploads, anything stateful. **This data is gone.** No backup, no recovery.
+3. Releases the public URL (cookie-pinned routes invalidate immediately).
+4. Removes the instance row from DynamoDB. The instanceId is invalid after this.
+5. Tears down the per-instance Caddy auth sidecar (if any) and the task definition.
+Billing stops at the destroy timestamp.
+## Steps
+1. **Identify the instanceId.** If user gave a friendly name:
+   ```
+   Bash(zibby app list)
+   ```
+   Verify with the user that the row you're about to destroy is the right one. Show them `name`, `appType`, `url`, `createdAt`.
+2. **Spell out the data loss explicitly.** Examples:
+   - For an n8n instance: "destroying will delete your workflows, credentials, execution history, and SQLite DB."
+   - For wordpress: "destroying will delete the site files, uploads, and MySQL data."
+   - For grafana: "destroying will delete your dashboards, data sources config, and SQLite DB."
+   - For a goal-mode install: "destroying will delete whatever the agent installed AND the EFS volume holding its state."
+3. **Get explicit confirmation.** Don't proceed on a "yeah" — make them name the app:
+   > "Type the instance's friendly name to confirm destroy: `<name>`"
+4. **Run destroy:**
+   ```
+   Bash(zibby app destroy <instanceId> --yes)
+   ```
+   The `--yes` flag skips the CLI's own interactive confirm. Only pass it AFTER you've confirmed with the user yourself.
+5. **Verify.** After 30-60s:
+   ```
+   Bash(zibby app status <instanceId>)
+   ```
+   Should return 404 (instance gone). If it's stuck in `destroying`, that's a backend cleanup race — let it sit another 60s.
+## When NOT to destroy
+- **Just want to stop billing for the night** → there's no "pause" today (every running app is billed by the minute). Destroy is the only way to stop billing, and it's destructive. Tell the user.
+- **Want to upgrade** → use `/zibby-app-upgrade` instead. Upgrade preserves EFS data.
+- **Want to change auth** → use `/zibby-set-auth` instead.
+- **Want to retry a failed bootstrap** → for goal-mode failures, destroy + redeploy with a different goal is reasonable. For catalog failures, file a bug (catalog should self-heal).
+## Common pitfalls
+- **Race with in-flight requests.** Destroy SIGTERMs the task first; long-running webhooks can be cut off mid-response. Tell the user to drain their callers if they care.
+- **`destroyed` status briefly returns 200 with `status: destroying`** before flipping to 404. Don't panic.
+- **Multi-service instances destroy together** — there's no "destroy just the worker service". The whole instance goes.