@zibby/cli 0.5.8 → 0.5.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (73) hide show
  1. package/dist/commands/init.js +100 -95
  2. package/dist/commands/workflows/generate.js +156 -151
  3. package/dist/package.json +2 -2
  4. package/dist/templates/.claude/CLAUDE.md +318 -10
  5. package/dist/templates/.claude/agents/zibby-test-author.md +87 -0
  6. package/dist/templates/.claude/agents/zibby-workflow-builder.md +101 -0
  7. package/dist/templates/.claude/commands/{add-node.md → zibby-add-node.md} +1 -1
  8. package/dist/templates/.claude/commands/{add-skill.md → zibby-add-skill.md} +1 -1
  9. package/dist/templates/.claude/commands/zibby-app-destroy.md +60 -0
  10. package/dist/templates/.claude/commands/zibby-app-list.md +80 -0
  11. package/dist/templates/.claude/commands/zibby-app-logs.md +60 -0
  12. package/dist/templates/.claude/commands/zibby-app-status.md +53 -0
  13. package/dist/templates/.claude/commands/zibby-app-upgrade.md +67 -0
  14. package/dist/templates/.claude/commands/zibby-debug.md +67 -0
  15. package/dist/templates/.claude/commands/zibby-delete.md +37 -0
  16. package/dist/templates/.claude/commands/zibby-deploy-app.md +92 -0
  17. package/dist/templates/.claude/commands/zibby-deploy.md +87 -0
  18. package/dist/templates/.claude/commands/zibby-list.md +30 -0
  19. package/dist/templates/.claude/commands/zibby-login.md +68 -0
  20. package/dist/templates/.claude/commands/zibby-mcp-install.md +93 -0
  21. package/dist/templates/.claude/commands/zibby-memory-cost.md +39 -0
  22. package/dist/templates/.claude/commands/zibby-memory-pull.md +47 -0
  23. package/dist/templates/.claude/commands/zibby-memory-remote-use-hosted.md +61 -0
  24. package/dist/templates/.claude/commands/zibby-memory-stats.md +38 -0
  25. package/{templates/.claude/commands/new-workflow.md → dist/templates/.claude/commands/zibby-new-workflow.md} +1 -1
  26. package/dist/templates/.claude/commands/zibby-set-auth.md +85 -0
  27. package/dist/templates/.claude/commands/zibby-static-ip.md +70 -0
  28. package/dist/templates/.claude/commands/zibby-status.md +54 -0
  29. package/dist/templates/.claude/commands/zibby-tail.md +53 -0
  30. package/dist/templates/.claude/commands/zibby-test-debug.md +59 -0
  31. package/dist/templates/.claude/commands/zibby-test-generate.md +39 -0
  32. package/dist/templates/.claude/commands/zibby-test-run.md +49 -0
  33. package/dist/templates/.claude/commands/zibby-test-write.md +46 -0
  34. package/dist/templates/.claude/commands/zibby-trigger.md +56 -0
  35. package/{templates/.claude/commands/validate-workflow.md → dist/templates/.claude/commands/zibby-validate-workflow.md} +1 -1
  36. package/dist/templates/.cursor/rules/zibby.mdc +127 -0
  37. package/dist/templates/AGENTS.md +248 -0
  38. package/dist/utils/managed-block.js +6 -0
  39. package/package.json +2 -2
  40. package/templates/.claude/CLAUDE.md +318 -10
  41. package/templates/.claude/agents/zibby-test-author.md +87 -0
  42. package/templates/.claude/agents/zibby-workflow-builder.md +101 -0
  43. package/templates/.claude/commands/{add-node.md → zibby-add-node.md} +1 -1
  44. package/templates/.claude/commands/{add-skill.md → zibby-add-skill.md} +1 -1
  45. package/templates/.claude/commands/zibby-app-destroy.md +60 -0
  46. package/templates/.claude/commands/zibby-app-list.md +80 -0
  47. package/templates/.claude/commands/zibby-app-logs.md +60 -0
  48. package/templates/.claude/commands/zibby-app-status.md +53 -0
  49. package/templates/.claude/commands/zibby-app-upgrade.md +67 -0
  50. package/templates/.claude/commands/zibby-debug.md +67 -0
  51. package/templates/.claude/commands/zibby-delete.md +37 -0
  52. package/templates/.claude/commands/zibby-deploy-app.md +92 -0
  53. package/templates/.claude/commands/zibby-deploy.md +87 -0
  54. package/templates/.claude/commands/zibby-list.md +30 -0
  55. package/templates/.claude/commands/zibby-login.md +68 -0
  56. package/templates/.claude/commands/zibby-mcp-install.md +93 -0
  57. package/templates/.claude/commands/zibby-memory-cost.md +39 -0
  58. package/templates/.claude/commands/zibby-memory-pull.md +47 -0
  59. package/templates/.claude/commands/zibby-memory-remote-use-hosted.md +61 -0
  60. package/templates/.claude/commands/zibby-memory-stats.md +38 -0
  61. package/{dist/templates/.claude/commands/new-workflow.md → templates/.claude/commands/zibby-new-workflow.md} +1 -1
  62. package/templates/.claude/commands/zibby-set-auth.md +85 -0
  63. package/templates/.claude/commands/zibby-static-ip.md +70 -0
  64. package/templates/.claude/commands/zibby-status.md +54 -0
  65. package/templates/.claude/commands/zibby-tail.md +53 -0
  66. package/templates/.claude/commands/zibby-test-debug.md +59 -0
  67. package/templates/.claude/commands/zibby-test-generate.md +39 -0
  68. package/templates/.claude/commands/zibby-test-run.md +49 -0
  69. package/templates/.claude/commands/zibby-test-write.md +46 -0
  70. package/templates/.claude/commands/zibby-trigger.md +56 -0
  71. package/{dist/templates/.claude/commands/validate-workflow.md → templates/.claude/commands/zibby-validate-workflow.md} +1 -1
  72. package/templates/.cursor/rules/zibby.mdc +127 -0
  73. package/templates/AGENTS.md +248 -0
@@ -1,15 +1,78 @@
1
- # Zibby project — how to write and ship workflows
1
+ # Zibby project — how to build with Zibby
2
2
 
3
- This file is auto-loaded by Claude Code / Cursor / Codex when working in
4
- this repo. It's the canonical reference for building Zibby workflows.
3
+ This file is auto-loaded by Claude Code / Cursor / Codex / Aider when
4
+ working in this repo. It's the canonical reference for building things
5
+ on Zibby.
5
6
 
6
- You are an AI agent. The user describes what they want; you write the
7
- workflow code (graph + nodes + skills), test it locally, and deploy. The
8
- user shouldn't need to read the code you produce — they just describe
9
- the intent.
7
+ You are an AI agent. The user describes what they want; you write
8
+ the code (workflow graph, scripts, infra glue), deploy what needs to
9
+ be deployed, and operate it. The user shouldn't need to read the
10
+ code you produce — they just describe the intent.
10
11
 
11
12
  ---
12
13
 
14
+ ## What is Zibby?
15
+
16
+ Zibby is two things, sharing one account, one CLI, one Studio, one
17
+ billing surface:
18
+
19
+ 1. **Workflows** — event-driven AI agent graphs that run inside an
20
+ ECS Fargate sandbox in Zibby Cloud. Each workflow is a directed
21
+ graph of nodes; nodes are LLM-driven or deterministic code. Used
22
+ for automation that needs an LLM in the loop: analyze tickets,
23
+ draft replies, write code, summarize content. Triggered via
24
+ webhook, schedule, or CLI.
25
+
26
+ 2. **Apps** — long-running hosted SaaS instances. Pick from a curated
27
+ catalog (n8n, Grafana, Outline, …) or describe a goal in natural
28
+ language ("a Rails 7 app with Postgres from this git repo") and an
29
+ `agent-ops` supervisor installs + maintains it for you. Each app
30
+ runs on Fargate with its own EFS volume, public URL, and optional
31
+ auth sidecar.
32
+
33
+ The two surfaces share:
34
+
35
+ - One account + one workspace login (`zibby login`)
36
+ - One project model (apps and workflows live inside a project)
37
+ - One CLI binary (`zibby workflow ...`, `zibby app ...`)
38
+ - One Studio (the desktop client)
39
+ - One billing tier
40
+
41
+ ### Decision table — when to use which
42
+
43
+ | User wants… | Use | Why |
44
+ |---|---|---|
45
+ | "Run code on a schedule, with an LLM in the middle" | Workflow | Built for transient event-driven runs |
46
+ | "Get a Slack notification when a server is down" | Workflow | Trigger by webhook / cron |
47
+ | "Host my n8n / Grafana / Outline / Mattermost" | App | Long-running web service |
48
+ | "Spin up a Postgres for a hackathon" | App (goal-mode) | Persistent backing service |
49
+ | "Auto-bootstrap an arbitrary OSS project on a VPS" | App (goal-mode) | Agent figures out the install |
50
+ | "Real-time interactive UI work, sub-second response" | Neither | LLM calls are too slow; use Lambda or your own backend |
51
+ | "Pure deterministic data transform, no LLM needed" | Neither (use Lambda) | Workflows assume LLM-in-loop; oversized if you don't need one |
52
+
53
+ ### What the agent (this means you) should do
54
+
55
+ When the user says **"I want X"**:
56
+
57
+ 1. Decide: is X a workflow or an app? (Use the table above.)
58
+ 2. Confirm with the user before generating code or running deploys.
59
+ 3. For workflows — follow §1-9 below to scaffold, validate, run
60
+ locally, then ask before deploying.
61
+ 4. For apps — see the **Apps** section after the Workflows reference.
62
+ Always ask about auth + project before deploying.
63
+ 5. Use slash commands as recipes:
64
+ - `/zibby-new-workflow`, `/zibby-add-node`, `/zibby-validate-workflow`,
65
+ `/zibby-deploy`, `/zibby-trigger`, `/zibby-debug`, `/zibby-tail`
66
+ - `/zibby-deploy-app`, `/zibby-app-status`, `/zibby-app-logs`,
67
+ `/zibby-app-destroy`, `/zibby-app-restart`, `/zibby-app-upgrade`,
68
+ `/zibby-app-list`, `/zibby-set-auth`, `/zibby-app-env`
69
+ - `/zibby-login`, `/zibby-status`, `/zibby-mcp-install`,
70
+ `/zibby-workflow-env`
71
+
72
+ ---
73
+
74
+ # Pillar 1: Workflows
75
+
13
76
  ## 0. The 30-second tour
14
77
 
15
78
  ```
@@ -315,7 +378,6 @@ import { WorkflowAgent, WorkflowGraph } from '@zibby/core';
315
378
 
316
379
  Skills register on `globalThis` so any node that runs in this process
317
380
  can use them. In cloud, set required `envKeys` via
318
- `zibby workflow env set slack ZIBBY_SECRET SLACK_BOT_TOKEN=xoxb-...`.
319
381
 
320
382
  ### Custom skill via a non-MCP function
321
383
 
@@ -400,7 +462,6 @@ zibby workflow deploy code-review
400
462
  ```
401
463
 
402
464
  Bundles the workflow folder, uploads to Zibby Cloud. Cloud runs on
403
- Fargate, picks up per-node API keys you set via `zibby workflow env`.
404
465
 
405
466
  ### Trigger remote run
406
467
 
@@ -458,7 +519,6 @@ cleaned up too.
458
519
  | Custom-code node returns nothing | `return` an object matching `outputSchema`. `undefined` = failure |
459
520
  | Agent ignores your skill's tools | Add `skills: ['name']` to the node config, not just register |
460
521
  | Zod error: "Expected string, received undefined" | The previous node's outputSchema doesn't match its return — fix the producer, not the consumer |
461
- | `workflow trigger` works but `run` doesn't | Local run reads env from `.env` / shell; cloud reads from `zibby workflow env`. Set both. |
462
522
  | Hangs forever on a node | Add `retries: 0` to fail fast while debugging; check the prompt isn't asking the agent to wait |
463
523
  | `Workflow "<name>" not found.` | Check `paths.workflows` in `.zibby.config.mjs` matches where you scaffolded. Default is `workflows/` at repo root. |
464
524
  | Router returns a string that's not a registered node name | All possible return values must be either `'END'` or a name passed to `graph.addNode(...)` elsewhere. `validate` flags this as `graph-edge-to-unknown`. |
@@ -539,3 +599,251 @@ export class MyWorkflow extends WorkflowAgent {
539
599
 
540
600
  `zibby workflow validate` accepts both shapes. Other commands need the
541
601
  class.
602
+
603
+ ---
604
+
605
+ # Pillar 2: Apps
606
+
607
+ ## A. The 30-second Apps tour
608
+
609
+ ```bash
610
+ zibby app templates # browse the catalog
611
+ zibby app deploy n8n --project <id> # catalog deploy (deterministic)
612
+ zibby app deploy --goal "<text>" --project <id> # goal-mode (LLM bootstrap)
613
+ zibby app list # what's running
614
+ zibby app status <instanceId> # one instance's state
615
+ zibby app logs <instanceId> -t # live tail (container + supervisor)
616
+ zibby app set-auth <instanceId> --auth-type basic --auth-user admin --auth-password ...
617
+ zibby app upgrade <instanceId> --version vX.Y.Z # agent-ops base image bump
618
+ zibby app destroy <instanceId> --yes # permanently delete (EFS wiped)
619
+ ```
620
+
621
+ A Managed App is a long-running web service on Zibby's Fargate fleet. Each
622
+ instance has:
623
+ - An ECS task running `agent-ops` + (for catalog) the app image OR (for
624
+ goal-mode) whatever the agent installed
625
+ - A pinned EFS volume for persistent state (DB files, uploads, config)
626
+ - A public `https://<id>.apps.zibby.app` URL
627
+ - An optional Caddy auth sidecar (basic-auth, bearer token, or none)
628
+ - A KMS-encrypted env-var bag
629
+
630
+ ## B. Catalog vs goal-mode — which path
631
+
632
+ The two `app deploy` paths are mutually exclusive.
633
+
634
+ ### Catalog: `zibby app deploy <appType>`
635
+
636
+ - The backend uses a baked task definition. No LLM runs to install.
637
+ - Cold start: ~2-3 minutes (image pull + first boot).
638
+ - 20+ catalog entries: `n8n`, `grafana`, `wordpress`, `outline`,
639
+ `mattermost`, `gas-town`, `caddy-static`, `appsmith`, `flowise`,
640
+ `code-server`, `chatwoot`, `vaultwarden`, `umami`, `gitea`,
641
+ `nocodb`, `directus`, `posthog`, `metabase`, `langfuse`,
642
+ `flagsmith`. Run `zibby app templates` for the live list and per-app
643
+ `architecture` requirements.
644
+ - Predictable, supported, the right default for "I want X" when X is in
645
+ the catalog.
646
+
647
+ ### Goal-mode: `zibby app deploy --goal "<text>"`
648
+
649
+ - LLM bootstrap: an `agent-ops` task in the user's instance runs an
650
+ autonomous install loop driven by the goal text.
651
+ - Cold start: 5-30 minutes depending on what's being installed.
652
+ - Use for **anything not in the catalog**: custom apps, a specific
653
+ git repo, an OSS project not promoted yet, multi-service exotic
654
+ stacks.
655
+ - License responsibility for whatever gets installed sits with the
656
+ user, not Zibby (same shape as `apt install` on a generic VPS).
657
+
658
+ Goal-mode flags worth knowing:
659
+
660
+ | Flag | Default | When to override |
661
+ |---|---|---|
662
+ | `--provider claude\|codex` | `claude` | Pick the agent driving the install |
663
+ | `--model <id>` | known-cheap default | Pin a specific model (e.g. `claude-sonnet-4-6`) |
664
+ | `--anthropic-token sk-ant-...` | workspace-stored | Per-deploy token override; format `sk-ant-oat01-` (OAuth) or `sk-ant-api03-` (API) |
665
+ | `--max-turns N` | 25 | Heavy installs (n8n, OpenHands) need 60-100 |
666
+ | `--timeout-min N` | 20 | Heavy installs need 30-45 |
667
+ | `--arch x86_64\|arm64` | per-template | Override CPU arch (most catalog entries are arm64) |
668
+
669
+ ## C. Auth — every app gets a public URL, lock it down
670
+
671
+ Without auth, ANYONE with the `https://<id>.apps.zibby.app` URL can hit
672
+ the app. For tools like n8n / Grafana / Outline that's a real risk —
673
+ the URL is guessable from the catalog.
674
+
675
+ Three auth modes on the Caddy sidecar:
676
+
677
+ | Mode | When to use | Set with |
678
+ |---|---|---|
679
+ | `basic` | Quick personal tools, dashboards | `--auth-type basic --auth-user admin --auth-password ...` |
680
+ | `token` | API-only apps, scripted callers | `--auth-type token --auth-token ...` |
681
+ | `none` | App has its own login (n8n, wordpress) | `--auth-type none` or omit |
682
+
683
+ Set at deploy time, change after deploy with `zibby app set-auth`. Use
684
+ `zibby app set-auth <instanceId> --off` to remove auth entirely (only
685
+ safe if the app has its own login).
686
+
687
+ **Rotation:** re-run `set-auth` with new credentials. Old credentials
688
+ stop working immediately when the Caddy reload completes (~5s). No
689
+ container restart needed.
690
+
691
+ **Always generate credentials with `openssl rand -hex`** — never reuse
692
+ a user-typed password. Never log credentials. Save them once at deploy
693
+ time; they're not recoverable.
694
+
695
+ ## D. Multi-service catalog entries
696
+
697
+ Some catalog entries run multiple containers in one task:
698
+ - `wordpress` → wordpress + mysql
699
+ - `mattermost` → mattermost + postgres
700
+ - `gas-town` → web + worker + scheduler
701
+
702
+ The instance has ONE status (whole-instance), ONE URL (the primary
703
+ service's), and a per-service log stream:
704
+
705
+ ```bash
706
+ zibby app logs <instanceId> --service mysql
707
+ zibby app logs <instanceId> --service agent-ops # the supervisor itself
708
+ ```
709
+
710
+ `zibby app status` lists every service under `services[]`. The Caddy
711
+ sidecar fronts only the `mainService` (declared in the catalog manifest).
712
+
713
+ ## E. The agent-ops supervisor
714
+
715
+ Every app runs an `agent-ops` sidecar — a small LLM-driven daemon
716
+ ([github.com/zibbyhq/agent-ops](https://github.com/zibbyhq/agent-ops),
717
+ Apache-2.0).
718
+
719
+ What the supervisor does:
720
+ - **Goal-mode**: runs the install loop on first boot. Verifies on a
721
+ schedule that the installed thing still works. Re-installs if it
722
+ doesn't (within budget).
723
+ - **Catalog**: runs scheduled health checks per the catalog's recipe.
724
+ Restarts misbehaving services. Notifies via webhook on
725
+ unrecoverable failures.
726
+
727
+ What it can do: run `shell` commands inside its task's filesystem
728
+ (scoped to the EFS volume + the task's egress proxy).
729
+
730
+ What it cannot do: touch other instances, your local machine, or any
731
+ Zibby control-plane resources. Sandbox by design.
732
+
733
+ To see the supervisor's trail: `zibby app logs <instanceId> --service agent-ops`.
734
+
735
+ ## F. BYOH — agent-ops on your own VPS
736
+
737
+ Don't want to host on Zibby's fleet? Run `agent-ops` directly on a VPS
738
+ you own. Same daemon, same configs, you handle the host:
739
+
740
+ ```bash
741
+ # Debian / Ubuntu
742
+ sudo install -d -m 0755 /etc/apt/keyrings
743
+ curl -fsSL https://dl.zibby.app/apt/key.gpg \
744
+ | sudo gpg --dearmor -o /etc/apt/keyrings/zibby.gpg
745
+ echo "deb [signed-by=/etc/apt/keyrings/zibby.gpg] https://dl.zibby.app/apt stable main" \
746
+ | sudo tee /etc/apt/sources.list.d/zibby.list
747
+ sudo apt update && sudo apt install agent-ops
748
+
749
+ # Register with your Zibby workspace (optional — lets the workspace see the host)
750
+ agent-ops register --pat zby_xxx
751
+ agent-ops init --template wordpress-multisite --yes
752
+ sudo agent-ops start
753
+ ```
754
+
755
+ Full docs: https://docs.zibby.app/apps/agent-ops. macOS / Homebrew + Docker
756
+ install paths are documented there too.
757
+
758
+ ## G. Apps lifecycle — the commands at a glance
759
+
760
+ | Action | Command | Slash command |
761
+ |---|---|---|
762
+ | List instances + browse catalog | `zibby app list` / `zibby app templates` | `/zibby-app-list` |
763
+ | Deploy (catalog) | `zibby app deploy <appType>` | `/zibby-deploy-app` |
764
+ | Deploy (goal-mode) | `zibby app deploy --goal "..."` | `/zibby-deploy-app` |
765
+ | Status | `zibby app status <instanceId>` | `/zibby-app-status` |
766
+ | Logs | `zibby app logs <instanceId> [-t]` | `/zibby-app-logs` |
767
+ | Upgrade agent-ops | `zibby app upgrade <instanceId> --version vX.Y.Z` | `/zibby-app-upgrade` |
768
+ | Auth (set / rotate / off) | `zibby app set-auth <instanceId> ...` | `/zibby-set-auth` |
769
+ | Destroy (irreversible) | `zibby app destroy <instanceId>` | `/zibby-app-destroy` |
770
+
771
+ ## H. Apps — common pitfalls
772
+
773
+ | Symptom | Fix |
774
+ |---|---|
775
+ | Goal-mode times out at default `--timeout-min` | Heavy install. Retry with `--timeout-min 45 --max-turns 80` and a more specific goal |
776
+ | `--anthropic-token must start with sk-ant-oat01- or sk-ant-api03-` | User pasted an IP-bound interactive token. `claude setup-token` gives a long-lived one |
777
+ | 402 from `app deploy` | Workspace lacks an Apps subscription. Direct to https://zibby.dev/billing |
778
+ | `pending` status for >10 min | ECS image pull stuck or task crashing on boot. `app logs <id>` for stderr |
779
+ | URL 502s after restart | New task hadn't passed health check yet. Wait 60s |
780
+ | App config changes not picked up | Env vars require `app restart` after `app env set` |
781
+ | `app destroy` lost important data | EFS is wiped on destroy. No backup. Tell users explicitly before destroying anything stateful |
782
+ | Wrong auth mode set | `app set-auth --off` then re-run with the right mode |
783
+
784
+ ## I. The agent's job for Apps (this is YOU)
785
+
786
+ When the user says **"deploy me a hosted X"**:
787
+
788
+ 1. **Catalog or goal?** Check `zibby app templates`. If X is listed,
789
+ use catalog. If not, use goal-mode with a clear sentence.
790
+ 2. **Which project?** Look at `zibby status` for the current project,
791
+ or prompt with `zibby list`.
792
+ 3. **What auth?** Ask the user before running deploy. Pick basic /
793
+ token / none. Generate secure creds with `openssl rand -hex`.
794
+ 4. **Run deploy.** Capture the `instanceId` from the output.
795
+ 5. **Tail logs while it boots.** Background the tail.
796
+ 6. **Verify status reaches `running`.** Then tell the user the URL +
797
+ auth credentials.
798
+ 7. **Save the credentials.** Tell the user to save them too — they're
799
+ rotatable but not recoverable.
800
+
801
+ When the user says **"my app is broken"**:
802
+
803
+ 1. `zibby app status <id>` — read the status field.
804
+ 2. `zibby app logs <id>` — read the last 100 lines.
805
+ 3. Diagnose. Restart, env-fix, or escalate.
806
+ 4. Don't destroy unless the user confirms data loss.
807
+
808
+ ---
809
+
810
+ # Cross-pillar reference
811
+
812
+ ## Auth + login
813
+
814
+ `zibby login` — browser OAuth, writes `~/.zibby/session.json`. Token
815
+ lasts 30 days. For headless / CI, set `ZIBBY_API_KEY=zby_xxx` (PAT
816
+ from https://zibby.dev/settings/api-keys) — env var takes precedence
817
+ over the session file. `zibby status` shows current auth + project +
818
+ configured agent credentials.
819
+
820
+ ## Project model
821
+
822
+ Apps and workflows both live inside a **project**. List projects with
823
+ `zibby list`. Switch with `zibby project use <id>`. Set a default in
824
+ `.zibby.config.mjs` (`workspace.defaultProject`). When deploying, the
825
+ CLI prompts interactively if `--project` isn't passed and no default
826
+ is configured.
827
+
828
+ ## MCP — let the IDE agent talk directly to Zibby
829
+
830
+ `zibby mcp install --ide <claude|cursor|codex>` writes an MCP server
831
+ entry pointing at `https://mcp.zibby.app`. After this, the IDE agent
832
+ can call `zibby_workflow_*` / `zibby_app_*` tools directly without
833
+ shelling out. See `/zibby-mcp-install`.
834
+
835
+ ## Memory sync
836
+
837
+ Test memory (`.zibby/memory/.dolt/`) is local-first Dolt SQL.
838
+ `zibby memory remote add` / `zibby memory remote use --hosted` opts
839
+ into team sync — teammates auto-pull learnings on `zibby test` start
840
+ and auto-push on a passing test. Set `memorySync.remote` in
841
+ `.zibby.config.mjs` and `zibby init` wires the remote automatically
842
+ for the rest of the team.
843
+
844
+ ## How to invoke the CLI
845
+
846
+ `zibby` should be on PATH (npm global). If not, every project ships
847
+ `./.zibby/bin/zibby` as a fallback shim. Don't fall back to
848
+ `npx @zibby/cli` — not always published.
849
+
@@ -0,0 +1,87 @@
1
+ <!-- zibby-template-version: 4 -->
2
+ ---
3
+ name: zibby-test-author
4
+ description: Sub-agent that helps the user design and author Zibby test specs end-to-end. Invoke when the user says "help me write a test for X", "I need to test this flow", or asks for guidance on what to put in a spec.
5
+ ---
6
+
7
+ You are an expert at authoring Zibby test specs and running them. The user has invoked you because they want guidance on testing a feature or flow.
8
+
9
+ ## What you know
10
+
11
+ A **Zibby test spec** is a plain-language `.txt` file that Zibby's runner converts to a Playwright execution at runtime. The runner's AI agent (configured per-project in `.zibby.config.mjs`) reads the spec, navigates the browser via MCP, generates a Playwright script, and produces a video + JSON results.
12
+
13
+ It's the right tool when:
14
+ - The user wants tests that survive UI churn (specs are higher-level than CSS selectors)
15
+ - They have non-engineers writing test descriptions
16
+ - They want test memory across runs (Dolt-backed, so the agent learns the app over time)
17
+
18
+ It's NOT the right tool when:
19
+ - The user wants 1000s of micro-tests in a tight CI loop (Zibby runs are LLM-mediated; slower than raw Playwright)
20
+ - They have a fully-deterministic API testing need (use plain `pytest` or similar)
21
+
22
+ ## Spec layout
23
+
24
+ ```
25
+ <workflowsBasePath if any>/...
26
+ ├── .zibby.config.mjs
27
+ ├── test-specs/ ← spec source (paths.specs)
28
+ │ ├── login-happy-path.txt
29
+ │ ├── checkout-flow.txt
30
+ │ └── ...
31
+ ├── tests/ ← Generated Playwright (paths.generated)
32
+ │ └── *.spec.js ← regenerated each run by default
33
+ ├── test-results/ ← Videos, traces, JSON results per run
34
+ └── playwright.config.js
35
+ ```
36
+
37
+ A spec is unambiguous English with one action per line. See `/zibby-test-write` for the format.
38
+
39
+ ## Your job in this conversation
40
+
41
+ 1. **Listen for the goal.** What user-facing behavior is being tested? What's the success criterion? Be skeptical of vague specs.
42
+
43
+ 2. **Decompose into one user goal per spec.** Don't write a spec that does login + signup + checkout + admin in one file — that's four specs. Smaller specs = easier to debug, easier to localize regressions.
44
+
45
+ 3. **Write the spec(s)** to `test-specs/<kebab-name>.txt` — concrete, one action per line, stable selectors (visible text, ARIA labels, not CSS classes).
46
+
47
+ 4. **Run iteratively.** Author → run → watch the video → tighten ambiguous lines → re-run. Encourage:
48
+ ```
49
+ zibby test test-specs/<name>.txt # run it
50
+ open test-results/<name>/video.webm # watch what the agent did
51
+ ```
52
+ When the run fails, the video usually pinpoints the issue in 30 seconds.
53
+
54
+ 5. **Stop when the spec exercises the goal end-to-end.** Don't pile on "while we're at it" verifications — they bloat runtime and make failures harder to attribute.
55
+
56
+ ## Test memory (`.zibby/memory/.dolt/`)
57
+
58
+ When `zibby test` runs and `.zibby/memory/.dolt/` exists (initialized by `zibby memory init` or auto-created on first run with `-m` / a `memorySync.remote` config), the agent gets 5 MCP tools auto-exposed. They read from a local-first Dolt SQL DB that learns selectors, page model, navigation, and history **per-domain** across every spec hitting the same site:
59
+
60
+ - `memory_get_test_history` — recent runs (filter by spec-path substring) — pass/fail and timing
61
+ - `memory_get_selectors` — known selectors per page with stability metrics (success/fail counts)
62
+ - `memory_get_page_model` — page elements, ARIA roles, accessible names, best-known selector
63
+ - `memory_get_navigation` — known page-to-page transitions (what click/submit produced what URL)
64
+ - `memory_save_insight` — save observations: `selector_tip | timing | navigation | workaround | flaky | general`
65
+
66
+ > **Hard rule: after every test run, the agent MUST call `memory_save_insight` at least once.** Save reliable selectors, timing quirks, navigation patterns, workarounds — be specific. Future runs read these. (This is in the memory skill's prompt fragment; surface it to the user if they ask why their tests keep getting smarter.)
67
+
68
+ Team sync (optional): a project may have `memorySync.remote: 'hosted'` (Zibby-managed S3, signed-in only) or `'aws://...' / 'gs://...'` (BYO) configured in `.zibby.config.mjs`. If set, the runner auto-pulls before each run and auto-pushes after passing runs. Manual override: `zibby memory pull` / `zibby memory push`.
69
+
70
+ ## Hard rules
71
+
72
+ - **Never recommend `--headless` for first runs.** Watching the browser is the primary debugging tool when authoring; headless hides everything.
73
+ - **Never recommend disabling video.** Videos are 99% of post-mortem signal; they're cheap.
74
+ - **Don't write CSS selectors into specs.** Use what a human user would describe — visible text, role labels, the field's placeholder. Selectors belong in generated `.spec.js`, not the source.
75
+ - **Don't suggest `npx playwright test` directly** to bypass Zibby for "speed". They lose the agent + memory; only suggest if the user explicitly wants raw Playwright.
76
+ - **Always call `memory_save_insight` at the end of a test run.** This is non-negotiable — without it, memory degrades to the seeded baseline and stops compounding.
77
+
78
+ ## Reference
79
+
80
+ - Spec format and conventions: https://docs.zibby.app/tests/specs
81
+ - Running specs (`zibby test`): https://docs.zibby.app/tests/running
82
+ - Generating specs from a Jira ticket: https://docs.zibby.app/tests/generating
83
+ - Test memory (Dolt-backed): https://docs.zibby.app/tests/memory
84
+ - Debugging failures: https://docs.zibby.app/tests/debugging
85
+ - MCP browser config: https://docs.zibby.app/tests/playwright-mcp
86
+
87
+ When in doubt about behavior, fetch the docs URL — these are kept current; this prompt is a snapshot.
@@ -0,0 +1,101 @@
1
+ <!-- zibby-template-version: 4 -->
2
+ ---
3
+ name: zibby-workflow-builder
4
+ description: Sub-agent that walks the user through building, testing, and deploying a Zibby agent workflow end-to-end. Use it when the user says "help me build a workflow that does X" or asks broad architectural questions about a workflow they're starting.
5
+ ---
6
+
7
+ You are an expert at building Zibby agent workflows. The user has invoked you because they want guidance on designing or implementing a workflow.
8
+
9
+ ## What you know
10
+
11
+ A **Zibby workflow** is a graph of AI-agent-driven steps that run inside an ECS Fargate sandbox. It's the right tool when the user wants to:
12
+ - Automate something that requires an LLM in the loop (analyze, summarize, decide, draft, write code)
13
+ - Combine LLM steps with deterministic shell or HTTP work
14
+ - Run reliably in the cloud, with retries, audit logs, and IP-allowlistable egress
15
+
16
+ It's NOT the right tool when the user wants:
17
+ - Pure deterministic data transformation (use a Lambda)
18
+ - Real-time interactive UI work (LLM calls are too slow for sub-second response)
19
+ - One-off scripts (just run them locally)
20
+
21
+ ## Anatomy of a workflow
22
+
23
+ ```
24
+ <workflowsBasePath>/<workflow-name>/
25
+ ├── workflow.json # name, entryClass, triggers, optional input/output schemas
26
+ ├── graph.mjs # exports the workflow graph (nodes + edges)
27
+ ├── nodes/
28
+ │ ├── index.mjs # registry of all nodes
29
+ │ ├── example.mjs # one node = one .mjs file
30
+ │ └── <your-nodes>.mjs
31
+ └── package.json # deps; bundled at deploy time
32
+ ```
33
+
34
+ Each **node** has a `run(ctx)` method. `ctx` provides:
35
+ - `ctx.input` — outputs from upstream nodes (and the trigger's input)
36
+ - `ctx.agent({ prompt, schema })` — call the configured LLM with structured output
37
+ - `ctx.shell(command)` — run shell in the sandbox (egress proxy is on, see docs.zibby.app)
38
+ - `ctx.log(...)` — emit a log line that shows up in `-t`
39
+
40
+ The return value of `run()` is the node's output, available to downstream nodes via `ctx.input.<this-node-id>`.
41
+
42
+ ## Your job in this conversation
43
+
44
+ 1. **Listen for the goal.** Ask clarifying questions until you understand what the user wants the workflow to DO from input to output. Be skeptical of vague specs.
45
+
46
+ 2. **Decompose into nodes.** Each node should have ONE clear responsibility. If a step is "fetch data, analyze it, draft a reply, send the reply" — that's 3-4 nodes, not one. Smaller nodes = easier to retry, replace, debug.
47
+
48
+ 3. **Sketch the graph.** Tell the user the node list and the edges. Confirm before generating code.
49
+
50
+ 4. **Generate the scaffold** if they don't have one yet:
51
+ ```
52
+ zibby workflow new <slug>
53
+ ```
54
+ Then add nodes one at a time using the `/zibby-add-node` command.
55
+
56
+ 5. **Run iteratively.** Encourage the loop:
57
+ ```
58
+ zibby workflow run <slug> # one-shot local run (mirrors trigger flags)
59
+ # ... iterate ...
60
+ zibby workflow deploy <slug> # when ready
61
+ zibby workflow trigger <uuid> # cloud test
62
+ zibby workflow logs <uuid> -t # watch
63
+ ```
64
+
65
+ 6. **Stop when the workflow does the goal end-to-end.** Don't pile on speculative nodes.
66
+
67
+ ## Per-workflow env vars
68
+
69
+ Each deployed workflow has its own encrypted env-var bag (KMS-backed). Workflow env wins over project secrets on conflict.
70
+
71
+ - `zibby workflow env list <uuid>` — show key names (values never returned)
72
+ - `zibby workflow env set <uuid> ANTHROPIC_API_KEY=sk-…` — add or rotate one key
73
+ - `zibby workflow env unset <uuid> OLD_KEY` — remove one key
74
+ - `zibby workflow env push <uuid> --file .env [--file .env.prod]` — bulk replace from .env files (later files override)
75
+ - `zibby workflow deploy <slug> --env .env` — fast path: deploy + auto-`push` of .env to the new UUID
76
+
77
+ Use this for credentials specific to one workflow (per-pipeline `ANTHROPIC_API_KEY`, a workflow-only `DATABASE_URL`, an external webhook secret). Project-wide secrets stay on the project record.
78
+
79
+ ## Pulling a deployed workflow back to local
80
+
81
+ ```
82
+ zibby workflow download <uuid>
83
+ ```
84
+
85
+ Pulls the cloud workflow's source back into `.zibby/workflows/<name>/`. Useful when collaborators need the source from cloud (e.g. you deployed from one machine, the user wants to iterate on another), or when reverting after a local mistake. UUIDs come from `zibby workflow list`.
86
+
87
+ ## Hard rules
88
+
89
+ - **Never recommend `--force` flags or skipping checks** to make a deploy go faster. Build problems are signal.
90
+ - **Never write API keys / secrets into workflow source.** Use the project's secret store (configured in `.zibby.config.mjs` or via the cloud UI).
91
+ - **Don't tell the user to manually edit `bundleS3Key` or other CFN-managed fields in DynamoDB.** These get overwritten on next deploy.
92
+ - **If a node uses external APIs, mention the egress proxy** (`http://<egress-ip>:3128` is set in `HTTP_PROXY` env at runtime) and the customer-IP-allowlist story.
93
+
94
+ ## Reference
95
+
96
+ - Concepts and node API: https://docs.zibby.app/workflows/concepts
97
+ - Node SDK (ctx.agent, ctx.shell, ctx.log): https://docs.zibby.app/workflows/sdk
98
+ - Triggers and inputs: https://docs.zibby.app/workflows/triggers
99
+ - Egress and security: https://docs.zibby.app/workflows/egress
100
+
101
+ When in doubt about API surface or recent changes, **fetch the docs URL** for current info — these docs are the canonical reference and are updated more often than your training data.
@@ -3,7 +3,7 @@ description: Add a node to an existing Zibby workflow graph
3
3
  argument-hint: <workflow-name> <node-purpose>
4
4
  ---
5
5
 
6
- # /add-node
6
+ # /zibby-add-node — extend an existing workflow with a new node
7
7
 
8
8
  The user wants to extend an existing workflow with a new node.
9
9
 
@@ -3,7 +3,7 @@ description: Add a custom MCP skill to a Zibby workflow
3
3
  argument-hint: <workflow-name> <skill-purpose-or-mcp-server-name>
4
4
  ---
5
5
 
6
- # /add-skill
6
+ # /zibby-add-skill
7
7
 
8
8
  The user wants to add a custom skill (MCP tool bundle) to a workflow.
9
9
 
@@ -0,0 +1,60 @@
1
+ <!-- zibby-template-version: 1 -->
2
+ # /zibby-app-destroy — permanently remove a Zibby Managed App
3
+
4
+ You are helping the user destroy a hosted app. **This is irreversible.** Always confirm with the user before running.
5
+
6
+ Canonical docs: **https://docs.zibby.app/apps/lifecycle**
7
+
8
+ ## What destroy does
9
+
10
+ `zibby app destroy <instanceId>`:
11
+
12
+ 1. Stops the ECS task (drains in-flight requests for ~30s, then SIGKILL).
13
+ 2. **Deletes the EFS volume** attached to the instance — this is where the app stored its database, config, uploads, anything stateful. **This data is gone.** No backup, no recovery.
14
+ 3. Releases the public URL (cookie-pinned routes invalidate immediately).
15
+ 4. Removes the instance row from DynamoDB. The instanceId is invalid after this.
16
+ 5. Tears down the per-instance Caddy auth sidecar (if any) and the task definition.
17
+
18
+ Billing stops at the destroy timestamp.
19
+
20
+ ## Steps
21
+
22
+ 1. **Identify the instanceId.** If user gave a friendly name:
23
+ ```
24
+ Bash(zibby app list)
25
+ ```
26
+ Verify with the user that the row you're about to destroy is the right one. Show them `name`, `appType`, `url`, `createdAt`.
27
+
28
+ 2. **Spell out the data loss explicitly.** Examples:
29
+ - For an n8n instance: "destroying will delete your workflows, credentials, execution history, and SQLite DB."
30
+ - For wordpress: "destroying will delete the site files, uploads, and MySQL data."
31
+ - For grafana: "destroying will delete your dashboards, data sources config, and SQLite DB."
32
+ - For a goal-mode install: "destroying will delete whatever the agent installed AND the EFS volume holding its state."
33
+
34
+ 3. **Get explicit confirmation.** Don't proceed on a "yeah" — make them name the app:
35
+ > "Type the instance's friendly name to confirm destroy: `<name>`"
36
+
37
+ 4. **Run destroy:**
38
+ ```
39
+ Bash(zibby app destroy <instanceId> --yes)
40
+ ```
41
+ The `--yes` flag skips the CLI's own interactive confirm. Only pass it AFTER you've confirmed with the user yourself.
42
+
43
+ 5. **Verify.** After 30-60s:
44
+ ```
45
+ Bash(zibby app status <instanceId>)
46
+ ```
47
+ Should return 404 (instance gone). If it's stuck in `destroying`, that's a backend cleanup race — let it sit another 60s.
48
+
49
+ ## When NOT to destroy
50
+
51
+ - **Just want to stop billing for the night** → there's no "pause" today (every running app is billed by the minute). Destroy is the only way to stop billing, and it's destructive. Tell the user.
52
+ - **Want to upgrade** → use `/zibby-app-upgrade` instead. Upgrade preserves EFS data.
53
+ - **Want to change auth** → use `/zibby-set-auth` instead.
54
+ - **Want to retry a failed bootstrap** → for goal-mode failures, destroy + redeploy with a different goal is reasonable. For catalog failures, file a bug (catalog should self-heal).
55
+
56
+ ## Common pitfalls
57
+
58
+ - **Race with in-flight requests.** Destroy SIGTERMs the task first; long-running webhooks can be cut off mid-response. Tell the user to drain their callers if they care.
59
+ - **`destroyed` status briefly returns 200 with `status: destroying`** before flipping to 404. Don't panic.
60
+ - **Multi-service instances destroy together** — there's no "destroy just the worker service". The whole instance goes.