npm - @zibby/skills - Versions diffs - 0.1.28 → 0.1.30 - Mend

@zibby/skills 0.1.28 → 0.1.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/docs/apps/goal-mode.md ADDED Viewed

@@ -0,0 +1,175 @@
+---
+sidebar_position: 6
+title: Goal-mode deploys
+---
+# Goal-mode deploys
+**Describe what you want. Get a deployment.**
+Goal-mode is `zibby app deploy --goal "..."` — a free-form natural-language install path for any app not in the catalog. Claude writes the bash, agent-ops runs and supervises it inside the container, and you get back a stable HTTPS URL pointing at a running app, encrypted EFS volume and all.
+```bash
+zibby app deploy --goal "Install n8n on port 5678 with sqlite persistence" \
+  --project <project-id> \
+  --name automations
+```
+That's it. ~5 minutes wall-clock on a healthy run, $0.05-$0.30 in Claude tokens, and you have n8n.
+## When goal-mode works well
+- The install fits in **30 min wall-clock and 8 GB RAM**. (Most things: anything pip / npm / cargo / apt / single Docker run. Not things that need to compile LLVM from source.)
+- The app exposes a **single HTTP port** for verification. (Multi-port apps work — agent-ops just verifies the main one.)
+- You're OK with the app running on a fresh ephemeral EFS volume. (No "restore from my existing database" — start the customer's data flow yourself, post-deploy.)
+- The upstream has a **documented install path**. Random unmaintained GitHub repos with no README work less well than mainstream projects.
+If your install needs more than that — long compile steps, custom kernel modules, a 50 GB pre-trained model download — bring your own host. Goal-mode isn't trying to replace EC2.
+## How it works
+```
+zibby app deploy --goal "Install n8n on port 5678 ..."
+       │
+       ▼
+backend POST /apps:
+  - extracts verify port from "on port NNNN" (sniffs goal text)
+  - splices AGENT_OPS_BOOTSTRAP_MODE=agent_script
+  - splices BOOTSTRAP_PROMPT=<your goal text>
+  - splices customer's BYOK Claude token (env or --anthropic-token)
+  - splices model / max-turns / timeout / token-budget flags
+  - splices AGENT_OPS_BOOTSTRAP_SYSTEM_RULES (curated house rules)
+  - defaults to 4 vCPU / 8 GB Fargate (heavier than the catalog tiers,
+    because installs are CPU-spiky)
+       │
+       ▼
+container starts. agent-ops runs the agent_script loop:
+  ┌── Phase 1: PLAN ─────────────────────────────────────────────┐
+  │ Claude with Write+Read tools only — no Bash, no Edit.        │
+  │ Reads your goal + house rules. Writes one complete bash      │
+  │ script to /tmp/install.sh. ~2 turns, ~$0.05.                 │
+  └──────────────────────────────────────────────────────────────┘
+                              │
+                              ▼
+  ┌── Phase 2: SUPERVISE LOOP ───────────────────────────────────┐
+  │ agent-ops execs /bin/bash /tmp/install.sh in a process group │
+  │ (so we can kill the whole tree on intervene).                │
+  │                                                               │
+  │ Every 30s:                                                   │
+  │   - snapshot stdout/stderr tail + proc status + idle time    │
+  │   - send to Claude (text-only, no tools)                     │
+  │   - Claude returns one JSON line:                            │
+  │       {"verdict":"continue","note":"..."}                    │
+  │       {"verdict":"done","note":"app responding on :5678"}    │
+  │       {"verdict":"intervene","reason":"...","note":"..."}    │
+  │                                                               │
+  │ continue: log progress, keep polling                         │
+  │ done:     write success, leave the nohup'd app running       │
+  │ intervene: SIGTERM the pgroup, 5s grace, SIGKILL, replan     │
+  │                                                               │
+  │ Auto-short-circuit: if proc EXITED with code 0 AND verify    │
+  │ port returns 2xx-499, agent-ops declares done without        │
+  │ asking the supervisor. Stops false-positive intervenes when  │
+  │ the app went into the background and Claude can't see its    │
+  │ "startup logs" in the snapshot anymore.                      │
+  └──────────────────────────────────────────────────────────────┘
+                              │
+              intervene? ───→ back to Phase 1, with stderr +
+                              stdout + exit code as new context.
+                              Claude REWRITES the script (not patches).
+                              Phase 2 starts fresh.
+                              │
+                              ▼
+  Hard caps (whichever hits first):
+    - 5 iterations
+    - 30 min wall-clock (configurable via --timeout-min)
+    - $1.00 token budget
+```
+Every iteration's script + supervisor turns + final status are persisted under `/var/lib/agent-ops/agent_script-state/` on the per-instance EFS volume. `zibby app logs <id>` surfaces the supervisor verdicts in real time; the persisted state is there for post-mortems.
+## CLI flags
+The flags that matter for goal-mode:
+| Flag | What |
+|---|---|
+| `--goal "<text>"` | Free-form install description. Mutually exclusive with `[appType]`. |
+| `--model <name>` | Claude model — `claude-sonnet-4-6` (default), `claude-opus-4-8` (heavier installs), `claude-haiku-4-5-20251001` (cheaper). |
+| `--max-turns <n>` | Claude subprocess max turns, 1-200 (default 25). Bump for heavy installs (n8n, OpenHands) that need many supervisor checks. |
+| `--timeout-min <n>` | Bootstrap wall-clock minutes, 1-120 (default 30). |
+| `--anthropic-token <token>` | Per-deploy Claude credential override. Starts `sk-ant-oat01-` (OAuth) or `sk-ant-api03-` (API key). Also accepts `ZIBBY_ANTHROPIC_TOKEN` env. Falls back to workspace credentials if absent. |
+| `--name <name>` | Display name for `zibby app list` / dashboard. |
+| `--auth-type / --auth-user / --auth-password / --auth-token` | Optional Caddy auth sidecar in front of the installed app. See [Auth proxy](./auth). |
+Goal-mode tasks default to **4 vCPU / 8 GB** Fargate — heavier than catalog tiers — because `npm install -g n8n` and friends are CPU/memory spiky. You can still pass `--cpu` / `--memory` to override.
+## Cost expectations
+Per goal-mode deploy:
+- **Compute** — 5-15 min of 4 vCPU / 8 GB Fargate at standard Fargate pricing. Single-digit cents per deploy.
+- **Claude tokens** — typically $0.05-$0.30 on Sonnet (the default). Opus-4-8 can hit $1.00 if the install takes 4-5 intervene iterations. Hard-capped at $1.00 by agent-ops — beyond that the deploy fails.
+- **Ongoing** — once the app is running, the deploy is exactly like any catalog app: per-minute Fargate billing at the resource tier you ended up with. No Claude tokens spent after the install converges.
+Practical: budget $0.20 - $0.50 per goal-mode deploy attempt, including failed ones. Re-runs after a fix are cheaper because the planner gets shorter context.
+## When it converges, when it loops
+It converges fast when:
+- The install is a single package manager call + a config file + a port to listen on.
+- The app's own README has the exact install steps in a copy-pasteable block.
+- You include the port in your goal (`"on port 5678"`) — saves the planner a guess.
+It loops or fails when:
+- The install needs interactive prompts and the planner forgets to `apt-get install -y` / `DEBIAN_FRONTEND=noninteractive`.
+- The app needs a sister service (Postgres + Redis + web) and you didn't tell it. Goal-mode does one task; for multi-service, prefer a [catalog multi-service entry](./index#multi-service-entries) or split into multiple deploys.
+- The download is huge (multi-GB models) and times out the 30-min wall-clock. Use `--timeout-min 60` if you know that's coming.
+If it fails, the supervisor verdicts in the logs tell you exactly what went wrong on each iteration — paste the goal + the failure output back into the next `--goal "..."` with a hint and it usually converges.
+## License responsibility
+Goal-mode is intentionally a different licensing posture from the catalog:
+- **Catalog apps** — Zibby pre-cleared the license. We're confident we can ship that bundle as a paid host.
+- **Goal-mode** — **you** are directing the install. You named the upstream project, you accepted whatever license terms apply, you decided to run it on infrastructure you're paying for. Same model as deploying it on your own EC2 instance — Zibby is the compute provider, not the redistributor.
+This is why n8n (Sustainable Use License — forbids paid commercial hosting by a third party) isn't in the catalog but **can** be installed via goal-mode: when you direct the install, you're the operator. The SUL is between you and n8n GmbH, not between Zibby and them.
+If you're unsure whether your install is fine for goal-mode, read the upstream license. If it requires you-as-the-operator to accept terms before running it, you're the one accepting — make sure that's a thing you're allowed to do for your use case.
+## Worked example: n8n
+```bash
+zibby app deploy --goal "Install n8n on port 5678 with sqlite persistence" \
+  --project <project-id> \
+  --name automations \
+  --max-turns 40
+```
+Streaming output (abbreviated):
+```
+↑ Goal-mode deploy: "Install n8n on port 5678 with sqlite persistence"
+  Fargate task: 4 vCPU / 8 GB
+  model: claude-sonnet-4-6, max-turns: 40, timeout: 30 min
+  phase 1: planning install script…
+    plan turn 1/40: reading house rules
+    plan turn 2/40: wrote /tmp/install.sh (47 lines)
+  phase 2: executing /tmp/install.sh under supervision…
+    [30s ] supervisor: continue — apt-get update in progress
+    [60s ] supervisor: continue — installing nodejs 20 from nodesource
+    [120s] supervisor: continue — npm install -g n8n (compiling sqlite3)
+    [240s] supervisor: continue — n8n starting, binding to :5678
+    [270s] auto-short-circuit: process exit 0, port 5678 returns 200
+✔ Deployed (instanceId: f1e2d3c4)
+→ Public URL: https://f1e2d3c4.apps.zibby.dev
+```
+Total: ~4.5 min, 1 iteration, $0.07 in Claude tokens. Open the URL, set up your n8n admin account, you're done.
+→ Next: [Auth proxy](./auth) (put basic auth in front of the install you just did) or [Agent operator](./agent-ops) (how the supervise loop works in detail)

package/docs/apps/index.md CHANGED Viewed

@@ -5,15 +5,31 @@ title: Apps overview
 # Managed Apps
-One-click hosted instances of open-source tools (n8n, Grafana, Open WebUI, draw.io, Gas Town, …), each private to your project — with an **autonomous agent-ops sidecar** that handles health checks, self-healing, and upgrades on its own.
+Long-lived, per-tenant containers running open-source tools — each behind a stable HTTPS URL, on encrypted EFS, with an **autonomous agent-ops sidecar** that handles health checks, self-healing, and upgrades on its own.
 ```bash
 zibby app templates              # browse the catalog
-zibby app deploy n8n              # one-click — ECS service + EFS volume + ALB target group
+zibby app deploy grafana          # one-click — ECS service + EFS volume + ALB target group
 zibby app logs <id> -t            # tail logs, SSE auto-reconnect
 zibby app status <id>             # uptime, cost, version, agent-ops activity
 ```
+## Two paths to a deployment
+There are two ways to land a container on the apps fleet, and you pick by **whether the thing you want is in our catalog**:
+| | **Catalog** | **Goal-mode** |
+|---|---|---|
+| Trigger | `zibby app deploy <slug>` | `zibby app deploy --goal "..."` |
+| Source | Curated bundle (image + EFS layout + defaults) | Free-form natural-language install |
+| Time-to-live | ~45-90 s | 2-15 min (Claude writes + runs the install script) |
+| Licensing | Pre-cleared by Zibby | You direct the install; you accept the upstream license |
+| Best for | Anything in the 20-app catalog | n8n, random GitHub project, anything not in the catalog |
+Both paths land in the same shape — Fargate task, per-instance EFS volume, ALB target group, agent-ops sidecar — and look identical to every downstream `zibby app logs/status/upgrade` command. The only difference is **who wrote the install recipe**.
+See [Goal-mode deploys](./goal-mode) for the long form.
 ## Why apps (not workflows)
 Both are pillars of Zibby Cloud. Pick by **how long the thing needs to run**:
@@ -24,7 +40,7 @@ Both are pillars of Zibby Cloud. Pick by **how long the thing needs to run**:
 | Surface | A graph of agent CLI calls | A whole open-source application |
 | Billing | Per execution | Per minute, while running |
 | Persistence | Session JSONL + S3 artifacts | Encrypted-at-rest EFS volume |
-| Best for | "When ticket lands, classify it" | "Host n8n for the team" |
+| Best for | "When ticket lands, classify it" | "Host Grafana for the team" |
 If you find yourself wanting to **run an open-source web app behind a stable URL**, that's an App. If you want **agent-driven business logic that fires on events**, that's a Workflow.
@@ -36,22 +52,78 @@ If you find yourself wanting to **run an open-source web app behind a stable URL
 - **Per-minute Fargate billing** — including the agent-ops sidecar, pause-to-stop billing
 - **agent-ops sidecar** (see [Agent operator](./agent-ops)) — hourly health checks, self-healing, upgrades
 - **SSE log streaming** — `zibby app logs -t` tails any container from anywhere
+- **Optional auth proxy** — `--auth-type basic|token` puts a Caddy sidecar in front of the app (see [Auth proxy](./auth))
 - **Dedicated egress IP addon** — pin outbound HTTPS through one whitelistable IP for self-hosted GitLab / Salesforce / Oracle Cloud
 ## The catalog
-Each marketplace entry is a curated bundle: container image, EFS volume layout, ALB wiring, secrets pattern, resource defaults. Today's catalog:
+Each catalog entry is a curated bundle: container image, EFS volume layout, ALB wiring, secrets pattern, resource defaults. Today's catalog is **20 apps**, grouped by what they're for:
+### AI
+| App | Tier | Rate | What it does |
+|---|---|---|---|
+| **Open WebUI** | Heavy | $0.25/hr | ChatGPT-style UI for Ollama / OpenAI-compatible endpoints |
+| **OpenHands** | Heavy | $0.25/hr | AI software-engineer agent (V1) |
+| **Gas Town** | Light | $0.05/hr | Multi-agent workspace — coordinate Claude, Codex, Cursor, Gemini |
+### Data + APIs
+| App | Tier | Rate | What it does |
+|---|---|---|---|
+| **PostgREST** | Standard | $0.10/hr | Auto-generated REST API on top of any Postgres schema |
+| **Mathesar** | Heavy | $0.25/hr | Spreadsheet-style front-end for Postgres |
+| **PocketBase** | Light | $0.05/hr | Single-file backend (Auth + DB + file storage + realtime) |
+### Knowledge + docs
+| App | Tier | Rate | What it does |
+|---|---|---|---|
+| **Docmost** | Heavy | $0.25/hr | Wiki + collaboration (multi-service: web + Postgres + Redis) |
+| **SiYuan** | Heavy | $0.25/hr | Notion-like knowledge base, local-first |
+| **draw.io** | Light | $0.05/hr | Diagrams + flowcharts (client-side editor) |
+### Monitoring + observability
+| App | Tier | Rate | What it does |
+|---|---|---|---|
+| **Grafana** | Light | $0.05/hr | Dashboards for metrics, logs, traces |
+| **OpenObserve** | Heavy | $0.25/hr | Unified logs + metrics + traces |
+| **Uptime Kuma** | Light | $0.05/hr | Self-hosted Pingdom-alt |
+| **Beszel** | Light | $0.05/hr | Lightweight single-host server monitor |
+| **ChangeDetection.io** | Standard | $0.10/hr | Web-page change watcher |
+### Identity
-| App | Category | Tier | Rate |
+| App | Tier | Rate | What it does |
 |---|---|---|---|
-| **n8n** | Workflow automation | Light | $0.05/hr |
-| **Grafana** | Metrics + dashboards | Light | $0.05/hr |
-| **Gas Town** | Multi-agent workspace | Light | $0.05/hr |
-| **draw.io** | Diagrams + flowcharts | Light | $0.05/hr |
-| **Open WebUI** | ChatGPT-style UI for Ollama | Heavy | $0.25/hr |
+| **Authentik** | Heavy | $0.25/hr | SSO / IdP |
+| **Zitadel** | Heavy | $0.25/hr | SSO / IdP (alt) |
+### Productivity
+| App | Tier | Rate | What it does |
+|---|---|---|---|
+| **Glance** | Light | $0.05/hr | Personal dashboard |
+| **Homepage** | Standard | $0.10/hr | Self-hosted homepage / app launcher |
+| **Gotify** | Light | $0.05/hr | Self-hosted push-notification + webhook server |
 `zibby app templates` is the canonical, always-up-to-date list — the table above is a snapshot.
+### Multi-service entries
+A catalog entry can declare more than one container in the same task — useful for apps that need a DB + cache alongside the web tier. **Docmost** is the live example: web + `postgres:16-alpine` + `redis:7-alpine` sharing localhost and per-volume EFS access points. You don't have to think about it — `zibby app deploy docmost` reads identically — but `zibby app logs --service db` lets you scope log tails to one container.
+### Not in the catalog? Use goal-mode
+The catalog only includes apps whose licenses permit Zibby (a paid host) to ship them as a one-click bundle. Apps under the Sustainable Use License — most famously **n8n** — aren't in the catalog because the SUL forbids paid commercial hosting. They can still be deployed via goal-mode:
+```bash
+zibby app deploy --goal "Install n8n on port 5678 with sqlite persistence"
+```
+The customer (you) is directing the install — Zibby just provides compute. License terms of whatever you install are between you and the upstream project. See [Goal-mode deploys](./goal-mode).
 ## How tiers work
 The catalog groups apps into three resource tiers:
@@ -59,10 +131,10 @@ The catalog groups apps into three resource tiers:
 | Tier | CPU | RAM | Rate |
 |---|---|---|---|
 | **Light** | 0.5 vCPU | 1 GB | $0.05/hr |
-| **Standard** | 1 vCPU | 2 GB | $0.12/hr |
+| **Standard** | 1 vCPU | 2 GB | $0.10/hr |
 | **Heavy** | 2 vCPU | 4 GB | $0.25/hr |
-Per-instance resource overrides are supported when you need to bump CPU / memory for one specific deployment without forking the catalog entry. See [Managing instances → resource overrides](./managing#resource-overrides).
+Per-instance resource overrides are supported when you need to bump CPU / memory for one specific deployment without forking the catalog entry. See [Managing instances → resource overrides](./managing#resource-overrides). Goal-mode deploys default to 4 vCPU / 8 GB to give the install agent enough headroom.
 ## Pricing model

package/docs/apps/managing.md CHANGED Viewed

@@ -16,7 +16,7 @@ zibby app list --project <project-id>     # scope to one project
 ```
 ID         Name         App         Tier    Status    Hourly    Uptime
-a1b2c3d4   automations  n8n@1.97.1  Light   running   $0.05/hr  7d 14h
+a1b2c3d4   wiki         docmost     Heavy   running   $0.25/hr  7d 14h
 a8f7e6d5   metrics      grafana     Light   running   $0.05/hr  21d 3h
 b2c3d4e5   webui        open-webui  Heavy   paused    —         —
 ```
@@ -57,7 +57,7 @@ Behind the scenes:
 3. ALB drains old tasks while new ones come up; the listener serves the new tasks once they pass health checks
 4. Old tasks shut down
-A load-bearing n8n stays serving traffic the whole time. `--yes` skips the confirmation prompt for automation.
+A load-bearing Grafana stays serving traffic the whole time. `--yes` skips the confirmation prompt for automation.
 ## Restart
@@ -96,7 +96,7 @@ Changes apply on the next task restart. Use `zibby app restart` to roll immediat
 Default resources come from the catalog entry's tier. To bump CPU / memory for one instance:
 ```bash
-zibby app deploy n8n --project <id> --cpu 1024 --memory 2048   # 1 vCPU / 2 GB
+zibby app deploy grafana --project <id> --cpu 1024 --memory 2048   # 1 vCPU / 2 GB
 ```
 Per-instance overrides survive upgrades; the upgrade flow re-registers the task definition with the same override values unless `--reset-resources` is passed.

package/docs/cli-reference.md CHANGED Viewed

@@ -267,17 +267,18 @@ Options on `add`:
 ## App commands {#app-commands}
-`zibby app` manages [Managed App instances](./apps/) — hosted open-source tools (n8n, Grafana, …) with an autonomous agent-ops sidecar. Each verb is keyed by **instance ID** (`a1b2c3d4`-style); `zibby app list` shows IDs alongside display names.
+`zibby app` manages [Managed App instances](./apps/) — hosted open-source tools (Grafana, Open WebUI, Docmost, OpenHands, and 16 more in the catalog, plus anything you install via [goal-mode](./apps/goal-mode)) with an autonomous agent-ops sidecar. Each verb is keyed by **instance ID** (`a1b2c3d4`-style); `zibby app list` shows IDs alongside display names.
 | Command | What it does |
 |---|---|
-| [`zibby app templates`](#app-templates) | Browse the catalog (n8n, grafana, gas-town, drawio, open-webui, …) |
+| [`zibby app templates`](#app-templates) | Browse the catalog (grafana, uptime-kuma, open-webui, openhands, docmost, …) |
 | [`zibby app list`](#app-list) | List deployed instances under your account |
-| [`zibby app deploy <appType>`](#app-deploy) | Deploy an app from the catalog |
+| [`zibby app deploy <appType>`](#app-deploy) | Deploy an app from the catalog, or `--goal "..."` for free-form goal-mode |
 | [`zibby app status <id>`](#app-status) | One-screen summary: status, resources, URL, last agent-ops run |
-| [`zibby app logs <id>`](#app-logs) | Logs from app + agent-ops, with `-t` tail mode |
+| [`zibby app logs <id>`](#app-logs) | Logs from app + agent-ops, with `-t` tail mode; `--service <name>` to scope multi-service |
 | [`zibby app upgrade <id>`](#app-upgrade) | Zero-downtime roll to the catalog's current image |
 | [`zibby app restart <id>`](#app-restart) | Force ECS service to roll the running tasks |
+| [`zibby app set-auth <id>`](#app-set-auth) | Add / rotate / remove the optional Caddy auth proxy |
 | [`zibby app update-credential <id>`](#app-update-credential) | Rotate a BYOK credential and restart |
 | [`zibby app destroy <id>`](#app-destroy) | Tear down service + volume (data permanently deleted) |
@@ -303,17 +304,41 @@ Options:
 ### app deploy {#app-deploy}
 ```bash
-zibby app deploy n8n --project <project-id> --name automations
+zibby app deploy grafana --project <project-id> --name metrics
+zibby app deploy --goal "Install n8n on port 5678 with sqlite persistence" --project <id> --name automations
 ```
+Two modes:
+- **Catalog**: pass an `appType` (slug from `zibby app templates`).
+- **Goal-mode**: pass `--goal "<install description>"` instead of `appType`. Claude writes the install script and agent-ops runs it under supervision inside the container. See [Goal-mode deploys](./apps/goal-mode).
 Options:
 - `--project <id>` — interactive picker if omitted
 - `--name <name>` — display name in the dashboard / `zibby app list` (defaults to `appType`)
+- `--provider <name>` — `claude` (default) or `codex` — picks which BYOK credential to inject
+- `--arch <name>` — `x86_64` or `arm64` (defaults to catalog's first listed arch)
+- `--api-key <key>` — Zibby API key (or `ZIBBY_API_KEY` env)
 - `--cpu <units>` — Fargate CPU units (e.g. `1024` for 1 vCPU; default from tier)
 - `--memory <mb>` — Fargate memory in MB (e.g. `2048` for 2 GB; default from tier)
-- `--api-key <key>` — API key (or `ZIBBY_API_KEY` env)
-Returns an `instanceId` and the public URL.
+Goal-mode + planner options (used by `--goal` deploys and cheatsheet-mode catalog entries):
+- `--goal "<text>"` — free-form install description. Mutually exclusive with `[appType]`.
+- `--model <name>` — Claude model identifier. E.g. `claude-sonnet-4-6` (default), `claude-opus-4-8`, `claude-haiku-4-5-20251001`. Overrides the agent-ops bootstrap default.
+- `--anthropic-token <token>` — per-deploy Claude credential override. Must start `sk-ant-oat01-` (OAuth) or `sk-ant-api03-` (API key). SENSITIVE. Also accepts `ZIBBY_ANTHROPIC_TOKEN` env. Falls back to workspace credentials if absent.
+- `--max-turns <n>` — Claude subprocess max turns, 1-200 (default 25). Bump for heavy installs.
+- `--timeout-min <n>` — bootstrap wall-clock minutes, 1-120 (default 30).
+Auth proxy options (opts into a Caddy sidecar on port 8888 — see [Auth proxy](./apps/auth)):
+- `--auth-type <kind>` — `basic`, `token`, or `none` (default `none`).
+- `--auth-user <name>` — required for `--auth-type basic`. Printable ASCII, no spaces, 1-64 chars.
+- `--auth-password <pass>` — required for `--auth-type basic`. SENSITIVE. 8-256 chars. Also accepts `ZIBBY_APP_AUTH_PASSWORD` env.
+- `--auth-token <token>` — optional for `--auth-type token`. If omitted, backend auto-generates a 32-char URL-safe token and returns it ONCE on deploy. Also accepts `ZIBBY_APP_AUTH_TOKEN` env.
+Returns an `instanceId` and the public URL. If `--auth-type token` was used without `--auth-token`, the generated token is included in the response and shown ONCE — save it then, you can't retrieve it later.
 ### app status {#app-status}
@@ -331,9 +356,17 @@ zibby app logs a1b2c3d4 -t                    # tail mode, polls every 3s, SSE a
 zibby app logs a1b2c3d4 --lines 1000          # bigger window
 zibby app logs a1b2c3d4 --json                # raw JSON lines
 zibby app logs a1b2c3d4 --verbose             # full line including JSON body
+zibby app logs a1b2c3d4 -t --service db       # scope to one container in a multi-service entry
 ```
-Logs cover **both** containers — the app and the agent-ops sidecar — prefixed by source. Default output is the parsed `<time>  <msg>` summary.
+Logs cover **all** containers in the task — the app(s), the agent-ops sidecar, and (if enabled) the Caddy auth proxy — prefixed by source. Default output is the parsed `<time>  <msg>` summary.
+Options:
+- `-t, --follow` — live tail
+- `--lines <n>` — initial window size (default 200)
+- `--json` / `--verbose` — output format toggles
+- `--service <name>` — scope to one container by name (e.g. `db` on docmost). Useful for multi-service catalog entries.
 ### app upgrade {#app-upgrade}
@@ -353,6 +386,38 @@ zibby app restart a1b2c3d4
 Forces the ECS service to roll the current tasks without changing the task definition. Useful when the app gets wedged on a stuck connection.
+### app set-auth {#app-set-auth}
+Add, rotate, or remove the [Caddy auth proxy](./apps/auth) on an existing instance.
+```bash
+# Add basic auth to a previously-unauthenticated instance
+zibby app set-auth a1b2c3d4 --auth-type basic --auth-user admin --auth-password 'S0me-long-passphrase!'
+# Rotate just the password (basic auth must already be on)
+zibby app set-auth a1b2c3d4 --auth-password 'N3w-passphrase-2026!'
+# Switch from basic to token auth (caller-supplied)
+zibby app set-auth a1b2c3d4 --auth-type token --auth-token "$(cat ~/.secrets/bearer.txt)"
+# Switch to token auth with a freshly-generated token (returned ONCE in response)
+zibby app set-auth a1b2c3d4 --auth-type token
+# Strip auth entirely — Caddy container is removed; ALB routes straight to the app
+zibby app set-auth a1b2c3d4 --off
+```
+PATCH semantics: omitted flags preserve current state. Triggers an ECS rolling task replace (~60-90s); the app container keeps its EFS data, only the proxy config / container set changes.
+Options:
+- `--auth-type <kind>` — `basic`, `token`, or `none`
+- `--auth-user <name>` — required when setting `--auth-type basic`
+- `--auth-password <pass>` — set / rotate the basic-auth password. Also accepts `ZIBBY_APP_AUTH_PASSWORD` env.
+- `--auth-token <token>` — set / rotate the bearer token. If `--auth-type token` is set without this flag, backend generates a 32-char URL-safe token and returns it once. Also accepts `ZIBBY_APP_AUTH_TOKEN` env.
+- `--off` — remove auth entirely. Equivalent to `--auth-type none`.
+- `--yes` — skip confirmation prompt.
 ### app update-credential {#app-update-credential}
 ```bash

package/docs/intro.md CHANGED Viewed

@@ -56,7 +56,7 @@ zibby template add <name>                  # add a template later (overwrites =
 - **Run anywhere** — local with hot reload, or cloud with Heroku-style bundles (~3s cold start).
 - **Session replay** — every run lands as on-disk JSONL + artifacts. Re-run any node via `--session <id> --node <name>`.
 - **Cloud-native** — SSE log streaming, dedicated egress IPs for firewalled GitLab / GitHub Enterprise / Salesforce.
-- **Hosted apps too** — [Managed Apps](./apps/) host open-source tools (n8n, Grafana, Open WebUI, draw.io) with an autonomous agent-ops sidecar that handles health checks, self-healing, and upgrades.
+- **Hosted apps too** — [Managed Apps](./apps/) host open-source tools (Grafana, Open WebUI, Docmost, Uptime Kuma, Authentik, …) from a curated catalog, OR deploy anything else via natural-language [goal-mode](./apps/goal-mode). Every instance ships with an autonomous agent-ops sidecar that handles health checks, self-healing, and upgrades.
 - **Drive it from your AI agent** — [`@zibby/mcp-cli`](./packages/mcp-cli) exposes deploy / trigger / logs / debug as MCP tools. Add one snippet to Claude Code, Cursor, Codex, or Gemini and they call Zibby directly from chat. See [Use from your AI agent](./get-started/use-from-agents).
 ## Two product surfaces
@@ -66,7 +66,7 @@ zibby template add <name>                  # add a template later (overwrites =
 | Lifetime | Per trigger (seconds-minutes) | Long-lived |
 | Surface | Graph of agent CLI calls | A whole open-source application |
 | Billing | Per execution | Per minute, while running |
-| Best for | "When ticket lands, classify it" | "Host n8n for the team" |
+| Best for | "When ticket lands, classify it" | "Host Grafana for the team" |
 Pick by how long the thing needs to run — see [Apps overview](./apps/) for the decision tree.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@zibby/skills",
-  "version": "0.1.28",
+  "version": "0.1.30",
   "description": "Built-in skill definitions for Zibby test automation framework",
   "type": "module",
   "main": "dist/index.js",