npm - @agent-compose/cli - Versions diffs - 0.2.1 → 0.3.0 - Mend

@agent-compose/cli 0.2.1 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +4 -1
package/dist/index.js +124 -23
package/package.json +11 -11
package/skills/ac:generate-workflow.md +353 -122
package/skills/ac:invoke.md +31 -9
package/skills/ac:register-invoke.md +44 -7
package/skills/ac:register.md +4 -1
package/skills/ac:schedule.md +132 -0
package/skills/ac:setup.md +4 -1
package/skills/ac:snapshots.md +10 -2
package/skills/ac:tutorial.md +375 -0
package/skills/ac:demo.md +0 -134

package/skills/ac:schedule.md ADDED Viewed

@@ -0,0 +1,132 @@
+---
+name: ac:schedule
+description: Create, list, and delete cron schedules attached to registered workflows.
+argument-hint: <create|list|delete> [args]
+allowed-tools: Bash(agentc *)
+---
+# Schedules
+Cron schedules attached to a registered workflow. A workflow can have N
+named schedules with distinct cadences; each tick dispatches a fresh
+run tagged with the schedule's id + name so the dashboard attributes
+the run to its trigger and filters Runs by schedule.
+> **Mental model.** A schedule is a `(name, workflow, cron)` triple
+> scoped to a factory. Names are unique per factory; ids are stable
+> across re-registers (which makes "show me every run this schedule
+> ever fired" work even when the target template gets new versions).
+> Schedules are independent of workflow lifecycle — deleting a schedule
+> doesn't touch the workflow, deleting a workflow cascades to its
+> schedules.
+## Context
+Registered workflows:
+```
+!`agentc list 2>/dev/null | awk '{print $1}'`
+```
+Existing schedules:
+```
+!`agentc schedule list 2>/dev/null`
+```
+## Plan
+Based on `$ARGUMENTS`:
+| Arguments | Action |
+|---|---|
+| `create <name> --workflow <wf> --cron <expr>` | Create a schedule |
+| `list` | List schedules in the active factory |
+| `delete <id>` | Delete a schedule by id |
+| *(none)* | Ask which subcommand and walk through args |
+### Create
+```bash
+agentc schedule create <name> --workflow <wf> --cron '<expr>'
+```
+- `<name>` — kebab-case, unique per factory (e.g. `nightly-triage`).
+- `--workflow <wf>` — must be a registered workflow in the same factory.
+- `--cron '<expr>'` — UTC, standard 5-field cron. Quote it (shells eat
+  the wildcards otherwise). Common patterns:
+  - `'0 9 * * *'` — daily at 09:00 UTC
+  - `'*/15 * * * *'` — every 15 minutes
+  - `'0 * * * 1-5'` — hourly on weekdays
+- `--factory <slug>` — pin to a specific factory (default: active).
+Read `--cron` aloud back to the user to confirm — the symbol soup
+hides bugs. If unsure, use `cronstrue` (the dashboard renders the
+human form on the Schedules tab).
+### List
+```bash
+agentc schedule list
+agentc schedule list --factory my-factory
+```
+Output is one row per schedule:
+```
+<name>  <workflow>  <cron>  next: <ISO>  <schedule-id>
+```
+The dashboard's `/factories/<slug>/workflows` → **Schedules** tab is
+the richer view: human cadence (cronstrue), next/last fire times,
+delete button, click-through to the Runs view filtered by schedule.
+### Delete
+```bash
+agentc schedule delete <schedule-id>
+```
+Confirm with the user first — "Delete schedule `<name>`? It will stop
+firing immediately; existing runs are unaffected." Schedule ids come
+from `agentc schedule list` or the dashboard.
+## Shorthand on register
+`agentc register <file> --schedule '<cron>'` auto-creates a schedule
+**named after the workflow** at register time. Useful when a workflow
+has exactly one cadence and you don't need a separate name:
+```bash
+agentc register pipeline.ts --schedule '0 9 * * *'
+# → registers "pipeline" + creates schedule "pipeline" at 09:00 UTC
+```
+Re-registering with the same cron is a no-op; changing the cron
+updates the existing schedule's row. Drop `--schedule` (or pass
+`null`) to clear the auto-shorthand schedule. Schedules created via
+`agentc schedule create` with a different name are left alone.
+For workflows with multiple cadences (e.g. weekday + weekend), use
+`agentc schedule create` explicitly — the register shorthand only
+manages a single same-name schedule.
+## Filtering runs
+Every run dispatched by a schedule carries `_scheduleId` +
+`_scheduleName` on its metadata. The dashboard's Schedules tab makes
+the schedule name a link straight into a filtered Runs view; the
+chip on each row links back. From the CLI, the metadata is visible
+on `agentc logs <run-id>` and via the SDK's `getRunById`.
+## Next Steps
+- **Schedule created** → "Wait for the first tick (next fire shown in
+  `agentc schedule list`); the dispatched run shows up in the dashboard
+  Runs view with a `schedule by <name>` link."
+- **`workflow "..." not found`** → Register the target template first
+  (`/ac:register <file.ts>`) — schedules can only target workflows
+  that already exist.
+- **`schedule "..." already exists`** → Name collision. Either use
+  `agentc schedule delete <id>` first or pick a new name.
+- **Schedule never fires** → Check the workflow's `bootFrom`
+  (`/ac:snapshots`) — runs against a missing snapshot fail at
+  dispatch. The schedule still ticks; only the workflow run fails.

package/skills/ac:setup.md CHANGED Viewed

@@ -86,6 +86,9 @@ This lists registered workflows. If it returns without error, setup is complete.
 > "You're all set. Here's what you can do now:"
 >
-> - `/ac:demo` — watch AI agents build a browser game end-to-end
+> - `/ac:tutorial` — hands-on walkthrough of register, invoke, schedule, events, and snapshots (~10 min)
 > - `agentc invoke <workflow> --follow` — run any registered workflow
 > - `/ac:generate-workflow` — build your own workflow from a description
+> - `/ac:schedule create <name> --workflow <wf> --cron '<expr>'` — fire
+>   a workflow on a cron schedule (each tick shows up in the Runs view
+>   with a `schedule by <name>` link)

package/skills/ac:snapshots.md CHANGED Viewed

@@ -107,8 +107,16 @@ export default defineWorkflow({
 // On every successful run — captures the sandbox VM after the last step:
 defineWorkflow({ snapshots: { saveLatest: true }, run: ... });
-// Retain one per step (storage scales with step count):
-defineWorkflow({ snapshots: { saveLatest: true, retainSteps: true }, run: ... });
+// Retain one snapshot per step (storage scales linearly with step count).
+// Most useful with step-form workflows where each `.step(definedStep)`
+// is a distinct engine step; a run-form workflow has only one outer
+// "run" step, so `retainSteps: true` there behaves the same as
+// `saveLatest: true` alone.
+defineWorkflow({ id: "pipeline", input, output, snapshots: { saveLatest: true, retainSteps: true } })
+  .step(fetchStep)
+  .step(scoreStep)
+  .step(pickTopStep)
+  .build();
 ```
 Per-invocation override:

package/skills/ac:tutorial.md ADDED Viewed

@@ -0,0 +1,375 @@
+---
+name: ac:tutorial
+description: Hands-on tutorial — register, invoke, schedule, emit events, capture snapshots. ~10 minutes against a running stack.
+effort: high
+allowed-tools: Bash(agentc *) Bash(git *) Bash(open *) Read Write Edit
+---
+# Tutorial — agent-compose
+A hands-on walkthrough of every surface that matters. Builds one tiny
+workflow and uses it to demonstrate the five things you'll actually
+care about in production:
+1. **Author + register** a workflow.
+2. **Invoke** it and watch the run live.
+3. **Schedule** it to fire on a cron.
+4. **Emit events** from inside the workflow + inspect them.
+5. **Snapshot** the sandbox state + boot another workflow from it.
+Should take ~10 minutes against the hosted server (a bit longer
+locally — the runner sandbox boots fresh each time).
+## Pre-flight
+```
+!`agentc auth status`
+```
+If the line above shows no API key, run `/ac:setup` first.
+## Mental model
+Three primitives. Read once, refer back when something doesn't make
+sense:
+> **Workflow** — the unit of work you author. Two shapes:
+> `async (ctx, sandbox) => T` (run-form, single body) or
+> `defineWorkflow({...}).step(s1).step(s2).build()` (step-form, typed
+> pipeline). This tutorial uses run-form because it's the smaller
+> mental load; step-form is the right shape when the work decomposes
+> into typed phases.
+> **Agent loop** — `agent({ runtime, prompt, ... })`. Lives inside the
+> workflow body. Drives one iteration of an LLM agent against the
+> runner sandbox until the model emits `exit_signal: true` or hits
+> the budget.
+> **Runtime** — `claudeRuntime` (built-in, wraps the Claude CLI).
+> `openAIDesktopRuntime` (computer-use). Write your own with
+> `defineRuntime`.
+Everything else — schedules, events, snapshots, secrets — wraps these
+three.
+## Step 1 — Scaffold
+Create `hello-agent.ts` next to where the user wants to work. Walk
+them through the file before writing — point at `ctx`, `sandbox`,
+`agent`, the prompt, the tool allowlist, the budget, and the
+`ctx.reportEvent` call (we'll use that in step 4). Confirm before
+saving.
+```ts
+// hello-agent.ts
+import { defineWorkflow, agent, claudeRuntime } from "@agent-compose/sdk";
+const PROMPT = `
+You're verifying agent-compose is wired up correctly.
+Run \`echo "hello from sandbox $(hostname)"\` once, then emit a
+<status> block with a summary describing what you observed and
+exit_signal: true.
+`;
+export default defineWorkflow({
+  async run(ctx, sandbox) {
+    const result = await agent({
+      sandbox,
+      runtime: claudeRuntime,
+      prompt:  PROMPT,
+      tools:   ["Bash"],
+      budget:  { turnsPerIteration: 6, maxIterations: 1 },
+    });
+    // Emit a custom event so step 4 has something to inspect.
+    await ctx.reportEvent("tutorial.verified", {
+      summary: result.status?.summary ?? "(no summary)",
+      attributes: { completed: !!result.status?.completed },
+    });
+    return { ok: !!result.status?.completed, summary: result.status?.summary };
+  },
+});
+```
+## Step 2 — Register and invoke
+```bash
+agentc register hello-agent.ts
+agentc invoke hello-agent --follow
+```
+> "The CLI bundles the workflow source + the runtime source via
+> `bundleWorkflow`, then POSTs to `/api/v1/factories/<slug>/templates`.
+> The server creates a runner sandbox, drops the bundle in, and starts
+> Node. Inside, our workflow runs `agent` once — Claude spawns, runs
+> the bash command, emits its `<status>`, and exits. Server marks the
+> run complete; the CLI prints elapsed time and the final outcome."
+Watch the SSE-streamed events live — point out:
+- `step_started` / `step_completed` (timeline phases).
+- `agent_event` lines (model output streaming live).
+- `event_reported` for the `tutorial.verified` event we emitted (step 4
+  picks this up).
+- `run_complete` with elapsed time.
+Copy the run id from the final output — needed for step 4.
+## Step 3 — Schedule it
+Schedules fire your workflow on a cron. Names are unique per factory
+and stable across re-registers, so historical runs stay attributed
+even when the target template gets new versions.
+```bash
+agentc schedule create hello-every-2min \
+  --workflow hello-agent \
+  --cron '*/2 * * * *'
+```
+Wait two minutes, then list runs (or open the dashboard's Runs view):
+```bash
+agentc list                                  # registered templates
+# In the dashboard, /factories/<slug>/workflows → Schedules tab shows
+# `hello-every-2min` with the cadence in plain English and the next
+# fire time. Clicking the name filters Runs to just this schedule.
+```
+When you're done, clean up:
+```bash
+agentc schedule list                         # find the id
+agentc schedule delete <schedule-id>
+```
+> "Schedules are independent of the workflow — deleting a schedule
+> doesn't touch the template, deleting the template cascades to its
+> schedules. A workflow can have N named schedules for different
+> cadences (weekday vs weekend, hourly vs daily, etc)."
+## Step 4 — Inspect events
+`ctx.reportEvent(...)` inside the workflow body emits structured
+events the dashboard timeline renders, downstream workflows subscribe
+to, and analytics queries scan. List the events on the run we did in
+step 2:
+```bash
+agentc events list <run-id>
+```
+You should see `tutorial.verified` with the summary we emitted.
+Send a synthetic event from the CLI to verify the write path (useful
+for testing downstream subscribers without re-running the workflow):
+```bash
+agentc events send <run-id> tutorial.poke \
+  --body '{"source":"tutorial"}' \
+  --summary "manual poke from /ac:tutorial"
+```
+List again — `tutorial.poke` is now there too.
+> "Events are idempotent on `(run, name, idempotency-key)`, propagate
+> to subscribers, and live in the same timeline as lifecycle events.
+> Production workflows emit them via `ctx.reportEvent` — the CLI form
+> is for testing + verification."
+## Step 5 — Snapshots
+A snapshot is the on-disk state of a runner VM at the end of a
+successful step. Other workflows boot from it instead of running
+their own setup. Useful for:
+- Pre-baked sandbox environments (Claude CLI installed, deps fetched,
+  caches warm).
+- Re-running a workflow with the same starting state every time.
+For the tutorial, capture one from the next `hello-agent` run by
+adding `snapshots: { saveLatest: true }` to the workflow:
+```ts
+export default defineWorkflow({
+  snapshots: { saveLatest: true },              // ← add this
+  async run(ctx, sandbox) { /* unchanged */ },
+});
+```
+Re-register and invoke:
+```bash
+agentc register hello-agent.ts
+agentc invoke hello-agent --follow
+```
+After the run completes, list captured snapshots:
+```bash
+agentc snapshot list --workflow hello-agent
+```
+Copy the `snap_…` id. Now author a *second* workflow that boots from
+it instead of starting from a blank sandbox:
+```ts
+// hello-consumer.ts
+import { defineWorkflow } from "@agent-compose/sdk";
+export default defineWorkflow({
+  snapshots: { bootFrom: { snapshotId: "snap_PASTE_ID_HERE" } },
+  async run(ctx, sandbox) {
+    const result = await sandbox.commands.run("ls -la /home/user");
+    return { listing: result.stdout };
+  },
+});
+```
+```bash
+agentc register hello-consumer.ts
+agentc invoke hello-consumer --follow
+```
+`hello-consumer` starts in the exact state `hello-agent` left the
+sandbox in — same files, same processes, same env. No setup re-runs.
+> "Snapshot ids are the unit of identity. There's no 'latest of
+> workflow X' resolution at dispatch time — operators pick an id from
+> `agentc snapshot list` (or the dashboard's snapshots page) and
+> paste it. Deleting a snapshot makes any workflow referencing it 503
+> at dispatch; the consuming workflow needs to be re-registered with
+> a different id."
+## Step 6 — From your own code (SDK)
+Everything you just did with the CLI is also available programmatically
+via `@agent-compose/sdk`. The CLI is a thin wrapper around the same
+`AgentComposeClient` — what you see in `agentc <verb>` is what your
+backend code does. Useful for:
+- Triggering workflows from another service (webhook handlers, queue
+  consumers, internal dashboards).
+- Listing runs / schedules from your own admin UI.
+- Wiring agent-compose into a CI pipeline without shelling out to
+  `agentc`.
+Install it as a dependency of the calling project (it's a public npm
+package):
+```bash
+npm install @agent-compose/sdk     # or bun add @agent-compose/sdk
+```
+Construct a client with an API key (mint one in the dashboard's
+Settings → API Keys; the CLI uses the same shape):
+```ts
+import { AgentComposeClient } from "@agent-compose/sdk";
+const client = new AgentComposeClient({
+  baseURL: process.env.AGENTC_URL ?? "http://localhost:8080",
+  apiKey:  process.env.AGENTC_API_KEY!,     // ac_…
+});
+```
+> "The SDK is Bearer-authed (Authorization: ac_…). It's the same
+> entry point the CLI uses — `agentc invoke foo` is literally
+> `client.invoke('foo', …)`. Browser code talks to the server via a
+> cookie-bound dashboard-internal fetch wrapper instead, but that's
+> a dashboard-only seam; server-to-server is the SDK."
+### Invoke a workflow + wait for the result
+```ts
+// Fire-and-forget — returns a runId you can stream later.
+const { runId } = await client.invoke("hello-agent", { input: {} });
+// Block until the run settles, return the typed output:
+const result = await client.invokeAndWait<{ ok: boolean; summary?: string }>(
+  "hello-agent",
+  { input: {}, timeoutMs: 120_000 },
+);
+console.log(result.outcome, result.output?.summary);
+```
+### Schedules (matches `agentc schedule …`)
+```ts
+const schedules = await client.listSchedules();          // → ScheduleRow[]
+await client.createSchedule({
+  name:     "nightly",
+  workflow: "hello-agent",
+  cron:     "0 9 * * *",                                 // UTC
+});
+await client.deleteSchedule(scheduleId);
+```
+### Events (matches `ctx.reportEvent` inside the workflow + `agentc events`)
+```ts
+// Read what a run emitted:
+const events = await client.listRunEvents(runId);
+// Or factory-wide (audit / cross-workflow analytics):
+const recent = await client.listFactoryEvents({ limit: 200 });
+// Send a synthetic event into a run from outside:
+await client.reportEvent(runId, {
+  name:    "deploy.completed",
+  summary: "deploy to prod succeeded",
+  body:    { sha: "abc123" },
+});
+```
+### Snapshots (matches `agentc snapshot list`)
+```ts
+const captured = await client.listSnapshots({ workflowName: "hello-agent" });
+const perRun   = await client.listRunSnapshots(runId);
+```
+To boot another workflow from a captured snapshot, set
+`snapshots: { bootFrom: { snapshotId: "snap_…" } }` in
+`defineWorkflow(...)` — same shape the CLI's register step writes.
+### Stream a run's logs
+```ts
+for await (const line of client.streamRunLogs(runId)) {
+  console.log(`[${line.stream}] ${line.line}`);
+}
+```
+The full set of methods is on `AgentComposeClient` — the same surface
+the dashboard playground and `/ac:logs` skill use. See the SDK's
+[`README`](../../sdk/README.md) for the type-checked reference.
+## Step 7 — Where to go next
+> "You've now used every primitive. To go further:"
+>
+> - `/ac:generate-workflow` — describe a multi-phase workflow in
+>   plain English; Claude scaffolds the file with `agent` blocks per
+>   phase and the right input/output schemas.
+> - `/ac:generate-agent` — scaffold one agent loop's prompt + snippet
+>   to slot into an existing workflow.
+> - `/ac:secrets` — wire up brokered network secrets (GCP Secret
+>   Manager → injected at the Vercel firewall).
+> - `/ac:schedule` — the dedicated schedules walkthrough.
+> - `/ac:snapshots` — list, inspect, delete captured snapshots.
+> - `/ac:events` — full events reference.
+> - Read [`docs/how-it-works.md`](../../docs/how-it-works.md) for the
+>   multi-agent example (plan + parallel implementers).
+## Cleanup
+```bash
+agentc schedule delete <hello-every-2min-id>   # if you created one
+agentc snapshot delete <run-id> <snap-id>      # if you captured one
+rm hello-agent.ts hello-consumer.ts            # local files
+```
+Or keep them as references — registered templates stay in the team
+factory until you `agentc factory ...` them out.

package/skills/ac:demo.md DELETED Viewed

@@ -1,134 +0,0 @@
----
-name: ac:demo
-description: First-run walkthrough — verify auth, scaffold a sample workflow, register it, invoke, and watch the agent run live.
-effort: high
-allowed-tools: Bash(agentc *) Bash(git *) Bash(open *) Read Write Edit
----
-# Demo — agent-compose
-A guided first-run walkthrough. Verify auth, generate a tiny sample
-workflow, register it, dispatch it, and watch a single agent loop run
-end-to-end. Should take ~5 minutes against the hosted server (a bit
-longer locally — the runner sandbox boots fresh).
-## Context
-```
-!`agentc auth status`
-```
-If the line above shows no API key, run `/ac:setup` first (sign in to
-the dashboard, mint an `ac_…` key, `agentc auth login <key>`). That
-skill walks the full bootstrap.
-## Steps
-### 1. Briefly explain the model
-Make sure the user understands the three primitives before you start
-typing. Keep it brief — they can read [`docs/how-it-works.md`](../../docs/how-it-works.md)
-for the full version.
-> **Workflow** — `async (ctx, sandbox) => T`. The function you author.
-> Owns the run end-to-end: orchestrates phases, kicks off agent loops,
-> writes metadata, returns a structured result.
-> **Agent loop** — `agent({ runtime, prompt, ... })`. Embedded inside
-> the workflow body. Drives one iteration cycle of an LLM agent against
-> the runner sandbox until the model emits `exit_signal: true` (or
-> hits the iteration budget).
-> **Runtime** — `claudeRuntime` (built-in) wraps the Claude CLI.
-> `openAIDesktopRuntime` wraps OpenAI computer-use. You can write your
-> own with `defineRuntime`.
-### 2. Scaffold a sample workflow
-Create a temporary `hello-agent.ts` next to where the user wants to
-work. Keep it minimal — one phase, no fancy plumbing:
-```ts
-// hello-agent.ts
-import { defineWorkflow, agent, claudeRuntime } from "@agent-compose/sdk";
-const PROMPT = `
-You're verifying agent-compose is wired up correctly.
-Run \`echo "hello from sandbox $(hostname)"\` once, then emit a
-<status> block with summary describing what you observed and
-exit_signal: true.
-`;
-export default defineWorkflow({
-  async run(ctx, sandbox) {
-    const result = await agent({
-      sandbox,
-      runtime: claudeRuntime,
-      prompt:  PROMPT,
-      tools:   ["Bash"],
-      budget:  { turnsPerIteration: 6, maxIterations: 1 },
-    });
-    await ctx.setMetadata({ summary: result.status?.summary });
-    return { ok: !!result.status?.completed, summary: result.status?.summary };
-  },
-});
-```
-Walk the user through the file — point at `ctx`, `sandbox`,
-`agent`, the prompt, the tool allowlist, the budget. Confirm before
-writing.
-### 3. Register
-```bash
-agentc register hello-agent.ts
-```
-> "The CLI bundles the workflow source + the runtime source via
-> `bundleWorkflow`, then POSTs to `/api/v1/templates`. Once registered,
-> anyone with an `invoke`-scoped API key can call it."
-### 4. Invoke and watch live
-```bash
-agentc invoke hello-agent --follow
-```
-> "The server creates a runner sandbox, drops the bundle in, and starts
-> Node. Inside the sandbox, our workflow runs `agent` once — Claude
-> spawns, runs the bash command, emits its `<status>`, and exits.
-> Server marks the run complete; the CLI prints the final outcome and
-> elapsed time."
-Watch the SSE-streamed events together — point out:
-- `step_started` / `step_completed` (timeline phases).
-- `agent_event` lines (model output streaming live).
-- `run_complete` with elapsed time.
-### 5. Browse the run in the dashboard
-Once the run finishes, open the run detail page:
-```bash
-# pull the dashboard URL from auth status; substitute the run id
-open "$(agentc auth status | awk '/^Dashboard:/ {print $2}')/workflows/<run-id>"
-```
-The dashboard shows the same lifecycle events plus the captured
-artifacts (logs, metadata) the workflow emitted.
-### 6. What to build next
-> "That was the simplest possible workflow. To go further:"
->
-> - `/ac:generate-workflow` — describe a multi-phase workflow in plain
->   English; Claude scaffolds the full file with `agent` blocks per phase.
-> - `/ac:generate-agent` — scaffold one phase's prompt + `agent` snippet
->   to slot into an existing workflow.
-> - `/ac:generate-runtime` — write a custom runtime for a model the SDK
->   doesn't ship.
-> - Read [`docs/how-it-works.md`](../../docs/how-it-works.md) for the
->   multi-agent example (plan + parallel implementers).
-Clean up: `rm hello-agent.ts` (or keep it as a reference).