npm - @keystrokehq/cli - Versions diffs - 0.1.15 → 0.1.16 - Mend

@keystrokehq/cli 0.1.15 → 0.1.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/dist/skills-bundle/skills/keystroke-skills/SKILL.md CHANGED Viewed

@@ -28,10 +28,15 @@ At runtime → `/workspace/agent/skills/support/`.
 ## SKILL.md
-Follow [Agent Skills](https://agentskills.io/specification): YAML frontmatter with `name` (matches folder) and `description`. Keep the body short; put detail in `references/`.
+Follow [Agent Skills](https://agentskills.io/specification): YAML frontmatter with `name` (matches the folder) and `description` are the required fields; everything else (e.g. `metadata`) is optional. Keep the body short; put detail in `references/`.
+## Two kinds of skills (don't confuse them)
+- **`src/skills/`** — *project* Agent Skills attached to your agents via `skills: [...]`. They deploy with your project; inspect deployed ones with `keystroke skill list`.
+- **`.agents/skills/`** — the *bundled coding-agent* guides (like this one) that `keystroke init` scaffolds. Refresh them to the current CLI version with `keystroke skills sync` (note the plural `skills`).
 ## External registries
-Browse [skills.sh](https://skills.sh) — copy into `src/skills/{key}/` and fix `name` to match the folder.
+Browse [skills.sh](https://skills.sh) — copy into `src/skills/{name}/` and fix `name` to match the folder.
 Related: [agents](.agents/skills/keystroke-agents/SKILL.md), [files](.agents/skills/keystroke-files/SKILL.md).

package/dist/skills-bundle/skills/keystroke-triggers/SKILL.md CHANGED Viewed

@@ -7,18 +7,18 @@ metadata:
 # Triggers
-Triggers **attach** a source to a workflow. No business logic here — only schedule, endpoint, validation, and filters.
+Triggers **attach** a source to a target (a workflow or an agent). No business logic here — only schedule, endpoint, validation, and filters.
-Attachment id: `{triggerKey}:{workflowKey}` (e.g. `signup:signup-pipeline`).
+Attachment id: `{sourceSlug}:{targetSlug}` (e.g. `signup:signup-pipeline`), where the suffix is the workflow's (or agent's) `slug`.
 ## Cron
 ```ts
-import { defineCronSource } from "@keystrokehq/trigger";
+import { defineCronSource } from "@keystrokehq/keystroke/trigger";
 import workflow from "../workflows/morning-check";
 export default defineCronSource({
-  key: "morning-check",
+  slug: "morning-check",
   schedule: "0 9 * * *",
 }).attach({ workflow });
 ```
@@ -26,12 +26,12 @@ export default defineCronSource({
 ## Webhook
 ```ts
-import { defineWebhookSource } from "@keystrokehq/trigger";
+import { defineWebhookSource } from "@keystrokehq/keystroke/trigger";
 import { z } from "zod";
 import workflow from "../workflows/signup-pipeline";
 export default defineWebhookSource({
-  key: "signup",
+  slug: "signup",
   endpoint: "signup",
   request: z.object({
     name: z.string().trim().min(1),
@@ -47,20 +47,22 @@ Use optional Zod `filter` for extra constraints beyond `request`:
 filter: z.object({ type: z.literal("invoice.paid") }),
 ```
+**Webhooks ack asynchronously.** The POST returns immediately — `202 { runId }` when a binding matches, or `{ ok: true, skipped: true }` when nothing does. The workflow runs in the background; its output is **not** returned in the HTTP response (there's no "respond to webhook"). To return data to the caller, make an outbound call from the workflow, or have the caller poll the run via the runs API / `keystroke workflow runs get`.
 ### Shared endpoint (e.g. Stripe)
-Multiple trigger files can use the same `endpoint` — one URL `POST /triggers/{endpoint}`, each with its own `key`, `request`, `filter`, and `transform`. Unmatched payloads return `{ ok: true, skipped: true }`.
+Multiple trigger files can use the same `endpoint` — one URL `POST /triggers/{endpoint}`, each with its own `slug`, `request`, `filter`, and `transform`. Unmatched payloads return `{ ok: true, skipped: true }`.
 ```ts
 // src/triggers/stripe-invoice-paid.ts
 export default defineWebhookSource({
-  key: "stripe-invoice-paid",
+  slug: "stripe-invoice-paid",
   endpoint: "stripe",
   request: z.object({ type: z.string(), data: z.object({ id: z.string() }) }),
   filter: z.object({ type: z.literal("invoice.paid") }),
 }).attach({ workflow: invoicePaidWorkflow, transform: (p) => ({ invoiceId: p.data.id }) });
-// src/triggers/stripe-subscription-deleted.ts — same endpoint, different key/schema/filter
+// src/triggers/stripe-subscription-deleted.ts — same endpoint, different slug/schema/filter
 ```
 List or inspect all triggers on an endpoint:
@@ -71,21 +73,23 @@ keystroke trigger get stripe          # same rows as list --endpoint (shared rou
 keystroke trigger url stripe          # one webhook URL for the route
 ```
-Use each trigger's `key` for `trigger get` / run history (`keystroke trigger runs list stripe-invoice-paid:…`).
+Use each trigger's `slug` for `trigger get` / run history (`keystroke trigger runs list stripe-invoice-paid:…`).
 ## Poll
 ```ts
-import { definePollSource } from "@keystrokehq/trigger";
+import { definePollSource } from "@keystrokehq/keystroke/trigger";
 import workflow from "../workflows/new-inbox";
 export default definePollSource({
-  key: "new-inbox",
+  slug: "new-inbox",
   schedule: "*/5 * * * *",
   run: () => ({ emails: [] }),
 }).attach({ workflow });
 ```
+Poll filtering uses a **function predicate** (not a Zod schema like webhooks) — either `.filter((result) => …)` chained on the source, or a `filter:` option. Returning falsy skips the tick. Group polls that should run together with `definePollSource({ id: "...", … })`.
 ### Ephemeral poll (agents)
 Agents can register a scheduled codemode script with `set_trigger`:
@@ -93,7 +97,7 @@ Agents can register a scheduled codemode script with `set_trigger`:
 ```ts
 set_trigger({
   kind: "poll",
-  key: "inbox",
+  slug: "inbox",
   schedule: "*/5 * * * *",
   code: [
     'const emails = await tools["list-emails"]({ query: "is:unread" });',
@@ -101,14 +105,22 @@ set_trigger({
     "console.log(JSON.stringify({ count: emails.items.length, items: emails.items }));",
   ].join("\n"),
   prompt: "You have {{payload.count}} unread emails.",
+  lifecycle: { maxExecutions: 10 }, // optional: cap runs (also `until`)
 });
 ```
 - Write the script the same way you would for codemode (`bash` + `js-exec`).
 - `console.log(JSON.stringify(result))` when there is work to do.
-- Log nothing (or `null`) to skip — skipped ticks do not count toward `maxExecutions`.
+- Log nothing (or `null`) to skip — skipped ticks do not count toward `lifecycle.maxExecutions`.
 - Prompt interpolation matches webhooks (`{{payload.path}}`).
+## Attachment patterns
+- **Attach to an agent** instead of a workflow: `.attach({ agent, prompt })` — the source's payload drives the agent prompt (interpolated like webhooks).
+- **Fan-out to multiple targets**: chain `.attach(...).attach(...)` (or export an array of attachments) to bind one source to several workflows/agents.
+- **Shared source definitions** can live under `src/triggers/sources/` — that subfolder is excluded from attachment discovery, so it's a safe place for source defs you import elsewhere.
+- Sources also accept optional `name` / `description` metadata.
 ## Develop & audit
 While building, invoke the workflow directly:
@@ -120,11 +132,12 @@ keystroke workflow run signup-pipeline --input '{"name":"Ada","email":"a@acme.co
 Inspect trigger-driven runs:
 ```bash
-keystroke trigger runs list signup:signup-pipeline
+keystroke trigger runs list signup:signup-pipeline   # --limit / --cursor / --trigger-type
 keystroke trigger runs get signup:signup-pipeline <run-id> --include workflows,trace
-keystroke trigger poll <poll-attachment-id>    # on-demand poll
+keystroke trigger poll <poll-attachment-id>    # on-demand poll (--group to run a poll group)
+keystroke trigger attachment disable <trigger-slug> <attachment-id>   # pause; `enable` to resume
 ```
 Check `src/triggers/` in your project for existing patterns before adding new ones.
-Related: [workflows](.agents/skills/keystroke-workflows/SKILL.md), [gateways](.agents/skills/keystroke-gateways/SKILL.md).
+Related: [workflows](.agents/skills/keystroke-workflows/SKILL.md), [channels](.agents/skills/keystroke-channels/SKILL.md).

package/dist/skills-bundle/skills/keystroke-workflows/SKILL.md CHANGED Viewed

@@ -12,8 +12,8 @@ Workflows are **deterministic orchestration**: Zod input/output, a `run` functio
 ## Example: action chain
 ```ts
-import { defineWorkflow } from "@keystrokehq/workflow";
-import { postMessage } from "@keystrokehq/slack/actions";
+import { defineWorkflow } from "@keystrokehq/keystroke/workflow";
+import { slackSendMessage } from "@keystrokehq/slack/actions";
 import { z } from "zod";
 import { researchSignup } from "../actions/research-signup";
 import { signupBriefMessage } from "../lib/signup";
@@ -24,12 +24,16 @@ export default defineWorkflow({
   output: z.object({ brief: z.string(), channel: z.string(), ts: z.string() }),
   async run(input) {
     const { brief } = await researchSignup.run(input);
-    return postMessage.run({ channel: "#pipeline", text: signupBriefMessage({ ...input, brief }) });
+    const sent = await slackSendMessage.run({
+      channel: "#pipeline",
+      markdown_text: signupBriefMessage({ ...input, brief }),
+    });
+    return { brief, ...sent }; // sent provides { channel, ts }; merge in brief for the output schema
   },
 });
 ```
-`research-signup` calls an agent; `postMessage` is an integration action used **directly as a step** here — never wrapped in a custom action. Keep orchestration in the workflow; an action is a single leaf step and never calls another action.
+`research-signup` calls an agent; `slackSendMessage` is an integration action used **directly as a step** here — never wrapped in a custom action. Note the `run` return is spread to satisfy the `output` schema (the Slack action only returns `channel`/`ts`). Keep orchestration in the workflow; an action is a single leaf step and never calls another action.
 ## Run & audit
@@ -41,18 +45,29 @@ keystroke workflow runs get signup-pipeline <run-id> --include steps,trace
 ## How workflows get invoked
-| From             | How                                                                   |
-| ---------------- | --------------------------------------------------------------------- |
-| CLI              | `keystroke workflow run {key} --input '{...}'`                        |
-| Trigger          | cron / webhook / poll attachment in `src/triggers/`                   |
-| Agent tool       | `defineWorkflowTool(workflow)` from `@keystrokehq/runtime` on an agent |
-| Another workflow | call from `run`                                                       |
+| From       | How                                                                   |
+| ---------- | --------------------------------------------------------------------- |
+| CLI        | `keystroke workflow run {slug} --input '{...}'`                       |
+| HTTP       | `POST /workflows/{slug}`                                              |
+| Trigger    | cron / webhook / poll attachment in `src/triggers/`                   |
+| Agent tool | `defineWorkflowTool(workflow)` from `@keystrokehq/runtime` on an agent |
-`key` must be unique across agents, workflows, triggers, and actions.
+There is no first-class "call another workflow" step. To share logic between workflows, extract it into `src/lib/` (or an action, which adds typed IO and its own durable checkpoint). To run a workflow as its own tracked run, expose it as an agent tool (`defineWorkflowTool`) or invoke it over HTTP. Calling another workflow's `.run()` directly skips that workflow's input/output validation (the durable step context is preserved inside a run, but the IO schemas are not re-applied).
+The workflow `slug` is its identifier and route key, and must be unique across agents, workflows, triggers, and actions.
+## Durability (the model you must design for)
+Workflows are **durable**: each `await` of an action, agent `.prompt()`, or `promptLlm` is recorded as a `step_completed` event. If a later step fails, the run **replays** the log — completed steps return their recorded result instead of running again — and resumes at the first unfinished step. Two rules follow:
+- **Side effects go inside steps.** Code in the `run` body that isn't a step (network calls, writes, `Date.now()`, random) re-executes on every replay. Wrap it in an action/agent/`promptLlm` call so it's recorded once.
+- **Keep control flow deterministic, steps idempotent.** Branch on input and recorded step results, not on values that change between attempts. A step can be retried after a transient failure, so design actions to tolerate being called twice with the same input. Retries are automatic — you don't write retry loops; the runtime emits `step_retrying` / `step_failed`.
+Step ids are correlation ids: `step:<slug>#<occurrence>` (`#0`, `#1`, …), or `step:<id>` when pinned with `.stepId()`. If an action's position can shift, pin a stable `.stepId()` so replays line up. Durable waits (`ctx.sleep`, `ctx.hook`) suspend the run without holding a process open. See [authoring.md](references/authoring.md).
 ## Testing
-Unit-test through `executeWorkflow` from `@keystrokehq/workflow` — never call `workflow.run(...)` directly (skips validation + action context). Stub costly actions by seeding `step_completed` events in a `MemoryEventLog`. See [testing.md](references/testing.md).
+Unit-test through `executeWorkflow` from `@keystrokehq/keystroke/workflow` — never call `workflow.run(...)` directly (skips validation + action context). Stub costly actions by seeding `step_completed` events in a `MemoryEventLog`. See [testing.md](references/testing.md).
 ## Next references

package/dist/skills-bundle/skills/keystroke-workflows/references/authoring.md CHANGED Viewed

@@ -10,7 +10,11 @@ async run(input) {
   const summary = await support.prompt({
     message: `Summarize signup research:\n${brief}`,
   });
-  return postMessage.run({ channel: "#pipeline", text: signupBriefMessage({ ...input, brief }) });
+  const sent = await slackSendMessage.run({
+    channel: "#pipeline",
+    markdown_text: signupBriefMessage({ ...input, brief }),
+  });
+  return { brief, ...sent };
 }
 ```
@@ -18,7 +22,7 @@ Actions are leaf units: an action never calls another action — compose them in
 ## Agents as workflow steps
-Import the agent and call `.prompt()` directly in the workflow `run`. Each call is a durable step keyed as `step:<agentKey>#<occurrence>` (same scheme as actions). No wrapper action required when the step is just "prompt this agent."
+Import the agent and call `.prompt()` directly in the workflow `run`. Each call is a durable step keyed as `step:<agent-slug>#<occurrence>` (same scheme as actions). No wrapper action required when the step is just "prompt this agent."
 ```ts
 const result = await support.prompt({
@@ -28,6 +32,110 @@ const result = await support.prompt({
 Use a wrapper action only when the agent call should also be an agent tool, or when the step bundles non-prompt logic.
+For an explicit step id, actions chain `.stepId("x")`; agent prompts take it as an option: `agent.prompt(input, { stepId: "x" })`.
+## LLM steps (`promptLlm`)
+For a one-shot LLM call without a full agent, use `promptLlm` (from `@keystrokehq/keystroke/workflow`). It's a first-class durable step keyed `step:promptLlm#<occurrence>`. The signature is `promptLlm(prompt, opts)` — the prompt string comes **first**, options second:
+```ts
+import { promptLlm } from "@keystrokehq/keystroke/workflow";
+// Returns a string when no outputSchema is given:
+const summary = await promptLlm(`Summarize:\n${brief}`, {
+  model: "anthropic/claude-sonnet-4-6",
+});
+```
+Pass `outputSchema` (Zod) to get a parsed, schema-validated object back instead of a string — the return type is inferred from the schema:
+```ts
+import { z } from "zod";
+const Category = z.object({ category: z.enum(["bug", "feature", "question"]) });
+const { category } = await promptLlm(`Classify this ticket:\n${ticket}`, {
+  model: "anthropic/claude-sonnet-4-6",
+  outputSchema: Category,
+});
+```
+Other options: `system`, `thinkingLevel`, `temperature`, `maxTokens`, `stepId`.
+## Durability & retries
+As each step completes, its result is written to a run event log. If a later step fails and the run is retried, Keystroke replays the log: completed steps return their recorded result instead of running again, and execution resumes at the first unfinished step. Design for this:
+- **Put side effects inside steps.** Work in the `run` body that isn't an action / agent / `promptLlm` call (a raw `fetch`, a DB write, `Date.now()`, `Math.random()`) runs again on every replay. Move it into a step so it's recorded once.
+- **Steps should be idempotent.** A step can be retried after a transient failure, so an action should tolerate being called twice with the same input (use idempotency keys for external writes where it matters).
+- **Keep control flow deterministic.** Branch on the run's input and recorded step results — not on wall-clock time or randomness — so replays take the same path. Each step is recorded under `step:<slug>#<occurrence>`; pin `.stepId("x")` when an action's order can shift so replays still line up. Ids must be unique within a run.
+You don't manage retries yourself. The runtime records `step_completed`, `step_retrying`, and `step_failed` events and surfaces them in `keystroke workflow runs get <slug> <run-id> --include steps,trace`.
+## Parallel steps
+Independent steps run concurrently with `Promise.all` — each `.run()` / `.prompt()` / `promptLlm` inside it is still its own durable step, recorded and replayed individually:
+```ts
+const [enriched, scored] = await Promise.all([
+  enrichLead.run({ leadId }),
+  scoreLead.run({ leadId }),
+]);
+```
+Correlation ids are assigned per slug in array order (`step:enrich-lead#0`, `step:score-lead#0`). If the branches can change order between attempts (conditionally included, dynamic arrays), pin a stable `.stepId()` on each so replays line up:
+```ts
+const results = await Promise.all(
+  leads.map((lead) => enrichLead.run({ leadId: lead.id }).stepId(`enrich:${lead.id}`)),
+);
+```
+There's no built-in concurrency limit or fan-out helper — `Promise.all` runs everything at once. For bounded concurrency, batch the array yourself (e.g. chunk and `await` each chunk).
+## Handling failures
+Retries are **queue-level**: a failed run is re-enqueued (default 3 attempts, exponential backoff) and replays from the event log, so completed steps don't re-run. Each failed attempt emits `step_retrying`; the final failure emits `step_failed`. You don't write retry loops — make steps idempotent so a retried step is safe.
+There's no built-in saga, compensation engine, or dead-letter queue. When a later step fails and you need to undo earlier work, orchestrate it yourself with `try/catch` in `run` — catching the error lets the run complete normally (so attach a `compensated` flag to the output rather than rethrowing if you've handled it):
+```ts
+async run(input) {
+  await reserveBooking.run({ id: input.bookingId });
+  try {
+    await chargeCustomer.run({ id: input.bookingId });
+    return { ok: true };
+  } catch (error) {
+    await releaseBooking.run({ id: input.bookingId }); // best-effort cleanup
+    return { ok: false, compensated: true };
+  }
+}
+```
+If you let the error propagate instead, the run fails and the queue retries it — fine for transient failures, but it replays from the log, so any cleanup must itself be a recorded, idempotent step.
+## Durable run context (`ctx`)
+`run(input, ctx)` receives a second argument exposing `runId`, `trigger` (`api` | `cron` | `webhook` | `poll` | `retry`), and the durable wait primitives `ctx.sleep(...)` and `ctx.hook(...)`. Both **suspend** the run (the replay engine resumes it later) so long waits and external callbacks don't hold a process open.
+```ts
+// Durable delay: number (ms), duration string, or a Date
+await ctx.sleep("1h");
+// Durable hook: suspend until an external system POSTs to the resume URL
+async run(input, ctx) {
+  const approval = ctx.hook<{ approved: boolean }>();
+  await slackSendMessage.run({
+    channel: "#approvals",
+    markdown_text: `Approve deploy? ${approval.resumeUrl}`,
+  });
+  const { approved } = await approval; // suspends here until resumed
+  return { approved };
+}
+```
+`ctx.hook(options?)` takes an optional `token` (reuse a stable resume token) and `schema` (Zod validation of the resume payload). The returned handle exposes `token` and `resumeUrl` before you await it — surface that URL so the approver/callback can resume the run.
 ## Agents inside actions
 When an action must call an agent (e.g. the action is reused as an agent tool), call `.prompt()` inside the action's `run`. That prompt is not a separate workflow step — the action is.
@@ -36,6 +144,10 @@ When an action must call an agent (e.g. the action is reused as an agent tool),
 Sync the app into `src/apps/`, author with `app.action()`, then `keystroke connect <slug>` before testing. Catalog integration actions from npm packages work once the app is connected.
-## Keys
+## Slugs
+Workflow `slug`, action `slug`, agent `slug`, and trigger `slug` share one namespace — pick unique names (`signup-pipeline`, not `pipeline`). The workflow's `slug` is also its HTTP route segment (`POST /workflows/{slug}`).
+## Subscription mode
-Workflow `key`, action `key`, agent `key`, and trigger `key` share one namespace — pick unique names (`signup-pipeline`, not `pipeline`).
+`defineWorkflow` accepts an optional `subscription: { mode: "system" | "subscribable" }` to control how the workflow can be subscribed to. Omit it for the default behavior.

package/dist/skills-bundle/skills/keystroke-workflows/references/testing.md CHANGED Viewed

@@ -2,15 +2,15 @@
 ## Unit tests (in-process, no server)
-Test through `executeWorkflow` from `@keystrokehq/workflow` — it parses input/output and wires up the action runner. **Never call `workflow.run(...)` directly**: that skips Zod validation and has no action context, so any action step throws.
+Test through `executeWorkflow` from `@keystrokehq/keystroke/workflow` — it parses input/output and wires up the action runner. **Never call `workflow.run(...)` directly in a test**: called at top level (outside `executeWorkflow`) it skips Zod validation and has no action context, so any action step throws.
-`executeWorkflow` runs the workflow via the durable replay engine and resolves to a `ReplayResult`: `{ status: "completed", output }`, `{ status: "failed", error }`, or `{ status: "suspended", items }` (when the body hits `ctx.sleep`/`ctx.hook`). Assert on `status` and read `output`.
+`executeWorkflow` runs the workflow via the durable replay engine and resolves to a `ReplayResult`: `{ status: "completed", output }`, `{ status: "failed", error }`, `{ status: "suspended", items }` (when the body hits `ctx.sleep`/`ctx.hook`), or `{ status: "canceled" }`. Assert on `status` and read `output`.
 `keystroke init` scaffolds Vitest and a starter test (e.g. `src/workflows/greeting.test.ts`):
 ```ts
 import { describe, expect, it } from "vitest";
-import { executeWorkflow } from "@keystrokehq/workflow";
+import { executeWorkflow } from "@keystrokehq/keystroke/workflow";
 import greeting from "./greeting";
 describe("greeting workflow", () => {
@@ -27,7 +27,7 @@ For multi-step workflows:
 ```ts
 import { describe, expect, it } from "vitest";
-import { executeWorkflow } from "@keystrokehq/workflow";
+import { executeWorkflow } from "@keystrokehq/keystroke/workflow";
 import workflow from "../signup-pipeline";
 it("runs the pipeline", async () => {
@@ -39,7 +39,7 @@ it("runs the pipeline", async () => {
   expect(result.status).toBe("completed");
   if (result.status === "completed") {
-    expect(result.output).toMatchObject({ channel: "#signups" });
+    expect(result.output).toMatchObject({ channel: "#pipeline" });
   }
 });
 ```
@@ -51,7 +51,7 @@ Deterministic actions (pure logic, no network/LLM) need no mocks — run the wor
 Steps are checkpointed in the durable event log as `step_completed` events keyed by a stable **correlation id**. Before running a step the action runner checks the log for that correlation id; pre-seed a `step_completed` event to skip real work (LLM/agent/HTTP calls) and feed a fixture:
 ```ts
-import { executeWorkflow, MemoryEventLog } from "@keystrokehq/workflow";
+import { executeWorkflow, MemoryEventLog } from "@keystrokehq/keystroke/workflow";
 import workflow from "../signup-pipeline";
 it("drafts without calling the agent", async () => {
@@ -83,7 +83,7 @@ it("drafts without calling the agent", async () => {
 Rules:
 - Pass a fixed `runId` and the same `MemoryEventLog` to `executeWorkflow`.
-- The correlation id is `step:<actionKey>#<occurrence>` — `#0` for the first call to that action, `#1` for the second, and so on — unless the step used `.stepId("x")`, in which case it is `step:x`.
+- The correlation id is `step:<slug>#<occurrence>` — `#0` for the first call to that action/agent, `#1` for the second, and so on — unless an explicit step id was set (`action.stepId("x")` or `agent.prompt(input, { stepId: "x" })`), in which case it is `step:x`.
 - The action **output** is stored as the event `data` directly (no wrapper). It is re-validated against the action's output schema, so it must be schema-valid.
 This is the preferred mock: schema-checked, no module-mock hoisting, and it still exercises the real `run` orchestration (branching, loops, output shape) — only the action bodies are stubbed.
@@ -92,11 +92,19 @@ This is the preferred mock: schema-checked, no module-mock hoisting, and it stil
 When you'd rather stub the underlying dependency (e.g. the agent inside an action) or assert how it was called, use `vi.mock`:
+`agent.prompt()` resolves to a `PromptResponse` — `{ sessionId, messages, error, canceled?, output? }` (`output` is the parsed result when the prompt was called with an `outputSchema`), **not** a bare messages array. Mock that shape:
 ```ts
 import { vi } from "vitest";
 vi.mock("../agents/support", () => ({
-  default: { prompt: vi.fn().mockResolvedValue([{ role: "assistant", content: "mocked" }]) },
+  default: {
+    prompt: vi.fn().mockResolvedValue({
+      sessionId: "test-session",
+      messages: [{ role: "assistant", content: "mocked" }],
+      error: null,
+    }),
+  },
 }));
 ```
@@ -108,7 +116,7 @@ Validation happens at the `executeWorkflow` boundary, so contract tests are free
 await expect(executeWorkflow(workflow, { email: "x" } as never)).rejects.toThrow();
 ```
-## Debug runs against a running server (CLI)
+## Debug runs from the CLI
 ```bash
 keystroke workflow run signup-pipeline --input '{"name":"Ada","email":"ada@acme.com","company":"Acme"}'

package/dist/templates/hello-world/README.md CHANGED Viewed

@@ -4,26 +4,35 @@ Keystroke project — agents, workflows, actions, and triggers under `src/`. See
 ## Getting started
+Keystroke is deploy-first: build in `src/`, deploy to your platform project, then run and inspect what's deployed with the CLI.
 ```bash
-pnpm install   # @keystrokehq/* from GitHub Packages (see .npmrc)
-# .env is created from .env.example on init — set ANTHROPIC_API_KEY and integration keys
-keystroke start   # API :3002, dashboard :3000
+pnpm install                          # @keystrokehq/* from GitHub Packages (see .npmrc)
+keystroke auth login                  # once
+keystroke deploy --project <id>       # build + ship src/ (see: keystroke project list)
 ```
-Login at `http://localhost:3000` (`admin@example.com` / `adminadmin` by default).
-Prompt the hello agent:
+<!-- example:start -->
+Prompt the hello agent and run the greeting workflow against your deployed project:
 ```bash
 keystroke agent prompt hello --message "Hi"
+keystroke workflow run greeting --input '{"name":"Ada"}'
 ```
-Run the greeting workflow:
+Use `--filter agents/hello` (or `workflows/greeting`) to redeploy a single module fast.
+<!-- example:end -->
+## Run locally (optional)
+To iterate offline without deploying, run a local server. Set `ANTHROPIC_API_KEY` and any integration keys in `.env` (created from `.env.example` on init), then:
 ```bash
-keystroke workflow run greeting --input '{"name":"Ada"}'
+keystroke dev   # watch src/, rebuild, restart the API (API :3002, dashboard :3000)
 ```
+The same `keystroke agent` / `keystroke workflow` commands then target your local server. Login at `http://localhost:3000` (`admin@example.com` / `adminadmin` by default).
 ## Linting
 ```bash
@@ -39,7 +48,9 @@ pnpm test:unit         # src/**/*.test.ts
 pnpm test:int          # src/**/*.int.test.ts (needs ANTHROPIC_API_KEY for agent tests)
 ```
+<!-- example:start -->
 Example: `src/workflows/greeting.test.ts` uses `executeWorkflow` to test the `greeting` workflow in-process.
+<!-- example:end -->
 ## Layout

package/dist/templates/hello-world/src/workflows/greeting.test.ts CHANGED Viewed

@@ -6,6 +6,6 @@ describe("greeting workflow", () => {
   it("returns a greeting for a name", async () => {
     const result = await executeWorkflow(greeting, { name: "Ada" });
-    expect(result).toEqual({ greeting: "Hello, Ada!" });
+    expect(result).toEqual({ status: "completed", output: { greeting: "Hello, Ada!" } });
   });
 });

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@keystrokehq/cli",
-  "version": "0.1.15",
+  "version": "0.1.16",
   "repository": {
     "type": "git",
     "url": "git+https://github.com/keystrokehq/keystroke.git",
@@ -40,9 +40,9 @@
     "tsx": "^4.22.3",
     "typescript": "^6.0.3",
     "vitest": "^4.1.7",
-    "@keystrokehq/oxlint-config": "0.0.4",
     "@keystrokehq/tsconfig": "0.0.3",
     "@keystrokehq/tsdown-config": "0.0.3",
+    "@keystrokehq/oxlint-config": "0.0.4",
     "@keystrokehq/vitest-config": "0.0.5"
   },
   "scripts": {

package/dist/skills-bundle/skills/keystroke-cli/references/api-targets.md DELETED Viewed

@@ -1,87 +0,0 @@
-# API target routing
-The CLI resolves a `baseUrl` before runtime commands (`workflow`, `agent`, `trigger`, `app`, `connect`, `health`). Implementation: `apps/cli/src/resolve-api-target.ts`.
-## Mental model
-- **Local** = your **keystroke server** (`keystroke dev` / `keystroke start`). Use for authoring, unit/integration tests in the repo, and manual runs while iterating.
-- **Cloud** = **keystroke platform** control plane routes to a deployed keystroke server (`/api/projects/:projectId/*`). Use for live invocation, listing triggers on deploy, webhook URLs, and auditing production runs.
-Auth is independent of target. `keystroke auth login` stores a bearer token in the OS keychain keyed by `webUrl` hostname and persists matching `webUrl` / `platformUrl` in config. Cloud API calls attach that token; local dev often runs without auth unless `BETTER_AUTH_SECRET` is set.
-## Config (`~/.keystroke`)
-| Key               | Default                    | Role                             |
-| ----------------- | -------------------------- | -------------------------------- |
-| `platformUrl`     | `https://api.keystroke.ai` | Keystroke platform control plane |
-| `webUrl`          | `https://keystroke.ai` | Dashboard + device login         |
-| `activeProjectId` | unset                      | Deployed project id (persistent) |
-| `apiTarget`       | inferred                   | `local` or `platform`            |
-Local commands use `PUBLIC_PLATFORM_URL` from the project `.env` (dev default `http://localhost:3002` when unset). They do not read `platformUrl` when `apiTarget=local`, so cloud login does not redirect local workflow runs to production.
-If `apiTarget` is unset and `activeProjectId` exists, effective target is **platform** (backward compatible). Explicit `apiTarget=local` overrides that without clearing `activeProjectId`.
-`keystroke config show` prints effective `apiTarget` and any active dev session.
-## Switching without losing cloud state
-```bash
-keystroke deploy --project proj_abc     # activeProjectId=proj_abc, apiTarget=platform
-keystroke config use local              # apiTarget=local; proj_abc unchanged
-keystroke workflow run foo --input '{}' # → localhost:3002
-keystroke config use cloud              # apiTarget=platform; same proj_abc
-keystroke trigger list                  # → platform runtime
-```
-Swap cloud project:
-```bash
-keystroke config use project proj_xyz
-```
-Target another project once:
-```bash
-keystroke --project proj_xyz workflow runs list my-workflow
-```
-## Dev session auto-local
-`keystroke dev` writes `dev-session.json` beside config (pid, port, serverUrl). While that process is alive, resolved target is **local** even if `apiTarget=platform`. Stopping dev removes the session; routing falls back to config.
-`--project` still wins over the dev session when you need to hit cloud while dev is up.
-## Global flags
-| Flag             | Effect                                                                     |
-| ---------------- | -------------------------------------------------------------------------- |
-| `--local`        | Force local API (`PUBLIC_PLATFORM_URL` / project `.env`) for this command                  |
-| `--project <id>` | Force platform runtime for that project; does not update `activeProjectId` |
-## Platform projects
-Platform projects are deploy targets on the control plane. List or create them before the first deploy:
-```bash
-keystroke project list
-keystroke project create --name "My app"
-keystroke deploy --project <id>              # activates runtime; sets activeProjectId
-keystroke config use project <id>            # swap default cloud target
-```
-New projects start **inactive**. The CLI hints with `keystroke deploy --project <id>` when appropriate.
-## When agents should choose local
-- User is editing `src/` and wants to test a workflow or agent change
-- `keystroke dev` or `keystroke start` is running (or should be started first)
-- Debugging with fast iteration; no deploy mentioned
-## When agents should choose cloud
-- User asks about deployed triggers, webhook URLs, or production runs
-- After `keystroke deploy` or when operating an existing platform project
-- Listing or invoking resources on the live environment
-If unsure, run `keystroke config show` and prefer **local** when the task is development/testing.