@codemation/agent-skills 0.2.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -2,277 +2,77 @@
2
2
  name: codemation-workflow-dsl
3
3
  description: Guides Codemation workflow authoring. Use when creating or updating workflow definitions in `src/workflows` — manual-trigger flows via `workflow("...").manualTrigger(...)`, or cron/webhook/other triggers via `createWorkflowBuilder({id, name}).trigger(...)`.
4
4
  compatibility: Designed for Codemation apps and plugins that author workflows.
5
+ tags: workflow, dsl, authoring
6
+ uses: "@codemation/core-nodes, @codemation/host"
5
7
  ---
6
8
 
7
9
  # Codemation Workflow DSL
8
10
 
9
- ## Use this skill when
11
+ ## Mental model
10
12
 
11
- Authoring or reviewing workflow definitions under `src/workflows/`.
13
+ A workflow definition describes how items move from a trigger through downstream node steps. Items carry data in `item.json`; earlier outputs are available through `ctx.data`. Activations are batch-shaped but most node steps execute per-item. Every workflow definition finishes with `.build()`, which validates node ids and emits a `WorkflowDefinitionError` on collision or empty id.
12
14
 
13
- Do not use this skill for CLI-only troubleshooting or deep host architecture questions unless they directly affect workflow authoring.
15
+ ## When to use / when NOT
14
16
 
15
- ## Discovering nodes and patterns
17
+ Use this skill when authoring or reviewing workflow definitions under `src/workflows/`.
18
+ Do not use for CLI-only troubleshooting or deep host architecture questions unless they directly affect workflow authoring.
16
19
 
17
- **Always call `find_examples` first** when you need to learn how to use a node or build a workflow pattern.
18
-
19
- ### Why examples are the canonical reference
20
-
21
- Examples in the catalog typecheck, lint, and are verified by CI. They show the exact import paths, constructor signatures, and DSL shape that work in a real project — more efficiently than reading schema definitions or grepping framework source.
22
-
23
- ### When to call `find_examples` first
24
-
25
- - Before writing any workflow that uses an unfamiliar node.
26
- - When you need a pattern (polling, branching, sub-workflow, agent with tools, etc.) and aren't sure of the exact API.
27
- - As your first step — before `read_skill`, before `search_capabilities`, before reading any file.
28
-
29
- ### Query patterns
30
-
31
- Call `find_examples` in two ways:
32
-
33
- ```ts
34
- // By node name:
35
- find_examples({ query: "HttpRequest" });
36
- find_examples({ query: "AIAgent" });
37
- find_examples({ query: "CronTrigger" });
38
-
39
- // By use case / intent:
40
- find_examples({ query: "poll API and write to database" });
41
- find_examples({ query: "AIAgent multi-step pipeline" });
42
- find_examples({ query: "gmail trigger classify email" });
43
- ```
44
-
45
- Mix both: `find_examples({ query: "AIAgent gmail classify" })` works too.
46
-
47
- ### Install state in results
48
-
49
- Every search result includes `installed: boolean` and `requiresInstall: string[]`. Use these to plan installs (`install_package`) before adapting an example. If `installed` is `false` or `requiresInstall` is non-empty, call `install_package` for each missing package before writing any workflow code that imports them.
50
-
51
- ### When find_examples returns zero hits
52
-
53
- Stop. Do not improvise from memory. Do one of:
54
-
55
- 1. **Ask the user**: "I don't have an example for `<query>`. Would you like me to adapt the closest match (`<nearest>`) or should a proper example be added first?"
56
- 2. **Adapt the closest near-miss** — only with the user's explicit confirmation that the approach is reasonable.
57
-
58
- Do not attempt to infer node behavior by grepping framework source code (e.g. `node_modules/@codemation/*`). Examples convey the same information more efficiently and are authoritative.
59
-
60
- ## When no example matches — the self-solving fallback chain
61
-
62
- If `find_examples` returns no good match for your query, **do not ask the user**. The user is non-technical and can't help you pick between framework primitives. Solve it using this fixed chain:
63
-
64
- ### Tier 1 — Retry with intent variations
65
-
66
- Re-query with the underlying intent: a different verb, a more generic term, the closest standard pattern. Example: no hit for `"google sheets append row"` → retry `"http POST bearer credential"` or `"REST API call with credential"`.
67
-
68
- ### Tier 2 — Custom REST node (preferred for HTTP APIs)
69
-
70
- If the task is "call an external HTTP API," use `defineRestNode`. Always works.
71
-
72
- `find_examples({ query: "defineRestNode" })` → returns the canonical templates:
73
-
74
- - `custom-rest-node-simple.example.ts` — basic shape
75
- - `custom-rest-node-with-credential.example.ts` — with bearer/OAuth credential slot
76
-
77
- Adapt these to the specific endpoint + payload shape needed.
78
-
79
- ### Tier 3 — Raw HttpRequest (inline, one-off)
80
-
81
- If the call is one-shot inline in a workflow and you don't need to define a reusable node, use the `HttpRequest` config class.
82
-
83
- `find_examples({ query: "HttpRequest" })` → `node-httprequest.example.ts`
84
-
85
- ### Tier 4 — defineNode (non-HTTP custom logic)
86
-
87
- If the task isn't an HTTP call (data transformation, business logic, anything stateful), use `defineNode`.
88
-
89
- `find_examples({ query: "defineNode template" })` → `custom-node-template.example.ts`
90
-
91
- ### What NOT to do
92
-
93
- - Do NOT ask the user "should I use HttpRequest or defineRestNode?" — they can't help; pick using the chain.
94
- - Do NOT grep `node_modules/@codemation/*` for node implementations — the templates above are the canonical reference.
95
- - Do NOT invent a custom solution outside this chain.
96
-
97
- ### Surfacing what you did
98
-
99
- After building, your final message to the concierge should state the technique used, e.g.:
100
-
101
- > "Built using `defineRestNode` for the Google Sheets append call (no first-class Sheets node yet)."
102
-
103
- This is informational, not a request for approval.
104
-
105
- ## There are TWO authoring APIs — pick by trigger type
106
-
107
- | Trigger | API to use | Import | Available chain helpers |
108
- | ----------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
109
- | **Manual** (one-shot, optionally seeded with default items) | `workflow("id").manualTrigger(...)` | `import { workflow } from "@codemation/host"` | Full fluent sugar: `.map`, `.if`, `.switch`, `.split`, `.agent`, `.node`, `.then`, `.build` |
110
- | **Cron, Webhook, Test, or any non-manual trigger** | `createWorkflowBuilder({ id, name }).trigger(new XxxTrigger(...))` | `import { createWorkflowBuilder, CronTrigger, WebhookTrigger } from "@codemation/core-nodes"` | Low-level `.then(new SomeNodeConfig(...))` only — **no** `.map`/`.if`/`.agent`/`.node` sugar |
111
-
112
- **Why two APIs?** `workflow("...")` returns a `WorkflowAuthoringBuilder` that _only_ exposes `.name()` and `.manualTrigger(...)`. Once you call `.manualTrigger(...)`, you get a `WorkflowChain` that has all the fluent helpers. For any other trigger, you must use the lower-level `createWorkflowBuilder({id, name}).trigger(new Trigger(...))` path — the result is a `ChainCursor` whose only chain method is `.then(new NodeConfig(...))`. You compose by passing node config classes directly: `new Callback(...)`, `new HttpRequest(...)`, `new AIAgent(...)`, `new If(...)`, `new Split(...)`, etc.
113
-
114
- If you find yourself wanting `.map` or `.if` on a cron workflow, you have two options: (a) accept the verbose `.then(new Callback(...))` style, or (b) wrap the cron-trigger cursor explicitly: `new WorkflowChain(builder.trigger(new CronTrigger(...)))` — but this is rare in practice; production cron workflows use plain `.then(new ConfigClass(...))`.
115
-
116
- ## Core mental model
117
-
118
- 1. A workflow definition describes how items move from a trigger through downstream steps.
119
- 2. Activations are **batch-shaped** (`Items`); many steps use **per-item** execution (`execute`, including helper **`defineNode`**) with optional **`inputSchema`** and **`itemExpr`** on config fields. Batch reshape steps (split/filter/aggregate, **`defineBatchNode`**) work on the whole batch.
120
- 3. Fluent callback helpers (manual-trigger only) follow the runtime item contract: `.map(...)`, `.if(...)`, and `.switch({ resolveCaseKey })` receive `(item, ctx)`. Row fields live under `item.json`; earlier completed outputs are available through `ctx.data`.
121
- 4. Finish every workflow definition with `.build()`.
122
-
123
- ## Authoring rules
124
-
125
- 1. **Pick the API by trigger type** (see table above). Don't try to call `.trigger(...)` on the `workflow(...)` builder — it doesn't exist there.
126
- 2. Keep workflow files focused on orchestration and named steps.
127
- 3. Use custom nodes when a callback grows into reusable product logic.
128
- 4. Distinguish **batch activations** from **per-item node bodies**: custom nodes from **`defineNode`** implement **`execute`** per item unless you chose **`defineBatchNode`** for batch **`run`**.
129
- 5. **Collection nodes (`collectionInsertNode`, `collectionGetNode`, `collectionListNode`, etc.) use `.then(node.create(...))` instead of `.node(label, node, opts)`.** TypeScript's inference can't bridge the recursive `ParamDeep` constraint when the node config contains `z.record(...)` fields. See `node-collection-crud.example.ts` for the canonical pattern.
130
-
131
- ## Node ids and stability
132
-
133
- Every node in a workflow definition has an `id`. When no explicit `id:` is given, `WorkflowBuilder` derives one by slugifying the node's `name` label: lowercase, non-alphanumeric runs replaced with `-`, trimmed. `"Send Email"` becomes `"send-email"`.
134
-
135
- `.build()` throws `WorkflowDefinitionError` if any node ends up with an empty id (blank label and no explicit `id`) or if two nodes share the same id. The check covers agent connection children (model + tools) as well.
136
-
137
- For nodes that hold credential bindings, the binding is keyed by `(workflowId, nodeId, slotKey)`. Renaming a node's label changes its slug-derived id and orphans the binding — the operator must re-attach the credential in the UI. Prefer stable labels or set an explicit `id:` on credential-using nodes:
20
+ ## Quickstart pick API by trigger type
138
21
 
139
22
  ```ts
140
- .node("Send notification", SendEmailNodeConfig, {
141
- id: "send-notification", // stable even if the label is later renamed
142
- // ...
143
- })
144
- ```
145
-
146
- ### Collision gotcha — set explicit ids on every node
147
-
148
- Auto-derived ids can also **collide** when a trigger and a downstream node share a label. Example:
149
-
150
- ```ts
151
- // ❌ Auto-derived ids collide: both slugify to "classify-feedback"
152
- workflow("wf.feedback")
153
- .manualTrigger("Classify feedback", {
154
- /* ... */
155
- })
156
- .agent("Classify feedback", {
157
- /* ... */
23
+ // Manual trigger — full fluent sugar (.map, .if, .switch, .agent, .node, .then)
24
+ import { workflow } from "@codemation/host";
25
+ export default workflow("wf.example")
26
+ .manualTrigger("Start", {
27
+ /* seed items */
158
28
  })
159
- .build(); // throws WorkflowDefinitionError: duplicate nodeId "classify-feedback"
29
+ .map(/* ... */)
30
+ .build();
160
31
 
161
- // Explicit id on the AIAgent disambiguates
162
- workflow("wf.feedback")
163
- .manualTrigger("Classify feedback", {
164
- /* ... */
165
- })
166
- .agent("Classify feedback", { id: "classify-feedback-agent" /* ... */ })
32
+ // Cron / webhook / any other trigger — low-level .then(new NodeConfig(...)) only
33
+ import { createWorkflowBuilder, CronTrigger } from "@codemation/core-nodes";
34
+ export default createWorkflowBuilder({ id: "wf.example", name: "Example" })
35
+ .trigger(new CronTrigger("Daily", { schedule: "0 9 * * *", timezone: "UTC" }))
36
+ .then(/* new SomeNodeConfig(...) */)
167
37
  .build();
168
38
  ```
169
39
 
170
- **Recommendation: always set an explicit `id:` on every node.** It's a few extra characters that buys you:
171
-
172
- 1. Stable credential bindings across label renames (above)
173
- 2. No collision build errors when refactoring labels
174
- 3. Stable references for any downstream code that addresses nodes by id (e.g. pinned-output state, test-suite assertions, audit-log entries)
175
-
176
- The slug-derived default exists for quick prototyping; production workflows should declare ids.
177
-
178
- ## Typical flow
40
+ For full patterns multi-step pipelines, branching, SubWorkflow, binary, agent tools, TestTrigger, and complete working examples — use your harness's example-discovery tool: `find_examples({ query: "..." })`. Useful queries: `"CronTrigger"`, `"if branch"`, `"AIAgent multi-step"`, `"SubWorkflow binary"`, `"TestTrigger assertion"`.
179
41
 
180
- **Manual trigger (fluent):**
42
+ ## Decision branches & gotchas
181
43
 
182
- 1. `workflow("wf.example.id")`.
183
- 2. `.name("Display name")` (optional — defaults to the id).
184
- 3. `.manualTrigger("Start", { /* default item json */ })`.
185
- 4. Chain transformations: `.map(...)`, `.if(...)`, `.switch(...)`, `.split(...)`, `.agent(...)`, `.node(...)`, `.then(...)`.
186
- 5. `.build()`.
44
+ **Two authoring APIs — pick by trigger type.** `workflow("id").manualTrigger(...)` returns a `WorkflowChain` with full fluent helpers (`.map`, `.if`, `.switch`, `.split`, `.agent`, `.node`). `createWorkflowBuilder({id, name}).trigger(new XxxTrigger(...))` returns a `ChainCursor` whose only chain method is `.then(new NodeConfig(...))`. Do NOT call `.trigger(...)` on the `workflow(...)` builder — it doesn't exist there.
187
45
 
188
- **Cron / webhook (low-level):**
46
+ **Node ids and stability.** When no explicit `id:` is given, the engine slugifies the node's `name` label (lowercase, non-alphanumeric → `-`). `"Send Email"` → `"send-email"`. Nodes sharing credential bindings use `(workflowId, nodeId, slotKey)` as the binding key — renaming a label orphans the binding. **Set explicit `id:` on every credential-using node.** `.build()` throws `WorkflowDefinitionError` on empty or duplicate ids.
189
47
 
190
- 1. `createWorkflowBuilder({ id: "wf.example.id", name: "Display name" })`.
191
- 2. `.trigger(new CronTrigger("Label", { schedule, timezone }))` or `.trigger(new WebhookTrigger("Label", { endpointKey, methods }))`.
192
- 3. Chain with `.then(new SomeNodeConfig(...))` repeatedly. Common configs: `Callback`, `HttpRequest`, `AIAgent`, `If`, `Split`, `Merge`, `SubWorkflow`.
193
- 4. `.build()`.
48
+ **Id collision pitfall.** A manual-trigger label and a downstream agent label that share the same string both slugify to the same id — `.build()` throws. Fix: add `id: "...-agent"` to disambiguate.
194
49
 
195
- ## Built-in triggers
50
+ **Collection nodes** use `.then(node.create(...))` instead of `.node(label, node, opts)` — TypeScript can't infer the `ParamDeep` constraint via the fluent helper. See `find_examples({ query: "collection crud" })`.
196
51
 
197
- - **`ManualTrigger`** one-shot manual run, optionally seeded with default items. Use the fluent shortcut: `workflow("id").manualTrigger(name, items?)`. The shortcut internally wires up `createWorkflowBuilder(...).trigger(new ManualTrigger(...))` and wraps the result in `WorkflowChain` so you get the full fluent sugar.
198
- - **`WebhookTrigger`** — fires on an incoming HTTP request. Construct with `new WebhookTrigger(name, { endpointKey, methods })`. Attach via `createWorkflowBuilder({id, name}).trigger(new WebhookTrigger(...))`.
199
- - **`CronTrigger`** — fires on a cron schedule. Construct with `new CronTrigger(name, { schedule, timezone? })`. Attach via `createWorkflowBuilder({id, name}).trigger(new CronTrigger(...))`. The expression is validated at workflow build time. Each tick emits one item: `{ firedAt: string, scheduledFor: string }` (both ISO-8601). Defaults to UTC — always supply `timezone` for DST-sensitive schedules.
52
+ **Install state in example results.** Every `find_examples` result includes `installed: boolean` and `requiresInstall: string[]`. If `installed` is `false` or `requiresInstall` is non-empty, call `install_package` for each missing package before writing any workflow code that imports them.
200
53
 
201
- ## Agent tools (callable helpers)
54
+ **When no example matches — self-solving fallback chain.**
202
55
 
203
- - For **inline** agent tools in workflow files (no separate `@tool()` class), use **`callableTool(...)`** from `@codemation/core`: supply `name`, Zod `inputSchema` / `outputSchema`, and `execute({ input, item, ctx, ... })`. **`CallableToolFactory.callableTool(...)`** is the same implementation if you prefer the factory style.
204
- - Prefer **plugin `Tool` classes** when the tool is reusable across packages; use **`AgentToolFactory.asTool(...)`** when exposing an existing runnable node to the agent.
56
+ 1. Retry with intent variations (different verb, more generic term).
57
+ 2. For HTTP APIs: `find_examples({ query: "defineRestNode" })` covers basic and credential-slotted REST.
58
+ 3. For one-shot inline HTTP: `find_examples({ query: "HttpRequest" })`.
59
+ 4. For non-HTTP custom logic: `find_examples({ query: "defineNode template" })`.
60
+ Do NOT ask the user to pick between primitives — they can't help; use the chain. Do NOT grep `node_modules/@codemation/*` for node implementations — examples are authoritative. Surface the technique used in your reply.
205
61
 
206
- ## Workflow agent authoring
62
+ **Workflow testing.** Three built-in nodes from `@codemation/core-nodes`: `TestTrigger` (yields one item per test case), `IsTestRun` (routes `true`/`false` by `ctx.testContext`), `Assertion` (emits `AssertionResult[]`, sets `emitsAssertions: true`). See `references/workflow-testing.md` for authoring details.
207
63
 
208
- - Use `.agent(...)` for fluent workflow-defined agent steps.
209
- - Define agent messages with `messages`, not a workflow-specific prompt shortcut.
210
- - Use a static `messages` array for fixed prompts.
211
- - Use `itemExpr(...)` when agent messages depend on the current item.
212
- - Use fluent `.map((item, ctx) => ...)` when workflow data itself needs reshaping before the agent step.
213
- - `model` may be a provider string such as `"openai:gpt-4o-mini"` or a `ChatModelConfig`.
64
+ **SubWorkflow binary.** `item.binary` slots pass transparently through SubWorkflow boundaries in both directions — no special config needed. Both runs share the same `BinaryStorage` singleton.
214
65
 
215
- ## Workflow testing nodes
66
+ **Verify your workflow.** Call `verify_workflow({ path: "src/workflows/my-workflow.ts" })` instead of running `pnpm typecheck` yourself. Returns `{ ok, data: { typecheck, lint, build, structure }, hint? }`.
216
67
 
217
- Codemation ships first-class **workflow tests**: each test case is one full workflow run, persisted with assertion records. Three nodes from `@codemation/core-nodes`:
68
+ ## Anti-patterns
218
69
 
219
- 1. **`TestTrigger`** drop alongside live triggers. Author callback `generateItems(ctx)` returns an `AsyncIterable<Item>`; the orchestrator dispatches one workflow run per yielded item with `executionOptions.testContext` set. `triggerKind: "test"` is set automatically — live activation skips it.
220
- 2. **`IsTestRun`** — per-item router with `true` / `false` ports. Routes `true` iff `ctx.testContext` is set. Use it to skip side-effects in tests (don't actually send a real reply).
221
- 3. **`Assertion`** generic callback emitter; returns `AssertionResult[]`. Each result is `{ name, score: 0..1, passThreshold?, errored?, expected?, actual?, message?, details? }` pass/fail derives from `score >= (passThreshold ?? 0.5)` (use `score: 1`/`0` for boolean checks, set `passThreshold` for continuous metrics, `errored: true` for assertion-code crashes). Each result becomes one emitted item on `main` and one persisted `TestAssertion` row when running inside a test. Sets `emitsAssertions: true` so the host persister identifies it.
222
-
223
- Authors invoke a TestSuiteRun from the canvas **Tests tab** or via `POST /api/workflows/:id/test-suite-runs`. The orchestrator caps concurrency (default 4, configurable per trigger) and aggregates results into `succeeded | failed | partial | cancelled | errored`.
224
-
225
- Custom nodes can also read `ctx.testContext?.{testSuiteRunId, testCaseIndex}` directly — useful for synthetic outputs in test mode without `IsTestRun` branching.
226
-
227
- ## Binary slots across SubWorkflow boundaries
228
-
229
- `item.binary` (the map of named `BinaryAttachment` records) is carried transparently through SubWorkflow boundaries in both directions:
230
-
231
- - **Parent → child**: binary slots attached before the SubWorkflow node are visible inside the child run. `ctx.binary.openReadStream(attachment)` works in the child because both runs share the same `BinaryStorage`.
232
- - **Child → parent**: slots attached inside the child are returned with the item and visible in the parent's continuation nodes.
233
-
234
- This requires no special configuration in production — the shared `BinaryStorage` DI singleton is what makes cross-run byte reads possible.
235
-
236
- ### SubWorkflow + binary example (manual trigger)
237
-
238
- ```ts
239
- import { workflow } from "@codemation/host";
240
- import { Callback, SubWorkflow } from "@codemation/core-nodes";
241
-
242
- // Manual-trigger flow — uses the fluent `.map`/`.then` sugar.
243
- export default workflow("wf.parent")
244
- .manualTrigger<{ url: string }>("Start", { url: "" })
245
- // Attach a binary slot before the sub-workflow:
246
- .map(async (item, ctx) => {
247
- const att = await ctx.binary.attach({
248
- name: "doc",
249
- body: Buffer.from("..."),
250
- mimeType: "application/pdf",
251
- filename: "doc.pdf",
252
- });
253
- return ctx.binary.withAttachment(item, "doc", att);
254
- })
255
- // Sub-workflow receives item with binary["doc"] populated:
256
- .then(new SubWorkflow("ParseDoc", { workflowId: "wf.child" }))
257
- // Continuation: both parent "doc" slot and any child-added slots are visible here.
258
- .map((item) => item)
259
- .build();
260
- ```
70
+ - Do not call `.trigger(...)` on the `workflow(...)` manual builder use `createWorkflowBuilder(...)` for non-manual triggers.
71
+ - Do not rely on slug-derived node ids for production workflows with credential bindings always set an explicit `id:`.
72
+ - Do not improvise from memory when `find_examples` returns zero hits — use the fallback chain above.
261
73
 
262
74
  ## Read next when needed
263
75
 
264
- - Read `references/builder-patterns.md` for item-flow rules and fluent authoring patterns.
265
- - Read `references/workflow-testing.md` for TestTrigger / IsTestRun / Assertion authoring with full examples.
266
- - Read `references/complete-example.md` for a single dense end-to-end workflow example that exercises most authoring features (CronTrigger, map, if, agent, callableTool, itemExpr, ctx.data, ctx.binary, node with explicit id, build).
267
-
268
- ## Verifying your workflow
269
-
270
- After writing or modifying a workflow file, call `verify_workflow({ path })` instead of running `pnpm typecheck` yourself. The tool runs typecheck + lint + DSL build + structure dump in one round-trip and returns a structured envelope:
271
-
272
- ```ts
273
- verify_workflow({ path: "src/workflows/my-workflow.ts" });
274
- // → { ok: true, data: { typecheck: "ok", lint: "ok", build: "ok", structure: { id, name, trigger, nodes, edges, activation } } }
275
- // → { ok: false, error: "...", data: { typecheck: {...}, lint: {...}, build: {...}, structure: null }, hint: "..." }
276
- ```
277
-
278
- A failed `ok: false` result includes a `hint` field that points at the specific fix needed. Fix the reported errors and call `verify_workflow` again — do not report done until `ok: true`.
76
+ - `references/builder-patterns.md` item-flow rules and fluent authoring patterns.
77
+ - `references/workflow-testing.md` TestTrigger / IsTestRun / Assertion with full examples.
78
+ - `references/complete-example.md` dense end-to-end example covering most authoring features.
@@ -0,0 +1,142 @@
1
+ ---
2
+ name: codemation-workspace-files
3
+ description: ListWorkspaceFiles + ReadWorkspaceFile nodes — read files from the shared workspace pool. Covers read-by-filename (latest-wins), pinned fileId, binary slot handoff, and the raw-upload → concierge-digests → workflow-reads-derived-file pattern. Read before building any workflow that reads workspace files.
4
+ compatibility: Codemation core-nodes-workspace-files. Requires WORKSPACE_ID and BLOB_STORAGE_* env vars.
5
+ tags: workspace, files, binary, storage, read, csv, json
6
+ uses: "@codemation/core-nodes-workspace-files"
7
+ ---
8
+
9
+ # Codemation Workspace Files
10
+
11
+ ## Mental model
12
+
13
+ Workflows **read** the shared workspace file pool; they do **not** write to it. Files are
14
+ created and managed on the control-plane side (the Files UI, the concierge, the
15
+ DocumentScanner). The framework's role is to provide `ListWorkspaceFiles` and
16
+ `ReadWorkspaceFile` as pure read nodes.
17
+
18
+ The **headline scenario** is: a user uploads a raw PDF; the concierge digests it into a
19
+ structured JSON; the workflow reads the _derived JSON_, not the raw bytes. Workflows
20
+ never touch raw uploads directly.
21
+
22
+ ## When to use / when NOT
23
+
24
+ Use `ReadWorkspaceFile` when a workflow needs data that lives in the workspace pool
25
+ (pricing sheets, config JSON, concierge-derived documents, CSV exports).
26
+
27
+ Use `ListWorkspaceFiles` to discover what files exist or to drive a fan-out (one item per file).
28
+
29
+ Do NOT use these nodes to write files — writing is CP-mediated and deferred to v2.
30
+
31
+ Do NOT base64-encode bytes onto `item.json`. Binary payloads always flow through
32
+ `item.binary` via `ctx.binary`.
33
+
34
+ ## Quickstart
35
+
36
+ ```ts
37
+ import { readWorkspaceFileNode } from "@codemation/core-nodes-workspace-files";
38
+
39
+ // Read the latest "pricing.csv" by name — picks up the newest upload automatically.
40
+ readWorkspaceFileNode.create({ filename: "pricing.csv", binarySlot: "data" }, "Read pricing CSV", "read-pricing-csv");
41
+ ```
42
+
43
+ ```ts
44
+ // Pin to an exact version — a later upload never changes what this reads.
45
+ readWorkspaceFileNode.create(
46
+ { fileId: "abc123def456", binarySlot: "data" },
47
+ "Read pinned pricing CSV",
48
+ "read-pricing-pinned",
49
+ );
50
+ ```
51
+
52
+ For full patterns (parse the bytes, scenario walkthrough, list + filter), use your
53
+ harness's example-discovery tool: `find_examples({ query: "workspace files" })`.
54
+
55
+ ## Resolution modes
56
+
57
+ | Mode | Config | Behaviour |
58
+ | ------------------------- | ------------------------- | -------------------------------------------------------------------------------------------------- |
59
+ | **latest-wins** (default) | `filename: "pricing.csv"` | Reads the **newest** file with that name. Next upload of the same name is what the next run reads. |
60
+ | **pinned fileId** | `fileId: "abc123..."` | Reads that exact, immutable version forever. A new upload never changes this ref. |
61
+
62
+ Use **latest-wins** for "always use the current sheet" patterns.
63
+ Use **pinned fileId** for reproducible/auditable runs (e.g., regression tests, compliance audits).
64
+
65
+ ## Binary slot handoff
66
+
67
+ `ReadWorkspaceFile` streams the file's bytes into `item.binary[binarySlot]` (default `"data"`).
68
+ The node emits:
69
+
70
+ ```ts
71
+ {
72
+ fileId: string;
73
+ filename: string;
74
+ contentType: string;
75
+ size: number; // bytes
76
+ lastModified: string; // ISO 8601
77
+ binarySlot: string; // e.g. "data"
78
+ }
79
+ ```
80
+
81
+ Downstream nodes read the bytes via `ctx.binary.openReadStream(item.binary["data"])`.
82
+ The bytes are **never** base64-encoded on `item.json`.
83
+
84
+ ## Concierge → digest → workflow pattern
85
+
86
+ This is the intended headline flow:
87
+
88
+ ```
89
+ User uploads PDF → CP Files UI stores it in the workspace pool
90
+ Concierge sees upload → DocumentScanner digests it → writes "report-digested.json" back
91
+ Workflow runs (schedule/webhook) → ReadWorkspaceFile("report-digested.json") → acts
92
+ ```
93
+
94
+ The workflow is **decoupled** from the upload event. It reads the _derived_ file that the
95
+ concierge produced, not the raw upload. The concierge's job is to bridge the raw-upload world
96
+ and the structured-data world.
97
+
98
+ Key boundaries:
99
+
100
+ - **CP side (write)**: raw file ingest, concierge digest, derived file write, Files UI.
101
+ - **Workflow side (read)**: `ReadWorkspaceFile` + `ListWorkspaceFiles` only.
102
+
103
+ ## Anti-patterns
104
+
105
+ - Do NOT tell users to read the raw PDF upload in a workflow — point at the concierge-derived JSON.
106
+ - Do NOT base64-encode file bytes onto `item.json` — use `item.binary[slot]` + `ctx.binary`.
107
+ - Do NOT attempt to write a file from a workflow node — there is no write surface in v1.
108
+ - Do NOT assume `WORKSPACE_ID` is always set — in local dev without CP integration, the storage
109
+ token resolves to `undefined`. Add a guard if your workflow runs in dev mode.
110
+
111
+ ## Node reference
112
+
113
+ ### `listWorkspaceFilesNode`
114
+
115
+ ```ts
116
+ listWorkspaceFilesNode.create(
117
+ {
118
+ filenameFilter?: string; // optional substring match (case-insensitive)
119
+ },
120
+ "List files",
121
+ "list-files",
122
+ )
123
+ ```
124
+
125
+ Output per item: `{ fileId, filename, contentType, size, lastModified }`. Sorted newest-first.
126
+
127
+ ### `readWorkspaceFileNode`
128
+
129
+ ```ts
130
+ readWorkspaceFileNode.create(
131
+ {
132
+ filename?: string; // latest-wins resolution
133
+ fileId?: string; // pinned resolution (takes precedence over filename)
134
+ binarySlot?: string; // default: "data"
135
+ maxBytes?: number; // default: 100 MiB — raise for large files
136
+ },
137
+ "Read file",
138
+ "read-file",
139
+ )
140
+ ```
141
+
142
+ Either `filename` or `fileId` must be set. Output: metadata JSON + bytes in `item.binary[binarySlot]`.