@codemation/agent-skills 0.3.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/CHANGELOG.md +182 -0
  2. package/dist/metadata.json +383 -36
  3. package/package.json +3 -1
  4. package/skills/builder/ai-agent/SKILL.md +314 -0
  5. package/skills/builder/ai-agent/references/anti-patterns.md +24 -0
  6. package/skills/{codemation-cli → builder/cli}/SKILL.md +1 -8
  7. package/skills/builder/connect-external-systems/SKILL.md +191 -0
  8. package/skills/builder/credential-development/SKILL.md +86 -0
  9. package/skills/{codemation-credential-development → builder/credential-development}/references/credential-patterns.md +3 -3
  10. package/skills/builder/custom-node-development/SKILL.md +61 -0
  11. package/skills/builder/custom-node-development/references/credential-aware-nodes.md +52 -0
  12. package/skills/builder/custom-node-development/references/define-batch-node.md +54 -0
  13. package/skills/{codemation-custom-node-development → builder/custom-node-development}/references/define-node-per-item.md +14 -14
  14. package/skills/{codemation-custom-node-development → builder/custom-node-development}/references/node-patterns.md +33 -49
  15. package/skills/builder/document-ai/SKILL.md +167 -0
  16. package/skills/builder/execution-context/SKILL.md +436 -0
  17. package/skills/{codemation-framework-concepts → builder/framework-concepts}/SKILL.md +10 -18
  18. package/skills/builder/gmail/SKILL.md +327 -0
  19. package/skills/builder/human-in-the-loop/SKILL.md +82 -0
  20. package/skills/{codemation-mcp-capabilities → builder/mcp-capabilities}/SKILL.md +4 -11
  21. package/skills/{codemation-mcp-capabilities → builder/mcp-capabilities}/references/agent-with-mcp.ts +1 -1
  22. package/skills/builder/msgraph/SKILL.md +338 -0
  23. package/skills/builder/odoo/SKILL.md +498 -0
  24. package/skills/{codemation-plugin-development → builder/plugin-development}/SKILL.md +4 -7
  25. package/skills/{codemation-plugin-development → builder/plugin-development}/references/plugin-anatomy.md +36 -15
  26. package/skills/{codemation-plugin-development → builder/plugin-development}/references/plugin-structure.md +2 -2
  27. package/skills/builder/rest-node/SKILL.md +148 -0
  28. package/skills/builder/testing/SKILL.md +142 -0
  29. package/skills/builder/workflow-dsl/SKILL.md +493 -0
  30. package/skills/builder/workspace-files/SKILL.md +191 -0
  31. package/skills/concierge/credentials/SKILL.md +91 -0
  32. package/skills/concierge/intake-automation-playbook/SKILL.md +78 -0
  33. package/skills/concierge/scenario-invoice-to-accounting/SKILL.md +48 -0
  34. package/skills/concierge/scenario-procurement-intake/SKILL.md +58 -0
  35. package/skills/codemation-ai-agent-node/SKILL.md +0 -66
  36. package/skills/codemation-ai-agent-node/references/anti-patterns.md +0 -11
  37. package/skills/codemation-credential-development/SKILL.md +0 -57
  38. package/skills/codemation-custom-node-development/SKILL.md +0 -61
  39. package/skills/codemation-custom-node-development/references/credential-aware-nodes.md +0 -38
  40. package/skills/codemation-custom-node-development/references/define-batch-node.md +0 -38
  41. package/skills/codemation-workflow-dsl/SKILL.md +0 -78
  42. package/skills/codemation-workflow-dsl/references/builder-patterns.md +0 -120
  43. package/skills/codemation-workflow-dsl/references/complete-example.md +0 -263
  44. package/skills/codemation-workflow-dsl/references/workflow-testing.md +0 -194
  45. /package/skills/{codemation-cli → builder/cli}/references/command-map.md +0 -0
  46. /package/skills/{codemation-framework-concepts → builder/framework-concepts}/references/architecture-map.md +0 -0
@@ -6,7 +6,7 @@ A credential binding is stored as `(workflowId, nodeId, slotKey)`. The `nodeId`
6
6
 
7
7
  For production workflows with credential-using nodes, prefer an explicit `id:` on the node config:
8
8
 
9
- ```ts
9
+ ```text
10
10
  .node("Fetch from API", MyApiNodeConfig, {
11
11
  id: "fetch-from-api", // stable across label renames
12
12
  credentials: { apiKey: myApiCredential },
@@ -38,7 +38,7 @@ Register the credential type from the app or plugin boundary:
38
38
 
39
39
  Helper-defined nodes can request credentials directly:
40
40
 
41
- ```ts
41
+ ```text
42
42
  credentials: {
43
43
  myService: myServiceCredential,
44
44
  }
@@ -59,7 +59,7 @@ See **`packages/core/docs/credential-ui-fields.md`** in the repository root layo
59
59
 
60
60
  For credentials that go through the OAuth2 redirect flow (Microsoft Graph, Slack, GitHub, Notion, etc.), declare the authorize and token URLs directly on the credential's `auth` definition. The host's `OAuth2ProviderRegistry` substitutes `{publicFieldKey}` placeholders from the credential's public config at connect time (URL-encoded).
61
61
 
62
- ```ts
62
+ ```text
63
63
  auth: {
64
64
  kind: "oauth2",
65
65
  // providerId is a free-form label for telemetry / DB rows / Better Auth provider naming.
@@ -0,0 +1,61 @@
1
+ ---
2
+ name: custom-node-development
3
+ description: Authors a reusable Codemation node with defineNode(...) (per-item execute) or defineBatchNode(...) (batch run), including credential slots, binary payloads, and the class-based fallback. Use when creating or updating custom nodes in an app or plugin package.
4
+ tags: node, custom, plugin
5
+ uses: "@codemation/core"
6
+ ---
7
+
8
+ # Codemation Custom Node Development
9
+
10
+ Custom nodes are the extension point for reusable business logic that doesn't belong inline in a workflow callback. `defineNode(...)` wraps a per-item `execute` function with a typed contract (config, credential slots, output shape); the engine calls it once per item. `defineBatchNode(...)` is the batch variant for logic that must see all items at once. A node definition exposes `.create(config, name?, id?)` to wire it into a workflow.
11
+
12
+ ## Per-item vs batch
13
+
14
+ **`defineNode(...)` (per-item)** — the engine calls `execute(args, context)` once per item. This is the right default for the vast majority of nodes: straightforward logic, credential slots, input schema, optional fan-out.
15
+
16
+ **`defineBatchNode(...)` (batch)** — the engine calls `run(items, context)` with the full activation batch. Use only when the node genuinely needs to see all items at once (aggregation, bulk API calls, cross-item correlation).
17
+
18
+ When in doubt, start with `defineNode`.
19
+
20
+ ## Node rules
21
+
22
+ 1. Keep nodes deterministic and focused.
23
+ 2. Request credentials through named slots — never hard-code secrets.
24
+ 3. Put **static** options (credentials, retry policy, labels) on `input` (the config defaults); read them in `execute` via the second arg's `config`.
25
+ 4. **Emit files with `ctx.binary`, not base64 in `json`** — base64 in `item.json` bloats persisted run data. See `references/node-patterns.md`.
26
+ 5. Drop to class-based node APIs only when you need constructor-injected collaborators, decorators, or deeper runtime metadata.
27
+
28
+ ## Minimal `defineNode` example
29
+
30
+ `execute(args, context)` receives `args = { input, item, itemIndex, items, ctx }` and `context = { config, credentials, execution }`. `input` is the per-item `item.json`; `config` is the resolved static config declared in `input:`.
31
+
32
+ ```ts
33
+ import { defineNode } from "@codemation/core";
34
+
35
+ export const normalizeTextField = defineNode({
36
+ key: "example.normalize-text-field",
37
+ title: "Normalize text field",
38
+ icon: "lucide:case-lower",
39
+ input: {
40
+ field: "text",
41
+ trim: true as boolean,
42
+ lowercase: true as boolean,
43
+ },
44
+ execute({ input }, { config }) {
45
+ const rawValue = String((input as Record<string, unknown>)[config.field as string] ?? "");
46
+ let normalized = rawValue;
47
+ if (config.trim) normalized = normalized.trim();
48
+ if (config.lowercase) normalized = normalized.toLowerCase();
49
+ return { ...(input as Record<string, unknown>), [config.field as string]: normalized };
50
+ },
51
+ });
52
+ ```
53
+
54
+ Wire it into a workflow with `normalizeTextField.create({ field: "text" }, "Normalize text", "normalize-text")`.
55
+
56
+ ## Read next
57
+
58
+ - `references/define-node-per-item.md` — `defineNode(...)` contract, `inputSchema`, fan-out, and assertion nodes.
59
+ - `references/define-batch-node.md` — `defineBatchNode(...)` contract and when to choose batch over per-item.
60
+ - `references/credential-aware-nodes.md` — credential slots and typed sessions.
61
+ - `references/node-patterns.md` — binary payloads (`ctx.binary`, `attach`, `withAttachment`), fan-out shapes, polling-trigger binary patterns, and HTTP binary round-trips.
@@ -0,0 +1,52 @@
1
+ # Credential-Aware Nodes
2
+
3
+ Load this when your node needs a typed credential (OAuth token, API key, or any `defineCredential(...)` type) injected at runtime.
4
+
5
+ ## Core rule
6
+
7
+ Request credentials through **named slots** on the node config instead of hard-coding secrets. The framework resolves the slot to a live typed session at execution time.
8
+
9
+ ## Adding a credential slot to `defineNode`
10
+
11
+ Declare slots in `credentials:` (slot name → credential type). Each slot becomes an accessor `credentials.<slot>()` that resolves a live, typed session at execution time:
12
+
13
+ ```ts
14
+ import { defineCredential, defineNode } from "@codemation/core";
15
+
16
+ const myApiCredentialType = defineCredential({
17
+ key: "example.my-api",
18
+ label: "My API",
19
+ public: { baseUrl: "string" },
20
+ secret: { accessToken: "password" },
21
+ createSession(args) {
22
+ return {
23
+ baseUrl: String(args.publicConfig.baseUrl ?? ""),
24
+ accessToken: String(args.material.accessToken ?? ""),
25
+ };
26
+ },
27
+ test() {
28
+ return { status: "healthy", testedAt: new Date().toISOString() };
29
+ },
30
+ });
31
+
32
+ export const callApiNode = defineNode({
33
+ key: "example.call-api",
34
+ title: "Call My API",
35
+ credentials: { api: myApiCredentialType }, // slot name → credential type
36
+ async execute(_args, { credentials }) {
37
+ const session = (await credentials.api()) as { baseUrl: string; accessToken: string };
38
+ const response = await fetch(`${session.baseUrl}/data`, {
39
+ headers: { Authorization: `Bearer ${session.accessToken}` },
40
+ });
41
+ return await response.json();
42
+ },
43
+ });
44
+ ```
45
+
46
+ ## Typed sessions
47
+
48
+ `credentials.<slot>()` returns exactly what the credential type's `createSession(...)` returns. The framework handles binding, storage, and error propagation — your node only consumes the session.
49
+
50
+ ## Testing credential-aware nodes
51
+
52
+ Inject a fake credential session through `WorkflowTestKit` rather than live credentials. See `credential-development` for the full `defineCredential(...)` story.
@@ -0,0 +1,54 @@
1
+ # Define Batch Node
2
+
3
+ Load this when you need to author a `defineBatchNode(...)` node that processes all items in one call.
4
+
5
+ ## When to use `defineBatchNode` instead of `defineNode`
6
+
7
+ - The node must see the **entire activation batch** at once (e.g. an aggregation, a bulk API call, or a node that correlates items against each other).
8
+ - Legacy batch semantics are required by the calling workflow.
9
+ - You need the same contract as built-in batch-shaped nodes such as `Aggregate`.
10
+
11
+ For the common case (one-item-at-a-time logic), prefer `defineNode` — the engine handles iteration for you.
12
+
13
+ ## Minimal skeleton
14
+
15
+ `run(items, context)` receives plain JSON values (`TInputJson[]`), not `Item` wrappers, and returns one output per row. The example below ranks rows against the full batch — impossible per-item:
16
+
17
+ ```ts
18
+ import { defineBatchNode } from "@codemation/core";
19
+
20
+ type SaleRow = Readonly<{ salesRepId: string; revenueUsd: number }>;
21
+ type RankedSaleRow = SaleRow & Readonly<{ rank: number; pctOfTotal: number }>;
22
+
23
+ export const rankSalesByRevenue = defineBatchNode<
24
+ "example.rank-sales-by-revenue",
25
+ Record<string, never>,
26
+ SaleRow,
27
+ RankedSaleRow
28
+ >({
29
+ key: "example.rank-sales-by-revenue",
30
+ title: "Rank sales by revenue",
31
+ icon: "lucide:bar-chart-2",
32
+ run(items) {
33
+ const total = items.reduce((sum, row) => sum + row.revenueUsd, 0);
34
+ const sorted = [...items].sort((a, b) => b.revenueUsd - a.revenueUsd);
35
+ const rankMap = new Map(sorted.map((row, index) => [row.salesRepId, index + 1]));
36
+ return items.map((row) => ({
37
+ ...row,
38
+ rank: rankMap.get(row.salesRepId) ?? items.length,
39
+ pctOfTotal: total > 0 ? Math.round((row.revenueUsd / total) * 10000) / 100 : 0,
40
+ }));
41
+ },
42
+ });
43
+ ```
44
+
45
+ ## Contract
46
+
47
+ - `run(items, context)` is called **once**, on the last item in the activation; intermediate items return `[]` internally.
48
+ - `items` is plain `TInputJson[]`; return an array of output JSON values.
49
+ - The context object exposes `config`, `credentials`, and `execution` (same as `defineNode`).
50
+ - Batch nodes have no `inputSchema` — type the rows through the generic parameters.
51
+
52
+ ## Advanced fallback
53
+
54
+ Reach for class-based node APIs when constructor-injected collaborators are required, plugin packaging needs the lower-level runtime contract, or decorators/persisted metadata need tighter control.
@@ -13,19 +13,27 @@ Load this when you need to author a `defineNode(...)` node that processes one it
13
13
 
14
14
  ```ts
15
15
  import { defineNode } from "@codemation/core";
16
- import { z } from "zod";
17
16
 
18
17
  export const uppercaseNode = defineNode({
19
18
  key: "example.uppercase",
20
19
  title: "Uppercase field",
21
20
  icon: "lucide:languages", // optional — Lucide, builtin:, si:, or image URL
22
- inputSchema: z.object({ field: z.string() }),
23
- async execute({ input, item }, { config }) {
24
- return { ...input, [config.field]: String(input.field).toUpperCase() };
21
+ input: { field: "text" },
22
+ execute({ input }, { config }) {
23
+ const value = String((input as Record<string, unknown>)[config.field as string] ?? "");
24
+ return { ...(input as Record<string, unknown>), [config.field as string]: value.toUpperCase() };
25
25
  },
26
26
  });
27
27
  ```
28
28
 
29
+ To validate and type the per-item payload, pass a Zod `inputSchema`:
30
+
31
+ ```text
32
+ import { z } from "zod";
33
+ inputSchema: z.object({ field: z.string() }),
34
+ // → `input` is typed; the engine validates each item before calling execute.
35
+ ```
36
+
29
37
  ## Contract
30
38
 
31
39
  - `execute(args, context)` is called **once per item** by the engine.
@@ -42,17 +50,9 @@ Place **static** options (credentials, retry policy, labels) on `config`; place
42
50
 
43
51
  Supply `inputSchema` (Zod) to get typed `input` in `execute` and to drive the canvas form. The engine validates items against it before calling `execute`.
44
52
 
45
- ## Testing with `WorkflowTestKit`
46
-
47
- ```ts
48
- import { createEngineTestKit, registerDefinedNodes } from "@codemation/core/testing";
49
-
50
- const kit = createEngineTestKit();
51
- registerDefinedNodes([uppercaseNode]);
52
- const result = await kit.runNode(uppercaseNode, { json: { field: "hello" } });
53
- ```
53
+ ## Testing
54
54
 
55
- Use `WorkflowTestKit` from `@codemation/core/testing` for engine-backed tests without the host.
55
+ Use `WorkflowTestKit` from `@codemation/core/testing` for engine-backed tests without the host: construct it, register your defined nodes through its registration context, then run a workflow that wires the node via `.create(...)` and assert on the emitted items.
56
56
 
57
57
  ## Custom assertion nodes
58
58
 
@@ -13,23 +13,21 @@ Use `defineNode(...)` when:
13
13
  ## Standard helper shape (`execute`)
14
14
 
15
15
  ```ts
16
+ import { defineNode } from "@codemation/core";
17
+
16
18
  export const uppercaseNode = defineNode({
17
19
  key: "example.uppercase",
18
20
  title: "Uppercase field",
19
21
  icon: "lucide:languages",
20
- input: {
21
- field: "string",
22
- },
22
+ input: { field: "text" },
23
23
  execute({ input }, { config }) {
24
- return {
25
- ...input,
26
- [config.field]: String(input[config.field as keyof typeof input] ?? "").toUpperCase(),
27
- };
24
+ const value = String((input as Record<string, unknown>)[config.field as string] ?? "");
25
+ return { ...(input as Record<string, unknown>), [config.field as string]: value.toUpperCase() };
28
26
  },
29
27
  });
30
28
  ```
31
29
 
32
- Optional **`icon`** is forwarded to the generated node config for the canvas (Lucide `lucide:…`, **`builtin:…`** / **`si:…`**, or image URLs). See `packages/core-nodes/src/canvasIconName.ts` and the Next host `WorkflowCanvasNodeIcon` resolver.
30
+ Optional **`icon`** is forwarded to the generated node config for the canvas (Lucide `lucide:…`, **`builtin:…`** / **`si:…`**, or image URLs).
33
31
 
34
32
  ## Batch helper shape (`defineBatchNode`)
35
33
 
@@ -88,20 +86,23 @@ Binary slots attached inside a node survive SubWorkflow boundaries with no extra
88
86
  ### Pattern: attach in a node, read in the parent after SubWorkflow
89
87
 
90
88
  ```ts
89
+ import { defineNode } from "@codemation/core";
90
+
91
91
  // Child node — attaches a slot and returns the modified item.
92
+ // Binary lives on `ctx` (the first arg), not the second context param.
92
93
  export const parseAndStoreNode = defineNode({
93
94
  key: "example.parse-store",
94
95
  title: "Parse and Store",
95
- inputSchema: z.object({ filename: z.string() }),
96
- async execute({ input, item }, { binary }) {
96
+ input: { filename: "out" },
97
+ async execute({ item, ctx }, { config }) {
97
98
  const bytes = Buffer.from("...parsed content...");
98
- const att = await binary.attach({
99
+ const att = await ctx.binary.attach({
99
100
  name: "parsed",
100
101
  body: bytes,
101
102
  mimeType: "text/plain",
102
- filename: `${input.filename}.txt`,
103
+ filename: `${config.filename}.txt`,
103
104
  });
104
- return binary.withAttachment(item, "parsed", att);
105
+ return ctx.binary.withAttachment(item, "parsed", att);
105
106
  },
106
107
  });
107
108
  ```
@@ -110,32 +111,18 @@ After `SubWorkflowNode` returns, the parent's continuation nodes see `item.binar
110
111
 
111
112
  ### Testing binary across SubWorkflow with `WorkflowTestKit`
112
113
 
113
- ```ts
114
- import { DefaultExecutionContextFactory, InMemoryBinaryStorage } from "@codemation/core";
115
- import { createEngineTestKit } from "@codemation/core/testing";
116
- import { ItemHarnessNodeConfig } from "@codemation/core/testing";
117
-
118
- const storage = new InMemoryBinaryStorage();
119
- const kit = createEngineTestKit({
120
- executionContextFactory: new DefaultExecutionContextFactory(storage),
121
- });
114
+ `@codemation/core/testing` exposes `WorkflowTestKit` and `ItemHarnessNodeConfig`. Use `ItemHarnessNodeConfig` (NOT `CallbackNodeConfig`) for harness nodes that must modify items — its callback receives `{ item, ctx }`, calls `ctx.binary.attach(...)`, and returns `ctx.binary.withAttachment(item, slot, att)`:
122
115
 
123
- // Use ItemHarnessNodeConfig (NOT CallbackNodeConfig) for nodes that must modify items:
116
+ ```text
124
117
  const attachNode = new ItemHarnessNodeConfig(
125
118
  "Attach",
126
119
  z.unknown(),
127
120
  async ({ item, ctx }) => {
128
- const att = await ctx.binary.attach({
129
- name: "doc",
130
- body: Buffer.from("content"),
131
- mimeType: "application/pdf",
132
- filename: "doc.pdf",
133
- });
134
- return ctx.binary.withAttachment(item as Item, "doc", att);
121
+ const att = await ctx.binary.attach({ name: "doc", body: Buffer.from("content"), mimeType: "application/pdf", filename: "doc.pdf" });
122
+ return ctx.binary.withAttachment(item, "doc", att);
135
123
  },
136
124
  { id: "attach" },
137
125
  );
138
- // CallbackNodeConfig is fine for assertion-only (observe) nodes — it echoes input unchanged.
139
126
  ```
140
127
 
141
128
  Important: `CallbackNodeConfig` discards its callback return value and always echoes input items. Never use it for nodes that must attach binary or transform items.
@@ -144,23 +131,17 @@ Important: `CallbackNodeConfig` discards its callback return value and always ec
144
131
 
145
132
  Use `OutlookAttachmentDownload` from `@codemation/core-nodes-msgraph` when you have already obtained attachment metadata (filename, contentType, id) and want to download only specific attachments.
146
133
 
147
- ```ts
148
- import { onNewMsGraphMailTrigger, outlookAttachmentDownloadNode } from "@codemation/core-nodes-msgraph";
149
-
150
- workflow("wf.download-resumes")
151
- .trigger(onNewMsGraphMailTrigger, { mailbox: "me", folderId: "inbox" })
152
- .then(
153
- outlookAttachmentDownloadNode.create(
154
- {
155
- messageId: "", // falls back to item.json when empty
156
- attachmentId: "", // falls back to item.json when empty
157
- binarySlot: "resume",
158
- sizeCapBytes: 10 * 1024 * 1024,
159
- },
160
- "DownloadResume",
161
- ),
162
- )
163
- .build();
134
+ ```text
135
+ // from @codemation/core-nodes-msgraph
136
+ outlookAttachmentDownloadNode.create(
137
+ {
138
+ messageId: "", // falls back to item.json when empty
139
+ attachmentId: "", // falls back to item.json when empty
140
+ binarySlot: "resume",
141
+ sizeCapBytes: 10 * 1024 * 1024,
142
+ },
143
+ "DownloadResume",
144
+ )
164
145
  ```
165
146
 
166
147
  Key constraints:
@@ -195,7 +176,7 @@ export default workflow("wf.download-pdf")
195
176
 
196
177
  ### Upload binary bytes from a slot
197
178
 
198
- ```ts
179
+ ```text
199
180
  new HttpRequest("UploadResume", {
200
181
  method: "POST",
201
182
  url: "https://api.example.com/files",
@@ -207,6 +188,9 @@ new HttpRequest("UploadResume", {
207
188
  ### Download then upload (full round-trip)
208
189
 
209
190
  ```ts
191
+ import { HttpRequest } from "@codemation/core-nodes";
192
+ import { workflow } from "@codemation/host";
193
+
210
194
  export default workflow("wf.mirror-pdf")
211
195
  .manualTrigger<{ sourceUrl: string; targetUrl: string }>("Start", { sourceUrl: "", targetUrl: "" })
212
196
  .then(new HttpRequest("Download", { urlField: "sourceUrl", responseFormat: "binary", responseBinarySlot: "file" }))
@@ -0,0 +1,167 @@
1
+ ---
2
+ name: document-ai
3
+ description: Extracts markdown text and structured fields from a document, invoice, or image with the managed codemationDocumentScannerNode — no Azure or BYOK credential needed. Use this whenever a workflow scans an attachment (PDF, receipt, photo) and reads back markdown or per-field values.
4
+ compatibility: Codemation core-nodes. Requires @codemation/core-nodes import.
5
+ tags: ocr, document, invoice, image, scan, extract, markdown, fields, confidence, managed, binary
6
+ uses: "@codemation/core-nodes"
7
+ ---
8
+
9
+ # Codemation Document Scanner
10
+
11
+ `codemationDocumentScannerNode` reads the bytes of a binary attachment off `item.binary` and returns
12
+ `{ markdown, fields }` from the **managed** Codemation doc-scanner service. It signs each call with the
13
+ workspace pairing secret over HMAC — the workspace holds **no Azure key**. This is the default for any
14
+ scanning step on the managed platform.
15
+
16
+ **Discipline:** author straight from this file, then run `verify_workflow` and fix only what it flags.
17
+ Use `workflow-dsl` for the surrounding builder, trigger, and flow-control surface.
18
+
19
+ ## A complete scanning workflow
20
+
21
+ A webhook receives a file upload; the framework places the bytes at `item.binary["data"]`; the node
22
+ analyzes them and replaces the item payload with `{ markdown, fields }`.
23
+
24
+ ```typescript
25
+ import { createWorkflowBuilder, WebhookTrigger, codemationDocumentScannerNode } from "@codemation/core-nodes";
26
+
27
+ export default createWorkflowBuilder({
28
+ id: "wf.scan-document",
29
+ name: "Scan an uploaded document",
30
+ })
31
+ .trigger(
32
+ new WebhookTrigger("Receive upload", { endpointKey: "doc-upload", methods: ["POST"] }, undefined, {
33
+ id: "receive-upload",
34
+ description: "Accepts an uploaded file and starts the scan.",
35
+ }),
36
+ )
37
+ .then(
38
+ codemationDocumentScannerNode.create(
39
+ {
40
+ binaryField: "data", // key on item.binary holding the bytes — default "data"
41
+ analyzerType: "auto", // routes on Content-Type; set explicitly when you know the class
42
+ },
43
+ "Scan document",
44
+ { id: "scan-document", description: "Reads the uploaded file and pulls out its text and key fields." },
45
+ ),
46
+ )
47
+ .build();
48
+ ```
49
+
50
+ `.create(config, label, idOrOptions)` is the `defineNode` call shape — same for every built-in node here.
51
+ The third argument takes either a bare `"nodeId"` string OR an options object `{ id, description }` — **always
52
+ pass the object so the scan step carries a plain-language `description`** (it is a node like any other; a
53
+ document/OCR step with no description is the most-forgotten gap). Set an explicit `id` whenever a downstream
54
+ node references this output or the label may change later.
55
+
56
+ ## Choosing `analyzerType`
57
+
58
+ Default is `"auto"`. Set an explicit type whenever you know the content class — it avoids re-routing
59
+ and self-documents the workflow.
60
+
61
+ | `analyzerType` | When | Field extraction |
62
+ | -------------- | ---------------------------------------------------- | ------------------ |
63
+ | `"document"` | General PDFs, Word, HTML, text-heavy files | Yes |
64
+ | `"invoice"` | Invoices and receipts | Yes |
65
+ | `"image"` | Photos, screenshots, diagrams | No (markdown only) |
66
+ | `"auto"` | Unknown mime type — `image/*` → image, else document | Depends on routing |
67
+
68
+ ## Output shape
69
+
70
+ The node replaces the item payload with this shape (`DocScannerOutput`):
71
+
72
+ ```text
73
+ {
74
+ markdown: string; // full text rendering of the document
75
+ fields: Record<string, {
76
+ value: unknown; // scalar, ISO date string, nested object, or array
77
+ confidence: number | null; // 0–1 when includeConfidence:true; otherwise null
78
+ }>;
79
+ }
80
+ ```
81
+
82
+ `item.json.markdown` is the Markdown rendering. `item.json.fields` is a flat-or-nested map of
83
+ structured fields the analyzer found — sparse or empty for generic documents (best-effort), and
84
+ always `{}` for `analyzerType: "image"`.
85
+
86
+ ## Per-field confidence (opt-in)
87
+
88
+ By default `confidence` is `null` on every field — this keeps token cost low for the common case that
89
+ only needs `value`. Set `includeConfidence: true` to populate it:
90
+
91
+ ```typescript
92
+ import { createWorkflowBuilder, WebhookTrigger, codemationDocumentScannerNode } from "@codemation/core-nodes";
93
+
94
+ export default createWorkflowBuilder({ id: "wf.scan-invoice", name: "Scan invoice with confidence" })
95
+ .trigger(new WebhookTrigger("Receive invoice", { endpointKey: "invoice-upload", methods: ["POST"] }))
96
+ .then(
97
+ codemationDocumentScannerNode.create(
98
+ {
99
+ analyzerType: "invoice",
100
+ includeConfidence: true, // fields now carry confidence 0–1
101
+ },
102
+ "Scan invoice",
103
+ "scan-invoice",
104
+ ),
105
+ )
106
+ .build();
107
+ ```
108
+
109
+ **Cost:** enabling confidence routes to a confidence-enabled analyzer variant, roughly doubling the
110
+ contextualization token count for `document`/`invoice`. Only enable it when downstream logic reads
111
+ `field.confidence`. `image` (and auto-routed-to-image) requests ignore the flag silently — confidence
112
+ stays `null`, never a 400.
113
+
114
+ ## Consuming fields downstream
115
+
116
+ Read named fields off `item.json.fields` in a following step. Type the `Callback` input with
117
+ `DocScannerOutput` so the field map is inferred:
118
+
119
+ ```typescript
120
+ import { Callback } from "@codemation/core-nodes";
121
+ import type { DocScannerOutput } from "@codemation/core-nodes";
122
+
123
+ const useFields = new Callback<DocScannerOutput, { vendorName?: string; vendorConfidence: number | null }>(
124
+ "Use invoice fields",
125
+ (items) =>
126
+ items.map((item) => {
127
+ const vendorName = item.json.fields["VendorName"]?.value as string | undefined;
128
+ const vendorConfidence = item.json.fields["VendorName"]?.confidence ?? null;
129
+ return { ...item, json: { vendorName, vendorConfidence } };
130
+ }),
131
+ { id: "use-invoice-fields" },
132
+ );
133
+ ```
134
+
135
+ ## Config reference
136
+
137
+ ```text
138
+ codemationDocumentScannerNode.create(
139
+ {
140
+ binaryField?: string; // key on item.binary — default "data"
141
+ analyzerType?: "document" | "invoice" | "image" | "auto"; // default "auto"
142
+ contentType?: string; // MIME override — falls back to the attachment's mimeType
143
+ includeConfidence?: boolean; // default false — see the cost note above
144
+ maxBytes?: number; // size cap before reading — default 50 MiB
145
+ },
146
+ label?: string, // node label on the canvas
147
+ nodeId?: string, // explicit stable id — set when output is used downstream
148
+ )
149
+ ```
150
+
151
+ ## Gotchas
152
+
153
+ - **Bytes come from `item.binary`, never base64 on `item.json`.** The node reads
154
+ `item.binary[binaryField]`; a webhook, Gmail, or `readWorkspaceFileNode` must have attached the bytes
155
+ to that slot upstream. Missing attachment → the node throws.
156
+ - **The output replaces the item payload.** After scanning, `item.json` is `{ markdown, fields }` — the
157
+ original payload is gone. Carry forward anything you still need before this step, or read it from a
158
+ retained binary slot.
159
+ - **`fields` is best-effort.** Guard every `fields["Name"]?.value` access; the analyzer may not find it.
160
+ - **Managed only.** The node needs `DOC_SCANNER_GATEWAY_URL` + workspace pairing in the env; it throws a
161
+ clear error when run outside a paired workspace.
162
+
163
+ ## Read next when needed
164
+
165
+ - `workflow-dsl` — builder, triggers, flow control, the per-item contract.
166
+ - `workspace-files` — read a stored file's bytes into `item.binary` before scanning.
167
+ - `ai-agent` — pass `item.json.markdown` to an LLM for summarization or extraction.