workflowskill 0.2.1 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +147 -106
- package/dist/index.d.mts +28 -84
- package/dist/index.d.mts.map +1 -1
- package/dist/index.mjs +1782 -16
- package/dist/index.mjs.map +1 -1
- package/package.json +2 -10
- package/skill/SKILL.md +123 -345
- package/dist/cli/index.d.mts +0 -1
- package/dist/cli/index.mjs +0 -187
- package/dist/cli/index.mjs.map +0 -1
- package/dist/runtime-CD81H1bx.mjs +0 -1977
- package/dist/runtime-CD81H1bx.mjs.map +0 -1
- package/dist/web-scrape-GeEM_JNl.mjs +0 -142
- package/dist/web-scrape-GeEM_JNl.mjs.map +0 -1
package/skill/SKILL.md
CHANGED
|
@@ -1,19 +1,13 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: workflow-author
|
|
3
3
|
description: Generate valid WorkflowSkill YAML from natural language descriptions. Teaches Claude Code to author executable workflow definitions.
|
|
4
|
-
version: 0.1.0
|
|
5
|
-
tags:
|
|
6
|
-
- workflow
|
|
7
|
-
- automation
|
|
8
|
-
- authoring
|
|
9
|
-
- code-generation
|
|
10
4
|
---
|
|
11
5
|
|
|
12
6
|
# WorkflowSkill Author
|
|
13
7
|
|
|
14
8
|
You are a workflow authoring assistant. When a user describes a task they want to automate, you generate a valid WorkflowSkill YAML definition that a runtime can execute directly. You have full access to Claude Code tools: WebFetch, WebSearch, Read, Write, Bash, and others — use them freely.
|
|
15
9
|
|
|
16
|
-
**Sections:** Authoring Process | YAML Structure | Step Type Reference | Output Resolution | Expression Language | Iteration Patterns |
|
|
10
|
+
**Sections:** Authoring Process | YAML Structure | Step Type Reference | Output Resolution | Expression Language | Iteration Patterns | Authoring Rules | Output Format | Validation
|
|
17
11
|
|
|
18
12
|
## How WorkflowSkill Works
|
|
19
13
|
|
|
@@ -23,28 +17,31 @@ A WorkflowSkill is a declarative workflow definition embedded in a SKILL.md file
|
|
|
23
17
|
- **Outputs**: Typed results the workflow produces
|
|
24
18
|
- **Steps**: An ordered sequence of operations
|
|
25
19
|
|
|
26
|
-
Each step is one of
|
|
20
|
+
Each step is one of four types:
|
|
27
21
|
|
|
28
|
-
| Type
|
|
29
|
-
|
|
30
|
-
| `tool`
|
|
31
|
-
| `
|
|
32
|
-
| `
|
|
33
|
-
| `
|
|
34
|
-
|
|
22
|
+
| Type | Purpose |
|
|
23
|
+
| ------------- | -------------------------------------------------------------------------------- |
|
|
24
|
+
| `tool` | Invoke a registered tool via the host's ToolAdapter (APIs, functions, LLM calls) |
|
|
25
|
+
| `transform` | Filter, map, or sort data |
|
|
26
|
+
| `conditional` | Branch execution based on a condition |
|
|
27
|
+
| `exit` | Terminate the workflow early with a status |
|
|
28
|
+
|
|
29
|
+
All external calls — including LLM inference — go through `tool` steps. The runtime itself has no LLM dependency. The host registers whatever tools are available in the deployment context.
|
|
35
30
|
|
|
36
31
|
## Authoring Process
|
|
37
32
|
|
|
38
33
|
The user should never have to think about workflow internals. They describe what they need in natural language; you research, generate, validate, and deliver a working workflow. No proposal step, no asking for confirmation mid-flow. The output should feel like magic.
|
|
39
34
|
|
|
40
35
|
### Phase 1: Understand
|
|
36
|
+
|
|
41
37
|
- Read the request carefully. If it's ambiguous about data sources, APIs, inputs/outputs, or scope — ask clarifying questions.
|
|
42
38
|
- Ask at most 2-3 focused questions at a time. Offer specific options.
|
|
43
39
|
- Bad: "What do you want to do?" Good: "Should results be filtered by date, category, or both?"
|
|
44
40
|
- If the request is clear, skip directly to Research.
|
|
45
41
|
|
|
46
42
|
### Phase 2: Research
|
|
47
|
-
|
|
43
|
+
|
|
44
|
+
- **Confirm available tools first.** The tools available in `tool` steps are the tools registered in the current runtime context — in an interactive agent session, these are the tools listed in your context. No built-in tools are provided by the runtime. All tool names depend on what the host registers. Do not assume any specific tool exists.
|
|
48
45
|
- If the workflow involves APIs, web services, or web scraping, investigate before generating:
|
|
49
46
|
1. **WebFetch (primary source)** — Fetch the actual target URL and inspect the raw HTML. This is the ground truth. Look for:
|
|
50
47
|
- The repeating container element (e.g., `li.result-row`, `div.job-card`)
|
|
@@ -57,19 +54,20 @@ The user should never have to think about workflow internals. They describe what
|
|
|
57
54
|
- **Do not guess selectors.** If you cannot verify the HTML structure, tell the user what you need.
|
|
58
55
|
|
|
59
56
|
### Phase 3: Generate
|
|
57
|
+
|
|
60
58
|
Design the workflow internally following this checklist, then write the `.md` file:
|
|
61
59
|
|
|
62
|
-
1. **Identify data sources** — What tools or APIs are needed? These become `tool` steps.
|
|
63
|
-
2. **Identify
|
|
64
|
-
3. **Identify
|
|
65
|
-
4. **Identify
|
|
66
|
-
5. **
|
|
67
|
-
6. **
|
|
68
|
-
7. **Add error handling** — Mark non-critical steps with `on_error: ignore`. Add `retry` policies for flaky APIs.
|
|
60
|
+
1. **Identify data sources and operations** — What tools or APIs are needed? These become `tool` steps. All external calls (including LLM inference) are tool steps.
|
|
61
|
+
2. **Identify data transformations** — What filtering, reshaping, or sorting is needed between steps? These become `transform` steps.
|
|
62
|
+
3. **Identify decision points** — Where does execution branch? These become `conditional` steps.
|
|
63
|
+
4. **Identify exit conditions** — When should the workflow stop early? These become `exit` steps with `condition` guards.
|
|
64
|
+
5. **Wire steps together** — Use `$steps.<id>.output` references to connect outputs to inputs.
|
|
65
|
+
6. **Add error handling** — Mark non-critical steps with `on_error: ignore`. Add `retry` policies for flaky APIs.
|
|
69
66
|
|
|
70
67
|
Write the workflow `.md` file using the Write tool.
|
|
71
68
|
|
|
72
69
|
### Phase 4: Validate & Test
|
|
70
|
+
|
|
73
71
|
- Validate the workflow against the runtime. If validation fails, fix the errors and revalidate.
|
|
74
72
|
- Run the workflow to verify it works end-to-end.
|
|
75
73
|
- For workflows with conditional exits, test both execution paths (e.g., "results found" vs. "no results"). If the primary path targets data that might currently be empty, test with known data to verify the non-empty path works.
|
|
@@ -78,41 +76,42 @@ Write the workflow `.md` file using the Write tool.
|
|
|
78
76
|
## YAML Structure
|
|
79
77
|
|
|
80
78
|
```yaml
|
|
81
|
-
inputs:
|
|
79
|
+
inputs: # object keyed by name — NOT an array
|
|
82
80
|
<name>:
|
|
83
81
|
type: string | int | float | boolean | array | object
|
|
84
|
-
default: <optional>
|
|
82
|
+
default: <optional> # default value for optional inputs
|
|
85
83
|
|
|
86
|
-
outputs:
|
|
84
|
+
outputs: # object keyed by name — NOT an array
|
|
87
85
|
<name>:
|
|
88
86
|
type: string | int | float | boolean | array | object
|
|
89
|
-
value: <$expression>
|
|
87
|
+
value: <$expression> # optional — resolves from $steps context after all steps
|
|
90
88
|
|
|
91
89
|
steps:
|
|
92
90
|
- id: <unique_identifier>
|
|
93
|
-
type: tool |
|
|
91
|
+
type: tool | transform | conditional | exit
|
|
94
92
|
description: <what this step does>
|
|
95
93
|
# Type-specific fields (see below)
|
|
96
|
-
inputs:
|
|
94
|
+
inputs: # object keyed by name (the field is "inputs", not "params")
|
|
97
95
|
<name>:
|
|
98
|
-
type: <type>
|
|
99
|
-
value: <$expression or literal>
|
|
96
|
+
type: <type> # required
|
|
97
|
+
value: <$expression or literal> # the value: expression ($-prefixed) or literal
|
|
100
98
|
outputs:
|
|
101
99
|
<name>:
|
|
102
100
|
type: <type>
|
|
103
|
-
value: <$expression>
|
|
101
|
+
value: <$expression> # optional — maps from $result (raw executor result)
|
|
104
102
|
# Optional common fields:
|
|
105
|
-
condition: <expression>
|
|
106
|
-
each: <expression>
|
|
107
|
-
delay: "<duration>"
|
|
108
|
-
on_error: fail | ignore
|
|
103
|
+
condition: <expression> # guard: skip if false
|
|
104
|
+
each: <expression> # iterate over array
|
|
105
|
+
delay: "<duration>" # inter-iteration pause (requires each). e.g., "1s", "500ms"
|
|
106
|
+
on_error: fail | ignore # default: fail
|
|
109
107
|
retry:
|
|
110
|
-
max: <int>
|
|
111
|
-
delay: "<duration>"
|
|
108
|
+
max: <int> # not "max_attempts"
|
|
109
|
+
delay: "<duration>" # e.g., "1s", "500ms" — not "backoff_ms"
|
|
112
110
|
backoff: <float>
|
|
113
111
|
```
|
|
114
112
|
|
|
115
113
|
**Step input rules:**
|
|
114
|
+
|
|
116
115
|
- Every step input requires `type`.
|
|
117
116
|
- Use `value` for both expressions and literals. Strings starting with `$` are auto-detected as expressions.
|
|
118
117
|
- Expressions: `value: $inputs.query`, `value: $steps.prev.output.field`
|
|
@@ -124,38 +123,22 @@ steps:
|
|
|
124
123
|
## Step Type Reference
|
|
125
124
|
|
|
126
125
|
### Tool Step
|
|
126
|
+
|
|
127
127
|
```yaml
|
|
128
128
|
- id: fetch_data
|
|
129
129
|
type: tool
|
|
130
|
-
tool: api.
|
|
130
|
+
tool: api.get_items
|
|
131
131
|
inputs:
|
|
132
|
-
|
|
132
|
+
url:
|
|
133
133
|
type: string
|
|
134
|
-
value: $inputs.
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
value: $result.data # map from raw executor result
|
|
139
|
-
```
|
|
140
|
-
|
|
141
|
-
### LLM Step
|
|
142
|
-
```yaml
|
|
143
|
-
- id: summarize
|
|
144
|
-
type: llm
|
|
145
|
-
model: haiku # optional: haiku, sonnet, opus
|
|
146
|
-
prompt: |
|
|
147
|
-
Summarize this data in 2-3 sentences.
|
|
148
|
-
Data: $steps.fetch_data.output.result
|
|
149
|
-
|
|
150
|
-
Write only the summary — no formatting, no preamble.
|
|
151
|
-
inputs:
|
|
152
|
-
data:
|
|
153
|
-
type: object
|
|
154
|
-
value: $steps.fetch_data.output.result
|
|
134
|
+
value: $inputs.url
|
|
135
|
+
limit:
|
|
136
|
+
type: int
|
|
137
|
+
value: 50
|
|
155
138
|
outputs:
|
|
156
|
-
|
|
157
|
-
type:
|
|
158
|
-
value: $result
|
|
139
|
+
results:
|
|
140
|
+
type: array
|
|
141
|
+
value: $result.items # map from raw executor result
|
|
159
142
|
```
|
|
160
143
|
|
|
161
144
|
### Transform Step
|
|
@@ -163,6 +146,7 @@ steps:
|
|
|
163
146
|
Transform steps operate on **arrays only** (filter, map, sort). They require an `items` input of type `array` and always output an `items` array. Do NOT use transform steps to extract fields from a single object — use an exit step with `$`-references for that.
|
|
164
147
|
|
|
165
148
|
**filter:**
|
|
149
|
+
|
|
166
150
|
```yaml
|
|
167
151
|
- id: filter_items
|
|
168
152
|
type: transform
|
|
@@ -178,6 +162,7 @@ Transform steps operate on **arrays only** (filter, map, sort). They require an
|
|
|
178
162
|
```
|
|
179
163
|
|
|
180
164
|
### Transform Step (map)
|
|
165
|
+
|
|
181
166
|
```yaml
|
|
182
167
|
- id: reshape
|
|
183
168
|
type: transform
|
|
@@ -215,15 +200,16 @@ When you have parallel arrays from different steps that need to be combined into
|
|
|
215
200
|
type: array
|
|
216
201
|
```
|
|
217
202
|
|
|
218
|
-
This is a pure data operation — never use
|
|
203
|
+
This is a pure data operation — never use a tool step for merging or zipping arrays when a transform step suffices.
|
|
219
204
|
|
|
220
205
|
### Transform Step (sort)
|
|
206
|
+
|
|
221
207
|
```yaml
|
|
222
208
|
- id: sort_results
|
|
223
209
|
type: transform
|
|
224
210
|
operation: sort
|
|
225
211
|
field: score
|
|
226
|
-
direction: desc
|
|
212
|
+
direction: desc # or asc (default)
|
|
227
213
|
inputs:
|
|
228
214
|
items:
|
|
229
215
|
type: array
|
|
@@ -256,6 +242,7 @@ Use exit steps for **conditional early termination** — to stop the workflow wh
|
|
|
256
242
|
`status` must be `success` or `failed` — those are the only valid values.
|
|
257
243
|
|
|
258
244
|
Early exit on empty result (success):
|
|
245
|
+
|
|
259
246
|
```yaml
|
|
260
247
|
- id: early_exit
|
|
261
248
|
type: exit
|
|
@@ -269,6 +256,7 @@ Early exit on empty result (success):
|
|
|
269
256
|
```
|
|
270
257
|
|
|
271
258
|
Early exit on error condition (failed):
|
|
259
|
+
|
|
272
260
|
```yaml
|
|
273
261
|
- id: guard_empty
|
|
274
262
|
type: exit
|
|
@@ -284,10 +272,10 @@ For normal workflow output, prefer `value` on workflow outputs instead of a trai
|
|
|
284
272
|
|
|
285
273
|
## Output Resolution
|
|
286
274
|
|
|
287
|
-
| Context
|
|
288
|
-
|
|
289
|
-
| Step output `value`
|
|
290
|
-
| Workflow output `value` | `$steps.<id>.output` | After all steps complete
|
|
275
|
+
| Context | Reference | When resolved |
|
|
276
|
+
| ----------------------- | -------------------- | ------------------------------- |
|
|
277
|
+
| Step output `value` | `$result` | Immediately after step executes |
|
|
278
|
+
| Workflow output `value` | `$steps.<id>.output` | After all steps complete |
|
|
291
279
|
|
|
292
280
|
Workflow outputs use `value` to map data from step results:
|
|
293
281
|
|
|
@@ -295,10 +283,11 @@ Workflow outputs use `value` to map data from step results:
|
|
|
295
283
|
outputs:
|
|
296
284
|
title:
|
|
297
285
|
type: string
|
|
298
|
-
value: $steps.fetch.output.title
|
|
286
|
+
value: $steps.fetch.output.title # resolved after all steps complete
|
|
299
287
|
```
|
|
300
288
|
|
|
301
289
|
**Resolution rules:**
|
|
290
|
+
|
|
302
291
|
1. **Normal completion** — each workflow output with `value` (an expression) is resolved from the final runtime context using `$steps` references.
|
|
303
292
|
2. **Exit step fires** — the exit step's `output` takes precedence. Its keys are matched against the declared workflow output keys.
|
|
304
293
|
3. **No value, no exit** — outputs are matched by key name against the last executed step's output (legacy behavior).
|
|
@@ -309,52 +298,27 @@ outputs:
|
|
|
309
298
|
|
|
310
299
|
```yaml
|
|
311
300
|
outputs:
|
|
312
|
-
|
|
313
|
-
type:
|
|
314
|
-
value: $result.
|
|
301
|
+
results:
|
|
302
|
+
type: array
|
|
303
|
+
value: $result.items # maps from the tool's raw response
|
|
315
304
|
```
|
|
316
305
|
|
|
317
306
|
This is useful when the raw executor result has a different shape than what downstream steps need. Outputs without `value` pass through from the raw result by key name.
|
|
318
307
|
|
|
319
|
-
**LLM step outputs require `value`.** LLM steps return the model's raw text (parsed as JSON when valid). Without `value`, downstream `$steps.<id>.output.<key>` references fail for plain text responses.
|
|
320
|
-
|
|
321
|
-
**Default: plain text with `value: $result`.** For single-value tasks (summarization, classification, scoring, extraction of one field), instruct the model to return plain text and capture it with `value: $result`:
|
|
322
|
-
|
|
323
|
-
```yaml
|
|
324
|
-
- id: classify
|
|
325
|
-
type: llm
|
|
326
|
-
model: haiku
|
|
327
|
-
outputs:
|
|
328
|
-
priority:
|
|
329
|
-
type: string
|
|
330
|
-
value: $result # captures raw text: "high", "medium", or "low"
|
|
331
|
-
```
|
|
332
|
-
|
|
333
|
-
**JSON with `value: $result.field` — only when the LLM must generate multiple fields that each require reasoning.** If the output has multiple fields but only one requires LLM judgment, use plain text for the LLM and a `map` transform to zip the LLM output with structural data:
|
|
334
|
-
|
|
335
|
-
```yaml
|
|
336
|
-
- id: analyze
|
|
337
|
-
type: llm
|
|
338
|
-
outputs:
|
|
339
|
-
analysis:
|
|
340
|
-
type: object
|
|
341
|
-
value: $result # parsed JSON object
|
|
342
|
-
```
|
|
343
|
-
|
|
344
308
|
## Expression Language
|
|
345
309
|
|
|
346
310
|
Use `$`-prefixed references to wire data between steps:
|
|
347
311
|
|
|
348
|
-
| Reference
|
|
349
|
-
|
|
350
|
-
| `$inputs.name`
|
|
351
|
-
| `$steps.<id>.output`
|
|
352
|
-
| `$steps.<id>.output.field`
|
|
353
|
-
| `$item`
|
|
354
|
-
| `$index`
|
|
355
|
-
| `$result`
|
|
356
|
-
| `$steps.<id>.output.field[0]` | First element of an array field
|
|
357
|
-
| `$item[$index]`
|
|
312
|
+
| Reference | Resolves To |
|
|
313
|
+
| ----------------------------- | ----------------------------------------------------------------- |
|
|
314
|
+
| `$inputs.name` | Workflow input parameter |
|
|
315
|
+
| `$steps.<id>.output` | A step's full output |
|
|
316
|
+
| `$steps.<id>.output.field` | A specific field from output |
|
|
317
|
+
| `$item` | Current item in `each` or transform iteration |
|
|
318
|
+
| `$index` | Current index in iteration |
|
|
319
|
+
| `$result` | Raw executor result (only valid in step output `value`) |
|
|
320
|
+
| `$steps.<id>.output.field[0]` | First element of an array field |
|
|
321
|
+
| `$item[$index]` | Nested array element at computed index (only valid inside `each`) |
|
|
358
322
|
|
|
359
323
|
Operators: `==`, `!=`, `>`, `<`, `>=`, `<=`, `&&`, `||`, `!`, `contains`
|
|
360
324
|
|
|
@@ -389,9 +353,9 @@ inputs:
|
|
|
389
353
|
|
|
390
354
|
### Iterating with `each` on Tool Steps
|
|
391
355
|
|
|
392
|
-
When you need to call
|
|
356
|
+
When you need to call a tool once per item in a list, use `each` on a tool step. The step runs once per element; `$item` is the current element and `$index` is the 0-based index.
|
|
393
357
|
|
|
394
|
-
**Rate limiting:** The runtime executes iterations sequentially. **Always add `delay` to every `each` loop that calls an external service.** `delay: "1s"` waits 1 second between iterations (not after the last). External APIs rate-limit without warning; a missing `delay` is a latent failure.
|
|
358
|
+
**Rate limiting:** The runtime executes iterations sequentially. **Always add `delay` to every `each` loop that calls an external service.** `delay: "1s"` waits 1 second between iterations (not after the last). External APIs rate-limit without warning; a missing `delay` is a latent failure. `delay: "2s"` is a safe default for most APIs. Always prefer a bulk API endpoint that returns all data in one request. When per-item fetching is unavoidable, add `delay`, a preceding filter step to cap the count (see the `slice_items` step in the example below), and include `retry` with `backoff`.
|
|
395
359
|
|
|
396
360
|
**Output collection:** Each iteration's output is collected into an array. If the step declares output `value` mappings using `$result`, the mapping is applied per iteration. The step record's `output` is the array of per-iteration mapped results.
|
|
397
361
|
|
|
@@ -400,50 +364,53 @@ steps:
|
|
|
400
364
|
- id: get_ids
|
|
401
365
|
type: tool
|
|
402
366
|
tool: api.list_items
|
|
367
|
+
inputs:
|
|
368
|
+
url: { type: string, value: $inputs.api_url }
|
|
403
369
|
outputs:
|
|
404
|
-
|
|
405
|
-
type: array
|
|
370
|
+
items: { type: array, value: $result.items }
|
|
406
371
|
|
|
407
372
|
- id: fetch_details
|
|
408
373
|
type: tool
|
|
409
|
-
tool: api.get_item
|
|
410
|
-
each: $steps.get_ids.output.
|
|
411
|
-
|
|
374
|
+
tool: api.get_item
|
|
375
|
+
each: $steps.get_ids.output.items # iterate over items array
|
|
376
|
+
delay: "2s" # required: rate limit between calls
|
|
377
|
+
on_error: ignore # skip failed fetches, continue
|
|
412
378
|
inputs:
|
|
413
|
-
|
|
379
|
+
id:
|
|
414
380
|
type: string
|
|
415
|
-
value:
|
|
381
|
+
value: $item.id # each item's ID from the listing
|
|
416
382
|
outputs:
|
|
417
383
|
title:
|
|
418
384
|
type: string
|
|
419
|
-
value: $result.
|
|
420
|
-
|
|
421
|
-
type:
|
|
422
|
-
value: $result.
|
|
385
|
+
value: $result.title # mapped per iteration via $result
|
|
386
|
+
summary:
|
|
387
|
+
type: string
|
|
388
|
+
value: $result.summary
|
|
423
389
|
```
|
|
424
390
|
|
|
425
|
-
After this step, `$steps.fetch_details.output` is an array of `{ title,
|
|
391
|
+
After this step, `$steps.fetch_details.output` is an array of `{ title, summary }` objects — one per iteration. Use `$steps.fetch_details.output` (the whole array) in downstream steps or workflow outputs.
|
|
426
392
|
|
|
427
393
|
**Workflow output for each+tool:**
|
|
394
|
+
|
|
428
395
|
```yaml
|
|
429
396
|
outputs:
|
|
430
397
|
details:
|
|
431
398
|
type: array
|
|
432
|
-
value: $steps.fetch_details.output
|
|
399
|
+
value: $steps.fetch_details.output # the collected array of per-iteration results
|
|
433
400
|
```
|
|
434
401
|
|
|
435
402
|
**Pattern: List → Slice → Fetch Details**
|
|
436
403
|
|
|
437
|
-
Full example
|
|
404
|
+
Full example fetching a listing then fetching each detail via `each`:
|
|
438
405
|
|
|
439
406
|
```yaml
|
|
440
407
|
inputs:
|
|
408
|
+
api_url:
|
|
409
|
+
type: string
|
|
410
|
+
default: "https://api.example.com/items"
|
|
441
411
|
count:
|
|
442
412
|
type: int
|
|
443
413
|
default: 10
|
|
444
|
-
base_url:
|
|
445
|
-
type: string
|
|
446
|
-
default: "https://api.example.com/item/"
|
|
447
414
|
|
|
448
415
|
outputs:
|
|
449
416
|
items:
|
|
@@ -451,231 +418,46 @@ outputs:
|
|
|
451
418
|
value: $steps.fetch_details.output
|
|
452
419
|
|
|
453
420
|
steps:
|
|
454
|
-
- id:
|
|
421
|
+
- id: get_listing
|
|
455
422
|
type: tool
|
|
456
423
|
tool: api.list_items
|
|
457
424
|
inputs:
|
|
458
|
-
url: { type: string, value:
|
|
425
|
+
url: { type: string, value: $inputs.api_url }
|
|
459
426
|
outputs:
|
|
460
|
-
|
|
427
|
+
items: { type: array, value: $result.items }
|
|
461
428
|
|
|
462
|
-
- id:
|
|
429
|
+
- id: slice_items
|
|
463
430
|
type: transform
|
|
464
431
|
operation: filter
|
|
465
|
-
where: $index < $inputs.count
|
|
432
|
+
where: $index < $inputs.count # cap iteration count to avoid rate limiting
|
|
466
433
|
inputs:
|
|
467
|
-
items: { type: array, value: $steps.
|
|
434
|
+
items: { type: array, value: $steps.get_listing.output.items }
|
|
468
435
|
outputs:
|
|
469
|
-
|
|
436
|
+
items: { type: array }
|
|
470
437
|
|
|
471
438
|
- id: fetch_details
|
|
472
439
|
type: tool
|
|
473
440
|
tool: api.get_item
|
|
474
|
-
each: $steps.
|
|
475
|
-
delay: "2s"
|
|
476
|
-
retry: { max: 3, delay: "2s", backoff: 1.5 }
|
|
477
|
-
on_error: ignore
|
|
478
|
-
inputs:
|
|
479
|
-
url:
|
|
480
|
-
type: string
|
|
481
|
-
value: "${inputs.base_url}${item}.json"
|
|
482
|
-
outputs:
|
|
483
|
-
title: { type: string, value: $result.body.title }
|
|
484
|
-
score: { type: int, value: $result.body.score }
|
|
485
|
-
url: { type: string, value: $result.body.url }
|
|
486
|
-
```
|
|
487
|
-
|
|
488
|
-
### Iterating with `each` on LLM Steps
|
|
489
|
-
|
|
490
|
-
When you have an array of items that each need LLM reasoning (summarization, classification, extraction), use `each` on the LLM step — just like tool steps. **Do not dump the entire array into a single prompt.** Always add `delay` to LLM `each` loops — LLM APIs enforce token-per-minute limits, and `delay: "1s"` is the minimum safe default.
|
|
491
|
-
|
|
492
|
-
**Why iterate instead of batch:**
|
|
493
|
-
- **Token bounds** — Each call processes one item, so prompt size is predictable and bounded. Batching N items risks hitting context limits when items are large (e.g., HTML content).
|
|
494
|
-
- **Error isolation** — If one item produces malformed output, only that item fails. With `on_error: ignore`, the rest succeed. Batching loses *all* results if the model returns one malformed JSON array.
|
|
495
|
-
- **Prompt simplicity** — "Summarize this one item" is a trivial prompt. "Parse N items and return an N-element array with exact positional correspondence" is fragile and error-prone.
|
|
496
|
-
|
|
497
|
-
**Pattern: `each` + LLM with plain text output + map transform**
|
|
498
|
-
|
|
499
|
-
Use plain text output (`value: $result`) for the LLM step, then zip the LLM results with structural data from the source array using a `map` transform:
|
|
500
|
-
|
|
501
|
-
```yaml
|
|
502
|
-
steps:
|
|
503
|
-
- id: fetch_items
|
|
504
|
-
type: tool
|
|
505
|
-
tool: api.get_item # platform-specific; use web.scrape for HTML pages
|
|
506
|
-
each: $steps.get_ids.output.ids
|
|
507
|
-
delay: "2s"
|
|
508
|
-
on_error: ignore
|
|
509
|
-
inputs:
|
|
510
|
-
url:
|
|
511
|
-
type: string
|
|
512
|
-
value: "${inputs.base_url}${item}.json"
|
|
513
|
-
outputs:
|
|
514
|
-
title: { type: string, value: $result.body.title }
|
|
515
|
-
content: { type: string, value: $result.body.content }
|
|
516
|
-
|
|
517
|
-
- id: summarize
|
|
518
|
-
type: llm
|
|
519
|
-
model: haiku
|
|
520
|
-
each: $steps.fetch_items.output # iterate over the collected array
|
|
521
|
-
delay: "1s" # rate-limit: 1s pause between iterations
|
|
522
|
-
on_error: ignore # skip items that fail
|
|
523
|
-
description: Summarize each item individually
|
|
524
|
-
prompt: |
|
|
525
|
-
Summarize this item in 1-2 sentences.
|
|
526
|
-
|
|
527
|
-
Title: $item.title
|
|
528
|
-
Content: $item.content
|
|
529
|
-
|
|
530
|
-
Write only the summary — no formatting, no preamble.
|
|
531
|
-
inputs:
|
|
532
|
-
item:
|
|
533
|
-
type: object
|
|
534
|
-
value: $item
|
|
535
|
-
outputs:
|
|
536
|
-
summary:
|
|
537
|
-
type: string
|
|
538
|
-
value: $result # plain text — one summary per iteration
|
|
539
|
-
|
|
540
|
-
- id: combine_results
|
|
541
|
-
type: transform
|
|
542
|
-
operation: map
|
|
543
|
-
description: Zip summaries with source data
|
|
544
|
-
expression:
|
|
545
|
-
title: $item.title
|
|
546
|
-
description: $steps.summarize.output[$index].summary
|
|
547
|
-
inputs:
|
|
548
|
-
items: { type: array, value: $steps.fetch_items.output }
|
|
549
|
-
outputs:
|
|
550
|
-
items: { type: array }
|
|
551
|
-
```
|
|
552
|
-
|
|
553
|
-
After this, `$steps.combine_results.output.items` is an array of `{ title, description }` objects — structural data from the tool step, LLM-generated text from the summarize step.
|
|
554
|
-
|
|
555
|
-
**Why plain text over JSON for `each` + LLM:**
|
|
556
|
-
- **Fence risk** — Models frequently wrap JSON in markdown fences (`` ```json...``` ``) despite explicit instructions. The runtime parses with `JSON.parse`, which rejects fenced output. Plain text has no parsing step — what the model writes is what you get.
|
|
557
|
-
- **Silent failures** — When JSON parsing fails, the output stays as a raw string. `$result.field` on a string returns `undefined`, which propagates as `{}` downstream. No error is thrown — the workflow "succeeds" with empty objects.
|
|
558
|
-
- **Structural data doesn't need LLM generation** — Fields like `title`, `id`, `score` already exist in the source data. Only the LLM-generated field (summary, classification, score) needs to come from the model. Use a `map` transform to zip them together.
|
|
559
|
-
|
|
560
|
-
**Workflow output for each+LLM:**
|
|
561
|
-
```yaml
|
|
562
|
-
outputs:
|
|
563
|
-
summaries:
|
|
564
|
-
type: array
|
|
565
|
-
value: $steps.combine_results.output.items # the zipped array
|
|
566
|
-
```
|
|
567
|
-
|
|
568
|
-
**Anti-pattern — JSON output in `each` + LLM:**
|
|
569
|
-
```yaml
|
|
570
|
-
# BAD: model may return fenced JSON → parse fails → $result.field returns undefined → silent {}
|
|
571
|
-
- id: summarize
|
|
572
|
-
type: llm
|
|
573
|
-
each: $steps.fetch_items.output
|
|
574
|
-
prompt: |
|
|
575
|
-
Return a JSON object with "title" and "description" fields.
|
|
576
|
-
Respond with raw JSON only — no markdown fences.
|
|
577
|
-
outputs:
|
|
578
|
-
title: { type: string, value: $result.title } # undefined if fenced
|
|
579
|
-
description: { type: string, value: $result.description } # undefined if fenced
|
|
580
|
-
```
|
|
581
|
-
|
|
582
|
-
**Anti-pattern — batching all items into one prompt:**
|
|
583
|
-
```yaml
|
|
584
|
-
# BAD: unbounded prompt size, all-or-nothing failure, complex output format
|
|
585
|
-
- id: summarize
|
|
586
|
-
type: llm
|
|
587
|
-
prompt: |
|
|
588
|
-
Summarize each job below...
|
|
589
|
-
Jobs: $steps.fetch_items.output # dumps entire array into prompt
|
|
590
|
-
outputs:
|
|
591
|
-
roles: { type: array, value: $result } # one malformed response loses everything
|
|
592
|
-
```
|
|
593
|
-
|
|
594
|
-
**When bulk IS acceptable:** Use a single LLM call with the full array only when the task requires cross-item reasoning — ranking, deduplication, holistic comparison, or generating a unified summary across all items. If each item can be processed independently, always use `each`.
|
|
595
|
-
|
|
596
|
-
## Web Scraping Pattern
|
|
597
|
-
|
|
598
|
-
When a workflow fetches HTML and extracts structured data, follow this recipe:
|
|
599
|
-
|
|
600
|
-
### Step pattern: scrape → guard
|
|
601
|
-
|
|
602
|
-
`web.scrape` fetches the URL and applies CSS selectors in one step:
|
|
603
|
-
|
|
604
|
-
```yaml
|
|
605
|
-
steps:
|
|
606
|
-
- id: scrape_data
|
|
607
|
-
type: tool
|
|
608
|
-
tool: web.scrape
|
|
609
|
-
retry: { max: 3, delay: "2s", backoff: 1.5 }
|
|
610
|
-
inputs:
|
|
611
|
-
url: { type: string, value: "https://example.com/search" }
|
|
612
|
-
headers:
|
|
613
|
-
type: object
|
|
614
|
-
value: { "User-Agent": "Mozilla/5.0", "Accept": "text/html" }
|
|
615
|
-
selector: { type: string, value: "li.result-item" }
|
|
616
|
-
fields:
|
|
617
|
-
type: object
|
|
618
|
-
value:
|
|
619
|
-
title: "h3.title"
|
|
620
|
-
url: "a.link @href"
|
|
621
|
-
id: "@data-pid"
|
|
622
|
-
limit: { type: int, value: 50 }
|
|
623
|
-
outputs:
|
|
624
|
-
items: { type: array, value: $result.results }
|
|
625
|
-
|
|
626
|
-
- id: guard_empty
|
|
627
|
-
type: exit
|
|
628
|
-
condition: $steps.scrape_data.output.items.length == 0
|
|
629
|
-
status: success
|
|
630
|
-
output: { results: [] }
|
|
631
|
-
inputs: {}
|
|
632
|
-
outputs: {}
|
|
633
|
-
```
|
|
634
|
-
|
|
635
|
-
### Multi-page scraping with `each`
|
|
636
|
-
|
|
637
|
-
When scraping multiple pages, combine `web.scrape` with `each`:
|
|
638
|
-
|
|
639
|
-
```yaml
|
|
640
|
-
steps:
|
|
641
|
-
- id: get_page_list
|
|
642
|
-
type: tool
|
|
643
|
-
tool: api.get_sitemap # returns list of page URLs to scrape
|
|
644
|
-
outputs:
|
|
645
|
-
pages:
|
|
646
|
-
type: array
|
|
647
|
-
|
|
648
|
-
- id: scrape_pages
|
|
649
|
-
type: tool
|
|
650
|
-
tool: web.scrape
|
|
651
|
-
each: $steps.get_page_list.output.pages
|
|
441
|
+
each: $steps.slice_items.output.items
|
|
652
442
|
delay: "2s"
|
|
653
443
|
retry: { max: 3, delay: "2s", backoff: 1.5 }
|
|
654
444
|
on_error: ignore
|
|
655
445
|
inputs:
|
|
656
|
-
|
|
657
|
-
selector: { type: string, value: "article.post" }
|
|
658
|
-
fields:
|
|
659
|
-
type: object
|
|
660
|
-
value:
|
|
661
|
-
title: "h1.title"
|
|
662
|
-
body: "div.content"
|
|
446
|
+
id: { type: string, value: $item.id }
|
|
663
447
|
outputs:
|
|
664
|
-
|
|
448
|
+
title: { type: string, value: $result.title }
|
|
449
|
+
summary: { type: string, value: $result.summary }
|
|
450
|
+
score: { type: string, value: $result.score }
|
|
665
451
|
```
|
|
666
452
|
|
|
667
|
-
### Selector research
|
|
668
|
-
|
|
669
|
-
Follow the Research protocol (Authoring Process, Phase 2) before writing selectors. Every selector must be verified against actual fetched HTML.
|
|
670
|
-
|
|
671
453
|
## Authoring Rules
|
|
672
454
|
|
|
673
|
-
1. **
|
|
674
|
-
2. **Use
|
|
455
|
+
1. **Use tool steps for all external calls.** Every interaction with an API, database, or LLM is a tool step. The runtime dispatches tool steps to whatever tools the host registers — the workflow author should use the exact tool names available in this deployment context. Do not invent tool names.
|
|
456
|
+
2. **Use transforms for pure data operations.** Filtering, reshaping, sorting, and field extraction are structural operations — use `transform` steps. Do not use tool steps to reshape data that can be expressed as a transform.
|
|
675
457
|
3. **Always declare inputs and outputs.** They enable validation and composability.
|
|
676
458
|
4. **Use `value` on workflow outputs** to explicitly map step results to workflow outputs. Use `$steps.<id>.output.<field>` expressions. This is preferred over exit steps for producing output.
|
|
677
|
-
5. **Use `value` on step outputs** to map fields from the raw executor result using `$result`.
|
|
678
|
-
6. **Use `each` for per-item processing** on
|
|
459
|
+
5. **Use `value` on step outputs** to map fields from the raw executor result using `$result`. Useful when the tool's response shape differs from what downstream steps need.
|
|
460
|
+
6. **Use `each` for per-item processing** on tool steps. Always include `delay` on every `each` loop that calls an external service — `delay: "2s"` is a safe default. See _Iteration Patterns_.
|
|
679
461
|
7. **Add `on_error: ignore` for non-critical steps** like notifications.
|
|
680
462
|
8. **Add `retry` for external API calls** (tool steps that might fail transiently).
|
|
681
463
|
9. **Use `condition` guards for early exits** rather than letting empty data flow through.
|
|
@@ -684,11 +466,9 @@ Follow the Research protocol (Authoring Process, Phase 2) before writing selecto
|
|
|
684
466
|
12. **`condition` on a `conditional` step is the branch condition**, not a guard.
|
|
685
467
|
13. **Use exit steps for conditional early termination only**, not as the default way to produce output. Exit output keys must match the declared workflow output keys.
|
|
686
468
|
14. **Transform steps are for arrays only.** Never use a transform to extract fields from a single object.
|
|
687
|
-
15. **Use `map` with `$index` for cross-array merging.** When multiple steps produce parallel arrays, use a `map` transform with bracket indexing (`$steps.other.output.field[$index]`) to zip them into structured objects.
|
|
688
|
-
16. **
|
|
689
|
-
17. **
|
|
690
|
-
18. **Prefer bulk endpoints over per-item iteration.** When per-item `each` + tool calls (including `web.scrape`) are unavoidable, always add `delay: "2s"` (minimum), cap iteration count, and add `retry` with `backoff`. `delay` is not optional — external sites and APIs rate-limit without warning and delays are free. Same applies to `each` + `llm` steps: always add `delay: "1s"`. See *Iteration Patterns*.
|
|
691
|
-
19. **Prefer plain text LLM output over JSON.** For single-value tasks (summarization, classification, scoring), use `value: $result` and instruct the model to return plain text. Reserve JSON (`value: $result.field`) for multi-field output where every field requires LLM reasoning. In `each` + LLM patterns, always use plain text + a `map` transform to zip LLM output with structural data from the source array. See *Iteration Patterns*.
|
|
469
|
+
15. **Use `map` with `$index` for cross-array merging.** When multiple steps produce parallel arrays, use a `map` transform with bracket indexing (`$steps.other.output.field[$index]`) to zip them into structured objects.
|
|
470
|
+
16. **Guard expensive steps behind deterministic exits.** Pattern: fetch → filter → exit guard → expensive tool. Use deterministic expressions (e.g., `$item.department == "Engineering"` or `$item.title contains "Product Manager"`) in `transform filter` steps before any costly tool call. See _Patterns_.
|
|
471
|
+
17. **Prefer bulk endpoints over per-item iteration.** When per-item `each` + tool calls are unavoidable, always add `delay: "2s"` (minimum), cap iteration count, and add `retry` with `backoff`. `delay` is not optional — external APIs rate-limit without warning. See _Iteration Patterns_.
|
|
692
472
|
|
|
693
473
|
## Output Format
|
|
694
474
|
|
|
@@ -710,9 +490,9 @@ description: Fetches data and outputs a specific field
|
|
|
710
490
|
|
|
711
491
|
` `` `workflow
|
|
712
492
|
inputs:
|
|
713
|
-
|
|
493
|
+
url:
|
|
714
494
|
type: string
|
|
715
|
-
default: "
|
|
495
|
+
default: "https://api.example.com/items"
|
|
716
496
|
|
|
717
497
|
outputs:
|
|
718
498
|
name:
|
|
@@ -722,21 +502,22 @@ outputs:
|
|
|
722
502
|
steps:
|
|
723
503
|
- id: fetch
|
|
724
504
|
type: tool
|
|
725
|
-
tool:
|
|
505
|
+
tool: api.get_item
|
|
726
506
|
inputs:
|
|
727
|
-
|
|
507
|
+
url:
|
|
728
508
|
type: string
|
|
729
|
-
value: $inputs.
|
|
509
|
+
value: $inputs.url
|
|
730
510
|
outputs:
|
|
731
511
|
name:
|
|
732
512
|
type: string
|
|
733
|
-
value: $result.
|
|
513
|
+
value: $result.name
|
|
734
514
|
` `` `
|
|
735
515
|
```
|
|
736
516
|
|
|
737
517
|
## Validation
|
|
738
518
|
|
|
739
519
|
After writing the file, always validate it against the runtime. The validation checklist:
|
|
520
|
+
|
|
740
521
|
- [ ] All step IDs are unique
|
|
741
522
|
- [ ] All `$steps` references point to earlier steps
|
|
742
523
|
- [ ] All tools referenced are confirmed available in this deployment context
|
|
@@ -745,11 +526,8 @@ After writing the file, always validate it against the runtime. The validation c
|
|
|
745
526
|
- [ ] `each` not used on exit or conditional steps
|
|
746
527
|
- [ ] Workflow outputs have `value` mapping to `$steps` references
|
|
747
528
|
- [ ] Step output `value` uses `$result` (not `$steps`)
|
|
748
|
-
- [ ] LLM step outputs have `value` using `$result`
|
|
749
529
|
- [ ] All `${}` template references resolve to declared inputs/steps
|
|
750
|
-
- [ ]
|
|
751
|
-
- [ ]
|
|
752
|
-
- [ ] Every `each` loop that calls an external service has `delay` (tool steps: `"2s"` minimum; LLM steps: `"1s"` minimum)
|
|
753
|
-
- [ ] `each` + `web.scrape` steps are bounded (preceded by a cap) and have `retry` with `backoff`
|
|
530
|
+
- [ ] Every `each` loop that calls an external service has `delay` (`"2s"` minimum)
|
|
531
|
+
- [ ] `each` + tool steps with per-item fetching are bounded (preceded by a cap) and have `retry` with `backoff`
|
|
754
532
|
|
|
755
533
|
If validation fails, fix the errors and revalidate.
|