pi-taskflow 0.0.4 → 0.0.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/DESIGN.md +15 -1
- package/README.md +273 -54
- package/examples/conditional-research.json +56 -0
- package/examples/guarded-refactor.json +50 -0
- package/extensions/agents.ts +8 -1
- package/extensions/index.ts +30 -15
- package/extensions/interpolate.ts +231 -0
- package/extensions/render.ts +14 -3
- package/extensions/runner.ts +61 -78
- package/extensions/runtime.ts +369 -46
- package/extensions/schema.ts +85 -2
- package/extensions/store.ts +29 -3
- package/extensions/usage.ts +42 -0
- package/package.json +2 -2
- package/skills/taskflow/SKILL.md +79 -0
- package/skills/taskflow/configuration.md +275 -0
package/DESIGN.md
CHANGED
|
@@ -204,6 +204,19 @@ pi-taskflow/
|
|
|
204
204
|
| `map` | 对上游数组**动态 fan-out**,每项一个 agent | ≤concurrency |
|
|
205
205
|
| `gate` | 质量门 / 对抗 review(可决定是否继续) | 1+ |
|
|
206
206
|
| `reduce` | 把多结果聚合为一(synthesize) | 1 |
|
|
207
|
+
| `approval` | **人在环**:暂停等待 approve / reject / edit | 1 |
|
|
208
|
+
| `flow` | 把一个**已保存的 taskflow** 当作单个 phase 运行(组合复用) | 子流程并发 |
|
|
209
|
+
|
|
210
|
+
### 3.3b 控制流 / 可靠性字段(任意 phase)
|
|
211
|
+
|
|
212
|
+
| 字段 | 语义 |
|
|
213
|
+
|------|------|
|
|
214
|
+
| `when` | 条件守卫:表达式为假则 skip 该 phase。支持 `{refs}`、`== != < > <= >=`、`&& \|\| !`、括号、字符串/数字字面量。解析失败 fail-open(仍运行) |
|
|
215
|
+
| `join` | 依赖 join:`all`(默认,等全部 dep)/ `any`(OR-join,任一 dep 完成即运行) |
|
|
216
|
+
| `retry` | `{max, backoffMs, factor}`:失败重试,延迟 = `backoffMs * factor^attempt` |
|
|
217
|
+
| `use` / `with` | `flow` 子流程的名字与入参(入参字符串值会插值) |
|
|
218
|
+
|
|
219
|
+
顶层 `budget: {maxUSD, maxTokens}`:累计成本/token 超限即停(剩余 phase skip,运行态 `blocked`)。
|
|
207
220
|
|
|
208
221
|
### 3.4 模板插值
|
|
209
222
|
|
|
@@ -293,7 +306,8 @@ export async function runTaskflow(def, args, ctx): Promise<TaskflowResult>
|
|
|
293
306
|
| **v0.1** | DSL + schema + runtime(agent/parallel/map/reduce)+ `taskflow` 工具 + `/tf run` + 内存隔离 + 流式进度 | ✅ 已发布 (npm 0.0.1) |
|
|
294
307
|
| **v0.2** | 保存/动态命令注册 + 跨 session 恢复 + `gate` 真门控 + run 历史交互 TUI | ✅ 已完成 (npm 0.0.3) |
|
|
295
308
|
| **v0.3** | examples + SKILL.md(教 LLM 写定义)+ YAML 支持 + 发布 npm | 🚧 examples/SKILL/npm 已做;YAML 待办 |
|
|
296
|
-
| **v0.
|
|
309
|
+
| **v0.6** | 控制流 & 可靠性:`when` 条件分支 + `join:any` OR-join + 声明式 `retry` + `approval` 人在环 + `flow` 子流程组合 + `budget` 成本上限 | ✅ 已完成 |
|
|
310
|
+
| **v0.7+** | 真·后台执行(detached + 轮询)+ 事件/cron 触发 + 成本**预估** + mermaid DAG 导出 + 内置 `deep-research` 工作流 | ⏳ 待办 |
|
|
297
311
|
|
|
298
312
|
---
|
|
299
313
|
|
package/README.md
CHANGED
|
@@ -1,94 +1,269 @@
|
|
|
1
|
-
|
|
1
|
+
<div align="center">
|
|
2
|
+
|
|
3
|
+
<img src="./assets/hero.png" alt="pi-taskflow — declarative, multi-phase subagent workflows" width="880">
|
|
4
|
+
|
|
5
|
+
<p>
|
|
6
|
+
<a href="https://www.npmjs.com/package/pi-taskflow"><img src="https://img.shields.io/npm/v/pi-taskflow?style=flat-square&color=B692FF&label=npm" alt="npm version"></a>
|
|
7
|
+
<a href="./LICENSE"><img src="https://img.shields.io/badge/license-MIT-43D9AD?style=flat-square" alt="MIT license"></a>
|
|
8
|
+
<a href="https://pi.dev"><img src="https://img.shields.io/badge/for-Pi%20coding%20agent-6E8BFF?style=flat-square" alt="for the Pi coding agent"></a>
|
|
9
|
+
</p>
|
|
10
|
+
|
|
11
|
+
</div>
|
|
2
12
|
|
|
3
13
|
> Lightweight workflow orchestration for the [Pi coding agent](https://pi.dev).
|
|
4
14
|
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
15
|
+
**Orchestrate your Pi subagents. Not by prompting — by declaring.**
|
|
16
|
+
|
|
17
|
+
If you've used the built-in subagent tool's `task` / `tasks` / `chain`, you
|
|
18
|
+
already know the shorthand — your runs just get tracked, resumable, and
|
|
19
|
+
saveable as a one-word `/tf:<name>` command.
|
|
9
20
|
|
|
10
21
|
```bash
|
|
11
22
|
pi install npm:pi-taskflow
|
|
12
23
|
```
|
|
13
24
|
|
|
25
|
+
Fan out one subagent per item, gate the results with an adversarial review, and
|
|
26
|
+
get back only the final report — none of the intermediate transcripts ever touch
|
|
27
|
+
your conversation.
|
|
28
|
+
|
|
14
29
|
## Why
|
|
15
30
|
|
|
16
|
-
|
|
17
|
-
coordinated steps
|
|
18
|
-
or a
|
|
19
|
-
|
|
31
|
+
The built-in subagent tool is great for a single delegated task. But when a job
|
|
32
|
+
needs many coordinated steps, fan-out over dozens of items, cross-checked review,
|
|
33
|
+
or a repeatable pipeline, you want orchestration — without the intermediate
|
|
34
|
+
transcripts eating your context window.
|
|
20
35
|
|
|
21
36
|
`pi-taskflow` moves the plan into a small declarative definition. The runtime
|
|
22
37
|
holds the DAG, the loops, and the intermediate results; your context receives
|
|
23
38
|
only the final phase's output.
|
|
24
39
|
|
|
25
|
-
| | `subagent` | `pi-taskflow` |
|
|
40
|
+
| | `subagent` tool | `pi-taskflow` |
|
|
26
41
|
|---|---|---|
|
|
27
42
|
| Who drives | the model, turn by turn | the runtime, from a definition |
|
|
28
43
|
| Intermediate results | in your context window | in the runtime (not your context) |
|
|
29
44
|
| Reusable | re-described each time | saved as `/tf:<name>` |
|
|
30
45
|
| Scale | a few tasks | dynamic `map` fan-out |
|
|
31
|
-
| Resumable | no | yes (cross-session) |
|
|
46
|
+
| Resumable | no | yes (cross-session, cached phases skip) |
|
|
47
|
+
| Quality gates | no | `gate` phases with `VERDICT: BLOCK / PASS` |
|
|
48
|
+
| Progress visibility | opaque while running | live DAG render with timing + cost |
|
|
49
|
+
| Ergonomics | inline JSON each time | shorthand (`task`/`tasks`/`chain`) or DSL |
|
|
50
|
+
|
|
51
|
+
## Show me
|
|
52
|
+
|
|
53
|
+
Describe a pipeline once, then run it from a pi session by name:
|
|
54
|
+
|
|
55
|
+
> `/tf:summarize-files dir=src`
|
|
56
|
+
|
|
57
|
+
The runtime fans out one subagent per file, merges the summaries in a `reduce`
|
|
58
|
+
phase, and returns only the final overview. Every intermediate transcript stays
|
|
59
|
+
in the runtime — never in your context window. (Full definition in
|
|
60
|
+
[Quickstart](#then-go-declarative) below.)
|
|
61
|
+
|
|
62
|
+
## Quickstart
|
|
63
|
+
|
|
64
|
+
### Shorthand: same effort as `subagent`, but tracked & resumable
|
|
65
|
+
|
|
66
|
+
**Single task** — one agent, one job:
|
|
67
|
+
|
|
68
|
+
```jsonc
|
|
69
|
+
{ "task": "Summarize the architecture of src/", "agent": "explorer" }
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
**Parallel tasks** — fire several at once, outputs merge:
|
|
32
73
|
|
|
33
|
-
|
|
74
|
+
```jsonc
|
|
75
|
+
{ "tasks": [
|
|
76
|
+
{ "task": "Audit auth in src/api", "agent": "analyst" },
|
|
77
|
+
{ "task": "Audit input validation in src/api", "agent": "analyst" }
|
|
78
|
+
] }
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
**Chain** — sequential, each step sees the previous one's output:
|
|
82
|
+
|
|
83
|
+
```jsonc
|
|
84
|
+
{ "chain": [
|
|
85
|
+
{ "task": "List the public API of src/lib", "agent": "scout" },
|
|
86
|
+
{ "task": "Write docs for:\n{previous.output}", "agent": "writer" }
|
|
87
|
+
] }
|
|
88
|
+
```
|
|
34
89
|
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
run concurrently (bounded by `concurrency`).
|
|
90
|
+
`agent` is optional (defaults to the first available agent). Add `name` to label
|
|
91
|
+
the run and enable saving it as a reusable command.
|
|
38
92
|
|
|
39
|
-
|
|
93
|
+
Try it inline — tell the model something like:
|
|
40
94
|
|
|
41
|
-
|
|
42
|
-
|------|---------|
|
|
43
|
-
| `agent` | one subagent runs `task` |
|
|
44
|
-
| `parallel` | run `branches[]` concurrently |
|
|
45
|
-
| `map` | fan out over an array — one subagent per item, `{item}` bound |
|
|
46
|
-
| `gate` | quality/adversarial-review step |
|
|
47
|
-
| `reduce` | aggregate several phases' outputs into one |
|
|
95
|
+
> Run a chain: first explore the auth flow, then summarize findings.
|
|
48
96
|
|
|
49
|
-
|
|
97
|
+
The model calls the `taskflow` tool; you get live progress, per-step timing,
|
|
98
|
+
token cost, and a run record. Ask to `save` it and you get `/tf:<name>`.
|
|
50
99
|
|
|
51
|
-
|
|
52
|
-
- `{steps.ID.output}` — a prior phase's text output
|
|
53
|
-
- `{steps.ID.json}` / `{steps.ID.json.field}` — prior output parsed as JSON
|
|
54
|
-
- `{item}` / `{item.field}` — current item inside a `map` phase
|
|
55
|
-
- `{previous.output}` — the immediately-upstream phase output
|
|
100
|
+
### Then go declarative
|
|
56
101
|
|
|
57
|
-
|
|
102
|
+
When your pipeline outgrows the shorthand — when you need dynamic fan-out,
|
|
103
|
+
intermediate JSON routing, or quality gates — graduate to the full DSL:
|
|
58
104
|
|
|
59
105
|
```jsonc
|
|
60
106
|
{
|
|
61
107
|
"name": "summarize-files",
|
|
108
|
+
"description": "Discover files, summarize each, produce a report",
|
|
62
109
|
"args": { "dir": { "default": "." } },
|
|
63
|
-
"concurrency":
|
|
110
|
+
"concurrency": 8,
|
|
64
111
|
"phases": [
|
|
65
112
|
{ "id": "discover", "type": "agent", "agent": "scout",
|
|
66
|
-
"task": "List source files under {args.dir}
|
|
113
|
+
"task": "List source files under {args.dir} (non-recursive).\nOutput ONLY a JSON array [{\"file\":\"\"}]. No prose.",
|
|
67
114
|
"output": "json" },
|
|
68
|
-
{ "id": "summarize", "type": "map",
|
|
69
|
-
"
|
|
115
|
+
{ "id": "summarize", "type": "map",
|
|
116
|
+
"over": "{steps.discover.json}", "as": "item",
|
|
117
|
+
"agent": "scout",
|
|
118
|
+
"task": "Read {item.file} and give a one-sentence summary.",
|
|
70
119
|
"dependsOn": ["discover"] },
|
|
71
|
-
{ "id": "report", "type": "reduce", "from": ["summarize"],
|
|
120
|
+
{ "id": "report", "type": "reduce", "from": ["summarize"],
|
|
121
|
+
"agent": "writer",
|
|
72
122
|
"task": "Combine into a short overview:\n{steps.summarize.output}",
|
|
73
123
|
"dependsOn": ["summarize"], "final": true }
|
|
74
124
|
]
|
|
75
125
|
}
|
|
76
126
|
```
|
|
77
127
|
|
|
78
|
-
|
|
128
|
+
What this does:
|
|
129
|
+
|
|
130
|
+
1. **`discover`** — an agent lists every file in the directory and outputs a JSON array.
|
|
131
|
+
2. **`summarize`** — a `map` fans out, spawning one subagent per file in parallel
|
|
132
|
+
(throttled to 8 concurrent). Each gets `{item.file}` bound to its file path.
|
|
133
|
+
3. **`report`** — a `reduce` merges all summaries into one clean overview.
|
|
134
|
+
|
|
135
|
+
Intermediate outputs never enter your context. The runtime owns them. You get
|
|
136
|
+
only the final report back.
|
|
79
137
|
|
|
80
|
-
|
|
138
|
+
Save it once → `/tf:summarize-files` forever.
|
|
139
|
+
|
|
140
|
+
## Watch it run
|
|
141
|
+
|
|
142
|
+
This is the live progress render for a real run — the `self-improve` flow that
|
|
143
|
+
writes and verifies its own test suites, caught here mid-block by a quality gate:
|
|
81
144
|
|
|
82
145
|
```
|
|
83
|
-
/
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
146
|
+
⊗ taskflow self-improve 6/7 · blocked · $0.095
|
|
147
|
+
✓ discover agent deepseek-v4-flash 10t ↑38k ↓6.7k $0.011
|
|
148
|
+
┌ ✓ write-runner-tests agent claude-sonnet-4-6 10t ↑13 ↓6.6k $0.020
|
|
149
|
+
├ ✓ write-store-tests agent claude-sonnet-4-6 10t ↑11 ↓10k $0.018
|
|
150
|
+
├ ✓ write-agents-tests agent claude-sonnet-4-6 10t ↑28 ↓13k $0.030
|
|
151
|
+
└ ✓ fix-stability agent claude-sonnet-4-6 10t ↑13 ↓3.9k $0.012
|
|
152
|
+
✓ verify gate BLOCK 3 type errors in test files deepseek-v4-flash
|
|
153
|
+
⊘ report reduce skipped · Gate blocked ↳ fix-stability
|
|
89
154
|
```
|
|
90
155
|
|
|
91
|
-
|
|
156
|
+
**How to read it — the layout *is* the DAG:**
|
|
157
|
+
|
|
158
|
+
- **Header** — `⊗` means the flow is blocked (a gate halted it); `6/7` phases
|
|
159
|
+
processed, aggregate cost `$0.095`.
|
|
160
|
+
- **Status icons** — `✓` done, `◐` running, `✗` failed, `⊘` skipped, `○` pending.
|
|
161
|
+
- **Rail `┌ ├ └`** — phases in the same DAG layer, running concurrently. The four
|
|
162
|
+
`write-*`/`fix-stability` tasks all fan out from `discover`. A blank gutter is
|
|
163
|
+
a single-phase layer.
|
|
164
|
+
- **`↳`** — a long (layer-skipping) dependency. `report` depends on `verify` (the
|
|
165
|
+
adjacent layer, implied by position) *and* `fix-stability` two layers back, so
|
|
166
|
+
only that skip edge is annotated.
|
|
167
|
+
- **Gate** — `verify` emitted `VERDICT: BLOCK`, so the runtime skipped `report`
|
|
168
|
+
and ended the run as `blocked`, surfacing the reason.
|
|
169
|
+
- **Detail** — per phase: model, token counts (`↑`in `↓`out), cost, and timing.
|
|
170
|
+
Fan-out phases also show sub-task progress.
|
|
171
|
+
|
|
172
|
+
## Phase types
|
|
173
|
+
|
|
174
|
+
| type | meaning | required fields |
|
|
175
|
+
|------|---------|-----------------|
|
|
176
|
+
| `agent` | one subagent runs a single task | `task` |
|
|
177
|
+
| `parallel` | run `branches[]` concurrently | `branches` (array of `{task, agent?}`) |
|
|
178
|
+
| `map` | fan out over an array — one subagent per item, `{item}` bound | `over`, `task` |
|
|
179
|
+
| `gate` | quality/review step that can **halt the flow** | `task` |
|
|
180
|
+
| `reduce` | aggregate `from[]` phase outputs into one | `from`, `task` |
|
|
181
|
+
| `approval` | **human-in-the-loop** pause — approve / reject / edit before continuing | — |
|
|
182
|
+
| `flow` | run a **saved sub-flow** as one phase (composition/reuse) | `use` |
|
|
183
|
+
|
|
184
|
+
Every phase needs `id`. Optional fields: `agent`, `dependsOn`, `output`,
|
|
185
|
+
`model`, `thinking`, `tools`, `cwd`, `concurrency`, `final`, `optional`,
|
|
186
|
+
`when` (conditional guard), `join` (`all`\|`any` dependency join), `retry`
|
|
187
|
+
(`{max, backoffMs, factor}`), and `with` (args for a `flow` phase).
|
|
188
|
+
Run-wide: `budget: {maxUSD, maxTokens}` halts the flow when exceeded.
|
|
189
|
+
|
|
190
|
+
### Control flow & reliability
|
|
191
|
+
|
|
192
|
+
- **`when`** — skip a phase unless an expression is truthy. Supports `{refs}`,
|
|
193
|
+
`== != < > <= >=`, `&& || !`, parentheses, and quoted strings/numbers, e.g.
|
|
194
|
+
`"when": "{steps.triage.json.route} == deep"`. Pair with `join: "any"` on the
|
|
195
|
+
merge phase to build real if/else routing. Parse errors **fail open**.
|
|
196
|
+
- **`join: "any"`** — an OR-join: the phase runs as soon as *one* dependency
|
|
197
|
+
completes (default `"all"` waits for every dep).
|
|
198
|
+
- **`retry`** — `{ "max": 2, "backoffMs": 500, "factor": 2 }` retries a failing
|
|
199
|
+
subagent with fixed (`factor:1`) or exponential backoff; usage is summed and
|
|
200
|
+
the attempt count shows as `↻N` in the TUI.
|
|
201
|
+
- **`approval`** — pause for a human (`select`: Approve / Reject / Edit). Reject
|
|
202
|
+
halts the flow; Edit injects the typed note as the phase output for downstream
|
|
203
|
+
steps. Non-interactive runs auto-approve.
|
|
204
|
+
- **`flow`** — `{ "type": "flow", "use": "deep-research", "with": { "topic": "{item}" } }`
|
|
205
|
+
runs a saved flow as a phase (recursion is detected and rejected).
|
|
206
|
+
- **`budget`** — a run-wide `{maxUSD, maxTokens}` ceiling; once exceeded, pending
|
|
207
|
+
phases are skipped (and in-flight fan-out stops spawning) and the run is
|
|
208
|
+
`blocked`.
|
|
209
|
+
|
|
210
|
+
### `output` format
|
|
211
|
+
|
|
212
|
+
- `output: "text"` (default) — the raw subagent output.
|
|
213
|
+
- `output: "json"` — the subagent output is parsed as JSON and exposed via
|
|
214
|
+
`{steps.ID.json}` / `{steps.ID.json.field}`. Set this on phases whose output
|
|
215
|
+
a downstream `map` or `reduce` needs to consume as structured data.
|
|
216
|
+
|
|
217
|
+
There is no `output: "file"`. For file-based output, have the agent write to
|
|
218
|
+
disk with a `write` tool call.
|
|
219
|
+
|
|
220
|
+
### Gate phases (quality control)
|
|
221
|
+
|
|
222
|
+
A `gate` runs an agent to review upstream output and can **block the rest
|
|
223
|
+
of the workflow**. End the gate task's instructions by asking the agent to
|
|
224
|
+
emit a verdict the runtime can read:
|
|
225
|
+
|
|
226
|
+
- a final line `VERDICT: PASS` or `VERDICT: BLOCK` (also accepts `OK`, `FAIL`,
|
|
227
|
+
`STOP`, `REJECT`, `HALT` — last occurrence wins), or
|
|
228
|
+
- JSON like `{"continue": false, "reason": "missing auth checks"}` /
|
|
229
|
+
`{"verdict": "block", "reason": "..."}`.
|
|
230
|
+
|
|
231
|
+
On **BLOCK**, downstream phases are skipped and the run ends as `blocked` with
|
|
232
|
+
the reason surfaced. **Ambiguous output fails open** (treated as PASS) — a gate
|
|
233
|
+
never halts the flow by accident.
|
|
234
|
+
|
|
235
|
+
```
|
|
236
|
+
Review the audit results below. If any endpoint is missing auth, end with
|
|
237
|
+
"VERDICT: BLOCK" and a one-line reason; otherwise end with "VERDICT: PASS".
|
|
238
|
+
|
|
239
|
+
{steps.audit.output}
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
## Interpolation
|
|
243
|
+
|
|
244
|
+
| placeholder | resolves to |
|
|
245
|
+
|---|---|
|
|
246
|
+
| `{args.X}` | invocation argument |
|
|
247
|
+
| `{steps.ID.output}` | a prior phase's text output |
|
|
248
|
+
| `{steps.ID.json}` | prior output parsed as JSON (or `{steps.ID.json.field}`) |
|
|
249
|
+
| `{item}` / `{item.field}` | current item inside a `map` phase |
|
|
250
|
+
| `{previous.output}` | the immediately-upstream phase output |
|
|
251
|
+
|
|
252
|
+
## Commands
|
|
253
|
+
|
|
254
|
+
Saved flows become CLI shortcuts. All commands work in the pi session:
|
|
255
|
+
|
|
256
|
+
| Command | What it does |
|
|
257
|
+
|---|---|
|
|
258
|
+
| `/tf list` | List all saved flows |
|
|
259
|
+
| `/tf run <name> [args]` | Run a saved flow (e.g. `/tf run summarize-files dir=src`) |
|
|
260
|
+
| `/tf show <name>` | Print a flow's definition |
|
|
261
|
+
| `/tf runs` | Browse recent run history (interactive TUI) |
|
|
262
|
+
| `/tf resume <runId>` | Continue a paused/failed run — cached phases skip automatically |
|
|
263
|
+
| `/tf:<name> [args]` | Shortcut — runs the flow in one tap |
|
|
264
|
+
|
|
265
|
+
Tool actions (used by the model): `run` (inline `define` or saved `name`),
|
|
266
|
+
`save`, `resume`, `list`.
|
|
92
267
|
|
|
93
268
|
## Storage
|
|
94
269
|
|
|
@@ -98,30 +273,74 @@ Tool actions: `run` (inline `define` or saved `name`), `save`, `resume`, `list`.
|
|
|
98
273
|
.pi/taskflows/runs/<runId>.json # run state (resume); gitignore this
|
|
99
274
|
```
|
|
100
275
|
|
|
276
|
+
Agent discovery scope (set via `agentScope` in the flow definition):
|
|
277
|
+
|
|
278
|
+
| value | discovers agents from |
|
|
279
|
+
|---|---|
|
|
280
|
+
| `"user"` (default) | `~/.pi/agent/agents/*.md` |
|
|
281
|
+
| `"project"` | `.pi/agents/*.md` (walks up the tree) |
|
|
282
|
+
| `"both"` | user + project; project wins on name collision |
|
|
283
|
+
|
|
101
284
|
## Agents
|
|
102
285
|
|
|
103
|
-
Taskflow reuses your existing pi
|
|
104
|
-
`.pi/agents/*.md`)
|
|
105
|
-
|
|
286
|
+
Taskflow reuses your existing pi agent files (`~/.pi/agent/agents/*.md`,
|
|
287
|
+
`.pi/agents/*.md`). Reference agents by `name` in a phase or shorthand.
|
|
288
|
+
|
|
289
|
+
When running a phase, the runtime extracts the agent's `systemPrompt` from its
|
|
290
|
+
`.md` frontmatter and passes it via `--append-system-prompt` (written to a temp
|
|
291
|
+
file). Phase-level overrides for `model`, `thinking`, and `tools` are passed as
|
|
292
|
+
`--model` / `--thinking` / `--tools` flags to the subagent invocation.
|
|
293
|
+
|
|
294
|
+
Settings from `~/.pi/agent/settings.json` (the `subagents.agentOverrides` map)
|
|
295
|
+
are honored, letting you tweak model, thinking, or tools per agent across all flows.
|
|
296
|
+
|
|
297
|
+
## Status & limits
|
|
298
|
+
|
|
299
|
+
- **v0.0.6** — control flow & reliability: conditional `when` guards, `join: any`
|
|
300
|
+
OR-joins, declarative `retry`/backoff, `approval` (human-in-the-loop) phases,
|
|
301
|
+
`flow` (saved sub-flow composition), and run-wide `budget` caps — on top of the
|
|
302
|
+
DSL + DAG runtime (`agent`/`parallel`/`map`/`gate`/`reduce`),
|
|
303
|
+
inline + saved flows, cross-session resume, live progress, isolated context.
|
|
304
|
+
Default `concurrency` is 8 (set on the flow; per-phase `concurrency` overrides
|
|
305
|
+
for that phase).
|
|
306
|
+
- A run executes as one streaming tool call (live progress while it runs).
|
|
307
|
+
- `map` requires the upstream phase to emit a JSON array (`output: "json"`).
|
|
308
|
+
- Gate verdicts are **fail-open**: if the agent output contains no recognizable
|
|
309
|
+
verdict marker (`VERDICT: BLOCK/PASS/OK/FAIL/STOP/REJECT/HALT` or
|
|
310
|
+
`{continue: false}` / `{verdict: "block"}`), the gate passes. This prevents
|
|
311
|
+
an accidental missing verdict from blocking your workflow.
|
|
312
|
+
|
|
313
|
+
### What it doesn't do (yet)
|
|
314
|
+
|
|
315
|
+
- **No detached background execution.** A run needs the pi session to stay open.
|
|
316
|
+
True background execution (and event/cron triggers on top of it) is on the
|
|
317
|
+
roadmap.
|
|
318
|
+
- **No `output: "file"`.** Outputs are text/JSON only. Write files via agent
|
|
319
|
+
tool calls if needed.
|
|
320
|
+
- **`map` requires a JSON array.** The `over` field must resolve to
|
|
321
|
+
`{steps.ID.json}` where the upstream phase emitted `output: "json"`. If the
|
|
322
|
+
source is a plain text list, wrap it in a single-agent phase that outputs JSON.
|
|
323
|
+
- **Cycles are rejected at validation.** The DAG must be acyclic.
|
|
106
324
|
|
|
107
325
|
## Development
|
|
108
326
|
|
|
109
327
|
```bash
|
|
110
328
|
npm install
|
|
111
329
|
npm run typecheck
|
|
112
|
-
node --experimental-strip-types --test test/interpolate.test.ts
|
|
330
|
+
node --experimental-strip-types --test test/interpolate.test.ts \
|
|
331
|
+
test/condition.test.ts test/schema.test.ts test/usage.test.ts \
|
|
332
|
+
test/runtime.test.ts test/features.test.ts test/runner.test.ts \
|
|
333
|
+
test/store.test.ts test/agents.test.ts test/render.test.ts test/desugar.test.ts
|
|
113
334
|
|
|
114
335
|
# real end-to-end (spawns live subagents; needs model access)
|
|
115
336
|
PI_TASKFLOW_PI_BIN=pi node --experimental-strip-types test/e2e.mts
|
|
116
337
|
```
|
|
117
338
|
|
|
118
|
-
##
|
|
339
|
+
## Contributing
|
|
119
340
|
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
detached background execution is on the roadmap.
|
|
124
|
-
- `map` requires the upstream phase to emit a JSON array (`output: "json"`).
|
|
341
|
+
Contributions welcome! This is a young project — open an issue or PR on
|
|
342
|
+
[GitHub](https://github.com/heggria/pi-taskflow). Tests live in `test/`, the
|
|
343
|
+
runtime in `extensions/`.
|
|
125
344
|
|
|
126
345
|
## License
|
|
127
346
|
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "conditional-research",
|
|
3
|
+
"description": "Triage a topic, route to a deep or quick research branch (conditional `when` + OR-join), gate the result, and report.",
|
|
4
|
+
"version": 1,
|
|
5
|
+
"args": {
|
|
6
|
+
"topic": { "description": "The subject to research" }
|
|
7
|
+
},
|
|
8
|
+
"concurrency": 4,
|
|
9
|
+
"agentScope": "user",
|
|
10
|
+
"budget": { "maxUSD": 1.0 },
|
|
11
|
+
"phases": [
|
|
12
|
+
{
|
|
13
|
+
"id": "triage",
|
|
14
|
+
"type": "agent",
|
|
15
|
+
"agent": "analyst",
|
|
16
|
+
"task": "Decide how much research the topic \"{args.topic}\" needs. Output ONLY JSON: {\"route\": \"deep\"} for broad/ambiguous topics, or {\"route\": \"quick\"} for narrow/well-defined ones.",
|
|
17
|
+
"output": "json"
|
|
18
|
+
},
|
|
19
|
+
{
|
|
20
|
+
"id": "deep",
|
|
21
|
+
"type": "agent",
|
|
22
|
+
"agent": "explorer",
|
|
23
|
+
"when": "{steps.triage.json.route} == deep",
|
|
24
|
+
"dependsOn": ["triage"],
|
|
25
|
+
"task": "Do a thorough, multi-angle investigation of \"{args.topic}\". Cover background, key players, trade-offs, and open questions.",
|
|
26
|
+
"retry": { "max": 2, "backoffMs": 500, "factor": 2 }
|
|
27
|
+
},
|
|
28
|
+
{
|
|
29
|
+
"id": "quick",
|
|
30
|
+
"type": "agent",
|
|
31
|
+
"agent": "explorer",
|
|
32
|
+
"when": "{steps.triage.json.route} == quick",
|
|
33
|
+
"dependsOn": ["triage"],
|
|
34
|
+
"task": "Give a concise, focused answer on \"{args.topic}\" with the 3 most important points.",
|
|
35
|
+
"retry": { "max": 2, "backoffMs": 500, "factor": 2 }
|
|
36
|
+
},
|
|
37
|
+
{
|
|
38
|
+
"id": "review",
|
|
39
|
+
"type": "gate",
|
|
40
|
+
"agent": "critic",
|
|
41
|
+
"join": "any",
|
|
42
|
+
"from": ["deep", "quick"],
|
|
43
|
+
"dependsOn": ["deep", "quick"],
|
|
44
|
+
"task": "Review the research below for unsupported claims. If it is solid, end with 'VERDICT: PASS'; if it is too thin, end with 'VERDICT: BLOCK' and one reason.\n\n{steps.deep.output}{steps.quick.output}"
|
|
45
|
+
},
|
|
46
|
+
{
|
|
47
|
+
"id": "report",
|
|
48
|
+
"type": "reduce",
|
|
49
|
+
"from": ["review"],
|
|
50
|
+
"dependsOn": ["review"],
|
|
51
|
+
"agent": "doc-writer",
|
|
52
|
+
"task": "Write a clean markdown brief on \"{args.topic}\" from the validated research:\n\n{steps.deep.output}{steps.quick.output}",
|
|
53
|
+
"final": true
|
|
54
|
+
}
|
|
55
|
+
]
|
|
56
|
+
}
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "guarded-refactor",
|
|
3
|
+
"description": "Plan a change, pause for human approval before touching code (HITL), then implement with retries and a final review gate. Budget-capped.",
|
|
4
|
+
"version": 1,
|
|
5
|
+
"args": {
|
|
6
|
+
"target": { "default": "src", "description": "Directory or module to refactor" },
|
|
7
|
+
"goal": { "description": "What the refactor should achieve" }
|
|
8
|
+
},
|
|
9
|
+
"concurrency": 4,
|
|
10
|
+
"agentScope": "user",
|
|
11
|
+
"budget": { "maxUSD": 2.0 },
|
|
12
|
+
"phases": [
|
|
13
|
+
{
|
|
14
|
+
"id": "plan",
|
|
15
|
+
"type": "agent",
|
|
16
|
+
"agent": "planner",
|
|
17
|
+
"task": "Produce a concrete, step-by-step refactor plan for {args.target} to achieve: {args.goal}. List the files you would change and the risks."
|
|
18
|
+
},
|
|
19
|
+
{
|
|
20
|
+
"id": "approve",
|
|
21
|
+
"type": "approval",
|
|
22
|
+
"dependsOn": ["plan"],
|
|
23
|
+
"task": "Review the refactor plan above. Approve to proceed, reject to abort, or edit to add constraints the implementer must follow."
|
|
24
|
+
},
|
|
25
|
+
{
|
|
26
|
+
"id": "implement",
|
|
27
|
+
"type": "agent",
|
|
28
|
+
"agent": "executor_code",
|
|
29
|
+
"dependsOn": ["approve"],
|
|
30
|
+
"task": "Implement the approved plan for {args.target}.\nPlan:\n{steps.plan.output}\nExtra human guidance (if any):\n{steps.approve.output}",
|
|
31
|
+
"retry": { "max": 1, "backoffMs": 1000 }
|
|
32
|
+
},
|
|
33
|
+
{
|
|
34
|
+
"id": "review",
|
|
35
|
+
"type": "gate",
|
|
36
|
+
"agent": "reviewer",
|
|
37
|
+
"dependsOn": ["implement"],
|
|
38
|
+
"task": "Review the implementation report below. If it is correct and complete, end with 'VERDICT: PASS'; otherwise 'VERDICT: BLOCK' with reasons.\n\n{steps.implement.output}"
|
|
39
|
+
},
|
|
40
|
+
{
|
|
41
|
+
"id": "summary",
|
|
42
|
+
"type": "reduce",
|
|
43
|
+
"from": ["review"],
|
|
44
|
+
"dependsOn": ["review"],
|
|
45
|
+
"agent": "doc-writer",
|
|
46
|
+
"task": "Write a short changelog entry summarizing what was done:\n\n{steps.implement.output}",
|
|
47
|
+
"final": true
|
|
48
|
+
}
|
|
49
|
+
]
|
|
50
|
+
}
|
package/extensions/agents.ts
CHANGED
|
@@ -55,7 +55,14 @@ function loadAgentsFromDir(dir: string, source: "user" | "project"): AgentConfig
|
|
|
55
55
|
continue;
|
|
56
56
|
}
|
|
57
57
|
|
|
58
|
-
const { frontmatter, body } =
|
|
58
|
+
const { frontmatter, body } = (() => {
|
|
59
|
+
try {
|
|
60
|
+
return parseFrontmatter<Record<string, string>>(content);
|
|
61
|
+
} catch {
|
|
62
|
+
// A single malformed agent file must not break discovery for every flow.
|
|
63
|
+
return { frontmatter: {} as Record<string, string>, body: "" };
|
|
64
|
+
}
|
|
65
|
+
})();
|
|
59
66
|
if (!frontmatter.name || !frontmatter.description) continue;
|
|
60
67
|
|
|
61
68
|
const tools = frontmatter.tools
|
package/extensions/index.ts
CHANGED
|
@@ -18,8 +18,8 @@ import { Type } from "typebox";
|
|
|
18
18
|
import { type AgentScope, discoverAgents, readSubagentSettings } from "./agents.ts";
|
|
19
19
|
import { renderRunResult, summarizeRun } from "./render.ts";
|
|
20
20
|
import { RunHistoryComponent, type RunHistoryResult } from "./runs-view.ts";
|
|
21
|
-
import { executeTaskflow, type RuntimeResult } from "./runtime.ts";
|
|
22
|
-
import { finalPhase, type Taskflow, validateTaskflow, desugar, isShorthand } from "./schema.ts";
|
|
21
|
+
import { executeTaskflow, type ApprovalDecision, type ApprovalRequest, type RuntimeResult } from "./runtime.ts";
|
|
22
|
+
import { finalPhase, resolveArgs, type Taskflow, validateTaskflow, desugar, isShorthand } from "./schema.ts";
|
|
23
23
|
import {
|
|
24
24
|
getFlow,
|
|
25
25
|
listFlows,
|
|
@@ -86,17 +86,6 @@ const TaskflowParams = Type.Object({
|
|
|
86
86
|
),
|
|
87
87
|
});
|
|
88
88
|
|
|
89
|
-
function resolveArgs(def: Taskflow, provided: Record<string, unknown> | undefined): Record<string, unknown> {
|
|
90
|
-
const args: Record<string, unknown> = {};
|
|
91
|
-
for (const [key, spec] of Object.entries(def.args ?? {})) {
|
|
92
|
-
if (provided && key in provided) args[key] = provided[key];
|
|
93
|
-
else if (spec.default !== undefined) args[key] = spec.default;
|
|
94
|
-
}
|
|
95
|
-
// also pass through any extra provided args
|
|
96
|
-
if (provided) for (const [k, v] of Object.entries(provided)) if (!(k in args)) args[k] = v;
|
|
97
|
-
return args;
|
|
98
|
-
}
|
|
99
|
-
|
|
100
89
|
function makeRunState(def: Taskflow, args: Record<string, unknown>, cwd: string): RunState {
|
|
101
90
|
return {
|
|
102
91
|
runId: newRunId(def.name),
|
|
@@ -153,6 +142,29 @@ async function runFlow(
|
|
|
153
142
|
(heartbeat as { unref?: () => void }).unref?.();
|
|
154
143
|
}
|
|
155
144
|
|
|
145
|
+
// Human-in-the-loop approver — only when an interactive UI is available.
|
|
146
|
+
const requestApproval = ctx.hasUI
|
|
147
|
+
? async (req: ApprovalRequest): Promise<ApprovalDecision> => {
|
|
148
|
+
if (req.upstream?.trim()) {
|
|
149
|
+
const snip = req.upstream.replace(/\s+/g, " ").trim();
|
|
150
|
+
ctx.ui.notify(`[${def.name}/${req.phaseId}] ${snip.length > 280 ? `${snip.slice(0, 280)}…` : snip}`, "info");
|
|
151
|
+
}
|
|
152
|
+
const choice = await ctx.ui.select(
|
|
153
|
+
`Taskflow approval — ${req.phaseId}: ${req.message}`,
|
|
154
|
+
["Approve", "Reject", "Edit / add guidance"],
|
|
155
|
+
{ signal },
|
|
156
|
+
);
|
|
157
|
+
if (!choice || choice === "Reject") return { decision: "reject" };
|
|
158
|
+
if (choice.startsWith("Edit")) {
|
|
159
|
+
const note = await ctx.ui.input("Guidance passed downstream as this phase's output", "type guidance…", {
|
|
160
|
+
signal,
|
|
161
|
+
});
|
|
162
|
+
return { decision: "edit", note: note ?? "" };
|
|
163
|
+
}
|
|
164
|
+
return { decision: "approve" };
|
|
165
|
+
}
|
|
166
|
+
: undefined;
|
|
167
|
+
|
|
156
168
|
try {
|
|
157
169
|
const result = await executeTaskflow(state, {
|
|
158
170
|
cwd: ctx.cwd,
|
|
@@ -160,6 +172,8 @@ async function runFlow(
|
|
|
160
172
|
globalThinking: settings.globalThinking,
|
|
161
173
|
signal,
|
|
162
174
|
persist: persistThrottled,
|
|
175
|
+
requestApproval,
|
|
176
|
+
loadFlow: (name: string) => getFlow(ctx.cwd, name)?.def,
|
|
163
177
|
});
|
|
164
178
|
return result;
|
|
165
179
|
} finally {
|
|
@@ -199,11 +213,12 @@ export default function (pi: ExtensionAPI) {
|
|
|
199
213
|
label: "Taskflow",
|
|
200
214
|
description: [
|
|
201
215
|
"Orchestrate a multi-phase workflow of subagents from a declarative definition.",
|
|
202
|
-
"Phases (agent, parallel, map, gate, reduce) form a DAG; intermediate outputs stay out of your context — only the final phase output is returned.",
|
|
216
|
+
"Phases (agent, parallel, map, gate, reduce, approval, flow) form a DAG; intermediate outputs stay out of your context — only the final phase output is returned.",
|
|
203
217
|
"Use action=run with an inline `define` (you write the DSL) or a saved `name`.",
|
|
204
218
|
"For simple non-DAG delegations (like the subagent tool) skip the DSL: pass `task` (+optional `agent`) for one task, `tasks:[{task,agent?}]` to run in parallel, or `chain:[{task,agent?}]` to run sequentially (reference the prior step with {previous.output}).",
|
|
205
219
|
"Use action=save to persist a definition as a reusable /tf:<name> command. action=resume continues a paused run. action=list shows saved flows.",
|
|
206
|
-
"DSL: {name, args?, concurrency?, phases:[{id, type, agent, task, dependsOn?, over?(map), as?(map), branches?(parallel), from?(reduce), output?:'json', final?}]}.",
|
|
220
|
+
"DSL: {name, args?, concurrency?, budget?:{maxUSD,maxTokens}, phases:[{id, type, agent, task, dependsOn?, join?:'all'|'any', when?, retry?:{max,backoffMs,factor}, over?(map), as?(map), branches?(parallel), from?(reduce), use?(flow), with?(flow), output?:'json', final?}]}.",
|
|
221
|
+
"Phase types: agent (one subagent), parallel (static branches), map (dynamic fan-out over an array), gate (VERDICT: PASS/BLOCK quality gate), reduce (aggregate from N phases), approval (human-in-the-loop pause), flow (run a saved sub-flow). join:'any' is an OR-join; when is a conditional guard; retry adds backoff; budget caps run cost.",
|
|
207
222
|
"Interpolation: {args.X}, {steps.ID.output}, {steps.ID.json}, {item} (map), {previous.output}.",
|
|
208
223
|
].join(" "),
|
|
209
224
|
parameters: TaskflowParams,
|