pi-taskflow 0.0.18 → 0.0.20
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +26 -0
- package/README.md +2 -2
- package/extensions/approval-view.ts +264 -0
- package/extensions/detached-runner.ts +79 -0
- package/extensions/index.ts +83 -10
- package/extensions/interpolate.ts +1 -1
- package/extensions/runner.ts +18 -3
- package/extensions/runtime.ts +13 -10
- package/extensions/store.ts +39 -6
- package/package.json +2 -2
- package/skills/taskflow/SKILL.md +118 -15
- package/skills/taskflow/configuration.md +115 -5
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,32 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to pi-taskflow are documented here. This project follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) format.
|
|
4
4
|
|
|
5
|
+
## [0.0.20] — 2026-06-10
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
- **Background (detached) execution — `detach: true`.** Run a taskflow in a detached child process without blocking the current session. Pass `detach: true` and get a `runId` back immediately; the flow executes in the background, persisting state to the store. Status polled via `/tf runs` and `resume` works as normal.
|
|
9
|
+
- `extensions/detached-runner.ts` (new): lightweight child-process entry script — reads serialized context, calls `executeTaskflow`, persists terminal state.
|
|
10
|
+
- `extensions/index.ts`: `detach: Boolean` parameter on the taskflow tool + child-process spawn logic (records PID in `RunState`).
|
|
11
|
+
- `extensions/store.ts`: `RunState` gains `pid?: number` + `detached?: boolean` fields; `isProcessAlive(pid)` stale-PID helper.
|
|
12
|
+
- Design: entry-point spawn wrapper — zero changes to the 1340-line `runtime.ts` core, no new phase type, no DSL version bump, fully backward-compatible.
|
|
13
|
+
- Approval phases auto-reject in background mode. Idle watchdog kills stalled children. Stale PID detection via signal-0 probe.
|
|
14
|
+
- 8 new tests (`test/detached.test.ts`): process-alive, PID persistence, end-to-end detached, crash→failed, resume after failure, stale PID, backward compat.
|
|
15
|
+
|
|
16
|
+
### Fixed
|
|
17
|
+
- `approvalView` initialization robustness: throws a clear error when the approval view module is unavailable, preventing silent failures in detached/background mode.
|
|
18
|
+
|
|
19
|
+
## [0.0.19] — 2026-06-10
|
|
20
|
+
|
|
21
|
+
### Documentation
|
|
22
|
+
- **Closed the SKILL coverage gap — the LLM can now author every shipped feature.** A schema-vs-SKILL.md audit (`docs/internal/skill-coverage-audit.md`, machine-checked + cross-adversarial reviewed) found several implemented + tested features that were undocumented in the LLM-facing skill, so the model never generated them. All ~46 user-facing schema fields are now documented across SKILL.md + configuration.md.
|
|
23
|
+
- **SKILL.md**: phase-type table now lists all 9 types (added `loop`, `tournament`) with a “details” column pointing each to its section; new **Loop phases** (`until`/`maxIterations`/`convergence`) and **Tournament phases** (`variants`/`judge`/`mode`/`judgeAgent`) sections; `eval` (zero-token machine gate) and `onBlock: "retry"` (self-healing rework loop) folded into the Gate section; cross-run `cache` pointer + `optional` + static `branches` notes.
|
|
24
|
+
- **SKILL.md**: new **Operating a run** section — run lifecycle (`running → completed/blocked/failed/paused`), cache-aware resume, when to resume vs. re-run, budget-mid-run behavior, and run inspection. Clarified action semantics (`define` vs `name`, save scope/collision, `verify`/`agents` actions).
|
|
25
|
+
- **configuration.md**: new **§2.1 Context pre-reading** (`context`/`contextLimit` — resolution order, per-file 8000-char cap, 200k total cap) and **§8 Cross-run caching** (`cache.scope`, `ttl`, full `fingerprint` prefix table for git/glob/glob!/file/env). Fixed a stale “5 phase types” → 9 cross-file drift.
|
|
26
|
+
- Every documented JSON example validates against the live schema; all run-status/resume claims verified against the runtime (`blocked` is terminal; `paused`/`failed` are resumable). 560 tests pass, zero regression.
|
|
27
|
+
|
|
28
|
+
### CI
|
|
29
|
+
- GitHub Packages publish is now best-effort (`continue-on-error`) so an unscoped-package 404 there can never block the npm publish or the GitHub Release.
|
|
30
|
+
|
|
5
31
|
## [0.0.18] — 2026-06-09
|
|
6
32
|
|
|
7
33
|
### Added
|
package/README.md
CHANGED
|
@@ -131,7 +131,7 @@ The Pi ecosystem now has **20+ delegation, workflow, and orchestration extension
|
|
|
131
131
|
- **`pi-subagents` / `@gotgenes/pi-subagents`** are the mature picks for ad-hoc "use reviewer on this diff" delegation and background jobs. `pi-taskflow` is for when those delegations need to become a *repeatable, resumable pipeline*.
|
|
132
132
|
- **`pi-pipeline` / `pi-agent-flow`** ship *opinionated, fixed* flows. `pi-taskflow` ships an *empty canvas*: you (or the model) declare the graph that fits the job.
|
|
133
133
|
|
|
134
|
-
> The honest one-liner: **`pi-taskflow` is the only Pi extension that gives you a *declarative, verifiable, resumable* DAG of task nodes — saved as a one-word command, with zero runtime dependencies and context isolation by design.** Where code-mode workflows let the model *script* the work, `pi-taskflow` lets it *declare a graph the runtime can prove correct before running.* The known gaps it's closing next:
|
|
134
|
+
> The honest one-liner: **`pi-taskflow` is the only Pi extension that gives you a *declarative, verifiable, resumable* DAG of task nodes — saved as a one-word command, with zero runtime dependencies and context isolation by design.** Where code-mode workflows let the model *script* the work, `pi-taskflow` lets it *declare a graph the runtime can prove correct before running.* The known gaps it's closing next: worktree isolation (see [`STRATEGY.md`](./STRATEGY.md)).
|
|
135
135
|
|
|
136
136
|
## 30-second start
|
|
137
137
|
|
|
@@ -641,7 +641,7 @@ Our `self-improve` flow is a 10-phase DAG — it audits the codebase, patches de
|
|
|
641
641
|
|
|
642
642
|
Known boundaries (tracked, bounded — no surprises mid-flow):
|
|
643
643
|
|
|
644
|
-
- **
|
|
644
|
+
- **Detached background execution (new).** Add `detach: true` to `action: "run"` to spawn the flow in a detached child process. The tool returns immediately with the `runId`; the flow continues running even if the host session exits. Status is polled via the store (`/tf runs` or `action: "resume"`). Approval phases auto-reject in detached mode.
|
|
645
645
|
- **No `output: "file"`.** Outputs are text/JSON only — write files via an agent's `write` tool call.
|
|
646
646
|
- **`map` requires a JSON array.** The `over` field must resolve to a `{steps.ID.json}` array. Wrap a text list in a single-agent `output: "json"` phase first.
|
|
647
647
|
- **The DAG must be acyclic.** Cycles are rejected at validation.
|
|
@@ -0,0 +1,264 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Modal approval dialog for `approval` phases (ctx.ui.custom with overlay).
|
|
3
|
+
*
|
|
4
|
+
* Rendered as a centered bordered popup: the full upstream output (e.g. a
|
|
5
|
+
* plan) is shown in a scrollable viewport so long content can be reviewed
|
|
6
|
+
* before deciding. Every line is padded to the full dialog width so the
|
|
7
|
+
* overlay composites cleanly (no see-through, no ghosting in scrollback).
|
|
8
|
+
*
|
|
9
|
+
* While the dialog is open, SGR mouse reporting is enabled so the wheel
|
|
10
|
+
* scrolls the viewport instead of the terminal scrollback. It is restored
|
|
11
|
+
* on dispose.
|
|
12
|
+
*
|
|
13
|
+
* Keys: wheel/↑↓ scroll · PgUp/PgDn page · Home/End jump ·
|
|
14
|
+
* a/Enter approve · e edit (guidance) · r/Esc reject.
|
|
15
|
+
*/
|
|
16
|
+
|
|
17
|
+
import type { Theme } from "@earendil-works/pi-coding-agent";
|
|
18
|
+
import { matchesKey, truncateToWidth, visibleWidth, wrapTextWithAnsi } from "@earendil-works/pi-tui";
|
|
19
|
+
|
|
20
|
+
export type ApprovalChoice = "approve" | "reject" | "edit";
|
|
21
|
+
|
|
22
|
+
export interface ApprovalViewOptions {
|
|
23
|
+
/** Header title, e.g. "Taskflow approval — flow/phase". */
|
|
24
|
+
title: string;
|
|
25
|
+
/** Interpolated approval prompt. */
|
|
26
|
+
message: string;
|
|
27
|
+
/** Full upstream phase output (the content being approved). */
|
|
28
|
+
upstream?: string;
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
/** Minimal writer used to toggle terminal mouse reporting. */
|
|
32
|
+
export interface TerminalWriter {
|
|
33
|
+
write(data: string): void;
|
|
34
|
+
}
|
|
35
|
+
|
|
36
|
+
const FALLBACK_ROWS = 24;
|
|
37
|
+
/** Wheel ticks scroll this many lines. */
|
|
38
|
+
const WHEEL_STEP = 3;
|
|
39
|
+
/** SGR mouse sequence: ESC [ < B ; X ; Y (M|m) */
|
|
40
|
+
const MOUSE_SGR = /^\x1b\[<(\d+);(\d+);(\d+)([Mm])$/;
|
|
41
|
+
/** Enable basic mouse tracking + SGR encoding. */
|
|
42
|
+
const MOUSE_ON = "\x1b[?1000h\x1b[?1006h";
|
|
43
|
+
/** Restore: disable SGR encoding + mouse tracking. */
|
|
44
|
+
const MOUSE_OFF = "\x1b[?1006l\x1b[?1000l";
|
|
45
|
+
|
|
46
|
+
export class ApprovalViewComponent {
|
|
47
|
+
private theme: Theme;
|
|
48
|
+
private opts: ApprovalViewOptions;
|
|
49
|
+
private onDone: (choice: ApprovalChoice) => void;
|
|
50
|
+
private getRows: () => number;
|
|
51
|
+
private term?: TerminalWriter;
|
|
52
|
+
private scrollOffset = 0;
|
|
53
|
+
private cachedWidth?: number;
|
|
54
|
+
private cachedBody?: string[];
|
|
55
|
+
private mouseEnabled = false;
|
|
56
|
+
private decided = false;
|
|
57
|
+
|
|
58
|
+
constructor(
|
|
59
|
+
theme: Theme,
|
|
60
|
+
opts: ApprovalViewOptions,
|
|
61
|
+
onDone: (choice: ApprovalChoice) => void,
|
|
62
|
+
getRows?: () => number,
|
|
63
|
+
term?: TerminalWriter,
|
|
64
|
+
) {
|
|
65
|
+
this.theme = theme;
|
|
66
|
+
this.opts = opts;
|
|
67
|
+
this.onDone = onDone;
|
|
68
|
+
this.getRows = getRows ?? (() => FALLBACK_ROWS);
|
|
69
|
+
this.term = term;
|
|
70
|
+
this.enableMouse();
|
|
71
|
+
}
|
|
72
|
+
|
|
73
|
+
private enableMouse(): void {
|
|
74
|
+
if (this.term && !this.mouseEnabled) {
|
|
75
|
+
try {
|
|
76
|
+
this.term.write(MOUSE_ON);
|
|
77
|
+
this.mouseEnabled = true;
|
|
78
|
+
} catch {
|
|
79
|
+
// non-tty / closed stream — wheel support is best-effort
|
|
80
|
+
}
|
|
81
|
+
}
|
|
82
|
+
}
|
|
83
|
+
|
|
84
|
+
/** Restore terminal mouse state. Idempotent; call from the overlay's dispose. */
|
|
85
|
+
dispose(): void {
|
|
86
|
+
if (this.term && this.mouseEnabled) {
|
|
87
|
+
this.mouseEnabled = false;
|
|
88
|
+
try {
|
|
89
|
+
this.term.write(MOUSE_OFF);
|
|
90
|
+
} catch {
|
|
91
|
+
// ignore
|
|
92
|
+
}
|
|
93
|
+
}
|
|
94
|
+
}
|
|
95
|
+
|
|
96
|
+
private decide(choice: ApprovalChoice): void {
|
|
97
|
+
if (this.decided) return;
|
|
98
|
+
this.decided = true;
|
|
99
|
+
this.dispose();
|
|
100
|
+
this.onDone(choice);
|
|
101
|
+
}
|
|
102
|
+
|
|
103
|
+
private rows(): number {
|
|
104
|
+
try {
|
|
105
|
+
return this.getRows() || FALLBACK_ROWS;
|
|
106
|
+
} catch {
|
|
107
|
+
return FALLBACK_ROWS;
|
|
108
|
+
}
|
|
109
|
+
}
|
|
110
|
+
|
|
111
|
+
/** Visible body height given the message height — dialog targets ~80% of the terminal. */
|
|
112
|
+
private maxVisible(msgRows: number): number {
|
|
113
|
+
const avail = Math.max(10, Math.floor(this.rows() * 0.8));
|
|
114
|
+
// Chrome: top border, message rows, separator, scroll info, separator, hints, bottom border.
|
|
115
|
+
const chrome = 1 + msgRows + 1 + 1 + 1 + 1 + 1;
|
|
116
|
+
return Math.max(3, Math.min(avail - chrome, 60));
|
|
117
|
+
}
|
|
118
|
+
|
|
119
|
+
/** Wrap the upstream text to the viewport width (cached per width). */
|
|
120
|
+
private bodyLines(innerW: number): string[] {
|
|
121
|
+
if (this.cachedBody && this.cachedWidth === innerW) return this.cachedBody;
|
|
122
|
+
const out: string[] = [];
|
|
123
|
+
const upstream = (this.opts.upstream ?? "").replace(/\r\n/g, "\n").trimEnd();
|
|
124
|
+
if (upstream) {
|
|
125
|
+
for (const raw of upstream.split("\n")) {
|
|
126
|
+
if (!raw.trim()) {
|
|
127
|
+
out.push("");
|
|
128
|
+
continue;
|
|
129
|
+
}
|
|
130
|
+
for (const l of wrapTextWithAnsi(raw, innerW)) out.push(l);
|
|
131
|
+
}
|
|
132
|
+
}
|
|
133
|
+
this.cachedWidth = innerW;
|
|
134
|
+
this.cachedBody = out;
|
|
135
|
+
return out;
|
|
136
|
+
}
|
|
137
|
+
|
|
138
|
+
private msgLines(innerW: number): string[] {
|
|
139
|
+
const out: string[] = [];
|
|
140
|
+
for (const raw of this.opts.message.split("\n")) {
|
|
141
|
+
for (const l of wrapTextWithAnsi(raw, innerW)) out.push(l);
|
|
142
|
+
}
|
|
143
|
+
return out.length ? out : [""];
|
|
144
|
+
}
|
|
145
|
+
|
|
146
|
+
private maxOffset(totalLines: number, visible: number): number {
|
|
147
|
+
return Math.max(0, totalLines - visible);
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
private clampScroll(delta: number): void {
|
|
151
|
+
const total = this.cachedBody?.length ?? 0;
|
|
152
|
+
const visible = this.maxVisible(1);
|
|
153
|
+
const cap = this.maxOffset(total, visible);
|
|
154
|
+
this.scrollOffset = Math.max(0, Math.min(cap, this.scrollOffset + delta));
|
|
155
|
+
}
|
|
156
|
+
|
|
157
|
+
handleInput(data: string): void {
|
|
158
|
+
// Mouse events (SGR) — wheel scrolls, everything else is swallowed.
|
|
159
|
+
const mouse = MOUSE_SGR.exec(data);
|
|
160
|
+
if (mouse) {
|
|
161
|
+
const b = Number(mouse[1]);
|
|
162
|
+
if (b & 64) {
|
|
163
|
+
// Wheel: low two bits 0 = up, 1 = down.
|
|
164
|
+
if ((b & 3) === 0) this.clampScroll(-WHEEL_STEP);
|
|
165
|
+
else if ((b & 3) === 1) this.clampScroll(WHEEL_STEP);
|
|
166
|
+
}
|
|
167
|
+
return;
|
|
168
|
+
}
|
|
169
|
+
// Decisions
|
|
170
|
+
if (matchesKey(data, "return") || data === "a" || data === "y") {
|
|
171
|
+
this.decide("approve");
|
|
172
|
+
return;
|
|
173
|
+
}
|
|
174
|
+
if (data === "e") {
|
|
175
|
+
this.decide("edit");
|
|
176
|
+
return;
|
|
177
|
+
}
|
|
178
|
+
if (matchesKey(data, "escape") || matchesKey(data, "ctrl+c") || data === "r" || data === "n") {
|
|
179
|
+
this.decide("reject");
|
|
180
|
+
return;
|
|
181
|
+
}
|
|
182
|
+
// Scrolling (only meaningful when a body exists)
|
|
183
|
+
const page = this.maxVisible(1);
|
|
184
|
+
if (matchesKey(data, "up") || data === "k") {
|
|
185
|
+
this.clampScroll(-1);
|
|
186
|
+
} else if (matchesKey(data, "down") || data === "j") {
|
|
187
|
+
this.clampScroll(1);
|
|
188
|
+
} else if (matchesKey(data, "pageUp") || matchesKey(data, "ctrl+u")) {
|
|
189
|
+
this.clampScroll(-page);
|
|
190
|
+
} else if (matchesKey(data, "pageDown") || matchesKey(data, "ctrl+d") || matchesKey(data, "space")) {
|
|
191
|
+
this.clampScroll(page);
|
|
192
|
+
} else if (matchesKey(data, "home") || data === "g") {
|
|
193
|
+
this.scrollOffset = 0;
|
|
194
|
+
} else if (matchesKey(data, "end") || data === "G") {
|
|
195
|
+
this.clampScroll(Number.MAX_SAFE_INTEGER);
|
|
196
|
+
}
|
|
197
|
+
}
|
|
198
|
+
|
|
199
|
+
/** Pad `content` with spaces to exactly `w` visible columns (ANSI-aware). */
|
|
200
|
+
private pad(content: string, w: number): string {
|
|
201
|
+
const t = truncateToWidth(content, w);
|
|
202
|
+
return t + " ".repeat(Math.max(0, w - visibleWidth(t)));
|
|
203
|
+
}
|
|
204
|
+
|
|
205
|
+
/** A full-width dialog row: │ <content padded> │ */
|
|
206
|
+
private row(content: string, width: number): string {
|
|
207
|
+
const th = this.theme;
|
|
208
|
+
const inner = this.pad(content, Math.max(1, width - 4));
|
|
209
|
+
return th.fg("border", "│") + " " + inner + " " + th.fg("border", "│");
|
|
210
|
+
}
|
|
211
|
+
|
|
212
|
+
private hrule(width: number, left: string, right: string): string {
|
|
213
|
+
const th = this.theme;
|
|
214
|
+
return th.fg("border", left + "─".repeat(Math.max(0, width - 2)) + right);
|
|
215
|
+
}
|
|
216
|
+
|
|
217
|
+
render(width: number): string[] {
|
|
218
|
+
const th = this.theme;
|
|
219
|
+
const innerW = Math.max(20, width - 4);
|
|
220
|
+
const lines: string[] = [];
|
|
221
|
+
|
|
222
|
+
// Top border with embedded title
|
|
223
|
+
const title = truncateToWidth(` ${this.opts.title} `, Math.max(0, width - 6));
|
|
224
|
+
const fill = Math.max(0, width - 4 - visibleWidth(title));
|
|
225
|
+
lines.push(
|
|
226
|
+
th.fg("border", "╭─") + th.fg("accent", title) + th.fg("border", "─".repeat(fill) + "─╮"),
|
|
227
|
+
);
|
|
228
|
+
|
|
229
|
+
// Approval prompt
|
|
230
|
+
const msg = this.msgLines(innerW);
|
|
231
|
+
for (const l of msg) lines.push(this.row(th.fg("text", l), width));
|
|
232
|
+
|
|
233
|
+
// Scrollable upstream body
|
|
234
|
+
const body = this.bodyLines(innerW);
|
|
235
|
+
const visible = this.maxVisible(msg.length);
|
|
236
|
+
const cap = this.maxOffset(body.length, visible);
|
|
237
|
+
this.scrollOffset = Math.min(this.scrollOffset, cap);
|
|
238
|
+
if (body.length > 0) {
|
|
239
|
+
lines.push(this.hrule(width, "├", "┤"));
|
|
240
|
+
const slice = body.slice(this.scrollOffset, this.scrollOffset + visible);
|
|
241
|
+
while (slice.length < Math.min(visible, body.length)) slice.push("");
|
|
242
|
+
for (const l of slice) lines.push(this.row(l, width));
|
|
243
|
+
if (cap > 0) {
|
|
244
|
+
const above = this.scrollOffset;
|
|
245
|
+
const below = Math.max(0, body.length - visible - this.scrollOffset);
|
|
246
|
+
lines.push(
|
|
247
|
+
this.row(th.fg("dim", `↑${above} more · ↓${below} more (${body.length} lines)`), width),
|
|
248
|
+
);
|
|
249
|
+
}
|
|
250
|
+
}
|
|
251
|
+
|
|
252
|
+
// Key hints
|
|
253
|
+
lines.push(this.hrule(width, "├", "┤"));
|
|
254
|
+
const scrollHint = cap > 0 ? "wheel/↑↓/PgUp/PgDn scroll · " : "";
|
|
255
|
+
lines.push(this.row(th.fg("dim", `${scrollHint}a/Enter approve · e edit · r/Esc reject`), width));
|
|
256
|
+
lines.push(this.hrule(width, "╰", "╯"));
|
|
257
|
+
return lines;
|
|
258
|
+
}
|
|
259
|
+
|
|
260
|
+
invalidate(): void {
|
|
261
|
+
this.cachedWidth = undefined;
|
|
262
|
+
this.cachedBody = undefined;
|
|
263
|
+
}
|
|
264
|
+
}
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Detached runner — spawned as a child process for background (detached) runs.
|
|
3
|
+
*
|
|
4
|
+
* Reads a context JSON file (path passed as argv[2]), calls executeTaskflow,
|
|
5
|
+
* and persists the terminal state. Top-level try/catch writes status "failed"
|
|
6
|
+
* on crash. Approval phases auto-reject in detached mode (no interactive
|
|
7
|
+
* approver available).
|
|
8
|
+
*
|
|
9
|
+
* This file is NOT imported by index.ts — it is spawned via `child_process.spawn`.
|
|
10
|
+
*/
|
|
11
|
+
|
|
12
|
+
import { readFileSync } from "node:fs";
|
|
13
|
+
import { type AgentScope, discoverAgents, readSubagentSettings } from "./agents.ts";
|
|
14
|
+
import { executeTaskflow } from "./runtime.ts";
|
|
15
|
+
import { getFlow, loadRun, saveRun, DEFAULT_KEPT_RUNS, DEFAULT_RUN_AGE_DAYS } from "./store.ts";
|
|
16
|
+
|
|
17
|
+
interface DetachContext {
|
|
18
|
+
runId: string;
|
|
19
|
+
defName: string;
|
|
20
|
+
args: Record<string, unknown>;
|
|
21
|
+
cwd: string;
|
|
22
|
+
}
|
|
23
|
+
|
|
24
|
+
const contextPath = process.argv[2];
|
|
25
|
+
if (!contextPath) {
|
|
26
|
+
console.error("[detached-runner] Missing context file path argument");
|
|
27
|
+
process.exit(1);
|
|
28
|
+
}
|
|
29
|
+
|
|
30
|
+
let ctx: DetachContext;
|
|
31
|
+
try {
|
|
32
|
+
ctx = JSON.parse(readFileSync(contextPath, "utf-8")) as DetachContext;
|
|
33
|
+
} catch (e) {
|
|
34
|
+
console.error(`[detached-runner] Failed to read context: ${e instanceof Error ? e.message : String(e)}`);
|
|
35
|
+
process.exit(1);
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
const cleanupConfig = { maxKeep: DEFAULT_KEPT_RUNS, maxAgeDays: DEFAULT_RUN_AGE_DAYS };
|
|
39
|
+
|
|
40
|
+
try {
|
|
41
|
+
const state = loadRun(ctx.cwd, ctx.runId);
|
|
42
|
+
if (!state) {
|
|
43
|
+
console.error(`[detached-runner] Run not found: ${ctx.runId}`);
|
|
44
|
+
process.exit(1);
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
// Re-discover agents using the same settings as the host session.
|
|
48
|
+
const settings = readSubagentSettings();
|
|
49
|
+
cleanupConfig.maxKeep = settings.taskflow.maxKeptRuns;
|
|
50
|
+
cleanupConfig.maxAgeDays = settings.taskflow.maxRunAgeDays;
|
|
51
|
+
const scope: AgentScope = state.def.agentScope ?? "user";
|
|
52
|
+
const { agents } = discoverAgents(ctx.cwd, scope, settings.modelRoles, settings.taskflow);
|
|
53
|
+
|
|
54
|
+
const result = await executeTaskflow(state, {
|
|
55
|
+
cwd: ctx.cwd,
|
|
56
|
+
agents,
|
|
57
|
+
globalThinking: settings.globalThinking,
|
|
58
|
+
persist: (s) => saveRun(s, cleanupConfig),
|
|
59
|
+
// No requestApproval — approval phases auto-reject in detached mode
|
|
60
|
+
// (fail-open: phase records the rejection, run continues).
|
|
61
|
+
loadFlow: (name: string) => getFlow(ctx.cwd, name)?.def,
|
|
62
|
+
});
|
|
63
|
+
|
|
64
|
+
saveRun(result.state, cleanupConfig);
|
|
65
|
+
} catch (e) {
|
|
66
|
+
// Top-level catch: persist failure so the host can poll the terminal state.
|
|
67
|
+
const message = e instanceof Error ? e.message : String(e);
|
|
68
|
+
console.error(`[detached-runner] Fatal: ${message}`);
|
|
69
|
+
try {
|
|
70
|
+
const state = loadRun(ctx.cwd, ctx.runId);
|
|
71
|
+
if (state && state.status === "running") {
|
|
72
|
+
state.status = "failed";
|
|
73
|
+
saveRun(state, cleanupConfig);
|
|
74
|
+
}
|
|
75
|
+
} catch {
|
|
76
|
+
// Best-effort — if we can't even load the state, there's nothing to persist.
|
|
77
|
+
}
|
|
78
|
+
process.exit(1);
|
|
79
|
+
}
|
package/extensions/index.ts
CHANGED
|
@@ -27,6 +27,7 @@ import { Type } from "typebox";
|
|
|
27
27
|
import { type AgentScope, discoverAgents, readSubagentSettings, shouldSyncBuiltinAgentsToProject, syncBuiltinAgentsToProject } from "./agents.ts";
|
|
28
28
|
import { renderRunResult, summarizeRun } from "./render.ts";
|
|
29
29
|
import { RunHistoryComponent, type RunHistoryResult } from "./runs-view.ts";
|
|
30
|
+
import { ApprovalViewComponent, type ApprovalChoice } from "./approval-view.ts";
|
|
30
31
|
import { executeTaskflow, type ApprovalDecision, type ApprovalRequest, type RuntimeResult } from "./runtime.ts";
|
|
31
32
|
import { finalPhase, resolveArgs, type Taskflow, validateTaskflow, desugar, isShorthand } from "./schema.ts";
|
|
32
33
|
import {
|
|
@@ -110,6 +111,11 @@ const TaskflowParams = Type.Object({
|
|
|
110
111
|
"Destructive: overwrites modelRoles in settings.json. Required for mode='apply-defaults'.",
|
|
111
112
|
}),
|
|
112
113
|
),
|
|
114
|
+
detach: Type.Optional(
|
|
115
|
+
Type.Boolean({
|
|
116
|
+
description: "Run in background (detached child process); return runId immediately. Status polled via store.",
|
|
117
|
+
}),
|
|
118
|
+
),
|
|
113
119
|
});
|
|
114
120
|
|
|
115
121
|
function makeRunState(def: Taskflow, args: Record<string, unknown>, cwd: string): RunState {
|
|
@@ -166,19 +172,51 @@ async function runFlow(
|
|
|
166
172
|
}
|
|
167
173
|
|
|
168
174
|
// Human-in-the-loop approver — only when an interactive UI is available.
|
|
175
|
+
// Renders a centered modal popup (TUI overlay) with a scrollable viewport
|
|
176
|
+
// so long upstream output (e.g. a plan) can be reviewed in full before
|
|
177
|
+
// deciding (mouse wheel / ↑↓ / PgUp / PgDn to scroll).
|
|
169
178
|
const requestApproval = ctx.hasUI
|
|
170
179
|
? async (req: ApprovalRequest): Promise<ApprovalDecision> => {
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
180
|
+
const choice = await ctx.ui.custom<ApprovalChoice>(
|
|
181
|
+
(tui, theme, _kb, done) => {
|
|
182
|
+
const view = new ApprovalViewComponent(
|
|
183
|
+
theme,
|
|
184
|
+
{
|
|
185
|
+
title: `Taskflow approval — ${def.name}/${req.phaseId}`,
|
|
186
|
+
message: req.message,
|
|
187
|
+
upstream: req.upstream,
|
|
188
|
+
},
|
|
189
|
+
done,
|
|
190
|
+
() => tui.terminal.rows,
|
|
191
|
+
tui.terminal,
|
|
192
|
+
);
|
|
193
|
+
const onAbort = () => done("reject");
|
|
194
|
+
signal?.addEventListener("abort", onAbort, { once: true });
|
|
195
|
+
return {
|
|
196
|
+
render: (w: number) => view.render(w),
|
|
197
|
+
invalidate: () => view.invalidate(),
|
|
198
|
+
handleInput: (data: string) => {
|
|
199
|
+
view.handleInput(data);
|
|
200
|
+
tui.requestRender();
|
|
201
|
+
},
|
|
202
|
+
dispose: () => {
|
|
203
|
+
view.dispose();
|
|
204
|
+
signal?.removeEventListener("abort", onAbort);
|
|
205
|
+
},
|
|
206
|
+
};
|
|
207
|
+
},
|
|
208
|
+
{
|
|
209
|
+
overlay: true,
|
|
210
|
+
overlayOptions: {
|
|
211
|
+
width: "80%",
|
|
212
|
+
minWidth: 60,
|
|
213
|
+
maxHeight: "85%",
|
|
214
|
+
anchor: "center",
|
|
215
|
+
},
|
|
216
|
+
},
|
|
179
217
|
);
|
|
180
|
-
if (
|
|
181
|
-
if (choice
|
|
218
|
+
if (choice === "reject") return { decision: "reject" };
|
|
219
|
+
if (choice === "edit") {
|
|
182
220
|
const note = await ctx.ui.input("Guidance passed downstream as this phase's output", "type guidance…", {
|
|
183
221
|
signal,
|
|
184
222
|
});
|
|
@@ -614,6 +652,41 @@ export default function (pi: ExtensionAPI) {
|
|
|
614
652
|
for (const w of v.warnings) {
|
|
615
653
|
console.warn(`[taskflow:${def.name}] ${w}`);
|
|
616
654
|
}
|
|
655
|
+
// Detached (background) execution: spawn a child process and return immediately.
|
|
656
|
+
if (params.detach) {
|
|
657
|
+
const state = makeRunState(def, args, ctx.cwd);
|
|
658
|
+
state.detached = true;
|
|
659
|
+
saveRun(state);
|
|
660
|
+
|
|
661
|
+
// Serialize context for the detached runner script.
|
|
662
|
+
const { writeFileSync } = await import("node:fs");
|
|
663
|
+
const { spawn } = await import("node:child_process");
|
|
664
|
+
const os = await import("node:os");
|
|
665
|
+
const path = await import("node:path");
|
|
666
|
+
const tmpFile = path.join(os.tmpdir(), `taskflow-detach-${state.runId}.json`);
|
|
667
|
+
writeFileSync(tmpFile, JSON.stringify({
|
|
668
|
+
runId: state.runId,
|
|
669
|
+
defName: def.name,
|
|
670
|
+
args,
|
|
671
|
+
cwd: ctx.cwd,
|
|
672
|
+
}));
|
|
673
|
+
|
|
674
|
+
const runnerScript = path.join(path.dirname(new URL(import.meta.url).pathname), "detached-runner.ts");
|
|
675
|
+
const child = spawn(process.execPath, ["--experimental-strip-types", runnerScript, tmpFile], {
|
|
676
|
+
detached: true,
|
|
677
|
+
stdio: "ignore",
|
|
678
|
+
});
|
|
679
|
+
child.unref();
|
|
680
|
+
|
|
681
|
+
state.pid = child.pid ?? undefined;
|
|
682
|
+
saveRun(state);
|
|
683
|
+
|
|
684
|
+
return {
|
|
685
|
+
content: [{ type: "text", text: `Taskflow '${def.name}' started in background (pid: ${child.pid}). Run id: ${state.runId}` }],
|
|
686
|
+
details: { action, state, message: state.runId } satisfies TaskflowDetails,
|
|
687
|
+
};
|
|
688
|
+
}
|
|
689
|
+
|
|
617
690
|
const result = await runFlow(def, args, ctx, signal, onUpdate as any);
|
|
618
691
|
// Surface the validation warnings in the tool result so the model
|
|
619
692
|
// can acknowledge or fix them, and the user sees them in the chat.
|
|
@@ -260,7 +260,7 @@ function tokenize(input: string): Tok[] {
|
|
|
260
260
|
continue;
|
|
261
261
|
}
|
|
262
262
|
// number
|
|
263
|
-
const numMatch = /^-?\d+(?:\.\d+)?/.exec(input.slice(i));
|
|
263
|
+
const numMatch = /^-?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?/.exec(input.slice(i));
|
|
264
264
|
if (numMatch) {
|
|
265
265
|
toks.push({ t: "num", v: Number(numMatch[0]) });
|
|
266
266
|
i += numMatch[0].length;
|
package/extensions/runner.ts
CHANGED
|
@@ -13,6 +13,15 @@ import { withFileMutationQueue } from "@earendil-works/pi-coding-agent";
|
|
|
13
13
|
import type { AgentConfig } from "./agents.ts";
|
|
14
14
|
import { emptyUsage, type UsageStats } from "./usage.ts";
|
|
15
15
|
|
|
16
|
+
const activeChildren = new Set<number>();
|
|
17
|
+
const killAll = () => {
|
|
18
|
+
for (const pid of activeChildren) {
|
|
19
|
+
try { process.kill(pid, "SIGKILL"); } catch { /* already dead */ }
|
|
20
|
+
}
|
|
21
|
+
};
|
|
22
|
+
process.on("exit", killAll);
|
|
23
|
+
process.on("SIGTERM", () => { killAll(); process.exit(143); });
|
|
24
|
+
|
|
16
25
|
export interface RunResult {
|
|
17
26
|
agent: string;
|
|
18
27
|
task: string;
|
|
@@ -345,6 +354,7 @@ export async function runAgentTask(
|
|
|
345
354
|
shell: false,
|
|
346
355
|
stdio: ["ignore", "pipe", "pipe"],
|
|
347
356
|
});
|
|
357
|
+
if (proc.pid) activeChildren.add(proc.pid);
|
|
348
358
|
let buffer = "";
|
|
349
359
|
|
|
350
360
|
// Idle watchdog: a subagent that goes silent on stdout for too long is
|
|
@@ -389,13 +399,18 @@ export async function runAgentTask(
|
|
|
389
399
|
// Cap prevents OOM from verbose tool output (e.g., npm install). 64 KB is
|
|
390
400
|
// generous for error diagnosis while preventing memory exhaustion.
|
|
391
401
|
const STDERR_MAX_LEN = 64 * 1024;
|
|
402
|
+
let stderrCapped = false;
|
|
392
403
|
proc.stderr.on("data", (data) => {
|
|
393
|
-
|
|
394
|
-
|
|
395
|
-
|
|
404
|
+
if (!stderrCapped) {
|
|
405
|
+
result.stderr += data.toString();
|
|
406
|
+
if (result.stderr.length >= STDERR_MAX_LEN) {
|
|
407
|
+
result.stderr = result.stderr.slice(0, STDERR_MAX_LEN) + "\n[...stderr truncated at 64KB]";
|
|
408
|
+
stderrCapped = true;
|
|
409
|
+
}
|
|
396
410
|
}
|
|
397
411
|
});
|
|
398
412
|
proc.on("close", (code, signal) => {
|
|
413
|
+
if (proc.pid) activeChildren.delete(proc.pid);
|
|
399
414
|
clearTimers();
|
|
400
415
|
if (buffer.trim()) processLine(buffer);
|
|
401
416
|
if (code === null && signal) killedBySignal = signal;
|
package/extensions/runtime.ts
CHANGED
|
@@ -1025,6 +1025,7 @@ async function executePhase(
|
|
|
1025
1025
|
// Using indexOf on the stable `ran` array is reference-based and correct even
|
|
1026
1026
|
// when two variants produce byte-identical output.
|
|
1027
1027
|
const ranIdx = (r: RunResult) => ran.indexOf(r) + 1;
|
|
1028
|
+
const budgetSkipCount = results.filter((r) => r.stopReason === "budget-skipped").length;
|
|
1028
1029
|
|
|
1029
1030
|
// All competitors failed → the tournament fails (nothing to judge).
|
|
1030
1031
|
if (ok.length === 0) {
|
|
@@ -1033,6 +1034,7 @@ async function executePhase(
|
|
|
1033
1034
|
status: "failed",
|
|
1034
1035
|
usage: variantUsage,
|
|
1035
1036
|
error: `tournament '${phase.id}': all ${competitors.length} variants failed`,
|
|
1037
|
+
budgetTruncated: budgetSkipCount > 0 || undefined,
|
|
1036
1038
|
tournament: { variants: competitors.length, winner: 0, mode },
|
|
1037
1039
|
inputHash: hashInput(phase.id, "tournament", String(competitors.length)),
|
|
1038
1040
|
endedAt: Date.now(),
|
|
@@ -1047,6 +1049,7 @@ async function executePhase(
|
|
|
1047
1049
|
json: parseJson ? safeParse(ok[0].output) : undefined,
|
|
1048
1050
|
usage: variantUsage,
|
|
1049
1051
|
model: ok[0].model,
|
|
1052
|
+
budgetTruncated: budgetSkipCount > 0 || undefined,
|
|
1050
1053
|
tournament: { variants: competitors.length, winner: ranIdx(ok[0]), mode, reason: "only surviving variant" },
|
|
1051
1054
|
inputHash: hashInput(phase.id, "tournament", String(competitors.length)),
|
|
1052
1055
|
endedAt: Date.now(),
|
|
@@ -1062,6 +1065,7 @@ async function executePhase(
|
|
|
1062
1065
|
json: parseJson ? safeParse(ok[0].output) : undefined,
|
|
1063
1066
|
usage: variantUsage,
|
|
1064
1067
|
model: ok[0].model,
|
|
1068
|
+
budgetTruncated: budgetSkipCount > 0 || undefined,
|
|
1065
1069
|
warnings: ["judge skipped: run aborted or budget exceeded"],
|
|
1066
1070
|
tournament: { variants: competitors.length, winner: ranIdx(ok[0]), mode, reason: "judge skipped" },
|
|
1067
1071
|
inputHash: hashInput(phase.id, "tournament", String(competitors.length)),
|
|
@@ -1095,6 +1099,7 @@ async function executePhase(
|
|
|
1095
1099
|
json: parseJson ? safeParse(ok[0].output) : undefined,
|
|
1096
1100
|
usage: judgeUsage,
|
|
1097
1101
|
model: ok[0].model,
|
|
1102
|
+
budgetTruncated: budgetSkipCount > 0 || undefined,
|
|
1098
1103
|
warnings: [`judge failed (${judgeRes.errorMessage ?? "error"}); used variant ${ranIdx(ok[0])}`],
|
|
1099
1104
|
tournament: { variants: competitors.length, winner: ranIdx(ok[0]), mode, reason: "judge failed" },
|
|
1100
1105
|
inputHash: hashInput(phase.id, "tournament", String(competitors.length)),
|
|
@@ -1117,6 +1122,7 @@ async function executePhase(
|
|
|
1117
1122
|
json: parseJson ? safeParse(output) : undefined,
|
|
1118
1123
|
usage: judgeUsage,
|
|
1119
1124
|
model: mode === "aggregate" ? judgeRes.model : chosen.model,
|
|
1125
|
+
budgetTruncated: budgetSkipCount > 0 || undefined,
|
|
1120
1126
|
warnings: winnerIneligible ? [`judge picked an ineligible variant; used variant ${winnerIdx}`] : undefined,
|
|
1121
1127
|
tournament: { variants: competitors.length, winner: winnerIdx, mode, reason },
|
|
1122
1128
|
inputHash: hashInput(phase.id, "tournament", String(competitors.length), mode),
|
|
@@ -1398,12 +1404,10 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
|
|
|
1398
1404
|
let gateBlocked = false;
|
|
1399
1405
|
let gateReason = "";
|
|
1400
1406
|
let gateOutput = "";
|
|
1401
|
-
// `budgetBlocked` gates the skipping of remaining phases once the cap is hit
|
|
1402
|
-
//
|
|
1403
|
-
//
|
|
1404
|
-
// very last phase, with nothing left to skip, must NOT mark a good run failed).
|
|
1407
|
+
// `budgetBlocked` gates the skipping of remaining phases once the cap is hit
|
|
1408
|
+
// and also drives the terminal "blocked" status — a maxUSD ceiling must never
|
|
1409
|
+
// silently do nothing.
|
|
1405
1410
|
let budgetBlocked = false;
|
|
1406
|
-
let budgetSkipped = false;
|
|
1407
1411
|
let budgetReason = "";
|
|
1408
1412
|
const byId = new Map(def.phases.map((p) => [p.id, p]));
|
|
1409
1413
|
|
|
@@ -1442,7 +1446,7 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
|
|
|
1442
1446
|
}
|
|
1443
1447
|
|
|
1444
1448
|
if (skipReason) {
|
|
1445
|
-
if (skipReason.startsWith("Budget exceeded"))
|
|
1449
|
+
if (skipReason.startsWith("Budget exceeded")) budgetBlocked = true;
|
|
1446
1450
|
state.phases[phase.id] = {
|
|
1447
1451
|
id: phase.id,
|
|
1448
1452
|
status: "skipped",
|
|
@@ -1485,7 +1489,6 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
|
|
|
1485
1489
|
// A fan-out cut short by the cap is itself a budget skip.
|
|
1486
1490
|
if (ps.budgetTruncated) {
|
|
1487
1491
|
budgetBlocked = true;
|
|
1488
|
-
budgetSkipped = true;
|
|
1489
1492
|
if (!budgetReason) budgetReason = "fan-out truncated by budget";
|
|
1490
1493
|
}
|
|
1491
1494
|
// Budget ceiling: once exceeded, remaining phases are skipped.
|
|
@@ -1494,7 +1497,7 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
|
|
|
1494
1497
|
// the budget is detected as exceeded. This bounded overshoot is
|
|
1495
1498
|
// acceptable: budgetBlocked prevents cascading into subsequent layers.
|
|
1496
1499
|
const ob = overBudget(state);
|
|
1497
|
-
if (ob.over
|
|
1500
|
+
if (ob.over) {
|
|
1498
1501
|
budgetBlocked = true;
|
|
1499
1502
|
budgetReason = ob.reason;
|
|
1500
1503
|
}
|
|
@@ -1517,7 +1520,7 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
|
|
|
1517
1520
|
|
|
1518
1521
|
state.status = aborted
|
|
1519
1522
|
? "paused"
|
|
1520
|
-
: gateBlocked ||
|
|
1523
|
+
: gateBlocked || budgetBlocked
|
|
1521
1524
|
? "blocked"
|
|
1522
1525
|
: anyFailed
|
|
1523
1526
|
? "failed"
|
|
@@ -1527,7 +1530,7 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
|
|
|
1527
1530
|
let finalOutput = finalState?.output ?? "(no output)";
|
|
1528
1531
|
if (gateBlocked) {
|
|
1529
1532
|
finalOutput = `Gate blocked the workflow.${gateReason ? `\nReason: ${gateReason}` : ""}${gateOutput ? `\n\n${gateOutput}` : ""}`;
|
|
1530
|
-
} else if (
|
|
1533
|
+
} else if (budgetBlocked) {
|
|
1531
1534
|
finalOutput = `Budget exceeded — run halted.${budgetReason ? `\nReason: ${budgetReason}` : ""}${finalState?.output ? `\n\n${finalState.output}` : ""}`;
|
|
1532
1535
|
}
|
|
1533
1536
|
|
package/extensions/store.ts
CHANGED
|
@@ -84,6 +84,10 @@ export interface RunState {
|
|
|
84
84
|
createdAt: number;
|
|
85
85
|
updatedAt: number;
|
|
86
86
|
cwd: string;
|
|
87
|
+
/** OS PID of a detached runner process (set only for background runs). */
|
|
88
|
+
pid?: number;
|
|
89
|
+
/** True for runs spawned via `detach: true` (background execution). */
|
|
90
|
+
detached?: boolean;
|
|
87
91
|
}
|
|
88
92
|
|
|
89
93
|
// ---------------------------------------------------------------------------
|
|
@@ -458,10 +462,21 @@ function cleanupTerminalRuns(
|
|
|
458
462
|
}
|
|
459
463
|
|
|
460
464
|
// Sort terminal by updatedAt desc (newest first).
|
|
461
|
-
|
|
465
|
+
// Filter out entries with corrupt updatedAt (non-numeric/NaN) BEFORE sorting
|
|
466
|
+
// to prevent NaN from corrupting sort order. Corrupt entries cannot be
|
|
467
|
+
// reliably aged, so they are always moved to toRemove.
|
|
468
|
+
const cleanTerminal: RunIndexEntry[] = [];
|
|
469
|
+
for (const e of terminal) {
|
|
470
|
+
if (typeof e.updatedAt === "number" && !Number.isNaN(e.updatedAt)) {
|
|
471
|
+
cleanTerminal.push(e);
|
|
472
|
+
} else {
|
|
473
|
+
toRemove.push(e);
|
|
474
|
+
}
|
|
475
|
+
}
|
|
476
|
+
cleanTerminal.sort((a, b) => b.updatedAt - a.updatedAt);
|
|
462
477
|
|
|
463
|
-
for (let i = 0; i <
|
|
464
|
-
const e =
|
|
478
|
+
for (let i = 0; i < cleanTerminal.length; i++) {
|
|
479
|
+
const e = cleanTerminal[i]!;
|
|
465
480
|
const expiredByAge = now - e.updatedAt > maxAgeMs;
|
|
466
481
|
const excessByCount = i >= maxKeep;
|
|
467
482
|
if (expiredByAge || excessByCount) {
|
|
@@ -473,7 +488,7 @@ function cleanupTerminalRuns(
|
|
|
473
488
|
|
|
474
489
|
// Commit the pruned index while holding the lock so a concurrent
|
|
475
490
|
// updateIndexEntry cannot interleave and lose entries.
|
|
476
|
-
const remaining =
|
|
491
|
+
const remaining = cleanTerminal.filter((e) => !toRemove.includes(e));
|
|
477
492
|
writeIndex(runsRoot, [...active, ...remaining]);
|
|
478
493
|
});
|
|
479
494
|
|
|
@@ -783,8 +798,12 @@ export function listRuns(cwd: string, limit = 20): RunState[] {
|
|
|
783
798
|
}
|
|
784
799
|
|
|
785
800
|
// Sort by updatedAt desc, slice to limit.
|
|
786
|
-
entries
|
|
787
|
-
|
|
801
|
+
// Filter out entries with non-numeric/NaN updatedAt BEFORE sorting to
|
|
802
|
+
// prevent NaN from corrupting V8's sort order (which can displace valid
|
|
803
|
+
// entries when a limit is applied).
|
|
804
|
+
const valid = entries.filter((e) => typeof e.updatedAt === "number" && !Number.isNaN(e.updatedAt));
|
|
805
|
+
valid.sort((a, b) => b.updatedAt - a.updatedAt);
|
|
806
|
+
const sliced = valid.slice(0, limit);
|
|
788
807
|
|
|
789
808
|
// Read full RunState for each entry.
|
|
790
809
|
const runs: RunState[] = [];
|
|
@@ -804,6 +823,20 @@ export function hashInput(...parts: string[]): string {
|
|
|
804
823
|
return crypto.createHash("sha256").update(parts.join("\u0000")).digest("hex").slice(0, 16);
|
|
805
824
|
}
|
|
806
825
|
|
|
826
|
+
/**
|
|
827
|
+
* Check whether a process with the given PID is still alive.
|
|
828
|
+
* Uses signal 0 (no signal sent) — succeeds if the process exists and we have
|
|
829
|
+
* permission to signal it, throws ESRCH if it doesn't exist.
|
|
830
|
+
*/
|
|
831
|
+
export function isProcessAlive(pid: number): boolean {
|
|
832
|
+
try {
|
|
833
|
+
process.kill(pid, 0);
|
|
834
|
+
return true;
|
|
835
|
+
} catch {
|
|
836
|
+
return false;
|
|
837
|
+
}
|
|
838
|
+
}
|
|
839
|
+
|
|
807
840
|
/**
|
|
808
841
|
* Write a file atomically: write to a unique temp file in the same directory,
|
|
809
842
|
* then rename over the target (rename is atomic on the same filesystem). Prevents
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-taskflow",
|
|
3
|
-
"version": "0.0.
|
|
3
|
+
"version": "0.0.20",
|
|
4
4
|
"description": "A declarative, verifiable graph of task nodes for the Pi coding agent — not a workflow you script, but a DAG you declare: statically verified before it runs, with dynamic fan-out, gates, isolated subagent context, resumable runs, and saveable commands.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"pi-package",
|
|
@@ -37,7 +37,7 @@
|
|
|
37
37
|
],
|
|
38
38
|
"scripts": {
|
|
39
39
|
"typecheck": "tsc --noEmit",
|
|
40
|
-
"test": "PI_TASKFLOW_BUILTIN_AGENTS_DIR= node --experimental-strip-types --test test/interpolate.test.ts test/condition.test.ts test/schema.test.ts test/usage.test.ts test/runtime.test.ts test/features.test.ts test/runner.test.ts test/store.test.ts test/agents.test.ts test/init.test.ts test/render.test.ts test/desugar.test.ts test/cache.test.ts test/loop.test.ts test/tournament.test.ts test/verify.test.ts test/gate-eval.test.ts test/transient-error.test.ts test/runtime-branches.test.ts test/interpolate-extended.test.ts test/store-extended.test.ts test/flow-def.test.ts",
|
|
40
|
+
"test": "PI_TASKFLOW_BUILTIN_AGENTS_DIR= node --experimental-strip-types --test test/interpolate.test.ts test/condition.test.ts test/schema.test.ts test/usage.test.ts test/runtime.test.ts test/features.test.ts test/runner.test.ts test/store.test.ts test/agents.test.ts test/init.test.ts test/render.test.ts test/approval-view.test.ts test/desugar.test.ts test/cache.test.ts test/loop.test.ts test/tournament.test.ts test/verify.test.ts test/gate-eval.test.ts test/transient-error.test.ts test/runtime-branches.test.ts test/interpolate-extended.test.ts test/store-extended.test.ts test/flow-def.test.ts test/detached.test.ts",
|
|
41
41
|
"test:e2e": "PI_TASKFLOW_PI_BIN=pi node --experimental-strip-types test/e2e.mts",
|
|
42
42
|
"test:dogfood-cache": "node --experimental-strip-types test/dogfood-cache.mts"
|
|
43
43
|
},
|
package/skills/taskflow/SKILL.md
CHANGED
|
@@ -79,15 +79,17 @@ Call the `taskflow` tool. To run a brand-new flow you write inline, pass
|
|
|
79
79
|
|
|
80
80
|
### Phase types
|
|
81
81
|
|
|
82
|
-
| type | meaning |
|
|
83
|
-
|
|
84
|
-
| `agent` | one subagent runs `task` |
|
|
85
|
-
| `parallel` | run `branches[]` concurrently |
|
|
86
|
-
| `map` | fan out over `over` (an array) — one subagent per item, `{item}` bound |
|
|
87
|
-
| `gate` | quality/review step that can **halt the flow**
|
|
88
|
-
| `reduce` | aggregate `from[]` phases into one output |
|
|
89
|
-
| `approval` | **human-in-the-loop** pause: ask a person to approve / reject / edit before continuing |
|
|
90
|
-
| `flow` | run a **sub-flow** as one phase — **saved** (`use`) or **runtime-generated** (`def`) |
|
|
82
|
+
| type | meaning | details |
|
|
83
|
+
|------|---------|---------|
|
|
84
|
+
| `agent` | one subagent runs `task` | DSL shape |
|
|
85
|
+
| `parallel` | run `branches[]` concurrently | Conditional routing |
|
|
86
|
+
| `map` | fan out over `over` (an array) — one subagent per item, `{item}` bound | DSL shape |
|
|
87
|
+
| `gate` | quality/review step that can **halt the flow** | Gate phases |
|
|
88
|
+
| `reduce` | aggregate `from[]` phases into one output | DSL shape |
|
|
89
|
+
| `approval` | **human-in-the-loop** pause: ask a person to approve / reject / edit before continuing | Approval phases |
|
|
90
|
+
| `flow` | run a **sub-flow** as one phase — **saved** (`use`) or **runtime-generated** (`def`) | Sub-flows |
|
|
91
|
+
| `loop` | repeat a body until a condition / convergence / `maxIterations` | Loop phases |
|
|
92
|
+
| `tournament` | run N competing `variants`, a `judge` picks the best or aggregates | Tournament phases |
|
|
91
93
|
|
|
92
94
|
### Control-flow fields (any phase)
|
|
93
95
|
|
|
@@ -100,7 +102,9 @@ Call the `taskflow` tool. To run a brand-new flow you write inline, pass
|
|
|
100
102
|
### Conditional routing (when + gate/branches)
|
|
101
103
|
|
|
102
104
|
Pair `when` with an upstream phase that emits a decision to build real if/else
|
|
103
|
-
routing. Use `join: "any"` on the merge phase so it runs whichever branch fired
|
|
105
|
+
routing. Use `join: "any"` on the merge phase so it runs whichever branch fired. For
|
|
106
|
+
static (non-conditional) concurrency, a `parallel` phase runs fixed `branches[]`
|
|
107
|
+
instead — `{ "type": "parallel", "branches": [{"task":"..."}, {"task":"...","agent":"reviewer"}] }`.
|
|
104
108
|
|
|
105
109
|
```jsonc
|
|
106
110
|
{ "id": "triage", "type": "agent", "agent": "analyst", "output": "json",
|
|
@@ -125,6 +129,7 @@ deciding. The (interpolated) `task` is the prompt shown.
|
|
|
125
129
|
- **Edit** → the typed note becomes this phase's `output`, so you can inject
|
|
126
130
|
guidance mid-run: reference it downstream with `{steps.<id>.output}`.
|
|
127
131
|
- **Non-interactive** runs (headless/CI/print mode) **auto-approve** and record it.
|
|
132
|
+
- **Background (detached)** runs **auto-reject** (no interactive approver) — downstream sees the rejection; the flow continues (fail-open).
|
|
128
133
|
|
|
129
134
|
```jsonc
|
|
130
135
|
{ "id": "checkpoint", "type": "approval", "dependsOn": ["plan"],
|
|
@@ -176,6 +181,62 @@ so round N's plan depends on round N-1's **result** (not a one-shot fan-out):
|
|
|
176
181
|
the declarative equivalent of `for (...) { read result; decide next }`. See
|
|
177
182
|
`examples/dynamic-plan-execute.json` and `examples/iterative-replan.json`.
|
|
178
183
|
|
|
184
|
+
### Loop phases (iterate until done)
|
|
185
|
+
|
|
186
|
+
A `loop` phase runs its body repeatedly, exposing each iteration's output as
|
|
187
|
+
`{steps.<thisId>.output}` / `.json` so the next round can react to the last. It
|
|
188
|
+
stops on the first of: `until` truthy, **convergence** (output stops changing),
|
|
189
|
+
or `maxIterations` (hard cap). This is the declarative "keep going until good
|
|
190
|
+
enough" — the runtime always terminates (the cap is mandatory).
|
|
191
|
+
|
|
192
|
+
- `until` — stop condition, same operators as `when` (a parse error stops the loop, fail-safe).
|
|
193
|
+
- `maxIterations` — hard iteration cap (required to bound the loop).
|
|
194
|
+
- `convergence` — `true` to stop early when an iteration's output equals the previous one.
|
|
195
|
+
|
|
196
|
+
```jsonc
|
|
197
|
+
{
|
|
198
|
+
"id": "refine",
|
|
199
|
+
"type": "loop",
|
|
200
|
+
"agent": "executor",
|
|
201
|
+
"maxIterations": 5,
|
|
202
|
+
"until": "{steps.refine.json.done} == true",
|
|
203
|
+
"convergence": true,
|
|
204
|
+
"task": "Improve the draft. When nothing else needs fixing, output JSON {\"done\":true,\"draft\":\"...\"}; otherwise {\"done\":false,\"draft\":\"...\"}.",
|
|
205
|
+
"output": "json",
|
|
206
|
+
"final": true
|
|
207
|
+
}
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
For data-dependent **replanning** each round, pair a `loop` body that emits a
|
|
211
|
+
plan with `flow{def}` (see Sub-flows above). See `examples/iterative-replan.json`.
|
|
212
|
+
|
|
213
|
+
### Tournament phases (N variants, judge picks best)
|
|
214
|
+
|
|
215
|
+
A `tournament` phase runs `variants` competing attempts in parallel, then a
|
|
216
|
+
**judge** sub-phase selects the winner (`mode: "best"`) or merges them
|
|
217
|
+
(`mode: "aggregate"`). Use it when one shot is unreliable and you want the best
|
|
218
|
+
of several drafts, or a synthesis of diverse approaches.
|
|
219
|
+
|
|
220
|
+
- `variants` — the competing attempts: a number (run the same `task` N times) or an array of `{task, agent?}` for genuinely different approaches.
|
|
221
|
+
- `mode` — `"best"` (judge picks one winner, default) or `"aggregate"` (judge merges all into one output).
|
|
222
|
+
- `judge` — the judge's rubric/instructions (how to choose or merge).
|
|
223
|
+
- `judgeAgent` — *(optional)* the agent that runs the judge step; defaults to the phase `agent`.
|
|
224
|
+
- Fail-open: if the judge's pick is unparseable, variant 1 is returned (work is never lost).
|
|
225
|
+
|
|
226
|
+
```jsonc
|
|
227
|
+
{
|
|
228
|
+
"id": "headline",
|
|
229
|
+
"type": "tournament",
|
|
230
|
+
"agent": "executor",
|
|
231
|
+
"variants": 3,
|
|
232
|
+
"mode": "best",
|
|
233
|
+
"judge": "Pick the clearest, most accurate headline. End with: WINNER: <n>.",
|
|
234
|
+
"task": "Write one headline for the article below.\n\n{steps.draft.output}",
|
|
235
|
+
"dependsOn": ["draft"],
|
|
236
|
+
"final": true
|
|
237
|
+
}
|
|
238
|
+
```
|
|
239
|
+
|
|
179
240
|
### Budget (cost / token caps)
|
|
180
241
|
|
|
181
242
|
Add a run-wide ceiling at the top level. When accumulated cost/tokens exceed it,
|
|
@@ -206,6 +267,30 @@ Review the audit results below. If any endpoint is missing auth, end with
|
|
|
206
267
|
{steps.audit.output}
|
|
207
268
|
```
|
|
208
269
|
|
|
270
|
+
**Zero-token machine checks (`eval`).** Before spending a token on the LLM gate,
|
|
271
|
+
list machine-checkable assertions in `eval`. If **all** pass, the gate
|
|
272
|
+
auto-passes with **no LLM call**; if any fails, it falls through to the LLM
|
|
273
|
+
`task` (the qualitative residue). Each entry supports the `when` operators plus
|
|
274
|
+
`X contains Y` (substring). A parse error fails **open** (consistent with the
|
|
275
|
+
gate invariant).
|
|
276
|
+
|
|
277
|
+
```jsonc
|
|
278
|
+
{ "id": "quality", "type": "gate", "dependsOn": ["build","test"],
|
|
279
|
+
"eval": ["{steps.build.output} contains BUILD SUCCESS", "{steps.test.json.failures} == 0"],
|
|
280
|
+
"task": "Review the diff for subtle logic errors a linter can't catch. VERDICT: PASS or BLOCK." }
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
**Self-healing (`onBlock: "retry"`).** By default a blocking gate halts the run
|
|
284
|
+
(`onBlock: "halt"`). With `onBlock: "retry"` the gate instead **re-runs its
|
|
285
|
+
upstream `dependsOn` phases and re-evaluates**, up to `retry.max` rounds (or
|
|
286
|
+
until PASS / budget / abort) — a generate→critique→regenerate rework loop.
|
|
287
|
+
|
|
288
|
+
```jsonc
|
|
289
|
+
{ "id": "spec-gate", "type": "gate", "onBlock": "retry", "retry": { "max": 3 },
|
|
290
|
+
"dependsOn": ["implement"],
|
|
291
|
+
"task": "Does the implementation satisfy ALL acceptance criteria? VERDICT: PASS or BLOCK with reasons." }
|
|
292
|
+
```
|
|
293
|
+
|
|
209
294
|
### Structured-verify phases (v0.0.8.1)
|
|
210
295
|
|
|
211
296
|
A "verify" phase typically runs `npx tsc --noEmit && npm test && git diff --stat`
|
|
@@ -343,16 +428,34 @@ variables, and storage paths — read `configuration.md` (next to this file).
|
|
|
343
428
|
Quick reference:
|
|
344
429
|
|
|
345
430
|
- **Flow:** `name`, `description`, `concurrency` (default 8), `budget` (`maxUSD`/`maxTokens`), `agentScope` (user|project|both), `args`, `strictInterpolation`.
|
|
346
|
-
- **Phase:** `model`, `thinking`, `tools` (whitelist), `cwd`, `output:"json"`, `concurrency` (map/parallel fan-out), `when`, `join` (all|any), `retry`, `use`/`with` (flow), `final`.
|
|
431
|
+
- **Phase:** `model`, `thinking`, `tools` (whitelist), `cwd`, `output:"json"`, `concurrency` (map/parallel fan-out), `when`, `join` (all|any), `retry`, `use`/`with` (flow), `optional` (fail-soft — a failed/blocked phase won't abort the run), `final`.
|
|
432
|
+
- **Cross-run caching:** add `cache: { "scope": "cross-run" }` to a phase to memoize its output across runs (same input → instant reuse, zero tokens). See `configuration.md` for `ttl`, `fingerprint` (git/glob/file/env invalidation), and scope options.
|
|
347
433
|
- **Precedence (model/thinking/tools):** phase value → agent frontmatter (resolved via `modelRoles`) → global/default.
|
|
348
434
|
- **Concurrency:** same-layer phases use `flow.concurrency`; a `map`/`parallel` phase uses `phase.concurrency ?? flow.concurrency ?? 8`.
|
|
349
435
|
|
|
350
436
|
## Actions
|
|
351
437
|
|
|
352
|
-
- `action: "run"` — run inline `define` or a saved `name` (with optional `args`).
|
|
353
|
-
- `action: "save"` — persist `define` (scope `project` or `user`); becomes `/tf:<name>`.
|
|
354
|
-
- `action: "resume"` — continue a paused/failed run by `runId
|
|
355
|
-
- `action: "list"` — list saved flows.
|
|
438
|
+
- `action: "run"` — run an inline `define` (a one-off DAG) **or** a saved `name` (with optional `args`). Use `define` for an ad-hoc flow; use `name` to invoke something previously saved. Add `detach: true` to run in the background (returns immediately with the runId; poll the store for status).
|
|
439
|
+
- `action: "save"` — persist `define` (scope `project` — default, committed/shared — or `user`); it becomes `/tf:<name>`. On a name collision, project overrides user.
|
|
440
|
+
- `action: "resume"` — continue a paused/failed run by `runId`.
|
|
441
|
+
- `action: "list"` — list saved flows. `action: "verify"` — static-check a `define` (zero tokens). `action: "agents"` — list available agents.
|
|
442
|
+
|
|
443
|
+
## Background (detached) runs
|
|
444
|
+
|
|
445
|
+
Add `detach: true` to `action: "run"` to spawn the flow in a detached child process. The tool returns immediately with the `runId`; the flow continues running even if the host session exits. Status is polled via the store (`/tf runs` or `action: "resume"`).
|
|
446
|
+
|
|
447
|
+
- **Approval phases auto-reject** in detached mode (no interactive approver). Downstream phases see the rejection; the flow continues (fail-open).
|
|
448
|
+
- **Crash resilience:** if the detached process crashes, the store persists `status: "failed"`; resume with `action: "resume"`.
|
|
449
|
+
- **Same flow, both modes:** a flow can run foreground or background — `detach` is a dispatch-time decision, not a flow property.
|
|
450
|
+
|
|
451
|
+
## Operating a run (lifecycle, resume, inspection)
|
|
452
|
+
|
|
453
|
+
A run moves through: **running →** `completed` (a `final` phase produced output) **/** `blocked` (a gate emitted BLOCK, an `approval` was rejected, or the `budget` cap was hit) **/** `failed` (a non-`optional` phase errored) **/** `paused` (the run was aborted). `failed` and `paused` runs are resumable; `blocked` is terminal (fix the gate/budget and re-run).
|
|
454
|
+
|
|
455
|
+
- **Resume is cache-aware.** `action: "resume"` re-runs only what didn't finish: every phase already `done` is reused from its recorded output (within-run cache), so resuming after a crash or a `blocked`/`failed` stop never repeats completed work. A phase that was mid-flight is re-executed cleanly (stale `error`/`endedAt` are cleared first).
|
|
456
|
+
- **When to resume vs. re-run.** Resume when the inputs are unchanged and you just want to continue/retry the tail (fixed a gate, raised the budget, approved a checkpoint). Re-run from scratch when the task or upstream inputs changed — resume would reuse now-stale outputs. (For reuse *across* runs, opt a phase into `cache: {scope:"cross-run"}` — see configuration.md.)
|
|
457
|
+
- **Budget mid-run.** When the run-wide `budget` is exceeded, remaining phases are skipped and an in-flight `map`/`parallel` stops spawning new items; the run ends `blocked` with the partial outputs preserved.
|
|
458
|
+
- **Inspect runs.** `/tf runs` lists recent runs with status; `/tf show <name>` prints a saved flow's definition. Run state lives at `<project .pi>/taskflows/runs/<runId>.json` (gitignored).
|
|
356
459
|
|
|
357
460
|
## User commands
|
|
358
461
|
|
|
@@ -50,7 +50,7 @@ Keys of each object in `phases[]`. Some only apply to specific `type`s.
|
|
|
50
50
|
```jsonc
|
|
51
51
|
{
|
|
52
52
|
"id": "audit", // required, unique — referenced via {steps.audit.output}
|
|
53
|
-
"type": "map", // agent | parallel | map | gate | reduce (default: agent)
|
|
53
|
+
"type": "map", // agent | parallel | map | gate | reduce | approval | flow | loop | tournament (default: agent)
|
|
54
54
|
"agent": "analyst", // agent name to run this phase
|
|
55
55
|
"task": "Audit {item.route}…",
|
|
56
56
|
"dependsOn": ["discover"],// DAG edges
|
|
@@ -71,7 +71,7 @@ Keys of each object in `phases[]`. Some only apply to specific `type`s.
|
|
|
71
71
|
| Key | Applies to | Default | Notes |
|
|
72
72
|
|-----|-----------|---------|-------|
|
|
73
73
|
| `id` | all | — | **Required, unique.** Used in `{steps.<id>…}`. |
|
|
74
|
-
| `type` | all | `agent` | One of the
|
|
74
|
+
| `type` | all | `agent` | One of the 9 phase types (agent, parallel, map, gate, reduce, approval, flow, loop, tournament). |
|
|
75
75
|
| `agent` | all | first available | Agent name; resolved from the scoped pool. |
|
|
76
76
|
| `task` | agent, gate, map, reduce | — | Prompt; supports interpolation. Required for these types. |
|
|
77
77
|
| `over` | map | — | **Required for map.** Must resolve to an array. |
|
|
@@ -85,8 +85,52 @@ Keys of each object in `phases[]`. Some only apply to specific `type`s.
|
|
|
85
85
|
| `tools` | all | agent default | Whitelist of tools for the subagent. See §5. |
|
|
86
86
|
| `cwd` | all | flow cwd | Run this phase's subagent in a different directory. |
|
|
87
87
|
| `concurrency` | map, parallel | flow concurrency | Fan-out cap for this phase only. See §4. |
|
|
88
|
+
| `context` | all | — | File paths / `{steps.X}` refs to **pre-read and inject** before the task. See §2.1. |
|
|
89
|
+
| `contextLimit` | all | `8000` | Max characters read **per file** in `context`. See §2.1. |
|
|
90
|
+
| `cache` | all | `run-only` | Per-phase cache policy (`scope`/`ttl`/`fingerprint`). See §11. |
|
|
88
91
|
| `final` | all | last phase | Exactly one phase may be `final`; its output is returned. |
|
|
89
92
|
|
|
93
|
+
> Gate-only control fields (`eval`, `onBlock`) and the loop/tournament control
|
|
94
|
+
> fields (`until`/`maxIterations`/`convergence`, `variants`/`judge`/`judgeAgent`/`mode`)
|
|
95
|
+
> are documented in `SKILL.md` next to their phase types.
|
|
96
|
+
|
|
97
|
+
---
|
|
98
|
+
|
|
99
|
+
## 2.1 Context pre-reading (`context` / `contextLimit`)
|
|
100
|
+
|
|
101
|
+
Instead of making a subagent *discover* files by exploring (an O(N²) turn-cost
|
|
102
|
+
spiral), you can **pre-read** known files and inject their contents ahead of the
|
|
103
|
+
task prompt. List file paths and/or `{steps.X}` refs in `context`; the runtime
|
|
104
|
+
resolves interpolated refs first, then reads each file and prepends labeled
|
|
105
|
+
blocks to the task.
|
|
106
|
+
|
|
107
|
+
```jsonc
|
|
108
|
+
{
|
|
109
|
+
"id": "review",
|
|
110
|
+
"type": "agent",
|
|
111
|
+
"agent": "reviewer",
|
|
112
|
+
"context": ["src/auth.ts", "src/middleware.ts", "{steps.spec.output}"],
|
|
113
|
+
"contextLimit": 12000,
|
|
114
|
+
"task": "Review the auth flow against the spec above. VERDICT: PASS or BLOCK.",
|
|
115
|
+
"dependsOn": ["spec"]
|
|
116
|
+
}
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
**Behavior & limits (all enforced in the runtime):**
|
|
120
|
+
|
|
121
|
+
| Aspect | Rule |
|
|
122
|
+
|--------|------|
|
|
123
|
+
| Resolution order | interpolate `{steps.X}` / `{args.X}` refs **first**, then read file paths. |
|
|
124
|
+
| Per-file cap | `contextLimit` characters per file (default **8000**); longer files are truncated with a marker. |
|
|
125
|
+
| Total cap | the combined injected block is hard-capped at **200,000 chars**; overflow is truncated with a notice. |
|
|
126
|
+
| Unreadable file | skipped with a `console.warn` (never aborts the phase). |
|
|
127
|
+
| JSON-looking entry | a value that looks like a JSON blob (not a path) is diagnosed and skipped, not read as a file. |
|
|
128
|
+
|
|
129
|
+
Use `context` for **known, bounded** inputs (a handful of source files, an
|
|
130
|
+
upstream phase's output). For large/unknown exploration, let the agent use its
|
|
131
|
+
`read`/`grep` tools instead — pre-reading hundreds of files just hits the total
|
|
132
|
+
cap.
|
|
133
|
+
|
|
90
134
|
---
|
|
91
135
|
|
|
92
136
|
## 3. Declaring & passing arguments
|
|
@@ -209,7 +253,73 @@ Taskflow shares the subagent settings file at `~/.pi/agent/settings.json`:
|
|
|
209
253
|
|
|
210
254
|
---
|
|
211
255
|
|
|
212
|
-
## 8.
|
|
256
|
+
## 8. Cross-run caching (`cache`)
|
|
257
|
+
|
|
258
|
+
By default every phase is **`run-only`**: completed phases are reused only when
|
|
259
|
+
you *resume the same run* (the historical behavior). Opt a phase into the
|
|
260
|
+
persistent **cross-run** memoization store to reuse an identical-input result
|
|
261
|
+
from *any prior run* — instant, zero tokens. See `docs/rfc-cross-run-memoization.md`
|
|
262
|
+
for the design.
|
|
263
|
+
|
|
264
|
+
```jsonc
|
|
265
|
+
{
|
|
266
|
+
"id": "summarize-deps",
|
|
267
|
+
"type": "agent",
|
|
268
|
+
"agent": "writer",
|
|
269
|
+
"task": "Summarize the dependency tree of this repo.",
|
|
270
|
+
"cache": {
|
|
271
|
+
"scope": "cross-run",
|
|
272
|
+
"ttl": "6h",
|
|
273
|
+
"fingerprint": ["git:HEAD", "file:package-lock.json"]
|
|
274
|
+
}
|
|
275
|
+
}
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
### `scope`
|
|
279
|
+
|
|
280
|
+
| Value | Meaning |
|
|
281
|
+
|-------|---------|
|
|
282
|
+
| `run-only` (default) | Reuse only within a resumed run — exactly the historical behavior. |
|
|
283
|
+
| `cross-run` | Reuse an identical-input result from **any** prior run (the persistent store). |
|
|
284
|
+
| `off` | Never reuse, even within a run (force re-execution every time). |
|
|
285
|
+
|
|
286
|
+
### `ttl` (cross-run only)
|
|
287
|
+
|
|
288
|
+
Max age before a cross-run hit is treated as a miss: e.g. `"30m"`, `"6h"`, `"7d"`.
|
|
289
|
+
Omit for no time bound. A hit older than the TTL re-executes the phase.
|
|
290
|
+
|
|
291
|
+
### `fingerprint` (cross-run only)
|
|
292
|
+
|
|
293
|
+
The cache key is normally `phaseId + agent + model + interpolated-task`. A
|
|
294
|
+
fingerprint folds **“did the world change?”** signals into that key, so an
|
|
295
|
+
external change becomes a cache **miss** even when the task text is identical.
|
|
296
|
+
Each entry is one of:
|
|
297
|
+
|
|
298
|
+
| Entry | Becomes a miss when… | Resolves to |
|
|
299
|
+
|-------|----------------------|-------------|
|
|
300
|
+
| `git:HEAD` / `git:<ref>` | the commit moves | the resolved SHA (30s timeout → `<timeout>`; no git → `<no-git>`) |
|
|
301
|
+
| `glob:<pattern>` | the **set of matching paths** changes | sorted path list (mtime-free) |
|
|
302
|
+
| `glob!:<pattern>` | the **contents** of matching files change | content hashes (capped at 5000 matches) |
|
|
303
|
+
| `file:<path>` | that file's content changes | sha256 of the file (>10 MB or missing → `<skip>`/`<missing>`) |
|
|
304
|
+
| `env:<NAME>` | the env var changes | the env value |
|
|
305
|
+
|
|
306
|
+
### What is cached, and when
|
|
307
|
+
|
|
308
|
+
- Only phases whose **`status` is `done`** and that **were not themselves a cache
|
|
309
|
+
hit** are written to the store (no re-storing a value just read).
|
|
310
|
+
- The store is keyed by the full input hash + fingerprint, tagged with
|
|
311
|
+
`flowName`/`phaseId`/`runId`/`model` for inspection and LRU eviction.
|
|
312
|
+
- Cross-run reuse is **safe by construction**: a different agent, model, task, or
|
|
313
|
+
fingerprint produces a different key, so stale results are never served.
|
|
314
|
+
|
|
315
|
+
> **When to use it:** expensive, deterministic phases whose inputs rarely change
|
|
316
|
+
> (dependency summaries, doc generation, repeated audits of the same tree). For
|
|
317
|
+
> phases that *should* re-run every time (anything reading live external state
|
|
318
|
+
> without a fingerprint), leave the default `run-only` or set `off`.
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
## 9. Environment variables
|
|
213
323
|
|
|
214
324
|
| Variable | Effect |
|
|
215
325
|
|----------|--------|
|
|
@@ -217,7 +327,7 @@ Taskflow shares the subagent settings file at `~/.pi/agent/settings.json`:
|
|
|
217
327
|
|
|
218
328
|
---
|
|
219
329
|
|
|
220
|
-
##
|
|
330
|
+
## 10. Storage & file locations
|
|
221
331
|
|
|
222
332
|
| What | Path | Commit? |
|
|
223
333
|
|------|------|---------|
|
|
@@ -233,7 +343,7 @@ Taskflow shares the subagent settings file at `~/.pi/agent/settings.json`:
|
|
|
233
343
|
|
|
234
344
|
---
|
|
235
345
|
|
|
236
|
-
##
|
|
346
|
+
## 11. Quick recipes
|
|
237
347
|
|
|
238
348
|
**Pin a strong model only for the review gate:**
|
|
239
349
|
```jsonc
|