pi-taskflow 0.0.19 → 0.0.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,20 @@
2
2
 
3
3
  All notable changes to pi-taskflow are documented here. This project follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) format.
4
4
 
5
+ ## [0.0.20] — 2026-06-10
6
+
7
+ ### Added
8
+ - **Background (detached) execution — `detach: true`.** Run a taskflow in a detached child process without blocking the current session. Pass `detach: true` and get a `runId` back immediately; the flow executes in the background, persisting state to the store. Status polled via `/tf runs` and `resume` works as normal.
9
+ - `extensions/detached-runner.ts` (new): lightweight child-process entry script — reads serialized context, calls `executeTaskflow`, persists terminal state.
10
+ - `extensions/index.ts`: `detach: Boolean` parameter on the taskflow tool + child-process spawn logic (records PID in `RunState`).
11
+ - `extensions/store.ts`: `RunState` gains `pid?: number` + `detached?: boolean` fields; `isProcessAlive(pid)` stale-PID helper.
12
+ - Design: entry-point spawn wrapper — zero changes to the 1340-line `runtime.ts` core, no new phase type, no DSL version bump, fully backward-compatible.
13
+ - Approval phases auto-reject in background mode. Idle watchdog kills stalled children. Stale PID detection via signal-0 probe.
14
+ - 8 new tests (`test/detached.test.ts`): process-alive, PID persistence, end-to-end detached, crash→failed, resume after failure, stale PID, backward compat.
15
+
16
+ ### Fixed
17
+ - `approvalView` initialization robustness: throws a clear error when the approval view module is unavailable, preventing silent failures in detached/background mode.
18
+
5
19
  ## [0.0.19] — 2026-06-10
6
20
 
7
21
  ### Documentation
package/README.md CHANGED
@@ -131,7 +131,7 @@ The Pi ecosystem now has **20+ delegation, workflow, and orchestration extension
131
131
  - **`pi-subagents` / `@gotgenes/pi-subagents`** are the mature picks for ad-hoc "use reviewer on this diff" delegation and background jobs. `pi-taskflow` is for when those delegations need to become a *repeatable, resumable pipeline*.
132
132
  - **`pi-pipeline` / `pi-agent-flow`** ship *opinionated, fixed* flows. `pi-taskflow` ships an *empty canvas*: you (or the model) declare the graph that fits the job.
133
133
 
134
- > The honest one-liner: **`pi-taskflow` is the only Pi extension that gives you a *declarative, verifiable, resumable* DAG of task nodes — saved as a one-word command, with zero runtime dependencies and context isolation by design.** Where code-mode workflows let the model *script* the work, `pi-taskflow` lets it *declare a graph the runtime can prove correct before running.* The known gaps it's closing next: loop-until-done, worktree isolation, and non-blocking background runs (see [`STRATEGY.md`](./STRATEGY.md)).
134
+ > The honest one-liner: **`pi-taskflow` is the only Pi extension that gives you a *declarative, verifiable, resumable* DAG of task nodes — saved as a one-word command, with zero runtime dependencies and context isolation by design.** Where code-mode workflows let the model *script* the work, `pi-taskflow` lets it *declare a graph the runtime can prove correct before running.* The known gaps it's closing next: worktree isolation (see [`STRATEGY.md`](./STRATEGY.md)).
135
135
 
136
136
  ## 30-second start
137
137
 
@@ -641,7 +641,7 @@ Our `self-improve` flow is a 10-phase DAG — it audits the codebase, patches de
641
641
 
642
642
  Known boundaries (tracked, bounded — no surprises mid-flow):
643
643
 
644
- - **No detached background execution.** A run needs the Pi session open. True background execution (and event/cron triggers on top of it) is on the roadmap.
644
+ - **Detached background execution (new).** Add `detach: true` to `action: "run"` to spawn the flow in a detached child process. The tool returns immediately with the `runId`; the flow continues running even if the host session exits. Status is polled via the store (`/tf runs` or `action: "resume"`). Approval phases auto-reject in detached mode.
645
645
  - **No `output: "file"`.** Outputs are text/JSON only — write files via an agent's `write` tool call.
646
646
  - **`map` requires a JSON array.** The `over` field must resolve to a `{steps.ID.json}` array. Wrap a text list in a single-agent `output: "json"` phase first.
647
647
  - **The DAG must be acyclic.** Cycles are rejected at validation.
@@ -0,0 +1,264 @@
1
+ /**
2
+ * Modal approval dialog for `approval` phases (ctx.ui.custom with overlay).
3
+ *
4
+ * Rendered as a centered bordered popup: the full upstream output (e.g. a
5
+ * plan) is shown in a scrollable viewport so long content can be reviewed
6
+ * before deciding. Every line is padded to the full dialog width so the
7
+ * overlay composites cleanly (no see-through, no ghosting in scrollback).
8
+ *
9
+ * While the dialog is open, SGR mouse reporting is enabled so the wheel
10
+ * scrolls the viewport instead of the terminal scrollback. It is restored
11
+ * on dispose.
12
+ *
13
+ * Keys: wheel/↑↓ scroll · PgUp/PgDn page · Home/End jump ·
14
+ * a/Enter approve · e edit (guidance) · r/Esc reject.
15
+ */
16
+
17
+ import type { Theme } from "@earendil-works/pi-coding-agent";
18
+ import { matchesKey, truncateToWidth, visibleWidth, wrapTextWithAnsi } from "@earendil-works/pi-tui";
19
+
20
+ export type ApprovalChoice = "approve" | "reject" | "edit";
21
+
22
+ export interface ApprovalViewOptions {
23
+ /** Header title, e.g. "Taskflow approval — flow/phase". */
24
+ title: string;
25
+ /** Interpolated approval prompt. */
26
+ message: string;
27
+ /** Full upstream phase output (the content being approved). */
28
+ upstream?: string;
29
+ }
30
+
31
+ /** Minimal writer used to toggle terminal mouse reporting. */
32
+ export interface TerminalWriter {
33
+ write(data: string): void;
34
+ }
35
+
36
+ const FALLBACK_ROWS = 24;
37
+ /** Wheel ticks scroll this many lines. */
38
+ const WHEEL_STEP = 3;
39
+ /** SGR mouse sequence: ESC [ < B ; X ; Y (M|m) */
40
+ const MOUSE_SGR = /^\x1b\[<(\d+);(\d+);(\d+)([Mm])$/;
41
+ /** Enable basic mouse tracking + SGR encoding. */
42
+ const MOUSE_ON = "\x1b[?1000h\x1b[?1006h";
43
+ /** Restore: disable SGR encoding + mouse tracking. */
44
+ const MOUSE_OFF = "\x1b[?1006l\x1b[?1000l";
45
+
46
+ export class ApprovalViewComponent {
47
+ private theme: Theme;
48
+ private opts: ApprovalViewOptions;
49
+ private onDone: (choice: ApprovalChoice) => void;
50
+ private getRows: () => number;
51
+ private term?: TerminalWriter;
52
+ private scrollOffset = 0;
53
+ private cachedWidth?: number;
54
+ private cachedBody?: string[];
55
+ private mouseEnabled = false;
56
+ private decided = false;
57
+
58
+ constructor(
59
+ theme: Theme,
60
+ opts: ApprovalViewOptions,
61
+ onDone: (choice: ApprovalChoice) => void,
62
+ getRows?: () => number,
63
+ term?: TerminalWriter,
64
+ ) {
65
+ this.theme = theme;
66
+ this.opts = opts;
67
+ this.onDone = onDone;
68
+ this.getRows = getRows ?? (() => FALLBACK_ROWS);
69
+ this.term = term;
70
+ this.enableMouse();
71
+ }
72
+
73
+ private enableMouse(): void {
74
+ if (this.term && !this.mouseEnabled) {
75
+ try {
76
+ this.term.write(MOUSE_ON);
77
+ this.mouseEnabled = true;
78
+ } catch {
79
+ // non-tty / closed stream — wheel support is best-effort
80
+ }
81
+ }
82
+ }
83
+
84
+ /** Restore terminal mouse state. Idempotent; call from the overlay's dispose. */
85
+ dispose(): void {
86
+ if (this.term && this.mouseEnabled) {
87
+ this.mouseEnabled = false;
88
+ try {
89
+ this.term.write(MOUSE_OFF);
90
+ } catch {
91
+ // ignore
92
+ }
93
+ }
94
+ }
95
+
96
+ private decide(choice: ApprovalChoice): void {
97
+ if (this.decided) return;
98
+ this.decided = true;
99
+ this.dispose();
100
+ this.onDone(choice);
101
+ }
102
+
103
+ private rows(): number {
104
+ try {
105
+ return this.getRows() || FALLBACK_ROWS;
106
+ } catch {
107
+ return FALLBACK_ROWS;
108
+ }
109
+ }
110
+
111
+ /** Visible body height given the message height — dialog targets ~80% of the terminal. */
112
+ private maxVisible(msgRows: number): number {
113
+ const avail = Math.max(10, Math.floor(this.rows() * 0.8));
114
+ // Chrome: top border, message rows, separator, scroll info, separator, hints, bottom border.
115
+ const chrome = 1 + msgRows + 1 + 1 + 1 + 1 + 1;
116
+ return Math.max(3, Math.min(avail - chrome, 60));
117
+ }
118
+
119
+ /** Wrap the upstream text to the viewport width (cached per width). */
120
+ private bodyLines(innerW: number): string[] {
121
+ if (this.cachedBody && this.cachedWidth === innerW) return this.cachedBody;
122
+ const out: string[] = [];
123
+ const upstream = (this.opts.upstream ?? "").replace(/\r\n/g, "\n").trimEnd();
124
+ if (upstream) {
125
+ for (const raw of upstream.split("\n")) {
126
+ if (!raw.trim()) {
127
+ out.push("");
128
+ continue;
129
+ }
130
+ for (const l of wrapTextWithAnsi(raw, innerW)) out.push(l);
131
+ }
132
+ }
133
+ this.cachedWidth = innerW;
134
+ this.cachedBody = out;
135
+ return out;
136
+ }
137
+
138
+ private msgLines(innerW: number): string[] {
139
+ const out: string[] = [];
140
+ for (const raw of this.opts.message.split("\n")) {
141
+ for (const l of wrapTextWithAnsi(raw, innerW)) out.push(l);
142
+ }
143
+ return out.length ? out : [""];
144
+ }
145
+
146
+ private maxOffset(totalLines: number, visible: number): number {
147
+ return Math.max(0, totalLines - visible);
148
+ }
149
+
150
+ private clampScroll(delta: number): void {
151
+ const total = this.cachedBody?.length ?? 0;
152
+ const visible = this.maxVisible(1);
153
+ const cap = this.maxOffset(total, visible);
154
+ this.scrollOffset = Math.max(0, Math.min(cap, this.scrollOffset + delta));
155
+ }
156
+
157
+ handleInput(data: string): void {
158
+ // Mouse events (SGR) — wheel scrolls, everything else is swallowed.
159
+ const mouse = MOUSE_SGR.exec(data);
160
+ if (mouse) {
161
+ const b = Number(mouse[1]);
162
+ if (b & 64) {
163
+ // Wheel: low two bits 0 = up, 1 = down.
164
+ if ((b & 3) === 0) this.clampScroll(-WHEEL_STEP);
165
+ else if ((b & 3) === 1) this.clampScroll(WHEEL_STEP);
166
+ }
167
+ return;
168
+ }
169
+ // Decisions
170
+ if (matchesKey(data, "return") || data === "a" || data === "y") {
171
+ this.decide("approve");
172
+ return;
173
+ }
174
+ if (data === "e") {
175
+ this.decide("edit");
176
+ return;
177
+ }
178
+ if (matchesKey(data, "escape") || matchesKey(data, "ctrl+c") || data === "r" || data === "n") {
179
+ this.decide("reject");
180
+ return;
181
+ }
182
+ // Scrolling (only meaningful when a body exists)
183
+ const page = this.maxVisible(1);
184
+ if (matchesKey(data, "up") || data === "k") {
185
+ this.clampScroll(-1);
186
+ } else if (matchesKey(data, "down") || data === "j") {
187
+ this.clampScroll(1);
188
+ } else if (matchesKey(data, "pageUp") || matchesKey(data, "ctrl+u")) {
189
+ this.clampScroll(-page);
190
+ } else if (matchesKey(data, "pageDown") || matchesKey(data, "ctrl+d") || matchesKey(data, "space")) {
191
+ this.clampScroll(page);
192
+ } else if (matchesKey(data, "home") || data === "g") {
193
+ this.scrollOffset = 0;
194
+ } else if (matchesKey(data, "end") || data === "G") {
195
+ this.clampScroll(Number.MAX_SAFE_INTEGER);
196
+ }
197
+ }
198
+
199
+ /** Pad `content` with spaces to exactly `w` visible columns (ANSI-aware). */
200
+ private pad(content: string, w: number): string {
201
+ const t = truncateToWidth(content, w);
202
+ return t + " ".repeat(Math.max(0, w - visibleWidth(t)));
203
+ }
204
+
205
+ /** A full-width dialog row: │ <content padded> │ */
206
+ private row(content: string, width: number): string {
207
+ const th = this.theme;
208
+ const inner = this.pad(content, Math.max(1, width - 4));
209
+ return th.fg("border", "│") + " " + inner + " " + th.fg("border", "│");
210
+ }
211
+
212
+ private hrule(width: number, left: string, right: string): string {
213
+ const th = this.theme;
214
+ return th.fg("border", left + "─".repeat(Math.max(0, width - 2)) + right);
215
+ }
216
+
217
+ render(width: number): string[] {
218
+ const th = this.theme;
219
+ const innerW = Math.max(20, width - 4);
220
+ const lines: string[] = [];
221
+
222
+ // Top border with embedded title
223
+ const title = truncateToWidth(` ${this.opts.title} `, Math.max(0, width - 6));
224
+ const fill = Math.max(0, width - 4 - visibleWidth(title));
225
+ lines.push(
226
+ th.fg("border", "╭─") + th.fg("accent", title) + th.fg("border", "─".repeat(fill) + "─╮"),
227
+ );
228
+
229
+ // Approval prompt
230
+ const msg = this.msgLines(innerW);
231
+ for (const l of msg) lines.push(this.row(th.fg("text", l), width));
232
+
233
+ // Scrollable upstream body
234
+ const body = this.bodyLines(innerW);
235
+ const visible = this.maxVisible(msg.length);
236
+ const cap = this.maxOffset(body.length, visible);
237
+ this.scrollOffset = Math.min(this.scrollOffset, cap);
238
+ if (body.length > 0) {
239
+ lines.push(this.hrule(width, "├", "┤"));
240
+ const slice = body.slice(this.scrollOffset, this.scrollOffset + visible);
241
+ while (slice.length < Math.min(visible, body.length)) slice.push("");
242
+ for (const l of slice) lines.push(this.row(l, width));
243
+ if (cap > 0) {
244
+ const above = this.scrollOffset;
245
+ const below = Math.max(0, body.length - visible - this.scrollOffset);
246
+ lines.push(
247
+ this.row(th.fg("dim", `↑${above} more · ↓${below} more (${body.length} lines)`), width),
248
+ );
249
+ }
250
+ }
251
+
252
+ // Key hints
253
+ lines.push(this.hrule(width, "├", "┤"));
254
+ const scrollHint = cap > 0 ? "wheel/↑↓/PgUp/PgDn scroll · " : "";
255
+ lines.push(this.row(th.fg("dim", `${scrollHint}a/Enter approve · e edit · r/Esc reject`), width));
256
+ lines.push(this.hrule(width, "╰", "╯"));
257
+ return lines;
258
+ }
259
+
260
+ invalidate(): void {
261
+ this.cachedWidth = undefined;
262
+ this.cachedBody = undefined;
263
+ }
264
+ }
@@ -0,0 +1,79 @@
1
+ /**
2
+ * Detached runner — spawned as a child process for background (detached) runs.
3
+ *
4
+ * Reads a context JSON file (path passed as argv[2]), calls executeTaskflow,
5
+ * and persists the terminal state. Top-level try/catch writes status "failed"
6
+ * on crash. Approval phases auto-reject in detached mode (no interactive
7
+ * approver available).
8
+ *
9
+ * This file is NOT imported by index.ts — it is spawned via `child_process.spawn`.
10
+ */
11
+
12
+ import { readFileSync } from "node:fs";
13
+ import { type AgentScope, discoverAgents, readSubagentSettings } from "./agents.ts";
14
+ import { executeTaskflow } from "./runtime.ts";
15
+ import { getFlow, loadRun, saveRun, DEFAULT_KEPT_RUNS, DEFAULT_RUN_AGE_DAYS } from "./store.ts";
16
+
17
+ interface DetachContext {
18
+ runId: string;
19
+ defName: string;
20
+ args: Record<string, unknown>;
21
+ cwd: string;
22
+ }
23
+
24
+ const contextPath = process.argv[2];
25
+ if (!contextPath) {
26
+ console.error("[detached-runner] Missing context file path argument");
27
+ process.exit(1);
28
+ }
29
+
30
+ let ctx: DetachContext;
31
+ try {
32
+ ctx = JSON.parse(readFileSync(contextPath, "utf-8")) as DetachContext;
33
+ } catch (e) {
34
+ console.error(`[detached-runner] Failed to read context: ${e instanceof Error ? e.message : String(e)}`);
35
+ process.exit(1);
36
+ }
37
+
38
+ const cleanupConfig = { maxKeep: DEFAULT_KEPT_RUNS, maxAgeDays: DEFAULT_RUN_AGE_DAYS };
39
+
40
+ try {
41
+ const state = loadRun(ctx.cwd, ctx.runId);
42
+ if (!state) {
43
+ console.error(`[detached-runner] Run not found: ${ctx.runId}`);
44
+ process.exit(1);
45
+ }
46
+
47
+ // Re-discover agents using the same settings as the host session.
48
+ const settings = readSubagentSettings();
49
+ cleanupConfig.maxKeep = settings.taskflow.maxKeptRuns;
50
+ cleanupConfig.maxAgeDays = settings.taskflow.maxRunAgeDays;
51
+ const scope: AgentScope = state.def.agentScope ?? "user";
52
+ const { agents } = discoverAgents(ctx.cwd, scope, settings.modelRoles, settings.taskflow);
53
+
54
+ const result = await executeTaskflow(state, {
55
+ cwd: ctx.cwd,
56
+ agents,
57
+ globalThinking: settings.globalThinking,
58
+ persist: (s) => saveRun(s, cleanupConfig),
59
+ // No requestApproval — approval phases auto-reject in detached mode
60
+ // (fail-open: phase records the rejection, run continues).
61
+ loadFlow: (name: string) => getFlow(ctx.cwd, name)?.def,
62
+ });
63
+
64
+ saveRun(result.state, cleanupConfig);
65
+ } catch (e) {
66
+ // Top-level catch: persist failure so the host can poll the terminal state.
67
+ const message = e instanceof Error ? e.message : String(e);
68
+ console.error(`[detached-runner] Fatal: ${message}`);
69
+ try {
70
+ const state = loadRun(ctx.cwd, ctx.runId);
71
+ if (state && state.status === "running") {
72
+ state.status = "failed";
73
+ saveRun(state, cleanupConfig);
74
+ }
75
+ } catch {
76
+ // Best-effort — if we can't even load the state, there's nothing to persist.
77
+ }
78
+ process.exit(1);
79
+ }
@@ -27,6 +27,7 @@ import { Type } from "typebox";
27
27
  import { type AgentScope, discoverAgents, readSubagentSettings, shouldSyncBuiltinAgentsToProject, syncBuiltinAgentsToProject } from "./agents.ts";
28
28
  import { renderRunResult, summarizeRun } from "./render.ts";
29
29
  import { RunHistoryComponent, type RunHistoryResult } from "./runs-view.ts";
30
+ import { ApprovalViewComponent, type ApprovalChoice } from "./approval-view.ts";
30
31
  import { executeTaskflow, type ApprovalDecision, type ApprovalRequest, type RuntimeResult } from "./runtime.ts";
31
32
  import { finalPhase, resolveArgs, type Taskflow, validateTaskflow, desugar, isShorthand } from "./schema.ts";
32
33
  import {
@@ -110,6 +111,11 @@ const TaskflowParams = Type.Object({
110
111
  "Destructive: overwrites modelRoles in settings.json. Required for mode='apply-defaults'.",
111
112
  }),
112
113
  ),
114
+ detach: Type.Optional(
115
+ Type.Boolean({
116
+ description: "Run in background (detached child process); return runId immediately. Status polled via store.",
117
+ }),
118
+ ),
113
119
  });
114
120
 
115
121
  function makeRunState(def: Taskflow, args: Record<string, unknown>, cwd: string): RunState {
@@ -166,19 +172,51 @@ async function runFlow(
166
172
  }
167
173
 
168
174
  // Human-in-the-loop approver — only when an interactive UI is available.
175
+ // Renders a centered modal popup (TUI overlay) with a scrollable viewport
176
+ // so long upstream output (e.g. a plan) can be reviewed in full before
177
+ // deciding (mouse wheel / ↑↓ / PgUp / PgDn to scroll).
169
178
  const requestApproval = ctx.hasUI
170
179
  ? async (req: ApprovalRequest): Promise<ApprovalDecision> => {
171
- if (req.upstream?.trim()) {
172
- const snip = req.upstream.replace(/\s+/g, " ").trim();
173
- ctx.ui.notify(`[${def.name}/${req.phaseId}] ${snip.length > 280 ? `${snip.slice(0, 280)}…` : snip}`, "info");
174
- }
175
- const choice = await ctx.ui.select(
176
- `Taskflow approval — ${req.phaseId}: ${req.message}`,
177
- ["Approve", "Reject", "Edit / add guidance"],
178
- { signal },
180
+ const choice = await ctx.ui.custom<ApprovalChoice>(
181
+ (tui, theme, _kb, done) => {
182
+ const view = new ApprovalViewComponent(
183
+ theme,
184
+ {
185
+ title: `Taskflow approval — ${def.name}/${req.phaseId}`,
186
+ message: req.message,
187
+ upstream: req.upstream,
188
+ },
189
+ done,
190
+ () => tui.terminal.rows,
191
+ tui.terminal,
192
+ );
193
+ const onAbort = () => done("reject");
194
+ signal?.addEventListener("abort", onAbort, { once: true });
195
+ return {
196
+ render: (w: number) => view.render(w),
197
+ invalidate: () => view.invalidate(),
198
+ handleInput: (data: string) => {
199
+ view.handleInput(data);
200
+ tui.requestRender();
201
+ },
202
+ dispose: () => {
203
+ view.dispose();
204
+ signal?.removeEventListener("abort", onAbort);
205
+ },
206
+ };
207
+ },
208
+ {
209
+ overlay: true,
210
+ overlayOptions: {
211
+ width: "80%",
212
+ minWidth: 60,
213
+ maxHeight: "85%",
214
+ anchor: "center",
215
+ },
216
+ },
179
217
  );
180
- if (!choice || choice === "Reject") return { decision: "reject" };
181
- if (choice.startsWith("Edit")) {
218
+ if (choice === "reject") return { decision: "reject" };
219
+ if (choice === "edit") {
182
220
  const note = await ctx.ui.input("Guidance passed downstream as this phase's output", "type guidance…", {
183
221
  signal,
184
222
  });
@@ -614,6 +652,41 @@ export default function (pi: ExtensionAPI) {
614
652
  for (const w of v.warnings) {
615
653
  console.warn(`[taskflow:${def.name}] ${w}`);
616
654
  }
655
+ // Detached (background) execution: spawn a child process and return immediately.
656
+ if (params.detach) {
657
+ const state = makeRunState(def, args, ctx.cwd);
658
+ state.detached = true;
659
+ saveRun(state);
660
+
661
+ // Serialize context for the detached runner script.
662
+ const { writeFileSync } = await import("node:fs");
663
+ const { spawn } = await import("node:child_process");
664
+ const os = await import("node:os");
665
+ const path = await import("node:path");
666
+ const tmpFile = path.join(os.tmpdir(), `taskflow-detach-${state.runId}.json`);
667
+ writeFileSync(tmpFile, JSON.stringify({
668
+ runId: state.runId,
669
+ defName: def.name,
670
+ args,
671
+ cwd: ctx.cwd,
672
+ }));
673
+
674
+ const runnerScript = path.join(path.dirname(new URL(import.meta.url).pathname), "detached-runner.ts");
675
+ const child = spawn(process.execPath, ["--experimental-strip-types", runnerScript, tmpFile], {
676
+ detached: true,
677
+ stdio: "ignore",
678
+ });
679
+ child.unref();
680
+
681
+ state.pid = child.pid ?? undefined;
682
+ saveRun(state);
683
+
684
+ return {
685
+ content: [{ type: "text", text: `Taskflow '${def.name}' started in background (pid: ${child.pid}). Run id: ${state.runId}` }],
686
+ details: { action, state, message: state.runId } satisfies TaskflowDetails,
687
+ };
688
+ }
689
+
617
690
  const result = await runFlow(def, args, ctx, signal, onUpdate as any);
618
691
  // Surface the validation warnings in the tool result so the model
619
692
  // can acknowledge or fix them, and the user sees them in the chat.
@@ -260,7 +260,7 @@ function tokenize(input: string): Tok[] {
260
260
  continue;
261
261
  }
262
262
  // number
263
- const numMatch = /^-?\d+(?:\.\d+)?/.exec(input.slice(i));
263
+ const numMatch = /^-?\d+(?:\.\d+)?(?:[eE][+-]?\d+)?/.exec(input.slice(i));
264
264
  if (numMatch) {
265
265
  toks.push({ t: "num", v: Number(numMatch[0]) });
266
266
  i += numMatch[0].length;
@@ -13,6 +13,15 @@ import { withFileMutationQueue } from "@earendil-works/pi-coding-agent";
13
13
  import type { AgentConfig } from "./agents.ts";
14
14
  import { emptyUsage, type UsageStats } from "./usage.ts";
15
15
 
16
+ const activeChildren = new Set<number>();
17
+ const killAll = () => {
18
+ for (const pid of activeChildren) {
19
+ try { process.kill(pid, "SIGKILL"); } catch { /* already dead */ }
20
+ }
21
+ };
22
+ process.on("exit", killAll);
23
+ process.on("SIGTERM", () => { killAll(); process.exit(143); });
24
+
16
25
  export interface RunResult {
17
26
  agent: string;
18
27
  task: string;
@@ -345,6 +354,7 @@ export async function runAgentTask(
345
354
  shell: false,
346
355
  stdio: ["ignore", "pipe", "pipe"],
347
356
  });
357
+ if (proc.pid) activeChildren.add(proc.pid);
348
358
  let buffer = "";
349
359
 
350
360
  // Idle watchdog: a subagent that goes silent on stdout for too long is
@@ -389,13 +399,18 @@ export async function runAgentTask(
389
399
  // Cap prevents OOM from verbose tool output (e.g., npm install). 64 KB is
390
400
  // generous for error diagnosis while preventing memory exhaustion.
391
401
  const STDERR_MAX_LEN = 64 * 1024;
402
+ let stderrCapped = false;
392
403
  proc.stderr.on("data", (data) => {
393
- result.stderr += data.toString();
394
- if (result.stderr.length >= STDERR_MAX_LEN) {
395
- result.stderr = result.stderr.slice(0, STDERR_MAX_LEN) + "\n[...stderr truncated at 64KB]";
404
+ if (!stderrCapped) {
405
+ result.stderr += data.toString();
406
+ if (result.stderr.length >= STDERR_MAX_LEN) {
407
+ result.stderr = result.stderr.slice(0, STDERR_MAX_LEN) + "\n[...stderr truncated at 64KB]";
408
+ stderrCapped = true;
409
+ }
396
410
  }
397
411
  });
398
412
  proc.on("close", (code, signal) => {
413
+ if (proc.pid) activeChildren.delete(proc.pid);
399
414
  clearTimers();
400
415
  if (buffer.trim()) processLine(buffer);
401
416
  if (code === null && signal) killedBySignal = signal;
@@ -1025,6 +1025,7 @@ async function executePhase(
1025
1025
  // Using indexOf on the stable `ran` array is reference-based and correct even
1026
1026
  // when two variants produce byte-identical output.
1027
1027
  const ranIdx = (r: RunResult) => ran.indexOf(r) + 1;
1028
+ const budgetSkipCount = results.filter((r) => r.stopReason === "budget-skipped").length;
1028
1029
 
1029
1030
  // All competitors failed → the tournament fails (nothing to judge).
1030
1031
  if (ok.length === 0) {
@@ -1033,6 +1034,7 @@ async function executePhase(
1033
1034
  status: "failed",
1034
1035
  usage: variantUsage,
1035
1036
  error: `tournament '${phase.id}': all ${competitors.length} variants failed`,
1037
+ budgetTruncated: budgetSkipCount > 0 || undefined,
1036
1038
  tournament: { variants: competitors.length, winner: 0, mode },
1037
1039
  inputHash: hashInput(phase.id, "tournament", String(competitors.length)),
1038
1040
  endedAt: Date.now(),
@@ -1047,6 +1049,7 @@ async function executePhase(
1047
1049
  json: parseJson ? safeParse(ok[0].output) : undefined,
1048
1050
  usage: variantUsage,
1049
1051
  model: ok[0].model,
1052
+ budgetTruncated: budgetSkipCount > 0 || undefined,
1050
1053
  tournament: { variants: competitors.length, winner: ranIdx(ok[0]), mode, reason: "only surviving variant" },
1051
1054
  inputHash: hashInput(phase.id, "tournament", String(competitors.length)),
1052
1055
  endedAt: Date.now(),
@@ -1062,6 +1065,7 @@ async function executePhase(
1062
1065
  json: parseJson ? safeParse(ok[0].output) : undefined,
1063
1066
  usage: variantUsage,
1064
1067
  model: ok[0].model,
1068
+ budgetTruncated: budgetSkipCount > 0 || undefined,
1065
1069
  warnings: ["judge skipped: run aborted or budget exceeded"],
1066
1070
  tournament: { variants: competitors.length, winner: ranIdx(ok[0]), mode, reason: "judge skipped" },
1067
1071
  inputHash: hashInput(phase.id, "tournament", String(competitors.length)),
@@ -1095,6 +1099,7 @@ async function executePhase(
1095
1099
  json: parseJson ? safeParse(ok[0].output) : undefined,
1096
1100
  usage: judgeUsage,
1097
1101
  model: ok[0].model,
1102
+ budgetTruncated: budgetSkipCount > 0 || undefined,
1098
1103
  warnings: [`judge failed (${judgeRes.errorMessage ?? "error"}); used variant ${ranIdx(ok[0])}`],
1099
1104
  tournament: { variants: competitors.length, winner: ranIdx(ok[0]), mode, reason: "judge failed" },
1100
1105
  inputHash: hashInput(phase.id, "tournament", String(competitors.length)),
@@ -1117,6 +1122,7 @@ async function executePhase(
1117
1122
  json: parseJson ? safeParse(output) : undefined,
1118
1123
  usage: judgeUsage,
1119
1124
  model: mode === "aggregate" ? judgeRes.model : chosen.model,
1125
+ budgetTruncated: budgetSkipCount > 0 || undefined,
1120
1126
  warnings: winnerIneligible ? [`judge picked an ineligible variant; used variant ${winnerIdx}`] : undefined,
1121
1127
  tournament: { variants: competitors.length, winner: winnerIdx, mode, reason },
1122
1128
  inputHash: hashInput(phase.id, "tournament", String(competitors.length), mode),
@@ -1398,12 +1404,10 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
1398
1404
  let gateBlocked = false;
1399
1405
  let gateReason = "";
1400
1406
  let gateOutput = "";
1401
- // `budgetBlocked` gates the skipping of remaining phases once the cap is hit.
1402
- // `budgetSkipped` records that a phase was *actually* skipped/truncated for
1403
- // budget only then is the run terminal-status "blocked" (a cap crossed by the
1404
- // very last phase, with nothing left to skip, must NOT mark a good run failed).
1407
+ // `budgetBlocked` gates the skipping of remaining phases once the cap is hit
1408
+ // and also drives the terminal "blocked" status — a maxUSD ceiling must never
1409
+ // silently do nothing.
1405
1410
  let budgetBlocked = false;
1406
- let budgetSkipped = false;
1407
1411
  let budgetReason = "";
1408
1412
  const byId = new Map(def.phases.map((p) => [p.id, p]));
1409
1413
 
@@ -1442,7 +1446,7 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
1442
1446
  }
1443
1447
 
1444
1448
  if (skipReason) {
1445
- if (skipReason.startsWith("Budget exceeded")) budgetSkipped = true;
1449
+ if (skipReason.startsWith("Budget exceeded")) budgetBlocked = true;
1446
1450
  state.phases[phase.id] = {
1447
1451
  id: phase.id,
1448
1452
  status: "skipped",
@@ -1485,7 +1489,6 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
1485
1489
  // A fan-out cut short by the cap is itself a budget skip.
1486
1490
  if (ps.budgetTruncated) {
1487
1491
  budgetBlocked = true;
1488
- budgetSkipped = true;
1489
1492
  if (!budgetReason) budgetReason = "fan-out truncated by budget";
1490
1493
  }
1491
1494
  // Budget ceiling: once exceeded, remaining phases are skipped.
@@ -1494,7 +1497,7 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
1494
1497
  // the budget is detected as exceeded. This bounded overshoot is
1495
1498
  // acceptable: budgetBlocked prevents cascading into subsequent layers.
1496
1499
  const ob = overBudget(state);
1497
- if (ob.over && !budgetBlocked) {
1500
+ if (ob.over) {
1498
1501
  budgetBlocked = true;
1499
1502
  budgetReason = ob.reason;
1500
1503
  }
@@ -1517,7 +1520,7 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
1517
1520
 
1518
1521
  state.status = aborted
1519
1522
  ? "paused"
1520
- : gateBlocked || budgetSkipped
1523
+ : gateBlocked || budgetBlocked
1521
1524
  ? "blocked"
1522
1525
  : anyFailed
1523
1526
  ? "failed"
@@ -1527,7 +1530,7 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
1527
1530
  let finalOutput = finalState?.output ?? "(no output)";
1528
1531
  if (gateBlocked) {
1529
1532
  finalOutput = `Gate blocked the workflow.${gateReason ? `\nReason: ${gateReason}` : ""}${gateOutput ? `\n\n${gateOutput}` : ""}`;
1530
- } else if (budgetSkipped) {
1533
+ } else if (budgetBlocked) {
1531
1534
  finalOutput = `Budget exceeded — run halted.${budgetReason ? `\nReason: ${budgetReason}` : ""}${finalState?.output ? `\n\n${finalState.output}` : ""}`;
1532
1535
  }
1533
1536
 
@@ -84,6 +84,10 @@ export interface RunState {
84
84
  createdAt: number;
85
85
  updatedAt: number;
86
86
  cwd: string;
87
+ /** OS PID of a detached runner process (set only for background runs). */
88
+ pid?: number;
89
+ /** True for runs spawned via `detach: true` (background execution). */
90
+ detached?: boolean;
87
91
  }
88
92
 
89
93
  // ---------------------------------------------------------------------------
@@ -458,10 +462,21 @@ function cleanupTerminalRuns(
458
462
  }
459
463
 
460
464
  // Sort terminal by updatedAt desc (newest first).
461
- terminal.sort((a, b) => b.updatedAt - a.updatedAt);
465
+ // Filter out entries with corrupt updatedAt (non-numeric/NaN) BEFORE sorting
466
+ // to prevent NaN from corrupting sort order. Corrupt entries cannot be
467
+ // reliably aged, so they are always moved to toRemove.
468
+ const cleanTerminal: RunIndexEntry[] = [];
469
+ for (const e of terminal) {
470
+ if (typeof e.updatedAt === "number" && !Number.isNaN(e.updatedAt)) {
471
+ cleanTerminal.push(e);
472
+ } else {
473
+ toRemove.push(e);
474
+ }
475
+ }
476
+ cleanTerminal.sort((a, b) => b.updatedAt - a.updatedAt);
462
477
 
463
- for (let i = 0; i < terminal.length; i++) {
464
- const e = terminal[i]!;
478
+ for (let i = 0; i < cleanTerminal.length; i++) {
479
+ const e = cleanTerminal[i]!;
465
480
  const expiredByAge = now - e.updatedAt > maxAgeMs;
466
481
  const excessByCount = i >= maxKeep;
467
482
  if (expiredByAge || excessByCount) {
@@ -473,7 +488,7 @@ function cleanupTerminalRuns(
473
488
 
474
489
  // Commit the pruned index while holding the lock so a concurrent
475
490
  // updateIndexEntry cannot interleave and lose entries.
476
- const remaining = terminal.filter((e) => !toRemove.includes(e));
491
+ const remaining = cleanTerminal.filter((e) => !toRemove.includes(e));
477
492
  writeIndex(runsRoot, [...active, ...remaining]);
478
493
  });
479
494
 
@@ -783,8 +798,12 @@ export function listRuns(cwd: string, limit = 20): RunState[] {
783
798
  }
784
799
 
785
800
  // Sort by updatedAt desc, slice to limit.
786
- entries.sort((a, b) => b.updatedAt - a.updatedAt);
787
- const sliced = entries.slice(0, limit);
801
+ // Filter out entries with non-numeric/NaN updatedAt BEFORE sorting to
802
+ // prevent NaN from corrupting V8's sort order (which can displace valid
803
+ // entries when a limit is applied).
804
+ const valid = entries.filter((e) => typeof e.updatedAt === "number" && !Number.isNaN(e.updatedAt));
805
+ valid.sort((a, b) => b.updatedAt - a.updatedAt);
806
+ const sliced = valid.slice(0, limit);
788
807
 
789
808
  // Read full RunState for each entry.
790
809
  const runs: RunState[] = [];
@@ -804,6 +823,20 @@ export function hashInput(...parts: string[]): string {
804
823
  return crypto.createHash("sha256").update(parts.join("\u0000")).digest("hex").slice(0, 16);
805
824
  }
806
825
 
826
+ /**
827
+ * Check whether a process with the given PID is still alive.
828
+ * Uses signal 0 (no signal sent) — succeeds if the process exists and we have
829
+ * permission to signal it, throws ESRCH if it doesn't exist.
830
+ */
831
+ export function isProcessAlive(pid: number): boolean {
832
+ try {
833
+ process.kill(pid, 0);
834
+ return true;
835
+ } catch {
836
+ return false;
837
+ }
838
+ }
839
+
807
840
  /**
808
841
  * Write a file atomically: write to a unique temp file in the same directory,
809
842
  * then rename over the target (rename is atomic on the same filesystem). Prevents
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-taskflow",
3
- "version": "0.0.19",
3
+ "version": "0.0.20",
4
4
  "description": "A declarative, verifiable graph of task nodes for the Pi coding agent — not a workflow you script, but a DAG you declare: statically verified before it runs, with dynamic fan-out, gates, isolated subagent context, resumable runs, and saveable commands.",
5
5
  "keywords": [
6
6
  "pi-package",
@@ -37,7 +37,7 @@
37
37
  ],
38
38
  "scripts": {
39
39
  "typecheck": "tsc --noEmit",
40
- "test": "PI_TASKFLOW_BUILTIN_AGENTS_DIR= node --experimental-strip-types --test test/interpolate.test.ts test/condition.test.ts test/schema.test.ts test/usage.test.ts test/runtime.test.ts test/features.test.ts test/runner.test.ts test/store.test.ts test/agents.test.ts test/init.test.ts test/render.test.ts test/desugar.test.ts test/cache.test.ts test/loop.test.ts test/tournament.test.ts test/verify.test.ts test/gate-eval.test.ts test/transient-error.test.ts test/runtime-branches.test.ts test/interpolate-extended.test.ts test/store-extended.test.ts test/flow-def.test.ts",
40
+ "test": "PI_TASKFLOW_BUILTIN_AGENTS_DIR= node --experimental-strip-types --test test/interpolate.test.ts test/condition.test.ts test/schema.test.ts test/usage.test.ts test/runtime.test.ts test/features.test.ts test/runner.test.ts test/store.test.ts test/agents.test.ts test/init.test.ts test/render.test.ts test/approval-view.test.ts test/desugar.test.ts test/cache.test.ts test/loop.test.ts test/tournament.test.ts test/verify.test.ts test/gate-eval.test.ts test/transient-error.test.ts test/runtime-branches.test.ts test/interpolate-extended.test.ts test/store-extended.test.ts test/flow-def.test.ts test/detached.test.ts",
41
41
  "test:e2e": "PI_TASKFLOW_PI_BIN=pi node --experimental-strip-types test/e2e.mts",
42
42
  "test:dogfood-cache": "node --experimental-strip-types test/dogfood-cache.mts"
43
43
  },
@@ -129,6 +129,7 @@ deciding. The (interpolated) `task` is the prompt shown.
129
129
  - **Edit** → the typed note becomes this phase's `output`, so you can inject
130
130
  guidance mid-run: reference it downstream with `{steps.<id>.output}`.
131
131
  - **Non-interactive** runs (headless/CI/print mode) **auto-approve** and record it.
132
+ - **Background (detached)** runs **auto-reject** (no interactive approver) — downstream sees the rejection; the flow continues (fail-open).
132
133
 
133
134
  ```jsonc
134
135
  { "id": "checkpoint", "type": "approval", "dependsOn": ["plan"],
@@ -434,11 +435,19 @@ Quick reference:
434
435
 
435
436
  ## Actions
436
437
 
437
- - `action: "run"` — run an inline `define` (a one-off DAG) **or** a saved `name` (with optional `args`). Use `define` for an ad-hoc flow; use `name` to invoke something previously saved.
438
+ - `action: "run"` — run an inline `define` (a one-off DAG) **or** a saved `name` (with optional `args`). Use `define` for an ad-hoc flow; use `name` to invoke something previously saved. Add `detach: true` to run in the background (returns immediately with the runId; poll the store for status).
438
439
  - `action: "save"` — persist `define` (scope `project` — default, committed/shared — or `user`); it becomes `/tf:<name>`. On a name collision, project overrides user.
439
440
  - `action: "resume"` — continue a paused/failed run by `runId`.
440
441
  - `action: "list"` — list saved flows. `action: "verify"` — static-check a `define` (zero tokens). `action: "agents"` — list available agents.
441
442
 
443
+ ## Background (detached) runs
444
+
445
+ Add `detach: true` to `action: "run"` to spawn the flow in a detached child process. The tool returns immediately with the `runId`; the flow continues running even if the host session exits. Status is polled via the store (`/tf runs` or `action: "resume"`).
446
+
447
+ - **Approval phases auto-reject** in detached mode (no interactive approver). Downstream phases see the rejection; the flow continues (fail-open).
448
+ - **Crash resilience:** if the detached process crashes, the store persists `status: "failed"`; resume with `action: "resume"`.
449
+ - **Same flow, both modes:** a flow can run foreground or background — `detach` is a dispatch-time decision, not a flow property.
450
+
442
451
  ## Operating a run (lifecycle, resume, inspection)
443
452
 
444
453
  A run moves through: **running →** `completed` (a `final` phase produced output) **/** `blocked` (a gate emitted BLOCK, an `approval` was rejected, or the `budget` cap was hit) **/** `failed` (a non-`optional` phase errored) **/** `paused` (the run was aborted). `failed` and `paused` runs are resumable; `blocked` is terminal (fix the gate/budget and re-run).