pi-taskflow 0.0.25 → 0.0.26

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,116 @@
2
2
 
3
3
  All notable changes to pi-taskflow are documented here. This project follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) format.
4
4
 
5
+ ## [0.0.26] — 2026-06-25
6
+
7
+ > Foundation release: **the convergence roadmap's H1 lands** — a real FlowIR
8
+ > compile seam (M1), a declared dependency plane (M2), and a
9
+ > backward-compatible cache-key migration. v0.0.25 made incremental recompute
10
+ > *trustworthy*; this release makes the contract underneath it *real*: the
11
+ > recompute frontier now reasons over **observed ∪ declared** dependencies, the
12
+ > flow definition compiles through a typed IR surface instead of an inlined
13
+ > hash, and folding the definition into the cache key no longer evicts every
14
+ > pre-existing cross-run entry.
15
+
16
+ ### Added
17
+ - **FlowIR compile seam (M1).** New `extensions/flowir/{index,translate,meta}.ts`
18
+ exposes `compileTaskflowToIR(def) → { ir, meta, hash, usedFallbackHash,
19
+ warnings, errors }` — a typed, never-throwing projection of a desugared flow
20
+ into a content-addressed IR. The runtime now routes `flowDefHash` through this
21
+ seam instead of inlining it. `translate` is currently a 1:1 stub projection
22
+ (so `usedFallbackHash` is `true` and the hash equals the vendored
23
+ `flowDefHash`); it becomes the genuine overstory compiler once that kernel is
24
+ vendored, at which point the cache-key version advances `v2: → v3:`.
25
+ - **`/tf ir <flow>` command + `ir` tool action.** Renders the compiled IR plus
26
+ its hash and any structured `CompileError[]` — zero tokens, no LLM.
27
+ - **Declared dependency plane (M2).** `compileTaskflowToIR` synthesizes per-phase
28
+ `DeclaredDeps { reads, writes }` from interpolation refs
29
+ (`task`/`over`/`when`/`until`/`eval`/`branches`/`with`/`context`) and
30
+ `dependsOn`, attaches them to `ir.meta.declaredDeps`, and persists them to
31
+ `RunState`. `/tf recompute` now computes its stale frontier over
32
+ **union(observed ∪ declared)** rather than observed-only — a dependency that
33
+ was declared but never interpolated at runtime is no longer missed.
34
+ - **Tests: 753 → 802** (+49) across new suites: `flowir.test.ts`,
35
+ `flowir-declared.test.ts`, `stale-union.test.ts` (incl. a 500-iteration
36
+ property test proving the union frontier is never narrower than observed-only),
37
+ `recompute-union.test.ts`, `cache-migration.test.ts`, plus `e2e-flowir.mts`
38
+ and `e2e-cache-migration.mts`.
39
+
40
+ ### Fixed
41
+ - **Cache-key migration no longer evicts existing cross-run entries.** Folding
42
+ `flowdef:` into the key previously invalidated every pre-existing cross-run
43
+ cache entry on upgrade (a one-time miss-storm). `cacheKey` is now versioned
44
+ (`v2:flowdef:`) with a **3-tier lookup**: new key → bare `flowdef:` key →
45
+ legacy (no-flowdef) key. Old entries still hit for one release cycle; there is
46
+ no write-through on a fallback hit (legacy entries age out naturally), and
47
+ every tier still includes `flow:${name}` so two different flows can never
48
+ collide.
49
+ - **Declared plane and recompute guard now see `loop.until` and `gate.eval`.**
50
+ `collectRefs` skipped `until` (loop convergence) and `eval[]` (gate zero-token
51
+ checks), so a dependency expressed only in those fields was absent from the
52
+ declared plane and from the `dryRun:false` unobserved-dependency guard. Both
53
+ are now scanned. (Closes the two MEDIUM findings from the H1 risk review.)
54
+
55
+ ### Compatibility
56
+ - **Backward compatible.** `RunState.flowDefHash` and `RunState.declaredDeps`
57
+ are optional — pre-0.0.26 run states load unchanged. A compile/hash failure
58
+ fails open: `usedFallbackHash` stays set, cross-run cache is disabled for that
59
+ run, and the key degrades to a flow-scoped (collision-free) form. The one
60
+ observable change on upgrade is a single re-execution of in-flight phases
61
+ whose stored `inputHash` predates the `v2:` prefix.
62
+
63
+ ## [0.0.25] — 2026-06-24
64
+
65
+ > Correctness release: **incremental recompute is now trustworthy.** `/tf
66
+ > recompute` shipped in the prior line as a promising idea — force-rerun a seed,
67
+ > walk its stale frontier, let the cache cut off untouched downstreams. But the
68
+ > dependency graph it walked was a half-truth: reads observed only inside a
69
+ > `when` guard or an `eval` gate were never recorded, a loop that read its own
70
+ > output **deadlocked the scheduler**, and a `{previous.output}` chain could be
71
+ > silently skipped — each one a path where "only rerun what changed" quietly
72
+ > reused **stale** upstream state and returned a wrong answer that *looked*
73
+ > incrementally correct. This release closes all of them: the observed readSet
74
+ > is now complete, the recompute order unions declared **and** observed edges,
75
+ > and real (`dryRun:false`) recomputation refuses to run when it cannot prove
76
+ > the frontier is sound. The headline feature finally earns its safety claim —
77
+ > the difference between *looks* incremental and *provably* incremental.
78
+
79
+ ### Added
80
+ - **Safety guard for real recomputation.** `recomputeTaskflow` with `dryRun:false`
81
+ now refuses to run flows whose dependencies cannot be fully observed through
82
+ the captured readSet: Shared Context Tree (`shareContext` / `contextSharing`),
83
+ `flow` phases, `context:` file pre-reads, and interpolation placeholders such
84
+ as `{previous.output}`, `{args.X}`, or `{item.X}`. This prevents silently
85
+ reusing stale upstream state.
86
+ - **Regression tests** in `test/recompute.test.ts`:
87
+ - observed-read edges still order recomputation even without an explicit
88
+ `dependsOn` declaration;
89
+ - `{previous.output}` chains are rejected for real recomputation;
90
+ - `recomputeTaskflow` returns a fresh `RunState` and does not mutate the
91
+ caller's state.
92
+
93
+ ### Fixed
94
+ - **Loop self-read no longer deadlocks recompute.** A loop whose `until`
95
+ condition references its own prior output (e.g. `{steps.refine.output}`)
96
+ produced a self-edge in the observed-dependency graph, causing `topoLayers` to
97
+ schedule the phase with a permanently non-zero indegree. `observedDeps()` now
98
+ filters self-references so scheduling remains sound.
99
+ - **`when` condition upstream reads are captured.** Conditions are now evaluated
100
+ inside `executePhaseInner` with the same `onRead` hook used by the phase task,
101
+ so upstream refs observed only in a `when` guard are recorded in
102
+ `PhaseState.reads`.
103
+ - **Gate `eval` upstream reads are captured.** The machine-check `eval` branch
104
+ now receives the shared `onRead` hook, and the resulting readSet is persisted
105
+ when an eval-only gate skips the LLM call.
106
+ - **Recompute topo-order now unions declared and observed edges.** Previously
107
+ the recompute order only respected declared `dependsOn`, which could place a
108
+ downstream phase before its observed-but-not-declared upstream refreshed and
109
+ cause false early-cutoff. The scheduling graph now merges both edge sets.
110
+ - **Recompute no longer mutates the caller's RunState.** `recomputeTaskflow`
111
+ clones the input state via `structuredClone` before modifying it.
112
+ - **Help text accuracy.** `/tf` command and tool-action descriptions updated to
113
+ match the new `recompute` and provenance behavior.
114
+
5
115
  ## [0.0.24] — 2026-06-23
6
116
 
7
117
  > Feature release: **`/tf compile`** — turn the declared DAG into a Mermaid
@@ -0,0 +1,73 @@
1
+ /**
2
+ * Public entry point for the FlowIR compile seam.
3
+ *
4
+ * `compileTaskflowToIR` is the read-only, content-addressed IR projection used
5
+ * by:
6
+ * - `/tf ir <flow>` / `action=ir` — render the compiled IR + hash (0 tokens)
7
+ * - the runtime (`runTaskflowLayers`) — fold `ir.hash` into the cache key
8
+ * (== `flowDefHash` in the stub; the overstory-canonical hash once the
9
+ * genuine compiler is vendored) and persist `ir.meta.declaredDeps` to
10
+ * `RunState` (M2 declared plane).
11
+ *
12
+ * The stub hash reuses the already-vendored overstory `flowDefHash` algorithm
13
+ * (./hash.ts) so pi-taskflow and overstory share one byte-identical hashing
14
+ * contract today. `usedFallbackHash` is `true` in the stub (the genuine
15
+ * overstory `hashIR` is not yet wired); it flips to `false` once the compiler
16
+ * is vendored, at which point the cache key's `v2:` prefix advances to `v3:`
17
+ * (see docs/internal/cache-migration.md).
18
+ *
19
+ * Pure + async (Web Crypto). Never throws — a hash failure leaves `hash`
20
+ * unset and `usedFallbackHash` true; the runtime degrades to the safe
21
+ * flowName-only cache key (cross-run disabled for that run).
22
+ *
23
+ * @see docs/internal/overstory-convergence-roadmap.md §3 (M1)
24
+ * @see docs/internal/rfc-flowir-compilation.md
25
+ */
26
+
27
+ import type { Taskflow } from "../schema.ts";
28
+ import { flowDefHash } from "./hash.ts";
29
+ import { translateTaskflow } from "./translate.ts";
30
+ import type { TaskflowIR } from "./meta.ts";
31
+
32
+ /**
33
+ * Compile a (desugared) `Taskflow` into its content-addressed IR.
34
+ *
35
+ * The returned `hash` is, in the stub, exactly `flowDefHash(def)` — the
36
+ * overstory-vendored canonical-JSON + SHA-256-truncation contract. The
37
+ * `usedFallbackHash` flag records that this is the *fallback* hash (non-IR-
38
+ * canonical): it is `true` whenever the stub cannot guarantee IR-canonicity
39
+ * (any phase with a `when`, or any hash-compute failure).
40
+ *
41
+ * Never throws. Returns structured diagnostics so `/tf ir` on a broken flow
42
+ * yields a clean error table instead of crashing.
43
+ */
44
+ export async function compileTaskflowToIR(def: Taskflow): Promise<TaskflowIR> {
45
+ const t = translateTaskflow(def);
46
+ let hash: string | undefined;
47
+ try {
48
+ hash = await flowDefHash(def);
49
+ } catch {
50
+ hash = undefined;
51
+ }
52
+ return {
53
+ ir: t.ir,
54
+ meta: t.meta,
55
+ hash,
56
+ warnings: t.warnings,
57
+ errors: t.errors,
58
+ // Stub: the fallback hash is used whenever (a) any phase has a `when`
59
+ // (translateTaskflow flags it) OR (b) the hash computation itself failed.
60
+ // Once the genuine overstory compiler is vendored, condition (a) drops.
61
+ usedFallbackHash: t.usedFallbackHash || hash === undefined,
62
+ };
63
+ }
64
+
65
+ export type {
66
+ CompileError,
67
+ CompileWarning,
68
+ DeclaredDeps,
69
+ FlowIR,
70
+ FlowIRNode,
71
+ TaskflowIR,
72
+ TaskflowIRMeta,
73
+ } from "./meta.ts";
@@ -0,0 +1,126 @@
1
+ /**
2
+ * Type definitions for the FlowIR compile seam.
3
+ *
4
+ * This is the **stub/projection** layer of the overstory-convergence roadmap's
5
+ * M1 slice: we project a pi-taskflow `Taskflow` into a `FlowIR` shape that
6
+ * mirrors overstory's IR contract *structurally* (nodes with `inject`/`emits`)
7
+ * without yet compiling to overstory's native inject/emits model. The hash
8
+ * contract (overstory's `hashIR` algorithm) is shared via `flowDefHash` — see
9
+ * `./hash.ts`. When the genuine overstory compiler is vendored later, the
10
+ * `usedFallbackHash` flag flips to `false` and `ir` becomes the canonical IR;
11
+ * until then this seam is read-only, pure, and never throws.
12
+ *
13
+ * Pure module: no IO, no Date, no randomness. Type-only where possible.
14
+ *
15
+ * @see docs/internal/overstory-convergence-roadmap.md §3 (M1)
16
+ * @see docs/internal/rfc-flowir-compilation.md
17
+ */
18
+
19
+ import type { Budget, Taskflow } from "../schema.ts";
20
+
21
+ // ---------------------------------------------------------------------------
22
+ // Declared dependency plane (compile-time, M2)
23
+ // ---------------------------------------------------------------------------
24
+
25
+ /**
26
+ * A phase's *declared* (static) dependency footprint, synthesized at compile
27
+ * time from `{steps.X}` interpolation refs (via `collectRefs`) plus `dependsOn`.
28
+ * `reads` = the upstream step ids this phase's task/when/branches/with/context
29
+ * statically reference; `writes` = the step id this phase emits (itself).
30
+ *
31
+ * This is the *declared* plane — distinct from the *observed* readSet captured
32
+ * at runtime (M3 `PhaseState.reads`). The two are reconciled by a **union**
33
+ * (`observed ∪ declared`) in `computeStaleFrontier` / `recomputeTaskflow` so a
34
+ * declared-but-unobserved edge (e.g. a `when` ref that never fired) is still
35
+ * treated as a dependency for staleness propagation. JSON-safe `Record` shape
36
+ * (not `Map`) so it round-trips through `RunState` persistence.
37
+ */
38
+ export interface DeclaredDeps {
39
+ /** Upstream step ids statically referenced by this phase's interpolation. */
40
+ reads: string[];
41
+ /** Step id(s) this phase emits — currently `[phase.id]` (1:1 projection). */
42
+ writes: string[];
43
+ }
44
+
45
+ // ---------------------------------------------------------------------------
46
+ // FlowIR (1:1 projection of a Taskflow)
47
+ // ---------------------------------------------------------------------------
48
+
49
+ /**
50
+ * A single IR node — one per pi-taskflow phase. `kind` is the native phase type
51
+ * (a 1:1 projection; the overstory-native kind lowering is deferred per roadmap
52
+ * §6.1). `inject`/`emits` mirror overstory's contract: a node *injects*
53
+ * (reads) the outputs of its upstream nodes and *emits* (writes) its own.
54
+ */
55
+ export interface FlowIRNode {
56
+ id: string;
57
+ /** pi-taskflow phase type (1:1 projection; `agent`|`parallel`|`map`|…). */
58
+ kind: string;
59
+ /** Synthesized declared reads: the `{steps.X}` refs this node's task
60
+ * interpolates. (overstory-native `inject` lowering is deferred.) */
61
+ inject: string[];
62
+ /** What this node emits — currently `[id]` (1:1 projection). */
63
+ emits: string[];
64
+ /** Raw `when` guard passthrough (stub: not rewritten to IR conditions). */
65
+ when?: string;
66
+ }
67
+
68
+ /** The compiled IR: a flat list of nodes plus flow-level metadata. */
69
+ export interface FlowIR {
70
+ name: string;
71
+ nodes: FlowIRNode[];
72
+ args?: Taskflow["args"];
73
+ budget?: Budget;
74
+ concurrency?: number;
75
+ }
76
+
77
+ // ---------------------------------------------------------------------------
78
+ // Compile diagnostics
79
+ // ---------------------------------------------------------------------------
80
+
81
+ /** A hard compile error (none in the stub; reserved for the genuine compiler). */
82
+ export interface CompileError {
83
+ phaseId?: string;
84
+ code: string;
85
+ message: string;
86
+ }
87
+
88
+ /** A non-fatal advisory (e.g. a `{steps.X}` ref not reachable via dependsOn). */
89
+ export interface CompileWarning {
90
+ phaseId?: string;
91
+ message: string;
92
+ }
93
+
94
+ // ---------------------------------------------------------------------------
95
+ // Meta + composite return type
96
+ // ---------------------------------------------------------------------------
97
+
98
+ /**
99
+ * Compile-time metadata attached to the IR. `declaredDeps` is the M2 declared
100
+ * plane (per-phase `DeclaredDeps`); `sidecar` carries every pi-taskflow-specific
101
+ * field not represented in `FlowIRNode` so the projection is lossless and can
102
+ * round-trip back to a runnable `Taskflow`.
103
+ */
104
+ export interface TaskflowIRMeta {
105
+ sourceFlowName: string;
106
+ /** Per-phase declared dependency footprint (M2). JSON-safe. */
107
+ declaredDeps: Record<string, DeclaredDeps>;
108
+ /** Pi-taskflow-specific fields preserved verbatim for round-trip. */
109
+ sidecar: Record<string, unknown>;
110
+ }
111
+
112
+ /**
113
+ * The composite compile result (RFC §5). `ir`/`hash` are present unless
114
+ * synthesis failed (stub never fails). `usedFallbackHash` is `true` whenever
115
+ * the stub cannot produce an overstory-canonical hash (always, in the stub:
116
+ * the hash is the `flowDefHash` fallback; flips to `false` once the genuine
117
+ * compiler is vendored).
118
+ */
119
+ export interface TaskflowIR {
120
+ ir?: FlowIR;
121
+ meta: TaskflowIRMeta;
122
+ hash?: string;
123
+ warnings: CompileWarning[];
124
+ errors: CompileError[];
125
+ usedFallbackHash: boolean;
126
+ }
@@ -0,0 +1,163 @@
1
+ /**
2
+ * FlowIR translation — the 1:1 projection of a pi-taskflow `Taskflow` into the
3
+ * `FlowIR` shape.
4
+ *
5
+ * **Stub/projection** (M1): this is a *structural* mirror, NOT a compile to
6
+ * overstory's native inject/emits model (which expects an explicit emit
7
+ * declaration pi-taskflow doesn't have — see roadmap §6.1). Each phase becomes
8
+ * one `FlowIRNode`; `inject` is synthesized from `{steps.X}` interpolation refs
9
+ * (`collectRefs`), `emits` is `[phase.id]`. The overstory-native `kind` lowering
10
+ * is deliberately deferred.
11
+ *
12
+ * Pure, synchronous, never throws. Used by `compileTaskflowToIR` (./index.ts).
13
+ *
14
+ * @see docs/internal/overstory-convergence-roadmap.md §3 (M1)
15
+ */
16
+
17
+ import { collectRefs, type Phase, type Taskflow } from "../schema.ts";
18
+ import type {
19
+ CompileError,
20
+ CompileWarning,
21
+ DeclaredDeps,
22
+ FlowIR,
23
+ FlowIRNode,
24
+ TaskflowIRMeta,
25
+ } from "./meta.ts";
26
+
27
+ // ---------------------------------------------------------------------------
28
+ // Sidecar: the pi-taskflow-specific fields not represented in FlowIRNode.
29
+ // Everything preserved verbatim so the projection is lossless and can
30
+ // round-trip back to a runnable Taskflow. Defined as a list so the sidecar
31
+ // never silently drops a field when the DSL grows (a new field is carried
32
+ // automatically through `Phase` indexing).
33
+ // ---------------------------------------------------------------------------
34
+
35
+ const SIDECAR_PHASE_FIELDS = [
36
+ "task",
37
+ "over",
38
+ "as",
39
+ "branches",
40
+ "from",
41
+ "use",
42
+ "def",
43
+ "with",
44
+ "until",
45
+ "maxIterations",
46
+ "convergence",
47
+ "variants",
48
+ "judge",
49
+ "judgeAgent",
50
+ "mode",
51
+ "dependsOn",
52
+ "join",
53
+ "when",
54
+ "retry",
55
+ "output",
56
+ "model",
57
+ "thinking",
58
+ "tools",
59
+ "cwd",
60
+ "context",
61
+ "contextLimit",
62
+ "onBlock",
63
+ "eval",
64
+ "cache",
65
+ "shareContext",
66
+ "optional",
67
+ "final",
68
+ "concurrency",
69
+ ] as const;
70
+
71
+ /** Build the per-phase sidecar record (verbatim copy of non-IR fields). */
72
+ function sidecarForPhase(phase: Phase): Record<string, unknown> {
73
+ const out: Record<string, unknown> = {};
74
+ const rec = phase as Record<string, unknown>;
75
+ for (const k of SIDECAR_PHASE_FIELDS) {
76
+ if (k in rec && rec[k] !== undefined) out[k] = rec[k];
77
+ }
78
+ return out;
79
+ }
80
+
81
+ /**
82
+ * Translate a desugared `Taskflow` into a 1:1 `FlowIR` projection + declared
83
+ * dependency metadata. Never throws: malformed input yields warnings/errors in
84
+ * the return value, not an exception (so `/tf ir` on a broken flow still
85
+ * produces a structured diagnostic rather than crashing the tool).
86
+ *
87
+ * `usedFallbackHash` is `true` unconditionally in the stub: the hash produced
88
+ * is `flowDefHash` (the definition fingerprint), NOT the overstory-IR-canonical
89
+ * hash, so callers can never mistake a stub hash for a canonical one. It flips
90
+ * to `false` only once the genuine overstory compiler is vendored and the hash
91
+ * is IR-canonical; a `when` guard remains a *future* fallback driver then.
92
+ */
93
+ export function translateTaskflow(def: Taskflow): {
94
+ ir: FlowIR;
95
+ meta: TaskflowIRMeta;
96
+ warnings: CompileWarning[];
97
+ errors: CompileError[];
98
+ usedFallbackHash: boolean;
99
+ } {
100
+ const warnings: CompileWarning[] = [];
101
+ const errors: CompileError[] = [];
102
+ const declaredDeps: Record<string, DeclaredDeps> = {};
103
+ const sidecarPhases: Record<string, unknown> = {};
104
+
105
+ // In the stub the hash is ALWAYS the fallback (flowDefHash — the definition
106
+ // fingerprint, not the overstory-IR-canonical hash). The `when` guard is a
107
+ // *future* driver (the genuine compiler can't lower conditions → fallback);
108
+ // today the stub unconditionally uses the fallback so callers can never
109
+ // mistake a stub hash for a canonical one. Flips to `false` only once the
110
+ // genuine overstory compiler is vendored and the hash is IR-canonical.
111
+ const usedFallbackHash = true;
112
+
113
+ const nodes: FlowIRNode[] = def.phases.map((phase) => {
114
+ const refs = collectRefs(phase);
115
+ // declared reads: the {steps.X} refs this phase statically references.
116
+ const reads = refs.steps.filter((id) => id !== phase.id);
117
+ declaredDeps[phase.id] = { reads, writes: [phase.id] };
118
+
119
+ // Advisory: a {steps.X} ref whose target doesn't exist (mirrors the
120
+ // validation check but non-fatal here — validation is the source of
121
+ // truth; this is a read-only diagnostic).
122
+ const knownIds = new Set(def.phases.map((p) => p.id));
123
+ for (const r of refs.steps) {
124
+ if (r !== phase.id && !knownIds.has(r)) {
125
+ warnings.push({
126
+ phaseId: phase.id,
127
+ message: `references {steps.${r}.*} but no phase '${r}' exists`,
128
+ });
129
+ }
130
+ }
131
+
132
+ if (phase.when !== undefined) {
133
+ // `when` is a future fallback driver; today the stub is always fallback.
134
+ // (Kept as a structural marker on the node for round-trip.)
135
+ }
136
+
137
+ sidecarPhases[phase.id] = sidecarForPhase(phase);
138
+
139
+ return {
140
+ id: phase.id,
141
+ kind: phase.type ?? "agent",
142
+ inject: reads,
143
+ emits: [phase.id],
144
+ when: phase.when,
145
+ } satisfies FlowIRNode;
146
+ });
147
+
148
+ const ir: FlowIR = {
149
+ name: def.name,
150
+ nodes,
151
+ args: def.args,
152
+ budget: def.budget,
153
+ concurrency: def.concurrency,
154
+ };
155
+
156
+ const meta: TaskflowIRMeta = {
157
+ sourceFlowName: def.name,
158
+ declaredDeps,
159
+ sidecar: { phases: sidecarPhases },
160
+ };
161
+
162
+ return { ir, meta, warnings, errors, usedFallbackHash };
163
+ }
@@ -45,7 +45,8 @@ import {
45
45
  } from "./store.ts";
46
46
  import { CacheStore } from "./cache.ts";
47
47
  import { safeParse } from "./interpolate.ts";
48
- import { formatWhyStale, readMapOf } from "./stale.ts";
48
+ import { declaredReadMapOfDef, formatWhyStale, readMapOf } from "./stale.ts";
49
+ import type { TaskflowIR } from "./flowir/index.ts";
49
50
  import {
50
51
  isValidKey,
51
52
  queueSpawn,
@@ -86,8 +87,8 @@ const ShorthandStep = Type.Object(
86
87
  );
87
88
 
88
89
  const TaskflowParams = Type.Object({
89
- action: StringEnum(["run", "save", "resume", "list", "agents", "init", "verify", "compile", "provenance", "why-stale", "recompute", "cache-clear"] as const, {
90
- description: "What to do: run a flow, save a definition, resume a paused run, list saved flows, list available agents, init model role configuration, verify the DAG, compile the DAG to a Mermaid diagram + verification report, show observed readSet provenance, explain why a run is stale, minimally recompute a stale run, or clear the cross-run memoization cache",
90
+ action: StringEnum(["run", "save", "resume", "list", "agents", "init", "verify", "compile", "ir", "provenance", "why-stale", "recompute", "cache-clear"] as const, {
91
+ description: "What to do: run a flow, save a definition, resume a paused run, list saved flows, list available agents, init model role configuration, verify the DAG, compile the DAG to a Mermaid diagram + verification report, compile to FlowIR + content hash, show observed readSet provenance, explain why a run is stale, minimally recompute a stale run, or clear the cross-run memoization cache",
91
92
  default: "run",
92
93
  }),
93
94
  name: Type.Optional(Type.String({ description: "Name of a saved flow (for run/save without inline define)" })),
@@ -151,6 +152,43 @@ const TaskflowParams = Type.Object({
151
152
  ),
152
153
  });
153
154
 
155
+ function formatFlowIR(ir: TaskflowIR): string {
156
+ const lines: string[] = [];
157
+ lines.push(`# FlowIR — "${ir.meta.sourceFlowName}"`);
158
+ lines.push("");
159
+ if (ir.hash) {
160
+ lines.push(`**content hash:** \`${ir.hash}\`${ir.usedFallbackHash ? " (fallback — stub projection)" : " (overstory-canonical)"}`);
161
+ lines.push("");
162
+ } else {
163
+ lines.push("**content hash:** _(unavailable — computation failed)_");
164
+ lines.push("");
165
+ }
166
+ if (ir.errors.length) {
167
+ lines.push(`## Errors (${ir.errors.length})`);
168
+ for (const e of ir.errors) lines.push(`- [${e.code}]${e.phaseId ? ` [${e.phaseId}]` : ""}: ${e.message}`);
169
+ lines.push("");
170
+ }
171
+ if (ir.warnings.length) {
172
+ lines.push(`## Warnings (${ir.warnings.length})`);
173
+ for (const w of ir.warnings) lines.push(`- ${w.phaseId ? `[${w.phaseId}] ` : ""}${w.message}`);
174
+ lines.push("");
175
+ }
176
+ lines.push("## Nodes (1:1 projection)");
177
+ lines.push("");
178
+ for (const n of ir.ir?.nodes ?? []) {
179
+ lines.push(`- **${n.id}** (kind: \`${n.kind}\`) inject:[${n.inject.join(", ") || ""}] emits:[${n.emits.join(", ")}]${n.when ? ` when: \`${n.when}\`` : ""}`);
180
+ }
181
+ lines.push("");
182
+ lines.push("## Declared dependencies (M2)");
183
+ lines.push("");
184
+ lines.push("| phase | reads | writes |");
185
+ lines.push("|-------|-------|--------|");
186
+ for (const [id, deps] of Object.entries(ir.meta.declaredDeps)) {
187
+ lines.push(`| ${id} | ${deps.reads.join(", ") || "—"} | ${deps.writes.join(", ")} |`);
188
+ }
189
+ return lines.join("\n");
190
+ }
191
+
154
192
  function formatProvenance(run: RunState): string {
155
193
  const lines: string[] = [];
156
194
  lines.push(`Provenance — run ${run.runId} · flow "${run.flowName}" · ${run.status}`);
@@ -666,6 +704,46 @@ export default function (pi: ExtensionAPI) {
666
704
  };
667
705
  }
668
706
 
707
+ if (action === "ir") {
708
+ const { compileTaskflowToIR } = await import("./flowir/index.ts");
709
+ // Resolve definition: inline define (object or JSON/fenced string), shorthand,
710
+ // or saved name. Mirrors action=compile / action=verify.
711
+ let def: Taskflow | undefined;
712
+ let resolvedDefine: unknown = params.define;
713
+ if (typeof resolvedDefine === "string") {
714
+ const parsed = safeParse(resolvedDefine);
715
+ if (parsed && typeof parsed === "object") resolvedDefine = parsed;
716
+ }
717
+ if (resolvedDefine) {
718
+ const d = resolvedDefine as Record<string, unknown>;
719
+ if (typeof d === "object" && d !== null && Array.isArray(d.phases)) {
720
+ def = d as unknown as Taskflow;
721
+ } else if (isShorthand(resolvedDefine)) {
722
+ try {
723
+ def = desugar(resolvedDefine) as Taskflow;
724
+ } catch (e) {
725
+ return errorResult(action, `Invalid shorthand: ${e instanceof Error ? e.message : String(e)}`);
726
+ }
727
+ }
728
+ } else if (params.name) {
729
+ const saved = getFlow(ctx.cwd, params.name);
730
+ if (saved) def = saved.def;
731
+ }
732
+ if (!def) {
733
+ return errorResult(action, "Provide 'define' (DSL) or 'name' (saved flow) to compile to IR.");
734
+ }
735
+ // Schema validation first so a malformed graph gives a clean error.
736
+ const vr = validateTaskflow(def, { cwd: ctx.cwd ? String(ctx.cwd) : undefined });
737
+ if (!vr.ok) {
738
+ return errorResult(action, `Schema validation failed:\n${vr.errors.join("\n")}`);
739
+ }
740
+ const ir = await compileTaskflowToIR(def) as TaskflowIR;
741
+ return {
742
+ content: [{ type: "text", text: formatFlowIR(ir) }],
743
+ details: { action } satisfies TaskflowDetails,
744
+ };
745
+ }
746
+
669
747
  if (action === "cache-clear") {
670
748
  const removed = new CacheStore(ctx.cwd).clear();
671
749
  return {
@@ -701,9 +779,10 @@ export default function (pi: ExtensionAPI) {
701
779
  const run = loadRun(ctx.cwd, params.runId);
702
780
  if (!run) return errorResult(action, `Run not found: ${params.runId}`);
703
781
  const reads = readMapOf(run.phases);
782
+ const declared = declaredReadMapOfDef(run.def);
704
783
  const seeds = params.phaseId ? [String(params.phaseId)] : [];
705
784
  return {
706
- content: [{ type: "text", text: formatWhyStale(run.runId, run.flowName, reads, seeds) }],
785
+ content: [{ type: "text", text: formatWhyStale(run.runId, run.flowName, reads, seeds, declared) }],
707
786
  details: { action } satisfies TaskflowDetails,
708
787
  };
709
788
  }
@@ -931,7 +1010,7 @@ export default function (pi: ExtensionAPI) {
931
1010
  pi.registerCommand("tf", {
932
1011
  description: "Taskflow: list | run <name> | show <name> | compile <name> | runs | init",
933
1012
  getArgumentCompletions: (prefix) => {
934
- const subs = ["list", "run", "show", "runs", "resume", "init", "save", "verify", "compile", "provenance", "why-stale", "recompute"];
1013
+ const subs = ["list", "run", "show", "runs", "resume", "init", "save", "verify", "compile", "ir", "provenance", "why-stale", "recompute"];
935
1014
  const items = subs.map((s) => ({ value: s, label: s }));
936
1015
  const filtered = items.filter((i) => i.value.startsWith(prefix));
937
1016
  return filtered.length > 0 ? filtered : null;
@@ -987,6 +1066,30 @@ export default function (pi: ExtensionAPI) {
987
1066
  return;
988
1067
  }
989
1068
 
1069
+ if (sub === "ir") {
1070
+ if (!arg) {
1071
+ ctx.ui.notify("Usage: /tf ir <name>", "warning");
1072
+ return;
1073
+ }
1074
+ const flowName = arg.trim().split(/\s+/)[0];
1075
+ const flow = getFlow(ctx.cwd, flowName);
1076
+ if (!flow) {
1077
+ ctx.ui.notify(`Flow not found: ${flowName}`, "error");
1078
+ return;
1079
+ }
1080
+ // Schema-validate before compiling so a malformed saved flow yields a
1081
+ // clean error rather than a half-rendered report (mirrors action=ir).
1082
+ const vr = validateTaskflow(flow.def, { cwd: ctx.cwd ? String(ctx.cwd) : undefined });
1083
+ if (!vr.ok) {
1084
+ ctx.ui.notify(`Schema validation failed:\n${vr.errors.join("\n")}`, "error");
1085
+ return;
1086
+ }
1087
+ const { compileTaskflowToIR } = await import("./flowir/index.ts");
1088
+ const ir = await compileTaskflowToIR(flow.def);
1089
+ ctx.ui.notify(formatFlowIR(ir), "info");
1090
+ return;
1091
+ }
1092
+
990
1093
  if (sub === "provenance") {
991
1094
  if (!arg) {
992
1095
  ctx.ui.notify("Usage: /tf provenance <runId>", "warning");
@@ -1013,7 +1116,8 @@ export default function (pi: ExtensionAPI) {
1013
1116
  return;
1014
1117
  }
1015
1118
  const reads = readMapOf(run.phases);
1016
- ctx.ui.notify(formatWhyStale(run.runId, run.flowName, reads, rest), "info");
1119
+ const declared = declaredReadMapOfDef(run.def);
1120
+ ctx.ui.notify(formatWhyStale(run.runId, run.flowName, reads, rest, declared), "info");
1017
1121
  return;
1018
1122
  }
1019
1123
 
@@ -20,8 +20,8 @@ import { type Budget, type CacheScope, dependenciesOf, finalPhase, LOOP_DEFAULT_
20
20
  import { verifyTaskflow } from "./verify.ts";
21
21
  import { hashInput, newRunId, type PhaseState, type RunState, runsDir } from "./store.ts";
22
22
  import { CacheStore, resolveFingerprint } from "./cache.ts";
23
- import { flowDefHash } from "./flowir/hash.ts";
24
- import { computeStaleFrontier, readMapOf } from "./stale.ts";
23
+ import { compileTaskflowToIR } from "./flowir/index.ts";
24
+ import { computeStaleFrontier, declaredReadMapOfDef, readMapOf } from "./stale.ts";
25
25
  import { ctxDirFor, drainPendingSpawns, initCtxDir, registerNode, setNodeStatus, type SpawnAssignment } from "./context-store.ts";
26
26
  import { allocateWorkspace, isWorkspaceKeyword, type Workspace } from "./workspace.ts";
27
27
 
@@ -923,7 +923,7 @@ async function executePhaseInner(
923
923
  }
924
924
  if (allPassed) {
925
925
  // All evals passed — skip the LLM gate, return an auto-pass.
926
- const inputHash = cacheKey(cc, [phase.id, "eval-skip"]);
926
+ const inputHash = cacheKeys(cc, [phase.id, "eval-skip"]).key;
927
927
  const ps: PhaseState = {
928
928
  id: phase.id,
929
929
  status: "done",
@@ -943,8 +943,9 @@ async function executePhaseInner(
943
943
  const refWarning = warnUnresolvedRefs(phase.id, interp.missing);
944
944
  const fullTask = preRead + text;
945
945
  const agentName = resolveAgent(phase.agent, deps, state);
946
- const inputHash = cacheKey(cc, [phase.id, agentName, phase.model ?? "", fullTask]);
947
- const cached = cachedPhase(cc, inputHash);
946
+ const ck = cacheKeys(cc, [phase.id, agentName, phase.model ?? "", fullTask]);
947
+ const inputHash = ck.key;
948
+ const cached = cachedPhase(cc, ck);
948
949
  if (cached) return cached;
949
950
 
950
951
  const r = await runOne(agentName, fullTask, liveSink(state, phase.id, emitProgress), nodeIdFor());
@@ -1003,7 +1004,7 @@ async function executePhaseInner(
1003
1004
  const retryCtx = buildInterpolationContext(state, lastCompletedOutput(state, phase));
1004
1005
  const retryText = interpolate(phase.task ?? "", retryCtx).text;
1005
1006
  const retryTask = preRead + retryText;
1006
- const retryIH = cacheKey(cc, [phase.id, agentName, phase.model ?? "", retryTask]);
1007
+ const retryIH = cacheKeys(cc, [phase.id, agentName, phase.model ?? "", retryTask]).key;
1007
1008
  const retryR = await runOne(agentName, retryTask, liveSink(state, phase.id, emitProgress));
1008
1009
  gatePs = resultToPhaseState(phase.id, retryR, retryIH, parseJson);
1009
1010
  if (gatePs.status === "done") gatePs.gate = parseGateVerdict(retryR.output);
@@ -1025,8 +1026,9 @@ async function executePhaseInner(
1025
1026
  task: preRead + r.text,
1026
1027
  };
1027
1028
  });
1028
- const inputHash = cacheKey(cc, [phase.id, phase.model ?? "", JSON.stringify(branches)]);
1029
- const cached = cachedPhase(cc, inputHash);
1029
+ const ck = cacheKeys(cc, [phase.id, phase.model ?? "", JSON.stringify(branches)]);
1030
+ const inputHash = ck.key;
1031
+ const cached = cachedPhase(cc, ck);
1030
1032
  if (cached) return cached;
1031
1033
 
1032
1034
  const results = await runFanout(branches);
@@ -1066,8 +1068,9 @@ async function executePhaseInner(
1066
1068
  task: preRead + interpolate(phase.task ?? "", localCtx).text,
1067
1069
  };
1068
1070
  });
1069
- const inputHash = cacheKey(cc, [phase.id, phase.model ?? "", JSON.stringify(tasks)]);
1070
- const cached = cachedPhase(cc, inputHash);
1071
+ const ck = cacheKeys(cc, [phase.id, phase.model ?? "", JSON.stringify(tasks)]);
1072
+ const inputHash = ck.key;
1073
+ const cached = cachedPhase(cc, ck);
1071
1074
  if (cached) return cached;
1072
1075
 
1073
1076
  const results = await runFanout(tasks);
@@ -1087,8 +1090,9 @@ async function executePhaseInner(
1087
1090
  const readRefs: string[] = [];
1088
1091
  const ctx = buildInterpolationContext(state, previousOutput, undefined, (ref) => readRefs.push(ref));
1089
1092
  const message = interpolate(phase.task ?? "Approve to continue?", ctx).text;
1090
- const inputHash = cacheKey(cc, [phase.id, phase.model ?? "", "approval", message]);
1091
- const cached = cachedPhase(cc, inputHash);
1093
+ const ck = cacheKeys(cc, [phase.id, phase.model ?? "", "approval", message]);
1094
+ const inputHash = ck.key;
1095
+ const cached = cachedPhase(cc, ck);
1092
1096
  if (cached) return cached;
1093
1097
 
1094
1098
  // Non-interactive (headless/CI/detached): auto-REJECT, fail-open, but record it.
@@ -1232,8 +1236,9 @@ async function executePhaseInner(
1232
1236
  // that a different generated plan yields a different key (and an identical plan
1233
1237
  // hits cache). For saved flows the name is the identity (historical behavior).
1234
1238
  const flowIdentity = hasDef ? `def:${JSON.stringify(subDef)}` : `flow:${name}`;
1235
- const inputHash = cacheKey(cc, [phase.id, flowIdentity, preRead, JSON.stringify(subArgs)]);
1236
- const cached = cachedPhase(cc, inputHash);
1239
+ const ck = cacheKeys(cc, [phase.id, flowIdentity, preRead, JSON.stringify(subArgs)]);
1240
+ const inputHash = ck.key;
1241
+ const cached = cachedPhase(cc, ck);
1237
1242
  if (cached) return cached;
1238
1243
 
1239
1244
  const live = state.phases[phase.id];
@@ -1607,7 +1612,7 @@ function lastCompletedOutput(state: RunState, phase: Phase): string | undefined
1607
1612
  * scope, optional TTL, and a pre-resolved fingerprint string so each phase-type
1608
1613
  * branch can fold it into its inputHash and consult the cross-run store uniformly.
1609
1614
  */
1610
- interface PhaseCacheCtx {
1615
+ export interface PhaseCacheCtx {
1611
1616
  scope: CacheScope;
1612
1617
  ttlMs?: number;
1613
1618
  fingerprint: string;
@@ -1638,20 +1643,50 @@ interface PhaseCacheCtx {
1638
1643
  }
1639
1644
 
1640
1645
  /** Fold the phase fingerprint into the base hash parts to form the final cache key. */
1641
- function cacheKey(cc: PhaseCacheCtx, baseParts: string[]): string {
1646
+ /** A computed cache identity: the new (versioned) key plus the read-only
1647
+ * fallback keys used to honor entries written by older releases. The `key`
1648
+ * is what we WRITE under and what `PhaseState.inputHash` carries; the
1649
+ * `legacyKey`/`bareKey` are consulted READ-ONLY on a miss so an upgrade
1650
+ * never produces a miss-storm. See docs/internal/cache-migration.md. */
1651
+ export interface CacheKeys {
1652
+ /** Current key: folds `v2:flowdef:<hash>` (the overstory content fingerprint). */
1653
+ key: string;
1654
+ /** Pre-flowDefHash-era key: the flowdef line OMITTED entirely. Read-only. */
1655
+ legacyKey: string;
1656
+ /** Bare (unversioned) `flowdef:` key — written by pre-H1 code that folded
1657
+ * the hash without a `v2:` prefix. Read-only. Removed in v0.1.0. */
1658
+ bareKey: string;
1659
+ }
1660
+
1661
+ /** Fold the phase fingerprint into the base hash parts to form the cache keys.
1662
+ *
1663
+ * Three keys are produced for backward compatibility (see
1664
+ * docs/internal/cache-migration.md):
1665
+ * - `key` : `v2:flowdef:<hash>` — the current write key.
1666
+ * - `legacyKey`: the flowdef line omitted — pre-flowDefHash entries.
1667
+ * - `bareKey` : bare `flowdef:<hash>` (unversioned) — pre-H1 entries that
1668
+ * folded the hash without the `v2:` prefix.
1669
+ * `cachedPhase` consults all three READ-ONLY on a miss; `recordCache` writes
1670
+ * only `key`. This means an upgrade never produces a miss-storm: existing
1671
+ * entries (whichever shape) still hit, and new writes converge on `key`. */
1672
+ export function cacheKeys(cc: PhaseCacheCtx, baseParts: string[]): CacheKeys {
1642
1673
  // Fold the full cache identity into the hash: flow name (prevents collisions
1643
1674
  // across different flows that share a phase.id + task + model), the per-phase
1644
1675
  // thinking/tools config (changing either changes the subagent's output), the
1645
1676
  // resolved context pre-read content, and the world-state fingerprint.
1646
- const parts = [
1647
- `flow:${cc.flowName}`,
1648
- `flowdef:${cc.flowDefHash ?? ""}`,
1677
+ const tail = [
1649
1678
  ...baseParts,
1650
1679
  `think:${cc.thinking ?? ""}`,
1651
1680
  `tools:${JSON.stringify(cc.tools ?? [])}`,
1652
1681
  `ctx:${cc.preRead ?? ""}`,
1653
1682
  ];
1654
- return cc.fingerprint ? hashInput(...parts, cc.fingerprint) : hashInput(...parts);
1683
+ const fold = (parts: string[]): string =>
1684
+ cc.fingerprint ? hashInput(...parts, cc.fingerprint) : hashInput(...parts);
1685
+ return {
1686
+ key: fold([`flow:${cc.flowName}`, `v2:flowdef:${cc.flowDefHash ?? ""}`, ...tail]),
1687
+ legacyKey: fold([`flow:${cc.flowName}`, ...tail]),
1688
+ bareKey: fold([`flow:${cc.flowName}`, `flowdef:${cc.flowDefHash ?? ""}`, ...tail]),
1689
+ };
1655
1690
  }
1656
1691
 
1657
1692
  /**
@@ -1660,31 +1695,39 @@ function cacheKey(cc: PhaseCacheCtx, baseParts: string[]): string {
1660
1695
  * - "run-only": within-run resume only (historical behavior).
1661
1696
  * - "cross-run": within-run first, then the persistent cross-run store.
1662
1697
  * On a cross-run hit, usage is zeroed and `cacheHit` records the source.
1698
+ *
1699
+ * The cross-run read is THREE-TIER and READ-ONLY for fallback keys: it tries
1700
+ * `keys.key` (current `v2:flowdef:` shape) first, then `keys.bareKey` (pre-H1
1701
+ * bare `flowdef:`), then `keys.legacyKey` (pre-flowDefHash, no flowdef line).
1702
+ * A hit on ANY tier is restored as a cache hit; we do NOT write-through (no
1703
+ * re-store under the new key) so the cache size stays stable and the legacy
1704
+ * entry ages out naturally. See docs/internal/cache-migration.md.
1663
1705
  */
1664
- function cachedPhase(cc: PhaseCacheCtx, inputHash: string): PhaseState | null {
1706
+ function cachedPhase(cc: PhaseCacheCtx, keys: CacheKeys): PhaseState | null {
1665
1707
  if (cc.scope === "off") return null;
1666
1708
  if (cc.forceRerun) return null;
1667
1709
 
1668
1710
  // 1. within-run resume (fastest; always allowed unless scope is off)
1669
- if (cc.prior && cc.prior.status === "done" && cc.prior.inputHash === inputHash) {
1711
+ if (cc.prior && cc.prior.status === "done" && cc.prior.inputHash === keys.key) {
1670
1712
  return { ...cc.prior, status: "done" };
1671
1713
  }
1672
1714
 
1673
- // 2. cross-run memoization (opt-in)
1715
+ // 2. cross-run memoization (opt-in) — three-tier read-only fallback.
1674
1716
  if (cc.scope === "cross-run") {
1675
- const e = cc.store.get(inputHash, cc.ttlMs);
1676
- if (e) {
1717
+ for (const k of [keys.key, keys.bareKey, keys.legacyKey]) {
1718
+ const e = cc.store.get(k, cc.ttlMs);
1719
+ if (!e) continue;
1677
1720
  // If we stored the full PhaseState, restore it (preserving gate,
1678
1721
  // approval, reads, loop/tournament metadata, warnings) and just mark
1679
1722
  // the cache hit + zero usage. Fallback to the legacy trimmed surface
1680
1723
  // for entries written before this change.
1681
1724
  if (e.state) {
1682
- return { ...e.state, inputHash, usage: emptyUsage(), cacheHit: "cross-run", endedAt: Date.now() };
1725
+ return { ...e.state, inputHash: keys.key, usage: emptyUsage(), cacheHit: "cross-run", endedAt: Date.now() };
1683
1726
  }
1684
1727
  return {
1685
1728
  id: cc.phaseId,
1686
1729
  status: "done",
1687
- inputHash,
1730
+ inputHash: keys.key,
1688
1731
  output: e.output,
1689
1732
  json: e.json,
1690
1733
  model: e.model,
@@ -1868,6 +1911,7 @@ function hasUnobservedDependencies(state: RunState): boolean {
1868
1911
  if (p.context && p.context.length > 0) return true;
1869
1912
  if (scan(p.task ?? "")) return true;
1870
1913
  if (p.when && scan(p.when)) return true;
1914
+ if (p.until && scan(p.until)) return true;
1871
1915
  if (Array.isArray(p.eval) && p.eval.some(scan)) return true;
1872
1916
  }
1873
1917
  return false;
@@ -1893,7 +1937,12 @@ export async function recomputeTaskflow(
1893
1937
  // replay; only the caller decides whether to persist the new state.
1894
1938
  const newState = structuredClone(state) as RunState;
1895
1939
  const reads = readMapOf(newState.phases);
1896
- const frontier = computeStaleFrontier(reads, seeds);
1940
+ // M2: derive the declared read-map fresh from the def so the frontier uses
1941
+ // the UNION (observed ∪ declared). Derived here (not read from the persisted
1942
+ // `RunState.declaredDeps`) so old runs — pre-H1, no persisted declaredDeps —
1943
+ // also get union semantics. The persisted field is audit/provenance only.
1944
+ const declared = declaredReadMapOfDef(newState.def);
1945
+ const frontier = computeStaleFrontier(reads, seeds, declared);
1897
1946
  const allIds = Object.keys(newState.phases);
1898
1947
 
1899
1948
  if (opts.dryRun) {
@@ -1927,20 +1976,26 @@ export async function recomputeTaskflow(
1927
1976
 
1928
1977
  // Real recompute: topological order over the frontier so a downstream always
1929
1978
  // sees its (already-refreshed) upstreams when it re-evaluates its cache key.
1930
- // The order must respect both declared dependsOn AND observed reads, because
1931
- // pi-taskflow allows interpolation refs without an explicit dependsOn edge.
1979
+ // The order must respect declared dependsOn, observed reads, AND declared
1980
+ // reads (M2 union): pi-taskflow allows interpolation refs without an
1981
+ // explicit dependsOn edge, and a declared-but-unobserved edge (e.g. a `when`
1982
+ // ref that never fired) must still order the reader after its upstream so
1983
+ // the reader evaluates its cache key against the refreshed upstream (no
1984
+ // false early-cutoff).
1932
1985
  const seedSet = new Set(seeds);
1933
- function observedDeps(phaseId: string): string[] {
1986
+ function depsFor(phaseId: string): string[] {
1934
1987
  // A phase reading its own prior output (e.g. a loop `until` checking
1935
1988
  // `{steps.thisId.output}`) must not create a self-edge in the scheduling
1936
1989
  // graph — otherwise topoLayers would deadlock on the self-loop.
1937
- return (newState.phases[phaseId]?.reads ?? [])
1990
+ const observed = (newState.phases[phaseId]?.reads ?? [])
1938
1991
  .map((r) => r.stepId)
1939
1992
  .filter((id) => id !== phaseId);
1993
+ const declared_ = (declared.get(phaseId) ?? []).filter((id) => id !== phaseId);
1994
+ return [...new Set([...observed, ...declared_])];
1940
1995
  }
1941
1996
  const augmentedPhases = newState.def.phases.map((p) => ({
1942
1997
  ...p,
1943
- dependsOn: [...new Set([...(p.dependsOn ?? []), ...observedDeps(p.id)])],
1998
+ dependsOn: [...new Set([...(p.dependsOn ?? []), ...depsFor(p.id)])],
1944
1999
  }));
1945
2000
  const order = topoLayers(augmentedPhases)
1946
2001
  .flat()
@@ -2016,9 +2071,23 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
2016
2071
  // Reused by every phase, persisted on the RunState for audit/resume.
2017
2072
  // Never throws into the run — a hash failure leaves the field unset and the
2018
2073
  // cache key degrades to the legacy flowName-only shape.
2074
+ //
2075
+ // Routed through the FlowIR compile seam (M1): `compileTaskflowToIR`
2076
+ // produces the content-addressed IR whose `hash` (== flowDefHash in the
2077
+ // stub) folds into the cache key, and whose `meta.declaredDeps` (M2 declared
2078
+ // plane) is persisted for audit/provenance. The declared plane is also
2079
+ // derived fresh from `def` in recompute (so old runs get union semantics
2080
+ // too); the persisted copy is for display.
2019
2081
  if (state.flowDefHash === undefined) {
2020
2082
  try {
2021
- state.flowDefHash = await flowDefHash(def);
2083
+ const ir = await compileTaskflowToIR(def);
2084
+ state.flowDefHash = ir.hash ?? "failed";
2085
+ state.declaredDeps = ir.meta.declaredDeps;
2086
+ if (ir.errors.length) {
2087
+ console.warn(
2088
+ `[taskflow] IR compile errors for '${def.name}': ${ir.errors.map((e) => e.message).join("; ")}`,
2089
+ );
2090
+ }
2022
2091
  } catch (e) {
2023
2092
  // Fail-safe: warn loudly rather than silently degrading to the legacy
2024
2093
  // flowName-only key, which would reopen the cross-flow collision hole.
@@ -781,7 +781,7 @@ export function validateTaskflow(def: unknown, opts: ValidationOptions = {}): Va
781
781
  return { ok: errors.length === 0, errors, warnings };
782
782
  }
783
783
 
784
- function collectRefs(phase: Phase): { steps: string[]; args: string[] } {
784
+ export function collectRefs(phase: Phase): { steps: string[]; args: string[] } {
785
785
  const steps = new Set<string>();
786
786
  const args = new Set<string>();
787
787
  const scan = (s: string | undefined) => {
@@ -795,6 +795,8 @@ function collectRefs(phase: Phase): { steps: string[]; args: string[] } {
795
795
  scan(phase.task);
796
796
  scan(phase.over);
797
797
  scan(phase.when);
798
+ scan(phase.until);
799
+ for (const e of phase.eval ?? []) scan(e);
798
800
  for (const b of phase.branches ?? []) scan(b.task);
799
801
  for (const v of Object.values(phase.with ?? {})) if (typeof v === "string") scan(v);
800
802
  for (const c of phase.context ?? []) scan(c);
@@ -23,6 +23,7 @@
23
23
  */
24
24
 
25
25
  import type { PhaseState } from "./store.ts";
26
+ import { collectRefs, type Taskflow } from "./schema.ts";
26
27
 
27
28
  // ---------------------------------------------------------------------------
28
29
  // Read graph
@@ -41,13 +42,24 @@ export function readMapOf(phases: Record<string, PhaseState>): ReadMap {
41
42
  return m;
42
43
  }
43
44
 
44
- /** Phases that directly read `phaseId` (its immediate dependents). */
45
- export function dependentsOf(reads: ReadMap, phaseId: string): string[] {
46
- const out: string[] = [];
45
+ /** Phases that directly read `phaseId` (its immediate dependents).
46
+ *
47
+ * When `declared` is provided, the dependent set is the **union** of
48
+ * observed dependents (from `reads`) and declared dependents (from
49
+ * `declared`) — a declared-but-unobserved edge (e.g. a `when` ref that never
50
+ * fired) still counts as a dependency for staleness propagation (M2 union).
51
+ * `declared` undefined → observed-only (backward-compatible). */
52
+ export function dependentsOf(reads: ReadMap, phaseId: string, declared?: ReadMap): string[] {
53
+ const out = new Set<string>();
47
54
  for (const [reader, deps] of reads) {
48
- if (deps.includes(phaseId)) out.push(reader);
55
+ if (deps.includes(phaseId)) out.add(reader);
56
+ }
57
+ if (declared) {
58
+ for (const [reader, deps] of declared) {
59
+ if (deps.includes(phaseId)) out.add(reader);
60
+ }
49
61
  }
50
- return out;
62
+ return [...out];
51
63
  }
52
64
 
53
65
  // ---------------------------------------------------------------------------
@@ -56,27 +68,53 @@ export function dependentsOf(reads: ReadMap, phaseId: string): string[] {
56
68
 
57
69
  /**
58
70
  * The set of phases that are stale if `seeds` change, transitively. A reader
59
- * is stale if ANY phase it observed-reading is stale (union/I5: when in doubt,
60
- * assume dependency). Includes the seeds themselves.
71
+ * is stale if ANY phase it (observed- OR declared-)reading is stale
72
+ * (union/I5: when in doubt, assume dependency). Includes the seeds themselves.
73
+ *
74
+ * When `declared` is provided, the read graph used for propagation is the
75
+ * **union** of `reads` (observed, M3) and `declared` (M2 compile-time refs):
76
+ * a declared-but-unobserved edge still propagates staleness. `declared`
77
+ * undefined → observed-only (backward-compatible, identical to pre-M2).
61
78
  *
62
79
  * Deterministic. O(phases + read-edges). Cycles in the read graph (which a
63
80
  * correct DAG can't produce, but a pathological one could) terminate because a
64
81
  * phase is enqueued at most once.
65
82
  */
66
- export function computeStaleFrontier(reads: ReadMap, seeds: Iterable<string>): Set<string> {
83
+ export function computeStaleFrontier(reads: ReadMap, seeds: Iterable<string>, declared?: ReadMap): Set<string> {
67
84
  const stale = new Set<string>();
68
85
  const queue: string[] = [...seeds];
69
86
  while (queue.length) {
70
87
  const s = queue.shift() as string;
71
88
  if (stale.has(s)) continue;
72
89
  stale.add(s);
73
- for (const dep of dependentsOf(reads, s)) {
90
+ for (const dep of dependentsOf(reads, s, declared)) {
74
91
  if (!stale.has(dep)) queue.push(dep);
75
92
  }
76
93
  }
77
94
  return stale;
78
95
  }
79
96
 
97
+ // ---------------------------------------------------------------------------
98
+ // Declared-plane derivation (M2)
99
+ // ---------------------------------------------------------------------------
100
+
101
+ /** Build a declared ReadMap from a flow definition: each phase's `collectRefs`
102
+ * `{steps.X}` refs become its declared reads (self-refs excluded so a loop
103
+ * `until` checking `{steps.thisId.output}` doesn't create a self-edge).
104
+ *
105
+ * Pure. Used by `recomputeTaskflow` and `/tf why-stale` so union (observed ∪
106
+ * declared) semantics apply to old runs too (pre-H1 runs have no persisted
107
+ * `RunState.declaredDeps` — deriving from `def` keeps recompute sound). */
108
+ export function declaredReadMapOfDef(def: Taskflow): ReadMap {
109
+ const m: ReadMap = new Map();
110
+ for (const p of def.phases) {
111
+ const refs = collectRefs(p);
112
+ const reads = refs.steps.filter((id) => id !== p.id);
113
+ if (reads.length) m.set(p.id, reads);
114
+ }
115
+ return m;
116
+ }
117
+
80
118
  // ---------------------------------------------------------------------------
81
119
  // Rendering
82
120
  // ---------------------------------------------------------------------------
@@ -85,12 +123,18 @@ export function computeStaleFrontier(reads: ReadMap, seeds: Iterable<string>): S
85
123
  * Render either the full observed dependency graph (no seeds) or the stale
86
124
  * frontier given assumed-changed seeds. Each stale phase lists the stale
87
125
  * upstreams that caused it (its "why").
126
+ *
127
+ * When `declared` is provided, the frontier is the **union** (observed ∪
128
+ * declared) and a stale phase's "why" annotates edges present only in the
129
+ * declared plane (not observed at runtime) with `(declared)`. `declared`
130
+ * undefined → observed-only rendering (backward-compatible).
88
131
  */
89
132
  export function formatWhyStale(
90
133
  runId: string,
91
134
  flowName: string,
92
135
  reads: ReadMap,
93
136
  seeds: readonly string[],
137
+ declared?: ReadMap,
94
138
  ): string {
95
139
  const lines: string[] = [];
96
140
  lines.push(`why-stale — run ${runId} · flow "${flowName}"`);
@@ -98,26 +142,32 @@ export function formatWhyStale(
98
142
 
99
143
  if (seeds.length === 0) {
100
144
  // No seeds → show the full observed dependency graph (who reads what).
101
- if (reads.size === 0) {
145
+ if (reads.size === 0 && (!declared || declared.size === 0)) {
102
146
  lines.push("(No observed readSets in this run — provenance is empty.)");
103
147
  return lines.join("\n");
104
148
  }
105
149
  lines.push("Observed dependency graph (who reads what):");
106
150
  lines.push("");
107
- for (const [reader, deps] of reads) {
108
- lines.push(`■ ${reader} reads: ${deps.join(", ")}`);
151
+ const allReaders = new Set<string>([...reads.keys(), ...(declared?.keys() ?? [])]);
152
+ for (const reader of allReaders) {
153
+ const obs = reads.get(reader) ?? [];
154
+ const dec = declared?.get(reader) ?? [];
155
+ const parts: string[] = [];
156
+ for (const d of obs) parts.push(d);
157
+ for (const d of dec) if (!obs.includes(d)) parts.push(`${d} (declared)`);
158
+ lines.push(`■ ${reader} reads: ${parts.join(", ") || "(none)"}`);
109
159
  }
110
160
  lines.push("");
111
161
  lines.push("Pass a phase id to compute its stale frontier: /tf why-stale <runId> <phaseId>");
112
162
  return lines.join("\n");
113
163
  }
114
164
 
115
- const frontier = computeStaleFrontier(reads, seeds);
165
+ const frontier = computeStaleFrontier(reads, seeds, declared);
116
166
  const seedSet = new Set(seeds);
117
167
  lines.push(`Assuming changed: ${[...seedSet].join(", ")}`);
118
168
  lines.push("");
119
169
  if (frontier.size <= seedSet.size) {
120
- lines.push(`Stale frontier: only the seed(s) themselves — nothing else observed-reading them.`);
170
+ lines.push(`Stale frontier: only the seed(s) themselves — nothing else reads them.`);
121
171
  return lines.join("\n");
122
172
  }
123
173
  lines.push(`Stale frontier (transitive, ${frontier.size} phases):`);
@@ -127,10 +177,16 @@ export function formatWhyStale(
127
177
  if (seedSet.has(id)) {
128
178
  lines.push(` ■ ${id} (changed — seed)`);
129
179
  } else {
130
- // Why is it stale? The stale upstreams it read.
131
- const deps = reads.get(id) ?? [];
132
- const causes = deps.filter((d) => frontier.has(d));
133
- lines.push(` ■ ${id} ← reads ${causes.length ? causes.join(", ") : "(nothing stale?)"}`);
180
+ // Why is it stale? The stale upstreams it read (observed ∪ declared).
181
+ const obs = reads.get(id) ?? [];
182
+ const dec = declared?.get(id) ?? [];
183
+ const obsCauses = obs.filter((d) => frontier.has(d));
184
+ const decCauses = dec.filter((d) => frontier.has(d) && !obs.includes(d));
185
+ const causeStr = [
186
+ ...obsCauses,
187
+ ...decCauses.map((d) => `${d} (declared)`),
188
+ ].join(", ");
189
+ lines.push(` ■ ${id} ← reads ${causeStr || "(nothing stale?)"}`);
134
190
  }
135
191
  }
136
192
  return lines.join("\n");
@@ -21,6 +21,7 @@ import * as path from "node:path";
21
21
  import { getAgentDir } from "@earendil-works/pi-coding-agent";
22
22
  import type { Taskflow } from "./schema.ts";
23
23
  import type { UsageStats } from "./usage.ts";
24
+ import type { DeclaredDeps } from "./flowir/meta.ts";
24
25
 
25
26
  export interface SavedFlow {
26
27
  name: string;
@@ -103,6 +104,16 @@ export interface RunState {
103
104
  * re-run always reuses them. Filled once at run start; persisted for
104
105
  * audit/resume consistency. */
105
106
  flowDefHash?: string | "failed";
107
+ /** Per-phase *declared* dependency footprint (M2), synthesized at compile
108
+ * time from `{steps.X}` interpolation refs via `compileTaskflowToIR`.
109
+ * This is the *declared* plane — distinct from the *observed* readSet
110
+ * (`PhaseState.reads`, captured at runtime). Recompute staleness uses the
111
+ * **union** (observed ∪ declared) so a declared-but-unobserved edge (e.g.
112
+ * a `when` ref that never fired) still propagates. JSON-safe `Record`
113
+ * shape so it round-trips through persistence. Audit/provenance only —
114
+ * recompute derives this fresh from `def` so old runs (pre-H1) also get
115
+ * union semantics. */
116
+ declaredDeps?: Record<string, DeclaredDeps>;
106
117
  }
107
118
 
108
119
  // ---------------------------------------------------------------------------
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-taskflow",
3
- "version": "0.0.25",
3
+ "version": "0.0.26",
4
4
  "description": "A declarative, verifiable graph of task nodes for the Pi coding agent — not a workflow you script, but a DAG you declare: statically verified before it runs, with dynamic fan-out, gates, isolated subagent context, resumable runs, and saveable commands.",
5
5
  "keywords": [
6
6
  "pi-package",