pi-taskflow 0.0.27 → 0.0.28
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +54 -0
- package/README.md +3 -3
- package/extensions/flowir/index.ts +2 -0
- package/extensions/flowir/phasefp.ts +121 -0
- package/extensions/index.ts +55 -16
- package/extensions/runtime.ts +327 -27
- package/extensions/schema.ts +37 -0
- package/extensions/store.ts +12 -4
- package/package.json +1 -1
- package/skills/taskflow/SKILL.md +49 -1
- package/skills/taskflow/configuration.md +22 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,60 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to pi-taskflow are documented here. This project follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) format.
|
|
4
4
|
|
|
5
|
+
## [0.0.28] — 2026-06-27
|
|
6
|
+
|
|
7
|
+
> Granular-reuse release: **incremental recompute goes from whole-flow to
|
|
8
|
+
> per-phase and per-item.** v0.0.27 *proved* the recompute cost win; this
|
|
9
|
+
> release makes that win far larger and easier to opt into. Editing one phase
|
|
10
|
+
> now invalidates only that phase and its transitive dependents (a sibling keeps
|
|
11
|
+
> its cache hit), a `map` phase re-executes only the items that actually changed,
|
|
12
|
+
> and a single `incremental` flag flips a whole flow into cross-run reuse without
|
|
13
|
+
> annotating every phase.
|
|
14
|
+
|
|
15
|
+
### Added
|
|
16
|
+
- **Per-phase structural sub-fingerprint (`v3:phasefp`).** The cache key now
|
|
17
|
+
folds a per-phase fingerprint — the phase plus its transitive `dependsOn ∪ from`
|
|
18
|
+
closure — instead of the whole-flow `v2:flowdef` hash. Editing phase B
|
|
19
|
+
invalidates only B and its dependents; an independent sibling A keeps its hit.
|
|
20
|
+
`cacheKeys` emits a 4-tier read ladder (`v3:phasefp` write → `v2:flowdef` →
|
|
21
|
+
bare flowdef → legacy, all read-only) so the upgrade is additive — no
|
|
22
|
+
miss-storm for unchanged flows. Fail-open: any per-phase error degrades that
|
|
23
|
+
phase to the whole-flow hash. Soundness fallback to whole-flow when per-phase
|
|
24
|
+
invalidation can't be statically guaranteed (flow-wide `contextSharing`, any
|
|
25
|
+
`shareContext` phase in the closure, `join: "any"`, or sub-flow inner phases).
|
|
26
|
+
(`extensions/flowir/phasefp.ts`, `test/cache-phasefp.test.ts` — 11 tests.)
|
|
27
|
+
- **Per-item cross-run caching for `map` phases.** When one of N items changes
|
|
28
|
+
between runs, only that item re-executes (N−1 cache hits) while the whole-map
|
|
29
|
+
fast path and every soundness fallback stay intact. Per-item keys omit the
|
|
30
|
+
structural fingerprint (which hashes the whole `over` source) so changing one
|
|
31
|
+
item no longer moves every key at once; they fold `[phase.id, it.agent, model,
|
|
32
|
+
it.task]` + the world-state tail, so task/agent/upstream/world changes still
|
|
33
|
+
invalidate the right items. Disabled (whole-map only) under run-only/off scope,
|
|
34
|
+
`shareContext`/flow-wide `contextSharing`, or inside a runtime-generated
|
|
35
|
+
sub-flow. (`test/cache-peritem.test.ts` — 11 tests.)
|
|
36
|
+
- **`incremental` flag** — flow-level (`TaskflowSchema.incremental`) and
|
|
37
|
+
invocation-level (`run` tool arg). Defaults every phase to `scope:"cross-run"`
|
|
38
|
+
so re-running a flow reuses unchanged phases across runs/sessions, without
|
|
39
|
+
annotating each phase. The invocation arg wins over the flow field; per-phase
|
|
40
|
+
cache settings and the cross-run-blocked types (gate/approval/loop/tournament)
|
|
41
|
+
still take precedence; default remains the safe `run-only` (fresh each run).
|
|
42
|
+
(`resolveCacheScope` in `extensions/index.ts`, `test/incremental-flag.test.ts`.)
|
|
43
|
+
- **Reuse reporting.** The end-of-run cache report and `/tf recompute` now show
|
|
44
|
+
reused-vs-executed counts and a per-phase "Why" trace (the explainable-
|
|
45
|
+
reactivity view: `▲ rerun / ✂ cutoff / ✓ reused / ✗ failed`, with `← causedBy`).
|
|
46
|
+
Dollar figures are reported only for within-run reuse, where the prior usage is
|
|
47
|
+
preserved; cross-run hits are counted but never attributed an invented saving.
|
|
48
|
+
(`summarizeReuse` / `RecomputeDecision` in `extensions/runtime.ts`,
|
|
49
|
+
`test/reuse-summary.test.ts`.)
|
|
50
|
+
- Tests: 804 → 846 (+42).
|
|
51
|
+
|
|
52
|
+
### Changed
|
|
53
|
+
- **`phaseFingerprint` strips more policy fields** (`cache`, `retry`,
|
|
54
|
+
`concurrency`, `final`): none changes a phase's subagent *output*, so a no-op
|
|
55
|
+
config tweak no longer causes false cache invalidation.
|
|
56
|
+
- **README** test count and feature line refreshed (804 → 846 across 46 files);
|
|
57
|
+
`per-item map caching` added to the headline capabilities.
|
|
58
|
+
|
|
5
59
|
## [0.0.27] — 2026-06-25
|
|
6
60
|
|
|
7
61
|
> Evidence release: **the incremental-recompute cost win is now proven, not
|
package/README.md
CHANGED
|
@@ -8,7 +8,7 @@
|
|
|
8
8
|
<a href="./LICENSE"><img src="https://img.shields.io/badge/license-MIT-43D9AD?style=flat-square" alt="MIT license"></a>
|
|
9
9
|
<a href="#whats-inside"><img src="https://img.shields.io/badge/runtime%20deps-0-43D9AD?style=flat-square" alt="zero runtime dependencies"></a>
|
|
10
10
|
<a href="https://github.com/heggria/pi-taskflow/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/heggria/pi-taskflow/ci.yml?branch=main&style=flat-square&label=CI" alt="CI status"></a>
|
|
11
|
-
<a href="#whats-inside"><img src="https://img.shields.io/badge/tests-
|
|
11
|
+
<a href="#whats-inside"><img src="https://img.shields.io/badge/tests-846-6E8BFF?style=flat-square" alt="846 tests"></a>
|
|
12
12
|
<a href="#whats-inside"><img src="https://img.shields.io/badge/dogfooded-%E2%9C%93-43D9AD?style=flat-square" alt="dogfooded"></a>
|
|
13
13
|
<a href="https://pi.dev"><img src="https://img.shields.io/badge/for-Pi%20coding%20agent-B692FF?style=flat-square" alt="for the Pi coding agent"></a>
|
|
14
14
|
</p>
|
|
@@ -728,12 +728,12 @@ Copy one into `.pi/taskflows/<name>.json` (or `~/.pi/agent/taskflows/`) and it r
|
|
|
728
728
|
|
|
729
729
|
<div align="center">
|
|
730
730
|
|
|
731
|
-
**0 runtime dependencies** · **
|
|
731
|
+
**0 runtime dependencies** · **846 tests** · **9 phase types** · **shared context tree** · **cross-session resume** · **cross-run memoization** · **per-item map caching** · **incremental recompute** · **FlowIR compile seam** · **detached execution** · **`compile` Mermaid renderer** · **~9k LOC runtime**
|
|
732
732
|
|
|
733
733
|
</div>
|
|
734
734
|
|
|
735
735
|
- **Zero runtime dependencies.** No `dependencies` field — the runtime is built entirely on Node built-ins (`fs` / `path` / `os` / `child_process` / `crypto`). The file lock is `fs.openSync("wx")`, not a third-party library.
|
|
736
|
-
- **
|
|
736
|
+
- **846 tests across 46 test files** covering concurrency, atomic file locking (8-process race regressions), path-traversal hardening, cross-session resume, cross-run cache freshness (flow/thinking/tools key isolation, fingerprint invalidation, TTL/LRU eviction), backward-compatible cache-key migration (4-tier legacy fallback), per-phase structural sub-fingerprint (v3:phasefp — editing one phase invalidates only it and its dependents), per-item map caching (one changed item re-executes, N−1 cache hits), the `incremental` flag (run-wide cross-run default), reuse reporting, the FlowIR compile seam (determinism, declared-plane synthesis), incremental recompute (early-cutoff propagation, partial cascade strictly < full, observed ∪ declared union frontier), gate verdicts, budget caps, retry/backoff, approval flows, loop termination, tournament judging, sub-flow composition, the shared context tree (blackboard reuse, supervision spawn, subflow validation/nesting), workspace isolation (temp/dedicated/worktree lifecycle, fail-open degrade, dynamic-flow rejection), dynamic sub-flow security hardening, detached execution (PID persistence, stale detection, crash→failed, resume after failure), live run-history refresh, callback isolation, the idle watchdog, model-role init config, parseModelFromLabel with parenthesized-model-name regression, and multi-fence `safeParse` recovery, plus the `compile` Mermaid renderer (id-collision disambiguation, markdown-injection hardening, and full verify-overlay category coverage).
|
|
737
737
|
- **Hardened by design.** Path-traversal defense (lexical + `realpath` containment check), runId validation, HTML/error sanitization, atomic writes, stale-lock stealing via `rename`, and an idle watchdog that kills wedged subagents (SIGTERM → SIGKILL after 5 minutes of silence). Dynamic sub-flows additionally get breadth caps, `cwd` containment, budget clamping, nesting depth caps, and prototype-pollution defense.
|
|
738
738
|
- **Dogfooded.** Every new feature has to survive the project's own `self-improve` taskflow before it ships.
|
|
739
739
|
|
|
@@ -0,0 +1,121 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* Per-phase structural sub-fingerprint (M6).
|
|
3
|
+
*
|
|
4
|
+
* `phaseFingerprint` produces a content-addressed hash of ONLY the subset of
|
|
5
|
+
* the flow definition that can affect a single phase's subagent output: the
|
|
6
|
+
* phase itself plus its transitive dependency closure. Folding this into the
|
|
7
|
+
* cross-run cache key (instead of the whole-flow `flowDefHash`) means editing
|
|
8
|
+
* phase B invalidates only B and its transitive dependents — independent
|
|
9
|
+
* sibling phase A keeps its cache hit.
|
|
10
|
+
*
|
|
11
|
+
* ## Soundness (the fallback gate)
|
|
12
|
+
*
|
|
13
|
+
* Per-phase invalidation is only sound when a phase's *real* dependencies are
|
|
14
|
+
* fully captured by the static `dependsOn ∪ from` closure. Three cases break
|
|
15
|
+
* that guarantee, so `phaseFingerprint` returns `undefined` for them and the
|
|
16
|
+
* caller falls back to the whole-flow `flowDefHash` (safe, = pre-M6 behavior):
|
|
17
|
+
*
|
|
18
|
+
* 1. **Shared Context Tree** (`def.contextSharing === true` or any closure
|
|
19
|
+
* member has `shareContext === true`): a sharing phase can read sibling
|
|
20
|
+
* blackboard writes OUTSIDE its declared deps, so the static closure
|
|
21
|
+
* under-approximates real reads.
|
|
22
|
+
* 2. **`flow` phase in the closure** (`type === "flow"`): a `flow` phase's
|
|
23
|
+
* sub-structure is resolved at runtime (inline `def`) or from a saved
|
|
24
|
+
* flow (`use`) and is not statically visible here. Editing the saved
|
|
25
|
+
* sub-flow would not move this phase's sub-fingerprint.
|
|
26
|
+
* 3. **`join: "any"` phase** (`phase.join === "any"`): validation exempts it
|
|
27
|
+
* from the `{steps.X}`-must-be-in-`dependsOn` check, so it may read
|
|
28
|
+
* phases outside its static closure. The closure under-approximates its
|
|
29
|
+
* real reads, so fall back to whole-flow invalidation.
|
|
30
|
+
*
|
|
31
|
+
* `cache`, `retry`, `concurrency`, and `final` are stripped from each phase
|
|
32
|
+
* before hashing: none of them changes the subagent's OUTPUT (they are policy,
|
|
33
|
+
* execution mechanics, or result selection). `cache`'s sub-fields
|
|
34
|
+
* (`scope`/`ttl`/`fingerprint`) reach the cache key through other paths
|
|
35
|
+
* (`cc.scope` gates the lookup, `cc.ttlMs` governs expiry, `cc.fingerprint` is
|
|
36
|
+
* in the key tail). Every other `Phase` field is hashed. `PhaseSchema` uses
|
|
37
|
+
* `additionalProperties: false`, so no surprise field can be missed.
|
|
38
|
+
*
|
|
39
|
+
* Pure + async (Web Crypto via `hashCanonical`). Reuses the vendored
|
|
40
|
+
* `canonicalJson`/`hashCanonical` (byte-identical to overstory's contract) so
|
|
41
|
+
* the sub-fingerprint shares one hashing contract with `flowDefHash`. Never
|
|
42
|
+
* throws — callers wrap in try/catch and degrade to `flowDefHash`.
|
|
43
|
+
*
|
|
44
|
+
* @see docs/internal/cache-migration.md (v3:phasefp tier)
|
|
45
|
+
*/
|
|
46
|
+
|
|
47
|
+
import { transitiveDependencies, type Phase, type Taskflow } from "../schema.ts";
|
|
48
|
+
import { canonicalJson, hashCanonical } from "./hash.ts";
|
|
49
|
+
|
|
50
|
+
/** Fields stripped before hashing because they do NOT affect a phase's
|
|
51
|
+
* subagent OUTPUT, only execution mechanics or result selection — folding
|
|
52
|
+
* them in would cause false cache invalidation on a no-op config change:
|
|
53
|
+
* - `cache`: policy object; its sub-fields reach the key via
|
|
54
|
+
* `cc.scope`/`cc.ttlMs`/`cc.fingerprint`.
|
|
55
|
+
* - `retry`: retry/backoff is execution mechanics; a successful phase
|
|
56
|
+
* produces the same output regardless of how many attempts it took.
|
|
57
|
+
* - `concurrency`: fan-out parallelism; does not change any item's output.
|
|
58
|
+
* - `final`: marks which phase's output is the flow result; does not change
|
|
59
|
+
* the phase's own output. */
|
|
60
|
+
const PHASE_FP_STRIP = ["cache", "retry", "concurrency", "final"] as const;
|
|
61
|
+
|
|
62
|
+
/** Clone a phase into a plain record with policy fields removed. */
|
|
63
|
+
function stripPolicy(phase: Phase): Record<string, unknown> {
|
|
64
|
+
const rec = phase as unknown as Record<string, unknown>;
|
|
65
|
+
const out: Record<string, unknown> = {};
|
|
66
|
+
for (const k of Object.keys(rec)) {
|
|
67
|
+
if ((PHASE_FP_STRIP as readonly string[]).includes(k)) continue;
|
|
68
|
+
out[k] = rec[k];
|
|
69
|
+
}
|
|
70
|
+
return out;
|
|
71
|
+
}
|
|
72
|
+
|
|
73
|
+
/**
|
|
74
|
+
* Per-phase structural sub-fingerprint.
|
|
75
|
+
*
|
|
76
|
+
* @returns the hex hash, or `undefined` when per-phase soundness cannot be
|
|
77
|
+
* guaranteed (caller falls back to the whole-flow `flowDefHash`). Never
|
|
78
|
+
* throws.
|
|
79
|
+
*/
|
|
80
|
+
export async function phaseFingerprint(def: Taskflow, phaseId: string): Promise<string | undefined> {
|
|
81
|
+
const phases = def.phases as Phase[];
|
|
82
|
+
const byId = new Map(phases.map((p) => [p.id, p]));
|
|
83
|
+
const phase = byId.get(phaseId);
|
|
84
|
+
if (!phase) return undefined;
|
|
85
|
+
|
|
86
|
+
// --- Soundness gate: fall back to whole-flow when static closure is unsafe. ---
|
|
87
|
+
// Flow-wide context sharing enables cross-sibling reads outside declared deps.
|
|
88
|
+
if (def.contextSharing === true) return undefined;
|
|
89
|
+
// A `join: "any"` phase may interpolate `{steps.X.*}` refs to phases OUTSIDE
|
|
90
|
+
// its declared dependsOn (validation deliberately exempts it — schema.ts), so
|
|
91
|
+
// the static closure under-approximates its real reads. Fall back to
|
|
92
|
+
// whole-flow invalidation rather than rely on the key tail alone (which would
|
|
93
|
+
// be an undocumented coupling). Safe, = pre-M6 behavior.
|
|
94
|
+
if (phase.join === "any") return undefined;
|
|
95
|
+
|
|
96
|
+
const closureIds = transitiveDependencies(phases, phaseId);
|
|
97
|
+
const closurePhases: Phase[] = [];
|
|
98
|
+
for (const id of closureIds) {
|
|
99
|
+
const p = byId.get(id);
|
|
100
|
+
if (!p) continue; // unknown dep — validation reports elsewhere
|
|
101
|
+
// Per-phase sharing: this closure member can read sibling blackboard
|
|
102
|
+
// writes outside its own declared deps.
|
|
103
|
+
if (p.shareContext === true) return undefined;
|
|
104
|
+
// A flow phase's sub-structure is runtime/saved-flow-resolved and not
|
|
105
|
+
// statically visible — editing it would not move the sub-fingerprint.
|
|
106
|
+
if ((p.type ?? "agent") === "flow") return undefined;
|
|
107
|
+
closurePhases.push(p);
|
|
108
|
+
}
|
|
109
|
+
// The self phase's own sharing/type is part of the closure too.
|
|
110
|
+
if (phase.shareContext === true) return undefined;
|
|
111
|
+
if ((phase.type ?? "agent") === "flow") return undefined;
|
|
112
|
+
|
|
113
|
+
// --- Build the canonical payload. ---
|
|
114
|
+
// `deps` is the SORTED transitive closure (self excluded). canonicalJson
|
|
115
|
+
// sorts OBJECT keys but preserves ARRAY order, so we sort the array
|
|
116
|
+
// explicitly for determinism independent of dependency walk order.
|
|
117
|
+
const depsPayload = closurePhases.map((p) => ({ id: p.id, def: stripPolicy(p) }));
|
|
118
|
+
const payload = { self: stripPolicy(phase), deps: depsPayload };
|
|
119
|
+
|
|
120
|
+
return hashCanonical(canonicalJson(payload));
|
|
121
|
+
}
|
package/extensions/index.ts
CHANGED
|
@@ -28,7 +28,7 @@ import { type AgentScope, discoverAgents, readSubagentSettings, shouldSyncBuilti
|
|
|
28
28
|
import { renderRunResult, summarizeRun } from "./render.ts";
|
|
29
29
|
import { RunHistoryComponent, type RunHistoryResult } from "./runs-view.ts";
|
|
30
30
|
import { ApprovalViewComponent, type ApprovalChoice } from "./approval-view.ts";
|
|
31
|
-
import { executeTaskflow, recomputeTaskflow, type ApprovalDecision, type ApprovalRequest, type RecomputeReport, type RuntimeDeps, type RuntimeResult } from "./runtime.ts";
|
|
31
|
+
import { executeTaskflow, recomputeTaskflow, summarizeReuse, type ApprovalDecision, type ApprovalRequest, type RecomputeReport, type RuntimeDeps, type RuntimeResult } from "./runtime.ts";
|
|
32
32
|
import { type UsageStats } from "./usage.ts";
|
|
33
33
|
import { finalPhase, resolveArgs, type Taskflow, validateTaskflow, desugar, isShorthand } from "./schema.ts";
|
|
34
34
|
import {
|
|
@@ -150,6 +150,12 @@ const TaskflowParams = Type.Object({
|
|
|
150
150
|
description: "Run in background (detached child process); return runId immediately. Status polled via store.",
|
|
151
151
|
}),
|
|
152
152
|
),
|
|
153
|
+
incremental: Type.Optional(
|
|
154
|
+
Type.Boolean({
|
|
155
|
+
description:
|
|
156
|
+
"For action=run: default every phase to cross-run caching so re-running the flow reuses unchanged phases across runs/sessions (incremental recompute). Overrides the flow's own `incremental` field. Per-phase cache settings and cross-run-blocked types (gate/approval/loop/tournament) still take precedence. Omit to use the flow's setting (default: run-only — fresh each run).",
|
|
157
|
+
}),
|
|
158
|
+
),
|
|
153
159
|
});
|
|
154
160
|
|
|
155
161
|
function formatFlowIR(ir: TaskflowIR): string {
|
|
@@ -225,6 +231,17 @@ function formatRecompute(r: RecomputeReport): string {
|
|
|
225
231
|
if (r.cutoff.length > 0) lines.push(` → saved ${r.cutoff.length} re-execution(s).`);
|
|
226
232
|
}
|
|
227
233
|
lines.push(`✓ reused (outside frontier): ${r.reused.join(", ") || "—"}`);
|
|
234
|
+
// Per-phase "why" — the explainable-reactivity trace (like React DevTools
|
|
235
|
+
// telling you why each component re-rendered). Only shown when present.
|
|
236
|
+
if (r.decisions && r.decisions.length > 0) {
|
|
237
|
+
const glyph: Record<string, string> = { rerun: "▲", cutoff: "✂", reused: "✓", failed: "✗" };
|
|
238
|
+
lines.push("");
|
|
239
|
+
lines.push("Why:");
|
|
240
|
+
for (const d of r.decisions) {
|
|
241
|
+
const cause = d.causedBy && d.causedBy.length ? ` ← ${d.causedBy.join(", ")}` : "";
|
|
242
|
+
lines.push(` ${glyph[d.outcome] ?? "•"} ${d.phaseId}: ${d.reason}${cause}`);
|
|
243
|
+
}
|
|
244
|
+
}
|
|
228
245
|
return lines.join("\n");
|
|
229
246
|
}
|
|
230
247
|
|
|
@@ -242,6 +259,18 @@ function makeRunState(def: Taskflow, args: Record<string, unknown>, cwd: string)
|
|
|
242
259
|
};
|
|
243
260
|
}
|
|
244
261
|
|
|
262
|
+
/** Resolve the run-wide default cache scope from the incremental flags. The
|
|
263
|
+
* invocation-level override (the `incremental` tool arg) wins; otherwise the
|
|
264
|
+
* flow's own `incremental` field; otherwise the safe `run-only` default
|
|
265
|
+
* (each run starts fresh — cross-run reuse is opt-in). Exported for testing. */
|
|
266
|
+
export function resolveCacheScope(
|
|
267
|
+
incrementalOverride: boolean | undefined,
|
|
268
|
+
flowIncremental: boolean | undefined,
|
|
269
|
+
): "cross-run" | "run-only" {
|
|
270
|
+
const on = typeof incrementalOverride === "boolean" ? incrementalOverride : flowIncremental;
|
|
271
|
+
return on === true ? "cross-run" : "run-only";
|
|
272
|
+
}
|
|
273
|
+
|
|
245
274
|
async function runFlow(
|
|
246
275
|
def: Taskflow,
|
|
247
276
|
args: Record<string, unknown>,
|
|
@@ -249,6 +278,9 @@ async function runFlow(
|
|
|
249
278
|
signal: AbortSignal | undefined,
|
|
250
279
|
onUpdate: ((p: AgentToolResult<TaskflowDetails>) => void) | undefined,
|
|
251
280
|
existing?: RunState,
|
|
281
|
+
// Invocation-level incremental override: when set, wins over def.incremental.
|
|
282
|
+
// undefined → fall back to the flow's own `incremental` field (default off).
|
|
283
|
+
incrementalOverride?: boolean,
|
|
252
284
|
): Promise<RuntimeResult> {
|
|
253
285
|
const state = existing ?? makeRunState(def, args, ctx.cwd);
|
|
254
286
|
|
|
@@ -374,11 +406,15 @@ async function runFlow(
|
|
|
374
406
|
persist: persistThrottled,
|
|
375
407
|
requestApproval,
|
|
376
408
|
loadFlow: (name: string) => getFlow(ctx.cwd, name)?.def,
|
|
377
|
-
// Cross-run cache is opt-in
|
|
378
|
-
//
|
|
379
|
-
//
|
|
380
|
-
//
|
|
381
|
-
|
|
409
|
+
// Cross-run cache is opt-in. By default a real run is `run-only` (fresh
|
|
410
|
+
// each run): defaulting every phase to cross-run silently persists
|
|
411
|
+
// outputs and can serve stale results for phases whose agents read files
|
|
412
|
+
// at runtime (those files are not in the cache key). A user opts in
|
|
413
|
+
// explicitly — the invocation `incremental` arg wins, else the flow's
|
|
414
|
+
// own `incremental` field, else the safe run-only default. All the
|
|
415
|
+
// soundness fallbacks (blocked types, per-phase fingerprint, shareContext)
|
|
416
|
+
// still apply per phase inside executePhase.
|
|
417
|
+
cacheScopeDefault: resolveCacheScope(incrementalOverride, def.incremental),
|
|
382
418
|
});
|
|
383
419
|
// Auto-report cache savings at the end of a real run so the user sees the
|
|
384
420
|
// M1-M5 effect without running a separate /tf command.
|
|
@@ -958,7 +994,7 @@ export default function (pi: ExtensionAPI) {
|
|
|
958
994
|
};
|
|
959
995
|
}
|
|
960
996
|
|
|
961
|
-
const result = await runFlow(def, args, ctx, signal, onUpdate as any);
|
|
997
|
+
const result = await runFlow(def, args, ctx, signal, onUpdate as any, undefined, params.incremental as boolean | undefined);
|
|
962
998
|
// Surface the validation warnings in the tool result so the model
|
|
963
999
|
// can acknowledge or fix them, and the user sees them in the chat.
|
|
964
1000
|
if (v.warnings.length) {
|
|
@@ -1399,15 +1435,18 @@ function errorResult(action: string, message: string): ToolResult {
|
|
|
1399
1435
|
};
|
|
1400
1436
|
}
|
|
1401
1437
|
|
|
1402
|
-
function formatCacheReport(state: RunState,
|
|
1403
|
-
const
|
|
1404
|
-
|
|
1405
|
-
|
|
1406
|
-
//
|
|
1407
|
-
//
|
|
1408
|
-
//
|
|
1409
|
-
|
|
1410
|
-
|
|
1438
|
+
function formatCacheReport(state: RunState, _totalUsage: UsageStats): string {
|
|
1439
|
+
const r = summarizeReuse(state);
|
|
1440
|
+
const reused = r.reusedRunOnly + r.reusedCrossRun;
|
|
1441
|
+
if (reused === 0) return ""; // nothing reused — no incremental story to tell
|
|
1442
|
+
// Honest framing: report reused-vs-executed counts, and a dollar figure only
|
|
1443
|
+
// for within-run reuse (where the prior usage is preserved). Cross-run hits
|
|
1444
|
+
// zero their usage, so their original cost is genuinely unknown — we say
|
|
1445
|
+
// "reused" without inventing a savings number for them.
|
|
1446
|
+
const parts: string[] = [`♻️ ${reused}/${r.done} phase(s) reused (${r.executed} executed this run)`];
|
|
1447
|
+
if (r.savedUSD > 0) parts.push(`~$${r.savedUSD.toFixed(4)} of re-execution avoided`);
|
|
1448
|
+
if (r.reusedCrossRun > 0) parts.push(`${r.reusedCrossRun} from cross-run cache`);
|
|
1449
|
+
return parts.join(" · ");
|
|
1411
1450
|
}
|
|
1412
1451
|
|
|
1413
1452
|
function finalResult(action: string, result: RuntimeResult): ToolResult {
|
package/extensions/runtime.ts
CHANGED
|
@@ -20,7 +20,7 @@ import { type Budget, type CacheScope, dependenciesOf, finalPhase, LOOP_DEFAULT_
|
|
|
20
20
|
import { verifyTaskflow } from "./verify.ts";
|
|
21
21
|
import { hashInput, newRunId, type PhaseState, type RunState, runsDir } from "./store.ts";
|
|
22
22
|
import { CacheStore, resolveFingerprint } from "./cache.ts";
|
|
23
|
-
import { compileTaskflowToIR } from "./flowir/index.ts";
|
|
23
|
+
import { compileTaskflowToIR, phaseFingerprint } from "./flowir/index.ts";
|
|
24
24
|
import { computeStaleFrontier, declaredReadMapOfDef, readMapOf } from "./stale.ts";
|
|
25
25
|
import { ctxDirFor, drainPendingSpawns, initCtxDir, registerNode, setNodeStatus, type SpawnAssignment } from "./context-store.ts";
|
|
26
26
|
import { allocateWorkspace, isWorkspaceKeyword, type Workspace } from "./workspace.ts";
|
|
@@ -72,6 +72,55 @@ export interface RuntimeResult {
|
|
|
72
72
|
finalOutput: string;
|
|
73
73
|
ok: boolean;
|
|
74
74
|
totalUsage: UsageStats;
|
|
75
|
+
/** Incremental-reuse summary: how many phases were reused from cache vs.
|
|
76
|
+
* freshly executed this run, and the cost the reused work would otherwise
|
|
77
|
+
* have incurred (known only for within-run resume; cross-run hits zero
|
|
78
|
+
* their usage so their original cost is not recoverable). Optional &
|
|
79
|
+
* additive — callers that ignore it are unaffected. */
|
|
80
|
+
reuse?: ReuseSummary;
|
|
81
|
+
}
|
|
82
|
+
|
|
83
|
+
/** A run's incremental-reuse accounting (see RuntimeResult.reuse). */
|
|
84
|
+
export interface ReuseSummary {
|
|
85
|
+
/** Phases that completed by executing a subagent this run. */
|
|
86
|
+
executed: number;
|
|
87
|
+
/** Phases served from the within-run resume cache (no new tokens). */
|
|
88
|
+
reusedRunOnly: number;
|
|
89
|
+
/** Phases restored from the cross-run store (no new tokens). */
|
|
90
|
+
reusedCrossRun: number;
|
|
91
|
+
/** Total phases that reached `done` (executed + reused). */
|
|
92
|
+
done: number;
|
|
93
|
+
/** USD the within-run-reused phases would have cost if re-executed (their
|
|
94
|
+
* preserved prior usage). Cross-run hits are excluded (cost not recoverable). */
|
|
95
|
+
savedUSD: number;
|
|
96
|
+
}
|
|
97
|
+
|
|
98
|
+
/** Compute the incremental-reuse summary from a run's terminal phase states.
|
|
99
|
+
* Pure, total, never throws. A phase is "reused" iff it carries a `cacheHit`
|
|
100
|
+
* marker (set by `cachedPhase` for both within-run resume and cross-run hits). */
|
|
101
|
+
export function summarizeReuse(state: RunState): ReuseSummary {
|
|
102
|
+
let executed = 0;
|
|
103
|
+
let reusedRunOnly = 0;
|
|
104
|
+
let reusedCrossRun = 0;
|
|
105
|
+
let savedUSD = 0;
|
|
106
|
+
for (const ps of Object.values(state.phases)) {
|
|
107
|
+
if (ps.status !== "done") continue;
|
|
108
|
+
if (ps.cacheHit === "run-only") {
|
|
109
|
+
reusedRunOnly++;
|
|
110
|
+
savedUSD += ps.usage?.cost ?? 0; // within-run resume preserves prior usage
|
|
111
|
+
} else if (ps.cacheHit === "cross-run") {
|
|
112
|
+
reusedCrossRun++; // cross-run hits zero their usage — cost not recoverable
|
|
113
|
+
} else {
|
|
114
|
+
executed++;
|
|
115
|
+
}
|
|
116
|
+
}
|
|
117
|
+
return {
|
|
118
|
+
executed,
|
|
119
|
+
reusedRunOnly,
|
|
120
|
+
reusedCrossRun,
|
|
121
|
+
done: executed + reusedRunOnly + reusedCrossRun,
|
|
122
|
+
savedUSD,
|
|
123
|
+
};
|
|
75
124
|
}
|
|
76
125
|
|
|
77
126
|
function buildInterpolationContext(
|
|
@@ -120,6 +169,31 @@ function resultToPhaseState(id: string, r: RunResult, inputHash: string, parseJs
|
|
|
120
169
|
};
|
|
121
170
|
}
|
|
122
171
|
|
|
172
|
+
/**
|
|
173
|
+
* Synthesize a 0-token `RunResult` from a cached per-item `PhaseState` so a
|
|
174
|
+
* cross-run per-item cache hit flows through `mergePhaseState` as a normal
|
|
175
|
+
* successful fan-out item. `stopReason: "cache-hit"` is NOT in `isFailed`'s
|
|
176
|
+
* failure set (only "error"/"aborted"/non-zero exit), so the item counts as
|
|
177
|
+
* success. Usage is `emptyUsage()` — a cached item spent no new tokens this
|
|
178
|
+
* run, so `mergePhaseState`'s `aggregateUsage` charges nothing for it.
|
|
179
|
+
*
|
|
180
|
+
* Used only by the `map` per-item cache path (see `runFanout`). Fail-open by
|
|
181
|
+
* construction: this is only reached AFTER a successful `cachedPhase` lookup,
|
|
182
|
+
* so `ps.output` is always present.
|
|
183
|
+
*/
|
|
184
|
+
function phaseStateToRunResult(ps: PhaseState, it: { agent: string; task: string }): RunResult {
|
|
185
|
+
return {
|
|
186
|
+
agent: it.agent,
|
|
187
|
+
task: it.task,
|
|
188
|
+
exitCode: 0,
|
|
189
|
+
output: ps.output ?? "",
|
|
190
|
+
stderr: "",
|
|
191
|
+
usage: emptyUsage(),
|
|
192
|
+
model: ps.model,
|
|
193
|
+
stopReason: "cache-hit",
|
|
194
|
+
};
|
|
195
|
+
}
|
|
196
|
+
|
|
123
197
|
/** Convert observed read refs (e.g. "steps.scout.output") into a structured
|
|
124
198
|
* readSet keyed by upstream phase id, tagging each with the version
|
|
125
199
|
* (= inputHash) that was current when read. Only `steps.*` refs are upstream
|
|
@@ -277,12 +351,20 @@ function mergePhaseState(
|
|
|
277
351
|
const model = ran.find((r) => r.model !== undefined)?.model;
|
|
278
352
|
// Combine outputs as a labelled list; also expose a JSON array of outputs.
|
|
279
353
|
// For failed items, use the error message instead of the useless placeholder.
|
|
280
|
-
|
|
354
|
+
// Labels are positionally aligned to the ORIGINAL `over` array: we iterate
|
|
355
|
+
// over ALL results (including budget-skipped, which are filtered to null) and
|
|
356
|
+
// use `results.length` as N, so item k's label reads `[k/N]` matching its
|
|
357
|
+
// position in `over` — not its rank among non-skipped items. Per-item cache
|
|
358
|
+
// hits (`stopReason: "cache-hit"`) are not budget-skipped, so they keep their
|
|
359
|
+
// original positional label.
|
|
360
|
+
const combinedText = results
|
|
281
361
|
.map((r, i) => {
|
|
282
|
-
|
|
362
|
+
if (r.stopReason === "budget-skipped") return null;
|
|
363
|
+
const label = `### [${i + 1}/${results.length}] ${r.agent}${isFailed(r) ? " (failed)" : ""}`;
|
|
283
364
|
const content = isFailed(r) ? (r.errorMessage || r.stderr || r.output) : r.output;
|
|
284
365
|
return `${label}\n\n${content}`;
|
|
285
366
|
})
|
|
367
|
+
.filter((x): x is string => x !== null)
|
|
286
368
|
.join("\n\n---\n\n");
|
|
287
369
|
// Only successful runs feed the parsed JSON array (no error/skip strings).
|
|
288
370
|
const jsonArray = parseJson ? ran.filter((r) => !isFailed(r)).map((r) => safeParse(r.output) ?? r.output) : undefined;
|
|
@@ -721,6 +803,7 @@ async function executePhaseInner(
|
|
|
721
803
|
flowName: state.flowName,
|
|
722
804
|
runId: state.runId,
|
|
723
805
|
flowDefHash: state.flowDefHash === "failed" ? undefined : state.flowDefHash,
|
|
806
|
+
phaseFp: state.phaseFingerprints?.[phase.id],
|
|
724
807
|
forceRerun: opts?.forceRerun,
|
|
725
808
|
thinking: phase.thinking,
|
|
726
809
|
tools: phase.tools,
|
|
@@ -820,7 +903,14 @@ async function executePhaseInner(
|
|
|
820
903
|
const parseJson = phase.output === "json";
|
|
821
904
|
|
|
822
905
|
// Runs a list of sub-tasks with live fan-out progress + aggregate live usage/activity.
|
|
823
|
-
|
|
906
|
+
// `perItem` (map only) enables per-item cross-run caching: each item is looked
|
|
907
|
+
// up in the cache before spawning a subagent, and a successful fresh item is
|
|
908
|
+
// recorded so a later run with that item unchanged hits per-item. When
|
|
909
|
+
// `perItem` is undefined (parallel, or non-cacheable maps) the path is inert.
|
|
910
|
+
const runFanout = async (
|
|
911
|
+
items: Array<{ agent: string; task: string }>,
|
|
912
|
+
perItem?: { keyOf: (idx: number) => CacheKeys | null; cc: PhaseCacheCtx },
|
|
913
|
+
): Promise<RunResult[]> => {
|
|
824
914
|
let done = 0;
|
|
825
915
|
let running = 0;
|
|
826
916
|
let failed = 0;
|
|
@@ -854,6 +944,28 @@ async function executePhaseInner(
|
|
|
854
944
|
stopReason: "budget-skipped",
|
|
855
945
|
} satisfies RunResult;
|
|
856
946
|
}
|
|
947
|
+
// Per-item cross-run cache lookup (map only). A hit synthesizes a 0-token
|
|
948
|
+
// RunResult and returns immediately — the item never spawns a subagent and
|
|
949
|
+
// never reaches the ctx_spawn drain below (a cached item can't have queued
|
|
950
|
+
// new spawns). Fail-open: any error in the lookup path degrades to executing.
|
|
951
|
+
if (perItem) {
|
|
952
|
+
try {
|
|
953
|
+
const ckItem = perItem.keyOf(idx);
|
|
954
|
+
if (ckItem) {
|
|
955
|
+
const hit = cachedPhase(perItem.cc, ckItem);
|
|
956
|
+
if (hit) {
|
|
957
|
+
done++;
|
|
958
|
+
const synth = phaseStateToRunResult(hit, it);
|
|
959
|
+
liveUsages[idx] = emptyUsage();
|
|
960
|
+
if (hit.model) latestModel = hit.model;
|
|
961
|
+
refresh();
|
|
962
|
+
return synth;
|
|
963
|
+
}
|
|
964
|
+
}
|
|
965
|
+
} catch {
|
|
966
|
+
/* fail-open: a cache read error must never sink the item */
|
|
967
|
+
}
|
|
968
|
+
}
|
|
857
969
|
running++;
|
|
858
970
|
refresh();
|
|
859
971
|
if (ctxDir) {
|
|
@@ -869,6 +981,23 @@ async function executePhaseInner(
|
|
|
869
981
|
done++;
|
|
870
982
|
if (isFailed(r)) failed++;
|
|
871
983
|
liveUsages[idx] = r.usage;
|
|
984
|
+
// Per-item cross-run cache record (map only): persist a successful fresh
|
|
985
|
+
// item so a later run with this item unchanged hits per-item instead of
|
|
986
|
+
// re-running. Failed and budget-skipped items are never cached (a stale
|
|
987
|
+
// failure would be served on the next run). Fail-open: a write error never
|
|
988
|
+
// sinks the item — the fresh `r` is already in hand and flows downstream.
|
|
989
|
+
if (perItem && !isFailed(r) && r.stopReason !== "budget-skipped") {
|
|
990
|
+
try {
|
|
991
|
+
const ckItem = perItem.keyOf(idx);
|
|
992
|
+
if (ckItem) {
|
|
993
|
+
const ccItem: PhaseCacheCtx = { ...perItem.cc, phaseId: `${phase.id}#item${idx}` };
|
|
994
|
+
const itemPs = resultToPhaseState(`${phase.id}#item${idx}`, r, ckItem.key, parseJson);
|
|
995
|
+
recordCache(ccItem, itemPs);
|
|
996
|
+
}
|
|
997
|
+
} catch {
|
|
998
|
+
/* fail-open: cache write must never sink the item */
|
|
999
|
+
}
|
|
1000
|
+
}
|
|
872
1001
|
if (ctxDir) {
|
|
873
1002
|
try {
|
|
874
1003
|
const itemNid = nodeIdFor(String(idx));
|
|
@@ -1068,12 +1197,59 @@ async function executePhaseInner(
|
|
|
1068
1197
|
task: preRead + interpolate(phase.task ?? "", localCtx).text,
|
|
1069
1198
|
};
|
|
1070
1199
|
});
|
|
1200
|
+
// Per-item caching is sound ONLY when ALL of:
|
|
1201
|
+
// - cross-run scope: run-only has no persistent store, so per-item entries
|
|
1202
|
+
// could never be re-read (no point keying them).
|
|
1203
|
+
// - no Shared Context Tree (`!sharing`): a sharing map item can read sibling
|
|
1204
|
+
// blackboard writes OUTSIDE its declared deps, so the per-item key (which
|
|
1205
|
+
// folds only the item's own task) under-approximates real reads and could
|
|
1206
|
+
// serve a stale result. Fall back to whole-map.
|
|
1207
|
+
// - not inside a runtime-generated sub-flow (`def:` frame in the stack):
|
|
1208
|
+
// such flows are untrusted / possibly non-deterministic, so per-item reuse
|
|
1209
|
+
// is unsafe. Fall back to whole-map (which still applies breadth caps).
|
|
1210
|
+
// `undefined phaseFingerprint` is NOT a blocker for soundness — it is a
|
|
1211
|
+
// DELIBERATE design choice: per-item keys omit BOTH phaseFp and flowDefHash
|
|
1212
|
+
// (via ccPerItem below) so a changing `over` cannot move unchanged items'
|
|
1213
|
+
// keys. See ccPerItem for the full soundness argument.
|
|
1214
|
+
const perItemCacheable =
|
|
1215
|
+
cc.scope === "cross-run" &&
|
|
1216
|
+
!sharing &&
|
|
1217
|
+
!(deps._stack ?? []).some((s) => s.startsWith("def:"));
|
|
1218
|
+
// Per-item cache context: structural fingerprints (phaseFp + flowDefHash)
|
|
1219
|
+
// are OMITTED so a changing `over` cannot move unchanged items' keys. Both
|
|
1220
|
+
// fingerprints hash `over` (the array source); folding either into a
|
|
1221
|
+
// per-item key means editing one item invalidates EVERY per-item key at
|
|
1222
|
+
// once (no partial reuse) — the bug fixed here. A single item's output is
|
|
1223
|
+
// fully specified by `it.task` (template + {item}/{as} value + any
|
|
1224
|
+
// upstream-output refs + args) + `it.agent` + model + thinking/tools/preRead
|
|
1225
|
+
// + the world-state `fingerprint`; `over` only determines WHICH items
|
|
1226
|
+
// exist, not WHAT any item computes. `flowName` is retained for cross-flow
|
|
1227
|
+
// collision prevention. Soundness: docs/internal/cache-migration.md.
|
|
1228
|
+
// NB: perItemCacheable already gates on scope === "cross-run", which is
|
|
1229
|
+
// blocked upstream when flowDefHash === "failed", so ccPerItem is only
|
|
1230
|
+
// built when flowDefHash is a real hash (or already undefined) — setting
|
|
1231
|
+
// it to undefined here is a safe no-op for the failed case.
|
|
1232
|
+
const ccPerItem: PhaseCacheCtx = { ...cc, phaseFp: undefined, flowDefHash: undefined };
|
|
1233
|
+
// Pre-compute per-item CacheKeys once so the lookup and the record path use
|
|
1234
|
+
// the IDENTICAL key (built from ccPerItem, NOT the whole-phase cc). The
|
|
1235
|
+
// per-item key folds `it.agent` (Arbiter fix): a different agent means
|
|
1236
|
+
// different output, so a per-item key WITHOUT the agent could serve a stale
|
|
1237
|
+
// cross-agent hit when only `phase.agent` changed (the whole-map key would
|
|
1238
|
+
// correctly miss via JSON.stringify(tasks), but per-item keys would not).
|
|
1239
|
+
const perItemKeys: (CacheKeys | null)[] = perItemCacheable
|
|
1240
|
+
? tasks.map((it) => cacheKeys(ccPerItem, [phase.id, it.agent, phase.model ?? "", it.task]))
|
|
1241
|
+
: tasks.map(() => null);
|
|
1242
|
+
const perItem = perItemCacheable
|
|
1243
|
+
? { keyOf: (idx: number): CacheKeys | null => perItemKeys[idx] ?? null, cc: ccPerItem }
|
|
1244
|
+
: undefined;
|
|
1245
|
+
// Whole-map key keeps the FULL cc (phaseFp + flowDefHash) so its fast path
|
|
1246
|
+
// and any pre-existing whole-map entries are unchanged (backward compat).
|
|
1071
1247
|
const ck = cacheKeys(cc, [phase.id, phase.model ?? "", JSON.stringify(tasks)]);
|
|
1072
1248
|
const inputHash = ck.key;
|
|
1073
1249
|
const cached = cachedPhase(cc, ck);
|
|
1074
1250
|
if (cached) return cached;
|
|
1075
1251
|
|
|
1076
|
-
const results = await runFanout(tasks);
|
|
1252
|
+
const results = await runFanout(tasks, perItem);
|
|
1077
1253
|
const ps = mergePhaseState(phase.id, results, inputHash, parseJson);
|
|
1078
1254
|
if (readRefs.length) ps.reads = readRefsToReads(readRefs, state);
|
|
1079
1255
|
if (mapTruncated) {
|
|
@@ -1635,6 +1811,12 @@ export interface PhaseCacheCtx {
|
|
|
1635
1811
|
* key so two structurally-different flows that share a name can never
|
|
1636
1812
|
* collide, and a changed flow never serves a stale cross-run hit. */
|
|
1637
1813
|
flowDefHash?: string | "failed";
|
|
1814
|
+
/** Per-phase structural sub-fingerprint (M6). When present, folds into the
|
|
1815
|
+
* key as `v3:phasefp:<subfp>` so editing phase B invalidates only B + its
|
|
1816
|
+
* transitive dependents. When absent (sub-flow inner states, or a phase
|
|
1817
|
+
* for which per-phase soundness couldn't be guaranteed), `cacheKeys`
|
|
1818
|
+
* falls back to `flowDefHash` — preserving pre-M6 whole-flow behavior. */
|
|
1819
|
+
phaseFp?: string;
|
|
1638
1820
|
/** Force this phase to re-execute, ignoring the within-run prior AND the
|
|
1639
1821
|
* cross-run store (M5 recompute seed). Downstream phases are NOT forced —
|
|
1640
1822
|
* they re-evaluate naturally: if the seed's new output changed their
|
|
@@ -1646,27 +1828,34 @@ export interface PhaseCacheCtx {
|
|
|
1646
1828
|
/** A computed cache identity: the new (versioned) key plus the read-only
|
|
1647
1829
|
* fallback keys used to honor entries written by older releases. The `key`
|
|
1648
1830
|
* is what we WRITE under and what `PhaseState.inputHash` carries; the
|
|
1649
|
-
* `
|
|
1650
|
-
* never produces a miss-storm. See docs/internal/cache-migration.md. */
|
|
1831
|
+
* `v2Key`/`bareKey`/`legacyKey` are consulted READ-ONLY on a miss so an
|
|
1832
|
+
* upgrade never produces a miss-storm. See docs/internal/cache-migration.md. */
|
|
1651
1833
|
export interface CacheKeys {
|
|
1652
|
-
/** Current key: folds `
|
|
1834
|
+
/** Current key: folds `v3:phasefp:<subfp>` (the per-phase structural
|
|
1835
|
+
* sub-fingerprint; degrades to the whole-flow hash when per-phase
|
|
1836
|
+
* soundness couldn't be guaranteed). */
|
|
1653
1837
|
key: string;
|
|
1654
|
-
/** Pre-
|
|
1655
|
-
|
|
1838
|
+
/** Pre-M6 key: `v2:flowdef:<flowDefHash>` (whole-flow fingerprint).
|
|
1839
|
+
* Read-only. */
|
|
1840
|
+
v2Key: string;
|
|
1656
1841
|
/** Bare (unversioned) `flowdef:` key — written by pre-H1 code that folded
|
|
1657
1842
|
* the hash without a `v2:` prefix. Read-only. Removed in v0.1.0. */
|
|
1658
1843
|
bareKey: string;
|
|
1844
|
+
/** Pre-flowDefHash-era key: the flowdef line OMITTED entirely. Read-only. */
|
|
1845
|
+
legacyKey: string;
|
|
1659
1846
|
}
|
|
1660
1847
|
|
|
1661
1848
|
/** Fold the phase fingerprint into the base hash parts to form the cache keys.
|
|
1662
1849
|
*
|
|
1663
|
-
*
|
|
1850
|
+
* Four keys are produced for backward compatibility (see
|
|
1664
1851
|
* docs/internal/cache-migration.md):
|
|
1665
|
-
* - `key` : `
|
|
1852
|
+
* - `key` : `v3:phasefp:<subfp>` — the current write key (per-phase
|
|
1853
|
+
* structural sub-fingerprint; falls back to the whole-flow hash when
|
|
1854
|
+
* `cc.phaseFp` is absent).
|
|
1855
|
+
* - `v2Key` : `v2:flowdef:<flowDefHash>` — pre-M6 whole-flow key.
|
|
1856
|
+
* - `bareKey` : bare `flowdef:<flowDefHash>` (unversioned) — pre-H1 entries.
|
|
1666
1857
|
* - `legacyKey`: the flowdef line omitted — pre-flowDefHash entries.
|
|
1667
|
-
*
|
|
1668
|
-
* folded the hash without the `v2:` prefix.
|
|
1669
|
-
* `cachedPhase` consults all three READ-ONLY on a miss; `recordCache` writes
|
|
1858
|
+
* `cachedPhase` consults all four READ-ONLY on a miss; `recordCache` writes
|
|
1670
1859
|
* only `key`. This means an upgrade never produces a miss-storm: existing
|
|
1671
1860
|
* entries (whichever shape) still hit, and new writes converge on `key`. */
|
|
1672
1861
|
export function cacheKeys(cc: PhaseCacheCtx, baseParts: string[]): CacheKeys {
|
|
@@ -1682,10 +1871,15 @@ export function cacheKeys(cc: PhaseCacheCtx, baseParts: string[]): CacheKeys {
|
|
|
1682
1871
|
];
|
|
1683
1872
|
const fold = (parts: string[]): string =>
|
|
1684
1873
|
cc.fingerprint ? hashInput(...parts, cc.fingerprint) : hashInput(...parts);
|
|
1874
|
+
// Per-phase sub-fingerprint; falls back to the whole-flow hash when absent
|
|
1875
|
+
// (sub-flow inner states, or soundness fallback) — preserving pre-M6 behavior.
|
|
1876
|
+
const fp = cc.phaseFp ?? cc.flowDefHash ?? "";
|
|
1877
|
+
const fdh = cc.flowDefHash ?? "";
|
|
1685
1878
|
return {
|
|
1686
|
-
key: fold([`flow:${cc.flowName}`, `
|
|
1879
|
+
key: fold([`flow:${cc.flowName}`, `v3:phasefp:${fp}`, ...tail]),
|
|
1880
|
+
v2Key: fold([`flow:${cc.flowName}`, `v2:flowdef:${fdh}`, ...tail]),
|
|
1881
|
+
bareKey: fold([`flow:${cc.flowName}`, `flowdef:${fdh}`, ...tail]),
|
|
1687
1882
|
legacyKey: fold([`flow:${cc.flowName}`, ...tail]),
|
|
1688
|
-
bareKey: fold([`flow:${cc.flowName}`, `flowdef:${cc.flowDefHash ?? ""}`, ...tail]),
|
|
1689
1883
|
};
|
|
1690
1884
|
}
|
|
1691
1885
|
|
|
@@ -1696,9 +1890,10 @@ export function cacheKeys(cc: PhaseCacheCtx, baseParts: string[]): CacheKeys {
|
|
|
1696
1890
|
* - "cross-run": within-run first, then the persistent cross-run store.
|
|
1697
1891
|
* On a cross-run hit, usage is zeroed and `cacheHit` records the source.
|
|
1698
1892
|
*
|
|
1699
|
-
* The cross-run read is
|
|
1700
|
-
* `keys.key` (current `
|
|
1701
|
-
*
|
|
1893
|
+
* The cross-run read is FOUR-TIER and READ-ONLY for fallback keys: it tries
|
|
1894
|
+
* `keys.key` (current `v3:phasefp:` shape) first, then `keys.v2Key` (pre-M6
|
|
1895
|
+
* `v2:flowdef:`), then `keys.bareKey` (pre-H1 bare `flowdef:`), then
|
|
1896
|
+
* `keys.legacyKey` (pre-flowDefHash, no flowdef line).
|
|
1702
1897
|
* A hit on ANY tier is restored as a cache hit; we do NOT write-through (no
|
|
1703
1898
|
* re-store under the new key) so the cache size stays stable and the legacy
|
|
1704
1899
|
* entry ages out naturally. See docs/internal/cache-migration.md.
|
|
@@ -1707,14 +1902,17 @@ function cachedPhase(cc: PhaseCacheCtx, keys: CacheKeys): PhaseState | null {
|
|
|
1707
1902
|
if (cc.scope === "off") return null;
|
|
1708
1903
|
if (cc.forceRerun) return null;
|
|
1709
1904
|
|
|
1710
|
-
// 1. within-run resume (fastest; always allowed unless scope is off)
|
|
1905
|
+
// 1. within-run resume (fastest; always allowed unless scope is off). Flag
|
|
1906
|
+
// it as a `run-only` cache hit so the run summary can count it as reused
|
|
1907
|
+
// work (it spent no new tokens). The prior usage is preserved verbatim so
|
|
1908
|
+
// the summary can report what the reuse would otherwise have cost.
|
|
1711
1909
|
if (cc.prior && cc.prior.status === "done" && cc.prior.inputHash === keys.key) {
|
|
1712
|
-
return { ...cc.prior, status: "done" };
|
|
1910
|
+
return { ...cc.prior, status: "done", cacheHit: "run-only" };
|
|
1713
1911
|
}
|
|
1714
1912
|
|
|
1715
|
-
// 2. cross-run memoization (opt-in) —
|
|
1913
|
+
// 2. cross-run memoization (opt-in) — four-tier read-only fallback.
|
|
1716
1914
|
if (cc.scope === "cross-run") {
|
|
1717
|
-
for (const k of [keys.key, keys.bareKey, keys.legacyKey]) {
|
|
1915
|
+
for (const k of [keys.key, keys.v2Key, keys.bareKey, keys.legacyKey]) {
|
|
1718
1916
|
const e = cc.store.get(k, cc.ttlMs);
|
|
1719
1917
|
if (!e) continue;
|
|
1720
1918
|
// If we stored the full PhaseState, restore it (preserving gate,
|
|
@@ -1895,6 +2093,22 @@ export interface RecomputeReport {
|
|
|
1895
2093
|
/** Phases in the frontier whose inputHash did NOT move → cached result
|
|
1896
2094
|
* reused, no re-execution (early cutoff). Empty in dry-run (unknowable). */
|
|
1897
2095
|
readonly cutoff: readonly string[];
|
|
2096
|
+
/** Per-phase decision trace: WHY each phase was rerun / cut off / reused.
|
|
2097
|
+
* The "explainable reactivity" layer — like React DevTools telling you why
|
|
2098
|
+
* a component re-rendered. Additive; callers that ignore it are unaffected. */
|
|
2099
|
+
readonly decisions: readonly RecomputeDecision[];
|
|
2100
|
+
}
|
|
2101
|
+
|
|
2102
|
+
/** Why a single phase landed in its recompute outcome. */
|
|
2103
|
+
export interface RecomputeDecision {
|
|
2104
|
+
readonly phaseId: string;
|
|
2105
|
+
/** What happened (real run) or would happen (dry-run). */
|
|
2106
|
+
readonly outcome: "rerun" | "cutoff" | "reused" | "failed";
|
|
2107
|
+
/** Human-readable cause. */
|
|
2108
|
+
readonly reason: string;
|
|
2109
|
+
/** The upstream phase(s) that caused this outcome, when applicable
|
|
2110
|
+
* (e.g. the changed upstreams that forced a rerun). */
|
|
2111
|
+
readonly causedBy?: readonly string[];
|
|
1898
2112
|
}
|
|
1899
2113
|
|
|
1900
2114
|
/** Scan a flow for dependencies that cannot be observed through the readSet.
|
|
@@ -1946,6 +2160,30 @@ export async function recomputeTaskflow(
|
|
|
1946
2160
|
const allIds = Object.keys(newState.phases);
|
|
1947
2161
|
|
|
1948
2162
|
if (opts.dryRun) {
|
|
2163
|
+
// Explain each phase WITHOUT executing: a frontier phase "may rerun"
|
|
2164
|
+
// because it (transitively) reads a changed seed; everything else is
|
|
2165
|
+
// reused as unreachable. We name the in-frontier upstream(s) as the cause.
|
|
2166
|
+
const seedSet0 = new Set(seeds);
|
|
2167
|
+
const upstreamsOf = (id: string): string[] => {
|
|
2168
|
+
const observed = (newState.phases[id]?.reads ?? []).map((r) => r.stepId).filter((u) => u !== id);
|
|
2169
|
+
const decl = (declared.get(id) ?? []).filter((u) => u !== id);
|
|
2170
|
+
return [...new Set([...observed, ...decl])];
|
|
2171
|
+
};
|
|
2172
|
+
const decisions: RecomputeDecision[] = allIds.map((id) => {
|
|
2173
|
+
if (!frontier.has(id)) {
|
|
2174
|
+
return { phaseId: id, outcome: "reused", reason: "not reachable from any changed seed" };
|
|
2175
|
+
}
|
|
2176
|
+
if (seedSet0.has(id)) {
|
|
2177
|
+
return { phaseId: id, outcome: "rerun", reason: "forced by recompute request (seed)" };
|
|
2178
|
+
}
|
|
2179
|
+
const causes = upstreamsOf(id).filter((u) => frontier.has(u));
|
|
2180
|
+
return {
|
|
2181
|
+
phaseId: id,
|
|
2182
|
+
outcome: "rerun",
|
|
2183
|
+
reason: "reads a phase in the stale frontier; may re-run if that upstream's output moves",
|
|
2184
|
+
causedBy: causes.length ? causes : undefined,
|
|
2185
|
+
};
|
|
2186
|
+
});
|
|
1949
2187
|
return {
|
|
1950
2188
|
report: {
|
|
1951
2189
|
dryRun: true,
|
|
@@ -1954,6 +2192,7 @@ export async function recomputeTaskflow(
|
|
|
1954
2192
|
rerun: [...frontier],
|
|
1955
2193
|
reused: allIds.filter((id) => !frontier.has(id)),
|
|
1956
2194
|
cutoff: [],
|
|
2195
|
+
decisions,
|
|
1957
2196
|
},
|
|
1958
2197
|
state: newState,
|
|
1959
2198
|
};
|
|
@@ -2003,6 +2242,11 @@ export async function recomputeTaskflow(
|
|
|
2003
2242
|
.filter((id) => frontier.has(id));
|
|
2004
2243
|
const rerun: string[] = [];
|
|
2005
2244
|
const cutoff: string[] = [];
|
|
2245
|
+
const decisions: RecomputeDecision[] = [];
|
|
2246
|
+
// Phases whose OUTPUT actually moved this recompute (seed forced, or result
|
|
2247
|
+
// changed). Used to attribute a downstream rerun to the specific upstream(s)
|
|
2248
|
+
// that changed — the "why" of the decision trace.
|
|
2249
|
+
const outputMoved = new Set<string>();
|
|
2006
2250
|
const noop = () => {};
|
|
2007
2251
|
let aborted = false;
|
|
2008
2252
|
for (const id of order) {
|
|
@@ -2015,17 +2259,50 @@ export async function recomputeTaskflow(
|
|
|
2015
2259
|
const phase = newState.def.phases.find((p) => p.id === id);
|
|
2016
2260
|
if (!phase) continue;
|
|
2017
2261
|
const before = newState.phases[id]?.inputHash;
|
|
2018
|
-
const
|
|
2262
|
+
const isSeed = seedSet.has(id);
|
|
2263
|
+
const execOpts = isSeed ? { forceRerun: true } : undefined;
|
|
2264
|
+
// The upstream(s) of this phase whose output moved — the cause of a rerun.
|
|
2265
|
+
const changedUpstreams = depsFor(id).filter((u) => outputMoved.has(u));
|
|
2019
2266
|
try {
|
|
2020
2267
|
const ps = await executePhase(phase, newState, deps, newState.phases[id], noop, 0, execOpts);
|
|
2021
2268
|
newState.phases[id] = ps;
|
|
2022
2269
|
// A phase counts as "rerun" if it was a forced seed OR its result moved;
|
|
2023
2270
|
// otherwise it hit its cache (inputHash unchanged) → early cutoff.
|
|
2024
|
-
if (
|
|
2025
|
-
|
|
2271
|
+
if (isSeed || ps.inputHash !== before) {
|
|
2272
|
+
rerun.push(id);
|
|
2273
|
+
outputMoved.add(id);
|
|
2274
|
+
decisions.push(
|
|
2275
|
+
isSeed
|
|
2276
|
+
? { phaseId: id, outcome: "rerun", reason: "forced by recompute request (seed)" }
|
|
2277
|
+
: {
|
|
2278
|
+
phaseId: id,
|
|
2279
|
+
outcome: "rerun",
|
|
2280
|
+
reason: "input changed — an upstream's output moved",
|
|
2281
|
+
causedBy: changedUpstreams.length ? changedUpstreams : undefined,
|
|
2282
|
+
},
|
|
2283
|
+
);
|
|
2284
|
+
} else {
|
|
2285
|
+
cutoff.push(id);
|
|
2286
|
+
decisions.push({
|
|
2287
|
+
phaseId: id,
|
|
2288
|
+
outcome: "cutoff",
|
|
2289
|
+
reason: "input unchanged — upstream(s) re-ran but produced identical output (early cutoff)",
|
|
2290
|
+
causedBy: depsFor(id).filter((u) => frontier.has(u)).length
|
|
2291
|
+
? depsFor(id).filter((u) => frontier.has(u))
|
|
2292
|
+
: undefined,
|
|
2293
|
+
});
|
|
2294
|
+
}
|
|
2026
2295
|
} catch {
|
|
2027
2296
|
// A failing recompute phase is recorded as rerun (it was attempted).
|
|
2028
2297
|
rerun.push(id);
|
|
2298
|
+
outputMoved.add(id);
|
|
2299
|
+
decisions.push({ phaseId: id, outcome: "failed", reason: "re-execution attempted but the phase failed" });
|
|
2300
|
+
}
|
|
2301
|
+
}
|
|
2302
|
+
// Frontier-external phases were never touched — record them as reused.
|
|
2303
|
+
for (const id of allIds) {
|
|
2304
|
+
if (!frontier.has(id)) {
|
|
2305
|
+
decisions.push({ phaseId: id, outcome: "reused", reason: "not reachable from any changed seed" });
|
|
2029
2306
|
}
|
|
2030
2307
|
}
|
|
2031
2308
|
return {
|
|
@@ -2036,6 +2313,7 @@ export async function recomputeTaskflow(
|
|
|
2036
2313
|
rerun,
|
|
2037
2314
|
reused: allIds.filter((id) => !frontier.has(id)),
|
|
2038
2315
|
cutoff,
|
|
2316
|
+
decisions,
|
|
2039
2317
|
},
|
|
2040
2318
|
state: newState,
|
|
2041
2319
|
};
|
|
@@ -2099,6 +2377,27 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
|
|
|
2099
2377
|
}
|
|
2100
2378
|
}
|
|
2101
2379
|
|
|
2380
|
+
// M6: per-phase structural sub-fingerprints. Computed once per run (when
|
|
2381
|
+
// cross-run is potentially active) so editing phase B invalidates only B +
|
|
2382
|
+
// its transitive dependents, not independent siblings. Each value is either
|
|
2383
|
+
// a precise per-phase hash or the whole-flow `flowDefHash` (soundness
|
|
2384
|
+
// fallback for shareContext / `flow` phases). Skipped entirely when
|
|
2385
|
+
// `flowDefHash === "failed"` (cross-run is disabled for the run anyway).
|
|
2386
|
+
// Never throws into the run — a per-phase error degrades that phase to the
|
|
2387
|
+
// whole-flow hash (safe, = pre-M6 behavior).
|
|
2388
|
+
if (state.flowDefHash !== "failed" && state.phaseFingerprints === undefined) {
|
|
2389
|
+
const whole = state.flowDefHash ?? "";
|
|
2390
|
+
const map: Record<string, string> = {};
|
|
2391
|
+
for (const p of def.phases) {
|
|
2392
|
+
try {
|
|
2393
|
+
map[p.id] = (await phaseFingerprint(def, p.id)) ?? whole;
|
|
2394
|
+
} catch {
|
|
2395
|
+
map[p.id] = whole; // fail-open → whole-flow scope
|
|
2396
|
+
}
|
|
2397
|
+
}
|
|
2398
|
+
state.phaseFingerprints = map;
|
|
2399
|
+
}
|
|
2400
|
+
|
|
2102
2401
|
state.status = "running";
|
|
2103
2402
|
safeEmit(deps, state);
|
|
2104
2403
|
|
|
@@ -2238,5 +2537,6 @@ async function runTaskflowLayers(state: RunState, deps: RuntimeDeps): Promise<Ru
|
|
|
2238
2537
|
finalOutput,
|
|
2239
2538
|
ok: state.status === "completed",
|
|
2240
2539
|
totalUsage,
|
|
2540
|
+
reuse: summarizeReuse(state),
|
|
2241
2541
|
};
|
|
2242
2542
|
}
|
package/extensions/schema.ts
CHANGED
|
@@ -284,6 +284,12 @@ export const TaskflowSchema = Type.Object(
|
|
|
284
284
|
"Enable the Shared Context Tree for ALL phases in this flow (shorthand for setting shareContext on every phase). Default false.",
|
|
285
285
|
}),
|
|
286
286
|
),
|
|
287
|
+
incremental: Type.Optional(
|
|
288
|
+
Type.Boolean({
|
|
289
|
+
description:
|
|
290
|
+
"Default every phase to cross-run caching (scope:'cross-run') so re-running this flow reuses unchanged phases across runs/sessions. Equivalent to setting cache:{scope:'cross-run'} on every phase; per-phase cache settings and the cross-run-blocked types (gate/approval/loop/tournament) still take precedence. Default false (run-only — each run starts fresh unless a phase opts in). A run-time `incremental` argument overrides this.",
|
|
291
|
+
}),
|
|
292
|
+
),
|
|
287
293
|
phases: Type.Array(PhaseSchema, { minItems: 1, description: "Ordered phase definitions (DAG via dependsOn)" }),
|
|
288
294
|
},
|
|
289
295
|
{ additionalProperties: false },
|
|
@@ -855,6 +861,37 @@ export function dependenciesOf(phase: Phase): string[] {
|
|
|
855
861
|
return Array.from(set);
|
|
856
862
|
}
|
|
857
863
|
|
|
864
|
+
/**
|
|
865
|
+
* Transitive upstream dependency closure of a phase: every id reachable via
|
|
866
|
+
* `dependsOn ∪ from`, including indirect ancestors. Cycle-safe (visited set).
|
|
867
|
+
* Returns the closure EXCLUDING `phaseId` itself. Sorted for deterministic
|
|
868
|
+
* hashing. Shares the exact edge semantics with `topoLayers`/`detectCycle` so
|
|
869
|
+
* the closure is complete for every valid flow (validation already rejects
|
|
870
|
+
* `{steps.X}` refs that aren't reachable via these edges, except for
|
|
871
|
+
* `join: "any"` phases — handled by callers as needed).
|
|
872
|
+
*
|
|
873
|
+
* Hoisted out of `validateTaskflow` so `phaseFingerprint` (M6) and validation
|
|
874
|
+
* share one source of truth for "what does this phase structurally depend on".
|
|
875
|
+
*/
|
|
876
|
+
export function transitiveDependencies(phases: Phase[], phaseId: string): string[] {
|
|
877
|
+
const byId = new Map(phases.map((p) => [p.id, p]));
|
|
878
|
+
const seen = new Set<string>();
|
|
879
|
+
const queue: string[] = [];
|
|
880
|
+
const seed = byId.get(phaseId);
|
|
881
|
+
if (seed) for (const d of dependenciesOf(seed)) queue.push(d);
|
|
882
|
+
while (queue.length) {
|
|
883
|
+
const id = queue.shift()!;
|
|
884
|
+
if (seen.has(id)) continue;
|
|
885
|
+
if (!byId.has(id)) continue; // unknown dep — validation reports elsewhere
|
|
886
|
+
seen.add(id);
|
|
887
|
+
const dep = byId.get(id)!;
|
|
888
|
+
for (const d of dependenciesOf(dep)) {
|
|
889
|
+
if (!seen.has(d)) queue.push(d);
|
|
890
|
+
}
|
|
891
|
+
}
|
|
892
|
+
return Array.from(seen).sort();
|
|
893
|
+
}
|
|
894
|
+
|
|
858
895
|
/** Topologically ordered layers; phases in the same layer can run concurrently. */
|
|
859
896
|
export function topoLayers(phases: Phase[]): Phase[][] {
|
|
860
897
|
const byId = new Map(phases.map((p) => [p.id, p]));
|
package/extensions/store.ts
CHANGED
|
@@ -42,10 +42,11 @@ export interface PhaseState {
|
|
|
42
42
|
model?: string;
|
|
43
43
|
error?: string;
|
|
44
44
|
inputHash?: string;
|
|
45
|
-
/** When this result was served from cache
|
|
46
|
-
* cross-run
|
|
47
|
-
*
|
|
48
|
-
|
|
45
|
+
/** When this result was served from cache instead of executed:
|
|
46
|
+
* 'cross-run' = restored from the persistent cross-run store;
|
|
47
|
+
* 'run-only' = within-run resume (a prior attempt with the same inputHash).
|
|
48
|
+
* A phase with this set spent no new tokens this run. */
|
|
49
|
+
cacheHit?: "cross-run" | "run-only";
|
|
49
50
|
startedAt?: number;
|
|
50
51
|
endedAt?: number;
|
|
51
52
|
/** Live fan-out progress for map/parallel phases. */
|
|
@@ -114,6 +115,13 @@ export interface RunState {
|
|
|
114
115
|
* recompute derives this fresh from `def` so old runs (pre-H1) also get
|
|
115
116
|
* union semantics. */
|
|
116
117
|
declaredDeps?: Record<string, DeclaredDeps>;
|
|
118
|
+
/** Per-phase structural sub-fingerprints (M6). Computed once per run
|
|
119
|
+
* alongside `flowDefHash`. Each value is either a precise per-phase hash
|
|
120
|
+
* (when sound) or the whole-flow `flowDefHash` (fallback for
|
|
121
|
+
* shareContext / `flow` phases). Folded into the cross-run cache key as
|
|
122
|
+
* `v3:phasefp:<subfp>` so editing phase B invalidates only B + its
|
|
123
|
+
* transitive dependents. Audit/resume only — recompute derives fresh. */
|
|
124
|
+
phaseFingerprints?: Record<string, string>;
|
|
117
125
|
}
|
|
118
126
|
|
|
119
127
|
// ---------------------------------------------------------------------------
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-taskflow",
|
|
3
|
-
"version": "0.0.
|
|
3
|
+
"version": "0.0.28",
|
|
4
4
|
"description": "A declarative, verifiable graph of task nodes for the Pi coding agent — not a workflow you script, but a DAG you declare: statically verified before it runs, with dynamic fan-out, gates, isolated subagent context, resumable runs, and saveable commands.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"pi-package",
|
package/skills/taskflow/SKILL.md
CHANGED
|
@@ -549,10 +549,58 @@ Quick reference:
|
|
|
549
549
|
|
|
550
550
|
- **Flow:** `name`, `description`, `concurrency` (default 8), `budget` (`maxUSD`/`maxTokens`), `agentScope` (user|project|both), `args`, `strictInterpolation`.
|
|
551
551
|
- **Phase:** `model`, `thinking`, `tools` (whitelist), `cwd`, `output:"json"`, `concurrency` (map/parallel fan-out), `when`, `join` (all|any), `retry`, `use`/`with` (flow), `optional` (fail-soft — a failed/blocked phase won't abort the run), `final`.
|
|
552
|
-
- **Cross-run caching:** add `cache: { "scope": "cross-run" }` to a phase to memoize its output across runs (same input → instant reuse, zero tokens). See `configuration.md` for `ttl`, `fingerprint` (git/glob/file/env invalidation), and
|
|
552
|
+
- **Cross-run caching:** add `cache: { "scope": "cross-run" }` to a phase to memoize its output across runs (same input → instant reuse, zero tokens), or set `incremental: true` at the flow level (or pass `incremental: true` to `run`) to default every phase to cross-run reuse. See `configuration.md` for `ttl`, `fingerprint` (git/glob/file/env invalidation), scope options, and the `incremental` precedence rules.
|
|
553
553
|
- **Precedence (model/thinking/tools):** phase value → agent frontmatter (resolved via `modelRoles`) → global/default.
|
|
554
554
|
- **Concurrency:** same-layer phases use `flow.concurrency`; a `map`/`parallel` phase uses `phase.concurrency ?? flow.concurrency ?? 8`.
|
|
555
555
|
|
|
556
|
+
### Per-item map caching (cross-run)
|
|
557
|
+
|
|
558
|
+
A `map` phase with `cache: { "scope": "cross-run" }` is cached **per item**, not
|
|
559
|
+
just as a whole. When one of N items changes between runs, only that item
|
|
560
|
+
re-executes — the other N−1 are served from the cross-run cache for $0.
|
|
561
|
+
|
|
562
|
+
```jsonc
|
|
563
|
+
{ "id": "audit-each", "type": "map",
|
|
564
|
+
"over": "{steps.discover.json.files}", // array from an upstream phase
|
|
565
|
+
"task": "audit {item}",
|
|
566
|
+
"cache": { "scope": "cross-run" }, // ← enables per-item reuse
|
|
567
|
+
"dependsOn": ["discover"], "final": true }
|
|
568
|
+
```
|
|
569
|
+
|
|
570
|
+
How it works:
|
|
571
|
+
|
|
572
|
+
- The **whole-map** entry is still checked first (fast path): an identical
|
|
573
|
+
re-run is a single $0 hit and never enters the fan-out.
|
|
574
|
+
- On a whole-map miss, each item is looked up individually before it spawns a
|
|
575
|
+
subagent; a hit returns a 0-token synthesized result. Successful fresh items
|
|
576
|
+
are recorded so a later run with that item unchanged reuses them.
|
|
577
|
+
- Per-item keys fold the item's resolved task **and agent** (so changing
|
|
578
|
+
`phase.agent` invalidates every item), plus the phase sub-fingerprint,
|
|
579
|
+
`thinking`/`tools`, and any `fingerprint` entries — exactly like a standalone
|
|
580
|
+
cross-run phase.
|
|
581
|
+
|
|
582
|
+
Automatic fallbacks (per-item disables and the whole-map path is used):
|
|
583
|
+
|
|
584
|
+
- `shareContext: true` on the phase, or flow-wide `contextSharing: true` — a
|
|
585
|
+
sharing item can read sibling blackboard writes outside its declared deps, so
|
|
586
|
+
the per-item key would under-approximate real reads.
|
|
587
|
+
- The map runs **inside a runtime-generated sub-flow** (a `flow { def }` phase
|
|
588
|
+
or a `ctx_spawn({subflow})`) — untrusted / possibly non-deterministic.
|
|
589
|
+
- `scope: "run-only"` (default) or `"off"` — no persistent store to reuse from.
|
|
590
|
+
|
|
591
|
+
Notes & limitations:
|
|
592
|
+
|
|
593
|
+
- Duplicate items (identical task + agent) share a single entry — reuse is
|
|
594
|
+
content-addressable, not positional.
|
|
595
|
+
- Failed items and **budget-skipped** items are never cached, so they always
|
|
596
|
+
re-execute on the next run.
|
|
597
|
+
- `{steps.<map>.json[k]}` indexes the k-th **successful** item (not the k-th
|
|
598
|
+
position in `over`); the merged `output` text, however, IS positionally
|
|
599
|
+
aligned with `over` (labels read `[k/N]`).
|
|
600
|
+
- Within-run resume of a partially-completed map is not supported (only
|
|
601
|
+
fully-completed maps resume within a run); cross-run per-item reuse covers the
|
|
602
|
+
common case.
|
|
603
|
+
|
|
556
604
|
## Actions
|
|
557
605
|
|
|
558
606
|
- `action: "run"` — run an inline `define` (a one-off DAG) **or** a saved `name` (with optional `args`). Use `define` for an ad-hoc flow; use `name` to invoke something previously saved. Add `detach: true` to run in the background (returns immediately with the runId; poll the store for status).
|
|
@@ -283,6 +283,28 @@ for the design.
|
|
|
283
283
|
| `cross-run` | Reuse an identical-input result from **any** prior run (the persistent store). |
|
|
284
284
|
| `off` | Never reuse, even within a run (force re-execution every time). |
|
|
285
285
|
|
|
286
|
+
### Flow-wide opt-in: `incremental`
|
|
287
|
+
|
|
288
|
+
Rather than annotating every phase with `cache: { "scope": "cross-run" }`, set
|
|
289
|
+
`incremental: true` at the **flow** level (or pass `incremental: true` as the
|
|
290
|
+
`run` tool argument) to default *every* phase to cross-run reuse:
|
|
291
|
+
|
|
292
|
+
```jsonc
|
|
293
|
+
{
|
|
294
|
+
"name": "audit",
|
|
295
|
+
"incremental": true, // ← every phase defaults to scope:"cross-run"
|
|
296
|
+
"phases": [ /* ... */ ]
|
|
297
|
+
}
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
Precedence: the invocation `incremental` argument wins over the flow's
|
|
301
|
+
`incremental` field, which is in turn overridden by any **per-phase** `cache`
|
|
302
|
+
setting. The cross-run-blocked phase types (`gate`/`approval`/`loop`/
|
|
303
|
+
`tournament`) and all per-phase soundness fallbacks still apply. The default
|
|
304
|
+
remains `run-only` (each run starts fresh unless something opts in), because
|
|
305
|
+
cross-run reuse silently persists outputs and can serve stale results for phases
|
|
306
|
+
whose agents read files at runtime.
|
|
307
|
+
|
|
286
308
|
### `ttl` (cross-run only)
|
|
287
309
|
|
|
288
310
|
Max age before a cross-run hit is treated as a miss: e.g. `"30m"`, `"6h"`, `"7d"`.
|