opencode-plugin-flow 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,2661 @@
1
+ # Changelog
2
+
3
+ ## [Unreleased]
4
+
5
+ ## [3.0.0] - 2026-06-12
6
+
7
+ The skills-first inversion: skills carry the workflow, the plugin shrinks to a state backend
8
+
9
+ Flow 3.0.0 completes Phases 2–4 of the skills-first overhaul plan (`docs/plans/skills-first-overhaul-2026-06-12.md`). Four hand-authored skills (`flow`, `flow-plan`, `flow-run`, `flow-review`, with `references/` rubrics, worked examples, and a recovery playbook) are now the primary instruction surface; they are checked into the repo, synced at startup, user-editable, and per-project overridable. The generated prompt/skill/contract projections, mode contracts, capture scripts, and parity tests are deleted — there is no generated source left.
10
+
11
+ The tool surface consolidates from 18 tools to 7: `flow_status` (state, readiness, and a computed next step — absorbing `flow_doctor` and `flow_auto_prepare`), `flow_plan_save`, `flow_plan_approve`, `flow_run_start`, `flow_feature_complete`, `flow_review_record`, and `flow_session`. The 15 retired v2 tool names stay registered for one minor cycle as hidden redirect stubs that return an error naming the replacement tool and its key arguments, so resumed v2 sessions degrade gracefully; the stubs are scheduled for removal in v3.1. Agents collapse from 6 to 1 (`flow-reviewer`, read-only via native per-agent permissions); the 9 command names are unchanged and are now thin pointers into the skills. The session schema stays at v1, so existing `.flow/**` sessions resume under 3.0.0 unchanged (covered by a v2-session fixture test).
12
+
13
+ The runtime now enforces four hard invariants directly instead of a nine-gate matrix: a feature cannot complete without recorded validation evidence; a session cannot close as `completed` while planned features are below the plan's completion target (the close returns a structured `unfinished_features` error); an approved plan cannot be mutated without an explicit reset; and a strict review policy requires a recorded reviewer decision before completion. `flow_status` stays readable when a persisted session id is malformed — the `session_artifacts` readiness check degrades to a failing check with remediation instead of crashing. Stack-standards profiling, the review scope/coverage accounting layers, and the `flow-auto` coordination lane are deleted; repo profiling and review judgment move to the skills.
14
+
15
+ The check pipeline drops from ~20 gate steps to 6 (typecheck, lint, build, test, install smoke, bundle sanity); the bench suite stays runnable outside `check`. A new manual golden-transcript eval lane (`bun run evals:golden`) drives `opencode run` headless against fixture repos and asserts outcomes from the persisted `.flow/**` state — it needs a model key and is not part of CI. The minified bundle is ~166 KB (budget 200 KiB), down from 752 KB at v2.0.56.
16
+
17
+ Install pin changes to `"plugin": ["opencode-plugin-flow@3"]`. Restart OpenCode once after upgrading so re-synced skills are discovered.
18
+
19
+ Not-tested: Live OpenCode UI runtime interaction; the golden-transcript eval lane against a real model (harness dry-run only).
20
+
21
+ ## [2.1.0] - 2026-06-12
22
+
23
+ npm distribution, Promise-based permission API, and the skills-first overhaul Phase 1
24
+
25
+ Flow 2.1.0 is the first release of the skills-first overhaul plan (`docs/plans/skills-first-overhaul-2026-06-12.md`) and changes how Flow is installed without changing workflow behavior.
26
+
27
+ Flow is now distributed through npm: add `"plugin": ["opencode-plugin-flow@2"]` to `opencode.json` and OpenCode installs the package (with `zod` resolved as a regular dependency) at startup. The curl installer, the bundled `~/.config/opencode/plugins/flow.js` slot, the release skill tarball, and `src/installer.ts` are retired. A pre-npm `flow.js` copy keeps working for this minor cycle, but plugin startup and `/flow-doctor` warn about the double-load risk. `bunx opencode-plugin-flow uninstall` (new `dist/cli.js` bin) removes Flow-owned global skills and the pre-npm copy, and prints the `opencode.json` cleanup step.
28
+
29
+ Generated global skills are now synced by the plugin at startup with folder marker files (`.flow-skill-version`) instead of being copied by an installer: folders without the Flow marker are never touched, pristine pre-npm hash-locked installs are adopted in place, and a user-edited `SKILL.md` is backed up to `SKILL.md.backup` before an update replaces it. Restart OpenCode once after the first install or an update so newly synced skills are discovered.
30
+
31
+ The `effect` beta dependency is gone: `@opencode-ai/plugin` is pinned at 1.17.3, where `ToolContext.ask` returns a `Promise`, so the hidden-workspace permission flow now awaits the host directly and refuses the mutation if a pre-1.15.5 host hands back a non-Promise. With `zod` external and full minification the plugin bundle drops from 752 KB to ~307 KB (gate at 320 KB; the sub-100 KB target lands with the Phase 2/3 runtime simplification).
32
+
33
+ The install smoke now runs against the packed npm tarball end-to-end (pack → extract → plugin startup → skill sync markers → tool session → uninstall CLI), the release workflow publishes to npm (requires the `NPM_TOKEN` secret) and attaches the tarball to the GitHub release, and smoke/eval evidence moved from `prompt-exports/` to the gitignored `.release-artifacts/`.
34
+
35
+ Not-tested: Live OpenCode UI runtime interaction; npm registry installation through a real OpenCode host (publish-and-install spike pending).
36
+
37
+ ## [2.0.56] - 2026-06-01
38
+
39
+ Force review target rendering through the installed review surface
40
+
41
+ Flow 2.0.56 follows the 2.0.55 provenance grounding release with a narrower installation-surface hardening patch. A live broad review from the newest plugin could still omit the `Review target` section because the model-facing audit flow could hand-write the final response or call the renderer without target provenance.
42
+
43
+ The `/flow-review` command and auditor agent now share explicit renderer-backed final-output rules: build the structured ledger, call `flow_review_render`, include `reviewTarget` for the repository actually reviewed, and return the renderer's `report` field verbatim. The render tool now rejects missing `reviewTarget` for human-facing output, including default, `human`, and `both` views, while preserving target-less `structured` output for legacy raw JSON compatibility.
44
+
45
+ The tool guidance now advertises the same conditional requirement so the model-facing tool description and runtime validation agree. If rendering fails, the prompt no longer invites a full hand-written substitute review; it limits fallback output to the tool failure, minimal target/provenance details, and retry guidance.
46
+
47
+ Constraint: Human-facing review output must be rendered through `flow_review_render` with explicit reviewed-target provenance
48
+ Constraint: Preserve target-less `view: structured` compatibility for legacy raw JSON report consumers
49
+ Constraint: Keep the patch limited to review rendering guidance, adapter schema enforcement, and regression coverage
50
+ Rejected: Make `reviewTarget` mandatory in the base `ReviewReportSchema` | that would break structured legacy reports and broaden the migration beyond the installed human-review path
51
+ Rejected: Let auditor prompts hand-write a full fallback report after render failure | that reopens the path that omitted target provenance
52
+ Rejected: Add new commands, runtime tools, prompt modes, install paths, package exports, dependencies, or persisted session migrations | this release hardens the existing review surface only
53
+ Confidence: high
54
+ Scope-risk: low
55
+ Reversibility: clean
56
+ Directive: Treat review target provenance as a renderer-enforced contract for human-facing audit output; future review prompt changes should preserve the tool-backed final response path
57
+ Tested: `bun test tests/config/prompt-contracts.test.ts tests/prompt-snapshot.test.ts tests/config/tool-schemas.test.ts tests/runtime-review-render.test.ts && bun run typecheck && bun run lint`; `bun run check` (passed for package `2.0.56`, including typecheck, review/prompt capture checks, dependency contract, architecture seams, fresh-surface terminology, deadcode and export budget checks, build, release hygiene, pack invariants, completion lane, replay tests, cold-start budget, bundle sanity, full suite, lint, bench smoke, and bench gate); `bun run smoke:release` (passed for package `2.0.56`, generated release assets and OpenCode-oriented smoke evidence under `prompt-exports/release-smoke/`, real OpenCode CLI/UI not invoked)
58
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.56` before push
59
+
60
+ ## [2.0.55] - 2026-06-01
61
+
62
+ Ground broad reviews in target provenance
63
+
64
+ Flow 2.0.55 tightens the review ledger after a broad codebase review exposed a confusing provenance failure mode: review output could mix findings from one repository with a file-map or context artifact from another. Audit reports can now record the reviewed target, reviewed context artifacts, and structured finding locations so final review lore has a concrete repository identity instead of relying on prose alone.
65
+
66
+ The report schema now validates safe repo-relative finding paths, rejects context artifacts unless a review target is present, and refuses context artifacts whose `repoRoot` does not match the recorded review target. Human review output now renders the target repository, branch/commit metadata when available, context artifact provenance, primary finding locations, and related-location reasons.
67
+
68
+ The release deliberately keeps this as an audit-report grounding patch. It does not add commands, runtime tools, prompt modes, install paths, package exports, dependencies, or a persisted session migration. `reviewTarget` remains optional for legacy reports, but any report that supplies context artifacts must bind them to the same target root.
69
+
70
+ Constraint: Keep broad-review findings tied to explicit repository provenance when context artifacts are present
71
+ Constraint: Use safe repo-relative finding paths only; reject absolute paths, URI paths, home-relative paths, and traversal outside the reviewed root
72
+ Constraint: Preserve compatibility for legacy audit reports by making `reviewTarget` optional unless context artifacts are supplied
73
+ Rejected: Make `reviewTarget` mandatory for every historical report in this patch | that would turn a grounding fix into a broader schema migration
74
+ Rejected: Accept file maps or context artifacts from a different repo root | mixed provenance was the reviewed failure mode this release closes
75
+ Rejected: Add new commands, runtime tools, prompt modes, install paths, package exports, or dependencies | the change is audit schema, prompt-contract, presenter, and regression coverage only
76
+ Confidence: high
77
+ Scope-risk: low
78
+ Reversibility: clean
79
+ Directive: Keep review evidence grounded in the actual reviewed repository; future audit-report expansions should preserve explicit target identity before rendering final findings
80
+ Tested: `bun test tests/runtime/audit-report-provenance.test.ts tests/runtime/evidence-packets.test.ts tests/config/tool-schemas.test.ts tests/prompt-snapshot.test.ts tests/prompt-behavior-eval.test.ts && bun run typecheck && bun run lint`; `bun run check` (passed for package `2.0.55`, including typecheck, review/prompt capture checks, dependency contract, architecture seams, fresh-surface terminology, deadcode and export budget checks, build, release hygiene, pack invariants, completion lane, replay tests, cold-start budget, bundle sanity, full suite `693 pass`, lint, bench smoke, and bench gate); `bun run smoke:release` (passed for package `2.0.55`, generated release assets and OpenCode-oriented smoke evidence under `prompt-exports/release-smoke/`, real OpenCode CLI/UI not invoked)
81
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.55` before push
82
+
83
+ ## [2.0.54] - 2026-06-01
84
+
85
+ Patch compact-context cleanup after the OpenCode rebuild
86
+
87
+ Flow 2.0.54 follows the already-published 2.0.53 rebuild with a narrow safety patch for compact OpenCode context replacement. The adapter now removes stale pre-rebuild Flow bullets such as prior summaries, next steps, next commands, latest validation, and standards-profile lines before appending the fresh active-session context.
88
+
89
+ The regression coverage now proves that host-owned context and non-Flow handoff lines survive the cleanup while obsolete Flow-managed bullets are stripped from both system-transform and session-compacting paths. This keeps the lighter default coding flow from carrying stale runtime instructions forward after compaction.
90
+
91
+ The release deliberately does not rewrite the published `v2.0.53` tag or GitHub release. It preserves the public package name, install paths, tool names, generated skill names, SDK/zod boundary, and the ordinary-vs-strict completion policy introduced in 2.0.53.
92
+
93
+ Constraint: Keep `v2.0.53` immutable and publish this fix as a normal follow-up patch release
94
+ Constraint: Strip only Flow-managed compact-context bullets; retain host-owned and non-Flow handoff context
95
+ Constraint: Keep the OpenCode rebuild public surface unchanged from 2.0.53
96
+ Rejected: Force-push `main` or move the published `v2.0.53` tag | released artifacts should remain immutable once published
97
+ Rejected: Broaden compaction cleanup to arbitrary bullet lines | that could delete host context Flow does not own
98
+ Rejected: Add new commands, tools, modes, install paths, or package exports for this patch | the fix is release hygiene plus stale-context filtering only
99
+ Confidence: high
100
+ Scope-risk: low
101
+ Reversibility: clean
102
+ Directive: Treat release tags as immutable; when a post-tag fix is needed, publish a patch release with focused lore and regression coverage rather than rewriting the prior release
103
+ Tested: `bun test tests/config/plugin-surface.test.ts`; `bun run check`; `bun run smoke:release`
104
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.54` before push
105
+
106
+ ## [2.0.53] - 2026-06-01
107
+
108
+ Rebuild the OpenCode plugin default flow
109
+
110
+ Flow 2.0.53 keeps the public package and install story unchanged: the package remains `opencode-plugin-flow`, the package entry remains `dist/index.js`, the global OpenCode plugin target remains `~/.config/opencode/plugins/flow.js`, and generated skills still install under `~/.config/opencode/skills`.
111
+
112
+ The default coding workflow is lighter. Ordinary implementation sessions now use compact active-session context, slimmer command/agent prompts, targeted validation plus `featureReview` payloads for feature completion, and broad validation plus `finalReview` payloads for final completion without requiring separately recorded reviewer decisions on the ordinary path.
113
+
114
+ Strict review remains available and enforced for `review`, `review_and_fix`, and explicit `deliveryPolicy.strictReview` sessions. Those paths still require reviewer decisions and the applicable review-scope and finding-accounting evidence before completion.
115
+
116
+ Constraint: Keep public tool names, package exports, install paths, generated skill names, SDK/zod strictness, and snapshot persistence unchanged
117
+ Constraint: Keep ordinary completion validation-backed even when recorded reviewer decisions are no longer mandatory on the default implementation path
118
+ Constraint: Keep strict review/accounting gates required for review/review-and-fix/explicit strict sessions
119
+ Rejected: Rename tools, package, install path, or skills as part of the rebuild | this release is an in-place OpenCode plugin rebuild
120
+ Rejected: Delete stable public tools only to reduce surface counts | continuity-preserving stability is preferred over breaking existing OpenCode workflows
121
+ Rejected: Update `bench/BASELINE.md` for this pass | benchmark gate still passes against the existing baseline, and no new baseline convention was needed
122
+ Confidence: high
123
+ Scope-risk: medium
124
+ Reversibility: moderate
125
+ Directive: Keep future default-flow simplification budget-backed with cold-start, prompt/context, tool schema, and ordinary-vs-strict smoke coverage
126
+ Tested: `bun run check` passed for package `2.0.53` (typecheck; review and prompt capture checks; dependency contract; architecture seams; fresh-surface, deadcode, and deadcode-export budget checks; build; release hygiene; pack invariants OK for version `2.0.53`; completion lane; replay tests; cold-start budget; bundle sanity; full suite `689 pass`, `0 fail`, `1` snapshot, `7154` expect calls; lint; `bench:smoke`; and `bench:gate` with `12` baseline rows passed). `bun run smoke:release` passed for package `2.0.53`, generated release assets and OpenCode-oriented smoke evidence under `prompt-exports/release-smoke/`, and did not invoke a real OpenCode CLI/UI host.
127
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted CI/release workflow run for tag `v2.0.53` before push
128
+
129
+ ## [2.0.52] - 2026-05-31
130
+
131
+ Tighten the cleanup gate lore
132
+
133
+ Flow 2.0.52 turns the simplification pass into an enforced maintenance boundary. Unused value exports are now driven to zero and guarded by a deadcode export budget, schema barrels stop re-exporting private internals, prompt fragments keep only active generated-prompt inputs, and adapter/runtime helper exports are narrowed to the surfaces that are still actually consumed.
134
+
135
+ The release also removes the old final-review compatibility lore. Prior `oracleRefs`, `test_oracle`, and `test_oracle_authenticity` payload terms are rejected instead of canonicalized, final-review behavior schemas live in their own module, and detailed final-review gates require meaningful trimmed integration and regression evidence rather than whitespace-only placeholders.
136
+
137
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, live OpenCode UI automation, or a persisted session schema version. It intentionally contracts internal and legacy review payload surfaces while preserving the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
138
+
139
+ Constraint: Keep unused value exports at zero through `bun run check:deadcode-exports`
140
+ Constraint: Treat legacy final-review terms as invalid input, not compatibility aliases
141
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
142
+ Rejected: Keep broad runtime/schema barrels for possible future callers | the package public API remains root-only, and unused internal exports were hiding ownership
143
+ Rejected: Preserve old final-review aliases after the cleanup | backwards compatibility is intentionally out of scope for this simplification release
144
+ Rejected: Count whitespace-only detailed-review checks as evidence | detailed final review requires meaningful integration and regression accounting
145
+ Confidence: high
146
+ Scope-risk: medium
147
+ Reversibility: moderate
148
+ Directive: Keep future simplification work budget-backed and source-owned; do not widen internal exports or revive compatibility aliases without a current consumer and release-note justification
149
+ Tested: `bun test tests/runtime/final-review-contracts.test.ts` (36 pass, 0 fail, 197 expect() calls); `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, fresh surfaces OK, deadcode export budget OK with `exports=0`, release hygiene OK, pack invariants OK for version `2.0.52`, bundle sanity OK, full suite 676 pass/0 fail, lint passed, bench smoke and bench gate passed); `bun run smoke:release` (passed for package `2.0.52`, wrote release-smoke evidence under `prompt-exports/release-smoke/`, real OpenCode CLI not invoked); RepoPrompt MCP code review found one P1 detailed-final-review evidence issue before release, and this patch fixes it
150
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.52` before push
151
+
152
+ ## [2.0.51] - 2026-05-21
153
+
154
+ Keep runtime projections out of handoff lore
155
+
156
+ Flow 2.0.51 tightens the review follow-up from the handoff-evidence release. Worker review guidance now uses the exact handoffMode vocabulary everywhere: `task_subagent` for an actual flow-reviewer Task handoff, `inline_role` for inline approval fallback, and `not_supported` only when Task is unavailable or denied.
157
+
158
+ The release also renames derived task-progress presentation from `handoff: runtime_projection` to `projection: runtime_projection`. Runtime status, history, and rendered session docs still expose that rows are derived, but they no longer place runtime projections under a handoff label that could be mistaken for real Task/subagent telemetry.
159
+
160
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, persisted schema migrations, or live OpenCode UI automation. It preserves the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
161
+
162
+ Constraint: Keep handoffMode labels exact across flow-auto and flow-worker guidance: `task_subagent`, `inline_role`, and `not_supported`
163
+ Constraint: Keep runtime task-progress rows visibly derived without implying they are handoff decisions or OpenCode Task telemetry
164
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
165
+ Rejected: Let worker review guidance use shorthand `inline` wording | that drifts from the canonical `inline_role` label and weakens prompt scoring
166
+ Rejected: Keep rendering projections as `handoff: runtime_projection` | the label still reads like handoff telemetry even when the value is neutral
167
+ Rejected: Persist new handoff telemetry in this patch | the fix is presentation and prompt-contract precision, not a schema expansion
168
+ Confidence: high
169
+ Scope-risk: low
170
+ Reversibility: clean
171
+ Directive: Keep actual handoff decisions in coordinator/worker output separate from derived runtime projections; use projection labels for derived rows until real Task telemetry is intentionally persisted
172
+ Tested: `bun test tests/runtime-summary.test.ts tests/runtime-operator-history.test.ts tests/runtime-actionable-metadata.test.ts tests/render-fixtures.test.ts tests/runtime/render-snapshot.test.ts tests/config/prompt-contracts.test.ts tests/mode-contracts.test.ts` (83 pass, 0 fail, 1 snapshot, 1342 expect() calls); `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, fresh surfaces OK, release hygiene OK, pack invariants OK for version `2.0.51`, bundle sanity OK, full suite 675 pass/0 fail, lint passed, bench smoke and bench gate passed); `bun run smoke:release` (passed for package `2.0.51`, wrote release-smoke evidence under `prompt-exports/release-smoke/`, real OpenCode CLI not invoked); RepoPrompt MCP code review found two P2 findings before release, and this patch fixes both
173
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.51` before push
174
+
175
+ ## [2.0.50] - 2026-05-14
176
+
177
+ Clarify flow-auto handoff evidence
178
+
179
+ Flow 2.0.50 makes `/flow-auto` subagent coordination more explicit without overclaiming what runtime summaries can prove. Flow-auto prompts now require phase-boundary handoff reporting for planning, execution, and review with the exact modes `task_subagent`, `inline_role`, and `not_supported`, so agents expose whether work used an actual OpenCode Task/subagent, stayed inline, or could not use Task.
180
+
181
+ The release also fixes the reviewed runtime-summary ambiguity: derived task-progress rows are now labeled only as `runtime_projection` instead of hardcoding `not_supported`. This keeps status/history/rendered session summaries useful while avoiding a false claim that Task/subagent invocation was unavailable when the row is only a projection.
182
+
183
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, persisted schema migrations, or live OpenCode UI automation. It preserves the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
184
+
185
+ Constraint: Distinguish actual coordinator handoff decisions from derived runtime progress projections
186
+ Constraint: Keep the flow-auto handoff vocabulary narrow and exact: `task_subagent`, `inline_role`, and `not_supported`
187
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
188
+ Rejected: Treat runtime task-progress rows as proof of actual child sessions | those rows are derived presentation state, not OpenCode Task telemetry
189
+ Rejected: Continue rendering projected rows as `not_supported` | that falsely implies Task was unavailable, denied, or not permission-allowed
190
+ Rejected: Persist new handoff telemetry in this patch | neutral projection labeling fixes the reviewed correctness issue without widening persisted schema
191
+ Confidence: high
192
+ Scope-risk: moderate
193
+ Reversibility: clean
194
+ Directive: Keep actual handoff decisions in prompt/coordinator output separate from runtime-derived presentation rows unless real Task telemetry is intentionally persisted later
195
+ Tested: `bun test tests/runtime-summary.test.ts tests/runtime-operator-history.test.ts tests/render-fixtures.test.ts tests/runtime/render-snapshot.test.ts tests/config/prompt-contracts.test.ts tests/mode-contracts.test.ts tests/prompt-mode-behavior-eval.test.ts tests/docs-semantic-parity.test.ts tests/docs-stale-reference-policy.test.ts tests/docs-tool-parity.test.ts && bun run typecheck` (95 pass, 0 fail, 1 snapshot, 1960 expect() calls; `tsc --noEmit` passed); `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, fresh surfaces OK, release hygiene OK, pack invariants OK for version `2.0.50`, bundle sanity OK, full suite 675 pass/0 fail, lint passed, bench smoke and bench gate passed); `bun run smoke:release` (passed for package `2.0.50`, wrote release-smoke evidence under `prompt-exports/release-smoke/`, real OpenCode CLI not invoked); RepoPrompt review found one semantic blocker before release, and the implemented fix removed hardcoded runtime-projection `not_supported` evidence
196
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.50` before push
197
+
198
+ ## [2.0.49] - 2026-05-14
199
+
200
+ Make final-review behavior accounting proportional
201
+
202
+ Flow 2.0.49 narrows final-review behavior-risk accounting so broad review scopes no longer require temporal ledgers from declared or unchanged context alone. Grounded review context still triggers concrete async, lifecycle, state, and test-evidence accounting when changed artifacts or reviewed relationships prove those risks are relevant.
203
+
204
+ The release also rejects duplicate behavior and validation risk classes after canonicalization, tightens prompt-capture scoring so validation coverage only counts when its command was recorded, and updates prompt guidance to follow `deliveryPolicy.finalReviewPolicy` without padding non-required behavior classes with `not_applicable` entries.
205
+
206
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, persisted schema migrations, or live OpenCode UI automation. It preserves the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
207
+
208
+ Constraint: Keep final-review strictness proportional to grounded changed artifacts, review context, and validation evidence rather than declared scope breadth alone
209
+ Constraint: Preserve strict behavior accounting for grounded async/event ordering, lifecycle, state, and test-evidence risks
210
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
211
+ Rejected: Pad every broad review with non-required `not_applicable` behavior classes | that encourages shallow ledger noise and obscures real required risks
212
+ Rejected: Let validationCoverage satisfy prompt behavior scoring without a recorded validation command | detached validation coverage is weaker than the runtime ledger contract
213
+ Rejected: Derive temporal behavior risks from declared review scope alone | broad review declarations are accounting targets, not proof that async/lifecycle/state behavior changed
214
+ Confidence: high
215
+ Scope-risk: moderate
216
+ Reversibility: clean
217
+ Directive: Keep final-review behavior ledgers grounded in changed artifacts, explicit relationships, and recorded validation evidence; do not reintroduce padded non-applicable ledgers for proportional reviews
218
+ Tested: `bun test tests/runtime/final-review-contracts.test.ts tests/review-prompt-capture.test.ts` (44 pass, 0 fail, 252 expect() calls); `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, fresh surfaces OK, pack invariants OK for version `2.0.49`, bundle sanity OK, full suite 675 pass/0 fail, lint passed, bench smoke and bench gate passed); `bun run smoke:release` (passed for package `2.0.49`, wrote release-smoke evidence under `prompt-exports/release-smoke/`, real OpenCode CLI not invoked); RepoPrompt code review found two fixable findings before release; follow-up Oracle review of the implemented fixes found no blockers
219
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.49` before push
220
+
221
+ ## [2.0.48] - 2026-05-14
222
+
223
+ Lock planning validation and post-save artifact recovery
224
+
225
+ Flow 2.0.48 closes the release-review gaps around planning-session durability and OpenCode planning payload validation. `flow_plan_start` now preserves the saved source-of-truth session when artifact rendering fails after persistence, returning a structured `partial_success` response with artifact-sync failure metadata instead of throwing after the mutation was already durable.
226
+
227
+ Planning payload strictness is now enforced at the adapter-facing runtime parse boundaries for `flow_plan_apply` and `flow_plan_context_record`. The outer `flow_plan_apply` payload, nested `plan`, nested optional `planning`, and `flow_plan_context_record` planning context reject unknown keys, while simple status/history tools remain intentionally tolerant. New execution-path tests call the actual OpenCode tool `execute()` wrappers to prove validation failures short-circuit through structured JSON errors, not only direct schema `safeParse()` checks.
228
+
229
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, persisted schema migrations, or live OpenCode UI automation. It preserves the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
230
+
231
+ Constraint: Preserve persistence-first planning semantics while making stale artifact rendering an explicit repair signal after saved state
232
+ Constraint: Tighten only the adapter-facing planning payload boundaries; simple read/status tool payloads remain tolerant by contract
233
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
234
+ Rejected: Treat post-save artifact rendering failure as a total `flow_plan_start` failure | callers need to know the session was saved and only derived artifacts need repair
235
+ Rejected: Make every tool schema strict | the strictness contract is scoped to planning payload boundaries and preserves tolerant simple tool behavior
236
+ Rejected: Add a new workspace result-kind hierarchy in this release | the current structured `partial_success` response covers callers without widening the runtime action API
237
+ Confidence: high
238
+ Scope-risk: moderate
239
+ Reversibility: clean
240
+ Directive: Keep planning payload strictness scoped to documented adapter-facing boundaries, and keep post-persistence artifact failures distinguishable from unsaved mutation failures
241
+ Tested: `bun test tests/config/tool-schemas.test.ts` (18 pass, 0 fail, 515 expect() calls); `bun run typecheck`; `bunx biome check tests/config/tool-schemas.test.ts`; `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, fresh surfaces OK, pack invariants OK for version `2.0.48`, bundle sanity OK, full suite 663 pass/0 fail, lint passed, bench smoke and bench gate passed); `bun run smoke:release` (passed for package `2.0.48`, wrote release-smoke evidence under `prompt-exports/release-smoke/`, real OpenCode CLI not invoked); RepoPrompt Oracle architect review of the execution-path strictness follow-up returned APPROVE
242
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.48` before push
243
+
244
+ ## [2.0.47] - 2026-05-14
245
+
246
+ Make release smoke safe for reusable evidence directories
247
+
248
+ Flow 2.0.47 turns the release-smoke review fixes into a releaseable contract. The new `smoke:release` wrapper now treats `--no-keep-assets` as a disposable-output mode with a side-effect-free preflight: if any known release-smoke asset, evidence, or checklist file already exists, the command refuses before writing anything, preserving existing diagnostics instead of partially overwriting them.
249
+
250
+ The wrapper also separates disposable release assets from diagnostic evidence. Successful disposable runs remove only generated upload assets while retaining the JSON/Markdown smoke evidence and manual-live checklist paths that release and PR notes depend on. Child smoke failures now preserve the expected child evidence paths and read the child JSON when available, so the wrapper failure evidence remains useful instead of pointing at missing diagnostics.
251
+
252
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, persisted schema migrations, or live OpenCode UI automation. It preserves the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
253
+
254
+ Constraint: Harden release-smoke artifact handling without changing the OpenCode plugin runtime surface or release asset format
255
+ Constraint: Keep automated smoke bounded to prepared local assets and temporary workspace state; live OpenCode UI validation remains a manual release step
256
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
257
+ Rejected: Delete diagnostic evidence during disposable cleanup | emitted evidence paths must remain readable after the command exits
258
+ Rejected: Let `--no-keep-assets` overwrite stale evidence/checklist files before refusing | that makes the safety mode unsafe for reusable directories
259
+ Rejected: Broaden this patch into GitHub workflow or product-surface changes | the current change is a local release-smoke contract and documentation hardening patch
260
+ Confidence: high
261
+ Scope-risk: narrow
262
+ Reversibility: clean
263
+ Directive: Keep `smoke:release` evidence paths durable after each run, and keep manual live OpenCode validation reported separately from automated smoke claims
264
+ Tested: `bun test tests/cross-area/opencode-smoke.test.ts` (7 pass, 0 fail); `bun test tests/docs-stale-reference-policy.test.ts tests/docs-semantic-parity.test.ts` (5 pass, 0 fail); `bun run smoke:release -- --skip-build --output-dir <tmp> --no-keep-assets` (passed and retained only diagnostic evidence/checklist files); `bun run smoke:release` (passed for package `2.0.47`, wrote release-smoke evidence under `prompt-exports/release-smoke/`, real OpenCode CLI not invoked); manual no-clobber repro for pre-existing `release-smoke-evidence.json` and `manual-live-opencode-checklist.md` (exit 1, original contents unchanged); `bunx biome check scripts/cross-area/release-smoke.mjs tests/cross-area/opencode-smoke.test.ts package.json docs/release-process.md docs/architecture/maintainer-risk-checklist.md --files-ignore-unknown=true`; RepoPrompt code review of the uncommitted release-smoke diff found no actionable blockers or important suggestions; `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, pack invariants OK for version `2.0.47`, bundle sanity OK, full suite 659 pass/0 fail, lint passed, bench smoke and bench gate passed)
265
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.47` before push
266
+
267
+ ## [2.0.46] - 2026-05-14
268
+
269
+ Prove release smoke against the artifacts users receive
270
+
271
+ Flow 2.0.46 closes the release-readiness evidence gap around OpenCode installation smoke testing. The release workflow now prepares the same uploadable assets it publishes (`flow.js`, `flow-skills.tar.gz`, `install.sh`, and `uninstall.sh`) before running `smoke:opencode`, then passes those exact local candidates into the smoke runner instead of relying on source-generated defaults.
272
+
273
+ The new smoke runner installs into a temporary `HOME`, imports the installed plugin, checks the public bundle surface, checks generated skills, exercises a minimal `.flow/**` runtime session in an isolated temporary worktree, runs uninstall, and emits JSON/Markdown evidence that clearly separates automated host-boundary smoke from manual live OpenCode UI validation. Local development keeps the default source-generated smoke path, while tests prove the release-shaped explicit-asset path records the expected asset provenance and workspace isolation.
274
+
275
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, persisted schema migrations, or live OpenCode UI automation. It preserves the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
276
+
277
+ Constraint: Strengthen release evidence for uploadable OpenCode assets without publishing source-generated smoke results as if they proved the release archive
278
+ Constraint: Keep automated smoke bounded to local file assets and temporary workspace state; live OpenCode UI validation remains a manual release step
279
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
280
+ Rejected: Treat default source-generated smoke evidence as release-asset validation | that could overclaim readiness for the files users actually install from GitHub releases
281
+ Rejected: Invoke a real OpenCode CLI/UI host from CI | the automated smoke should stay deterministic and sandboxed while documenting the remaining manual host validation
282
+ Rejected: Broaden this patch into product-surface changes | the current change is release evidence and documentation only
283
+ Confidence: high
284
+ Scope-risk: moderate
285
+ Reversibility: clean
286
+ Directive: Keep automated smoke evidence tied to the actual prepared release assets, and keep manual live OpenCode validation reported separately from CI smoke claims
287
+ Tested: `bun test tests/cross-area/opencode-smoke.test.ts` (1 pass, 0 fail); `bun run smoke:opencode -- --skip-build --json <tmp>/default-evidence.json --summary <tmp>/default-evidence.md` (passed with `assetSource=generated-defaults` and isolated worktree `.flow/**` evidence); `bun test tests/cross-area/install-lifecycle.test.ts tests/smoke/dist-load.test.ts` (3 pass, 0 fail); RepoPrompt Oracle review of the release-smoke diff found no concrete must-fix findings; `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, fresh surfaces OK, bundle sanity OK, full suite 653 pass/0 fail, lint passed, bench smoke and bench gate passed)
288
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.46` before push
289
+
290
+ ## [2.0.45] - 2026-05-14
291
+
292
+ Harden review-schema and activation rollback contracts
293
+
294
+ Flow 2.0.45 closes the review-found safety gaps in the OpenCode adapter and live session activation paths. Hidden workspace mutations now fail closed when the adapter needs OpenCode edit approval but no `ToolContext.ask` bridge is available, preventing permissionless writes under hidden project roots while preserving direct runtime support for hidden home workspaces.
295
+
296
+ Final-review tool schemas now advertise the same structured `reviewContextPack` shape that runtime validation accepts. The adapter no longer blesses compact `unknown[]` context, relationship, or validation-evidence entries that would pass the public OpenCode raw schema and then fail during runtime parsing; parity tests now prove both sides accept structured evidence and reject compact drift, including nested final reviews in worker completion payloads.
297
+
298
+ Live session activation is rollback-safe across both save-driven promotion and direct `activateSession()` promotion. If a requested stored session cannot be promoted after the prior active session is parked, Flow restores the prior active session where possible. Rollback failures now raise `SessionActivationRollbackError` with structured `promotionError`, `rollbackError`, and `rollbackPhase` diagnostics instead of burying one side of the failure in prose.
299
+
300
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, or persisted schema migrations. It preserves the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
301
+
302
+ Constraint: Close reviewed adapter/runtime schema drift and activation rollback diagnostics without widening Flow's public command/tool surface or persisted session schema
303
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
304
+ Constraint: Preserve hidden-home runtime usability while failing closed only at the OpenCode adapter mutation-permission boundary
305
+ Rejected: Normalize compact string-array review context into structured runtime evidence | compact entries lack required path/reason/relationship fields and would create ambiguous review evidence
306
+ Rejected: Add new recovery commands or state paths for activation rollback failures | the existing session layout can remain coherent with narrower rollback behavior and structured diagnostics
307
+ Rejected: Broaden the release into general runtime simplification | the current change is a focused review-fix patch with behavior-locked tests
308
+ Confidence: high
309
+ Scope-risk: moderate
310
+ Reversibility: clean
311
+ Directive: Keep OpenCode raw schemas execution-valid against runtime parsers, keep hidden-root mutation permission fail-closed, and preserve structured diagnostics for multi-error filesystem recovery paths
312
+ Tested: `bun test tests/config/tool-schemas.test.ts tests/runtime-session-persistence.test.ts && bun run typecheck` (37 pass, 0 fail); RepoPrompt Oracle review of the review-fix diff found no concrete must-fix regression or test inadequacy; `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, bundle sanity reported 7 agents, 9 commands, 18 tools, full suite 652 pass/0 fail, lint passed, bench smoke and bench gate passed)
313
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.45` before push
314
+
315
+ ## [2.0.44] - 2026-05-14
316
+
317
+ Lock maintenance guardrails before release
318
+
319
+ Flow 2.0.44 turns the latest quality review into small, behavior-locked maintenance guardrails. `flow_reset_feature` ownership metadata now points at the actual review transition owner, and descriptor/core-action parity coverage verifies emitted events and invariant IDs so registry drift is caught before release.
320
+
321
+ The release also clarifies architecture governance without widening the hard seam checker: `src/prompts/**` and `src/audit/**` are documented as governed projection surfaces outside the `core`/`workflow`/`runtime`/`adapters` layer-edge checker, with tests proving the checker does not partially enforce those projection imports.
322
+
323
+ Runtime mutation finalization now distinguishes artifact-rendering failures after persistence from unsaved mutation failures. Successful and no-op mutations whose `.flow` source-of-truth state is already saved report partial success with artifact-sync failure metadata, while failed transitions that persist recovery state remain failures and include the same artifact-sync evidence. Session persistence policy is also explicit: version `1` sessions remain the only supported persisted schema, and future versions are rejected rather than silently downgraded.
324
+
325
+ The release deliberately does not add slash commands, runtime tools, prompt modes, state paths, package exports, installer behavior, dependencies, or schema migrations. It preserves the existing `@opencode-ai/plugin` and `zod` compatibility boundary.
326
+
327
+ Constraint: Reduce registry, seam-policy, mutation-finalization, and session-version maintenance risk without widening Flow's public command/tool surface or persisted schema contract
328
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
329
+ Constraint: Preserve persistence-first session semantics while making stale artifact rendering an explicit partial-success/recovery signal
330
+ Rejected: Collapse all tool/schema/mode metadata into a new manifest in this release | a broad projection rewrite would be riskier than focused parity coverage for the reviewed drift
331
+ Rejected: Expand the hard architecture seam checker to prompts/audit immediately | those surfaces intentionally project cross-cutting contracts and need projection-specific governance before layer-edge denial
332
+ Rejected: Add session migration logic for hypothetical future versions | the current release only documents and tests the existing version-1-only policy
333
+ Confidence: high
334
+ Scope-risk: moderate
335
+ Reversibility: clean
336
+ Directive: Keep future registry/projection changes covered by parity tests, keep prompt/audit governance explicit, and treat artifact sync failures after saved state as projection repair work rather than lost mutation state
337
+ Tested: `bun run typecheck`; `bun test tests/config/tool-schemas.test.ts tests/mode-contracts.test.ts tests/cross-area/architecture-seams.test.ts tests/runtime-mutation-finalization.test.ts tests/runtime-session-persistence.test.ts tests/cross-area/module-scope-schemas.test.ts` (55 pass, 0 fail); `bun run check:architecture-seams:enforce`; RepoPrompt Oracle review of the uncommitted maintenance diff found no actionable findings; `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, pack invariants OK for version `2.0.44`, bundle sanity reported 7 agents, 9 commands, 18 tools, full suite 644 pass/0 fail, lint passed, bench smoke and bench gate passed)
338
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.44` before push
339
+
340
+ ## [2.0.43] - 2026-05-14
341
+
342
+ Return attachment ownership to native OpenCode
343
+
344
+ Flow 2.0.43 removes the Flow-owned OpenCode attachment materialization bridge after confirming native OpenCode image/file attachment handling is the correct owner for ordinary chat context. The plugin no longer registers chat or command attachment capture hooks, and `flow_auto_prepare` no longer reports attachment availability, `attachmentGuidance`, or materialization requirements.
345
+
346
+ The public Flow tool surface shrinks from nineteen tools to eighteen by removing `flow_attachments_materialize`. The deleted adapter attachment store, selection, policy, materialization tool, and behavior tests remove the in-memory byte-retention and workspace-file write path that had made Flow responsible for a host capability it should not duplicate.
347
+
348
+ The release keeps the change intentionally scoped: no slash commands, state paths, package exports, installer behavior, runtime workflow modes, persisted session schemas, package dependencies, or OpenCode SDK compatibility boundaries change. Native OpenCode remains responsible for attached images/files; Flow remains responsible for workflow JSON/state under `.flow/**` and derived docs.
349
+
350
+ Constraint: Remove Flow-owned attachment capture and materialization from normal public operation while preserving native OpenCode attachment behavior as host/model context
351
+ Constraint: Keep the release as a surface contraction, not a replacement attachment system; no new tools, commands, state paths, package exports, dependencies, or workflow modes are added
352
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
353
+ Rejected: Keep `flow_attachments_materialize` as a hidden compatibility shim | retaining the obsolete workspace-mutating bridge would preserve the ownership confusion and public-surface maintenance cost
354
+ Rejected: Rebuild a smaller Flow attachment cache with byte caps | the product decision is to let native OpenCode own ordinary attachments rather than make Flow a second attachment subsystem
355
+ Rejected: Rewrite historical release/investigation notes that described the previous bridge | historical docs should remain evidence of prior decisions, while current docs and contracts now describe native ownership
356
+ Confidence: high
357
+ Scope-risk: moderate
358
+ Reversibility: clean
359
+ Directive: Do not reintroduce Flow-owned attachment capture/materialization unless a new explicit product requirement proves native OpenCode context is insufficient and includes a fresh threat model, byte-retention policy, workspace-write policy, and tests
360
+ Tested: `bun test tests/config/plugin-surface.test.ts tests/config/tool-schemas.test.ts tests/config/prompt-contracts.test.ts tests/auto-prepare.test.ts tests/runtime-tools-metadata.test.ts tests/runtime-tool-routing.test.ts tests/smoke/dist-load.test.ts tests/mode-contracts.test.ts tests/docs-tool-parity.test.ts tests/descriptor-family-parity.test.ts tests/protocol-parity.test.ts` (90 pass, 0 fail); `bun test tests/config/plugin-surface.test.ts tests/auto-prepare.test.ts tests/config/tool-schemas.test.ts tests/docs-tool-parity.test.ts` (34 pass, 0 fail); `bun run typecheck`; `bun run lint`; RepoPrompt Oracle review of the uncommitted attachment-downgrade diff found no issues; `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, architecture seams OK, pack invariants OK for version `2.0.43`, bundle sanity reported 7 agents, 9 commands, 18 tools, full suite 635 pass/0 fail, lint passed, bench smoke and bench gate passed)
361
+ Not-tested: Live OpenCode UI runtime interaction with image attachments; live GitHub-hosted release workflow run for tag `v2.0.43` before push
362
+
363
+ ## [2.0.42] - 2026-05-14
364
+
365
+ Simplify maintainer risk guidance and completion-gate internals
366
+
367
+ Flow 2.0.42 continues the delete-first simplification line without widening the public surface. The maintainer risk checklist is now a compact non-canonical pointer to the canonical maintainer contract and contributor map instead of a second copy of boundary policy, reducing current-facing documentation drift risk.
368
+
369
+ The runtime cleanup stays intentionally small: `execution-completion-validation.ts` no longer carries a private reviewer-decision helper with duplicated final-path logic. The non-final reviewer-decision behavior is inlined where it is used, while completion gate ordering, failure messages, recovery metadata, final-review coverage checks, schemas, persistence behavior, package exports, dependencies, installer behavior, slash commands, runtime tools, state paths, and workflow modes remain unchanged.
370
+
371
+ Fresh simplification metrics after the pass: runtime files `124`, runtime LOC `17,480`, large runtime files `7`, top-5 runtime-file LOC share `9.7%`, and architecture seam violations `0`. The largest runtime files are now `schema-review-shared.ts` (`353` LOC), `execution-completion-validation.ts` (`347`), `session-presenters.ts` (`341`), `final-review-coverage.ts` (`332`), and `session-actions.ts` (`326`). Compared with v2.0.41, the release lowers the completion-validation hotspot by `10` LOC and keeps seam violations at zero.
372
+
373
+ Constraint: Preserve Flow's public tool names, command names, runtime response envelopes, `.flow/**` state paths, package exports, installer behavior, completion/reviewer gate semantics, and OpenCode SDK dependency contract while simplifying duplicated guidance and private helper logic
374
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
375
+ Constraint: Keep final-feature reviewer approval, final-review coverage, failure messages, and recovery metadata behavior locked while deleting only private redundant logic
376
+ Rejected: Delete or weaken completion/reviewer/final-review gates for a larger LOC reduction | those checks are load-bearing runtime safety contracts
377
+ Rejected: Keep detailed boundary-hotspot guidance duplicated in the non-canonical checklist | duplicated current-facing authority increases drift risk relative to the canonical docs
378
+ Rejected: Broaden the cleanup into additional runtime hotspots before release | one hotspot per pass keeps the simplification diff reviewable and behavior-locked
379
+ Confidence: high
380
+ Scope-risk: narrow
381
+ Reversibility: clean
382
+ Directive: Continue simplification one hotspot at a time; keep detailed current-facing authority in `docs/maintainer-contract.md` and `docs/contributor-map.md`, and preserve completion-gate tests before editing reviewer/final-review validation paths
383
+ Tested: `bun test tests/docs-stale-reference-policy.test.ts tests/docs-semantic-parity.test.ts` (5 pass, 0 fail); `bun run test:fast` (16 pass, 0 fail); `bun run gate:completion-lane` (124 pass, 0 fail); `bun run check:architecture-seams:enforce` (0 blocked imports); `bun run typecheck`; `bun run report:runtime-simplification-metrics` (`runtime.totalLoc=17480`, `seamViolationCount=0`); `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, pack invariants OK for version `2.0.42`, 662 pass, 0 fail, lint passed, bench smoke passed, bench gate passed)
384
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.42` before push
385
+
386
+ ## [2.0.41] - 2026-05-14
387
+
388
+ Publish release-only continuity evidence
389
+
390
+ Flow 2.0.41 is a release-only patch that carries forward the v2.0.40 runtime contract under a fresh package version and tag. There were no implementation commits after `v2.0.40`; this release records that continuity explicitly so the GitHub release notes, package version, changelog, and release archive all describe the same artifact boundary.
391
+
392
+ The public surface remains frozen: no slash commands, runtime tools, prompt contracts, state paths, package exports, dependencies, installer behavior, OpenCode SDK contract, or workflow modes change. The existing `@opencode-ai/plugin` and `zod` versions remain aligned at the already-reviewed SDK compatibility point.
393
+
394
+ This release deliberately does not claim new runtime behavior or simplification metrics beyond the v2.0.40 baseline. Its purpose is to publish a clean release note in the lore format before any further runtime or adapter work resumes.
395
+
396
+ Constraint: Preserve the exact v2.0.40 runtime behavior, public command/tool surface, package exports, installer behavior, `.flow/**` state paths, and OpenCode SDK dependency contract while issuing a fresh release tag
397
+ Constraint: Keep `@opencode-ai/plugin` at `1.14.48` and `zod` at `4.1.8`; this release changes no dependency compatibility boundary
398
+ Rejected: Add a runtime code change merely to justify the patch version | the requested release can be represented honestly as release metadata plus package version alignment
399
+ Rejected: Recompute simplification claims as if behavior changed | no implementation delta exists after `v2.0.40`, so v2.0.41 should point to continuity instead of new metrics
400
+ Confidence: high
401
+ Scope-risk: narrow
402
+ Reversibility: clean
403
+ Directive: Treat v2.0.41 as a release-only continuity marker; future behavior changes should start from v2.0.40/v2.0.41's frozen surface assumptions and carry their own focused verification evidence
404
+ Tested: `git describe --tags --abbrev=0` returned `v2.0.40`; `git log v2.0.40..HEAD` returned no commits before the release metadata bump; `bun run check` (release gate passed: dependency contract OK with project/plugin/root `zod=4.1.8`, pack invariants OK for version `2.0.41`, 662 pass, 0 fail, lint passed, bench smoke passed, bench gate passed)
405
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.41` before push
406
+
407
+ ## [2.0.40] - 2026-05-14
408
+
409
+ Close quality-review seams before further runtime work
410
+
411
+ Flow 2.0.40 closes the quality-review fixes that were intentionally kept inside the current public surface. Shared semantic invariant IDs now live in `src/core/protocols/semantic-invariants.ts`, while runtime/domain keeps the invariant descriptor ownership; this removes the remaining need for the deleted `src/workflow/domain.ts` bridge and makes the architecture seam check enforce the intended dependency directions across `core`, `workflow`, `runtime`, and `adapters`.
412
+
413
+ The release also hardens session and adapter failure edges without changing Flow's operator-facing contract. Session persistence now writes the target session before promoting it active, active-session swaps preserve a coherent stored/parked layout before directory synchronization, Flow Core query dispatch rejects unknown query names explicitly, and failed attachment materialization closes the file handle without attempting a path-based cleanup that could race a swapped destination.
414
+
415
+ Fresh simplification metrics after the pass: runtime files `124`, runtime LOC `17,490`, large runtime files `7`, top-5 runtime-file LOC share `9.8%`, and architecture seam violations `0`. The largest runtime files are now `execution-completion-validation.ts` (`357` LOC), `schema-review-shared.ts` (`353`), `session-presenters.ts` (`341`), `final-review-coverage.ts` (`332`), and `session-actions.ts` (`326`). Compared with v2.0.39, large-file count and top-five concentration stay flat while the seam rules become stricter.
416
+
417
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, installer behavior, or new workflow modes. The OpenCode adapter remains the host integration layer; the new seam rule blocks inward dependencies on adapters, not the adapter itself.
418
+
419
+ Constraint: Preserve Flow's public tool names, command names, runtime response envelopes, `.flow/**` state paths, package exports, installer behavior, and OpenCode SDK dependency contract while closing reviewed quality seams
420
+ Constraint: Keep shared invariant identity in `core` while runtime/domain remains the owner of runtime semantic descriptors and validation behavior
421
+ Constraint: Keep the OpenCode adapter as an outward host boundary; block `runtime -> adapters`, `workflow -> adapters`, and inner-layer adapter dependencies instead of deleting adapter support
422
+ Rejected: Keep `src/workflow/domain.ts` as a dead compatibility bridge | deadcode and seam ownership showed the invariant ID belongs in core protocols
423
+ Rejected: Repair attachment materialization failures by unlinking the destination path after open failure | path-based cleanup can delete a swapped target; leaking an already-created file is safer than TOCTOU deletion
424
+ Rejected: Treat unknown Flow Core query names as unreachable by type alone | runtime entrypoints still need explicit rejection before dispatch
425
+ Confidence: high
426
+ Scope-risk: moderate
427
+ Reversibility: clean
428
+ Directive: Keep future shared IDs core-owned, keep adapter dependencies one-way from `adapters/opencode` into runtime/core, and update seam docs/tests together whenever layer rules change
429
+ Tested: `bun run report:runtime-simplification-metrics`; RepoPrompt review of the uncommitted quality-fix diff; `bun run check` (release gate passed: 662 pass, 0 fail; build, release hygiene, pack invariants, completion lane, full tests, lint, bench smoke, and bench gate passed)
430
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.40` before push
431
+
432
+ ## [2.0.39] - 2026-05-14
433
+
434
+ Harden runtime quality fixes before release
435
+
436
+ Flow 2.0.39 hardens the runtime quality fixes from the latest review pass without widening Flow's public surface. Unknown Flow Core commands now fail explicitly before dispatch, same-worktree save queues recover after a rejected save task, mutation finalization returns the saved session consistently through direct and composite response values, and final-review legacy terminology normalization is centralized behind a shared domain helper.
437
+
438
+ The release keeps the compatibility intent narrow: legacy `oracleRefs` and `test_oracle_authenticity` inputs remain accepted where prior schemas allowed them, canonical output continues to prefer `testEvidenceRefs`, and conflict cases remain rejected. Focused regression tests lock the queue-recovery, invalid-command, saved-session substitution, and final-review canonicalization edges.
439
+
440
+ Fresh simplification metrics after the pass: runtime files `124`, runtime LOC `17,461`, large runtime files `7`, top-5 runtime-file LOC share `9.8%`, and architecture seam violations `0`. The largest runtime files are now `execution-completion-validation.ts` (`357` LOC), `schema-review-shared.ts` (`353`), `session-presenters.ts` (`341`), `final-review-coverage.ts` (`332`), and `session-actions.ts` (`326`). Compared with v2.0.38, large runtime files dropped from `8` to `7` while preserving the existing command catalog and runtime response envelopes.
441
+
442
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, installer behavior, or new workflow modes. `final-review-canonicalization.ts` is internal runtime-domain structure, not a new package surface.
443
+
444
+ Constraint: Preserve Flow's public tool names, command names, facade imports, `.flow/**` state paths, persisted session schema, and runtime response envelopes while hardening reviewed runtime edge cases
445
+ Constraint: Keep legacy final-review input compatibility for `oracleRefs` and `test_oracle_authenticity` while canonicalizing persisted/normalized output to test-evidence terminology
446
+ Constraint: Keep `zod` and `@opencode-ai/plugin` aligned at the previously released SDK contract; this release changes no dependencies
447
+ Rejected: Broaden the command catalog or add fallback dispatch for unknown Flow Core commands | explicit rejection prevents accidental mutation routing
448
+ Rejected: Rewrite final-review coverage policy while extracting terminology helpers | compatibility canonicalization and policy changes need separate review surfaces
449
+ Rejected: Add a broader response-shaping abstraction for mutation finalization now | the current helper fixes the saved-session consistency bug without widening the application boundary
450
+ Confidence: high
451
+ Scope-risk: moderate
452
+ Reversibility: clean
453
+ Directive: Keep future final-review terminology compatibility in `final-review-canonicalization.ts`, and preserve explicit unknown-command tests whenever Flow Core command catalogs change
454
+ Tested: `bun run report:runtime-simplification-metrics`; RepoPrompt review of uncommitted diff found no blockers; `bun run typecheck`; `bun test tests/runtime-session-persistence.test.ts tests/session-engine.test.ts tests/runtime/final-review-contracts.test.ts tests/completion-gates.test.ts tests/transitions-consolidation.test.ts`; `bun run lint`; `bun test`; `bun run check` (release gate passed: 653 pass, 0 fail; build, release hygiene, pack invariants, completion lane, full tests, lint, bench smoke, and bench gate passed)
455
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.39` before push
456
+
457
+ ## [2.0.38] - 2026-05-14
458
+
459
+ Lower final-review behavior validation concentration behind a ledger seam
460
+
461
+ Flow 2.0.38 continues the runtime simplification line by moving behavior-ledger validation mechanics out of `final-review-behavior-validation.ts` and into `final-review-behavior-ledger-validation.ts`. The established facade remains intact: existing imports for `BehaviorValidationLedgerTarget` and `behaviorValidationLedgerFailureReasons()` continue to resolve through `final-review-behavior-validation.ts`, while grounding and path-safety checks stay in the original validation module.
462
+
463
+ The release keeps the extraction behavior-locked around final-review accounting: required behavior risks, `needs_fix` rejection, `not_applicable` handling, validation command normalization, validation coverage command mapping, and failure-reason ordering remain covered by the existing final-review and completion-gate contracts. The new ledger module imports risk shapes type-only, avoiding a runtime circular dependency with `final-review-behavior-risks.ts`.
464
+
465
+ Fresh simplification metrics after the pass: runtime files `123`, runtime LOC `17,410`, large runtime files `8`, top-5 runtime-file LOC share `9.9%`, and architecture seam violations `0`. The largest runtime files are now `schema-review-shared.ts` (`366` LOC), `execution-completion-validation.ts` (`357`), `session-presenters.ts` (`341`), `final-review-coverage.ts` (`332`), and `session-actions.ts` (`326`). Compared with v2.0.37, large runtime files dropped from `9` to `8`, and `final-review-behavior-validation.ts` is no longer a large runtime file.
466
+
467
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, installer behavior, or new workflow modes. `final-review-behavior-ledger-validation.ts` is internal runtime-domain structure, not a new package surface.
468
+
469
+ Constraint: Preserve Flow's final-review behavior accounting semantics, facade imports, public domain barrel exports, `.flow/**` state paths, and runtime response envelopes while lowering validation hotspot concentration
470
+ Constraint: Keep grounding/path-safety validation separate from behavior-ledger validation without changing failure-reason ordering or validation command normalization
471
+ Constraint: Keep `zod` and `@opencode-ai/plugin` aligned at the previously released SDK contract; this release is structural simplification only
472
+ Rejected: Rewrite final-review coverage policy during the extraction | that would mix behavior change with boundary cleanup
473
+ Rejected: Export the ledger helper as a new package surface | the helper is internal structure behind the existing final-review behavior validation facade
474
+ Rejected: Merge grounding and ledger normalization into a broader utility now | the duplicated helper is non-blocking and a broader utility would widen the cleanup diff
475
+ Confidence: high
476
+ Scope-risk: narrow
477
+ Reversibility: clean
478
+ Directive: Keep future behavior-ledger rules in `final-review-behavior-ledger-validation.ts`; keep grounding and path-safety checks in `final-review-behavior-validation.ts` unless the final-review policy boundary itself changes
479
+ Tested: `bun run report:runtime-simplification-metrics`; `bun run typecheck`; `bun test tests/runtime/final-review-contracts.test.ts tests/completion-gates.test.ts` (65 pass, 0 fail); `bun run check:architecture-seams:enforce`; `bun run deadcode`; RepoPrompt code review found no blockers for the uncommitted extraction; `bun run check` (release gate passed)
480
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.38` before push
481
+
482
+ ## [2.0.37] - 2026-05-14
483
+
484
+ Move mutation finalization behind a behavior-locked boundary
485
+
486
+ Flow 2.0.37 completes the cleanup slice intentionally left outside v2.0.36: mutation persistence, failed-attempt clearing, no-op finalization, save/sync ordering, and mutation response shaping now live in `session-mutation-finalization.ts` instead of the `session-engine.ts` orchestration facade. The public engine entry points remain stable, including `executeTransitionAtRoot()`, `runSessionMutationActionAtRoot()`, and the existing mutation result type exports.
487
+
488
+ The release adds focused behavior-lock coverage around the risky state-machine edges before relying on the extraction: failed transitions with and without sessions, `recordFailure` projection persistence, no-op save/sync behavior, failed-attempt clear policies, saved-session response substitution, and the existing explicit-empty-options no-sync behavior. This keeps the simplification measurable without changing Flow's tool names, command catalog, `.flow/**` state paths, persisted session schema, or runtime response envelopes.
489
+
490
+ Fresh simplification metrics after the pass: runtime files `122`, runtime LOC `17,391`, large runtime files `9`, top-5 runtime-file LOC share `10.5%`, and architecture seam violations `0`. The largest runtime files are now `final-review-behavior-validation.ts` (`424` LOC), `schema-review-shared.ts` (`366`), `execution-completion-validation.ts` (`357`), `session-presenters.ts` (`341`), and `final-review-coverage.ts` (`332`). Compared with v2.0.36, large runtime files dropped from `10` to `9`, and `session-engine.ts` is no longer a top-five runtime file.
491
+
492
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, installer behavior, or new workflow modes. `session-mutation-finalization.ts` is internal runtime structure, not a new package surface.
493
+
494
+ Constraint: Preserve Flow's public tool names, command names, facade imports, `.flow/**` state paths, persisted session schema, and runtime response envelopes while moving mutation finalization out of the engine facade
495
+ Constraint: Keep transition semantics in `src/runtime/transitions/**`; this release only moves application-layer persistence/finalization mechanics
496
+ Constraint: Keep `zod` and `@opencode-ai/plugin` aligned at the previously released SDK contract; this release is structural simplification only
497
+ Rejected: Rewrite mutation actions or transition reducers during the extraction | that would mix behavior change with boundary cleanup
498
+ Rejected: Export the finalization helper through the application barrel or package surface | the helper is internal structure behind `session-engine.ts`
499
+ Rejected: Preserve mutation behavior only through broad integration tests | save/sync and failed-attempt edge cases need focused behavior locks
500
+ Confidence: high
501
+ Scope-risk: moderate
502
+ Reversibility: clean
503
+ Directive: Keep future mutation persistence and finalization changes inside `session-mutation-finalization.ts` unless session loading/orchestration behavior changes; preserve focused behavior-lock tests for every state-machine edge moved
504
+ Tested: `bun run report:runtime-simplification-metrics`; `bun test tests/session-engine.test.ts`; `bun run typecheck`; `bun run lint`; `bun run check:architecture-seams:enforce`; RepoPrompt code review found no P0/P1/P2 findings for the uncommitted extraction; `bun run check` (release gate passed)
505
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.37` before push
506
+
507
+ ## [2.0.36] - 2026-05-13
508
+
509
+ Lower runtime hotspot concentration with behavior-locked seams
510
+
511
+ Flow 2.0.36 continues the runtime simplification line by reducing three concentrated application/root hotspots without changing Flow's public tool surface, command catalog, persisted `.flow/**` state paths, or workflow semantics. The release extracts feature drilldown presentation, task-progress row projection, and generic session read/workspace action-runner plumbing into narrower modules while keeping the established facade imports and runtime response envelopes stable.
512
+
513
+ Session presentation now delegates feature drilldown collection through `session-presenter-drilldowns.ts`, so active, stored, parked, and completed session responses keep their existing shape while the presenter file no longer owns drilldown lookup details. Task progress now separates shared row modeling from review/validation/failure row builders through `summary-task-progress-model.ts` and `summary-task-progress-review.ts`, with `projectTaskProgress()` remaining the integration point. Session engine read/workspace action envelopes now live in `session-engine-action-runner.ts`, while mutation persistence, failed-attempt clearing, no-op handling, save/sync ordering, and default runtime ports remain owned by `session-engine.ts`.
514
+
515
+ Fresh simplification metrics after the pass: runtime files `121`, runtime LOC `17,327`, large runtime files `10`, top-5 runtime-file LOC share `10.5%`, and architecture seam violations `0`. The largest runtime files are now `final-review-behavior-validation.ts` (`424` LOC), `schema-review-shared.ts` (`366`), `execution-completion-validation.ts` (`357`), `session-presenters.ts` (`341`), and `session-engine.ts` (`339`). Compared with v2.0.35's release metrics, large runtime files dropped from `11` to `10` and top-5 concentration dropped from `11.4%` to `10.5%`.
516
+
517
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, installer behavior, or new workflow modes. It also does not move mutation persistence out of `session-engine.ts`; that remains a separate, higher-risk cleanup slice requiring its own behavior lock.
518
+
519
+ Constraint: Preserve Flow's public tool names, command names, facade imports, `.flow/**` state paths, and runtime response envelopes while lowering hotspot concentration
520
+ Constraint: Keep mutation persistence, failed-attempt clearing, no-op semantics, and save/sync ordering owned by `session-engine.ts` during this release
521
+ Constraint: Keep `zod` and `@opencode-ai/plugin` aligned at the previously released SDK contract; this release is structural simplification only
522
+ Rejected: Move mutation persistence during the action-runner extraction | state-machine movement is higher risk and needs a separate behavior lock
523
+ Rejected: Collapse task-progress and presenter extractions into broad runtime redesign | smaller seams are easier to review, verify, and reverse
524
+ Rejected: Add package exports or runtime tools for internal helper modules | the helpers are internal structure, not public product surface
525
+ Confidence: high
526
+ Scope-risk: moderate
527
+ Reversibility: clean
528
+ Directive: Future simplification should keep facade imports stable, add focused tests before extracting behavior, and record fresh runtime metrics before release
529
+ Tested: `bun run report:runtime-simplification-metrics`; `bun test tests/runtime-summary.test.ts`; `bun test tests/session-engine.test.ts`; `bun test tests/runtime-tools.test.ts tests/runtime-tools-metadata.test.ts`; `bun run typecheck`; `bun run lint`; `bun run check:architecture-seams:enforce`; focused RepoPrompt reviews found no blockers for the task-progress and session-engine slices; `bun run check` (639 pass, 0 fail; build, release hygiene, pack invariants, completion lane, full tests, lint, bench smoke, and bench gate passed)
530
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.36` before push
531
+
532
+ ## [2.0.35] - 2026-05-13
533
+
534
+ Upgrade the OpenCode SDK contract without widening Flow's surface
535
+
536
+ Flow 2.0.35 moves the adapter to the current OpenCode plugin SDK contract by pinning `@opencode-ai/plugin` to `1.14.48`, keeping `zod` aligned at `4.1.8`, and adding the `effect` runtime needed to execute SDK permission prompts. Tool context and result typing now come from the SDK boundary instead of local drift-prone aliases, while Flow's public tool names, command names, state paths, and workflow semantics remain unchanged.
537
+
538
+ The release hardens permission handling around the SDK 1.14 `context.ask()` contract. Hidden workspace mutations and attachment materialization now run returned `Effect` values through the bundled Effect runner, and regression coverage proves both successful permission effects and denied permission effects before writes occur. Bundle sanity now exercises a production-built permission-gated tool against an injected peer plugin mock and verifies the permission `Effect` body runs exactly once.
539
+
540
+ The cold-start budget now imports the built package in a release-like isolated package with the real `@opencode-ai/plugin` peer and aligned `zod`, so dependency-resolution evidence tracks the upgraded SDK contract more closely. The bundle remains below budget at `823,737` bytes with the peer dependency externalized and no inlined OpenCode client symbols.
541
+
542
+ This release also keeps the simplification work moving by extracting auto-prepare presentation into `session-auto-prepare-presenter.ts` and updating maintainer/contributor docs around protocol projections and release evidence. Fresh runtime metrics: runtime files `117`, runtime LOC `17,259`, large runtime files `11`, top-5 runtime-file LOC share `11.4%`, and architecture seam violations `0`. The largest runtime files are now `session-presenters.ts` (`449` LOC), `final-review-behavior-validation.ts` (`424`), `schema-review-shared.ts` (`366`), `summary-task-progress.ts` (`366`), and `session-engine.ts` (`358`).
543
+
544
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, installer destinations, or new workflow modes. Backward compatibility with older OpenCode plugin SDK permission shapes is not preserved; this release intentionally upgrades the plugin to the SDK 1.14 contract.
545
+
546
+ Constraint: Align Flow's adapter with `@opencode-ai/plugin` `1.14.48` while keeping `zod` matched to the SDK's effective `4.1.8` contract
547
+ Constraint: Keep the shipped plugin as a single built entry with `@opencode-ai/plugin` externalized and the Effect permission runner bundled
548
+ Constraint: Preserve existing Flow tool names, command names, `.flow/**` state paths, and runtime workflow semantics while upgrading the host SDK boundary
549
+ Rejected: Keep compatibility with the older promise-like permission mock shape | the plugin is intentionally moving to SDK 1.14 `Effect` permissions
550
+ Rejected: Externalize `effect` from the bundle | release-installed single-file plugins need permission prompts to run without resolving an extra runtime peer
551
+ Rejected: Change `zod` independently of the OpenCode plugin SDK | tool arg compatibility depends on the SDK's effective schema runtime
552
+ Confidence: high
553
+ Scope-risk: moderate
554
+ Reversibility: clean
555
+ Directive: Future SDK upgrades must verify the plugin peer version, effective `zod` version, permission `Effect` execution, bundle peer externalization, and cold-start import evidence before release
556
+ Tested: `bun run report:runtime-simplification-metrics`; `bun install`; `bun test tests/runtime-operator-tools.test.ts tests/attachment-materialization.test.ts` (46 pass, 0 fail); `bun run build && node ./scripts/cross-area/bundle-sanity.mjs` (`permissionAskRuns: 1`); `bun run typecheck`; `bun run check:dependency-contract`; `bun run check:cold-start-budget`; `bun run lint`; `bun run check` (634 pass, 0 fail; build, release hygiene, pack invariants, completion lane, full tests, lint, bench smoke, and bench gate passed)
557
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.35` before push
558
+
559
+ ## [2.0.34] - 2026-05-13
560
+
561
+ Keep simplification seams measurable and compatibility-stable
562
+
563
+ Flow 2.0.34 completes the next simplification pass across adapter projections, review-domain validation, rendering projections, and session live-storage boundaries. The change moves implementation detail into narrower helper modules while preserving the existing OpenCode tool names, runtime action bindings, public schema surfaces, `.flow/**` state paths, and facade imports.
564
+
565
+ The release makes the OpenCode core-action projection seam explicit through `core-action-projection.ts`, splits final-review context grounding and review-scope ledger validation into focused domain helpers, separates task-progress row selection from render/presenter code, and centralizes active/stored/completed session live-storage helpers. A follow-up compatibility fix keeps `openCodeToolCoreSummary()` tolerant for stale projected core actions by returning `null` instead of throwing, while strict descriptor metadata lookup remains fail-fast.
566
+
567
+ Fresh metrics after the pass: runtime files `116`, runtime LOC `17,249`, large runtime files `11`, top-5 runtime-file LOC share `12%`, and architecture seam violations `0`. The largest runtime files are now `session-presenters.ts` (`553` LOC), `final-review-behavior-validation.ts` (`424`), `schema-review-shared.ts` (`366`), `summary-task-progress.ts` (`366`), and `session-engine.ts` (`358`).
568
+
569
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, installer behavior, or workflow semantics. `zod` remains aligned with `@opencode-ai/plugin`; this is internal structure, compatibility preservation, and regression coverage only.
570
+
571
+ Constraint: Preserve existing OpenCode tool names, schema owners, generated guidance/projection shape, and runtime action bindings while reducing adapter projection duplication
572
+ Constraint: Preserve review-domain validation semantics, review decision normalization behavior, and session persistence precedence across active, stored, and completed sessions
573
+ Constraint: Keep `.flow/**` path authority and rendered-doc-as-derived-artifact rules unchanged while centralizing live-storage helpers
574
+ Rejected: Make projection-summary rendering strict at the public helper boundary | stale generated projection data previously returned `null` and host guidance should stay tolerant
575
+ Rejected: Add new runtime surfaces or state paths during simplification | this release is behavior-preserving boundary repair, not product expansion
576
+ Rejected: Change `zod` or plugin SDK versions | tool arg compatibility depends on keeping the effective SDK contract aligned
577
+ Confidence: high
578
+ Scope-risk: moderate
579
+ Reversibility: clean
580
+ Directive: Future simplification should keep public facades stable, add focused regression coverage for extracted seams, and record fresh runtime metrics before release
581
+ Tested: `bun run report:runtime-simplification-metrics`; `bun test tests/config/tool-schemas.test.ts tests/descriptor-family-parity.test.ts` (22 pass, 0 fail); `bun run typecheck`; targeted `bunx biome check` on projection fix files; focused RepoPrompt review found no P0/P1/P2 findings for the compatibility fix; `bun run check` (631 pass, 0 fail; build, release hygiene, pack invariants, completion lane, full tests, lint, bench smoke, and bench gate passed)
582
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.34` before push
583
+
584
+ ## [2.0.33] - 2026-05-13
585
+
586
+ Lower runtime hotspot concentration behind stable facades
587
+
588
+ Flow 2.0.33 continues the runtime simplification work with two behavior-preserving extractions. Task-progress projection now lives in `src/runtime/summary-task-progress.ts`, while `summary-projections.ts` keeps the existing export path as a compatibility facade. Final-review behavior validation now lives in `src/runtime/domain/final-review-behavior-validation.ts`, while `final-review-behavior-risks.ts` remains the public facade for the existing risk and ledger contract.
589
+
590
+ The release updates the runtime complexity baseline after the split: runtime files move from `105` to `107`, runtime LOC from `17,005` to `17,040`, large files remain at `14`, and top-5 runtime-file LOC share drops from `15.0%` to `14.1%`. The former largest hotspot, `final-review-behavior-risks.ts`, is no longer a top-five runtime file.
591
+
592
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, installer behavior, or workflow semantics. `zod` remains aligned with `@opencode-ai/plugin`; this is internal structure, compatibility preservation, and metrics documentation only.
593
+
594
+ Constraint: Preserve existing imports from `summary-projections.ts` and `final-review-behavior-risks.ts` while relocating implementation details
595
+ Constraint: Keep final-review failure ordering, ledger validation semantics, task-progress row projection, and architecture seams unchanged
596
+ Constraint: Keep the simplification measurable through `report:runtime-simplification-metrics` rather than adding a new hard gate
597
+ Rejected: Rename or remove facade exports | downstream adapters, audit schemas, and tests still depend on the established import paths
598
+ Rejected: Reshape session lifecycle or persistence in this release | persisted-state changes carry higher release risk than pure extraction
599
+ Rejected: Add new dependencies or package exports | the release is structural simplification only
600
+ Confidence: high
601
+ Scope-risk: moderate
602
+ Reversibility: clean
603
+ Directive: Future simplification should keep facade barrels stable, update runtime metrics with fresh command output, and add direct regression coverage before moving persisted-session behavior
604
+ Tested: `bun run typecheck`; `bun run report:runtime-simplification-metrics`; `bun test tests/runtime-summary.test.ts tests/cross-area/summarize-goldens.test.ts tests/runtime/final-completion-gates.test.ts tests/runtime/final-review-contracts.test.ts tests/reviewer-decision-scope.test.ts tests/completion-gates.test.ts tests/cross-area/architecture-seams.test.ts tests/cross-area/module-scope-schemas.test.ts` (111 pass, 0 fail); `bun run check`; focused RepoPrompt review found no code/API/architecture issues after docs metric correction
605
+ Not-tested: Live OpenCode UI runtime interaction; live GitHub-hosted release workflow run for tag `v2.0.33` before push
606
+
607
+ ## [2.0.32] - 2026-05-13
608
+
609
+ Make runtime simplification seams explicit
610
+
611
+ Flow 2.0.32 completes a behavior-preserving simplification pass across the runtime and OpenCode adapter. The release splits the runtime schema barrel into focused schema subdomains, decomposes review-scope accounting into target, validation, recovery, and shared evidence modules, and moves session JSON I/O into a narrow `session-workspace-io` boundary for strict parsing, cache clone isolation, atomic writes, and cache invalidation.
612
+
613
+ The OpenCode attachment path now has a dedicated `attachment-selection` helper with explicit current-message priority, latest-batch fallback, skipped-only batch reporting, and duplicate filename selector handling. Focused tests lock those extracted behaviors, while schema barrel parity and session I/O tests guard the new seams against drift.
614
+
615
+ The release deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, installer behavior, or new workflow semantics. `zod` remains aligned with `@opencode-ai/plugin`; this is structural simplification plus regression coverage only.
616
+
617
+ Constraint: Preserve public runtime schema imports while moving implementation into narrower subdomain files
618
+ Constraint: Keep session persistence semantics unchanged while isolating strict JSON and atomic-write behavior
619
+ Constraint: Keep OpenCode attachment materialization behavior stable while making selection rules directly testable
620
+ Rejected: Rewrite runtime workflows while simplifying modules | behavior changes would obscure whether the seam extraction was safe
621
+ Rejected: Add new package exports or dependencies | the release is internal structure and test hardening only
622
+ Rejected: Leave extracted helper behavior covered only through integration tests | narrow seams need direct regression coverage
623
+ Confidence: high
624
+ Scope-risk: moderate
625
+ Reversibility: clean
626
+ Directive: Future simplification work should keep facade barrels stable, add focused tests for every extracted seam, and update runtime subdomain metrics when file boundaries move
627
+ Tested: `bun test tests/attachment-selection.test.ts tests/session-workspace-io.test.ts tests/schema-equivalence.test-d.ts`; `bun run typecheck`; `bun run lint`; `bun run check` (626 pass, 0 fail; build, architecture seams, dependency contract, completion lane, full tests, lint, bench smoke, and bench gate passed); focused Oracle review reported no must-fix findings
628
+ Not-tested: Live OpenCode UI attachment materialization; live GitHub-hosted release workflow run for tag `v2.0.32` before push
629
+
630
+ ## [2.0.31] - 2026-05-10
631
+
632
+ Make completion-review recovery evidence explicit
633
+
634
+ Flow 2.0.31 closes the completion/final-review recovery gap found during uncommitted review: `reviewScopeLedger` evidence must now be grounded in changed artifacts or a review context pack instead of passing through file-target self-reference alone, and `retryPolicy.mustChangeEvidenceRefs` now reports whether generated scaffold evidence actually needs replacement.
635
+
636
+ The release also makes latest failed-attempt recovery safer and easier to operate. Successful retries clear only matching failed tool attempts, explicit reset still clears all failed-attempt state, repeated same-category failures are counted and surfaced, and operator recovery hints are compacted into concise single-line output.
637
+
638
+ The release keeps the 2.0.29/2.0.30 reasoning-effort posture intact: `/flow-doctor` verifies Flow-injected agent `reasoningEffort` budgets and command routing while explicitly not claiming proof of OpenCode host-effective session reasoning. It deliberately does not add slash commands, runtime tools, state paths, package exports, dependencies, generated skills, installer behavior, or new workflow semantics. `zod` remains aligned with `@opencode-ai/plugin`.
639
+
640
+ Constraint: Keep final-review and completion recovery evidence grounded in concrete changed artifacts or review context
641
+ Constraint: Preserve actionable failed-attempt visibility until the matching failed tool succeeds or reset is explicit
642
+ Constraint: Treat reasoningEffort diagnostics as Flow-injected config verification, not host-effective runtime proof
643
+ Rejected: Accept file-target self-reference as standalone review evidence | it can make unreviewed targets appear grounded
644
+ Rejected: Clear all failed-attempt state on unrelated successful mutations | it can hide still-actionable recovery work
645
+ Rejected: Change dependencies, installer surfaces, or OpenCode model/provider ownership | this release is runtime contract and diagnostics hardening only
646
+ Confidence: high
647
+ Scope-risk: moderate
648
+ Reversibility: clean
649
+ Directive: Future review-scope recovery changes must keep validation, scaffold examples, retry policy, and operator guidance in lockstep
650
+ Tested: `bun run typecheck`; `bun test tests/completion-gates.test.ts tests/runtime/final-review-contracts.test.ts tests/runtime-operator-tools.test.ts tests/runtime-summary.test.ts` (108 pass, 0 fail); scoped Oracle review of implemented fixes found no blockers; `bun run check`
651
+ Not-tested: Live OpenCode UI verification of host-effective `reasoningEffort`; live GitHub-hosted release workflow run for tag `v2.0.31` before push
652
+
653
+ ## [2.0.30] - 2026-05-10
654
+
655
+ Keep release diagnostics and cleanup evidence current
656
+
657
+ Flow 2.0.30 makes `/flow-doctor detail` report the injected command routing and agent `reasoningEffort` budget map so operators can verify the lane-aware agent configuration introduced in 2.0.29 without inspecting generated config by hand. The config check now fails when `/flow-review` is not routed through `flow-auditor` or when any built-in Flow agent drifts from its expected reasoning budget.
658
+
659
+ The release also completes a cleanup pass: generated `prompt-exports/` output is ignored, stale benchmark baseline/result rows for removed workflow event/checkpoint/projection/tool benchmarks are deleted, and completed dated reasoning-level plan/review docs are removed after no-reference proof. Prompt/render fixtures, historical release/investigation docs, current benchmark files, package scripts, runtime state, tool schemas, and dependency pins remain intact.
660
+
661
+ The release deliberately does not add slash commands, runtime tools, state paths, tool payload schemas, package exports, dependencies, generated skills, installer behavior, or Flow workflow semantics. It keeps `zod` aligned with `@opencode-ai/plugin` and treats this as diagnostics plus source-hygiene consolidation only.
662
+
663
+ Constraint: Keep the 2.0.29 lane-aware agent configuration observable through `/flow-doctor detail`
664
+ Constraint: Keep generated prompt-eval exports out of source control while preserving CI generation and upload paths
665
+ Constraint: Preserve benchmark gates by pruning only rows with no current benchmark emitter
666
+ Rejected: Delete prompt/render fixtures | tests own them as regression corpora and no replacement-coverage proof was established
667
+ Rejected: Delete historical release or investigation docs | maintainer policy treats them as historical evidence by default
668
+ Rejected: Change package scripts, runtime semantics, tool schemas, or dependencies | this release is diagnostics and cleanup only
669
+ Confidence: high
670
+ Scope-risk: narrow
671
+ Reversibility: clean
672
+ Directive: Future cleanup should prove benchmark row coverage both ways: stale rows are absent and every active benchmark has a baseline/result row
673
+ Tested: `bun test tests/runtime-operator-tools.test.ts`; `bun run check:fresh-surfaces`; `bun run deadcode`; `bun run typecheck`; `bun run bench:gate`; `bun run check`
674
+ Not-tested: Live OpenCode UI verification of `/flow-doctor detail`; live GitHub-hosted release workflow run for tag `v2.0.30` before push
675
+
676
+ ## [2.0.29] - 2026-05-10
677
+
678
+ Make Flow agent reasoning budgets lane-aware
679
+
680
+ Flow 2.0.29 adds lane-appropriate OpenCode `reasoningEffort` hints to every built-in Flow agent while preserving user-owned model and provider selection. Planning, planning research, worker review, and standalone audit now receive high reasoning; autonomous coordination receives medium reasoning; focused worker and control lanes receive low reasoning.
681
+
682
+ Standalone `/flow-review` now runs through a dedicated read-only `flow-auditor` agent instead of the low-reasoning `flow-control` agent. The audit renderer is restricted to the standalone audit mode, control prompts no longer advertise audit rendering, and maintainer docs/tests now lock the audit/control split.
683
+
684
+ The release deliberately does not add slash commands, runtime tools, state paths, tool payload schemas, package exports, dependencies, generated skills, installer behavior, or Flow workflow semantics. It treats `reasoningEffort` as pass-through OpenCode agent metadata only, with no provider-specific `model`, `variant`, or nested reasoning config emitted by Flow.
685
+
686
+ Constraint: Keep public command names stable while changing the internal `/flow-review` backing agent
687
+ Constraint: Treat `reasoningEffort` as OpenCode-owned pass-through metadata, not Flow runtime behavior
688
+ Constraint: Preserve existing task handoff permissions and read-only postures for planner, reviewer, control, and audit agents
689
+ Rejected: Bind standalone audit to low-reasoning `flow-control` | audit depth and coverage calibration need a review-class lane
690
+ Rejected: Add model or provider defaults | operators and OpenCode config should keep owning model choice
691
+ Rejected: Broaden the change into runtime state, tool schemas, generated skills, installers, or dependency updates | this release is adapter config and contract alignment only
692
+ Confidence: high
693
+ Scope-risk: narrow
694
+ Reversibility: clean
695
+ Directive: Keep future agent-budget changes tested through command bindings, emitted agent config, read-only permissions, and audit/control tool access boundaries
696
+ Tested: `bun test tests/config/plugin-surface.test.ts tests/mode-contracts.test.ts`; `bun test tests/config/prompt-contracts.test.ts`; `bun test tests/mode-contracts.test.ts tests/config/plugin-surface.test.ts tests/prompt-mode-capture.test.ts tests/config/prompt-contracts.test.ts`; `bun run typecheck`; `bun run check`
697
+ Not-tested: Live OpenCode UI verification that providers honor each emitted `reasoningEffort`; live GitHub-hosted release workflow run for tag `v2.0.29` before push
698
+
699
+ ## [2.0.28] - 2026-05-10
700
+
701
+ Make Flow skills install globally with the plugin
702
+
703
+ Flow 2.0.28 changes the generated OpenCode skill lifecycle from project-local workspace files to the documented global OpenCode skill directory. Source installs and release installs now place `flow-plan`, `flow-run`, and `flow-review` under `~/.config/opencode/skills/**`, matching the global `flow.js` plugin location and making the guidance available without per-workspace `--project` targeting.
704
+
705
+ The source installer, release installer, release workflow, generated skill docs, README, maintainer docs, and focused lifecycle tests were updated together. Install still preflights same-name skill files before mutating the global plugin, and uninstall removes only intact Flow-generated skills while preserving user-managed global skills with the same names.
706
+
707
+ The release deliberately does not add commands, tools, runtime modes, package exports, dependencies, state paths, or new Flow workflow semantics. It narrows the installation surface to global OpenCode assets only and removes the `--project` lifecycle option rather than introducing another scope selector.
708
+
709
+ Constraint: Follow OpenCode's documented global skill discovery path at `~/.config/opencode/skills/<name>/SKILL.md`
710
+ Constraint: Keep generated skill overwrite/removal guarded by Flow-owned markers and hashes
711
+ Constraint: Preserve user-managed same-name global skills during uninstall
712
+ Rejected: Keep project-local skills as the default | the requested release is global-only skill installation
713
+ Rejected: Add `project|global|both` scope flags | global-only avoids duplicate skill names and per-workspace install drift
714
+ Rejected: Expand workflow semantics | this release changes install location only
715
+ Confidence: high
716
+ Scope-risk: narrow
717
+ Reversibility: clean
718
+ Directive: Keep Flow skill install, release asset packaging, generated skill docs, and uninstall safety checks aligned on `~/.config/opencode/skills/**`
719
+ Tested: `bun run check` (typecheck, prompt captures, dependency and architecture contracts, fresh-surface terminology, dead-code scan, build, release hygiene, pack invariants, completion-lane gate, runtime replay tests, cold-start budget, bundle sanity, full test suite, Biome lint, bench smoke, and bench gate); focused installer/release lifecycle tests; Oracle review of the global-only install diff
720
+ Not-tested: Live `curl .../install.sh | bash` against the GitHub-hosted `v2.0.28` assets before tag push; live OpenCode UI skill discovery after installing from the GitHub release
721
+
722
+ ## [2.0.27] - 2026-05-10
723
+
724
+ Make the release installer install project-local skills
725
+
726
+ Flow 2.0.27 fixes the release installer gap exposed after 2.0.26: `curl .../install.sh | bash` now installs both the global `flow.js` plugin and the generated `flow-plan`, `flow-run`, and `flow-review` skills into the current workspace by default. Operators can pass `--project <path>` through Bash to install those skills into another workspace while keeping the plugin in the canonical global OpenCode plugin slot.
727
+
728
+ The release workflow now publishes a `flow-skills.tar.gz` asset generated from the same source skill bundle used by the Bun installer. The release `install.sh` downloads that asset, preflights existing skill files, refuses to overwrite user-managed or user-edited skill files, then extracts the generated skills. The release `uninstall.sh` mirrors the workspace target behavior and removes only intact generated skills after the same preflight while still clearing the canonical global plugin file.
729
+
730
+ The release deliberately does not add commands, tools, runtime modes, package exports, dependencies, state paths, or new Flow workflow semantics. It keeps the fix scoped to release asset packaging and install/uninstall parity with the already generated project-local guidance surface.
731
+
732
+ Constraint: Keep the curl installer useful for users who install outside the plugin repository checkout
733
+ Constraint: Generate release skill files from the same source bundle as the Bun installer
734
+ Constraint: Preserve user-edited skill files by failing before plugin or skill removal when skill preflight fails
735
+ Rejected: Leave release install as plugin-only | README and 2.0.26 behavior promised project-local guidance skills
736
+ Rejected: Inline three skill documents directly into `install.sh` | generated tarball assets keep the script smaller and source-owned
737
+ Rejected: Add new runtime/plugin behavior | this is release packaging parity, not a workflow expansion
738
+ Confidence: high
739
+ Scope-risk: narrow
740
+ Reversibility: clean
741
+ Directive: Keep release install assets and source install skill generation in sync; future release-script changes must test both plugin and skill installation paths
742
+ Tested: `bun run check` (typecheck, prompt captures, dependency and architecture contracts, fresh-surface terminology, dead-code scan, build, release hygiene, pack invariants, completion-lane gate, runtime replay tests, cold-start budget, bundle sanity, full test suite, Biome lint, bench smoke, and bench gate)
743
+ Not-tested: Live `curl .../install.sh | bash` against the GitHub-hosted `v2.0.27` assets before tag push; live OpenCode UI skill discovery after installing from the GitHub release
744
+
745
+ ## [2.0.26] - 2026-05-10
746
+
747
+ Make OpenCode guidance installable and uninstall cleanup authoritative
748
+
749
+ Flow 2.0.26 moves the OpenCode guidance surface into generated project-local skills and makes source installs target an explicit workspace with `--project`. The global plugin remains installed at the canonical OpenCode plugin slot, while `flow-plan`, `flow-run`, and `flow-review` skills are generated under the target workspace so prompt guidance follows the project being operated on rather than the plugin repository checkout.
750
+
751
+ The install and uninstall lifecycle now snapshots plugin and skill state before mutating either side. If a later step fails, Flow restores the prior plugin file and generated skill files, preventing partial installs or partial removals. Uninstall also clears an incompatible canonical `flow.js` file after skill preflight succeeds, matching the release-script behavior for stale plugin cleanup without silently deleting user-edited generated skills.
752
+
753
+ Prompt and descriptor guidance were refreshed for the newer OpenCode plugin surface, including tighter tool descriptions, generated skill docs, updated prompt/eval fixtures, and maintainer/development docs that describe the plugin-versus-skill split. The release deliberately keeps the public package export, dependency set, runtime state paths, command names, and Flow workflow semantics unchanged.
754
+
755
+ Constraint: Keep the global plugin path canonical while making project-local guidance skills install into the operator-selected workspace
756
+ Constraint: Preserve user edits by refusing to remove modified generated skills and rolling back plugin/skill lifecycle failures
757
+ Constraint: Align source uninstall with release-script stale `flow.js` cleanup without widening package exports or runtime modes
758
+ Rejected: Keep installing skills into the plugin repo cwd | source installs need to target the workspace where OpenCode will load guidance
759
+ Rejected: Leave partial lifecycle failures for manual cleanup | plugin and skill mutation must be transaction-like for local install safety
760
+ Rejected: Add new commands or runtime state paths | this release is install/guidance hygiene, not workflow expansion
761
+ Confidence: high
762
+ Scope-risk: moderate
763
+ Reversibility: clean
764
+ Directive: Keep future OpenCode guidance changes generated, fixture-backed, and project-local; do not re-couple generated skills to the plugin repository checkout
765
+ Tested: `bun run check` (typecheck, prompt captures, dependency and architecture contracts, fresh-surface terminology, dead-code scan, build, release hygiene, pack invariants, completion-lane gate, runtime replay tests, cold-start budget, bundle sanity, full test suite, Biome lint, bench smoke, and bench gate)
766
+ Not-tested: Live OpenCode UI skill loading in an installed external workspace; live GitHub-hosted `release.yml` run for tag `v2.0.26` before push
767
+
768
+ ## [2.0.25] - 2026-05-09
769
+
770
+ Make final-review evidence terminology neutral
771
+
772
+ Flow 2.0.25 removes legacy proof terminology from active final-review, audit, prompt, and fixture contracts. Behavior-risk accounting now uses evidence-oriented names such as `test_evidence_authenticity`, `test_evidence`, `testEvidenceRefs`, and `requireTestEvidenceOrGap`, so reviewers describe what validation proves or leaves as a gap without relying on the previous loaded label.
773
+
774
+ The runtime preserves compatibility for existing persisted sessions and older tool payloads. Legacy input values are accepted at schema and normalization boundaries, mapped to canonical evidence fields, and rejected when old and new reference arrays conflict. Normalized worker final-review history now drops legacy reference fields before persistence, keeping canonical output clean while preserving read compatibility.
775
+
776
+ Prompt, audit, generated prompt, eval, render, benchmark, and schema fixtures were updated together. Active source, tests, and generated prompt surfaces no longer contain the old whole-word terminology, while historical docs and dedicated compatibility tests remain the only intentional legacy references.
777
+
778
+ The release deliberately does not add commands, tools, runtime modes, package exports, dependencies, state paths, or new worker/reviewer payload requirements. It preserves `zod` / `@opencode-ai/plugin` alignment, accepts only a narrow 5 KiB bundle-budget increase for release-bound schema normalization, and treats this as terminology/schema-normalization cleanup rather than a behavioral expansion.
779
+
780
+ Constraint: Keep final-review behavior accounting semantics unchanged while renaming the active vocabulary
781
+ Constraint: Preserve backward input compatibility for persisted sessions and older tool payloads
782
+ Constraint: Emit canonical evidence terminology from normalized runtime outputs
783
+ Constraint: Preserve `zod` / `@opencode-ai/plugin` alignment and public tool transport shape
784
+ Constraint: Accept only a narrow 5 KiB bundle-budget increase for release-bound schema normalization
785
+ Rejected: Remove legacy parsing outright | existing persisted sessions and older callers still need read/input compatibility
786
+ Rejected: Dual-write legacy fields in new persisted output | that would keep the old vocabulary active instead of making it compatibility-only
787
+ Rejected: Broaden the release into new review gates or tool surfaces | the requested change is terminology cleanup with compatibility shims
788
+ Confidence: high
789
+ Scope-risk: moderate
790
+ Reversibility: clean
791
+ Directive: Keep future review evidence terminology canonical in active prompts, schemas, generated surfaces, and fixtures; legacy names should stay confined to compatibility shims/tests
792
+ Tested: active terminology searches for source/test/bench and generated prompt surfaces; `bun run typecheck`; `bun run lint`; targeted schema/final-review/worker-result tests; `bun test`; `bun run build`; `bun run check`
793
+ Not-tested: Live OpenCode UI final-review submission using legacy payload terms; live GitHub-hosted CI/release workflow run for tag `v2.0.25` before push
794
+
795
+ ## [2.0.24] - 2026-05-09
796
+
797
+ Make review-scope recovery scaffold-safe
798
+
799
+ Flow 2.0.24 hardens review and review-and-fix completion recovery around `reviewScopeLedger`. Recovery details now label `exampleReviewScopeLedger` as scaffold-only guidance, and generated example entries carry an explicit scaffold residual-risk placeholder so the runtime can distinguish guidance from reviewer evidence.
800
+
801
+ The runtime now rejects blind scaffold replay. Worker completion payloads and final reviewer decisions that resubmit the scaffold placeholder unchanged fail with review-scope accounting recovery, while a retry with evidence-grounded scope entries and truthful residual risk succeeds. This keeps structured recovery helpful without allowing the runtime-provided example to become fake review evidence.
802
+
803
+ Prompt, command, and OpenCode tool guidance now tell workers and reviewers to reassess every declared scope, replace scaffold residual risk, avoid resending identical decisions, and use finding refs only when mapped to the declared scope. The investigation notes also clarify that the observed “runtime provided the missing scope ledger entries” retry wording was expected recovery behavior with misleading prose, not proof that review work had already been done.
804
+
805
+ The release deliberately does not add commands, tools, runtime modes, package exports, dependencies, state paths, worker/reviewer payload shapes, or automatic finding-ref assignment. It preserves `zod` / `@opencode-ai/plugin` alignment, keeps review-scope recovery as guidance rather than evidence, and records the separate newest-OpenCode-plugin regression investigation as pending documentation only.
806
+
807
+ Constraint: Treat recovery examples as scaffold-only guidance, never as completed review evidence
808
+ Constraint: Require evidence-grounded `reviewScopeLedger` entries with truthful residual risk before review/review-and-fix completion can pass
809
+ Constraint: Preserve the existing public tool surface and payload shape while tightening validation semantics
810
+ Constraint: Keep dependency and SDK-boundary versions unchanged
811
+ Rejected: Let agents replay `exampleReviewScopeLedger` unchanged | that can convert runtime guidance into unsupported review evidence
812
+ Rejected: Auto-assign closed finding refs from recovery candidates | candidates still require reliable scope mapping before they become ledger evidence
813
+ Rejected: Add a new recovery or review tool | existing structured recovery details and retry paths are sufficient
814
+ Confidence: high
815
+ Scope-risk: moderate
816
+ Reversibility: clean
817
+ Directive: Keep future recovery examples explicitly labeled as non-evidence and pair any scaffold payloads with validation that rejects unchanged placeholders
818
+ Tested: `bun test tests/completion-gates.test.ts tests/runtime/final-review-contracts.test.ts tests/runtime-tools.test.ts tests/config/prompt-contracts.test.ts tests/recovery-hint-parity.test.ts tests/protocol-parity.test.ts` (130 pass, 1229 expect calls); `bunx biome check src/runtime/domain/review-scope-accounting.ts tests/completion-gates.test.ts tests/runtime/final-review-contracts.test.ts --files-ignore-unknown=true`; `bun run typecheck`; `bun run check`
819
+ Not-tested: Live OpenCode UI final-review recovery retry; live newest-OpenCode-plugin regression reproduction; live GitHub-hosted CI/release workflow run for tag `v2.0.24` before push
820
+
821
+ ## [2.0.23] - 2026-05-08
822
+
823
+ Make singleton runtime retries idempotent and artifact-repairable
824
+
825
+ Flow 2.0.23 narrows retry noise around singleton runtime transitions without expanding the public surface. Review, plan approval, and run-start paths now distinguish requested tool metadata from persisted state more clearly, and identical singleton retries no-op instead of rewriting state where the runtime can prove the requested transition is already applied.
826
+
827
+ The release also makes no-op mutation retries artifact-repairable. A lost-response retry that reloads an already-mutated session still skips the session-state save, but it now runs artifact sync when the action's `syncArtifacts` contract allows it. This preserves idempotent state writes while letting retries repair missing rendered artifacts after a partial save/sync failure.
828
+
829
+ Execution-start retry handling is now aligned with prompt guidance. An implicit `flow_run_start({})` retry no-ops only when the current active feature is already `in_progress`; explicit attempts to switch to a different feature while one is active still fail. Review-record behavior is documented as current identical-decision no-op behavior plus changed-decision singleton overwrite, with no reviewer-history append.
830
+
831
+ The release deliberately does not add commands, tools, runtime modes, package exports, dependencies, state paths, worker/reviewer payload shapes, or history-appending completion idempotency. It keeps completion calls non-idempotent without new worker evidence, preserves snapshot-primary runtime persistence, keeps prompts/docs descriptive rather than authoritative over runtime semantics, and accepts a narrow 6 KiB bundle-budget increase for release-bound retry/idempotency metadata and guidance.
832
+
833
+ Constraint: Treat runtime tool metadata as request progress until the structured response confirms persisted state
834
+ Constraint: Preserve singleton no-op behavior only where the runtime can prove the same transition is already applied
835
+ Constraint: Keep no-op retries artifact-repairable without saving session state again
836
+ Constraint: Preserve completion history semantics; do not make `flow_run_complete_feature` idempotent without new worker evidence
837
+ Constraint: Accept only a narrow 6 KiB bundle-budget increase for release-bound retry/idempotency metadata and prompt guidance
838
+ Rejected: Make all repeated runtime calls successful no-ops | history-appending completion and changed reviewer decisions carry new evidence/state and must remain explicit
839
+ Rejected: Add new runtime status or retry tools | existing status, recovery metadata, and no-op transitions are sufficient
840
+ Rejected: Broaden prompt guidance into runtime authority | runtime transitions remain the behavior source of truth; prompts only describe safe retry boundaries
841
+ Confidence: high
842
+ Scope-risk: moderate
843
+ Reversibility: clean
844
+ Directive: Keep future retry/idempotency work transition-specific, test-backed, and explicit about whether artifacts sync, session state saves, or history rows are appended
845
+ Tested: `bun test tests/runtime-tools-metadata.test.ts tests/runtime-tools.test.ts tests/config/prompt-contracts.test.ts tests/reviewer-decision-scope.test.ts tests/runtime/plan-and-tool-schema-contracts.test.ts tests/prompt-snapshot.test.ts tests/prompt-eval-corpus.test.ts` (76 pass, 1314 expect calls); focused runtime/prompt/tool/docs gate bundle (133 pass, 2587 expect calls); `bun run eval:prompt-capture:check`; `bun run eval:review-capture:check`; `bun run typecheck`; final focused RepoPrompt review for doc/export blockers; `bun run check`
846
+ Not-tested: Live OpenCode UI session exercising lost-response retry rendering; live GitHub-hosted CI/release workflow run for tag `v2.0.23` before push
847
+
848
+ ## [2.0.22] - 2026-05-08
849
+
850
+ Make attachment materialization runtime-guided and content-policy explicit
851
+
852
+ Flow 2.0.22 tightens the `/flow-auto` attachment contract introduced in 2.0.21. Auto preparation now exposes attachment availability as runtime-owned `attachmentGuidance`, and prompt/mode contracts instruct coordinators to materialize attachments only when `attachmentGuidance.materializationRequired` is true, using the provided tool and args before planning, repository inspection, or Task/subagent handoff.
853
+
854
+ The release also clarifies that supported attachment formats are MIME/content-policy based rather than filename-extension trust based. Captured file names are used only for safe slug bases; materialization normalizes MIME, requires matching data URL MIME, verifies image magic bytes, and writes canonical extensions from the validated MIME policy. A JPEG uploaded with a misleading `.png` filename therefore imports as `.jpg`, not as the user-provided extension.
855
+
856
+ The release deliberately does not expand attachment ingress. It adds no commands, runtime modes, package exports, dependencies, state paths, persisted attachment indexes, SVG support, raw base64 transport, filesystem path imports, `file:` imports, or HTTP URL imports. It keeps binary assets outside `.flow/**`, preserves `zod` / `@opencode-ai/plugin` alignment, and accepts a narrow 4 KiB bundle-budget increase for runtime attachment guidance snapshots and coordinator instructions.
857
+
858
+ Constraint: Treat attachment materialization as a runtime-guided preparation contract, not a prompt-inferred goal classification
859
+ Constraint: Keep the format restriction MIME/content-policy based; never trust the uploaded filename extension for validation or output extension selection
860
+ Constraint: Preserve the 2.0.21 ingress boundary: supported `data:` image attachments only, root-bound destinations, no `.flow/**` asset writes, no dependency-version changes, and only a narrow 4 KiB bundle-budget increase
861
+ Rejected: Materialize based on whether prose appears attachment-dependent | the runtime already knows current/latest attachment availability and skipped unsupported records
862
+ Rejected: Preserve user-supplied filename extensions | filenames are untrusted metadata and may not match the payload MIME or bytes
863
+ Rejected: Broaden attachment sources or formats in this patch | SVG, raw base64, filesystem, `file:`, and HTTP sources need separate threat-model review
864
+ Confidence: high
865
+ Scope-risk: moderate
866
+ Reversibility: clean
867
+ Directive: Keep future attachment changes driven by explicit runtime attachment guidance and validated MIME/content policy, not model inference or filename suffixes
868
+ Tested: `bun test tests/attachment-materialization.test.ts tests/auto-prepare.test.ts tests/config/plugin-surface.test.ts tests/config/prompt-contracts.test.ts tests/runtime-tool-routing.test.ts tests/prompt-mode-behavior-eval.test.ts tests/docs-stale-reference-policy.test.ts` (86 pass, 1298 expect calls); `bun run typecheck`; `bun run eval:prompt-capture:check`; `bunx biome check ... --files-ignore-unknown=true`; `bun run check`
869
+ Not-tested: Live OpenCode UI attachment upload session; live GitHub-hosted CI/release workflow run for tag `v2.0.22` before push
870
+
871
+ ## [2.0.21] - 2026-05-08
872
+
873
+ Materialize OpenCode image attachments before Flow automation planning
874
+
875
+ Flow 2.0.21 adds a narrow attachment-ingress bridge for `/flow-auto` goals that depend on user-supplied images. The OpenCode plugin now captures supported chat/file parts for the active session and exposes `flow_attachments_materialize`, a coordinator-only tool that imports PNG, JPEG, WebP, GIF, and AVIF `data:` attachments into explicit workspace asset paths before planning, implementation inspection, or Task/subagent handoff.
876
+
877
+ The release is a deliberate surface-freeze exception: it adds one public Flow tool because the previous behavior left chat-visible image attachments unavailable as shell-readable project files, forcing manual user file placement. The new tool is bounded to `/flow-auto`, returns workspace-relative paths for plan/evidence handoff, and keeps binary assets outside `.flow/**`; Flow session state remains snapshot-primary and derived docs remain markdown artifacts only.
878
+
879
+ The materialization path is intentionally conservative. It allowlists image MIME types, keeps SVG unsupported, rejects raw base64, filesystem, `file:`, and HTTP URL sources, enforces data-size limits before decode, sanitizes filenames, prevents traversal and `.flow/**` destinations, rejects symlink destination ancestry, and writes final files exclusively with deterministic collision suffixes. Unsupported or stale attachments are reported as skipped metadata instead of silently falling back to older captured files.
880
+
881
+ Prompt, mode, descriptor, docs, and schema contracts now teach `flow-auto` to follow the runtime `attachmentGuidance.materializationRequired` field from `flow_auto_prepare`: when true, materialize with the provided tool/args before planning or handoff; when false, do not call the tool. Planner/worker/reviewer handoffs should receive concrete imported paths rather than chat-only attachment references.
882
+
883
+ The release deliberately does not add commands, runtime modes, package exports, dependencies, state paths, worker/reviewer payload shapes, evidence-packet binary transport, or persisted attachment indexes. It preserves `zod` / `@opencode-ai/plugin` alignment and accepts a narrow bundle budget increase for the release-bound attachment capture, policy, and root-safe materialization guards.
884
+
885
+ Constraint: Add exactly one narrow `/flow-auto` workspace tool to bridge supported OpenCode image attachments into project asset files
886
+ Constraint: Keep imported binary assets outside `.flow/**`; Flow-owned state remains session JSON plus derived markdown docs
887
+ Constraint: Preserve worker/reviewer/tool JSON payload contracts except for the explicit `flow_attachments_materialize` raw arg schema
888
+ Constraint: Preserve `zod` / `@opencode-ai/plugin` alignment; no dependency-version changes
889
+ Constraint: Accept only a narrow bundle budget increase for attachment capture, validation, and exclusive-write safety guards
890
+ Rejected: Treat chat attachments as filesystem files without materialization | OpenCode file parts are model-visible context and may not be shell-readable workspace paths
891
+ Rejected: Store attachment bytes or imported assets under `.flow/**` | Flow state paths are runtime/session artifacts, not project asset storage
892
+ Rejected: Support SVG, raw base64, filesystem paths, `file:`, or HTTP URLs now | those sources need separate trusted-origin and threat-model review
893
+ Rejected: Reuse evidence packets or worker artifacts as binary transport | those contracts are metadata references, not attachment byte channels
894
+ Confidence: high
895
+ Scope-risk: moderate
896
+ Reversibility: clean
897
+ Directive: Keep future attachment ingress narrow, permissioned, root-bound, and explicitly documented before expanding formats or source URL support
898
+ Tested: `bun test tests/attachment-materialization.test.ts tests/config/plugin-surface.test.ts tests/config/tool-schemas.test.ts tests/runtime-tools-metadata.test.ts tests/config/prompt-contracts.test.ts tests/mode-contracts.test.ts tests/prompt-mode-behavior-eval.test.ts tests/smoke/dist-load.test.ts` (88 pass, 2095 expect calls); `bun run typecheck`; `bun run check` before tag push
899
+ Not-tested: Live OpenCode UI attachment upload session; live GitHub-hosted CI/release workflow run for tag `v2.0.21` before push
900
+
901
+ ## [2.0.20] - 2026-05-08
902
+
903
+ Make feature identifiers drill down to Flow-rendered feature docs
904
+
905
+ Flow 2.0.20 connects visible `featureId` references in status, history, execution, review, and metadata surfaces to the existing Flow-rendered per-feature markdown artifact. Feature drilldowns are presentation-only targets over `.flow/active`, `.flow/stored`, and `.flow/completed` docs; canonical session state, worker results, reviewer decisions, and tool argument schemas remain unchanged.
906
+
907
+ The release hardens path handling around explicit drilldown sources. Caller-provided session directories and session paths must resolve under the expected Flow lifecycle root, and malformed or missing docs degrade to unavailable drilldown metadata instead of breaking read-only status/history responses. This preserves passive feature inspection without creating subagent sessions or widening runtime persistence.
908
+
909
+ Status and history presenters now own the fallback resolver for feature docs. Active, stored, and completed sessions can expose available feature-doc targets when rendered docs exist, while pruned or not-yet-rendered feature docs surface as missing drilldowns that still point at the intended artifact location.
910
+
911
+ The release deliberately does not add commands, tools, runtime modes, package exports, dependencies, state paths, or worker/reviewer schema changes. It keeps feature drilldowns in summary, presenter, history, and metadata layers, matching the existing Flow artifact lifecycle rather than introducing child-session navigation for passive inspection. It accepts a narrow 5 KiB bundle budget increase for release-bound drilldown presentation and path-hardening code.
912
+
913
+ Constraint: Treat feature drilldown as a derived presentation model over Flow-owned rendered artifacts and session history
914
+ Constraint: Keep active/stored/completed session path derivation root-bound under `.flow/**`
915
+ Constraint: Preserve worker/reviewer/tool JSON schemas and persisted session shape; no dependency-version changes
916
+ Constraint: Accept only a narrow 5 KiB bundle budget increase for release-bound feature drilldown presentation and path-hardening code
917
+ Rejected: Open or create subagent sessions for passive `featureId` inspection | subagents represent delegated planner/worker/reviewer work, while feature docs already provide the passive detail target
918
+ Rejected: Add a new persisted drilldown index | status/history can derive the target from existing session roots and feature ids
919
+ Rejected: Fail status/history when optional feature docs are malformed or missing | drilldown is presentation-only and must degrade without blocking read surfaces
920
+ Confidence: high
921
+ Scope-risk: moderate
922
+ Reversibility: clean
923
+ Directive: Keep future feature-inspection UX in artifact/presenter metadata layers unless a concrete runtime state requirement justifies schema expansion
924
+ Tested: `bun test tests/feature-doc-drilldown.test.ts tests/runtime-operator-history.test.ts tests/runtime-summary.test.ts tests/runtime-tools-metadata.test.ts` (46 pass, 338 expect calls); `bun run typecheck`; `bun run check` before tag push
925
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.20` before push
926
+
927
+ ## [2.0.19] - 2026-05-08
928
+
929
+ Make subagent work visible without weakening Flow runtime contracts
930
+
931
+ Flow 2.0.19 adds a derived task-progress projection for Flow sessions so operators can see planning, execution, validation, review, final-review, and recovery work as concise task rows in status summaries, history responses, rendered session docs, and OpenCode action metadata. The projection is presentation-only: worker results, reviewer decisions, tool payloads, and persisted session schemas remain the runtime-owned machine contracts.
932
+
933
+ The release also updates prompt guidance around role-aware Task/subagent handoffs. Coordinators can split independent planning, implementation, and review work into fresh child contexts, while leaf reviewer/audit roles stay evidence-backed report producers rather than recursive orchestrators. This keeps the investigation recommendation grounded in current runtime ownership: prompts describe orchestration, and Flow tools still own state transitions.
934
+
935
+ Stored-session history now keeps parked-session UX consistent. When `flow_history_show` displays a non-completed stored session, task-progress rows and operator summaries point to session activation instead of direct work that would not update the parked runtime state. Completed and active session summaries keep their existing task-progress behavior.
936
+
937
+ The release deliberately does not add commands, tools, runtime modes, state paths, package exports, dependencies, or worker/reviewer schema changes. It accepts a narrow bundle budget increase for the release-bound task-progress presentation code while preserving direct JSON tool contracts, `zod` / `@opencode-ai/plugin` alignment, and snapshot-primary runtime persistence.
938
+
939
+ Constraint: Keep task/session progress as a derived presentation model over canonical session, history, validation, review, and tool metadata
940
+ Constraint: Preserve worker/reviewer/tool JSON schemas and direct OpenCode `tool(...)` arg shapes without stringified or nested transport wrappers
941
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this release
942
+ Constraint: Accept only a narrow 8 KiB bundle budget increase for release-bound task-progress projection and presentation code
943
+ Rejected: Replace worker/reviewer JSON with prose | runtime schemas, adapter schemas, completion gates, and regression tests depend on strict machine-readable payloads
944
+ Rejected: Add first-class child-session tree persistence now | current runtime history is flat and feature/session-oriented, so projection-first UX avoids a schema migration
945
+ Rejected: Let parked stored sessions show direct work next steps | direct work outside Flow would not update parked runtime records, so activation must be the visible next action
946
+ Confidence: high
947
+ Scope-risk: moderate
948
+ Reversibility: clean
949
+ Directive: Keep future subagent/task UX improvements in summary, presenter, render, history, and metadata layers unless a concrete runtime requirement justifies persisted child-session modeling
950
+ Tested: `bun test tests/runtime-operator-history.test.ts tests/runtime-summary.test.ts tests/runtime-tools-metadata.test.ts tests/runtime-actionable-metadata.test.ts tests/config/prompt-contracts.test.ts` (62 pass, 735 expect calls); `bun run typecheck`; `bun run check` before tag push
951
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.19` before push
952
+
953
+ ## [2.0.18] - 2026-05-08
954
+
955
+ Restore the hosted generated-drift lane after the Flow Core release
956
+
957
+ Flow 2.0.18 is a fix-forward release for the hosted CI failure observed immediately after `v2.0.17`. The `v2.0.17` release workflow published successfully, but the main-branch CI generated-drift preflight still invoked Bun with the old test-name pattern `descriptor family parity`. The descriptor suite was intentionally renamed around the smaller OpenCode registry, so the hosted pattern matched zero tests even though the local full `bun run check` path had passed.
958
+
959
+ This release keeps the Flow Core snapshot-first simplification from `v2.0.17` unchanged. It updates the generated-drift package script to use the renamed descriptor parity suite selector, preserving the same registry/projection/docs parity surface while matching the current test contract.
960
+
961
+ The release deliberately does not add commands, tools, runtime modes, state paths, package exports, dependencies, or behavior changes. It only restores hosted CI coverage for generated descriptor drift after the descriptor-suite rename.
962
+
963
+ Constraint: Fix the hosted CI lane without rewriting the already-pushed `v2.0.17` tag or weakening generated-drift coverage
964
+ Constraint: Preserve the `v2.0.17` Flow Core snapshot-first product contract unchanged
965
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
966
+ Rejected: Force-move `v2.0.17` | the release workflow already succeeded and the tag was pushed, so fix-forward is safer and more auditable
967
+ Rejected: Remove descriptor parity from generated-drift checks | that would weaken the release gate that caught the stale test selector
968
+ Rejected: Keep the stale `descriptor family parity` selector | it no longer names the active descriptor parity suite and matched zero hosted tests
969
+ Confidence: high
970
+ Scope-risk: narrow
971
+ Reversibility: clean
972
+ Directive: Keep generated-drift selectors synchronized with descriptor parity suite names when the suite is renamed
973
+ Tested: `bun run check:generated-drift`; `bun run check`
974
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.18` before push
975
+
976
+ ## [2.0.17] - 2026-05-08
977
+
978
+ Make Flow Core snapshot-first and retire replay infrastructure
979
+
980
+ Flow 2.0.17 completes the current simplification pass by freezing the supported Flow Core vNext contract around runtime transitions, the session-engine persistence boundary, and snapshot-first active/stored/completed session state. The new Flow Core facade exposes compact command and query names without becoming a second state engine; transitions still own behavior, and the session engine still owns load -> transition -> save -> render synchronization.
981
+
982
+ The release replaces duplicated OpenCode descriptor/projection metadata with a smaller tool registry that keeps tool names, runtime bindings, mode visibility, descriptions, and docs metadata in one local surface. Generated projections and docs rows now derive from that smaller registry instead of preserving a broader duplicated descriptor family.
983
+
984
+ Replay/event/checkpoint/projection persistence is intentionally retired as a product-supported surface. The live product path was already snapshot-primary, so the release deletes the core workflow replay wrappers, event/checkpoint/projection stores, replay tests, and event-store benchmark while keeping runtime transition invariants, session history, rendered artifacts, and the new snapshot persistence gate. Historical release and investigation docs may still mention the retired replay architecture as historical evidence.
985
+
986
+ Strict review governance is narrowed to review/review-and-fix or explicit strict review modes. Ordinary implementation flows keep compact completion safety, while supplied final-review behavior evidence is still sanity-checked so approved/passing final reviews cannot carry `needs_fix`, unsafe refs, or validation refs that were not actually recorded.
987
+
988
+ The release deliberately does not add commands, tools, runtime modes, package exports, dependencies, or looser completion paths. It retires unsupported replay state surfaces in favor of the documented snapshot-first contract and keeps `zod` / `@opencode-ai/plugin` alignment unchanged.
989
+
990
+ Constraint: Preserve runtime transition authority and session-engine snapshot persistence while deleting duplicated replay/product metadata surfaces
991
+ Constraint: Retire `.flow/events`, `.flow/checkpoints`, and `.flow/projections` as supported product state paths without changing active/stored/completed session snapshots
992
+ Constraint: Keep command names, tool names, package exports, dependencies, and `zod` / `@opencode-ai/plugin` alignment stable
993
+ Rejected: Keep event/checkpoint/projection stores as dormant compatibility code | dead product surfaces would keep release gates and architecture shaped around unsupported replay behavior
994
+ Rejected: Treat the new Flow Core facade as a new state engine | it is only a command/query boundary over existing runtime application handlers and transitions
995
+ Rejected: Drop all behavior-evidence checks outside strict review mode | supplied invalid evidence must still fail even when strict completeness is optional
996
+ Confidence: high
997
+ Scope-risk: broad
998
+ Reversibility: moderate
999
+ Directive: Keep Flow Core vNext snapshot-first unless a future release deliberately reintroduces event-sourced persistence with migration, public state-path docs, and replay gates
1000
+ Tested: `bun test tests/runtime-tool-routing.test.ts tests/completion-gates.test.ts` (47 pass, 319 expect calls); `bun run typecheck`; touched-file Biome; `bun run check`
1001
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.17` before push
1002
+
1003
+ ## [2.0.16] - 2026-05-07
1004
+
1005
+ Harden review-scope recovery accounting before release
1006
+
1007
+ Flow 2.0.16 closes the release-readiness gaps in the review-scope accounting contract. Recovery examples now stay conservative: they list closed finding refs as candidates, but no longer assign every closed finding to every declared scope. Agents must map findings to the specific scope they actually prove before using `finding_closed`.
1008
+
1009
+ Completion and reviewer validation now route review-scope failures through structured failure kinds instead of substring matching. Worker completion recovery remains tied to worker evidence, while final-reviewer recovery now points at the recorded final reviewer decision when the reviewer approval ledger is the failing artifact.
1010
+
1011
+ Historical completed feature evidence is accepted only when its `reviewScopeLedger` is structurally valid and covers every declared scope for that feature. Recursive glob review targets also use standard globstar semantics, so `src/**/*.ts` grounds both `src/index.ts` and nested TypeScript paths without broadening unsupported bracket or brace glob syntax.
1012
+
1013
+ The release deliberately does not add commands, tools, runtime modes, state paths, package exports, dependencies, or looser completion paths. It narrows recovery guidance and historical evidence reuse while preserving the existing review/review-and-fix surface. The bundle sanity ceiling moves from 708 KiB to 716 KiB to account for the added release-critical recovery checks while preserving a fixed budget check.
1014
+
1015
+ Constraint: Preserve strict review and review-and-fix completion gates without changing persisted session shape
1016
+ Constraint: Keep recovery details machine-readable while avoiding automatic closed-finding-to-scope assignment
1017
+ Constraint: Accept only a narrow 8 KiB bundle budget increase for release-critical recovery/accounting checks
1018
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1019
+ Rejected: Populate every recovery example scope with all closed finding refs | that overstates which findings were actually mapped to each declared scope
1020
+ Rejected: Select review-scope recovery by matching error-message substrings | structured failure kinds are safer and keep worker vs final-reviewer recovery targets explicit
1021
+ Rejected: Let partial historical ledgers or one-directory-only globstar behavior satisfy final accounting | both would silently drop declared review scope evidence
1022
+ Confidence: high
1023
+ Scope-risk: moderate
1024
+ Reversibility: clean
1025
+ Directive: Keep review-scope ledgers scoped and evidence-grounded; historical completions can contribute only after every declared feature scope is accounted
1026
+ Tested: `bun test tests/completion-gates.test.ts tests/runtime-tools.test.ts` (48 pass, 395 expect calls); `bun run typecheck`; targeted Biome; targeted completion/recovery gates and `bun run check` before tag push
1027
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.16` before push
1028
+
1029
+ ## [2.0.15] - 2026-05-07
1030
+
1031
+ Preserve review-and-fix closure obligations across planning refreshes
1032
+
1033
+ Flow 2.0.15 closes the follow-up release gap in the new `planning.reviewFindings` remediation contract. Active `review_and_fix` sessions can still refresh planning evidence, but they can no longer remove recorded findings through `record_planning_context` while the plan depends on those findings for completion. Final completion now keeps the original remediation obligation intact until every planned finding is closed with fix, test, validation, ledger, final-review, and reviewer-approval evidence.
1034
+
1035
+ The release also consolidates review-finding closure policy into a small runtime-domain helper instead of leaving closure ledger checks and planned-finding closure checks split across transition-local helpers. Completion transitions still own recovery routing and gate order, while the domain helper owns the closure-policy text and missing-ref calculation.
1036
+
1037
+ Final-review coverage gap accounting now treats whitespace-only `suggestedValidation` entries as missing. If a review context pack records coverage gaps, the final review must carry those gaps forward and provide real follow-up validation guidance rather than satisfying the contract with blank strings.
1038
+
1039
+ The release deliberately does not add commands, tools, runtime modes, state paths, package exports, dependencies, or looser completion paths. It narrows the existing review-and-fix contract and keeps the review-first/remediation split from 2.0.14 intact.
1040
+
1041
+ Constraint: Preserve strict `review_and_fix` finding closure after planning context refreshes without changing persisted session shape
1042
+ Constraint: Keep completion/reviewer gate recovery behavior unchanged while moving closure-policy checks into a focused domain helper
1043
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1044
+ Rejected: Treat empty `planning.reviewFindings` refreshes as valid during active `review_and_fix` execution | this would erase the runtime-owned remediation baseline before final completion
1045
+ Rejected: Store a new persisted immutable findings baseline in this release | guarding the mutation ingress fixes the bypass without a migration or state-shape change
1046
+ Rejected: Let whitespace-only suggested validation satisfy coverage gaps | blank follow-up guidance weakens final-review evidence quality
1047
+ Confidence: high
1048
+ Scope-risk: moderate
1049
+ Reversibility: clean
1050
+ Directive: Do not remove `planning.reviewFindings` from an active `review_and_fix` session unless replanning out of remediation mode first; every planned finding must remain auditable through closure evidence before final completion
1051
+ Tested: `bun run typecheck`; `bun run lint`; `bun test tests/completion-gates.test.ts tests/runtime/evidence-packets.test.ts tests/runtime/final-review-contracts.test.ts` (63 pass, 405 expect calls); targeted release gates and `bun run check` before tag push
1052
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.15` before push
1053
+
1054
+ ## [2.0.14] - 2026-05-07
1055
+
1056
+ Route no-findings review-and-fix work through review-first discovery
1057
+
1058
+ Flow 2.0.14 fixes the review-and-fix quality regression where broad codebase review requests with no concrete findings could be planned as a single `review_and_fix` feature and then degrade into repeated completion-payload retries. Planning now has an explicit `planning.reviewFindings` context ledger for concrete existing review findings, and `review_and_fix` plan application fails fast when that ledger is empty.
1059
+
1060
+ No-findings review-and-fix requests now stay in `goalMode: review` for audit/discovery first. Once a review produces concrete findings, a remediation replan can use `goalMode: review_and_fix` with those findings recorded in `planning.reviewFindings`, preserving the strict finding-to-fix-to-validation chain.
1061
+
1062
+ The release deliberately keeps the existing completion gates strict. Real `review_and_fix` remediation still requires closure evidence, review-scope accounting, final-review evidence, and reviewer approval; this patch changes when remediation mode may start, not what it must prove before completion.
1063
+
1064
+ Prompt contracts, planner/auto/planning-researcher guidance, and prompt-mode calibration fixtures now mirror the runtime rule: no findings means review-first discovery, known findings means strict remediation. Regression coverage locks both paths, including inline-only `planning.reviewFindings` acceptance and audit-only no-findings calibration.
1065
+
1066
+ Constraint: Add only a narrow planning-context contract for concrete review findings; do not add commands, tools, runtime modes, state paths, package exports, or dependencies
1067
+ Constraint: Accept a narrow raw tool-schema budget increase for `planning.reviewFindings` while keeping the bundle sanity budget unchanged
1068
+ Constraint: Preserve strict `review_and_fix` completion gates for actual remediation with known findings
1069
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1070
+ Rejected: Loosen `reviewFindingClosures`, `reviewScopeLedger`, or final-review requirements for no-change completions | that would make shallow review-and-fix completion easier instead of forcing discovery first
1071
+ Rejected: Keep broad no-findings review-and-fix as a single remediation feature | it frames the agent around completion accounting before findings exist
1072
+ Rejected: Infer known findings from natural-language goals alone | `planning.reviewFindings` gives the runtime and prompts a concrete, auditable prerequisite
1073
+ Confidence: high
1074
+ Scope-risk: moderate
1075
+ Reversibility: clean
1076
+ Directive: Use `goalMode: review_and_fix` only after concrete findings are recorded in `planning.reviewFindings`; broad review-and-fix/codebase-review requests without findings must start as `goalMode: review`
1077
+ Tested: `bun test tests/config/prompt-contracts.test.ts tests/plan-graph-validation.test.ts tests/prompt-mode-behavior-eval.test.ts tests/prompt-mode-capture.test.ts tests/runtime/plan-and-tool-schema-contracts.test.ts tests/completion-gates.test.ts tests/runtime/evidence-packets.test.ts` (91 pass, 1262 expect calls); `bun test tests/config/tool-schemas.test.ts`; `bun run build` plus bundle sanity at 720706 bytes; `bun run typecheck`; `bun run lint`; Oracle review found no blockers and P2 follow-ups were applied; `bun run check`
1078
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.14` before push
1079
+
1080
+ ## [2.0.13] - 2026-05-07
1081
+
1082
+ Dedupe final-completion tool guidance in subagent prompts
1083
+
1084
+ Flow 2.0.13 narrows the worker and autonomous subagent prompt surfaces so final-completion tools appear as concrete calls only where they are operationally needed. The hard allowed-tool contract still lists `flow_review_record_final` and `flow_run_complete_feature`, and the workflow steps still name the exact persistence calls for the final-review and completion gates.
1085
+
1086
+ The surrounding policy fragments and role examples now refer to the canonical feature/final review-record runtime tool instead of repeating the same literal tool names. This preserves the final-completion path while reducing prompt noise that could make subagents treat the guidance as multiple independent obligations.
1087
+
1088
+ Regression coverage now counts rendered worker and auto prompt occurrences for `flow_review_record_final` and `flow_run_complete_feature`, keeping both bounded to two appearances per subagent prompt while preserving presence checks and mode-contract/tool-surface parity.
1089
+
1090
+ Constraint: Reduce prompt repetition without renaming tools, adding tools, changing runtime state, or weakening final completion gates
1091
+ Constraint: Keep the mode contract as the public allowed-tool source of truth
1092
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1093
+ Rejected: Remove exact tool names from workflow steps | subagents still need precise final-review and completion persistence calls at the point of action
1094
+ Rejected: Keep all repeated exact names in fragments and role examples | redundant literal mentions increase prompt noise without adding contract clarity
1095
+ Rejected: Add runtime idempotency changes in this release | the issue addressed here is prompt duplication, not evidence of duplicate runtime registrations
1096
+ Confidence: high
1097
+ Scope-risk: narrow
1098
+ Reversibility: clean
1099
+ Directive: Keep exact final-completion tool names in allowed-tool contracts and concrete workflow actions; use canonical prose elsewhere unless a literal call is required
1100
+ Tested: rendered prompt occurrence check for worker/auto subagents; `bun test tests/config/prompt-contracts.test.ts tests/mode-contracts.test.ts`; `bun run typecheck`; `bun run lint`; Oracle review found no blocker; `bun run check`
1101
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.13` before push
1102
+
1103
+ ## [2.0.12] - 2026-05-07
1104
+
1105
+ Harden review-scope evidence grounding after false-negative review passes
1106
+
1107
+ Flow 2.0.12 closes the false-negative gaps found in the review-scope accounting gates. Review and review-and-fix completions now require each `reviewScopeLedger` entry to cite concrete artifact evidence grounded in the declared scope; validation commands can still supplement evidence, but they can no longer be the only proof for every target.
1108
+
1109
+ The release tightens scope matching without expanding the public tool surface. File targets accept line-suffixed artifact refs, glob targets use path-aware `*`, `**`, and `?` matching instead of prefix matching, and unsupported bracket/brace glob syntax is conservatively rejected rather than treated as broad evidence. Domain and surface targets now apply kind-aware grounding, while workflow/custom targets require explicit path-like targets instead of fuzzy substring matches.
1110
+
1111
+ Behavior-led final-review evidence was also narrowed so non-concrete declared scope labels such as `runtime` or wildcard targets cannot ground behavior refs. Regression coverage locks the previous misses: unrelated concrete paths for `domain:runtime`, nested/wrong-extension glob refs, validation-only ledger evidence, file line refs, and behavior refs grounded only by non-concrete scope labels.
1112
+
1113
+ The bundle sanity ceiling moves from 700 KiB to 704 KiB to account for the additional release-gate safety logic while preserving a fixed budget check.
1114
+
1115
+ Constraint: Fix review false negatives without adding commands, tools, state paths, package exports, or dependencies
1116
+ Constraint: Accept only a narrow 4 KiB bundle budget increase for the new safety checks
1117
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1118
+ Rejected: Let validation commands alone close every declared review scope | generic commands do not prove which scope was reviewed
1119
+ Rejected: Keep prefix-only glob matching | it accepts nested and wrong-extension files outside the declared pattern
1120
+ Rejected: Use non-concrete scope labels as behavior evidence | labels such as `runtime` are audit scope metadata, not artifact refs
1121
+ Confidence: high
1122
+ Scope-risk: moderate
1123
+ Reversibility: clean
1124
+ Directive: Treat review-scope ledger evidence as concrete scoped proof; use validation commands as supporting evidence, not as a substitute for grounded artifact refs
1125
+ Tested: `bun test tests/runtime/final-review-contracts.test.ts tests/completion-gates.test.ts` (49 pass, 338 expect calls); `bun test tests/config/tool-schemas.test.ts` (10 pass, 419 expect calls); `bun run typecheck`; `bun run lint`; `bun run check`
1126
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.12` before push
1127
+
1128
+ ## [2.0.11] - 2026-05-06
1129
+
1130
+ Require review-scope accounting before broad audit completion
1131
+
1132
+ Flow 2.0.11 hardens broad review and review-and-fix workflows with a runtime-owned review scope ledger. Review-shaped plans must now declare an effective scope through `reviewScope` or `fileTargets`, and final completion cannot reduce a full audit request to one closed finding unless every declared target is accounted as reviewed with no findings, finding closed, deferred, out of scope, or blocked with evidence and residual risk.
1133
+
1134
+ The release keeps artifact-derived final-review coverage separate from audit-scope closure. `reviewScopeLedger` is carried through worker results, execution history, and final reviewer approvals, while implementation-mode one-file workflows remain valid without the new ledger. Historical completed feature closures can satisfy final review-and-fix scope where appropriate, but failed historical attempts cannot be cited as completion evidence.
1135
+
1136
+ The OpenCode adapter, descriptors, prompt contracts, recovery guidance, generated completion-gate projections, architecture notes, and prompt snapshots now surface the new scope-accounting contract. Regression coverage models broad one-file fixes, multi-feature historical closures, failed-attempt evidence rejection, plan scope requirements, effective scope-id collisions, and the preserved implementation-mode path.
1137
+
1138
+ Constraint: Add audit-scope completion accounting without requiring edits to every declared target file
1139
+ Constraint: Keep final-review `reviewedSurfaces` artifact-derived; do not overload it into a whole-audit ledger
1140
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1141
+ Rejected: Treat broad review completion as mutation-count coverage | legitimate audits may fix one file while still reviewing or deferring the rest of the declared scope
1142
+ Rejected: Infer audit breadth from natural-language goals at completion time | structured `reviewScope` / `fileTargets` gives the runtime an auditable source of truth
1143
+ Rejected: Let failed historical attempts satisfy final reviewer `finding_closed` scope entries | rejected attempts can contain unsupported closure refs and must not become completion evidence
1144
+ Confidence: high
1145
+ Scope-risk: moderate
1146
+ Reversibility: clean
1147
+ Directive: For `review` and `review_and_fix` plans, declare scope explicitly and close it with `reviewScopeLedger`; use `deferred`, `out_of_scope`, or `blocked` for honest residual-risk accounting rather than narrowing silently
1148
+ Tested: `bun run lint`; `bun run typecheck`; `bun test` (543 pass, 0 fail, 1 snapshot, 17064 expect calls); Oracle review follow-ups fixed and revalidated with targeted completion, final-review, prompt, plan, schema, protocol, recovery, and snapshot suites
1149
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.11` before push
1150
+
1151
+ ## [2.0.10] - 2026-05-06
1152
+
1153
+ Require behavior-grounded final review approvals
1154
+
1155
+ Flow 2.0.10 hardens final-review approval from surface accounting into behavior-grounded evidence. Live final-review and `flow_review_record_final` inputs now require explicit `evidenceRefs` while persisted sessions keep their backcompat parse path, and reviewer-decision normalization preserves behavior checks plus validation coverage so approval-time validation can fail fast before shallow approvals are recorded.
1156
+
1157
+ The release adds a required behavior-risk ledger for async ordering, lifecycle reentrancy, state rollback, persistence recovery, interaction, accessibility, and test-oracle authenticity risks. Runtime-derived required risks must be passed or gap-recorded with grounded refs and validation coverage, behavior refs are normalized against safe repo-relative evidence, source-only multi-domain app changes now trigger behavior accounting, and audit reports cross-check behavior validation refs against `validationRun` while still allowing grounded `needs_fix` findings.
1158
+
1159
+ The release also records the soft-focus final-review miss investigation and removes redundant Knip ignore configuration after deadcode diagnostics proved the ignored type-contract files no longer need explicit suppression.
1160
+
1161
+ Constraint: Harden final-review approval contracts without adding commands, tools, state paths, package exports, or dependency versions
1162
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1163
+ Rejected: Let persisted-session compatibility defaults leak into live `flow_review_record_final` args | omitted evidence refs could parse as present and weaken final-review recording
1164
+ Rejected: Allow runtime-derived required behavior risks to use prose-only `not_applicable` | shallow approvals could bypass async/lifecycle/state accounting without proof or gap records
1165
+ Rejected: Reuse final-approval `needs_fix` rejection unchanged for audit reports | audits must be able to report grounded behavior findings that still need fixes
1166
+ Confidence: high
1167
+ Scope-risk: moderate
1168
+ Reversibility: clean
1169
+ Directive: Treat final-review approval as behavior-ledger validation, not just surface coverage; keep required risks grounded in changed artifacts, context packs, tests, or validation evidence before approving final work
1170
+ Tested: `bun test tests/runtime/final-review-contracts.test.ts tests/completion-gates.test.ts tests/config/tool-schemas.test.ts tests/runtime/evidence-packets.test.ts` (56 pass, 700 expect calls); `bun run typecheck`; `bun run deadcode`; `bun run check:fresh-surfaces`; `bun test tests/docs-stale-reference-policy.test.ts` (3 pass); `bun run check`
1171
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.10` before push
1172
+
1173
+ ## [2.0.9] - 2026-05-06
1174
+
1175
+ Refresh planning evidence packets without preserving stale context
1176
+
1177
+ Flow 2.0.9 turns planning context evidence into an explicit durable packet ledger while keeping the workflow surface stable. Planning, execution, review, and final-review schemas can now carry source-backed evidence packets for selected context, exclusions, relationship hypotheses, ambiguity notes, covered findings, and validation evidence, and runtime planning context merges those packets through a shared domain helper instead of duplicating merge behavior across transitions.
1178
+
1179
+ The release also closes the review risks found during hardening. Same-id evidence packets now refresh wholesale so replans can retract stale source refs or selected/excluded context instead of unioning obsolete evidence forever. Prompt guidance is split between runtime-owner and read-only roles, so planning researcher and reviewer prompts return evidence for a planner/coordinator/runtime owner to persist rather than telling read-only roles to call planning runtime tools. Tool schema budgets were tightened around the measured evidence-packet growth so future unrelated schema bloat still fails fast.
1180
+
1181
+ Constraint: Add source-backed planning/review evidence packets without adding commands, tools, state paths, package exports, or dependency versions
1182
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1183
+ Rejected: Preserve same-id packet arrays by unioning old and new context | stale refs and selected/excluded context would survive replans and weaken evidence accuracy
1184
+ Rejected: Reuse one prompt fragment for runtime owners and read-only roles | it gives reviewers/researchers contradictory persistence instructions
1185
+ Rejected: Leave broad raw-schema ceilings after evidence-packet growth | oversized budgets hide unrelated future tool-schema drift
1186
+ Confidence: high
1187
+ Scope-risk: moderate
1188
+ Reversibility: clean
1189
+ Directive: Treat same-id evidence packets as refreshes, not append logs; use new packet ids for additive evidence and keep runtime-tool persistence instructions out of read-only prompt surfaces
1190
+ Tested: `bun run typecheck`; `bun run lint`; `bun test tests/runtime/evidence-packets.test.ts tests/config/tool-schemas.test.ts tests/config/prompt-contracts.test.ts tests/runtime-hooks.test.ts tests/runtime/workflow-core-reducer.test.ts` (54 pass, 950 expect calls); `bun run check`
1191
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.9` before push
1192
+
1193
+ ## [2.0.8] - 2026-05-06
1194
+
1195
+ Ground final review coverage in canonical evidence
1196
+
1197
+ Flow 2.0.8 hardens final-review coverage by introducing a typed `reviewContextPack` for changed files, connected context, relationship edges, validation evidence, suggested validation, and coverage gaps. Runtime normalization now carries that pack through final-review and worker-completion paths, while prompt, audit, schema, and capture fixtures teach reviewers that changed files are the review seed rather than the review boundary.
1198
+
1199
+ The release also closes the main trust-boundary risk in that new evidence ledger. Review surfaces such as tests, release, operator, tooling, docs, shared surfaces, and integration points now need grounded canonical path or relationship evidence instead of self-reported labels. Validation evidence must match actual worker validation commands, and the OpenCode tool surface rejects empty or unknown top-level `reviewContextPack` payloads while keeping the raw schema compact enough for the existing size budget.
1200
+
1201
+ Constraint: Improve final-review context discovery without adding new commands, tools, state paths, package exports, or dependency versions
1202
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1203
+ Rejected: Let reviewer-supplied `includedContext.surface` or `reason` satisfy concrete review surfaces | self-attested labels can spoof coverage and weaken final-review gates
1204
+ Rejected: Reuse the full runtime `reviewContextPack` schema directly in OpenCode raw tool args | it exceeds the tool schema size budget, so runtime strict parsing owns nested validation while the compact raw schema pins top-level shape
1205
+ Confidence: high
1206
+ Scope-risk: moderate
1207
+ Reversibility: clean
1208
+ Directive: Treat `reviewContextPack` as a grounded evidence ledger: labels may describe context, but coverage gates must derive concrete surfaces from canonical paths, relationships, and validation commands
1209
+ Tested: `bun test tests/runtime/final-review-contracts.test.ts tests/config/tool-schemas.test.ts` (18 pass, 444 expect calls); `bun run typecheck`; `bun run lint`; `bun run check` (520 pass in full suite, completion/replay gates, build, release hygiene, pack invariants, lint, bench smoke, and bench gate passed); Oracle review found no blockers and P2 follow-ups were applied
1210
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.8` before push
1211
+
1212
+ ## [2.0.7] - 2026-05-05
1213
+
1214
+ Keep broad review-and-fix planning evidence-first
1215
+
1216
+ Flow 2.0.7 adds a dedicated read-only `flow-planning-researcher` subagent so broad goals such as full codebase review followed by fixes can gather repository profile, package-manager, stack, standards, and validation evidence before the runtime plan is finalized. The planner and autonomous coordinator now route broad review-and-fix/codebase-review requests through that research surface when findings do not yet exist, preserving the distinction between planning evidence, review findings, and execution fixes.
1217
+
1218
+ The release deliberately keeps findings out of the planning phase. The researcher may recommend a review-first decomposition and evidence packet, but it must not invent defects, claim closure evidence, mutate `.flow`, call Flow runtime tools, or edit repository files. Permission and prompt-mode contracts now make the new agent read-only, while config and capture/eval tests prove the handoff path stays bounded.
1219
+
1220
+ Constraint: Improve full-codebase review-and-fix planning without letting the planning phase invent findings or bypass runtime-owned review/execution gates
1221
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1222
+ Rejected: Teach `flow-planner` to both research and speculate fixes for broad review requests | that collapses audit discovery into planning and encourages unsupported findings
1223
+ Rejected: Add a new command, tool, state path, or runtime mode | a bounded read-only subagent and prompt/permission contracts solve the workflow gap with less public surface churn
1224
+ Confidence: high
1225
+ Scope-risk: moderate
1226
+ Reversibility: clean
1227
+ Directive: Keep broad review-and-fix goals review-first: planning may preserve evidence and recommend audit scope, but findings and closure evidence belong only to review/execution records
1228
+ Tested: `bun run lint`; `bun run typecheck`; `bun test tests/prompt-mode-behavior-eval.test.ts tests/prompt-mode-capture.test.ts tests/config/prompt-contracts.test.ts tests/config/plugin-surface.test.ts tests/mode-contracts.test.ts` (58 pass); `bun run check:generated-drift`; `bun run eval:prompt-capture:check`; `bun run check` (514 pass, bundle sanity 6 agents / 9 commands / 18 tools, bench gate passed)
1229
+ Not-tested: Live OpenCode Flow session routing through `flow-planning-researcher`; live GitHub-hosted CI/release workflow runs for tag `v2.0.7` before push
1230
+
1231
+ ## [2.0.6] - 2026-05-05
1232
+
1233
+ Enforce strict checkpoint integrity and replay durability over legacy compatibility
1234
+
1235
+ Flow 2.0.6 hardens workflow persistence by binding checkpoints to event-log prefixes, validating explicit replay resume offsets, and fsyncing event-log directories after append operations. The release also clarifies post-rename durability error semantics in session writes, keeps strict review-input contracts at the tool boundary, and adds larger replay benchmark coverage plus targeted regression tests.
1236
+
1237
+ This release intentionally drops backward compatibility for deprecated checkpoint artifacts that lack integrity metadata. Legacy checkpoints are treated as invalid by design so replay safety does not depend on stale format tolerance.
1238
+
1239
+ Constraint: Prioritize replay/checkpoint integrity and durability guarantees over legacy checkpoint compatibility
1240
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this patch
1241
+ Rejected: Accept legacy checkpoints without `eventPrefixHash` fallback | deprecated state formats undermine deterministic replay integrity
1242
+ Confidence: high
1243
+ Scope-risk: moderate
1244
+ Reversibility: clean
1245
+ Directive: Treat persistence schema hardening as forward-only when it enforces integrity contracts; document intentional incompatibilities in release notes
1246
+ Tested: `bun test tests/runtime/workflow-persistence.test.ts tests/replay/replay-persistence-gate.test.ts tests/atomic-writes.test.ts tests/runtime/workspace-cache.test.ts tests/runtime/workflow-core-reducer.test.ts tests/runtime/semantic-invariants.test.ts` (40 pass); `bun x tsc --noEmit`
1247
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.6` before push
1248
+
1249
+ ## [2.0.5] - 2026-05-05
1250
+
1251
+ Simplify governance surfaces by removing stale audit artifacts
1252
+
1253
+ Flow 2.0.5 is a docs-only consolidation release focused on reducing governance drift and maintenance overhead. The oversized deep-audit investigation artifact was removed, the runtime complexity baseline doc was renamed for clearer intent, and gate-matrix wording was corrected so generated-drift coverage is described accurately.
1254
+
1255
+ This release does not change runtime behavior, commands, tool schemas, state paths, or dependencies. It narrows the maintenance surface so contract documentation stays executable and easier to keep in sync.
1256
+
1257
+ Constraint: Keep release scope docs-only with no runtime/tool/dependency changes
1258
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency or SDK-boundary changes in this patch
1259
+ Rejected: Keep large point-in-time audit logs in-repo as active governance artifacts | they add drift risk and duplicate canonical contract surfaces
1260
+ Confidence: high
1261
+ Scope-risk: narrow
1262
+ Reversibility: clean
1263
+ Directive: Keep one canonical runtime complexity baseline surface and avoid reintroducing long-lived duplicated audit artifacts
1264
+ Tested: `bun run check`
1265
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.5` before push
1266
+
1267
+ ## [2.0.4] - 2026-05-04
1268
+
1269
+ Harden stored-session parking after missing `.flow/stored` recovery
1270
+
1271
+ Flow 2.0.4 fixes a portability risk in session parking during `flow_plan_start` and `flow_session_activate` by ensuring the stored root exists before rename, rather than pre-creating the destination leaf directory. This keeps the missing-`.flow/stored` recovery path robust across filesystems where rename-to-existing-directory behavior is stricter.
1272
+
1273
+ The release also updates runtime coverage for the missing-stored-root paths and adjusts workspace mkdir-caching expectations to match the new explicit re-ensure behavior. Prompt-mode capture fixtures are refreshed for providerless metadata wording without changing runtime behavior.
1274
+
1275
+ Constraint: Keep runtime behavior stable while fixing cross-platform rename safety when `.flow/stored` is recreated
1276
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency or tool-schema expansion in this patch
1277
+ Rejected: Keep creating `getStoredSessionDir(...)` before rename | destination-leaf precreation can fail rename on stricter filesystem semantics
1278
+ Confidence: high
1279
+ Scope-risk: narrow
1280
+ Reversibility: clean
1281
+ Directive: For directory move safety, create parent roots before rename and avoid precreating rename destination leaves
1282
+ Tested: `bun test` with 509 passing tests; `bun run check` (typecheck, prompt capture checks, dependency contract, architecture seams enforce, deadcode, build, release hygiene, pack invariants, completion-lane gate, replay gates, cold-start budget, full tests, lint, bench smoke, bench gate)
1283
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.4` before push
1284
+
1285
+ ## [2.0.3] - 2026-05-04
1286
+
1287
+ Enforce architecture seams and continue runtime simplification
1288
+
1289
+ Flow 2.0.3 hardens architecture boundaries and continues runtime decomposition while keeping the public Flow command/tool surface stable. CI now enforces blocked cross-layer imports, runtime/session responsibilities are split into clearer lifecycle/recovery/rendering seams, and simplification metrics are recorded so maintainers can track complexity reduction without guessing.
1290
+
1291
+ The release keeps dependency alignment unchanged (`zod` remains aligned with `@opencode-ai/plugin`) and avoids user-facing expansion. This pass is focused on maintainability and guardrails: stronger seam checks, clearer ownership docs, and runtime simplification follow-through with explicit verification.
1292
+
1293
+ Constraint: Keep behavior stable while reducing coupling and making architecture drift visible in CI
1294
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; no dependency-version changes in this simplification pass
1295
+ Rejected: Keep seam checks report-only after violations reached zero | hard enforcement is required to prevent regression
1296
+ Confidence: high
1297
+ Scope-risk: moderate
1298
+ Reversibility: clean
1299
+ Directive: Maintain seam enforcement in CI and route new cross-layer shared contracts through seam-safe workflow/runtime boundaries
1300
+ Tested: `bun run check` (typecheck, dependency contract, architecture seams enforce, fresh-surface policy, deadcode, build, release hygiene, pack invariants, completion lane, replay gates, cold-start budget, full tests, lint, bench smoke, bench gate); `bun test` with 507 passing tests
1301
+ Not-tested: Live GitHub-hosted CI/release workflow runs for tag `v2.0.3` before push
1302
+
1303
+ ## [2.0.2] - 2026-05-04
1304
+
1305
+ Make workflow contracts descriptor-driven and evidence-aware
1306
+
1307
+ Flow 2.0.2 turns the new workflow surface hardening into explicit descriptor and evidence contracts. OpenCode tool metadata now flows through descriptor families that project host tool order, docs rows, prompt visibility, schema ownership, runtime bindings, and verification anchors from one reviewable source instead of relying on parallel hand-maintained lists.
1308
+
1309
+ Completion gates now have runtime-owned descriptor projections for feature, final, and review-and-fix paths. The generated guidance ties recovery kinds, required artifacts, predicate owners, and architecture documentation back to the same completion-gate table, while tests prove descriptor parity against runtime recovery metadata, docs output, and semantic invariant order.
1310
+
1311
+ Standalone review and audit surfaces now carry optional evidence packets for selected context, exclusions, ambiguity, validation notes, and already-covered findings. The review command, audit contract, schemas, snapshots, and evidence-packet tests keep those packets as read-only support metadata for coverage ledgers and findings rather than a replacement for concrete file evidence.
1312
+
1313
+ The release deliberately keeps the public command/tool names, package entrypoint, state paths, dependency versions, and OpenCode plugin SDK/Zod alignment unchanged. The descriptor split adds internal generated surfaces, but the transition-module budget, prompt snapshot, parity tests, and docs contracts now make that expansion explicit and guarded.
1314
+
1315
+ Constraint: Improve maintainer confidence in tool, completion, and review contracts without adding user-facing commands, tools, state paths, dependencies, or package exports
1316
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; this release changes schemas and docs contracts without changing dependency versions
1317
+ Rejected: Keep descriptor data duplicated across generated projections, docs, prompt guidance, and tests | drift was the primary risk, so one projected contract is easier to review and verify
1318
+ Rejected: Treat evidence packets as replacement review ledgers | coverage and findings still need concrete evidence; packets only preserve boundaries, exclusions, ambiguity, and validation context
1319
+ Rejected: Hide the extra transition modules by weakening maintainability checks silently | the module budget now records the intentional completion-gate/projection split instead of allowing unexplained growth
1320
+ Confidence: high
1321
+ Scope-risk: moderate
1322
+ Reversibility: clean
1323
+ Directive: When changing tool descriptors, completion gates, or review evidence packets, update the generated projections/snapshots and run the descriptor, completion-gate, evidence-packet, prompt-contract, dependency, and full release gates before tagging
1324
+ Tested: `bun run typecheck`; `bun test tests/transitions-consolidation.test.ts tests/prompt-snapshot.test.ts tests/runtime/evidence-packets.test.ts tests/config/prompt-contracts.test.ts`; `bun test` with 501 passing tests; `bun run check` before tag
1325
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v2.0.2` before push
1326
+
1327
+ ## [2.0.1] - 2026-05-03
1328
+
1329
+ Make standards research guidance prompt-driven and safer
1330
+
1331
+ Flow 2.0.1 simplifies standards guidance around the actual agent workflow: planning and runtime prompts now tell agents to use available MCP research tools first, especially Ref for official documentation and Exa for current ecosystem research, with generic websearch/webfetch only as fallback. This replaces the more complicated detector-oriented approach with prompt-level guidance that fits how agents already decide which tools are available.
1332
+
1333
+ The standards profile still records local stack evidence, local guidance, config-derived rules, and research gaps, but the scanner now recognizes OpenCode Plugin SDK and Zod surfaces so planning can ask for relevant official-doc and ecosystem research when those tools appear in the repository. Cached planning context and compaction output include standards summaries without re-scanning every hook path.
1334
+
1335
+ The release hardens the standards-profile cache and parser boundaries found during review. Dynamic cached standards snippets are now quoted and framed as generated evidence rather than executable instructions. External standards cache expiry applies to official/external-priority rules as well as recorded external sources, and JSONC config parsing no longer accepts token-spliced or unterminated block-comment inputs as valid standards evidence.
1336
+
1337
+ No commands, tools, state paths, runtime modes, dependency versions, or public package entrypoints changed. The release deliberately keeps external research as a prompt instruction instead of adding MCP-server detection or tool availability plumbing to the standards scanner.
1338
+
1339
+ Constraint: Improve standards research behavior without adding new commands, tools, state paths, dependencies, or MCP-server detection machinery
1340
+ Constraint: Keep local repo guidance ahead of official docs, and official docs ahead of broader Exa/websearch synthesis
1341
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin`; this release changes guidance and detector evidence only, not dependency versions
1342
+ Rejected: Detect installed MCP servers in the standards scanner | prompt guidance is enough, avoids stale availability state, and lets agents use the tools actually exposed in their session
1343
+ Rejected: Store live research results automatically during repository scanning | the scanner should identify stack evidence and research gaps, not perform side-effectful or network-dependent research during local cache refresh
1344
+ Rejected: Trust cached standards snippets as prompt text | cached profile values can originate from repository-controlled files and must remain quoted/generated evidence
1345
+ Confidence: high
1346
+ Scope-risk: moderate
1347
+ Reversibility: clean
1348
+ Directive: Keep standards research prompt-driven unless a future design adds explicit, tested MCP availability contracts; never inject raw repo-derived standards text into system or compaction context without quoting and generated-evidence framing
1349
+ Tested: `bun test tests/runtime-hooks.test.ts tests/stack-standards-profile.test.ts`; `bun run typecheck`; `bun run lint`; `bun run eval:prompt-capture:check`; `bun run eval:review-capture:check`; Oracle review found an unterminated block-comment parser gap, fixed with regression coverage; `bun run check` end-to-end with 479 passing tests, replay gates, lint, build, release hygiene, pack invariants, cold-start budget, bundle sanity, bench smoke, and bench gate
1350
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v2.0.1` before push
1351
+
1352
+ ## [2.0.0] - 2026-05-03
1353
+
1354
+ Rebuild Flow around a fresh workflow core
1355
+
1356
+ Flow 2.0.0 turns the ground-up rewrite investigation into the new architecture. Flow is now organized around a deterministic workflow core with action/event contracts, append-only workflow event logs, replay checkpoints, projection rendering, generated role protocols, generated OpenCode adapter projections, and replay/property release gates.
1357
+
1358
+ The OpenCode integration is now a host adapter rather than the semantic center. Core workflow modules own action decisions, events, reducers, invariant mappings, policy facades, and role protocols. OpenCode plugin registration, config injection, tool registration, tool guidance, and SDK raw-shape concerns live under `src/adapters/opencode/**`, with tests guarding that the core does not import the adapter or `@opencode-ai/plugin`.
1359
+
1360
+ Persistence now records workflow evolution through `.flow/events/<session-id>.jsonl`, `.flow/checkpoints/<session-id>.json`, and `.flow/projections/<session-id>/`. Session JSON and readable session docs remain user-visible runtime artifacts, while replay/checkpoint/projection tests prove the event model. The release also adds event-store benchmarks and wires replay/fresh-surface gates into `bun run check`.
1361
+
1362
+ This major release intentionally removes old transport and internal file surfaces. JSON-string tool transport fields and nested worker payload forms are rejected. Root tool barrels and root tool-guidance indirections were deleted. Active docs, prompts, and tests now point at the fresh adapter/core/persistence/protocol paths, and `check:fresh-surfaces` guards active surfaces against stale terminology and deleted-path drift.
1363
+
1364
+ The README was reorganized for end users: `/flow-auto <goal>` is now the clear default path, maintainer details moved under Contributing, and state-on-disk documentation describes events, checkpoints, projections, locks, and workspace-local state without asking users to understand internals first.
1365
+
1366
+ Constraint: Ship the fresh-start architecture as a major release instead of preserving old internal surfaces
1367
+ Constraint: Keep the public package API root-only while allowing internal files, state implementation, and tool transport details to change freely
1368
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin` and verify tool argument compatibility after removing string transport fields
1369
+ Rejected: Preserve root `src/tools/**` barrels as convenience imports | fresh architecture should not keep internal file surfaces alive only for old callers
1370
+ Rejected: Keep JSON-string tool transport fields | direct schema-first raw object arguments are simpler, stricter, and covered by adapter boundary tests
1371
+ Rejected: Keep snapshot-compatibility checkpoints | append-only events plus replay checkpoints are the new persistence model and are guarded by replay tests
1372
+ Rejected: Leave release checks unchanged | the rewrite needs permanent replay and fresh-surface guards so old patterns do not creep back
1373
+ Confidence: high
1374
+ Scope-risk: broad
1375
+ Reversibility: messy
1376
+ Directive: Do not reintroduce root tool barrels, string transport fields, snapshot-compatibility persistence, or prompt/docs policy duplication without a reviewed replacement decision and release-note rationale
1377
+ Tested: `bun run check:fresh-surfaces`; `bun run deadcode`; `bun run lint`; `bun run typecheck`; `bun test` with 474 passing tests; `bun run test:replay`; `bun run bench:smoke`; `bun run bench:gate`; `bun run check` end-to-end; dependency contract confirmed `zod=4.1.8` aligned with the OpenCode plugin SDK effective zod
1378
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v2.0.0` before push
1379
+
1380
+ ## [1.0.63] - 2026-05-03
1381
+
1382
+ Preserve rich review packet boundaries
1383
+
1384
+ Flow 1.0.63 turns the Dual-flow review investigation into a reusable standalone review-prompt contract. `/flow-review` now treats rich user review packets as structured input instead of loose prose, preserving selected context, exclusions, relationship hypotheses, ambiguities, known exclusions, already-covered findings, evidence requirements, and done-when criteria before deriving findings.
1385
+
1386
+ The providerless review-capture harness now models those packet boundaries directly. Capture scenarios can include a `reviewPacket`, generated prompt packets render explicit `<review-packet>` sections, generated capture templates carry packet expectations, and offline scoring can fail captures that ignore selected-context limits or count excluded surfaces as directly reviewed.
1387
+
1388
+ The release strengthens prompt-quality eval coverage without expanding runtime schemas, command names, tool names, state paths, dependencies, or public plugin entrypoints. The existing structured review ledger remains the output contract; packet semantics are layered in front of it through prompt wording, capture fixtures, prompt snapshots, and behavior-eval scoring.
1389
+
1390
+ Constraint: Improve standalone review input quality without weakening existing output-ledger validation or runtime final-review gates
1391
+ Constraint: Keep packet semantics prompt/capture/eval scoped; do not add runtime schemas, commands, tools, state paths, dependencies, or public package surface
1392
+ Rejected: Copy the Dual-flow prompt verbatim | the useful pattern is first-class packet boundaries, not project-specific Phaser/DOM lifecycle wording
1393
+ Rejected: Expand `ReviewReportSchema` for selected-context or exclusion fields in this release | existing `discoveredSurfaces`, `coverageNotes`, findings, and next steps can carry the evidence while evals prove the prompt-level contract first
1394
+ Rejected: Score packet preservation only in static prompt snapshots | generated providerless captures also need packet expectations so manual model outputs can be scored against the boundary contract
1395
+ Confidence: high
1396
+ Scope-risk: moderate
1397
+ Reversibility: clean
1398
+ Directive: When changing `/flow-review` or review-capture prompts, preserve packet-boundary handling and keep exclusions/ambiguities as coverage/process evidence unless direct code evidence proves a defect
1399
+ Tested: Targeted prompt/capture/eval/snapshot tests; `bun run eval:review-capture:check`; `bun run typecheck`; `bun run lint`; Oracle review found capture-scoring and exclusion-violation gaps, both fixed; `bun run check`
1400
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.63` before push
1401
+
1402
+ ## [1.0.62] - 2026-05-03
1403
+
1404
+ Make Flow prompts more concise without weakening gates
1405
+
1406
+ Flow 1.0.62 trims prompt wording after the GPT-5.5 prompting review while keeping Flow's runtime contracts unchanged. Planner, worker, reviewer, auto, and command examples now use shorter wording where the prior text repeated the same intent, so the prompt surface stays closer to a compact work contract instead of accumulating ritual phrasing.
1407
+
1408
+ The standalone audit prompt now shares one read-only boundary rule between the command and auditor agent surfaces. That boundary is phrased around state mutation rather than tool-name prohibition, so `/flow-review` remains clearly read-only while still allowing the deterministic `flow_review_render` report renderer. The release also keeps the review snapshot and prompt-contract assertion aligned with the clarified boundary.
1409
+
1410
+ No commands, tools, state paths, runtime modes, schemas, package dependencies, or public plugin entrypoints changed. This is a prompt-expression cleanup only: runtime policy and transitions remain the source of truth, and the existing prompt/eval harness continues to guard mode boundaries, untrusted argument handling, review/final completion gates, evidence calibration, and release hygiene.
1411
+
1412
+ Constraint: Improve prompt concision without changing Flow runtime semantics, command/tool names, state paths, schemas, dependencies, or public package surface
1413
+ Constraint: Keep `/flow-review` read-only while preserving the `flow_review_render` renderer path
1414
+ Rejected: Rewrite the prompt stack around a new template shape | the existing structured sections and eval corpus already encode important workflow contracts, so a broad rewrite would add unnecessary regression risk
1415
+ Rejected: Remove safety/review/finalization rules to save tokens | those gates are release-critical and protected by prompt-mode contracts and behavior evals
1416
+ Rejected: Add live provider eval automation in this release | useful, but separate from the requested concise prompt cleanup and not needed to ship this wording-only release
1417
+ Confidence: high
1418
+ Scope-risk: narrow
1419
+ Reversibility: clean
1420
+ Directive: When making future prompt-concision edits, delete duplicate phrasing before weakening gates; keep shared audit boundary wording compatible with `flow_review_render`
1421
+ Tested: Fresh implementation subagent; targeted prompt/eval tests; Oracle review found no blocking findings and one wording risk, fixed; `bun run typecheck`; `bun run check`
1422
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.62` before push
1423
+
1424
+ ## [1.0.61] - 2026-05-03
1425
+
1426
+ Make Flow reviews challenge adversarial failure modes
1427
+
1428
+ Flow 1.0.61 turns the missed-review lesson into a general review-quality contract instead of memorizing a project-specific bug. Standalone `/flow-review` audits and the `flow-reviewer` approval gate now require reviewers to select applicable adversarial failure-mode classes before calling behavior clean: lifecycle/reentrancy/idempotency, async race and event ordering, persistence failure and recovery, interaction geometry and hit-testing, accessibility semantics and live regions, and test-oracle authenticity.
1429
+
1430
+ The audit contract now asks reviewers to record checked classes or meaningful gaps in the existing coverage ledger, findings, or next steps. Test-surface review must also say whether the evidence exercises a normal product path rather than only a shortcut setup. The reviewer contract uses the existing summary, integration/regression checks, blocking findings, follow-ups, and suggested validation fields, so this strengthens review behavior without adding commands, tools, state paths, package dependencies, or a new result schema.
1431
+
1432
+ The release also strengthens the prompt-quality harness. The offline behavior rubric adds `failure_modes_accounted`, raises the structured review capture threshold to 9/9, adds a regression case for otherwise polished reviews that omit failure-mode accounting, and adds a providerless capture scenario focused on adversarial failure-mode coverage. Prompt snapshots and prompt-eval snippets lock the new wording across auditor, command, reviewer, and contract surfaces.
1433
+
1434
+ Constraint: Improve review quality generally without hardcoding the PracticeScene incident or adding new public Flow commands, tools, state paths, dependencies, or result-schema fields
1435
+ Constraint: Keep failure-mode checks applicable and evidence-based so reviewers record checked paths or gaps without inventing irrelevant findings
1436
+ Rejected: Add a narrow checklist for the reported Phaser/DOM control issues | it would train Flow on one incident instead of reusable review reasoning
1437
+ Rejected: Add new structured reviewer fields for failure-mode checks | existing summary, integrationChecks, regressionChecks, blockingFindings, followUps, suggestedValidation, coverageNotes, findings, and nextSteps can carry the evidence without schema churn
1438
+ Rejected: Treat validation success as enough review evidence | the missed issues were lifecycle, interaction, accessibility, and test-oracle reasoning gaps that passing happy-path checks can miss
1439
+ Confidence: high
1440
+ Scope-risk: moderate
1441
+ Reversibility: clean
1442
+ Directive: When changing review prompts or behavior evals, keep adversarial failure-mode accounting broad, applicable, and evidence-backed; do not collapse it into a project-specific checklist
1443
+ Tested: `bun run typecheck`; `bun run eval:review-capture:check`; `bun run eval:prompt-capture:check`; targeted prompt/eval tests; changed-file and package-scope Biome checks; `bun run check`
1444
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.61` before push
1445
+
1446
+ ## [1.0.60] - 2026-05-03
1447
+
1448
+ Close review findings with runtime evidence
1449
+
1450
+ Flow 1.0.60 turns the review-remediation lesson from the Soft Focus investigation into a runtime and prompt contract. Worker results can now carry a `reviewFindingClosures` ledger that maps each reviewed finding to fix references, test references, validation commands, and residual risk. Flow persists that ledger into execution history and renders it in feature docs so review-fix claims are inspectable after the run instead of disappearing into generic decisions.
1451
+
1452
+ The completion gate now enforces the ledger for `review_and_fix` sessions. Successful review-fix completion requires closure evidence, requires every closure to be `closed`, requires closed findings to name fix/test/validation evidence, and rejects validation references that were not recorded in the current `validationRun`. Prompt contracts and behavior evals now train workers to produce the ledger and reviewers to reject missing or unsupported closure claims.
1453
+
1454
+ The release also clarifies the operator boundary around parked sessions and standalone review. History/show responses label stored non-completed sessions as parked/inactive and warn that direct work outside Flow will not update runtime state, reviewer records, validation records, or completion artifacts. README now documents that `/flow-review` is read-only and that direct Codex/RepoPrompt follow-up fixes bypass Flow runtime records unless remediation proceeds through Flow execution gates. A superseded complexity-reduction investigation note was removed so the current docs do not keep stale maintenance guidance alongside the newer review-remediation contract.
1455
+
1456
+ Constraint: Preserve existing command names, tool names, state paths, package API, and dependency versions while adding review-fix evidence accounting
1457
+ Constraint: Keep `reviewFindingClosures` additive for ordinary implementation sessions and enforce it only where `goalMode` is `review_and_fix`
1458
+ Rejected: Infer exact original-finding coverage from the latest reviewer projection | reviewer decisions can be overwritten by later approval, so exact all-finding matching needs a first-class original-finding store
1459
+ Rejected: Allow `partially_closed` or `blocked` closure entries on `status: ok` review-fix completion | successful completion should mean every listed finding is closed; unresolved entries belong in `needs_input` or continued work
1460
+ Rejected: Treat stale parked session docs as proof of runtime corruption | the actionable fix is clearer parked-session UX and bypass-boundary documentation
1461
+ Confidence: high
1462
+ Scope-risk: moderate
1463
+ Reversibility: clean
1464
+ Directive: When changing review remediation, keep the closure ledger tied to code/test/validation evidence and do not let `review_and_fix` completion pass with unresolved findings
1465
+ Tested: `bun run lint`; `bun run typecheck`; `bun run eval:prompt-capture:check`; targeted runtime, prompt, and operator-history tests; Oracle review found one review-fix closure-status gap and minor assertion/parked-flag improvements, all fixed; `bun run test -- --timeout 30000` with 445 passing tests; `bun run build`; `bun run check`
1466
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.60` before push
1467
+
1468
+ ## [1.0.59] - 2026-05-02
1469
+
1470
+ Ground Flow planning in cached stack standards evidence
1471
+
1472
+ Flow 1.0.59 makes planning context more explicit before execution by recording a runtime-owned stack and standards profile alongside the existing repo profile, package-manager detection, research, and decision logs. Planning now captures local stack evidence, local guidance, standards precedence, and bounded official-doc research gaps so agents can prefer repository rules and existing package scripts before reaching for external assumptions.
1473
+
1474
+ The cached profile is reused outside active Flow sessions without making every hook invocation rescan the workspace. The cache is fingerprinted against relevant local evidence, keyed by workspace/start-directory/package-manager context, and external guidance expires after 30 days. The cache writer now stores only the strict stack/standards profile payload, so records written by `flow_plan_context_record` and `flow_plan_apply` remain readable by the strict cache parser.
1475
+
1476
+ The release also keeps the OpenCode plugin surface compatible while simplifying read-only agent restrictions: read-only Flow agents now rely on permission-only restrictions instead of the deprecated boolean `tools` config, and tests lock the tool schema, plugin surface, runtime hooks, package-manager detection, and cache read/write contracts.
1477
+
1478
+ Constraint: Preserve Flow command names, tool names, state paths, dependency versions, and root package API while enriching planning context
1479
+ Constraint: Keep `zod` aligned with `@opencode-ai/plugin` and preserve direct SDK arg-shape compatibility
1480
+ Rejected: Let planning tools cache the whole planning object | strict cache reads reject unrelated planning keys and would silently ignore the profile
1481
+ Rejected: Recompute stack/standards fingerprints before checking cache existence | no-cache hook paths should stay cheap
1482
+ Rejected: Keep duplicate workspace-boundary traversal helpers | package detection and profile detection need one containment rule to avoid drift
1483
+ Confidence: high
1484
+ Scope-risk: moderate
1485
+ Reversibility: clean
1486
+ Directive: When changing stack/standards profiling, keep cache writes strict, cache reads cheap on the no-cache path, and local repo guidance ahead of official or web guidance
1487
+ Tested: `bun test tests/auto-prepare.test.ts tests/stack-standards-profile.test.ts tests/package-manager-detection.test.ts tests/runtime-hooks.test.ts`; `bun run typecheck`; `bunx biome check src/runtime/application/package-manager.ts src/runtime/application/stack-standards-profile.ts src/runtime/application/workspace-boundaries.ts src/tools/runtime-tools/planning-tools.ts tests/auto-prepare.test.ts`; Oracle review of scoped fix found no P0/P1/P2 findings; `bun run check` including typecheck, prompt capture checks, dependency contract, deadcode, build, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, full test suite, lint, and bench smoke
1488
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.59` before push
1489
+
1490
+ ## [1.0.58] - 2026-05-02
1491
+
1492
+ Preserve observability while enforcing release hygiene
1493
+
1494
+ Flow 1.0.58 narrows the no-console guidance so cleanup does not silently delete meaningful operator or diagnostic signals. Workflow prompts now tell agents to inspect existing logging, telemetry, and CLI-output patterns before changing `console.*`, classify each occurrence, remove only temporary debug noise, and replace intentional observability with the repo's existing logger, telemetry API, injected logger, or explicit stdout/stderr stream writes while preserving severity, message intent, and key context.
1495
+
1496
+ The release also makes the guard reviewable. Reviewer contracts now reject release-bound debug artifacts, deleted observability without an equivalent replacement, and newly invented logging or telemetry dependencies unless that dependency was explicitly approved. Maintainer docs and release-hygiene failures explain the same decision tree, while prompt-contract tests and behavior evals lock worker and reviewer regressions for deleted observability and unapproved dependency invention.
1497
+
1498
+ Constraint: Preserve Flow's existing command/tool/state surface while improving prompt guidance for observability-safe console cleanup
1499
+ Constraint: Keep release hygiene focused on debug artifacts, not on reducing production operator diagnostics
1500
+ Rejected: Keep a delete-only no-console rule | it can lower observability by removing intentional failure and operator signals
1501
+ Rejected: Add a logging or telemetry dependency | the right replacement depends on each host repo's existing facilities and no dependency was requested
1502
+ Rejected: Weaken the release-hygiene scanner | raw console/debugger artifacts should still fail release-bound checks; the fix belongs in guidance and review contracts
1503
+ Confidence: high
1504
+ Scope-risk: narrow
1505
+ Reversibility: clean
1506
+ Directive: When changing console/release-hygiene guidance, preserve the classify-before-edit rule and keep worker/reviewer behavior evals covering both deleted observability and unapproved dependency invention
1507
+ Tested: `bun test tests/config/prompt-contracts.test.ts tests/prompt-mode-behavior-eval.test.ts`; `bun run lint`; Oracle review of diff snapshot `2026-05-02/1905` found the remaining test-update and dependency-invention gaps, both fixed; `bun run check` including typecheck, prompt capture checks, dependency contract, deadcode, build, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, 431 tests across 78 files, lint, and bench smoke
1508
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.58` before push
1509
+
1510
+ ## [1.0.57] - 2026-05-02
1511
+
1512
+ Seal Flow review handoffs at prompt and permission boundaries
1513
+
1514
+ Flow 1.0.57 hardens the read-only review and fresh-context handoff contracts without adding commands, tools, state paths, dependencies, or public package surface. The autonomous and worker prompts now describe bounded Task-tool handoffs to the planner, worker, and reviewer roles, while the config keeps those handoffs narrow: `flow-auto` may delegate only to the Flow role agents it coordinates, `flow-worker` may delegate only to `flow-reviewer`, and read-only agents explicitly deny Task delegation.
1515
+
1516
+ The standalone `/flow-review` prompt now binds its structured ledger to the renderer transport shape directly. It tells the model to call `flow_review_render` with `{ reviewJson: JSON.stringify(ledger), view }` and clarifies that `reviewJson` must be the actual serialized JSON string, not a nested object or the literal pseudo-code text. Prompt contract tests and the committed review snapshot lock that wording so future prompt edits do not reopen the full-codebase review instability.
1517
+
1518
+ Constraint: Preserve Flow's existing command/tool/state surface while making fresh-context handoffs and standalone review rendering deterministic
1519
+ Constraint: Keep read-only agents read-only across edit, bash, and Task/subagent boundaries
1520
+ Rejected: Add a new review transport compatibility path | strict renderer input is deterministic and the bug was missing model-facing prompt guidance
1521
+ Rejected: Let read-only agents omit `permission.task` | OpenCode defaults are permissive enough that read-only boundaries should be explicit
1522
+ Rejected: Document the renderer transport in README | it is an internal prompt/tool contract, not a user-facing command behavior change
1523
+ Confidence: high
1524
+ Scope-risk: narrow
1525
+ Reversibility: clean
1526
+ Directive: When changing Flow role handoffs or JSON-renderer prompts, update config permissions, prompt contracts, and snapshots together; do not rely on omitted permissions or implicit model inference at transport boundaries
1527
+ Tested: `bun test tests/config/plugin-surface.test.ts tests/config/prompt-contracts.test.ts tests/mode-contracts.test.ts tests/prompt-snapshot.test.ts`; `bun run typecheck`; `bun run lint`; Oracle review found no blocker/regression findings after the edits; `bun run check` including typecheck, prompt capture checks, dependency contract, deadcode, build, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, 431 tests across 78 files, lint, and bench smoke
1528
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.57` before push
1529
+
1530
+ ## [1.0.56] - 2026-05-02
1531
+
1532
+ Make Flow uninstall clear the canonical plugin slot
1533
+
1534
+ Flow 1.0.56 fix-forwards the install UX repair by making uninstall follow the same canonical-slot contract. The release `uninstall.sh` and `bun run uninstall:opencode` now remove `~/.config/opencode/plugins/flow.js` whenever it exists, regardless of whether the file carries the Flow ownership header.
1535
+
1536
+ The previous ownership-marker refusal was too conservative for the actual user path: it could leave users stuck with a stale or incompatible `flow.js` after they explicitly asked Flow to uninstall. There is no backup file and no manual-delete instruction. The command owns exactly one canonical plugin filename and clears that filename simply.
1537
+
1538
+ Constraint: The user-facing uninstall command must resolve a blocked canonical `flow.js` slot without requiring manual filesystem cleanup
1539
+ Constraint: Keep the fix narrow to the OpenCode plugin file; do not alter Flow runtime behavior, commands, tools, state paths, prompt contracts, dependencies, or package surface
1540
+ Rejected: Keep ownership-marker refusal | it preserves theoretical safety while failing the practical uninstall use case
1541
+ Rejected: Move unowned files to backup | it adds surprising filesystem behavior and leaves another artifact for users to reason about
1542
+ Rejected: Add confirmation or force flags | the command is already explicit and extra surface would complicate a single-path cleanup contract
1543
+ Confidence: high
1544
+ Scope-risk: narrow
1545
+ Reversibility: clean
1546
+ Directive: Treat install and uninstall as authoritative for the canonical `flow.js` plugin slot; do not reintroduce marker-based refusal unless the plugin supports multiple install targets
1547
+ Tested: `bun test tests/install.test.ts tests/cross-area/install-lifecycle.test.ts`; `bun run check` including typecheck, eval captures, dependency contract, deadcode, build, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, 429 tests across 78 files, lint, and bench smoke
1548
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.56` before push
1549
+
1550
+ ## [1.0.55] - 2026-05-02
1551
+
1552
+ Make Flow install overwrite stale plugin files
1553
+
1554
+ Flow 1.0.55 fixes an install-time UX failure where an existing `flow.js` at the canonical OpenCode plugin path blocked installation unless the user manually deleted it first. Installing Flow is now authoritative for that target path: both `bun run install:opencode` and the release `install.sh` overwrite an existing `flow.js` and stamp the replacement with the Flow ownership header.
1555
+
1556
+ The uninstall safety boundary stays intact. `uninstall` still refuses to remove files that do not carry the Flow ownership marker, so installation is convenient while destructive removal remains ownership-gated. No commands, tools, runtime workflow semantics, state paths, prompt contracts, dependencies, or public plugin surface changed.
1557
+
1558
+ Constraint: The canonical OpenCode plugin filename is `flow.js`, so install must be able to replace stale or incompatible files at that path without manual cleanup
1559
+ Constraint: Preserve uninstall ownership protection while making install idempotent and user-friendly
1560
+ Rejected: Keep refusing unowned `flow.js` files | it blocks the expected install path and forces users into manual file deletion
1561
+ Rejected: Add a new force flag | explicit Flow install is already the user intent, and another option would add surface area for a patch UX fix
1562
+ Rejected: Weaken uninstall ownership checks | install and uninstall have different risk profiles; deleting unowned files should remain protected
1563
+ Confidence: high
1564
+ Scope-risk: narrow
1565
+ Reversibility: clean
1566
+ Directive: Treat install as authoritative for the canonical `flow.js` target, but do not let uninstall remove unowned plugin files without an explicit future contract change
1567
+ Tested: `bun test tests/install.test.ts tests/cross-area/install-lifecycle.test.ts`; `bun run check` including typecheck, eval captures, dependency contract, deadcode, build, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, 429 tests across 78 files, lint, and bench smoke
1568
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.55` before push
1569
+
1570
+ ## [1.0.54] - 2026-05-02
1571
+
1572
+ Lower maintainer risk without changing runtime behavior
1573
+
1574
+ Flow 1.0.54 completes the current maintainability-risk pass by making the active contract map and runtime test layout easier to audit. The release keeps `docs/maintainer-contract.md` and `docs/contributor-map.md` as the current-facing source of truth for commands, tools, state paths, invariants, surface-freeze rules, and required checks.
1575
+
1576
+ The runtime test suite now carries less review gravity. Operator history/session lifecycle coverage, session persistence/rendering coverage, execution-history rendering, replanning behavior, actionable needs-input metadata, and tool persistence all live in focused behavior-named suites instead of broad catch-all files. The assertions were moved rather than relaxed, and no runtime source, command names, tool schemas, state paths, dependency versions, or public plugin surface changed.
1577
+
1578
+ Constraint: Address the framework-complexity review by reducing contributor comprehension risk, not by rewriting runtime architecture
1579
+ Constraint: Preserve Flow runtime behavior, command names, tool schemas, state paths, dependencies, package surface, and completion/reviewer gates
1580
+ Rejected: Flatten or rewrite the runtime | the review risk was maintainability and contract drift, not broken workflow semantics
1581
+ Rejected: Delete or weaken contract tests | coverage is a project strength; the problem was review load per file
1582
+ Rejected: Add more current-facing documentation layers | the maintainer contract and contributor map should remain the short current truth instead of creating parallel sources
1583
+ Confidence: high
1584
+ Scope-risk: narrow
1585
+ Reversibility: clean
1586
+ Directive: Keep future runtime and config coverage organized by behavioral concern, and do not expand commands, tools, prompt contracts, state paths, or runtime modes unless the release records the retirement/replacement tradeoff
1587
+ Tested: `bun test tests/runtime.test.ts tests/runtime-execution-history.test.ts tests/runtime-replanning.test.ts tests/runtime-actionable-metadata.test.ts tests/runtime-tool-persistence.test.ts tests/docs-stale-reference-policy.test.ts tests/docs-semantic-parity.test.ts`; `bun run lint`; `bun run typecheck`; `bun run check` including eval captures, dependency contract, deadcode, build, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, 429 tests across 78 files, lint, and bench smoke; Oracle review of diff snapshots `2026-05-02/1527` and `2026-05-02/1540`
1588
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.54` before push
1589
+
1590
+ ## [1.0.53] - 2026-05-02
1591
+
1592
+ Retire factory artifact lore from the current maintenance surface
1593
+
1594
+ Flow 1.0.53 closes the remaining maintainability risk from the repo-local factory artifact cleanup. The release removes the tracked local process artifact tree and its taxonomy document, keeps regenerated local artifacts ignored, and preserves the hidden-workspace permission contract through generic hidden-root tests instead of a named retired directory.
1595
+
1596
+ The stale-reference guard is now stricter: the retired factory artifact name may appear only in the policy test and ignore configuration, not in changelog, release, investigation, docs, source, or fixture text. Older historical notes were rewritten to describe the same decisions as retired process-artifact context, so contributors no longer see a deleted artifact tree presented as current project lore.
1597
+
1598
+ Constraint: Remove contributor-confusing process artifacts without changing Flow runtime behavior, command names, tool schemas, state paths, dependencies, or public plugin surface
1599
+ Constraint: Keep hidden-workspace permission coverage after removing the named retired artifact tree
1600
+ Rejected: Preserve the literal retired artifact name in historical markdown | it kept passing the policy but still looked like current institutional lore to contributors
1601
+ Rejected: Delete historical release/investigation context wholesale | concise supersession wording keeps the audit trail without reviving a dead surface
1602
+ Rejected: Weaken stale-reference policy to avoid release-note failures | the policy now has narrower allowances and explicit generated-artifact handling instead
1603
+ Confidence: high
1604
+ Scope-risk: narrow
1605
+ Reversibility: clean
1606
+ Directive: Do not reintroduce repo-local process artifact directories as source-controlled workflow truth; if a future hidden workspace example is needed, use generic sentinel names in tests and document the actual current owner
1607
+ Tested: `bun test tests/docs-stale-reference-policy.test.ts`; active grep for retired factory references outside the stale-policy test and ignore configuration; `bun run check` including typecheck, eval captures, dependency contract, deadcode, build, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, 429 tests, lint, and bench smoke
1608
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.53` before push
1609
+
1610
+ ## [1.0.52] - 2026-05-02
1611
+
1612
+ Keep the release-note artifact inside the stale-reference policy
1613
+
1614
+ Flow 1.0.52 is a fix-forward release for the `v1.0.51` hosted release failure. The `v1.0.51` tag correctly validated the package/changelog version evidence, but the release job creates a generated `release-notes.md` file from `CHANGELOG.md` before running `bun run check`. The new stale-reference policy test scanned that generated file and rejected the same historical retired-path references that are intentionally allowed in the changelog.
1615
+
1616
+ This release keeps the stale-reference guard intact while adding the generated release-note artifact to the same historical-evidence allowlist as `CHANGELOG.md`. The maintainer contract now names that generated artifact explicitly, so future changes do not accidentally treat hosted release notes as current contract documentation. No runtime behavior, command names, tool schemas, state paths, dependency versions, or public plugin surface changed.
1617
+
1618
+ Constraint: Preserve the stale-reference policy while allowing the hosted release workflow's generated changelog excerpt
1619
+ Constraint: Do not force-move the already-pushed failed `v1.0.51` tag
1620
+ Rejected: Remove stale-reference scanning from `bun test` | the policy is useful and should remain part of the normal release gate
1621
+ Rejected: Remove stale-reference scanning from generated release notes | the generated artifact follows the changelog and should stay policy-covered
1622
+ Rejected: Force-move `v1.0.51` | the tag was already pushed and failed in hosted release, so fix-forward keeps history explicit
1623
+ Confidence: high
1624
+ Scope-risk: narrow
1625
+ Reversibility: clean
1626
+ Directive: Keep generated `release-notes.md` aligned with `CHANGELOG.md` in stale-reference policy decisions because the hosted release workflow materializes it before `bun run check`
1627
+ Tested: Hosted `v1.0.51` release run failed specifically in `tests/docs-stale-reference-policy.test.ts` against generated `release-notes.md`; `bun test tests/docs-stale-reference-policy.test.ts`; `bun run check` including 427 tests, build, deadcode, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, and bench smoke
1628
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.52` before push
1629
+
1630
+ ## [1.0.51] - 2026-05-02
1631
+
1632
+ Reduce maintainer contract drift after the framework-complexity review
1633
+
1634
+ Flow 1.0.51 is a maintenance release that addresses the main remaining risk from the maintainability review: contributor comprehension and contract drift as the plugin has grown into a small workflow runtime. The release replaces scattered current-truth language with `docs/maintainer-contract.md`, adds contributor and process-orientation maps, records the current release posture, and removes stale implementation/migration docs that no longer describe current behavior.
1635
+
1636
+ The test cleanup continues the same direction without weakening coverage. The former `tests/config.test.ts` and `tests/runtime-completion-contracts.test.ts` suites are now split by concern, with successor breadcrumbs at the top of each new file. A new stale-reference policy test allows old paths only in historical artifacts or explicit successor breadcrumbs, so release notes and retired validation evidence stay auditable without leaking retired paths back into current docs or source.
1637
+
1638
+ Constraint: Address maintainability and contributor-orientation risk without changing runtime behavior, command names, tool schemas, state paths, dependencies, or public plugin surface
1639
+ Constraint: Preserve historical release evidence without letting retired artifact names look current
1640
+ Rejected: Keep retired process-artifact names in release notes | those names created more contributor ambiguity than audit value after the artifact tree was removed
1641
+ Rejected: Keep `IMPLEMENTATION_PLAN.md` and v2 migration docs with banners | the maintainer contract now owns current behavior, and keeping superseded docs would preserve the ambiguity this release is removing
1642
+ Rejected: Split every large test file at once | this release establishes the pattern on the highest-risk contract suites while avoiding unnecessary churn
1643
+ Confidence: high
1644
+ Scope-risk: moderate
1645
+ Reversibility: clean
1646
+ Directive: Treat `docs/maintainer-contract.md` as the current contract map; stale retired-path references belong only in historical artifacts or explicit successor breadcrumbs guarded by `tests/docs-stale-reference-policy.test.ts`
1647
+ Tested: Existing `tests/config.test.ts` behavior locked before the split with 41 passing tests; split config suites preserved 41 passing tests; targeted docs/prompt/config checks passed; `bun run typecheck`; `bun run lint`; `bun test`; stale-reference scan reviewed remaining historical references; Oracle review found no blockers; `bun run check` including 427 tests, build, deadcode, release hygiene, pack invariants, completion-lane gate, cold-start budget, bundle sanity, and bench smoke
1648
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.51` before push
1649
+
1650
+ ## [1.0.50] - 2026-05-02
1651
+
1652
+ Make runtime-tool tests easier to review
1653
+
1654
+ Flow 1.0.50 turns the runtime-tool test cleanup into a release-visible maintenance pass. The former `tests/runtime-tools.test.ts` monolith is now split by concern: operator/control tools, completion and recovery, reviewer/reset behavior, review rendering, metadata, and runtime hooks. The original completion/recovery file is down to a focused 864 lines while the extracted suites keep the same behavioral coverage under targeted and full repo verification.
1655
+
1656
+ This release also removes the repeated local `toolContext` test helper from runtime-oriented test files by centralizing it in `tests/runtime-test-helpers.ts`. Per-file temp-directory cleanup remains local so test lifecycle ownership stays explicit, and no runtime source, package dependencies, or public plugin behavior changed.
1657
+
1658
+ Constraint: Preserve Flow runtime behavior while reducing test-suite review gravity
1659
+ Constraint: Keep the cleanup bounded to tests and existing helper surfaces with no new dependencies
1660
+ Rejected: Split `runtime-operator-tools.test.ts` further | it is still large but cohesive, and further slicing would add churn without a clear reviewability win
1661
+ Rejected: Centralize temp-directory cleanup | per-file cleanup keeps test lifecycle ownership visible and avoids shared teardown magic
1662
+ Confidence: high
1663
+ Scope-risk: narrow
1664
+ Reversibility: clean
1665
+ Directive: Keep runtime-tool tests grouped by behavior concern; only introduce shared test helpers when duplication crosses multiple files and the helper has no hidden lifecycle side effects
1666
+ Tested: `bun test tests/runtime-tools.test.ts tests/runtime-reviewer-reset.test.ts tests/runtime-operator-tools.test.ts tests/runtime-review-render.test.ts tests/runtime-tools-metadata.test.ts tests/runtime-hooks.test.ts`; broader targeted suite including runtime completion, summary, path traversal, and runtime transition tests; grep confirmed no remaining local `function toolContext` definitions under `tests`; Oracle review found no blocking issues; `bun run typecheck`; `bun run lint`; `bun run check` including 426 tests, build, deadcode, release hygiene, pack invariants, cold-start budget, bundle sanity, and bench smoke
1667
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.50` before push
1668
+
1669
+ ## [1.0.49] - 2026-05-02
1670
+
1671
+ Build release artifacts before randomized CI tests
1672
+
1673
+ Flow 1.0.49 completes the randomized-regression CI rollout by making the randomized scripts self-preparing on clean checkouts. The `v1.0.48` release workflow published successfully, but main CI still failed because the randomized-regression job runs from a fresh workspace and the install lifecycle tests load `dist/index.js`; local runs and `bun run check` had passed because they build before running the test suite.
1674
+
1675
+ `test:ci` and `test:randomized:regression` now run the existing build step before randomized tests. This keeps the new hosted randomized gate blocking, preserves the explicit 30s timeout and seed coverage, and makes the script match the same build-before-tests assumption already used by the release and full-check paths.
1676
+
1677
+ Constraint: Keep randomized-regression blocking on main while making it reliable on clean GitHub-hosted checkouts
1678
+ Constraint: Reuse the existing build script instead of changing install lifecycle tests or runtime behavior
1679
+ Rejected: Remove install lifecycle tests from randomized CI | they cover release-bound assets and should remain in the randomized suite
1680
+ Rejected: Depend on a prebuilt dist artifact in CI | a clean checkout must build the release artifact before tests that load it
1681
+ Confidence: high
1682
+ Scope-risk: narrow
1683
+ Reversibility: clean
1684
+ Directive: Any CI test script that includes install lifecycle or dist-load coverage must build `dist` first
1685
+ Tested: `rm -rf dist && bun run test:randomized:regression`; `bun run check`; workflow YAML parse; release guard smoke for `v1.0.49`; hosted `v1.0.48` release run succeeded with release evidence and published assets; hosted `v1.0.48` CI isolated the failure to missing clean-checkout `dist/index.js`
1686
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.49` before push
1687
+
1688
+ ## [1.0.48] - 2026-05-02
1689
+
1690
+ Keep randomized regression CI stable on hosted Linux
1691
+
1692
+ Flow 1.0.48 fixes the post-release CI signal from the new randomized regression job. The `v1.0.47` release workflow published successfully, including the release evidence artifact, but the main-branch randomized regression job exposed that the existing concurrent filesystem stress test can exceed Bun's default 5s test timeout on GitHub-hosted Linux under seed `42` even when its assertions pass locally and under the standard release check.
1693
+
1694
+ The randomized scripts now use the repo's established explicit 30s test timeout convention for heavier suites. This preserves randomized ordering and the seed-1/seed-42 regression coverage while avoiding false red builds from a known filesystem-concurrency stress test running slower on hosted infrastructure.
1695
+
1696
+ Constraint: Keep randomized regression coverage enabled instead of backing out the new CI job
1697
+ Constraint: Avoid changing runtime concurrency behavior for a timeout-only hosted test failure
1698
+ Rejected: Remove the concurrent write test from randomized CI | it is exactly the kind of filesystem safety coverage the regression job should retain
1699
+ Rejected: Increase only the single test timeout in code | the randomized script is the hosted gate and should carry the heavier-suite timeout consistently across seeds
1700
+ Confidence: high
1701
+ Scope-risk: narrow
1702
+ Reversibility: clean
1703
+ Directive: Keep randomized CI seeded and timeout-explicit when it includes filesystem stress tests on hosted runners
1704
+ Tested: `bun run check`; `bun run test:ci`; hosted `v1.0.47` release run succeeded with release evidence and published assets; hosted `v1.0.47` CI isolated the failure to randomized-regression seed `42` timeout in `tests/cross-area/concurrent-writes.test.ts`
1705
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.48` before push
1706
+
1707
+ ## [1.0.47] - 2026-05-02
1708
+
1709
+ Restore hosted release evidence after the guardrail rollout
1710
+
1711
+ Flow 1.0.47 is a fix-forward release for the new release-evidence workflow. The `v1.0.46` tag proved the version/changelog guard and full `bun run check` path on GitHub, but asset preparation failed before publication because the evidence writer embedded nested command substitutions directly inside `echo` strings. This release assigns those evidence values before writing the artifact, making the script parse cleanly under actionlint and Bash.
1712
+
1713
+ The release keeps the 1.0.46 hardening intact: completion-lane invariants remain a named gate, randomized test scripts remain explicit, hosted randomized regression coverage remains available, and the package API boundary remains root-only. No runtime behavior or dependency versions changed in this fix-forward pass.
1714
+
1715
+ Constraint: Fix the hosted release failure without rewriting the already-pushed `v1.0.46` tag or weakening the new release guards
1716
+ Constraint: Preserve the release evidence artifact contract while avoiding shell quoting that actionlint cannot parse
1717
+ Rejected: Force-move `v1.0.46` | the tag was already pushed and should remain an auditable failed release attempt
1718
+ Rejected: Remove release evidence generation | the evidence artifact is the purpose of the hardening and should be repaired, not bypassed
1719
+ Confidence: high
1720
+ Scope-risk: narrow
1721
+ Reversibility: clean
1722
+ Directive: Keep workflow shell snippets actionlint-clean; assign complex command substitutions before echoing evidence values
1723
+ Tested: `bun run check`; `bun run test:ci`; workflow YAML parse; release guard smoke for `v1.0.47`; hosted `v1.0.46` run proved the tag/changelog guard and `bun run check` before failing in asset preparation
1724
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.47` before push
1725
+
1726
+ ## [1.0.46] - 2026-05-02
1727
+
1728
+ Make release confidence visible before cutting the next tag
1729
+
1730
+ Flow 1.0.46 turns the post-review hardening pass into release-visible guardrails. CI now exposes the completion-lane invariant gate directly, `bun run check` includes that gate, and randomized tests have named scripts plus a regression-strength hosted job for main pushes or manual dispatch. The release workflow now refuses mismatched tags, missing changelog headings, empty or heading-only notes, and uploads release evidence before publishing assets.
1731
+
1732
+ The release also documents the package boundary in the README and maintainer docs: consumers should import only from `opencode-plugin-flow`, while deep paths remain unsupported internals. Historical evidence docs now call out stale version snapshots explicitly, and the dependency-contract check has a small helper-backed regression for missing plugin `zod` metadata without changing dependency versions.
1733
+
1734
+ Constraint: Improve release confidence using existing Bun/GitHub Actions surfaces without adding dependencies or changing Flow runtime behavior
1735
+ Constraint: Keep the public package API root-only and keep `zod` aligned with the OpenCode plugin SDK contract
1736
+ Rejected: Add a separate release dry-run workflow | the tag workflow now has local smoke coverage and hosted proof comes from the actual release path
1737
+ Rejected: Make `test:ci` deterministic-only | CI confidence depends on preserving randomized ordering while adding explicit regression-strength runs
1738
+ Rejected: Widen package exports for convenience | deep imports would create accidental semver promises for internal runtime modules
1739
+ Confidence: high
1740
+ Scope-risk: moderate
1741
+ Reversibility: clean
1742
+ Directive: Keep release notes body-bearing and version-matched; do not bypass the release evidence guard when publishing future tags
1743
+ Tested: `bun run check`; `bun run test:randomized:regression`; `bun run test:ci`; workflow YAML parse; release guard smoke including heading-only rejection; remote tag resolution for `actions/upload-artifact@v6`, `actions/checkout@v6`, and `actions/setup-node@v6`; Oracle review of the hardening diff
1744
+ Not-tested: Live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.46` before push
1745
+
1746
+ ## [1.0.45] - 2026-05-02
1747
+
1748
+ Reduce cleanup complexity while fencing the package API boundary
1749
+
1750
+ Flow 1.0.45 turns the complexity-reduction investigation into a guarded cleanup release. Shared final-review fixtures replace repeated completion-gate literals, final-review coverage now uses a centralized surface taxonomy, audit report coverage summaries are normalized in one place, and the legacy application `tool-runtime.ts` adapter has been removed after moving active callers to their owning runtime modules.
1751
+
1752
+ The release also makes the supported package boundary explicit. `package.json` now exports only the root plugin entrypoint, and pack invariants fail if future changes widen `main` or `exports` without review. This keeps unsupported deep imports from becoming accidental compatibility promises while preserving the shipped OpenCode plugin surface.
1753
+
1754
+ Constraint: Preserve Flow command behavior and final-completion semantics while removing stale internal wrappers and repeated test/report logic
1755
+ Constraint: Keep the public package API to the root plugin entrypoint instead of reintroducing internal runtime export surface
1756
+ Rejected: Keep `src/runtime/application/tool-runtime.ts` as a compatibility shim | the package ships a bundled root plugin entrypoint and active internal callers now import owner modules directly
1757
+ Rejected: Deduplicate prompt policy wording in this pass | those prompt snippets are intentionally product-facing and pinned by prompt/eval tests
1758
+ Rejected: Treat repo-local process artifacts as package dead code | release checks do not ship those artifacts, but repo tests and process notes still used parts of that tree as fixtures/support context at the time
1759
+ Confidence: high
1760
+ Scope-risk: moderate
1761
+ Reversibility: clean
1762
+ Directive: Do not widen `package.json#exports` or restore application-barrel compatibility helpers without a reviewed public-API decision and pack-invariant update
1763
+ Tested: `bun run check`; package import smoke for root `opencode-plugin-flow`; blocked deep-import smoke expecting `ERR_PACKAGE_PATH_NOT_EXPORTED`; targeted pack-invariants and dist-load tests; Oracle review of the cleanup diff
1764
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.45` before push
1765
+
1766
+ ## [1.0.44] - 2026-05-02
1767
+
1768
+ Make dead-code cleanup explicit before the next release
1769
+
1770
+ Flow 1.0.44 removes unused internal declarations that had drifted behind the runtime and tool-schema refactors. The cleanup deletes dead command constants, schema/type aliases, path wrappers, transition helpers, render helpers, and a test-only helper type without changing the user-facing Flow commands or runtime behavior.
1771
+
1772
+ The configured deadcode gate already had no unused files or dependencies, so this release keeps that gate intact and treats the broader export-level scan as advisory. Only high-confidence in-repository dead declarations were removed; schema barrels and intentionally exported validation surfaces remain in place where the current package structure still relies on them as internal boundaries.
1773
+
1774
+ Constraint: Keep the cleanup deletion-only and avoid changing Flow command behavior, package dependencies, or the zod/plugin SDK alignment
1775
+ Rejected: Remove every export reported by broad `knip` output | many reports are intentional internal schema/barrel surfaces rather than safely deletable runtime code
1776
+ Rejected: Update README command documentation | the documented `/flow-*` commands and behavior did not change
1777
+ Confidence: high
1778
+ Scope-risk: narrow
1779
+ Reversibility: clean
1780
+ Directive: Treat `bun run deadcode` as the release gate for unused files/dependencies and broad `knip` export output as advisory unless a symbol has no in-repository use or API reason to remain
1781
+ Tested: `bun run typecheck`; `bun run deadcode`; `bun test`; `bun run lint`; `bun run build`; targeted README reference search for removed symbols; Oracle review of the cleanup diff
1782
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.44` before push
1783
+
1784
+
1785
+ ## [1.0.43] - 2026-05-02
1786
+
1787
+ Make release safety explicit while removing deprecated install heuristics
1788
+
1789
+ Flow 1.0.43 hardens the release path without adding compatibility shims or speculative layers. Install and uninstall now protect `~/.config/opencode/plugins/flow.js` with an explicit Flow ownership marker, refuse to overwrite or remove unowned plugin files, and write release-installer downloads through temporary files before atomically moving the managed plugin into place. Deprecated unmarked-plugin signature heuristics were removed, so ownership is no longer inferred from brittle tool-name substrings.
1790
+
1791
+ Session persistence is safer across process boundaries. Lifecycle operations now share the session save lock, session writes use a filesystem-backed lock in addition to the in-process queue, and lock cleanup is covered by a stricter absence check. Final completion is also stricter: lite-lane final completion now requires a recorded final reviewer decision instead of accepting an in-band worker final review as a substitute.
1792
+
1793
+ The release checks now include pack invariants and cold-start budget enforcement in `bun run check`, while the cold-start harness loads the real plugin peer and zod packages instead of maintaining a local shim. The cleanup pass removed test-only runtime export surface and synthetic legacy install tests, keeping the diff smaller and easier to reason about.
1794
+
1795
+ Constraint: Ship the hardening work without adding new dependencies, deprecated compatibility branches, or avoidable public API surface
1796
+ Constraint: Keep zod aligned with the OpenCode plugin SDK's effective zod version while expanding release checks
1797
+ Rejected: Keep legacy unmarked-plugin detection | substring ownership heuristics can misclassify unrelated plugins and preserve deprecated behavior
1798
+ Rejected: Let release install write directly to `flow.js` before adding ownership metadata | interrupted installs could leave an unmarked file that future safety checks reject
1799
+ Rejected: Export queue internals for tests | production barrels should not grow test-only API surface
1800
+ Confidence: high
1801
+ Scope-risk: moderate
1802
+ Reversibility: clean
1803
+ Directive: Keep plugin ownership marker-based; do not reintroduce legacy signature ownership detection without a reviewed migration design and false-positive analysis
1804
+ Tested: `bun run check`; targeted install, lifecycle, atomic write, workspace cache, completion gate, runtime contract, and runtime tool tests
1805
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.43` before push
1806
+
1807
+ ## [1.0.42] - 2026-05-01
1808
+
1809
+ Make coding guidelines enforceable before release instead of relying on reviewer memory
1810
+
1811
+ Flow 1.0.42 turns maintainability guidance into a checked workflow contract. Production source now enables Biome's no-console rule, release builds drop bundled console calls, release checks scan `src` and `dist/index.js`, and the installer path uses explicit stdout/stderr stream adapters instead of raw console calls. This means debug-only `console.*` calls and `debugger` statements cannot slip into the release-bound plugin or its built artifact while development scripts and tests can still emit intentional operator output.
1812
+
1813
+ The same principle is now present in Flow's planning, execution, autonomous, and review prompts. Planner and worker surfaces explicitly plan and complete against coding guidelines, small diffs, existing scripts/utilities, release hygiene, and test coverage. Reviewer surfaces treat release hygiene and missing tests as review concerns, so workflow approvals should reject debug instrumentation instead of treating it as an afterthought.
1814
+
1815
+ Constraint: Convert best-practice guidance into automated release gates and prompt contracts without adding dependencies or weakening the existing Bun-first validation path
1816
+ Constraint: Preserve intentional CLI output by routing release-bound messages through explicit stdout/stderr adapters rather than banning operator-facing output entirely
1817
+ Rejected: Rely on prose-only maintainer guidance | release hygiene must fail mechanically before a tag can ship
1818
+ Rejected: Enable no-console globally for scripts and tests | development-only tooling needs intentional stdout/stderr output and is not part of the shipped plugin artifact
1819
+ Confidence: high
1820
+ Scope-risk: moderate
1821
+ Reversibility: clean
1822
+ Directive: Keep future workflow/prompt changes aligned with the coding-guidelines gate; do not reintroduce raw console calls or debugger statements into release-bound source without replacing the gate with an equally strict reviewed alternative
1823
+ Tested: `bun run check`; `bun test tests/cross-area/release-hygiene.test.ts tests/biome-adoption.test.ts tests/config.test.ts tests/mode-contracts.test.ts`; `bun run build && bun run check:release-hygiene`
1824
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.42` before push
1825
+
1826
+ ## [1.0.41] - 2026-05-01
1827
+
1828
+ Make Flow's long-running modes narrate their work without adding noise
1829
+
1830
+ Flow 1.0.41 turns operator feedback into an explicit prompt contract across the main workflow surfaces. `/flow-auto`, `/flow-run`, `/flow-plan`, flow-planner, and flow-worker now require concise phase-boundary progress updates before and after planning, execution, validation, review, recovery/reset, and finalization. The autonomous coordinator also carries a concrete checkpoint list so users can see what phase Flow is in, what action is happening next, why it matters, and what evidence came out of the phase.
1831
+
1832
+ The same expectation is now reflected in mode contracts and the read-only `/flow-review` surface: longer reviews should announce mapping, evidence inspection, and rendering phases, while control/history/reset operations should provide one before/after update when they perform multi-step runtime work. The reviewer JSON contract remains intentionally strict so machine-readable approval decisions do not get polluted with user-facing narration.
1833
+
1834
+ Constraint: Improve user-visible workflow transparency without weakening runtime-owned state transitions or strict reviewer JSON output
1835
+ Constraint: Keep feedback concise enough for OpenCode users to understand progress without flooding the transcript with raw tool JSON or minor file-read narration
1836
+ Rejected: Add a new runtime event-streaming API | prompt-level phase guidance solves the immediate UX gap without expanding the public tool surface
1837
+ Rejected: Apply progress prose to flow-reviewer output | reviewer decisions must remain exact JSON for downstream runtime gates
1838
+ Confidence: high
1839
+ Scope-risk: narrow
1840
+ Reversibility: clean
1841
+ Directive: Keep future prompt/mode changes aligned with the shared operator-progress contract so new workflow surfaces explain phase, action, evidence, and next step consistently
1842
+ Tested: `bun run check`; `bun test --randomize` (seed `2415184663`); `node ./scripts/cross-area/pack-invariants.mjs`; `node ./scripts/cross-area/bench-gate.mjs`; `node ./scripts/cross-area/cold-start-budget.mjs`; targeted prompt/config snapshot and capture checks
1843
+ Not-tested: Live OpenCode transcript behavior before pushing tag v1.0.41
1844
+
1845
+ ## [1.0.40] - 2026-05-01
1846
+
1847
+ Close review-found correctness and release hardening gaps before the next publish
1848
+
1849
+ Flow 1.0.40 turns the full-codebase review findings into runtime, audit, release, and validation guards. Completion thresholds now fail fast when they exceed the active plan size, including narrowed approval/select paths and legacy invalid sessions that reach execution. Session persistence now uses unique atomic-write temp files and validates cached reads by content hash, so same-size external rewrites cannot be served stale. History lookup tolerates stale active pointers by falling back to stored or completed copies, and the compaction hook now uses the same safe session loader as system-context injection.
1850
+
1851
+ The release lane is also stricter: source builds no longer require Python, bundle sanity is current with the actual history/config/tool surfaces and is part of `bun run check`, CI runs for docs/bench/config-relevant changes, and generated release install assets are tag-pinned instead of silently fetching `latest`. `/flow-review` now wraps raw arguments in a tagged untrusted-data block, downgrades full-audit claims unless all major surface categories are directly evidenced, and includes synthesized `not_run` validation accounting in structured output as well as the human report.
1852
+
1853
+ Constraint: Fix the concrete review findings without adding dependencies or broad rewrites
1854
+ Constraint: Preserve the existing Flow tool surface and zod/plugin SDK alignment while tightening runtime and release gates
1855
+ Rejected: Clamp impossible completion thresholds silently | rejecting invalid plans makes the author repair the plan instead of hiding a broken completion contract
1856
+ Rejected: Keep mtime/size-only session cache validation for performance | correctness across external writers is more important than avoiding the extra read
1857
+ Rejected: Leave bundle-sanity as a manual side script | release confidence depends on checking the packaged plugin surface after build
1858
+ Confidence: high
1859
+ Scope-risk: moderate
1860
+ Reversibility: clean
1861
+ Directive: Keep completion-policy validation paired with any future plan-subsetting path; keep bundle-sanity expected surfaces in lockstep with config/tool additions
1862
+ Tested: `bun run check`; `bun test --randomize` (seed `1395117911`); `node ./scripts/cross-area/pack-invariants.mjs`; `node ./scripts/cross-area/bench-gate.mjs`; `node ./scripts/cross-area/cold-start-budget.mjs`; targeted regression tests for runtime tools, session cache, atomic writes, install lifecycle, prompt snapshot, and config
1863
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.40` before push
1864
+
1865
+ ## [1.0.39] - 2026-04-30
1866
+
1867
+ Make prompt quality a mode-wide offline contract instead of review-only tuning
1868
+
1869
+ Flow 1.0.39 extends the providerless prompt-quality loop beyond `/flow-review` into the main Flow modes. Prompt mode boundaries now live in one canonical contract that records each surface's source files, mutation limits, expected tools, forbidden tools, required behavior, and stop condition. The behavior eval and capture harnesses use that contract to check planner, worker, auto, reviewer, run, and control outputs offline, including structured tool-call intent when available and safer affirmative matching when only prose is captured.
1870
+
1871
+ This keeps prompt iteration practical: maintainers can export capture prompts, score real outputs, promote calibrated captures into regressions, and publish combined prompt-eval summaries without adding model-provider credentials or direct API calls. The README and development guide now document that workflow so future prompt changes are treated as product changes, not unreviewed wording edits.
1872
+
1873
+ Constraint: Improve prompt quality across Flow modes without adding a model API dependency, new credentials, or new runtime tooling
1874
+ Constraint: Keep mode safety explicit enough that read-only and mutating surfaces can be mechanically checked against their intended Flow tool boundaries
1875
+ Rejected: Add live provider-backed evals as the first step | credential requirements and provider variance would make local and CI prompt checks harder to run consistently
1876
+ Rejected: Keep prompt-mode expectations only in prose fixtures | mode boundaries would drift as prompts, command templates, and tool permissions evolve
1877
+ Confidence: high
1878
+ Scope-risk: moderate
1879
+ Reversibility: clean
1880
+ Directive: Keep `src/prompts/mode-contracts.ts`, prompt fixtures, capture scenarios, and config/tool permissions aligned whenever adding or changing a Flow mode
1881
+ Tested: `bun run check`; `bun run eval:review-capture:check`; `bun run eval:prompt-capture:check`; `bun test tests/prompt-mode-behavior-eval.test.ts tests/prompt-mode-capture.test.ts tests/mode-contracts.test.ts`
1882
+ Not-tested: Live OpenCode tool-call trace ingestion; live GitHub-hosted `release.yml` run for tag `v1.0.39` before push
1883
+
1884
+ ## [1.0.38] - 2026-04-30
1885
+
1886
+ Keep GitHub Actions on the current Node runtime before the runner deprecation becomes noise
1887
+
1888
+ Flow 1.0.38 refreshes the CI and release workflow action surface so the project no longer emits Node 20 deprecation warnings on GitHub-hosted runs. The workflow now uses `actions/checkout@v6`, `dorny/paths-filter@v4`, and `actions/upload-artifact@v6`, with the explicit `pull-requests: read` permission that the paths-filter PR mode documents. This keeps the release lane boring: prompt-quality artifacts still upload, tag releases still build from the same Bun pipeline, and the workflow warnings no longer distract from real failures.
1889
+
1890
+ Constraint: Remove GitHub Actions Node 20 deprecation warnings without changing the Bun validation/release commands or adding new release tooling
1891
+ Constraint: Keep the CI change detector working for pull requests after moving `dorny/paths-filter` to its Node 24 major
1892
+ Rejected: Silence the warnings with a runner-level Node override | that would hide stale action pins instead of keeping the workflow surface current
1893
+ Rejected: Replace `paths-filter` with hand-written git diff shell logic | larger behavior change for a release hygiene fix
1894
+ Confidence: high
1895
+ Scope-risk: narrow
1896
+ Reversibility: clean
1897
+ Directive: Keep workflow action major bumps paired with their documented permission/runtime requirements; do not remove `pull-requests: read` while PR change detection depends on `paths-filter`
1898
+ Tested: `bun run check`; Ruby YAML parse of `.github/workflows/*.yml`
1899
+ Not-tested: Local `actionlint` because the Docker daemon was unavailable and no standalone `actionlint` binary was installed; live GitHub-hosted `ci.yml` and `release.yml` runs for tag `v1.0.38` before push
1900
+
1901
+ ## [1.0.37] - 2026-04-30
1902
+
1903
+ Turn Flow review quality into a measurable offline regression loop
1904
+
1905
+ Flow 1.0.37 strengthens `/flow-review` from a well-worded prompt into a prompt-quality system. The review lane now carries sharper evidence rules, an explicit `hardening_opportunity` taxonomy for useful non-blocking improvements, stricter schema and normalization guards around full-audit claims, deterministic behavior evals for calibrated versus overconfident outputs, captured real-output fixtures, prompt snapshots, and providerless review-capture packets that can score structured model/plugin output without calling a model API or requiring credentials. The prompt eval report now includes behavior-eval artifacts so prompt changes can be reviewed as testable product surfaces instead of prose-only edits.
1906
+
1907
+ Constraint: Improve review output quality and regression detection without adding provider-specific API calls, credentials, or new dependencies
1908
+ Constraint: Keep `/flow-review` read-only and renderer-backed while making requested depth, achieved depth, coverage evidence, and validation honesty mechanically checkable
1909
+ Rejected: Add a direct OpenAI live-eval harness | it would introduce credentials and provider lock-in for a workflow that can be captured and scored offline through the actual plugin surface
1910
+ Rejected: Continue tuning prompt wording without captured-output fixtures | prompt quality would remain subjective and regressions would only be noticed after disappointing real reviews
1911
+ Confidence: high
1912
+ Scope-risk: moderate
1913
+ Reversibility: clean
1914
+ Directive: When review output disappoints, capture the structured ledger and promote it into the behavior fixture corpus; keep prompt contracts, schemas, renderer ordering, and eval fixtures aligned when changing review semantics
1915
+ Tested: `bun run check`; `bun run eval:review-capture:check`; `bun test tests/review-prompt-capture.test.ts tests/prompt-behavior-eval.test.ts`; `bun run typecheck`; `bun run lint`; `bun run report:prompt-eval`
1916
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.37` before push; fully automated live model-in-the-loop evals by design; manual plugin-surface captures from multiple external repositories
1917
+
1918
+ ## [1.0.36] - 2026-04-30
1919
+
1920
+ Simplify the Flow review lane and tighten the user-facing docs without weakening runtime execution gates
1921
+
1922
+ Flow 1.0.36 shrinks the standalone `/flow-review` surface again. This release keeps the renderer-backed human report, but reduces review-lane contract and prompt duplication, removes duplicated coverage bookkeeping, trims semantic repair logic back to structural depth calibration, and keeps the stricter boundary that rejects the old review payload shape instead of carrying a compatibility shim the product no longer wants. It also cleans up the README so end users see the practical Flow entry points first (`/flow-auto`, `/flow-plan`, `/flow-review`) without being dropped straight into internal runtime terminology, while keeping maintainer detail in `docs/development.md`. Runtime-owned `flow-auto` completion/review semantics remain intact.
1923
+
1924
+ Constraint: Improve review-lane maintainability and end-user clarity without weakening runtime-owned completion, validation, or final-review semantics
1925
+ Constraint: Keep zod aligned with the plugin SDK's effective contract while preserving the existing canonical tool surface and thin JSON transport boundaries
1926
+ Rejected: Keep a legacy review-ledger compatibility parser in `flow_review_render` | it preserved an old internal shape the product no longer wants and added code without meaningful user value
1927
+ Rejected: Re-expand presenter/normalizer heuristics to rewrite reviewer meaning | that would hide simplification gains behind more semantic repair layers instead of making the boundary cleaner
1928
+ Confidence: high
1929
+ Scope-risk: moderate
1930
+ Reversibility: clean
1931
+ Directive: Keep `/flow-review` small and human-first, and preserve the boundary where runtime owns execution truth while the review lane owns presentation only
1932
+ Tested: `bun run check`; `bun test tests/runtime-tools.test.ts tests/config.test.ts tests/smoke/dist-load.test.ts tests/docs-tool-parity.test.ts tests/prompt-eval-corpus.test.ts`; `bunx biome check README.md docs/development.md tests/config.test.ts tests/prompt-eval-corpus.test.ts tests/docs-tool-parity.test.ts --files-ignore-unknown=true`
1933
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.36` before push; long-lived in-flight callers still emitting the pre-simplification review payload shape
1934
+
1935
+ ## [1.0.35] - 2026-04-30
1936
+
1937
+ Render `/flow-review` output through a deterministic human-first presenter instead of raw ledger text
1938
+
1939
+ Flow 1.0.35 keeps the structured review ledger for coverage and autonomous use, but stops relying on prompt prose alone for the final review output. This release adds a strict review-report schema, a renderer-backed `flow_review_render` read-only runtime tool, and a deterministic presenter that emits Conclusion, Top findings, Recommended next actions, and Coverage notes by default. The review command surface now builds the structured ledger, passes it through the renderer, and only emits raw structured JSON when explicitly requested.
1940
+
1941
+ Constraint: Improve human readability for `/flow-review` without weakening the structured coverage ledger or adding a mutating review subsystem
1942
+ Constraint: Keep zod aligned with the plugin SDK's effective contract while preserving the existing thin JSON transport boundary and read-only command behavior
1943
+ Rejected: Keep tuning prompt wording alone | output quality would still depend too heavily on model formatting drift
1944
+ Rejected: Replace the structured review ledger with free-form markdown only | would weaken machine-readable coverage accounting and reduce autonomous reuse
1945
+ Confidence: high
1946
+ Scope-risk: moderate
1947
+ Reversibility: clean
1948
+ Directive: Keep the review ledger as the canonical machine contract, but route user-facing review output through deterministic rendering rather than raw ledger dumps
1949
+ Tested: `bun run check`; `bun test tests/config.test.ts tests/prompt-eval-corpus.test.ts tests/runtime-tools.test.ts tests/smoke/dist-load.test.ts tests/docs-tool-parity.test.ts`; `bun run typecheck && bun run build`
1950
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.35` before push; real OpenCode review sessions across multiple repositories after the renderer-backed output change
1951
+
1952
+ ## [1.0.34] - 2026-04-30
1953
+
1954
+ Replace the audit-heavy review lane with one clear `/flow-review` surface and make final approval evidence real
1955
+
1956
+ Flow 1.0.34 removes the separate audit/report-history product story and replaces it with one user-facing read-only review command. `/flow-review` is now the only review surface Flow exposes, the old saved-audit plumbing and compatibility aliases are gone, and the docs/prompts/status copy now describe review depth directly instead of a parallel audit subsystem. This release also hardens completion: final review policy is runtime-owned, `flow-auto` defaults its final completion gate to a detailed cross-feature review, final reviewer decisions and worker payloads carry explicit depth, typed reviewed surfaces, and artifact-backed evidence refs, and completion rejects claimed coverage that is not grounded in the current run’s changed artifacts and validation commands.
1957
+
1958
+ Constraint: Simplify the review UX without weakening the final completion gate or reintroducing prompt-only review semantics
1959
+ Constraint: Keep zod aligned with the plugin SDK's effective contract while strengthening final-review validation through existing thin JSON tool boundaries
1960
+ Rejected: Keep separate `/flow-reviews` or legacy `/flow-audit` surfaces for saved-history browsing | they added user-facing complexity without serving the primary product goal of “review now and show the result”
1961
+ Rejected: Enforce a hard two-phase finalization flow that blocks recording final approval until matching worker evidence is already persisted | stronger in theory, but it reduces workflow flexibility and requires a larger sequencing redesign than this release needs
1962
+ Confidence: high
1963
+ Scope-risk: broad
1964
+ Reversibility: messy
1965
+ Directive: Keep the public review surface singular and keep final-review rigor runtime-owned; do not reintroduce hidden report-history UX or prompt-only final-review depth claims without fresh evidence
1966
+ Tested: `bun run check`
1967
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.34` before push; real OpenCode end-to-end sessions exercising both `broad` and `detailed` final-review policies after this release
1968
+
1969
+ ## [1.0.33] - 2026-04-30
1970
+
1971
+ Make Flow audits always available while shrinking the GPT-5.5 audit-history payload
1972
+
1973
+ Flow 1.0.33 finishes the audit-surface stabilization work. `flow_audit_reports` now uses a thin `requestJson` transport so OpenCode sends a much smaller provider-facing schema to OpenAI, while the runtime still keeps strict validation and legacy direct-object fallback for internal callers. This release also removes the obsolete audit env-gating and makes `/flow-audit` and `/flow-audits` available by default, updates the control/audit prompts to use the new wrapper contract, and cleans up docs/tests to match the always-on audit surface.
1974
+
1975
+ Constraint: Preserve audit behavior and internal direct-call compatibility while shrinking the provider-facing tool contract and removing obsolete audit env-gate complexity
1976
+ Constraint: Keep zod aligned with the plugin SDK's effective contract while changing only the audit surface that reproduced GPT-5.5 instability
1977
+ Rejected: Keep the audit env gates and only update docs | would leave the fixed audit surface hidden behind obsolete setup and preserve avoidable runtime/config complexity
1978
+ Rejected: Split `flow_audit_reports` into multiple new public tools immediately | the thin wrapper solved the GPT-5.5 failure with a smaller blast radius and preserved the existing command UX
1979
+ Confidence: high
1980
+ Scope-risk: moderate
1981
+ Reversibility: clean
1982
+ Directive: Keep heavy provider-facing tool inputs on thin JSON-string transports and do not reintroduce raw multiplexed audit schemas or env-gated audit defaults without fresh host evidence
1983
+ Tested: `bun run check`; `bun test tests/config.test.ts tests/runtime-tools.test.ts tests/smoke/dist-load.test.ts tests/docs-tool-parity.test.ts tests/protocol-parity.test.ts tests/cross-area/install-lifecycle.test.ts tests/cross-area/manual-flow.test.ts tests/cross-area/resume-flow.test.ts tests/cross-area/dependency-contract.test.ts tests/cross-area/pack-invariants.test.ts tests/cross-area/next-command-coverage.test.ts`; live no-env OpenCode `gpt-5.5` runs for `/flow-audits`, `/flow-audits show latest`, `/flow-audits compare latest latest`, `/flow-audits compare latest <older>`, `/flow-audits compare <older> latest`, `/flow-audits compare <reportA> <reportB>`, and `/flow-audit quick smoke audit; do not persist if no findings`; live no-env `gpt-5.4` controls for `/flow-audits show latest` and `/flow-audit quick smoke audit; do not persist if no findings`
1984
+ Not-tested: Long-running full-surface broad persisted GPT-5.5 audits on a repository with real active Flow session state; live GitHub-hosted `release.yml` run for tag `v1.0.33` before push
1985
+
1986
+ ## [1.0.32] - 2026-04-29
1987
+
1988
+ Flatten the /flow-audit command text so the direct OpenAI path avoids markup-heavy prompts
1989
+
1990
+ Flow 1.0.32 keeps the audit-on-control-agent routing from 1.0.31, but removes the remaining XML-style and section-rendered command framing from `FLOW_AUDIT_COMMAND_TEMPLATE`. The audit command now uses a compact plain-text instruction block with the same audited behavior constraints, while leaving the other Flow command surfaces untouched. This narrows the remaining provider-specific risk to a simpler text prompt instead of a markup-heavy command expansion.
1991
+
1992
+ Constraint: Preserve the audit command behavior and tested audit semantics while simplifying the provider-facing command text as much as possible
1993
+ Constraint: Keep zod aligned with the plugin SDK's effective contract and avoid broad prompt-system churn when only the audit command path is under suspicion
1994
+ Rejected: Convert every Flow command template away from structured sections | unnecessary blast radius when only `/flow-audit` is implicated
1995
+ Rejected: Keep XML-style task framing on `/flow-audit` while changing only tool/config routing | left the OpenAI-facing audit command payload unnecessarily markup-heavy
1996
+ Confidence: medium
1997
+ Scope-risk: narrow
1998
+ Reversibility: clean
1999
+ Directive: If provider-specific issues persist, prefer targeted simplification on the implicated surface instead of flattening unrelated Flow prompts
2000
+ Tested: `bun run check`; `bun test tests/cross-area/pack-invariants.test.ts tests/config.test.ts tests/smoke/dist-load.test.ts tests/prompt-eval-corpus.test.ts`
2001
+ Not-tested: Actual OpenCode direct OpenAI host behavior on the user's machine after the plain-text `/flow-audit` command change; live GitHub-hosted `release.yml` run for tag `v1.0.32` before push
2002
+
2003
+ ## [1.0.31] - 2026-04-29
2004
+
2005
+ Route /flow-audit through the stable control-agent path instead of a dedicated audit agent
2006
+
2007
+ Flow 1.0.31 removes the remaining dedicated primary-agent path from `/flow-audit`. The audit command still uses the same audit command template, tools, and structured contract, but it now runs through the existing `flow-control` agent rather than a separate `flow-auditor` agent. This keeps the audit behavior intact while eliminating one more OpenCode-specific command/agent surface that could diverge on the direct OpenAI provider path.
2008
+
2009
+ Constraint: Preserve the audit command behavior and read-only guarantees while removing the extra primary-agent surface from the audit path
2010
+ Constraint: Keep zod aligned with the plugin SDK's effective contract and avoid adding runtime complexity while isolating a provider-specific failure mode
2011
+ Rejected: Keep the dedicated flow-auditor path and continue trimming only prompt text | left an extra OpenCode primary-agent path in place even after prompt-surface reduction
2012
+ Rejected: Merge audit into the normal execution lane | would blur the audit/execution boundary instead of removing only the unstable surface
2013
+ Confidence: medium
2014
+ Scope-risk: narrow
2015
+ Reversibility: clean
2016
+ Directive: Prefer reusing stable agent surfaces for specialized commands when a separate primary-agent path is not buying real capability
2017
+ Tested: `bun run build && bun test tests/config.test.ts tests/smoke/dist-load.test.ts tests/prompt-eval-corpus.test.ts tests/docs-tool-parity.test.ts`; `bun run check`
2018
+ Not-tested: Actual OpenCode direct OpenAI host behavior on the user's machine after routing `/flow-audit` through `flow-control`; live GitHub-hosted `release.yml` run for tag `v1.0.31` before push
2019
+
2020
+ ## [1.0.30] - 2026-04-29
2021
+
2022
+ Trim the audit prompt surface so direct OpenAI audit requests stay within provider limits
2023
+
2024
+ Flow 1.0.30 targets the remaining `/flow-audit` instability on the direct OpenAI provider path. The prior releases reduced audit tool registration pressure, but the full audit agent still expanded into a very large command + agent + contract prompt surface. This release removes the large embedded audit-contract examples, trims duplicated audit guidance across the auditor prompt and audit command template, and keeps the tested audit contract semantics while materially shrinking the prompt payload seen by OpenCode when `/flow-audit` is invoked.
2025
+
2026
+ Constraint: Preserve the existing audit semantics and tested contract phrases while materially shrinking the direct OpenAI audit prompt surface
2027
+ Constraint: Keep zod aligned with the plugin SDK's effective contract and avoid introducing new runtime or tool complexity while debugging a provider-specific request failure
2028
+ Rejected: Keep the oversized embedded audit examples and continue bisecting only tool/config surfaces | the direct OpenAI failure occurs at audit invocation time and the prompt payload was still substantially larger than other Flow surfaces
2029
+ Rejected: Replace the audit contract with a looser summary-only prompt | would reduce size by weakening the structured audit guarantees instead of preserving them
2030
+ Confidence: medium
2031
+ Scope-risk: narrow
2032
+ Reversibility: clean
2033
+ Directive: Keep future audit prompt additions compact; if a new audit requirement needs long examples, prefer test fixtures and docs over embedding large examples into the provider-facing prompt surface
2034
+ Tested: `bun test tests/config.test.ts tests/prompt-eval-corpus.test.ts`; `bun run check`
2035
+ Not-tested: Actual OpenCode direct OpenAI host behavior on the user's machine after the audit prompt trim; live GitHub-hosted `release.yml` run for tag `v1.0.30` before push
2036
+
2037
+ ## [1.0.29] - 2026-04-29
2038
+
2039
+ Split the audit-tool gate so host instability can be isolated to one remaining tool
2040
+
2041
+ Flow 1.0.29 narrows the OpenCode instability diagnosis further. The prior patch reduced the audit tool surface from four tools to two, but `FLOW_ENABLE_AUDIT_TOOLS=1` still reproduced the host failure. This release keeps the umbrella audit-tools flag, but adds separate `FLOW_ENABLE_AUDIT_REPORTS_TOOL=1` and `FLOW_ENABLE_AUDIT_WRITE_TOOL=1` gates so the remaining failure can be isolated to the saved-audit read tool, the audit artifact write tool, or only their combined registration.
2042
+
2043
+ Constraint: Preserve the current audit behavior while adding the smallest possible host-bisect surface around the two remaining audit tools
2044
+ Constraint: Keep zod aligned with the plugin SDK's effective contract and avoid widening runtime semantics just to add finer diagnostic control
2045
+ Rejected: Leave the two-tool audit surface unsplit | not enough information to isolate whether one remaining tool or the combined registration breaks OpenCode
2046
+ Rejected: Add more runtime or prompt complexity before isolating the exact failing tool presence | the next useful signal is host bisecting, not more internal machinery
2047
+ Confidence: high
2048
+ Scope-risk: narrow
2049
+ Reversibility: clean
2050
+ Directive: Use the new reports-tool and write-tool gates to identify whether OpenCode breaks on one remaining audit tool or only on the combined audit-tools surface before changing the default again
2051
+ Tested: `bun run build && bun test tests/smoke/dist-load.test.ts tests/config.test.ts tests/docs-tool-parity.test.ts`; `bun run check`
2052
+ Not-tested: Actual OpenCode host stability on the user's machine for `FLOW_ENABLE_AUDIT_REPORTS_TOOL=1`, `FLOW_ENABLE_AUDIT_WRITE_TOOL=1`, or `FLOW_ENABLE_AUDIT_TOOLS=1`; live GitHub-hosted `release.yml` run for tag `v1.0.29` before push
2053
+
2054
+ ## [1.0.28] - 2026-04-29
2055
+
2056
+ Reduce audit tool registration pressure without dropping audit functionality
2057
+
2058
+ Flow 1.0.28 keeps the audit lane available, but trims the host-facing audit tool surface that was destabilizing OpenCode when `FLOW_ENABLE_AUDIT_TOOLS=1` was enabled. This release collapses the three saved-audit read tools into one multiplexed `flow_audit_reports` tool, keeps `flow_audit_write_report` as the separate permissioned write boundary, and lazy-loads audit runtime modules from those tool entrypoints so the audit runtime is not pulled in eagerly just because the tool surface is registered.
2059
+
2060
+ Constraint: Preserve audit functionality while reducing the global tool surface seen by OpenCode when audit tools are enabled
2061
+ Constraint: Keep zod aligned with the plugin SDK's effective contract and avoid widening runtime behavior just to shrink the audit entrypoints
2062
+ Rejected: Keep four separate audit tools and only tweak wording or guidance | did not address the host-facing tool registration surface implicated by `FLOW_ENABLE_AUDIT_TOOLS=1`
2063
+ Rejected: Remove audit tools again from the default diagnostic path | would preserve stability at the cost of leaving the real audit-tools regression unresolved
2064
+ Confidence: medium
2065
+ Scope-risk: moderate
2066
+ Reversibility: clean
2067
+ Directive: If OpenCode still destabilizes with `FLOW_ENABLE_AUDIT_TOOLS=1`, investigate host interaction with the remaining two audit tools before re-expanding the audit surface
2068
+ Tested: `bun test tests/config.test.ts tests/runtime-tools.test.ts tests/smoke/dist-load.test.ts tests/docs-tool-parity.test.ts`; `bun run check`
2069
+ Not-tested: Actual OpenCode host stability on the user's machine with `FLOW_ENABLE_AUDIT_TOOLS=1`; live GitHub-hosted `release.yml` run for tag `v1.0.28` before push
2070
+
2071
+ ## [1.0.27] - 2026-04-29
2072
+
2073
+ Add a diagnostic audit-surface matrix so host instability can be bisected cleanly
2074
+
2075
+ Flow 1.0.27 keeps the safer core-only default from 1.0.26, but adds fine-grained audit reintroduction switches so host instability can be isolated instead of argued about. This release introduces independent config, tools, and guidance gates for the audit lane, updates the built-dist and config coverage to exercise those partial combinations, and documents the matrix so host testing can identify the smallest unstable audit surface.
2076
+
2077
+ Constraint: Preserve the stable core-only default while making audit reintroduction granular enough to diagnose host-side instability
2078
+ Constraint: Keep zod aligned with the plugin SDK's effective contract and avoid widening runtime complexity just to add diagnostic knobs
2079
+ Rejected: Re-enable the full audit lane by default immediately | unsafe without host evidence about which audit sub-surface causes instability
2080
+ Rejected: Leave only one all-or-nothing audit flag | insufficient for meaningful host bisecting
2081
+ Confidence: high
2082
+ Scope-risk: narrow
2083
+ Reversibility: clean
2084
+ Directive: Use the new audit config/tools/guidance gate matrix to identify the smallest unstable OpenCode surface before changing the default again
2085
+ Tested: `bun run build && bun test tests/config.test.ts tests/smoke/dist-load.test.ts tests/docs-tool-parity.test.ts`; `bun run check`
2086
+ Not-tested: Actual OpenCode host behavior on the user’s machine across the diagnostic gate combinations; live GitHub-hosted `release.yml` run for tag `v1.0.27` before push
2087
+
2088
+ ## [1.0.26] - 2026-04-29
2089
+
2090
+ Keep Flow stable by making the audit lane an explicit opt-in surface
2091
+
2092
+ Flow 1.0.26 finishes the audit-lane stability correction. This release extracts audit behavior into a dedicated boundary, removes audit agents, commands, tools, and audit-specific guidance from the default plugin surface, keeps ordinary Flow behavior on the smaller core-only path, and preserves the full audit lane behind the explicit `FLOW_ENABLE_AUDIT_SURFACE=1` opt-in. It also tightens host-safety behavior so the plugin’s system and compacting hooks no-op when no usable workspace context exists.
2093
+
2094
+ Constraint: The default OpenCode plugin surface had to shrink materially without removing the audit feature entirely
2095
+ Constraint: Keep zod aligned with the plugin SDK's effective contract and preserve existing Flow core behavior while moving audit behind an opt-in gate
2096
+ Rejected: Leave audit enabled by default and only trim prompt text | insufficient to reduce the host-visible global surface that was destabilizing OpenCode
2097
+ Rejected: Remove audit entirely | the feature remains useful when explicitly enabled
2098
+ Confidence: high
2099
+ Scope-risk: moderate
2100
+ Reversibility: clean
2101
+ Directive: Keep audit opt-in unless real host-side evidence shows the default Flow surface can safely absorb that extra global footprint again
2102
+ Tested: `bun run check`; `bun test tests/runtime-tools.test.ts tests/smoke/dist-load.test.ts tests/cross-area/install-lifecycle.test.ts`; `bun test tests/cross-area/manual-flow.test.ts tests/cross-area/autonomous-flow.test.ts tests/cross-area/resume-flow.test.ts`; `bun test tests/protocol-parity.test.ts tests/package-manager-detection.test.ts tests/runtime-summary.test.ts`; `bun test tests/docs-tool-parity.test.ts tests/transitions-consolidation.test.ts tests/prompt-eval-corpus.test.ts`; `bun test tests/runtime/render-snapshot.test.ts tests/runtime/render-incremental.test.ts`; `bun test tests/cross-area/module-scope-schemas.test.ts tests/helpers.test.ts tests/workspace-root-guard.test.ts`; `bun run report:prompt-eval`; direct source/dist/plugin hook gate inspections
2103
+ Not-tested: Actual OpenCode host behavior on the user’s machine with this build installed; live GitHub-hosted `release.yml` run for tag `v1.0.26` before push
2104
+
2105
+ ## [1.0.25] - 2026-04-29
2106
+
2107
+ Shrink the global Flow tool surface so ordinary OpenCode requests stay stable
2108
+
2109
+ Flow 1.0.25 is a stability patch aimed at the plugin itself. This release moves the heaviest Flow tool payloads behind thin JSON-string wrapper fields, preserves strict runtime validation after decode, rejects malformed or duplicate-key JSON wrapper payloads, and adds a schema-budget regression so the plugin cannot quietly reintroduce a global tool-definition payload large enough to destabilize ordinary OpenCode requests.
2110
+
2111
+ Constraint: Keep zod aligned with the plugin SDK's effective contract and preserve tool-boundary compatibility without adding dependencies
2112
+ Constraint: Preserve runtime validation semantics while materially shrinking the SDK-facing tool schema surface seen by OpenCode
2113
+ Rejected: Add new runtime state or split Flow into more plugins | unnecessary complexity when the main issue was the global tool-schema payload size
2114
+ Confidence: medium
2115
+ Scope-risk: moderate
2116
+ Reversibility: clean
2117
+ Directive: If future Flow tools need large structured payloads, keep the SDK-facing schema thin and validate the decoded object inside the runtime boundary
2118
+ Tested: `bun run typecheck`; `bun run check`
2119
+ Not-tested: Live OpenCode interactive stability under the user’s exact workload; live GitHub-hosted `release.yml` run for tag `v1.0.25` before push
2120
+
2121
+ ## [1.0.24] - 2026-04-29
2122
+
2123
+ Remove audit-lane prompt contradictions before the next audit release
2124
+
2125
+ Flow 1.0.24 is a narrow audit-lane patch. It resolves the contradiction that told the auditor to stay read-only while also persisting reports, clarifies that `flow_audit_write_report` is the single sanctioned export write, keeps persisted artifact paths out of the audit report contract, and makes the contract examples schema-valid so the audit lane teaches one consistent output shape.
2126
+
2127
+ Constraint: Audit export must remain the only sanctioned write from the audit lane without widening execution or session-mutation permissions
2128
+ Constraint: Final audit output must stay a single contract-valid JSON object even when persistence returns extra metadata
2129
+ Rejected: Add artifact-path fields to the audit contract | would widen the chat/output contract instead of fixing the contradictory guidance
2130
+ Confidence: high
2131
+ Scope-risk: narrow
2132
+ Reversibility: clean
2133
+ Directive: If the audit lane keeps export metadata, keep it in tool responses and persisted artifacts rather than widening the audit report payload
2134
+ Tested: `bun test tests/config.test.ts tests/prompt-eval-corpus.test.ts tests/runtime-tools.test.ts tests/session-engine.test.ts tests/audit-report-contracts.test.ts`; `bun run report:prompt-eval`; `bun run check`
2135
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.24` before push
2136
+
2137
+ ## [1.0.23] - 2026-04-29
2138
+
2139
+ Make saved audit comparisons more trustworthy before cutting the next audit-capable release
2140
+
2141
+ Flow 1.0.23 turns the new saved-audit lane into a releaseable surface. This patch adds structured compare output for persisted audits, keeps compare in the read-only control lane, improves rename and retitle handling so obvious churn does not degrade into noisy add/remove output, and exposes match provenance so operators can see when a diff came from an exact key versus a heuristic pairing.
2142
+
2143
+ Constraint: Audit comparison must stay read-only and must not add new workflow state or execution lanes
2144
+ Constraint: Tool arg schemas must remain aligned with the plugin SDK's effective zod contract while still accepting the full persisted audit contract
2145
+ Rejected: Add a separate semantic-identity subsystem for audit diffs | too much new state and complexity for a patch release
2146
+ Confidence: medium
2147
+ Scope-risk: moderate
2148
+ Reversibility: clean
2149
+ Directive: If compare matching grows beyond evidence/category heuristics, add an explicit stable audit item identity before widening the algorithm further
2150
+ Tested: `bun run report:prompt-eval`; `bun run check`
2151
+ Not-tested: Live GitHub-hosted `release.yml` run for tag `v1.0.23` before push
2152
+
2153
+ ## [1.0.22] - 2026-04-29
2154
+
2155
+ Restore release confidence after the v1.0.21 packaging-gate regression
2156
+
2157
+ Flow 1.0.22 is a narrow corrective patch release. It keeps the prompt-system and eval infrastructure introduced on `main`, but fixes the release-blocking pack-invariants regression that came from hardcoding the previous version in the packaging test happy path. The goal of this release is to make the current fixed `main` state the official tagged release without introducing new behavioral scope.
2158
+
2159
+ Constraint: Packaging and changelog version checks must stay aligned with the active package version at release time
2160
+ Constraint: This patch should avoid widening the prompt/runtime surface beyond the already-verified `main` state
2161
+ Rejected: Retag `v1.0.21` in place | rewriting an already-pushed tag is riskier and less auditable than a clean patch release
2162
+ Confidence: high
2163
+ Scope-risk: narrow
2164
+ Reversibility: clean
2165
+ Directive: Keep release-version assertions dynamic anywhere a test derives expectations directly from `package.json`
2166
+ Tested: `bun run report:prompt-eval`; `bun run check`
2167
+ Not-tested: Live GitHub-hosted `release.yml` run for the new tag before push
2168
+
2169
+ ## [1.0.21] - 2026-04-29
2170
+
2171
+ Improve prompt-system reliability with adaptive context and first-party eval coverage
2172
+
2173
+ Flow 1.0.21 turns the recent prompt work into a first-party, CI-visible release surface. This release adds adaptive system-context injection grounded in persisted runtime state, expands prompt coverage across command, prompt, and contract surfaces, splits the eval corpus into maintainable first-party fixtures, and publishes a reusable prompt-eval coverage summary artifact for CI validation and inspection.
2174
+
2175
+ Constraint: Runtime semantics, completion gates, and recovery behavior remain runtime-owned rather than moving into prompt-only logic
2176
+ Constraint: Prompt evals must stay first-party and must not depend on external process artifacts
2177
+ Rejected: Add a model-graded prompt harness in this release | higher complexity before the static corpus and coverage model fully matured
2178
+ Confidence: high
2179
+ Scope-risk: moderate
2180
+ Directive: Expand corpus coverage before adding materially more prompt complexity, and keep any new eval fixtures grouped by surface under `tests/__fixtures__/prompt-evals/`
2181
+ Tested: `bun run report:prompt-eval`; `bun run typecheck`; `bun run build`; `bun run lint`; `bun test tests/prompt-eval-corpus.test.ts tests/config.test.ts tests/runtime-tools.test.ts`
2182
+ Not-tested: Live GitHub Actions artifact upload path in GitHub-hosted CI
2183
+
2184
+ ## [1.0.20] - 2026-04-28
2185
+
2186
+ ### Highlights
2187
+
2188
+ Flow 1.0.20 preserves the plugin’s strong autonomous core while reducing how much workflow machinery users have to think about. This release makes compact status and doctor summaries more action-oriented, clarifies that repo scripts are the primary execution contract, trims prompt-law duplication where runtime already owns semantics, and relaxes a small amount of architecture-coupled test friction without weakening safety or completion guarantees.
2189
+
2190
+ ### Added
2191
+
2192
+ - Added compact operator-summary guidance that prioritizes current action, blocker, next step, and next command over workflow taxonomy.
2193
+ - Added stronger script-first prompt coverage so planner, worker, and autonomous coordinator paths treat `package.json` scripts as the primary execution contract.
2194
+ - Added explicit prompt/schema reminders that planning-only context such as package-manager ambiguity belongs in `planning`, not inside `plan`.
2195
+
2196
+ ### Changed
2197
+
2198
+ - Changed compact status and doctor output to emphasize what Flow is doing now and what the operator should do next, while keeping richer runtime detail in structured and detailed views.
2199
+ - Changed planner/worker/auto wording to invoke existing package scripts through the detected package manager or repo convention before falling back to raw manager-specific commands.
2200
+ - Trimmed prompt-law and contract duplication where runtime already enforces completion, recovery, and gating semantics.
2201
+ - Relaxed a subset of wording- and partition-coupled tests so future maintenance can focus more on behavior and invariants than on exact prose or file ownership narratives.
2202
+
2203
+ ### Fixed
2204
+
2205
+ - Fixed the remaining prompt ambiguity around script-first behavior so autonomous execution no longer implies that package-manager-native commands should outrank existing scripts.
2206
+ - Fixed documentation drift introduced by compact operator summaries by clarifying that lane/laneReason detail remains available in structured and detailed views.
2207
+ - Fixed lite-lane parity coverage regressions introduced during simplification by restoring targeted prompt assertions for lite-lane completion and retry guidance.
2208
+
2209
+ ## [1.0.19] - 2026-04-28
2210
+
2211
+ ### Highlights
2212
+
2213
+ Flow 1.0.19 makes package-manager detection safer and more repo-aware. This release teaches Flow to detect package-manager evidence from the active subdirectory upward in monorepos, refuses to guess when one directory contains conflicting lockfile families, and records that ambiguity explicitly so execution can stay on known package scripts instead of drifting into Bun-by-default behavior.
2214
+
2215
+ ### Added
2216
+
2217
+ - Added a dedicated runtime package-manager detector that walks from the active tool directory up to the Flow workspace root.
2218
+ - Added explicit planning-state tracking for ambiguous package-manager evidence so Flow can record uncertainty instead of silently guessing.
2219
+ - Added regression coverage for monorepo subpackage detection, relative tool directories, root fallback behavior, outside-root rejection, and ambiguous same-directory lockfiles.
2220
+
2221
+ ### Changed
2222
+
2223
+ - Changed `flow_plan_start` to persist the nearest detected package manager for the active package scope instead of always using workspace-root evidence.
2224
+ - Updated planner, worker, and autonomous coordinator guidance to prefer existing `package.json` scripts and avoid guessing manager-specific commands when package-manager evidence is ambiguous.
2225
+ - Updated README and development guidance to explain monorepo-aware detection and the new ambiguity-safe behavior.
2226
+
2227
+ ### Fixed
2228
+
2229
+ - Fixed the remaining root bias where monorepo subpackages could inherit the workspace-root package manager even when package-local evidence existed.
2230
+ - Fixed the relative-directory resolution bug so package-manager detection now resolves relative tool directories against the Flow workspace root instead of `process.cwd()`.
2231
+ - Fixed the safety gap where conflicting lockfile families in the same directory previously forced an arbitrary precedence-based guess.
2232
+
2233
+ ## [1.0.18] - 2026-04-28
2234
+
2235
+ ### Highlights
2236
+
2237
+ Flow 1.0.18 improves subagent efficiency without expanding the runtime role model. This release teaches workers to classify feature workstreams up front, normalizes validator-safe command evidence around `bun run check` and `bun run format_check`, surfaces lane-selection reasons more consistently in operator-facing outputs, and documents that true runtime-level parallel feature execution remains intentionally deferred.
2238
+
2239
+ ### Added
2240
+
2241
+ - Added explicit `core-worker` workstream classes for implementation, test-only/coverage/tooling, validation-only, and release/integration work.
2242
+ - Added a required worker orientation reference alongside the existing architecture and validation guidance. This referred to a repo-local process artifact tree that has since been retired.
2243
+ - Added stronger protocol-parity coverage for lite-lane semantics, reviewer-persistence requirements, final-completion-path guidance, and recovery/replan expectations.
2244
+
2245
+ ### Changed
2246
+
2247
+ - Normalized worker verification guidance so `bun run check` is the default aggregate proof, with clearer workstream-specific expectations for when scoped sub-checks should be expanded.
2248
+ - Updated the shared formatter-safe validation alias to use a Biome check command with formatter enabled, linter disabled, and assist enforcement disabled.
2249
+ - Exposed `laneReason` more consistently in operator-facing runtime summaries and concrete session-detail payloads.
2250
+ - Clarified maintainer and README guidance around lane visibility, validator-safe commands, and the intentional deferral of runtime-level parallel feature execution.
2251
+
2252
+ ### Fixed
2253
+
2254
+ - Fixed the worker-procedure mismatch that had been forcing implementation, test-only, validation-only, and release/integration work through the same overly rigid checklist.
2255
+ - Fixed ambiguity around formatter-only validation guidance by aligning the shared alias, environment notes, and validator docs on one canonical command surface.
2256
+ - Fixed small release-surface/documentation inconsistencies uncovered during the final review pass.
2257
+
2258
+ ## [1.0.17] - 2026-04-28
2259
+
2260
+ ### Highlights
2261
+
2262
+ Flow 1.0.17 focuses on maintainability rather than new behavior. This release thins the OpenCode tool-schema adapter, moves lite-lane plan auto-approval into the runtime application layer, splits completion-path logic into smaller runtime-owned modules, and converts completion recovery mapping into a descriptor-driven policy while keeping the public tool surface and runtime semantics intact.
2263
+
2264
+ ### Added
2265
+
2266
+ - Added focused completion-path modules under `src/runtime/transitions/` for normalization, validation, and finalization so the protected completion lane is easier to inspect and maintain.
2267
+ - Added explicit post-refactor verification coverage for the changed runtime/application, completion, recovery, and tool-adapter seams.
2268
+
2269
+ ### Changed
2270
+
2271
+ - Simplified `src/tools/schemas.ts` by removing dead manual tool-arg type exports while preserving the SDK-facing arg-shape surface and the raw-vs-runtime worker schema distinction.
2272
+ - Moved lite-lane draft-plan auto-approval from tool-layer orchestration into `src/runtime/application/session-actions.ts`, keeping the outward `autoApproved` contract unchanged.
2273
+ - Split `src/runtime/transitions/execution-completion.ts` into smaller normalization, validation, and finalization modules while preserving completion gate ordering, failure-path persistence, and lite-lane behavior.
2274
+ - Reworked `src/runtime/transitions/recovery.ts` around a descriptor-driven completion recovery mapping while preserving canonical recovery metadata, error codes, and resolution hints.
2275
+ - Reduced wording-coupled test assertions where they were locking prose instead of behavior, while preserving semantic contract checks.
2276
+
2277
+ ### Removed
2278
+
2279
+ - Removed dead session-tool root helper exports from `src/tools/session-tools/shared.ts`.
2280
+ - Removed redundant manual tool-arg type exports from `src/tools/schemas.ts` that were no longer used by the runtime tool surface.
2281
+
2282
+ ### Fixed
2283
+
2284
+ - Preserved the runtime-owned lite auto-approval behavior without requiring a second tool-layer mutation branch.
2285
+ - Kept completion-path recovery and validation semantics green after the completion module split and recovery refactor.
2286
+ - Kept the generated dist surface stable at five agents, eight commands, and seventeen tools.
2287
+
2288
+ ## [1.0.16] - 2026-04-28
2289
+
2290
+ ### Highlights
2291
+
2292
+ Flow 1.0.16 tightens hidden-workspace permission behavior so only Flow's own `.flow` state stays auto-allowed. When the effective mutable workspace root is another hidden directory, Flow now asks for permission before writing Flow state there while still leaving normal project-root `.flow` behavior unchanged.
2293
+
2294
+ ### Added
2295
+
2296
+ - Added a shared mutable-workspace permission gate in `src/tools/mutable-workspace-permission.ts` so mutating Flow tools consistently request approval before writing `.flow/**` under hidden workspace roots other than `.flow`.
2297
+ - Added targeted runtime-tool coverage for the three key behaviors: hidden workspace roots prompt, normal project roots with hidden subdirectories do not prompt, and `.flow` itself remains auto-allowed.
2298
+
2299
+ ### Changed
2300
+
2301
+ - Routed mutating runtime and session tool entrypoints through the new permission gate instead of silently allowing all hidden workspace roots.
2302
+ - Updated workspace-safety documentation to explain when Flow prompts for hidden workspace roots versus when it continues writing to the normal project-root `.flow/**` subtree.
2303
+ - Clarified mutable-root remediation text so `$HOME` rejection explains that Flow needs a real project/worktree subdirectory rather than suggesting a trusted-root override.
2304
+
2305
+ ### Fixed
2306
+
2307
+ - Fixed the remaining mismatch where hidden directories could still become mutable Flow roots without an approval prompt.
2308
+ - Preserved the normal no-prompt path for the standard project-root `.flow/**` state directory and the existing hard block on `$HOME` itself as a mutable root.
2309
+
2310
+ ## [1.0.15] - 2026-04-28
2311
+
2312
+ ### Highlights
2313
+
2314
+ Flow 1.0.15 restores the default external-directory permission prompt for mutating agents without weakening Flow's mutable workspace-root guard. This release removes the over-broad OpenCode permission override that had turned cross-project access into a hard deny, while also trimming duplication in runtime guidance derivation and session-tool wrapper plumbing.
2315
+
2316
+ ### Changed
2317
+
2318
+ - Removed the explicit `external_directory: "deny"` override from `flow-worker` and `flow-auto` so OpenCode host/default permission prompting can apply again when work legitimately reaches outside the current project.
2319
+ - Simplified `src/runtime/summary.ts` by routing guidance shaping more directly through `deriveSessionOperatorState(...)` instead of re-deriving the same major phase branches locally.
2320
+ - Consolidated repeated session-tool read/workspace dispatch boilerplate into narrow helpers in `src/tools/session-tools/shared.ts`, with follow-on cleanup in the history, planning, and lifecycle tool registrations.
2321
+
2322
+ ### Fixed
2323
+
2324
+ - Fixed the regression where recent workspace-safety hardening suppressed the preferred ask-for-permission behavior for external-directory access by forcing a hard deny at the agent config layer.
2325
+ - Preserved the mutable-root safety boundary enforced by `src/runtime/workspace-root.ts` and `src/runtime/application/tool-runtime.ts`, so suspicious roots like home-level dot-directories still cannot silently host Flow state.
2326
+
2327
+ ## [1.0.14] - 2026-04-21
2328
+
2329
+ ### Highlights
2330
+
2331
+ Flow 1.0.14 focuses on durability after the recent runtime simplification work. This release removes the last SDK/runtime arg-shape bridge helper by aligning the `zod` contract with the plugin SDK, adds executable dependency and completion-lane guardrails, compresses redundant architecture/governance docs, and simplifies the main runtime hotspots without changing the operator-facing surface.
2332
+
2333
+ ### Added
2334
+
2335
+ - Added `scripts/cross-area/dependency-contract.mjs` plus `tests/cross-area/dependency-contract.test.ts` to verify that the repo and `@opencode-ai/plugin` still share the same effective `zod` contract.
2336
+ - Added `scripts/cross-area/check-completion-lane.mjs` and the `bun run check:completion-lane` package script so completion-path edits have an explicit protected verification lane.
2337
+ - Added a documented completion-path protection rule for `src/runtime/transitions/execution-completion.ts`, including a file-level warning and maintainer guidance in `docs/architecture/maintainer-risk-checklist.md`.
2338
+ - Added a stricter dependency-alignment check to `tests/config.test.ts` so SDK/runtime shape compatibility is guarded by CI instead of maintainer memory alone.
2339
+
2340
+ ### Changed
2341
+
2342
+ - Pinned `zod` to `4.1.8` to align with `@opencode-ai/plugin@1.3.10` and remove the remaining direct tool-arg bridge helper from the runtime tool surface.
2343
+ - Simplified the runtime application hotspots in `src/runtime/application/session-actions.ts`, `src/runtime/application/session-engine.ts`, and `src/runtime/application/tool-runtime.ts` by deleting duplicated response, dispatch, and workspace-root plumbing.
2344
+ - Simplified `src/runtime/summary.ts` and `src/runtime/transitions/execution-completion.ts` by centralizing repeated projection and completion-path shaping logic while preserving runtime semantics.
2345
+ - Clarified the public product surface in the README and development guide without shrinking the current 5-agent / 8-command / 17-tool surface.
2346
+ - Reframed `docs/migration/v2-tool-contract.md` as the current canonical tool-contract reference instead of a lingering migration note.
2347
+
2348
+ ### Removed
2349
+
2350
+ - Removed the last explicit SDK/runtime arg-shape bridge helper and the scattered direct bridge casts that existed around the runtime tool surface.
2351
+ - Removed redundant architecture-history documents that were no longer the canonical source of maintainer guidance:
2352
+ - `docs/architecture/bridge-hotspots.md`
2353
+ - `docs/architecture/bridge-seam-owners.md`
2354
+ - `docs/architecture/semantic-invariant-equivalence-matrix.md`
2355
+ - `docs/architecture/surface-matrix.md`
2356
+
2357
+ ### Fixed
2358
+
2359
+ - Fixed the residual risk that future dependency bumps could silently reintroduce the `zod` seam without an executable check.
2360
+ - Fixed stale maintainer guidance that still referenced non-existent response-shaping files after the runtime/application consolidation.
2361
+ - Reduced the chance that future completion-path edits can land without running the highest-signal contract and runtime suites first.
2362
+
2363
+ ## [1.0.13] - 2026-04-21
2364
+
2365
+ ### Highlights
2366
+
2367
+ Flow 1.0.13 consolidates the runtime architecture around clearer engine, action, and presentation boundaries while also making small tasks less ceremonial. This release adds runtime-owned read/mutation/workspace action families, splits low-level operator derivation from higher-level session view models, centralizes doctor/status/history presentation in the runtime application layer, and introduces adaptive lite/standard/strict execution guidance with real lite-lane behavior reductions.
2368
+
2369
+ ### Added
2370
+
2371
+ - Added `src/runtime/application/session-engine.ts` as the shared runtime engine for read, mutation, and workspace action execution.
2372
+ - Added runtime-owned action catalogs for mutation, read, and workspace flows in `src/runtime/application/session-actions.ts`, `src/runtime/application/session-read-actions.ts`, and `src/runtime/application/session-workspace-actions.ts`.
2373
+ - Added runtime-owned doctor and presenter modules in `src/runtime/application/doctor-checks.ts`, `src/runtime/application/doctor-report.ts`, `src/runtime/application/session-presenters.ts`, and `src/runtime/application/operator-presenters.ts`.
2374
+ - Added `src/runtime/session-operator-state.ts` to own low-level lane, blocker, and next-command derivation.
2375
+ - Added `tests/session-engine.test.ts` to verify the named action families and centralized engine boundaries directly.
2376
+
2377
+ ### Changed
2378
+
2379
+ - Introduced adaptive rigor with runtime-owned `lite`, `standard`, and `strict` lanes plus shared operator fields such as `phase`, `blocker`, `reason`, `nextStep`, and `nextCommand`.
2380
+ - Reduced lite-lane ceremony by auto-approving simple draft plans, accepting in-band final review payloads where appropriate, and returning retryable non-human blockers directly to `ready`.
2381
+ - Moved status, history, auto-prepare, activation, closure, and doctor reporting onto runtime-owned presenters and action dispatch instead of tool-local orchestration.
2382
+ - Split high-level session view-model derivation from lower-level operator-state derivation so runtime semantics are easier to maintain and extend.
2383
+ - Consolidated tiny dispatch-only modules back into the paired action-family modules to reduce glue-file sprawl without reintroducing ambiguous ownership.
2384
+
2385
+ ### Removed
2386
+
2387
+ - Removed obsolete tool-layer response and doctor helper files from `src/tools/session-tools/` now that runtime application presenters own those responsibilities.
2388
+ - Removed the standalone dispatch-only runtime application files after folding that logic into the corresponding action modules.
2389
+
2390
+ ### Fixed
2391
+
2392
+ - Fixed the remaining mismatch where session-oriented tools still owned their own response/report assembly instead of using runtime-owned presenters.
2393
+ - Fixed the last architecture drift where operator/status derivation and session view-model derivation were mixed in one place without a clean boundary.
2394
+ - Reduced the risk of future semantic drift by keeping tool adapters thin and routing runtime behavior through a smaller number of authoritative modules.
2395
+
2396
+ ## [1.0.12] - 2026-04-20
2397
+
2398
+ ### Highlights
2399
+
2400
+ Flow 1.0.12 hardens workspace safety so Flow can no longer silently create or mutate session state in unrelated directories such as home-level dot-config trees. This release adds explicit mutable-workspace root validation, keeps history/status-style reads non-mutating, denies external-directory access for the mutating agents, and surfaces the resolved workspace root plus rejection reasons in operator-facing tooling.
2401
+
2402
+ ### Added
2403
+
2404
+ - Added `src/runtime/workspace-root.ts` as the shared owner for mutable workspace-root normalization, trusted-root inspection, and explicit rejection errors.
2405
+ - Added runtime regression coverage in `tests/workspace-root-guard.test.ts` for direct session-layer writes, trusted suspicious roots, and read-only history behavior on empty workspaces.
2406
+ - Added helper coverage for multi-root `FLOW_TRUSTED_WORKSPACE_ROOTS` configuration using the platform path delimiter.
2407
+
2408
+ ### Changed
2409
+
2410
+ - Split Flow workspace resolution into read-only vs mutating paths so status/doctor/history remain readable while mutating actions require an intentional project root.
2411
+ - Hardened the runtime/session write surface so `saveSession`, `saveSessionState`, `syncSessionArtifacts`, workspace setup, activation, closure, and delete flows all validate mutable roots instead of trusting arbitrary strings.
2412
+ - Updated `flow_status` and `flow_doctor` payloads to report the resolved workspace root, its source, whether mutation is allowed, and the concrete rejection reason when Flow blocks a root.
2413
+ - Denied `external_directory` access for `flow-worker` and `flow-auto` as defense-in-depth at the OpenCode agent permission layer.
2414
+ - Clarified README guidance for exact trusted-root overrides, including multiple roots via `FLOW_TRUSTED_WORKSPACE_ROOTS`.
2415
+
2416
+ ### Fixed
2417
+
2418
+ - Fixed the accidental ability for Flow to persist state under suspicious hidden roots unless the exact path is explicitly trusted.
2419
+ - Fixed history and stored-session inspection so read-only commands no longer create `.flow/` directories as a side effect on otherwise empty workspaces.
2420
+ - Fixed the remaining gap where lower-level runtime session helpers could bypass the tool-layer workspace safety checks.
2421
+
2422
+ ## [1.0.11] - 2026-04-20
2423
+
2424
+ ### Highlights
2425
+
2426
+ Flow 1.0.11 hardens the new runtime-first simplification work so semantic parity is verified by executable contracts instead of fragile wording checks. This release adds a runtime-owned semantic invariant registry, explicit docs parity markers, stronger protocol/docs parity tests, and supporting architecture artifacts for bridge ownership and strictness.
2427
+
2428
+ ### Added
2429
+
2430
+ - Added `src/runtime/domain/semantic-invariants.ts` as the runtime-owned registry for stable semantic invariant IDs, expectation constants, and owner references.
2431
+ - Added `tests/runtime/semantic-invariants.test.ts` to verify completion-gate order, completion-policy thresholds, decision-gate surfacing, review-scope payload binding, recovery next-action metadata, and canonical tool-surface invariants.
2432
+ - Added `tests/docs-semantic-parity.test.ts` and `tests/docs-tool-parity.test.ts` to keep canonical docs and runtime tool surfaces aligned.
2433
+ - Added architecture references for invariant ownership and rollout planning in `docs/architecture/invariant-matrix.md`, `docs/architecture/strictness-contract.md`, `docs/architecture/semantic-invariant-equivalence-matrix.md`, `docs/architecture/bridge-hotspots.md`, `docs/architecture/bridge-seam-owners.md`, and `docs/architecture/surface-matrix.md`.
2434
+
2435
+ ### Changed
2436
+
2437
+ - Made runtime/domain, runtime/transitions, and runtime/schema the explicit normative owners of Flow workflow semantics, while prompt/contracts/docs now reference runtime-owned invariant IDs instead of re-owning policy.
2438
+ - Replaced brittle semantic wording checks with runtime-derived invariant coverage and explicit `[semantic-invariant]` markers in the canonical architecture docs.
2439
+ - Strengthened maintainer/release guidance and phase checklists so semantic parity, docs parity, and bridge strictness are part of the blocking verification path.
2440
+
2441
+ ### Fixed
2442
+
2443
+ - Fixed the remaining semantic-parity drift risk by verifying invariant owner file/symbol references directly from the runtime catalog.
2444
+ - Fixed the docs semantic-parity gate so it now requires the full runtime-owned invariant catalog, including `tools.canonical_surface.no_raw_wrappers`.
2445
+ - Reduced false positives in owner-resolution checks by allowing more legitimate declaration/export forms instead of only narrow declaration regex matches.
2446
+
2447
+ ## [1.0.10] - 2026-04-19
2448
+
2449
+ ### Highlights
2450
+
2451
+ Flow 1.0.10 makes the control surfaces easier to scan without weakening the runtime tool contract. This release adds `flow_doctor`, introduces runtime guidance plus canonical operator summaries for status-oriented surfaces, defaults `/flow-status` and `/flow-doctor` to compact operator-friendly views, and keeps the fuller structured view available on demand.
2452
+
2453
+ ### Added
2454
+
2455
+ - Added the `flow_doctor` runtime/control surface for non-destructive readiness checks covering install health, command injection, workspace writability, session artifacts, and current next-step guidance.
2456
+ - Added runtime-owned `guidance` and canonical `operatorSummary` fields for `flow_status`, `flow_history_show`, and `flow_doctor`.
2457
+ - Added compact vs detailed status/doctor view support so the default command path is easier for humans to scan while the detailed machine-readable shape remains available.
2458
+
2459
+ ### Changed
2460
+
2461
+ - Updated `/flow-status` and `/flow-doctor` command/control guidance to prefer compact operator-facing summaries by default, with `detail`/`detailed`/`full`/`json` forms opting into the fuller structured view.
2462
+ - Aligned `flow_history_show` next-action guidance so `guidance.nextCommand`, `operatorSummary`, and the top-level `nextCommand` now point to the same follow-up action.
2463
+ - Improved control-surface summaries so `flow-doctor` now leads with doctor-specific warn/fail/ok outcomes instead of reusing a session-only status summary.
2464
+
2465
+ ### Fixed
2466
+
2467
+ - Fixed the previous mismatch where history/show responses could present different next commands depending on whether the caller looked at `guidance`, `operatorSummary`, or the top-level response.
2468
+ - Reduced compact-mode payload cost by emitting minified JSON for compact `flow_status` and `flow_doctor` responses.
2469
+ - Reduced test duplication around doctor/install setup while keeping full release-gate coverage green.
2470
+
2471
+ ## [1.0.9] - 2026-04-19
2472
+
2473
+ ### Highlights
2474
+
2475
+ Flow 1.0.9 turns the new workflow semantics into explicit runtime behavior. This release adds a runtime `decisionGate`, requires structured replan reasons, makes session close outcomes explicit through `flow_session_close`, and updates Flow’s prompts, summaries, and docs to match the stricter workflow model.
2476
+
2477
+ ### Added
2478
+
2479
+ - Added runtime-owned decision-gate derivation so blocking planning decisions are surfaced in session summaries as `decisionGate`.
2480
+ - Added structured replan metadata requirements: `replanReason`, `failedAssumption`, and `recommendedAdjustment`.
2481
+ - Added explicit session closure metadata for `completed`, `deferred`, and `abandoned` outcomes.
2482
+
2483
+ ### Changed
2484
+
2485
+ - Replaced the old session-close flow with explicit `flow_session_close` semantics and made the closure kind required.
2486
+ - Updated runtime summaries, rendered session docs, and reviewer records to expose decision gates, closure state, and review purpose more clearly.
2487
+ - Updated planner/auto contracts and README/development docs to describe runtime-backed decision taxonomy, delivery policy, and active/stored/completed history behavior.
2488
+
2489
+ ## [1.0.8] - 2026-04-19
2490
+
2491
+ ### Highlights
2492
+
2493
+ Flow 1.0.8 finishes the session-storage redesign around explicit `active/`, `stored/`, and `completed/` directories. This release removes the old pointer-file model, aligns runtime/tool/test terminology with the new completed-history behavior, and simplifies completed-session storage logic so the filesystem layout, runtime behavior, and docs all say the same thing.
2494
+
2495
+ ### Changed
2496
+
2497
+ - Replaced the old `.flow/active` pointer plus `.flow/sessions/` and `.flow/archive/` layout with directory-based `.flow/active/<session-id>/`, `.flow/stored/<session-id>/`, and `.flow/completed/<session-id>-<timestamp>/`.
2498
+ - Updated session persistence, activation, history lookup, render syncing, and control-tool payloads to use `stored` and `completed` terminology consistently.
2499
+ - Centralized completed-session naming, collision handling, and lookup logic in a shared runtime storage helper to reduce duplication and layering drift.
2500
+
2501
+ ### Removed
2502
+
2503
+ - Removed the active-session pointer-file model from runtime persistence.
2504
+ - Removed the remaining archive-oriented runtime/test terminology in favor of completed-session wording.
2505
+ - Removed the redundant whitespace-only goal regression file after folding that coverage into the path-traversal suite.
2506
+
2507
+ ## [1.0.7] - 2026-04-19
2508
+
2509
+ ### Highlights
2510
+
2511
+ Flow 1.0.7 simplifies the plugin around a canonical-only runtime and install surface. This release removes deprecated raw-wrapper guidance, deletes the unused `requireFinalReview` knob, tightens prompt/runtime parity coverage, clarifies session-tool ownership boundaries, and drops legacy install/session-migration compatibility paths in favor of the current canonical layouts.
2512
+
2513
+ ### Changed
2514
+
2515
+ - Simplified Flow's canonical tool guidance, runtime boundaries, and session-tool module structure with stronger guardrails and protocol-parity coverage.
2516
+ - Removed the legacy `requireFinalReview` completion-policy field while keeping final review enforced by the final completion path.
2517
+ - Updated README, maintainer docs, and migration notes to reflect the current canonical-only behavior and risk checklist.
2518
+
2519
+ ### Removed
2520
+
2521
+ - Removed legacy raw-wrapper guidance and the unused contract-normalization seam.
2522
+ - Removed legacy install-path compatibility; Flow now installs and uninstalls only at `~/.config/opencode/plugins/flow.js`.
2523
+ - Removed legacy `.flow/session.json` auto-migration support; Flow now expects the current session-history layout only.
2524
+
2525
+ ## [1.0.5] - 2026-04-19
2526
+
2527
+ ### Highlights
2528
+
2529
+ Flow 1.0.5 restores reliable curl-based uninstall behavior from release artifacts by making uninstall idempotent and user-friendly when no plugin file is present.
2530
+
2531
+ ### Fixed
2532
+
2533
+ - Fixed `uninstall.sh` from release downloads to always succeed cleanly when Flow is already absent.
2534
+ - Added an explicit informational message when no plugin file is found at canonical or legacy install paths.
2535
+
2536
+ ## [1.0.4] - 2026-04-19
2537
+
2538
+ ### Highlights
2539
+
2540
+ Flow 1.0.4 cleans up the deterministic planning-context release by restoring the changelog structure and simplifying the new planning-context tool implementation. This keeps the 1.0.3 feature behavior intact while tightening release metadata and runtime-tool maintainability.
2541
+
2542
+ ### Changed
2543
+
2544
+ - Restored the missing markdown heading structure for the 1.0.2 changelog entry.
2545
+ - Simplified `flow_plan_context_record` by removing the redundant raw-input cast and consolidating schema imports.
2546
+ - Revalidated the full release suite after the cleanup.
2547
+
2548
+ ## [1.0.3] - 2026-04-19
2549
+
2550
+ ### Highlights
2551
+
2552
+ Flow 1.0.3 adds deterministic planning context capture before planning, including repo-profile persistence, optional research notes, and decision logging. Autonomous mode now pauses on meaningful unresolved decisions with a recommended path, and the README workflow docs/diagram now reflect that behavior.
2553
+
2554
+ ### Added
2555
+
2556
+ - Added `flow_plan_context_record` to persist repo profile, research, implementation approach, and planning decision logs into the active session.
2557
+ - Added planning decision schemas and decision-log rendering in Flow session summaries.
2558
+
2559
+ ### Changed
2560
+
2561
+ - Updated planner and autonomous prompts to detect stack context first and research only when local repo evidence is insufficient for a high-confidence path.
2562
+ - Restricted explicit decision gating to `/flow-auto`, where unresolved meaningful decisions now stop with options, rationale, and a recommended path.
2563
+ - Updated `README.md` prose and Mermaid workflow diagram to document deterministic planning context, research triggers, and `/flow-auto` decision pauses.
2564
+
2565
+ ## [1.0.2] - 2026-04-19
2566
+
2567
+ ### Highlights
2568
+
2569
+ Flow 1.0.2 extends strict malformed-JSON hardening to persisted session loading and legacy session migration. Session files now reject duplicate keys and other malformed object shapes consistently before runtime schema validation.
2570
+
2571
+ ### Changed
2572
+
2573
+ - Reused the strict JSON object parser for persisted `.flow` session loading and legacy session migration.
2574
+ - Added regression tests covering duplicate-key failures in active and legacy session JSON.
2575
+ - Reduced remaining production malformed-JSON exposure to local tooling/script parse sites rather than runtime session ingestion.
2576
+
2577
+ ## [1.0.1] - 2026-04-19
2578
+
2579
+ ### Highlights
2580
+
2581
+ Flow 1.0.1 hardens reviewer and worker contract ingestion so malformed raw JSON can no longer silently leak into runtime persistence. The release adds strict object scanning, duplicate-key detection, clearer malformed-payload recovery codes, and safer raw wrapper tools for reviewer/final-review/worker completion ingestion.
2582
+
2583
+ ### Added
2584
+
2585
+ - Added `src/runtime/contract-normalization.ts` with strict raw JSON contract parsing and normalization for reviewer and worker payloads.
2586
+ - Added raw-ingestion runtime tools for feature review, final review, and worker completion persistence.
2587
+ - Added regression coverage for duplicate keys, trailing text, non-object payloads, schema failures, and raw-wrapper recovery behavior.
2588
+
2589
+ ### Changed
2590
+
2591
+ - Updated Flow worker/auto command guidance to route reviewer and worker persistence through the safer `*_from_raw` tools.
2592
+ - Marked direct structured persistence tools as low-level/internal so the safer raw-ingestion wrappers are the preferred path.
2593
+ - Improved malformed-payload recovery metadata to surface precise error codes such as `duplicate_json_key`, `trailing_text`, `non_object_payload`, and `schema_validation_failed`.
2594
+
2595
+ ## [1.0.0] - 2026-04-18
2596
+
2597
+ ### Highlights
2598
+
2599
+ Flow 1.0.0 delivers the full six-milestone overhaul proposed for the OpenCode plugin: stricter foundations, correctness hardening for session persistence and validation, schema unification, a transition-layer refactor, measured rendering and bundle work, and final alignment with current OpenCode plugin APIs and release workflows. Measured wins from the shipped benchmarks include a bundled runtime reduced to 455,166 bytes, transition reducers improved by 51.13% to 90.38% versus baseline, and a warm `saveSession` path that stays at 777.70 µs average while the unchanged-session write path remains under the ≤ 1.0 ms release gate.
2600
+
2601
+ ### Added
2602
+
2603
+ - Added the `/flow-history show <session-id>` control command so archived and active stored sessions can be inspected directly by id.
2604
+ - Added canonical installation support for `~/.config/opencode/plugins/flow.js` while preserving legacy installs at `~/.opencode/plugins/flow.js` when they already exist.
2605
+ - Added a `mitata` benchmark harness under `bench/` with `bun run bench` for the full suite and `bun run bench:smoke` for the CI-sized reducer smoke gate.
2606
+ - Added committed benchmark baselines in `bench/BASELINE.md` and post-optimization comparisons in `bench/RESULTS.md`.
2607
+ - Added golden markdown fixtures under `tests/__fixtures__/render/` for empty, single-feature, mid-execution, 20-feature, all-completed, and 100-feature session shapes.
2608
+ - Added a pack-invariants verification script and test coverage to keep published package contents and CHANGELOG versioning in sync.
2609
+ - Added the `experimental.session.compacting` hook so Flow session context is appended during OpenCode session compaction.
2610
+ - Added metadata emission for all 15 Flow tools through `context.metadata({ title, metadata })` without changing the string-returning tool contract.
2611
+ - Added plugin-internal logging via `ctx.client.app.log(...)` in the plugin hot path.
2612
+ - Added a committed `CHANGELOG.md` as a release artifact shipped with the package.
2613
+ - Added a Migration / Upgrade section to the README to explain the canonical plugin path and legacy compatibility behavior.
2614
+ - Added a GitHub release workflow that extracts the matching CHANGELOG section and uses it to populate release notes on tag pushes.
2615
+
2616
+ ### Changed
2617
+
2618
+ - Tightened TypeScript with six additional strict flags: `noUncheckedIndexedAccess`, `exactOptionalPropertyTypes`, `noImplicitOverride`, `noFallthroughCasesInSwitch`, `verbatimModuleSyntax`, and `isolatedModules`.
2619
+ - Adopted Biome as the repo-wide formatter and linter, and wired `bun run lint` plus formatter checks into the project validation flow.
2620
+ - Consolidated transition logic from the earlier 15-file layout into six transition modules while preserving the public transition surface.
2621
+ - Unified runtime schemas under `src/runtime/schema.ts` so tool-layer shapes derive from the runtime source of truth instead of duplicating schema definitions.
2622
+ - Centralized slash-command identifiers and shared error helpers in `src/runtime/constants.ts` and `src/runtime/errors.ts`.
2623
+ - Reworked session persistence to use atomic temp-file-plus-rename writes with a per-worktree in-process save lock.
2624
+ - Hardened path handling so session ids, feature ids, and derived paths reject traversal and malformed components before filesystem access.
2625
+ - Made workspace setup idempotent, including `.flow/.gitignore` maintenance that preserves custom lines while restoring required entries.
2626
+ - Switched archive naming to millisecond-precision timestamps with collision retry suffixes and matching history parsing.
2627
+ - Removed repeated runtime reparsing by parsing tool arguments once at the boundary and operating on typed runtime data internally.
2628
+ - Replaced broad transition cloning with narrower immutable updates in the reducer hot path.
2629
+ - Added incremental markdown rendering with hash-based `writeDocIfChanged` behavior so unchanged saves skip redundant doc writes.
2630
+ - Added read caching keyed by session file metadata and workspace-preparation caching to reduce repeated filesystem work.
2631
+ - Optimized the bundle by externalizing the `@opencode-ai/plugin` peer dependency and building with syntax and whitespace minification plus external sourcemaps.
2632
+ - Updated `bun run check` ordering so the build step runs before tests, matching fresh-CI release conditions where `dist/` does not exist yet.
2633
+ - Restricted publishable package contents to `dist/`, `LICENSE`, `README.md`, and `CHANGELOG.md` plus npm's auto-included `package.json`.
2634
+
2635
+ ### Breaking
2636
+
2637
+ - New installs now target `~/.config/opencode/plugins/flow.js` as the canonical plugin path, while legacy `~/.opencode/plugins/flow.js` installs remain compatibility-only.
2638
+ - The mission intentionally introduced `.flow/` storage and session-format changes, so users may need to restart active Flow sessions after upgrading to 1.0.0.
2639
+ - Flow tools now emit UI metadata via the `context.metadata({ title, metadata })` side effect and return strings, rather than producing the earlier `{ title, metadata, output }`-style contract.
2640
+ - The `bun run check` pipeline now builds before testing, which changes the execution order expected by downstream automation.
2641
+
2642
+ ### Fixed
2643
+
2644
+ - Fixed `clearExecution` immutability so transition helpers no longer mutate caller-owned execution state.
2645
+ - Fixed `toArchiveTimestamp` formatting to strip the trailing `Z` while preserving millisecond precision for archive directory names.
2646
+ - Fixed recovery resolution-hint parity so recovery metadata remains byte-for-byte aligned with the documented contract.
2647
+ - Fixed incremental-render idempotency for VAL-PERF-006 by removing the stray `- updated:` line from unchanged index markdown output.
2648
+ - Fixed fixture determinism by adding `setNowIsoOverride`-based time control for snapshot and benchmark-adjacent tests.
2649
+
2650
+ ### Performance
2651
+
2652
+ - Bundle size dropped from the original pre-mission ~0.99 MB baseline to a 455,166-byte release asset.
2653
+ - `transition reducer / applyPlan` improved from 19.97 µs to 9.76 µs average (-51.13%).
2654
+ - `transition reducer / approvePlan` improved from 49.06 µs to 9.64 µs average (-80.35%).
2655
+ - `transition reducer / startRun` improved from 77.63 µs to 11.82 µs average (-84.77%).
2656
+ - `transition reducer / completeRun` improved from 139.61 µs to 13.43 µs average (-90.38%).
2657
+ - `warm saveSession cycle` held at 777.70 µs average with the incremental writer enabled, staying below the release gate for unchanged-session saves.
2658
+ - `full saveSession cycle / 20-feature plan` measured 3.76 ms average after M5 versus a 3.38 ms baseline, with the cold-path regression explicitly documented as a trade for warm-save wins.
2659
+ - `session save round-trip` measured 2.45 ms average after optimization work versus 1.91 ms baseline, with the extra cache invalidation and render bookkeeping called out in benchmark notes.
2660
+ - `markdown render / index` measured 3.87 µs average after the renderer rewrite versus 3.52 µs baseline, with the small fixed-cost increase documented as the price of skipped writes on unchanged saves.
2661
+ - `markdown render / feature` measured 793.16 ns average after optimization versus 766.81 ns baseline, remaining within the 5% tolerance gate.