@kontourai/flow-agents 1.3.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (214) hide show
  1. package/.github/CODEOWNERS +29 -0
  2. package/.github/actions/trust-verify/action.yml +145 -0
  3. package/.github/workflows/ci.yml +11 -4
  4. package/.github/workflows/kit-gates-demo.yml +2 -2
  5. package/.github/workflows/publish-npm.yml +10 -2
  6. package/.github/workflows/release-please.yml +1 -1
  7. package/.github/workflows/trust-reconcile.yml +113 -0
  8. package/AGENTS.md +13 -0
  9. package/CHANGELOG.md +103 -0
  10. package/CONTRIBUTING.md +4 -4
  11. package/README.md +1 -0
  12. package/agents/tool-planner.json +1 -1
  13. package/build/src/cli/console-learning-projection.d.ts +1 -0
  14. package/build/src/cli/effective-backlog-settings.d.ts +1 -0
  15. package/build/src/cli/fixture-retirement-audit.d.ts +2 -0
  16. package/build/src/cli/init.d.ts +17 -0
  17. package/build/src/cli/init.js +242 -20
  18. package/build/src/cli/kit.d.ts +1 -0
  19. package/build/src/cli/promote-workflow-artifact.d.ts +1 -0
  20. package/build/src/cli/publish-change-helper.d.ts +1 -0
  21. package/build/src/cli/pull-work-provider.d.ts +1 -0
  22. package/build/src/cli/runtime-adapter.d.ts +1 -0
  23. package/build/src/cli/telemetry-doctor.d.ts +1 -0
  24. package/build/src/cli/usage-feedback.d.ts +1 -0
  25. package/build/src/cli/utterance-check.d.ts +1 -0
  26. package/build/src/cli/validate-hook-influence.d.ts +1 -0
  27. package/build/src/cli/validate-source-tree.d.ts +1 -0
  28. package/build/src/cli/validate-workflow-artifacts.d.ts +2 -0
  29. package/build/src/cli/validate-workflow-artifacts.js +19 -2
  30. package/build/src/cli/verify.d.ts +1 -0
  31. package/build/src/cli/verify.js +90 -0
  32. package/build/src/cli/veritas-governance.d.ts +1 -0
  33. package/build/src/cli/workflow-artifact-cleanup-audit.d.ts +1 -0
  34. package/build/src/cli/workflow-sidecar.d.ts +324 -0
  35. package/build/src/cli/workflow-sidecar.js +1973 -90
  36. package/build/src/cli.d.ts +2 -0
  37. package/build/src/cli.js +2 -3
  38. package/build/src/flow-kit/validate.d.ts +81 -0
  39. package/build/src/index.d.ts +5 -0
  40. package/build/src/index.js +36 -0
  41. package/build/src/lib/args.d.ts +8 -0
  42. package/build/src/lib/flow-resolver.d.ts +82 -0
  43. package/build/src/lib/flow-resolver.js +237 -0
  44. package/build/src/lib/fs.d.ts +7 -0
  45. package/build/src/lib/workflow-learning-projection.d.ts +132 -0
  46. package/build/src/runtime-adapters.d.ts +18 -0
  47. package/build/src/tools/build-universal-bundles.d.ts +2 -0
  48. package/build/src/tools/build-universal-bundles.js +34 -22
  49. package/build/src/tools/common.d.ts +9 -0
  50. package/build/src/tools/generate-context-map.d.ts +2 -0
  51. package/build/src/tools/generate-context-map.js +3 -16
  52. package/build/src/tools/validate-package.d.ts +2 -0
  53. package/build/src/tools/validate-source-tree.d.ts +2 -0
  54. package/build/src/tools/validate-source-tree.js +42 -162
  55. package/context/contracts/artifact-contract.md +10 -0
  56. package/context/contracts/delivery-contract.md +1 -0
  57. package/context/contracts/review-contract.md +1 -0
  58. package/context/contracts/verification-contract.md +2 -0
  59. package/context/gate-awareness.md +39 -0
  60. package/context/scripts/hooks/stop-goal-fit.js +632 -70
  61. package/docs/adr/0001-flow-agents-consumes-flow.md +1 -1
  62. package/docs/adr/0002-flow-kits-as-extension-unit.md +1 -1
  63. package/docs/adr/0004-gates-expect-surface-claims.md +2 -0
  64. package/docs/adr/0005-kubernetes-inspired-resource-contracts.md +2 -0
  65. package/docs/adr/0007-skill-audit.md +1 -1
  66. package/docs/adr/0009-canonical-hook-core-kit-boundary.md +95 -0
  67. package/docs/adr/0010-workflow-trust-state-as-hachure-bundle.md +139 -0
  68. package/docs/adr/0011-mcp-posture.md +100 -0
  69. package/docs/adr/0012-agent-coordination-as-liveness-claims.md +119 -0
  70. package/docs/adr/0013-context-lifecycle.md +151 -0
  71. package/docs/adr/0014-core-vs-domain-kit-boundary.md +143 -0
  72. package/docs/adr/0015-flow-flow-agents-boundary-reconciliation.md +120 -0
  73. package/docs/adr/0016-three-hard-boundary-model.md +71 -0
  74. package/docs/adr/0017-anti-gaming-trust-security-model.md +155 -0
  75. package/docs/agent-system-guidebook.md +5 -12
  76. package/docs/context-map.md +4 -10
  77. package/docs/developer-architecture.md +14 -0
  78. package/docs/index.md +3 -2
  79. package/docs/integrations/framework-adapter.md +19 -6
  80. package/docs/integrations/index.md +2 -2
  81. package/docs/north-star.md +4 -4
  82. package/docs/operating-layers.md +3 -3
  83. package/docs/plans/adr-0010-phase2-gate-recompute.md +55 -0
  84. package/docs/repository-structure.md +2 -2
  85. package/docs/skills-map.md +1 -0
  86. package/docs/spec/runtime-hook-surface.md +78 -10
  87. package/docs/standards-register.md +3 -3
  88. package/docs/survey-utterance-check.md +1 -1
  89. package/docs/trust-anchor-adoption.md +197 -0
  90. package/docs/verifiable-trust.md +95 -0
  91. package/docs/veritas-integration.md +2 -2
  92. package/docs/workflow-usage-guide.md +69 -0
  93. package/evals/acceptance/DEMO-false-completion.md +144 -0
  94. package/evals/acceptance/demo-cast.sh +92 -0
  95. package/evals/acceptance/demo-false-completion.sh +72 -0
  96. package/evals/acceptance/demo-real-evidence.sh +104 -0
  97. package/evals/acceptance/demo.tape +29 -0
  98. package/evals/acceptance/prove-capture-teeth-declared.sh +335 -0
  99. package/evals/acceptance/prove-capture-teeth.sh +114 -0
  100. package/evals/acceptance/prove-teeth.sh +105 -0
  101. package/evals/ci/antigaming-suite.sh +54 -0
  102. package/evals/ci/run-baseline.sh +2 -0
  103. package/evals/fixtures/flow-kit-repository/invalid-missing-extension-asset/flows/review.flow.json +26 -0
  104. package/evals/fixtures/flow-kit-repository/invalid-missing-extension-asset/kit.json +20 -0
  105. package/evals/fixtures/flow-kit-repository/valid-unknown-extension/flows/review.flow.json +26 -0
  106. package/evals/fixtures/flow-kit-repository/valid-unknown-extension/kit.json +18 -0
  107. package/evals/integration/test_builder_step_producers.sh +379 -0
  108. package/evals/integration/test_bundle_install.sh +35 -71
  109. package/evals/integration/test_bundle_lifecycle.sh +39 -2
  110. package/evals/integration/test_captured_fail_reconciliation.sh +820 -0
  111. package/evals/integration/test_checkpoint_signing.sh +489 -0
  112. package/evals/integration/test_claim_lookup.sh +352 -0
  113. package/evals/integration/test_command_log_integrity.sh +275 -0
  114. package/evals/integration/test_context_map.sh +0 -2
  115. package/evals/integration/test_dual_emit_flow_step.sh +278 -0
  116. package/evals/integration/test_enforcer_expects_driven.sh +281 -0
  117. package/evals/integration/test_evidence_capture_hook.sh +185 -0
  118. package/evals/integration/test_flow_kit_repository.sh +2 -0
  119. package/evals/integration/test_flowdef_session_activation.sh +273 -0
  120. package/evals/integration/test_flowdef_session_history_preservation.sh +250 -0
  121. package/evals/integration/test_gate_bypass_chain.sh +448 -0
  122. package/evals/integration/test_gate_lockdown.sh +1137 -0
  123. package/evals/integration/test_gate_review_inquiry_records.sh +399 -0
  124. package/evals/integration/test_goal_fit_escape_hatch.sh +73 -0
  125. package/evals/integration/test_goal_fit_hook.sh +69 -4
  126. package/evals/integration/test_goal_fit_rederive.sh +263 -0
  127. package/evals/integration/test_hook_category_behaviors.sh +14 -0
  128. package/evals/integration/test_install_merge.sh +1176 -0
  129. package/evals/integration/test_mint_attestation.sh +373 -0
  130. package/evals/integration/test_phase_map_and_gate_claim.sh +365 -0
  131. package/evals/integration/test_publish_delivery.sh +269 -0
  132. package/evals/integration/test_reconcile_soundness.sh +528 -0
  133. package/evals/integration/test_resolvefirststep_security.sh +208 -0
  134. package/evals/integration/test_session_resume_roundtrip.sh +286 -0
  135. package/evals/integration/test_trust_checkpoint.sh +325 -0
  136. package/evals/integration/test_trust_reconcile.sh +293 -0
  137. package/evals/integration/test_verify_cli.sh +208 -0
  138. package/evals/integration/test_workflow_sidecar_writer.sh +549 -34
  139. package/evals/lib/node.sh +0 -6
  140. package/evals/run.sh +47 -0
  141. package/evals/static/test_library_exports.sh +85 -0
  142. package/evals/static/test_universal_bundles.sh +15 -0
  143. package/evals/static/test_workflow_skills.sh +6 -13
  144. package/install.sh +0 -7
  145. package/integrations/strands-ts/README.md +25 -15
  146. package/integrations/veritas/flow-agents.adapter.json +1 -2
  147. package/kits/builder/flows/build.flow.json +59 -12
  148. package/kits/builder/kit.json +85 -15
  149. package/kits/builder/skills/continue-work/SKILL.md +116 -0
  150. package/kits/builder/skills/deliver/SKILL.md +36 -6
  151. package/kits/builder/skills/design-probe/SKILL.md +28 -0
  152. package/kits/builder/skills/execute-plan/SKILL.md +9 -1
  153. package/kits/builder/skills/gate-review/SKILL.md +234 -0
  154. package/kits/builder/skills/learning-review/SKILL.md +30 -0
  155. package/kits/builder/skills/pickup-probe/SKILL.md +29 -0
  156. package/kits/builder/skills/plan-work/SKILL.md +13 -1
  157. package/kits/builder/skills/pull-work/SKILL.md +19 -0
  158. package/kits/knowledge/adapters/default-store/index.js +38 -0
  159. package/kits/knowledge/adapters/flow-runner/index.js +1620 -0
  160. package/kits/knowledge/adapters/obsidian-store/index.js +36 -6
  161. package/kits/knowledge/docs/store-contract.md +314 -0
  162. package/kits/knowledge/evals/audit-freshness/suite.test.js +368 -0
  163. package/kits/knowledge/evals/canonicalize-category/suite.test.js +383 -0
  164. package/kits/knowledge/evals/contract-suite/suite.test.js +111 -0
  165. package/kits/knowledge/evals/detect-contradictions/suite.test.js +324 -0
  166. package/kits/knowledge/evals/entities/suite.test.js +40 -0
  167. package/kits/knowledge/evals/glossary-sync/suite.test.js +416 -0
  168. package/kits/knowledge/evals/hygiene-review/suite.test.js +396 -0
  169. package/kits/knowledge/evals/retirement/suite.test.js +145 -0
  170. package/kits/knowledge/flows/audit-freshness.flow.json +44 -0
  171. package/kits/knowledge/flows/canonicalize-category.flow.json +44 -0
  172. package/kits/knowledge/flows/detect-contradictions.flow.json +44 -0
  173. package/kits/knowledge/flows/glossary-sync.flow.json +61 -0
  174. package/kits/knowledge/flows/hygiene-review.flow.json +43 -0
  175. package/kits/knowledge/kit.json +51 -1
  176. package/package.json +13 -4
  177. package/packaging/conformance/README.md +10 -2
  178. package/packaging/conformance/fixtures/evidence-capture--allow-records-command.json +29 -0
  179. package/packaging/conformance/fixtures/stop-goal-fit--block-bundle-disputed-claim.json +29 -0
  180. package/packaging/conformance/fixtures/stop-goal-fit--block-capture-contradicts-claimed-pass.json +30 -0
  181. package/packaging/conformance/fixtures/stop-goal-fit--block-mode.json +23 -0
  182. package/packaging/conformance/fixtures/stop-goal-fit--off-mode.json +24 -0
  183. package/packaging/conformance/fixtures/stop-goal-fit--warn-active-delivery.json +5 -2
  184. package/packaging/conformance/fixtures/stop-goal-fit--warn-no-bundle.json +23 -0
  185. package/packaging/conformance/fixtures/workflow-steering--reground-active-prompt.json +30 -0
  186. package/packaging/conformance/fixtures/workflow-steering--reground-session-start.json +30 -0
  187. package/packaging/conformance/run-conformance.js +1 -1
  188. package/scripts/README.md +2 -1
  189. package/scripts/build-universal-bundles.js +0 -1
  190. package/scripts/ci/mint-attestation.js +221 -0
  191. package/scripts/ci/trust-reconcile.js +545 -0
  192. package/scripts/hooks/config-protection.js +423 -1
  193. package/scripts/hooks/evidence-capture.js +348 -0
  194. package/scripts/hooks/lib/liveness-read.js +113 -0
  195. package/scripts/hooks/run-hook.js +6 -1
  196. package/scripts/hooks/stop-goal-fit.js +1471 -79
  197. package/scripts/hooks/workflow-steering.js +135 -5
  198. package/scripts/install-codex-home.sh +39 -0
  199. package/scripts/install-merge.js +330 -0
  200. package/src/cli/init.ts +218 -20
  201. package/src/cli/validate-workflow-artifacts.ts +18 -2
  202. package/src/cli/verify.ts +100 -0
  203. package/src/cli/workflow-sidecar.ts +2093 -84
  204. package/src/cli.ts +2 -3
  205. package/src/index.ts +53 -0
  206. package/src/lib/flow-resolver.ts +284 -0
  207. package/src/tools/build-universal-bundles.ts +34 -21
  208. package/src/tools/generate-context-map.ts +3 -17
  209. package/src/tools/validate-source-tree.ts +44 -104
  210. package/tsconfig.json +1 -0
  211. package/build/src/tools/filter-installed-packs.js +0 -135
  212. package/packaging/packs.json +0 -49
  213. package/scripts/filter-installed-packs.js +0 -2
  214. package/src/tools/filter-installed-packs.ts +0 -132
@@ -57,11 +57,25 @@ Canonical hook scripts in `scripts/hooks/` use the following exit code contract
57
57
 
58
58
  Adapters translate these exit codes into the host-native response format. The `claude-hook-adapter.js` and `codex-hook-adapter.js` wrappers perform this translation, and all errors fail open so hook runtime failures never block agent work.
59
59
 
60
+ ### Block Reason Channel
61
+
62
+ A block (exit `2` → deny) is only useful if the agent learns *why* it was blocked and how to proceed. When a policy blocks, the hook script writes a human-readable reason — for example, config-protection's "Fix the source code … instead of weakening the config." The adapter **must surface that reason to the model** through the host's native deny-reason mechanism, **not only to a log or stderr**, where it dies before the agent sees it. A deny without a model-visible reason makes the agent retry the same blocked action instead of self-correcting.
63
+
64
+ | Host surface | Model-facing reason channel |
65
+ | --- | --- |
66
+ | Claude Code | `hookSpecificOutput.permissionDecisionReason` (preToolUse); `reason` (stop) |
67
+ | Codex | `hookSpecificOutput.permissionDecisionReason` (preToolUse); `reason` (stop) |
68
+ | opencode | the thrown error message on the blocked `tool.execute.before` (surfaced as the tool result) |
69
+ | pi | the `reason` field of the `{ block: true, reason }` tool-call result |
70
+ | Native pre-dispatch host (e.g. an orchestration layer) | the blocked call's tool-result text |
71
+
72
+ The reason text is the canonical steering message: it should tell the agent what to do *instead* (edit the source, not the generated artifact), so the agent can self-correct on the next turn. An adapter that denies the call but drops the reason to a log only is a **conformance gap** — record it in the adapter's conformance declaration.
73
+
60
74
  ---
61
75
 
62
76
  ## 2. Policy Classes
63
77
 
64
- Flow Agents currently ships four canonical policy classes. Each policy class has a canonical hook script under `scripts/hooks/` and may be wired to one or more canonical trigger events.
78
+ Flow Agents currently ships five canonical policy classes. Each policy class has a canonical hook script under `scripts/hooks/` and may be wired to one or more canonical trigger events.
65
79
 
66
80
  ### 2.1 Workflow Steering
67
81
 
@@ -69,14 +83,14 @@ Flow Agents currently ships four canonical policy classes. Each policy class has
69
83
 
70
84
  **Canonical script**: `scripts/hooks/workflow-steering.js`
71
85
 
72
- **Canonical trigger event**: `userPromptSubmit` (ambient state guidance), `postToolUse` (after `InvokeSubagents` tool calls)
86
+ **Canonical trigger event**: `userPromptSubmit` and `agentSpawn`/`SessionStart` (active-goal re-grounding), `postToolUse` (after `InvokeSubagents` tool calls)
73
87
 
74
88
  **Inputs consumed**:
75
89
  - `.flow-agents/<slug>/state.json` — current workflow phase and status
76
90
  - `.flow-agents/<slug>/critique.json` — open critique findings
77
91
  - `docs/context-map.md` — structure hint for repo navigation
78
92
 
79
- **Decision contract**: Non-blocking. Always exits 0. Appends steering text to the agent's context via `additionalContext` in the hook response. Does not block any action.
93
+ **Decision contract**: Non-blocking. Always exits 0. Appends steering text to the agent's context via `additionalContext` in the hook response. Does not block any action. It re-grounds the active workflow goal (status, phase, recorded next step) at the start of every user turn — not only for flagged/blocked states — and on `SessionStart`, which fires after context compaction and on resume. This is the mechanism that keeps an in-flight goal alive across context loss instead of relying on the model voluntarily re-reading the sidecar.
80
94
 
81
95
  **Degradation when host lacks trigger**: If the host has no `userPromptSubmit`-equivalent hook, workflow steering is silent. The agent receives no ambient phase reminders at turn start. This is a capability loss, not a blocking failure. Log the gap in the adapter's conformance declaration as `userPromptSubmit: no native equivalent — steering context injection unavailable`.
82
96
 
@@ -115,11 +129,32 @@ Flow Agents currently ships four canonical policy classes. Each policy class has
115
129
  - `.flow-agents/<slug>/state.json` — workflow phase and next action
116
130
  - `.flow-agents/<slug>/evidence.json` — verification verdict and NOT_VERIFIED gaps
117
131
  - `.flow-agents/<slug>/critique.json` — critique status and open findings
118
- - `FLOW_AGENTS_GOAL_FIT_STRICT` env var `true` to make blocking (exit 2) instead of warning-only (exit 0)
132
+ - `.flow-agents/<slug>/command-log.jsonl` the deterministic capture log written by the Evidence Capture policy (see §2.5); cross-referenced against `evidence.json` claimed-pass command checks
133
+ - `.flow-agents/<slug>/acceptance.json` — acceptance criteria; a criterion's `command`-kind `evidence_ref` (`excerpt`) is the most-trusted backstop command
134
+ - `FLOW_AGENTS_GOAL_FIT_MODE` env var — `block` | `warn` | `off` (the legacy `FLOW_AGENTS_GOAL_FIT_STRICT=true` is an alias for `block`)
135
+ - `FLOW_AGENTS_GOAL_FIT_MAX_BLOCKS` env var — consecutive-identical-block cap before the escape hatch releases (default 3)
136
+ - `FLOW_AGENTS_GOAL_FIT_BACKSTOP` env var — `block` (default) | `off`/`warn` | `skip`; controls the capture backstop re-run (see Capture cross-reference below)
137
+ - `FLOW_AGENTS_GOAL_FIT_BACKSTOP_TIMEOUT_MS` env var — per-backstop-command timeout in ms (default 120000; runaway commands are SIGKILL'd)
138
+ - `FLOW_AGENTS_GOAL_FIT_RECHECK` env var — `true` opts into re-running the model's free-form `evidence.checks[].command` (the RCE-risky path; off by default)
119
139
 
120
140
  **Decision contract**:
121
- - Default mode: warning-only (exits 0). Writes guidance to stderr.
122
- - Strict mode (`FLOW_AGENTS_GOAL_FIT_STRICT=true`): blocking (exits 2) when the active workflow artifact has state, Definition Of Done, Goal Fit, or sidecar issues that classify as blocking.
141
+ - `warn` (canonical engine default): exits 0, writes guidance to stderr. Non-blocking.
142
+ - `block`: exits 2 when the active workflow artifact has state, Definition Of Done, Goal Fit, evidence, sidecar, or capture cross-reference issues that classify as blocking. Shipped L2 runtime configs (Claude Code, Codex) set `block` by default, overridable per-operator via the env var.
143
+ - `off`: silent (exits 0, no stderr).
144
+ - Escape hatch: in `block` mode the same goal-fit gap is refused up to `FLOW_AGENTS_GOAL_FIT_MAX_BLOCKS` (default 3) consecutive times, then released (exit 0 with a loud notice) so a genuinely-unsatisfiable goal cannot trap the agent. A changing gap resets the streak.
145
+
146
+ **Capture cross-reference (capture-first determinism)**: For each `evidence.checks[]` of `kind:"command"` claiming `status:"pass"` that carries a `command`, the gate cross-references the deterministic capture log (`command-log.jsonl`, §2.5) *before* trusting the model's claim:
147
+
148
+ 1. **Log shows the command ran and FAILED** → this is a caught false-completion → a blocking goal-fit gap (feeds the existing block/`MAX_BLOCKS` machinery).
149
+ 2. **Log shows the command ran and PASSED** → satisfied deterministically, with no re-run.
150
+ 3. **Log has NO execution for that claimed-pass command** (it was never actually run) → resolve a TRUSTED command to re-run as a thin backstop, in priority order:
151
+ - **(a) acceptance criterion** — the `command`-kind evidence ref of the matching `acceptance.json` criterion (authored upfront, most trusted).
152
+ - **(b) declared manifest target** — the project's own declared `package.json` `scripts.{test,build,lint}` (or `typecheck`), `Makefile` target, `cargo test`/`build`, `tox`/`pytest`, or `just`/`task` target. The NAMED declared target is run — never an arbitrary allowlisted string. (`veritas readiness` is just one such declared command — no special-casing.)
153
+ - **(c) model free-form command** — `evidence.checks[].command`, ONLY when `FLOW_AGENTS_GOAL_FIT_RECHECK=true` (opt-in; the RCE-risky path).
154
+
155
+ If the resolved backstop re-run fails, it is a caught false-completion. If NO trusted command resolves, the gate records `NOT_VERIFIED` — never a guess, never a silent pass, never auto-running an unlisted string.
156
+
157
+ **Backstop guardrails**: each backstop command runs under a per-command timeout (`FLOW_AGENTS_GOAL_FIT_BACKSTOP_TIMEOUT_MS`, default 120s; runaway commands are killed). The trusted-source backstop (a/b) rides `block` mode by default but is operator-disablable for latency: `FLOW_AGENTS_GOAL_FIT_BACKSTOP=off` (re-run becomes warn-only, never blocks) or `=skip` (no re-run at all → record `NOT_VERIFIED`). The arbitrary-model-command backstop (c) is opt-in only via `FLOW_AGENTS_GOAL_FIT_RECHECK`.
123
158
 
124
159
  **Degradation when host lacks trigger**: If the host has no stop hook, stop-goal-fit cannot fire. The agent may complete without the check. Log the gap as `stop: no native equivalent — stop-goal-fit policy unavailable`.
125
160
 
@@ -136,10 +171,35 @@ Flow Agents currently ships four canonical policy classes. Each policy class has
136
171
  - `SA_HOOK_INPUT_TRUNCATED` env var — whether input was truncated (truncated payloads are blocked unconditionally)
137
172
  - Protected file set: `.eslintrc*`, `eslint.config.*`, `.prettierrc*`, `prettier.config.*`, `biome.json`, `biome.jsonc`, `.ruff.toml`, `ruff.toml`, `.shellcheckrc`, `.stylelintrc*`, `.markdownlint*`
138
173
 
139
- **Decision contract**: Blocking (exits 2) when the target file basename is in the protected set. Writes a descriptive message to stderr directing the agent to fix source instead. Exits 0 (allow) otherwise.
174
+ **Decision contract**: Blocking (exits 2) when the target file basename is in the protected set. Writes a descriptive message directing the agent to fix source instead, which the adapter surfaces to the model as the deny reason (see [Block Reason Channel](#block-reason-channel)). Exits 0 (allow) otherwise.
140
175
 
141
176
  **Degradation when host lacks trigger**: If the host has no `preToolUse`-equivalent blocking hook, config protection cannot veto tool calls. The agent may modify linter configs without interception. Log the gap as `preToolUse: no native blocking equivalent — config-protection policy unavailable`.
142
177
 
178
+ ### 2.5 Evidence Capture (capture-first determinism)
179
+
180
+ **Intent**: Make evidence about what actually ran *machine-recorded at the source* rather than transcribed later by the model. `evidence.json` is the model's narration and can claim a test passed when it did not. The capture policy deterministically records every command/shell tool execution and its observed result to an append-only log, which the Stop-Goal-Fit gate (§2.3) cross-references against the model's claims. This makes re-running at the gate a thin backstop, not the primary check.
181
+
182
+ **Canonical script**: `scripts/hooks/evidence-capture.js`
183
+
184
+ **Canonical trigger event**: `postToolUse` (after command/shell tool calls)
185
+
186
+ **Inputs consumed**:
187
+ - `tool_name` + `tool_input.command` — identifies a command/shell execution (a command string present, with a command-shaped tool name; when no tool name is present but a command string is, it is still captured).
188
+ - `tool_response` / `tool_output` / `error` — the host tool result (per §1, `postToolUse`); the source of the deterministically-observed outcome.
189
+ - `.flow-agents/current.json` (`active_slug` / `artifact_dir`) then newest-mtime `state.json` — resolves the active artifact dir, the same way Workflow Steering and Stop-Goal-Fit do.
190
+
191
+ **Output**: appends one JSON object per line to `.flow-agents/<slug>/command-log.jsonl`:
192
+
193
+ ```json
194
+ { "command": "npm test", "observedResult": "pass", "exitCode": 0, "capturedAt": "2026-06-23T00:00:00Z", "source": "postToolUse-capture" }
195
+ ```
196
+
197
+ **Exit-code handling (deterministic observation only)**: a clean integer exit code is host-dependent. The policy extracts the real exit code where the host surfaces one (`tool_response`/`tool_output` `.exitCode`/`.exit_code`/`.status`/`.code`/`.returnCode`, or top-level equivalents) and sets `observedResult` to `pass` iff that code is `0`. When no clean integer exit code is present, `exitCode` is recorded as `null` and `observedResult` is inferred *only* from deterministic failure signals — a non-empty `error`, a `success:false`/`failed:true`/`is_error:true` flag, or a non-empty stderr with no stdout. Plain stdout text is never scanned for the words "error"/"fail"; the model's narration is never consulted.
198
+
199
+ **Decision contract**: Non-blocking. Always exits 0 and echoes stdin. Idempotent/append-only. Fail-open on any error — a capture failure must never block the agent or corrupt the log. Only records when an active workflow artifact dir resolves (otherwise there is nothing to anchor the log to).
200
+
201
+ **Degradation when host lacks trigger**: If the host has no `postToolUse` hook, command results are not captured. The Stop gate then has no capture log to cross-reference and falls back to its trusted backstop re-run (§2.3) for claimed-pass command checks. Log the gap as `postToolUse: no native equivalent — evidence capture unavailable; Stop gate relies on backstop re-run only`.
202
+
143
203
  ---
144
204
 
145
205
  ## 3. Hook Profiles
@@ -190,8 +250,10 @@ The adapter implements L1 plus all blocking policy classes.
190
250
  **Required**:
191
251
  - L1 steering and stop telemetry.
192
252
  - Config protection fires on `preToolUse` and can block (exit 2 translates to a deny response).
253
+ - Every block surfaces its reason to the model through the host's deny-reason channel (see [Block Reason Channel](#block-reason-channel)), not only to a log.
193
254
  - Quality gate fires on `postToolUse`.
194
- - Stop-goal-fit fires on `stop` with `FLOW_AGENTS_GOAL_FIT_STRICT` configurable (default may be warning mode; strict mode must be possible to enable).
255
+ - Stop-goal-fit fires on `stop` with `FLOW_AGENTS_GOAL_FIT_MODE` configurable. Shipped L2 configs default to `block`; the canonical engine default remains `warn`, and any mode must be operator-overridable.
256
+ - Workflow steering additionally re-grounds the active goal on `agentSpawn`/`SessionStart` so an in-flight goal survives context compaction and resume.
195
257
 
196
258
  **Permitted gaps**: None. All four policy classes are wired. Any missing host trigger must be documented as a named gap in the adapter's conformance declaration.
197
259
 
@@ -439,8 +501,9 @@ For structured `run()` responses (native import form), the return value is:
439
501
  |-------------|-------------|--------------------|--------------------|
440
502
  | config-protection | Fail-closed (exit 2 on protected file) | Yes — hook runtime errors exit 0 | Yes (preToolUse) |
441
503
  | quality-gate | Fail-open (exit 0 always) | Yes | No |
442
- | stop-goal-fit | Fail-open by default; fail-closed with `FLOW_AGENTS_GOAL_FIT_STRICT=true` | Yes — hook runtime errors exit 0 | Yes (stop, strict mode only) |
504
+ | stop-goal-fit | Engine default warn (fail-open); blocks in `FLOW_AGENTS_GOAL_FIT_MODE=block` (shipped L2 default) | Yes — hook runtime errors exit 0 | Yes (stop, block mode) |
443
505
  | workflow-steering | Fail-open (exit 0 always) | Yes | No |
506
+ | evidence-capture | Fail-open (exit 0 always) | Yes — capture errors never block or corrupt the log | No |
444
507
 
445
508
  **Telemetry**: Always fail-open. Hook runtime errors in telemetry scripts must never block agent work.
446
509
 
@@ -456,7 +519,12 @@ For structured `run()` responses (native import form), the return value is:
456
519
  | `SA_HOOK_INPUT_MAX_BYTES` | Integer string | `config-protection.js` |
457
520
  | `SA_QUALITY_GATE_FIX` | `true` / `false` | `quality-gate.js` |
458
521
  | `SA_QUALITY_GATE_STRICT` | `true` / `false` | `quality-gate.js` |
459
- | `FLOW_AGENTS_GOAL_FIT_STRICT` | `true` / `false` | `stop-goal-fit.js` |
522
+ | `FLOW_AGENTS_GOAL_FIT_MODE` | `block` / `warn` / `off` | `stop-goal-fit.js` |
523
+ | `FLOW_AGENTS_GOAL_FIT_MAX_BLOCKS` | Integer string (default 3) | `stop-goal-fit.js` |
524
+ | `FLOW_AGENTS_GOAL_FIT_STRICT` | `true` / `false` (legacy alias for mode=block) | `stop-goal-fit.js` |
525
+ | `FLOW_AGENTS_GOAL_FIT_BACKSTOP` | `block` (default) / `off` (=`warn`) / `skip` | `stop-goal-fit.js` |
526
+ | `FLOW_AGENTS_GOAL_FIT_BACKSTOP_TIMEOUT_MS` | Integer string (default 120000) | `stop-goal-fit.js` |
527
+ | `FLOW_AGENTS_GOAL_FIT_RECHECK` | `true` / `false` (opt-in re-run of model free-form command) | `stop-goal-fit.js` |
460
528
  | `FLOW_AGENTS_REQUIRE_SIDECARS` | `true` / `false` | `stop-goal-fit.js` |
461
529
  | `FLOW_AGENTS_REQUIRE_CRITIQUE` | `true` / `false` | `stop-goal-fit.js` |
462
530
  | `FLOW_AGENTS_HOOK_RUNTIME` | `claude-code`, `codex`, etc. | Hook adapters (forwarded to scripts) |
@@ -57,8 +57,8 @@ Flow Agents may need local schemas for reliability glue that existing standards
57
57
  | Critique record | Reviewer passes, findings, severity, and resolution state for critique loops | `.flow-agents/<slug>/critique.json` | Draft schema: `schemas/workflow-critique.schema.json` |
58
58
  | Release readiness | Merge, release, deploy, hold, rollback, docs, and operational readiness decisions | `.flow-agents/<slug>/release.json` | Draft schema: `schemas/workflow-release.schema.json` |
59
59
  | Learning record | Repeated failure, correction, pattern, and recommended system update | `.flow-agents/<slug>/learning.json` or `.telemetry/outcomes.jsonl` | Draft schema: `schemas/workflow-learning.schema.json` |
60
- | Context map | Compact project map: structure, commands, conventions, test strategy, packs, and recent state | Generated under `.flow-agents/` or configurable cache | Planned |
61
- | Pack manifest | Core and optional pack composition for a target install | `packaging/packs.json` plus generated export catalog metadata | Draft manifest: `packaging/packs.json` |
60
+ | Context map | Compact project map: structure, commands, conventions, test strategy, Kits, and recent state | Generated under `.flow-agents/` or configurable cache | Planned |
61
+ | Kit Catalog | Product-facing catalog of Flow Kits and their activation, layered over the always-installed standalone base | `kits/catalog.json` plus generated export catalog metadata | Catalog: `kits/catalog.json` |
62
62
  | Governance adapter | Optional bridge from Flow Agents evidence gates to tools such as Veritas | `context/contracts/governance-adapter-contract.md` | Draft contract |
63
63
 
64
64
  These formats should be treated as contracts once introduced. Breaking changes require schema version bumps and migration notes.
@@ -93,4 +93,4 @@ Before merging a new schema, file format, or artifact:
93
93
  - Is the new format schema-described?
94
94
  - Is there a human-readable representation?
95
95
  - Can another tool consume or export it?
96
- - Does this belong in core or an optional pack?
96
+ - Does this belong in the standalone base or an opinionated Flow Kit?
@@ -289,7 +289,7 @@ Flow Agents does not own trust claim models, inquiry semantics, or extractor imp
289
289
 
290
290
  - Do not make `@kontourai/survey` a mandatory dependency of flow-agents.
291
291
  - Do not copy Survey's extraction or inquiry schemas into flow-agents.
292
- - Do not auto-register the hook in the default pack; it is opt-in only.
292
+ - Do not auto-register the hook in the standalone base; it is opt-in only.
293
293
  - Do not make the hook blocking without explicit `mode: "strict"` or the env override.
294
294
  - Do not silently decide anything. The hook injects guidance; the agent decides next steps.
295
295
 
@@ -0,0 +1,197 @@
1
+ ---
2
+ title: "Trust Anchor Adoption — Add the CI Trust Anchor to Your Repo"
3
+ ---
4
+
5
+ # Trust Anchor Adoption
6
+
7
+ This guide explains how to add the Flow Agents CI trust anchor to any repository that
8
+ uses Flow Agents. The anchor is a required CI job that re-runs your canonical
9
+ verification fresh in a clean environment and reconciles the agent's claimed passes
10
+ against the real CI results. It is the external, un-disablable check that closes the
11
+ loop on agent self-reporting.
12
+
13
+ See [ADR 0017](adr/0017-anti-gaming-trust-security-model.md) for the full security
14
+ model and threat analysis.
15
+
16
+ ## What the Trust Anchor Does
17
+
18
+ 1. **Re-runs verification fresh.** In a clean CI environment the agent does not
19
+ control, it runs your declared verify command (build + tests + lint). Real exit
20
+ codes. No agent influence.
21
+
22
+ 2. **Reconciles the delivered bundle.** If the agent published a `delivery/trust.bundle`
23
+ with the PR, the anchor cross-checks every claimed-pass command against CI's own
24
+ fresh results. Divergences (claimed pass + CI fail, laundered command, claim with
25
+ no evidence, checkpoint-only bundle) fail the job with a clear diagnostic.
26
+
27
+ 3. **Fails closed on compile-only.** If no comprehensive verify command is configured,
28
+ the anchor refuses to pass — preventing a "build only" attestation that misses tests.
29
+
30
+ ## Step 1 — The Agent Publishes a Bundle
31
+
32
+ Flow Agents' deliver skill calls `publishDelivery`, which writes `delivery/trust.bundle`
33
+ to the repository with `git add -f` during the `record-release` step. This file carries
34
+ the session's evidence and claims to CI so the anchor can reconcile them.
35
+
36
+ You do not need to configure this — it is part of the deliver skill workflow. The bundle
37
+ is gitignored by default (the deliver skill force-adds it for the PR commit only).
38
+
39
+ ## Step 2 — Add the Composite Action
40
+
41
+ In your repo, create or update a CI workflow file (e.g.
42
+ `.github/workflows/trust-verify.yml`):
43
+
44
+ ```yaml
45
+ name: Trust Verify
46
+
47
+ on:
48
+ pull_request:
49
+ push:
50
+ branches: ["main"]
51
+ workflow_dispatch:
52
+
53
+ permissions:
54
+ contents: read
55
+
56
+ concurrency:
57
+ group: trust-verify-${{ github.ref }}
58
+ cancel-in-progress: true
59
+
60
+ jobs:
61
+ trust-verify:
62
+ name: Trust Verify
63
+ runs-on: ubuntu-latest
64
+ timeout-minutes: 15
65
+ # Add id-token: write here if you enable sign: true (Sigstore attestation).
66
+ permissions:
67
+ contents: read
68
+
69
+ steps:
70
+ - name: Checkout
71
+ uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
72
+
73
+ - uses: kontourai/flow-agents/.github/actions/trust-verify@<SHA>
74
+ with:
75
+ # Declare your comprehensive verify command: build + tests + lint.
76
+ # The agent must run this same command locally (via trust-reconcile-verify).
77
+ verify-command: "npm run build && npm test && npm run lint"
78
+ # bundle: defaults to delivery/trust.bundle (auto-discovered if present)
79
+ # sign: false (set to true + add id-token: write for Sigstore attestation)
80
+ ```
81
+
82
+ Replace `<SHA>` with the pinned commit SHA of the `kontourai/flow-agents` release you
83
+ are adopting. Pin to a SHA (not a tag) for supply-chain security.
84
+
85
+ **To find the SHA**: look at the
86
+ [flow-agents releases](https://github.com/kontourai/flow-agents/releases) or pin to
87
+ `main` HEAD after reviewing the CHANGELOG.
88
+
89
+ ## Step 3 — Arm It as a Required Status Check
90
+
91
+ The action reports results but is advisory until you arm it server-side:
92
+
93
+ 1. Go to **Settings → Branches** in your GitHub repository.
94
+ 2. Edit (or create) the branch protection rule for `main`.
95
+ 3. Under **Require status checks to pass before merging**, add **`Trust Verify`**.
96
+ 4. Check **Require branches to be up to date before merging**.
97
+ 5. Enable **Do not allow bypassing the above settings** (the "enforce admins" option).
98
+
99
+ Once armed, no PR can merge past a `Trust Verify` failure — including ones pushed by
100
+ the agent.
101
+
102
+ ## Step 4 — Protect the Verify Config
103
+
104
+ CODEOWNERS prevents the agent from quietly weakening the verify command. Add entries
105
+ for the files that declare what CI runs:
106
+
107
+ ```
108
+ # Trust anchor config — requires owner review.
109
+ # An agent cannot weaken verify-command without a human approving the change.
110
+ .github/workflows/trust-verify.yml @your-org/owners
111
+ package.json @your-org/owners
112
+ ```
113
+
114
+ Adjust paths and team names for your repo structure.
115
+
116
+ ## Configuring the Verify Command
117
+
118
+ The anchor fails closed if it cannot find a comprehensive verify command. Provide it
119
+ one of three ways (in priority order):
120
+
121
+ 1. **Action input** `verify-command` (recommended for the composite action).
122
+ 2. **`TRUST_RECONCILE_COMMANDS` environment variable** (comma- or newline-separated).
123
+ 3. **`package.json` `scripts["trust-reconcile-verify"]`** — the anchor auto-discovers
124
+ this key. Add it to your `package.json`:
125
+
126
+ ```json
127
+ {
128
+ "scripts": {
129
+ "trust-reconcile-verify": "npm run build && npm test && npm run lint"
130
+ }
131
+ }
132
+ ```
133
+
134
+ Then you can also run it locally:
135
+ ```
136
+ npx @kontourai/flow-agents verify
137
+ ```
138
+
139
+ ## Local Use
140
+
141
+ The `flow-agents verify` CLI subcommand runs the same trust-reconcile logic locally:
142
+
143
+ ```bash
144
+ # Install (or npx):
145
+ npm install -D @kontourai/flow-agents
146
+
147
+ # Re-run verify + reconcile against a delivered bundle:
148
+ npx @kontourai/flow-agents verify \
149
+ --commands "npm run build,npm test" \
150
+ --bundle delivery/trust.bundle
151
+
152
+ # Auto-discover bundle + verify command from package.json:
153
+ npx @kontourai/flow-agents verify
154
+
155
+ # Help:
156
+ npx @kontourai/flow-agents verify --help
157
+ ```
158
+
159
+ Exit codes: 0 = clean (fresh verify passed, no divergence); 1 = failed/divergence.
160
+
161
+ ## Mirror: Flow Agents' Own Setup
162
+
163
+ Flow Agents uses the same pattern in its own repository:
164
+
165
+ - **`scripts/ci/trust-reconcile.js`** — the anchor script (runs in
166
+ `.github/workflows/trust-reconcile.yml`).
167
+ - **`package.json` `trust-reconcile-verify`** — `npm run build && npm run eval:static`.
168
+ - **`evals/ci/antigaming-suite.sh`** — the regression suite that proves the gate and
169
+ anchor work; runs in the required `ci.yml` lane.
170
+ - **Branch protection** on `main` — `Trust Reconcile` required, `enforce_admins` on.
171
+
172
+ ## Adoption Checklist
173
+
174
+ - [ ] Deliver skill is configured and publishes `delivery/trust.bundle`.
175
+ - [ ] `.github/workflows/trust-verify.yml` added and the composite action is pinned.
176
+ - [ ] `verify-command` declares a comprehensive verify (build + tests + lint).
177
+ - [ ] `Trust Verify` added as a required, no-bypass status check on `main`.
178
+ - [ ] CODEOWNERS entry protects `trust-verify.yml` and `package.json`.
179
+ - [ ] (Optional) `scripts["trust-reconcile-verify"]` in `package.json` for local use.
180
+ - [ ] (Optional) `sign: true` + `id-token: write` for Sigstore attestation.
181
+
182
+ ## Troubleshooting
183
+
184
+ **"no comprehensive trust-reconcile-verify configured"**: Provide `verify-command` in
185
+ the action input, set `TRUST_RECONCILE_COMMANDS`, or add `scripts["trust-reconcile-verify"]`
186
+ to `package.json`. The anchor refuses to attest a compile-only check.
187
+
188
+ **"trust divergence: agent claimed X passed; CI fresh run = FAIL"**: The agent's
189
+ local environment or shell profile produced a false pass. The anchor correctly flagged
190
+ the mismatch. Fix the underlying test failure.
191
+
192
+ **"trust divergence: command contains exit-code-laundering operator"**: A claimed
193
+ command used `||`, `; true`, or `; exit 0`. These mask real exit codes. Remove them.
194
+
195
+ **"checkpoint-only bundle cannot be reconciled per-command"**: A `delivery/trust.bundle`
196
+ was expected but only `delivery/trust.checkpoint.json` was found. The deliver skill
197
+ publishes the full bundle; ensure it ran correctly.
@@ -0,0 +1,95 @@
1
+ # Verifiable Trust — why "done" actually means done
2
+
3
+ > **The problem with autonomous coding agents: they grade their own homework.**
4
+ > An agent writes the code, runs the tests, and reports "all green, shipped." If it's
5
+ > wrong — or if it learns it can just *say* the tests passed — you find out in production.
6
+ > Flow Agents is built so an agent **can't** mark work complete that isn't.
7
+
8
+ ## The one-line pitch
9
+
10
+ Flow Agents treats "the work is done" as a **claim that must be proven**, not a status the
11
+ agent gets to assert. Completion is gated by **evidence the system re-derives itself**, and
12
+ the authoritative check runs in **CI — an environment the agent can't disable or fake** —
13
+ with **cryptographically signed provenance** of exactly what was verified.
14
+
15
+ Most agent frameworks trust the model's self-report. Flow Agents doesn't trust the agent,
16
+ its claims, *or* its environment.
17
+
18
+ ## What that buys you
19
+
20
+ - **"Done" you can rely on.** A finished task ends with real evidence — tests, build, lint,
21
+ review findings, captured command results — and the gate *re-derives* the verdict from
22
+ that evidence. A claimed pass that contradicts a captured failure is **blocked**, not shipped.
23
+ - **Anti-gaming by design.** The gate independently captures real command results and
24
+ reconciles them against the agent's claims — namespace-agnostic, and independent of any
25
+ status the agent self-declares. Tricks like `npm test || true` (laundering the exit code)
26
+ are rejected.
27
+ - **An external anchor that can't be switched off.** On every pull request, CI re-runs the
28
+ verification *fresh* in a clean environment and **fails the merge on any divergence**
29
+ between what the agent claimed and what CI actually observed. The agent can tamper with its
30
+ own machine all it likes; it can't reach into CI.
31
+ - **Signed provenance.** CI mints a Sigstore-signed attestation over its *own* results. The
32
+ agent has no signing identity, so a fabricated "green" can't be signed — you get a
33
+ tamper-evident, externally-verifiable record of what shipped.
34
+ - **The gate can't be silently weakened.** The anti-gaming test suite runs as a **required
35
+ CI check**, and the gate/CI/verify config require **code-owner review** — so a change that
36
+ guts the protections can't merge.
37
+
38
+ ## Who it's for
39
+
40
+ - **Solo builders and teams shipping agent-written code.** Run the agent, and trust that
41
+ what it marks "done" is verified — not just asserted. Less re-checking, fewer "it said it
42
+ passed but it didn't" surprises.
43
+ - **Unattended / AFK and overnight agents.** When you're not watching, the gate is. An agent
44
+ running autonomously can't quietly ship broken work past a green self-report.
45
+ - **High-assurance, regulated, and audited environments.** Every delivery carries a signed,
46
+ reproducible record of *what was verified and how* — provenance you can hand to an auditor,
47
+ not a screenshot of a passing run.
48
+ - **Multi-agent and at-scale delivery.** Every agent's output is held to the same external,
49
+ un-gameable bar — so you can fan out work without fanning out the risk that one agent
50
+ learns to game its gate.
51
+ - **Platform teams adopting agents.** Add the trust anchor as a required check in your repos
52
+ and get a consistent, enforced "agents must prove it" policy across every team.
53
+
54
+ ## How it's different
55
+
56
+ | | Typical agent setup | Flow Agents |
57
+ |---|---|---|
58
+ | "Is it done?" | The agent says so | Re-derived from independent evidence |
59
+ | Failure hiding | Easy (claim pass, launder exit codes) | Caught — captured results reconcile against claims |
60
+ | Where trust lives | In the agent's environment | **External** — CI re-runs fresh, agent can't disable it |
61
+ | Provenance | A log line | **Sigstore-signed** attestation of CI's own results |
62
+ | Tampering with the gate | Possible | Required tests + code-owner review block it |
63
+
64
+ ## The honest part
65
+
66
+ This is a **defense-in-depth bar-raiser, not a magic wall** — and the docs say so plainly.
67
+ The local gate raises the cost of casual or direct self-tampering; the *real* tamper-proof
68
+ boundary is **external**: CI's fresh re-run, the CI-minted signatures, and human (owner)
69
+ review — none of which the agent can reach. Known residuals (and their mitigations) are
70
+ documented openly rather than hidden, because overstating security is its own risk. We'd
71
+ rather you trust this *because* you can see where the lines are.
72
+
73
+ > This posture wasn't designed on paper and declared safe — it was **earned by an adversarial
74
+ > loop**: independent reviewers repeatedly tried to defeat the gate (and found real holes we
75
+ > closed) until a round came back clean. That loop is now part of the policy, and the
76
+ > regression suite that proves it runs on every change.
77
+
78
+ ## Add it to your repo
79
+
80
+ The same external anchor works in **any** repo that uses Flow Agents — add the
81
+ [`trust-verify` composite action](trust-anchor-adoption.md) as a required check, or run it
82
+ locally / in any CI with `npx @kontourai/flow-agents verify`. See the
83
+ [Trust Anchor Adoption guide](trust-anchor-adoption.md) for the full wiring (publish the
84
+ bundle → add the action → make it a required, no-bypass check + CODEOWNERS).
85
+
86
+ ## Learn more
87
+
88
+ - **Add the anchor to your repo:** [Trust Anchor Adoption guide](trust-anchor-adoption.md)
89
+ - **Architecture, threat model, and residuals:** [ADR 0017 — The Anti-Gaming Trust Security
90
+ Model](adr/0017-anti-gaming-trust-security-model.md)
91
+ - **The trust state model it builds on:** [ADR 0010 — Workflow Trust State as a Hachure
92
+ Bundle](adr/0010-workflow-trust-state-as-hachure-bundle.md), [ADR 0004 — Gates Expect
93
+ Surface Claims](adr/0004-gates-expect-surface-claims.md)
94
+ - **Turning on the external teeth** (admin, one-time): the CI anchor + code-owner review are
95
+ armed by two server-side branch-protection settings — see the activation note in ADR 0017.
@@ -30,7 +30,7 @@ The user sees a clear result: pass, fail, hold, or not verified. The implementat
30
30
 
31
31
  | Area | Flow Agents Owns | Veritas Owns |
32
32
  | --- | --- | --- |
33
- | Workflow | Agent-facing workflow packs, harness hooks, sidecars, release decisions, learning loops | None |
33
+ | Workflow | Agent-facing workflow skills, harness hooks, sidecars, release decisions, learning loops | None |
34
34
  | Flow | Process steps, gates, transitions, Flow Runs, exceptions, and Flow Reports | None |
35
35
  | Governance | When to ask for governance evidence | Repo standards, authority settings, evidence checks |
36
36
  | Evidence | `evidence.json`, `standard_refs`, `external_evidence`, acceptance mapping | Native Veritas reports and rule results |
@@ -157,7 +157,7 @@ Current local configuration in this repo is limited to:
157
157
  ## Non-Goals
158
158
 
159
159
  - Do not vendor Veritas source into Flow Agents.
160
- - Do not make Veritas mandatory for the core pack.
160
+ - Do not make Veritas mandatory for the standalone base.
161
161
  - Do not duplicate Veritas policy schemas inside Flow Agents.
162
162
  - Do not make knowledge, meeting, or sales workflows depend on development governance tooling.
163
163
  - Do not bootstrap `.veritas/repo-map.json` from Flow Agents in this slice. Native Veritas repository setup remains future Veritas-owned or adapter-owned work.
@@ -278,6 +278,34 @@ npm run workflow:sidecar -- init-plan .flow-agents/<slug>/<slug>--deliver.md \
278
278
  --next-action "<next step>"
279
279
  ```
280
280
 
281
+ #### Deterministic slug from a work-item ref
282
+
283
+ For issue-backed sessions, pass `--work-item <owner/repo#id>` instead of `--task-slug`. The
284
+ derived slug has the format `<owner>-<repo>-<id>` — for example:
285
+
286
+ ```bash
287
+ npm run workflow:sidecar -- ensure-session \
288
+ --work-item "kontourai/flow-agents#161" \
289
+ --source-request "Implement #161" \
290
+ --summary "Deterministic slug demo."
291
+ # Creates .flow-agents/kontourai-flow-agents-161/
292
+ ```
293
+
294
+ The slug is deterministic and idempotent: any agent or worktree that runs `ensure-session
295
+ --work-item kontourai/flow-agents#161` will land in the same directory. This makes liveness
296
+ collision-detection work correctly — the `subjectId` written to `liveness/events.jsonl` equals
297
+ `workItemSlug(ref)` (i.e. `kontourai-flow-agents-161`), so a double-hold on the same issue is
298
+ detectable via `liveness status --subject kontourai-flow-agents-161` (see
299
+ [ADR 0012](adr/0012-agent-coordination-as-liveness-claims.md)).
300
+
301
+ Rules:
302
+ - `--task-slug` always wins when both flags are supplied (back-compat).
303
+ - Omitting both flags still dies with `--task-slug is required`.
304
+ - The `id` part after `#` must be a plain integer (GitHub issue number). Non-integer ids are
305
+ rejected.
306
+ - Issue-backed sessions should prefer `--work-item` over hand-supplied `--task-slug` so that
307
+ liveness subjectId alignment is automatic.
308
+
281
309
  Reviewer Markdown artifacts can be imported into `critique.json`:
282
310
 
283
311
  ```bash
@@ -441,3 +469,44 @@ Retrospective:
441
469
  ```text
442
470
  Use learning-review. Capture facts, decisions, gaps, follow-ups, and durable knowledge updates from this completed or failed workflow.
443
471
  ```
472
+
473
+ ## Resumable sessions
474
+
475
+ When a session resumes (after context compaction, an agent restart, or a cross-session
476
+ handoff), the workflow-steering hook emits a `RESUME:` block on `SessionStart` that
477
+ gives the resuming agent immediate situational awareness without blocking or auto-deciding.
478
+
479
+ The `RESUME:` block supplements the existing `STATE:` line and contains:
480
+
481
+ - **Header** — `RESUME: <slug> status:<status> phase:<phase>` — quick orientation.
482
+ - **Next action** — the full `next_action.summary` at 240 characters (not truncated to 80), so the agent can re-ground to the exact recorded next step.
483
+ - **Plan** — path to the plan artifact (`<slug>--plan-work.md` from `state.json artifact_paths` or conventional fallback).
484
+ - **Next step** — the first `handoff.json next_steps` entry.
485
+ - **Blockers** — any recorded blockers from `handoff.json`, or "none".
486
+ - **Trust** — `Trust: N verified / M disputed / T total` from reading `trust.bundle`. Each disputed or unknown claim is listed with its id and a copy-pasteable remedy command: `npm run workflow:sidecar -- claim <id> <dir>`.
487
+ - **Liveness advisory** (when applicable) — `[LIVENESS WARNING: another agent appears live on this work: actor <X>, last seen <T>]` when the shared liveness stream (`.flow-agents/liveness/events.jsonl`, ADR 0012) contains a fresh claim or heartbeat from a different actor for the same slug. This is advisory only — the hook exits 0 regardless.
488
+ - **Route hint** — `To continue: resume this work. Or run pull-work to assess WIP and start new/parallel work.` — always routes the resume-vs-parallel decision through `pull-work` rather than auto-taking it.
489
+
490
+ The `RESUME:` block appears on `SessionStart` only. `UserPromptSubmit` and `PostToolUse`
491
+ behavior is unchanged.
492
+
493
+ All reads are fail-open: a missing `handoff.json`, `trust.bundle`, or liveness stream
494
+ degrades gracefully — the section is omitted or shows "no data", and the hook never throws.
495
+
496
+ The liveness freshness check is read-only (ADR 0012). Writing or excluding liveness claims
497
+ is scoped to issue #151 (a later slice). The session-level event log (Layer 2) is also
498
+ deferred.
499
+
500
+ ### Shared liveness helper
501
+
502
+ The freshness logic is centralised in `scripts/hooks/lib/liveness-read.js` (pure CJS,
503
+ zero dependencies). It exports:
504
+
505
+ - `readLivenessEvents(streamPath)` — reads a `.flow-agents/liveness/events.jsonl` file
506
+ line-by-line, JSON-parses each, and tolerates malformed lines.
507
+ - `freshHolders(events, slug, selfActor, nowMs)` — returns actors (excluding `selfActor`)
508
+ who hold a within-TTL claim or heartbeat on `subjectId === slug`.
509
+
510
+ Both the hook (`scripts/hooks/workflow-steering.js`) and the compiled CLI
511
+ (`build/src/cli/workflow-sidecar.js`) consume this helper so the TTL/freshness logic lives
512
+ in one place.