@kontourai/flow-agents 1.4.0 → 2.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (184) hide show
  1. package/.github/CODEOWNERS +29 -0
  2. package/.github/actions/trust-verify/action.yml +145 -0
  3. package/.github/workflows/ci.yml +11 -4
  4. package/.github/workflows/kit-gates-demo.yml +2 -2
  5. package/.github/workflows/publish-npm.yml +10 -2
  6. package/.github/workflows/release-please.yml +1 -1
  7. package/.github/workflows/runtime-compat.yml +1 -1
  8. package/.github/workflows/trust-reconcile.yml +113 -0
  9. package/AGENTS.md +13 -0
  10. package/CHANGELOG.md +103 -0
  11. package/CONTRIBUTING.md +4 -4
  12. package/README.md +1 -0
  13. package/agents/tool-planner.json +1 -1
  14. package/build/src/cli/init.js +242 -20
  15. package/build/src/cli/validate-workflow-artifacts.js +19 -2
  16. package/build/src/cli/verify.d.ts +1 -0
  17. package/build/src/cli/verify.js +90 -0
  18. package/build/src/cli/workflow-sidecar.d.ts +316 -8
  19. package/build/src/cli/workflow-sidecar.js +1996 -91
  20. package/build/src/cli.js +2 -3
  21. package/build/src/lib/flow-resolver.d.ts +111 -0
  22. package/build/src/lib/flow-resolver.js +308 -0
  23. package/build/src/tools/build-universal-bundles.js +34 -22
  24. package/build/src/tools/generate-context-map.js +3 -16
  25. package/build/src/tools/validate-source-tree.d.ts +1 -1
  26. package/build/src/tools/validate-source-tree.js +42 -162
  27. package/context/contracts/artifact-contract.md +10 -0
  28. package/context/contracts/delivery-contract.md +1 -0
  29. package/context/contracts/review-contract.md +1 -0
  30. package/context/contracts/verification-contract.md +2 -0
  31. package/context/gate-awareness.md +39 -0
  32. package/context/scripts/hooks/stop-goal-fit.js +632 -70
  33. package/docs/adr/0001-flow-agents-consumes-flow.md +1 -1
  34. package/docs/adr/0002-flow-kits-as-extension-unit.md +1 -1
  35. package/docs/adr/0004-gates-expect-surface-claims.md +2 -0
  36. package/docs/adr/0005-kubernetes-inspired-resource-contracts.md +2 -0
  37. package/docs/adr/0007-skill-audit.md +1 -1
  38. package/docs/adr/0009-canonical-hook-core-kit-boundary.md +95 -0
  39. package/docs/adr/0010-workflow-trust-state-as-hachure-bundle.md +139 -0
  40. package/docs/adr/0011-mcp-posture.md +100 -0
  41. package/docs/adr/0012-agent-coordination-as-liveness-claims.md +119 -0
  42. package/docs/adr/0013-context-lifecycle.md +151 -0
  43. package/docs/adr/0014-core-vs-domain-kit-boundary.md +143 -0
  44. package/docs/adr/0015-flow-flow-agents-boundary-reconciliation.md +120 -0
  45. package/docs/adr/0016-three-hard-boundary-model.md +71 -0
  46. package/docs/adr/0017-anti-gaming-trust-security-model.md +155 -0
  47. package/docs/agent-system-guidebook.md +5 -12
  48. package/docs/context-map.md +4 -10
  49. package/docs/index.md +3 -2
  50. package/docs/integrations/framework-adapter.md +19 -6
  51. package/docs/integrations/index.md +2 -2
  52. package/docs/north-star.md +4 -4
  53. package/docs/operating-layers.md +3 -3
  54. package/docs/plans/adr-0010-phase2-gate-recompute.md +55 -0
  55. package/docs/repository-structure.md +2 -2
  56. package/docs/skills-map.md +1 -0
  57. package/docs/spec/runtime-hook-surface.md +62 -9
  58. package/docs/standards-register.md +3 -3
  59. package/docs/survey-utterance-check.md +1 -1
  60. package/docs/trust-anchor-adoption.md +197 -0
  61. package/docs/verifiable-trust.md +95 -0
  62. package/docs/veritas-integration.md +2 -2
  63. package/docs/workflow-usage-guide.md +69 -0
  64. package/evals/acceptance/DEMO-false-completion.md +144 -0
  65. package/evals/acceptance/demo-cast.sh +92 -0
  66. package/evals/acceptance/demo-false-completion.sh +72 -0
  67. package/evals/acceptance/demo-real-evidence.sh +104 -0
  68. package/evals/acceptance/demo.tape +29 -0
  69. package/evals/acceptance/prove-capture-teeth-declared.sh +335 -0
  70. package/evals/acceptance/prove-capture-teeth.sh +114 -0
  71. package/evals/acceptance/prove-teeth.sh +105 -0
  72. package/evals/ci/antigaming-suite.sh +55 -0
  73. package/evals/ci/run-baseline.sh +2 -0
  74. package/evals/fixtures/flow-kit-repository/invalid-missing-extension-asset/flows/review.flow.json +26 -0
  75. package/evals/fixtures/flow-kit-repository/invalid-missing-extension-asset/kit.json +20 -0
  76. package/evals/fixtures/flow-kit-repository/valid-unknown-extension/flows/review.flow.json +26 -0
  77. package/evals/fixtures/flow-kit-repository/valid-unknown-extension/kit.json +18 -0
  78. package/evals/integration/test_builder_step_producers.sh +379 -0
  79. package/evals/integration/test_bundle_install.sh +35 -71
  80. package/evals/integration/test_bundle_lifecycle.sh +39 -2
  81. package/evals/integration/test_captured_fail_reconciliation.sh +820 -0
  82. package/evals/integration/test_checkpoint_signing.sh +489 -0
  83. package/evals/integration/test_claim_lookup.sh +352 -0
  84. package/evals/integration/test_command_log_fork_classification.sh +134 -0
  85. package/evals/integration/test_command_log_integrity.sh +275 -0
  86. package/evals/integration/test_context_map.sh +0 -2
  87. package/evals/integration/test_dual_emit_flow_step.sh +278 -0
  88. package/evals/integration/test_enforcer_expects_driven.sh +281 -0
  89. package/evals/integration/test_evidence_capture_hook.sh +185 -0
  90. package/evals/integration/test_flow_kit_repository.sh +2 -0
  91. package/evals/integration/test_flowdef_session_activation.sh +273 -0
  92. package/evals/integration/test_flowdef_session_history_preservation.sh +250 -0
  93. package/evals/integration/test_gate_bypass_chain.sh +448 -0
  94. package/evals/integration/test_gate_lockdown.sh +1137 -0
  95. package/evals/integration/test_gate_review_inquiry_records.sh +399 -0
  96. package/evals/integration/test_goal_fit_escape_hatch.sh +73 -0
  97. package/evals/integration/test_goal_fit_hook.sh +69 -4
  98. package/evals/integration/test_goal_fit_rederive.sh +263 -0
  99. package/evals/integration/test_install_merge.sh +1176 -0
  100. package/evals/integration/test_kit_identity_trust.sh +393 -0
  101. package/evals/integration/test_mint_attestation.sh +373 -0
  102. package/evals/integration/test_phase_map_and_gate_claim.sh +365 -0
  103. package/evals/integration/test_publish_delivery.sh +269 -0
  104. package/evals/integration/test_reconcile_soundness.sh +528 -0
  105. package/evals/integration/test_resolvefirststep_security.sh +208 -0
  106. package/evals/integration/test_session_resume_roundtrip.sh +286 -0
  107. package/evals/integration/test_trust_checkpoint.sh +325 -0
  108. package/evals/integration/test_trust_reconcile.sh +293 -0
  109. package/evals/integration/test_verify_cli.sh +208 -0
  110. package/evals/integration/test_workflow_sidecar_writer.sh +549 -34
  111. package/evals/lib/node.sh +0 -6
  112. package/evals/run.sh +47 -0
  113. package/evals/static/test_workflow_skills.sh +6 -13
  114. package/install.sh +0 -7
  115. package/integrations/strands-ts/README.md +25 -15
  116. package/integrations/veritas/flow-agents.adapter.json +1 -2
  117. package/kits/builder/flows/build.flow.json +59 -12
  118. package/kits/builder/kit.json +85 -15
  119. package/kits/builder/skills/continue-work/SKILL.md +116 -0
  120. package/kits/builder/skills/deliver/SKILL.md +36 -6
  121. package/kits/builder/skills/design-probe/SKILL.md +28 -0
  122. package/kits/builder/skills/execute-plan/SKILL.md +9 -1
  123. package/kits/builder/skills/gate-review/SKILL.md +234 -0
  124. package/kits/builder/skills/learning-review/SKILL.md +30 -0
  125. package/kits/builder/skills/pickup-probe/SKILL.md +29 -0
  126. package/kits/builder/skills/plan-work/SKILL.md +13 -1
  127. package/kits/builder/skills/pull-work/SKILL.md +19 -0
  128. package/kits/knowledge/adapters/default-store/index.js +38 -0
  129. package/kits/knowledge/adapters/flow-runner/index.js +1620 -0
  130. package/kits/knowledge/adapters/obsidian-store/index.js +36 -6
  131. package/kits/knowledge/docs/store-contract.md +314 -0
  132. package/kits/knowledge/evals/audit-freshness/suite.test.js +368 -0
  133. package/kits/knowledge/evals/canonicalize-category/suite.test.js +383 -0
  134. package/kits/knowledge/evals/contract-suite/suite.test.js +111 -0
  135. package/kits/knowledge/evals/detect-contradictions/suite.test.js +324 -0
  136. package/kits/knowledge/evals/entities/suite.test.js +40 -0
  137. package/kits/knowledge/evals/glossary-sync/suite.test.js +416 -0
  138. package/kits/knowledge/evals/hygiene-review/suite.test.js +396 -0
  139. package/kits/knowledge/evals/retirement/suite.test.js +145 -0
  140. package/kits/knowledge/flows/audit-freshness.flow.json +44 -0
  141. package/kits/knowledge/flows/canonicalize-category.flow.json +44 -0
  142. package/kits/knowledge/flows/detect-contradictions.flow.json +44 -0
  143. package/kits/knowledge/flows/glossary-sync.flow.json +61 -0
  144. package/kits/knowledge/flows/hygiene-review.flow.json +43 -0
  145. package/kits/knowledge/kit.json +51 -1
  146. package/package.json +6 -6
  147. package/packaging/conformance/README.md +10 -2
  148. package/packaging/conformance/fixtures/evidence-capture--allow-records-command.json +29 -0
  149. package/packaging/conformance/fixtures/stop-goal-fit--block-bundle-disputed-claim.json +29 -0
  150. package/packaging/conformance/fixtures/stop-goal-fit--block-capture-contradicts-claimed-pass.json +30 -0
  151. package/packaging/conformance/fixtures/stop-goal-fit--block-mode.json +23 -0
  152. package/packaging/conformance/fixtures/stop-goal-fit--off-mode.json +24 -0
  153. package/packaging/conformance/fixtures/stop-goal-fit--warn-active-delivery.json +5 -2
  154. package/packaging/conformance/fixtures/stop-goal-fit--warn-no-bundle.json +23 -0
  155. package/packaging/conformance/fixtures/workflow-steering--reground-active-prompt.json +30 -0
  156. package/packaging/conformance/fixtures/workflow-steering--reground-session-start.json +30 -0
  157. package/packaging/conformance/run-conformance.js +1 -1
  158. package/scripts/README.md +2 -1
  159. package/scripts/build-universal-bundles.js +0 -1
  160. package/scripts/ci/mint-attestation.js +221 -0
  161. package/scripts/ci/trust-reconcile.js +545 -0
  162. package/scripts/hooks/config-protection.js +423 -1
  163. package/scripts/hooks/evidence-capture.js +348 -0
  164. package/scripts/hooks/lib/liveness-read.js +113 -0
  165. package/scripts/hooks/run-hook.js +6 -1
  166. package/scripts/hooks/stop-goal-fit.js +1524 -79
  167. package/scripts/hooks/workflow-steering.js +135 -5
  168. package/scripts/install-codex-home.sh +39 -0
  169. package/scripts/install-merge.js +330 -0
  170. package/scripts/repair-command-log.js +115 -0
  171. package/src/cli/init.ts +218 -20
  172. package/src/cli/validate-workflow-artifacts.ts +18 -2
  173. package/src/cli/verify.ts +100 -0
  174. package/src/cli/workflow-sidecar.ts +2127 -84
  175. package/src/cli.ts +2 -3
  176. package/src/lib/flow-resolver.ts +369 -0
  177. package/src/tools/build-universal-bundles.ts +34 -21
  178. package/src/tools/generate-context-map.ts +3 -17
  179. package/src/tools/validate-source-tree.ts +44 -104
  180. package/build/src/tools/filter-installed-packs.d.ts +0 -2
  181. package/build/src/tools/filter-installed-packs.js +0 -135
  182. package/packaging/packs.json +0 -49
  183. package/scripts/filter-installed-packs.js +0 -2
  184. package/src/tools/filter-installed-packs.ts +0 -132
@@ -10,19 +10,20 @@ const dirDescriptions = {
10
10
  context: "Shared contracts, routing notes, templates, and reusable guidance.",
11
11
  docs: "Long-lived project documentation and GitHub Pages content.",
12
12
  evals: "Static, integration, install, and behavioral eval fixtures.",
13
- powers: "Optional MCP/tool integration packs.",
13
+ powers: "Optional MCP/tool capability bundles.",
14
14
  prompts: "Reusable prompt entry points.",
15
15
  schemas: "JSON Schema contracts for machine-readable workflow artifacts.",
16
16
  scripts: "Build, validation, hook, telemetry, workflow, and import/export utilities.",
17
17
  skills: "On-demand capability instructions and workflow primitives.",
18
18
  };
19
- const workflowSkills = new Set(["idea-to-backlog", "pull-work", "plan-work", "execute-plan", "review-work", "verify-work", "evidence-gate", "release-readiness", "learning-review", "deliver", "fix-bug", "tdd-workflow"]);
19
+ const workflowSkills = new Set(["idea-to-backlog", "pull-work", "plan-work", "execute-plan", "review-work", "verify-work", "evidence-gate", "gate-review", "release-readiness", "learning-review", "deliver", "continue-work", "fix-bug", "tdd-workflow"]);
20
20
  const commands = [
21
21
  ["Source tree", "npm run validate:source"],
22
22
  ["Static suite", "bash evals/run.sh static"],
23
23
  ["Integration suite", "bash evals/run.sh integration"],
24
24
  ["Workflow artifacts", "npm run workflow:validate-artifacts -- --require-sidecars --require-critique .flow-agents/<slug>"],
25
25
  ["Workflow sidecars", "npm run workflow:sidecar -- --help"],
26
+ ["Claim lookup", "npm run workflow:sidecar -- claim <id> <dir>"],
26
27
  ["Context map drift", "npm run context-map:check"],
27
28
  ["Bundle build", "npm run build:bundles"],
28
29
  ];
@@ -154,17 +155,6 @@ function powers() {
154
155
  const dir = path.join(root, "powers");
155
156
  return fs.readdirSync(dir).sort().flatMap((name) => exists(path.join(dir, name, "POWER.md")) ? [[name, rel(path.join(dir, name, "POWER.md"))]] : []);
156
157
  }
157
- function packs() {
158
- const data = loadJson(path.join(root, "packaging/packs.json"));
159
- return (data.packs ?? []).map((pack) => [
160
- String(pack.name ?? ""),
161
- pack.default ? "yes" : "no",
162
- String(Array.isArray(pack.skills) ? pack.skills.length : 0),
163
- String(Array.isArray(pack.agents) ? pack.agents.length : 0),
164
- String(Array.isArray(pack.powers) ? pack.powers.length : 0),
165
- oneLine(String(pack.description ?? "")),
166
- ]);
167
- }
168
158
  function latestRuntimeStates(includeRuntime) {
169
159
  if (!includeRuntime) {
170
160
  return [
@@ -205,9 +195,6 @@ function render(includeRuntime) {
205
195
  "## Support Skills", "", ...markdownTable(["Skill", "Source", "When To Load"], supportRows), "",
206
196
  "## Agents", "", ...markdownTable(["Agent", "Model", "Tools", "Role"], agents()), "",
207
197
  "## Optional Powers", "", ...markdownTable(["Power", "Source"], powers()), "",
208
- "## Packs", "",
209
- "Pack composition is defined in `packaging/packs.json`. The current builder exports pack metadata in bundle catalogs, and generated install scripts support opt-in `FLOW_AGENTS_PACKS` filtering while leaving all packs installed by default.", "",
210
- ...markdownTable(["Pack", "Default", "Skills", "Agents", "Powers", "Purpose"], packs()), "",
211
198
  "## Current Workflow State", "", ...latestRuntimeStates(includeRuntime), "",
212
199
  "## Context Loading Rules", "",
213
200
  "- For delivery work, load `deliver`, then the specific primitive skill for the current phase.",
@@ -1,2 +1,2 @@
1
1
  #!/usr/bin/env node
2
- export declare function main(argv?: string[]): number;
2
+ export declare function main(argv?: string[]): Promise<number>;
@@ -2,7 +2,7 @@
2
2
  import fs from "node:fs";
3
3
  import { fileURLToPath } from "node:url";
4
4
  import path from "node:path";
5
- import { spawnSync } from "node:child_process";
5
+ import { validateKitRepository as validateFlowKitRepository } from "../flow-kit/validate.js";
6
6
  import { loadJson, readText, rel, root, walkFiles } from "./common.js";
7
7
  class Reporter {
8
8
  errors = [];
@@ -11,14 +11,10 @@ class Reporter {
11
11
  this.fail(message); }
12
12
  }
13
13
  const manifestPath = path.join(root, "packaging/manifest.json");
14
- const packsPath = path.join(root, "packaging/packs.json");
15
14
  const kitsCatalogPath = path.join(root, "kits/catalog.json");
16
15
  const flowRoot = process.env.FLOW_CLI_ROOT ? path.resolve(process.env.FLOW_CLI_ROOT) : "";
17
16
  const flowSchemaPath = flowRoot ? path.join(flowRoot, "schemas", "flow-definition.schema.json") : "";
18
17
  const flowCliPath = flowRoot ? ["dist/cli.js", "src/cli.js"].map((candidate) => path.join(flowRoot, candidate)).find((candidate) => fs.existsSync(candidate)) ?? path.join(flowRoot, "dist/cli.js") : "";
19
- const kitIdRe = /^[a-z][a-z0-9-]*(?:\.[a-z][a-z0-9-]*)*$/;
20
- const kitAssetSections = new Set(["skills", "docs", "adapters", "evals", "assets"]);
21
- const kitTopLevelKeys = new Set(["schema_version", "id", "name", "product_name", "description", "flows", ...kitAssetSections]);
22
18
  const textRefExtensions = new Set([".md", ".yaml", ".yml", ".json", ".sh", ".js", ".toml"]);
23
19
  const ignoredRefDirs = new Set(["node_modules", "__pycache__", ".pytest_cache", ".cache"]);
24
20
  const legacyRefRe = /(?<![A-Za-z0-9_.-])(?:agents|agent-cards|context|evals|lib|powers|prompts|scripts|skills)\/[A-Za-z0-9_./@:+-]+/g;
@@ -32,10 +28,8 @@ const mirroredFiles = new Map([
32
28
  ]);
33
29
  const publicScriptWrappers = new Map([
34
30
  ["scripts/build-universal-bundles.js", { target: "../build/src/tools/build-universal-bundles.js", significantLines: [
35
- "// Supports FLOW_AGENTS_PACKS through the TypeScript bundle builder.",
36
31
  'import("../build/src/tools/build-universal-bundles.js").then(({ main }) => process.exit(main()));',
37
32
  ] }],
38
- ["scripts/filter-installed-packs.js", { target: "../build/src/tools/filter-installed-packs.js", significantLines: ['import("../build/src/tools/filter-installed-packs.js").then(({ main }) => process.exit(main(process.argv.slice(2))));'] }],
39
33
  ["scripts/generate-context-map.js", { target: "../build/src/tools/generate-context-map.js", significantLines: ['import("../build/src/tools/generate-context-map.js").then(({ main }) => process.exit(main(process.argv.slice(2))));'] }],
40
34
  ["scripts/kit.js", { target: "../build/src/cli/kit.js", significantLines: ['import("../build/src/cli/kit.js").then(({ main }) => main().then((code) => process.exit(code)));'] }],
41
35
  ["scripts/pull-work-provider.js", { target: "../build/src/cli/pull-work-provider.js", significantLines: ['import("../build/src/cli/pull-work-provider.js").then(({ main }) => process.exit(main()));'] }],
@@ -62,6 +56,7 @@ const hookFilePolicies = new Map([
62
56
  ["scripts/hooks/codex-telemetry-hook.js", { category: "telemetry shim", requiredNeedles: ["codex", "telemetry"] }],
63
57
  ["scripts/hooks/run-hook.js", { category: "hook runner", requiredNeedles: ["isHookEnabled", "Path traversal rejected"] }],
64
58
  ["scripts/hooks/config-protection.js", { category: "policy hook", requiredNeedles: ["Config Protection Hook"] }],
59
+ ["scripts/hooks/evidence-capture.js", { category: "policy hook", requiredNeedles: ["Evidence Capture Hook"] }],
65
60
  ["scripts/hooks/governance-audit.sh", { category: "policy hook", requiredNeedles: ["governance-audit.sh", "audit_emit"] }],
66
61
  ["scripts/hooks/opencode-hook-adapter.js", { category: "runtime adapter", requiredNeedles: ["opencode", "run-hook.js"] }],
67
62
  ["scripts/hooks/opencode-telemetry-hook.js", { category: "telemetry shim", requiredNeedles: ["opencode", "telemetry"] }],
@@ -78,6 +73,7 @@ const hookFilePolicies = new Map([
78
73
  ["scripts/hooks/desktop-notify.sh", { category: "local notification helper", requiredNeedles: ["desktop-notify.sh", "osascript"] }],
79
74
  ["scripts/hooks/lib/audit-transport.sh", { category: "shared hook library", requiredNeedles: ["audit_emit"] }],
80
75
  ["scripts/hooks/lib/hook-flags.js", { category: "shared hook library", requiredNeedles: ["isHookEnabled"] }],
76
+ ["scripts/hooks/lib/liveness-read.js", { category: "shared hook library", requiredNeedles: ["freshHolders", "readLivenessEvents"] }],
81
77
  ["scripts/hooks/lib/patterns.sh", { category: "shared hook library", requiredNeedles: ["_detect_secrets"] }],
82
78
  ["scripts/hooks/lib/resolve-formatter.js", { category: "shared hook library", requiredNeedles: ["resolveFormatter"] }],
83
79
  ]);
@@ -196,81 +192,7 @@ function validateManifest(reporter, manifest, agentNames) {
196
192
  for (const agent of manifest.codex?.excluded_agents ?? [])
197
193
  reporter.check(agentNames.has(agent), `${rel(manifestPath)}: codex excluded agent '${agent}' does not exist`);
198
194
  }
199
- function validatePacksManifest(reporter, agentNames) {
200
- const data = tryLoadJson(packsPath, reporter);
201
- if (!data || typeof data !== "object")
202
- return;
203
- reporter.check(data.schema_version === "1.0", `${rel(packsPath)}: schema_version must be 1.0`);
204
- reporter.check(Array.isArray(data.packs) && data.packs.length > 0, `${rel(packsPath)}: packs must be a non-empty list`);
205
- const skillNames = new Set(fs.readdirSync(path.join(root, "skills")).filter((name) => fs.existsSync(path.join(root, "skills", name, "SKILL.md"))));
206
- const powerNames = new Set(fs.readdirSync(path.join(root, "powers")).filter((name) => fs.existsSync(path.join(root, "powers", name, "POWER.md"))));
207
- const names = new Set();
208
- const defaults = new Set();
209
- const assigned = { skills: new Set(), agents: new Set(), powers: new Set() };
210
- (Array.isArray(data.packs) ? data.packs : []).forEach((pack, index) => {
211
- const name = pack?.name;
212
- if (typeof name !== "string" || !/^[a-z][a-z0-9-]*$/.test(name)) {
213
- reporter.fail(`${rel(packsPath)}: packs[${index}].name must be a kebab-case string`);
214
- return;
215
- }
216
- if (names.has(name))
217
- reporter.fail(`${rel(packsPath)}: duplicate pack name '${name}'`);
218
- names.add(name);
219
- if (pack.default === true)
220
- defaults.add(name);
221
- reporter.check(typeof pack.description === "string" && !!pack.description, `${rel(packsPath)}: pack '${name}' missing description`);
222
- for (const [field, available] of [["skills", skillNames], ["agents", agentNames], ["powers", powerNames]]) {
223
- const values = pack[field] ?? [];
224
- reporter.check(Array.isArray(values), `${rel(packsPath)}: pack '${name}' .${field} must be a list`);
225
- const seen = new Set();
226
- for (const value of Array.isArray(values) ? values : []) {
227
- if (typeof value !== "string") {
228
- reporter.fail(`${rel(packsPath)}: pack '${name}' .${field} entry is not a string`);
229
- continue;
230
- }
231
- if (seen.has(value))
232
- reporter.fail(`${rel(packsPath)}: pack '${name}' has duplicate ${field} entry '${value}'`);
233
- seen.add(value);
234
- assigned[field].add(value);
235
- reporter.check(available.has(value), `${rel(packsPath)}: pack '${name}' references missing ${field.slice(0, -1)} '${value}'`);
236
- }
237
- }
238
- });
239
- reporter.check(defaults.has("core"), `${rel(packsPath)}: core pack must be default`);
240
- const missingSkills = [...skillNames].filter((name) => !assigned.skills.has(name)).sort();
241
- reporter.check(missingSkills.length === 0, `${rel(packsPath)}: skills missing from all packs: ${missingSkills.join(", ")}`);
242
- }
243
- function safeLocalPath(baseDir, pathText, label, reporter) {
244
- if (typeof pathText !== "string" || !pathText) {
245
- reporter.fail(`${label} must be a non-empty relative path`);
246
- return undefined;
247
- }
248
- if (path.isAbsolute(pathText)) {
249
- reporter.fail(`${label} must be relative; absolute paths are not allowed`);
250
- return undefined;
251
- }
252
- if (pathText.split(/[\\/]/).includes("..")) {
253
- reporter.fail(`${label} must stay inside the kit directory; '..' path traversal is not allowed`);
254
- return undefined;
255
- }
256
- return path.join(baseDir, pathText);
257
- }
258
- function validateFlowDefinitionShape(file, data, reporter) {
259
- const localCli = flowCliPath;
260
- if (fs.existsSync(localCli)) {
261
- const result = spawnSync("node", [localCli, "validate-definition", file, "--json"], { encoding: "utf8" });
262
- if (result.status !== 0)
263
- reporter.fail(`${rel(file)}: Flow validation failed: ${(result.stderr || result.stdout).trim()}`);
264
- return;
265
- }
266
- if (!data || typeof data !== "object") {
267
- reporter.fail(`${rel(file)}: Flow Definition must be an object`);
268
- return;
269
- }
270
- for (const key of ["id", "version", "steps", "gates"])
271
- reporter.check(key in data, `${rel(file)}: missing .${key}`);
272
- }
273
- function validateKitRepository(kitDir, reporter) {
195
+ async function validateKitRepository(kitDir, reporter) {
274
196
  if (!fs.existsSync(kitDir) || !fs.statSync(kitDir).isDirectory()) {
275
197
  reporter.fail(`${rel(kitDir)}: kit directory does not exist`);
276
198
  return;
@@ -279,78 +201,10 @@ function validateKitRepository(kitDir, reporter) {
279
201
  reporter.check(fs.existsSync(kitJson), `${rel(kitDir)}: missing kit.json at repository root`);
280
202
  if (!fs.existsSync(kitJson))
281
203
  return;
282
- const data = tryLoadJson(kitJson, reporter);
283
- if (!data || typeof data !== "object")
284
- return;
285
- const unknownKeys = Object.keys(data).filter((key) => !kitTopLevelKeys.has(key)).sort();
286
- if (unknownKeys.length)
287
- reporter.fail(`${rel(kitJson)}: unsupported fields ${unknownKeys.join(", ")}; remove them or add them to the Flow Kit Repository contract`);
288
- reporter.check(data.schema_version === "1.0", `${rel(kitJson)}: .schema_version must be "1.0"`);
289
- reporter.check(typeof data.id === "string" && kitIdRe.test(data.id), `${rel(kitJson)}: .id must be a stable kebab-case string`);
290
- reporter.check(typeof data.name === "string" && !!data.name.trim(), `${rel(kitJson)}: .name must be a non-empty string`);
291
- for (const section of [...kitAssetSections].sort())
292
- if (section in data) {
293
- if (!Array.isArray(data[section])) {
294
- reporter.fail(`${rel(kitJson)}: .${section} must be a list of relative asset paths or objects with path`);
295
- continue;
296
- }
297
- const seenPaths = new Set();
298
- const seenIds = new Set();
299
- data[section].forEach((entry, index) => {
300
- const pathValue = typeof entry === "string" ? entry : entry?.path;
301
- const assetId = typeof entry === "object" ? entry.id : undefined;
302
- if (typeof entry === "object") {
303
- const unknown = Object.keys(entry).filter((key) => !["id", "path", "description"].includes(key)).sort();
304
- if (unknown.length)
305
- reporter.fail(`${rel(kitJson)}: ${section}[${index}] has unsupported fields ${unknown.join(", ")}; use id, path, or description`);
306
- }
307
- if (assetId !== undefined && (typeof assetId !== "string" || !kitIdRe.test(assetId)))
308
- reporter.fail(`${rel(kitJson)}: ${section}[${index}].id must be a stable dot/kebab-case string`);
309
- const assetPath = safeLocalPath(kitDir, pathValue, `${rel(kitJson)}: ${section}[${index}].path`, reporter);
310
- if (!assetPath)
311
- return;
312
- if (seenPaths.has(String(pathValue)))
313
- reporter.fail(`${rel(kitJson)}: ${section}[${index}].path duplicates '${pathValue}'; declare each asset once`);
314
- seenPaths.add(String(pathValue));
315
- if (typeof assetId === "string") {
316
- if (seenIds.has(assetId))
317
- reporter.fail(`${rel(kitJson)}: ${section}[${index}].id duplicates '${assetId}'; use a unique asset id`);
318
- seenIds.add(assetId);
319
- }
320
- reporter.check(fs.existsSync(assetPath), `${rel(kitJson)}: ${section}[${index}].path points at missing asset: ${pathValue}; add the file or remove the entry`);
321
- });
322
- }
323
- if (!Array.isArray(data.flows) || !data.flows.length) {
324
- reporter.fail(`${rel(kitJson)}: .flows must be a non-empty list; add at least one Flow Definition entry`);
325
- return;
326
- }
327
- const seenIds = new Set();
328
- const seenPaths = new Set();
329
- data.flows.forEach((flow, index) => {
330
- if (!flow || typeof flow !== "object") {
331
- reporter.fail(`${rel(kitJson)}: flows[${index}] must be an object with id and path`);
332
- return;
333
- }
334
- if (typeof flow.id !== "string" || !kitIdRe.test(flow.id))
335
- reporter.fail(`${rel(kitJson)}: flows[${index}].id must be a stable dot/kebab-case string`);
336
- else if (seenIds.has(flow.id))
337
- reporter.fail(`${rel(kitJson)}: flows[${index}].id duplicates '${flow.id}'; use a unique Flow id`);
338
- else
339
- seenIds.add(flow.id);
340
- const flowPath = safeLocalPath(kitDir, flow.path, `${rel(kitJson)}: flows[${index}].path`, reporter);
341
- if (!flowPath)
342
- return;
343
- if (seenPaths.has(String(flow.path))) {
344
- reporter.fail(`${rel(kitJson)}: flows[${index}].path duplicates '${flow.path}'; declare each Flow Definition once`);
345
- return;
346
- }
347
- seenPaths.add(String(flow.path));
348
- reporter.check(fs.existsSync(flowPath), `${rel(kitJson)}: flows[${index}].path points at missing Flow Definition: ${flow.path}; add the file or fix the path`);
349
- if (fs.existsSync(flowPath))
350
- validateFlowDefinitionShape(flowPath, tryLoadJson(flowPath, reporter), reporter);
351
- });
204
+ for (const error of await validateFlowKitRepository(kitDir))
205
+ reporter.fail(error);
352
206
  }
353
- function validateKits(reporter) {
207
+ async function validateKits(reporter) {
354
208
  reporter.check(fs.existsSync(path.join(root, "kits")), "kits directory missing");
355
209
  const catalog = tryLoadJson(kitsCatalogPath, reporter);
356
210
  const kits = catalog?.kits;
@@ -362,17 +216,17 @@ function validateKits(reporter) {
362
216
  console.log(fs.existsSync(localCli) ? `info: validating kit Flow Definitions with Flow CLI at ${localCli}` : `warning: Flow validator unavailable; source-tree check only verifies Flow Definition top-level shape`);
363
217
  else
364
218
  console.log("warning: Flow schema not configured; source-tree check only verifies Flow Definition top-level shape. Set FLOW_CLI_ROOT to enable Flow CLI validation. Container validation (kit.json core fields) will delegate to 'flow validate-kit' from @kontourai/flow when FLOW_CLI_ROOT is available.");
365
- kits.forEach((entry, index) => {
219
+ for (const [index, entry] of kits.entries()) {
366
220
  const kitText = typeof entry === "string" ? entry : ["path", "directory", "dir", "id", "name"].map((key) => entry?.[key]).find((value) => typeof value === "string" && value);
367
221
  if (!kitText) {
368
222
  reporter.fail(`${rel(kitsCatalogPath)}: kits[${index}] missing path, directory, dir, id, or name`);
369
- return;
223
+ continue;
370
224
  }
371
225
  const kitRef = String(kitText).startsWith("kits/") ? path.join(root, kitText) : path.join(root, "kits", kitText);
372
226
  const kitDir = path.basename(kitRef) === "kit.json" ? path.dirname(kitRef) : kitRef;
373
227
  reporter.check(fs.existsSync(kitDir) && fs.statSync(kitDir).isDirectory(), `${rel(kitsCatalogPath)}: kits[${index}] points at missing kit folder: ${kitText}`);
374
- validateKitRepository(kitDir, reporter);
375
- });
228
+ await validateKitRepository(kitDir, reporter);
229
+ }
376
230
  }
377
231
  function validateAgentPaths(reporter, manifest) {
378
232
  for (const file of walkFiles(path.join(root, "agents")).filter((item) => item.endsWith(".json"))) {
@@ -480,6 +334,32 @@ function validatePublicScriptWrappers(reporter) {
480
334
  reporter.check(JSON.stringify(significantLines) === JSON.stringify(policy.significantLines), `${file}: public wrapper must match the exact thin launcher body for ${policy.target}`);
481
335
  }
482
336
  }
337
+ function validateAdrNumbers(reporter) {
338
+ // Each ADR (a docs/adr file with an `# ADR NNNN:` heading) must own a unique
339
+ // number, and its filename prefix must match that number. Companion/index docs
340
+ // without an ADR heading (e.g. a numbered skill-audit tied to an ADR) are
341
+ // intentionally skipped. Guards against concurrent number collisions like the
342
+ // duplicate ADR 0014 from PRs #180/#172.
343
+ const adrDir = path.join(root, "docs/adr");
344
+ if (!fs.existsSync(adrDir))
345
+ return;
346
+ const byNumber = new Map();
347
+ for (const file of walkFiles(adrDir)) {
348
+ if (path.extname(file) !== ".md")
349
+ continue;
350
+ const heading = readText(file).match(/^#\s+ADR\s+(\d{4}):/m);
351
+ if (!heading)
352
+ continue; // not an ADR decision doc
353
+ const num = heading[1];
354
+ reporter.check(path.basename(file).startsWith(`${num}-`), `${rel(file)}: ADR heading number ${num} does not match the filename prefix`);
355
+ const list = byNumber.get(num) ?? [];
356
+ list.push(rel(file));
357
+ byNumber.set(num, list);
358
+ }
359
+ for (const [num, files] of byNumber) {
360
+ reporter.check(files.length === 1, `docs/adr: duplicate ADR number ${num} — ${files.join(", ")}. ADR numbers must be unique; renumber one.`);
361
+ }
362
+ }
483
363
  function validateHookInventory(reporter) {
484
364
  const readme = readText(path.join(root, "scripts/README.md"));
485
365
  const hookFiles = walkFiles(path.join(root, "scripts/hooks"))
@@ -605,7 +485,7 @@ function validateNoFirstPartyPythonCommands(reporter) {
605
485
  reporter.fail(`${relative}: direct first-party Python command reference is not allowed; use npm/flow-agents TypeScript commands`);
606
486
  }
607
487
  }
608
- export function main(argv = process.argv.slice(2)) {
488
+ export async function main(argv = process.argv.slice(2)) {
609
489
  const kitIndex = argv.indexOf("--kit");
610
490
  if (kitIndex >= 0) {
611
491
  const kitDir = argv[kitIndex + 1];
@@ -619,7 +499,7 @@ export function main(argv = process.argv.slice(2)) {
619
499
  console.log(`info: validating kit Flow Definitions with Flow CLI at ${localCli}`);
620
500
  else
621
501
  console.log("warning: Flow validation surface unavailable; local kit check uses the minimal Flow Definition fallback");
622
- validateKitRepository(path.resolve(kitDir), reporter);
502
+ await validateKitRepository(path.resolve(kitDir), reporter);
623
503
  if (reporter.errors.length) {
624
504
  console.log("Flow Kit repository validation failed:");
625
505
  for (const error of reporter.errors)
@@ -635,14 +515,14 @@ export function main(argv = process.argv.slice(2)) {
635
515
  validateAgentCards(reporter, agentNames);
636
516
  validatePowers(reporter);
637
517
  validateManifest(reporter, manifest, agentNames);
638
- validatePacksManifest(reporter, agentNames);
639
- validateKits(reporter);
518
+ await validateKits(reporter);
640
519
  validateAgentPaths(reporter, manifest);
641
520
  validateLegacyRefs(reporter);
642
521
  validateMirrors(reporter);
643
522
  validateUsageFeedbackFiles(reporter);
644
523
  validatePublicScriptWrappers(reporter);
645
524
  validateHookInventory(reporter);
525
+ validateAdrNumbers(reporter);
646
526
  validateFixtureOwnership(reporter);
647
527
  validatePackageCommandSurface(reporter);
648
528
  validateNoFirstPartyPythonFiles(reporter);
@@ -672,5 +552,5 @@ catch {
672
552
  return process.argv[1];
673
553
  } })();
674
554
  if (_selfRealPath === _argv1RealPath) {
675
- process.exitCode = main();
555
+ main().then((code) => { process.exitCode = code; });
676
556
  }
@@ -20,6 +20,16 @@ The artifact root is local working memory unless a workflow explicitly promotes
20
20
  - Do not commit local workflow runtime roots such as `.flow-agents/<slug>/` as durable policy unless a repository-specific contract explicitly says that artifact is promoted.
21
21
  - Do not commit local workflow runtime roots such as `.flow-agents/<slug>/`; final acceptance must promote durable content before merge.
22
22
 
23
+ ## Persistence Integrity
24
+
25
+ Writing a durable artifact must **fail loud, never fail-open.** If a record (state, evidence, a
26
+ trust.bundle, a claim) cannot be persisted — a missing dependency, a validation failure, an I/O
27
+ error — the operation **fails with the reason**; it must not return success while silently
28
+ dropping the write. A silently-skipped persist is **data loss**, not a degraded mode, and is
29
+ invisible to the caller that depended on it. Callers act on persistence **return values**, not
30
+ just thrown exceptions. (See #160: an ignored `{written:false}` from the bundle writer dropped
31
+ records under concurrency.)
32
+
23
33
  ## Required Artifact Types
24
34
 
25
35
  ### Structured Sidecars
@@ -61,6 +61,7 @@ After CI passes and the work is merged, released, or otherwise accepted:
61
61
  - [ ] durable docs link back to the provider record, archived plan, or session artifact when useful
62
62
  - [ ] local `.flow-agents/` runtime artifacts remain untracked, and durable outcomes are promoted before merge to `main`
63
63
  - [ ] follow-up issues or learning-review items created for deferred work
64
+ - [ ] **workspace cleaned up after a confirmed merge**: the merge is verified from the provider's own merge record (a merge commit / `mergedAt`), not a green check or a command exit code; then the isolated worktree is removed and the now-merged branch is deleted locally and on the remote, honoring the `worktree_lifecycle` (`retain_until: pr_merged`) recorded at selection. Never delete a branch or worktree before the merge is confirmed. A delivery is not complete while it leaves a stale worktree or merged branch behind.
64
65
 
65
66
  ## Distribution Rule
66
67
 
@@ -59,6 +59,7 @@ All reviewers are read-only reporters. They may inspect files, run read-only ana
59
59
  Attempt relevant perspectives and record findings:
60
60
 
61
61
  - Code quality: readability, naming, function/file size, error handling, duplication, maintainability
62
+ - Failure handling: callers act on failure *return values*, not just exceptions — flag fail-open on any data-persisting path (per the persistence-integrity invariant in the artifact contract)
62
63
  - Correctness risks: edge cases, unintended behavior, unsafe assumptions, missing tests
63
64
  - Standards fit: project conventions, local architecture, public contracts, documented decisions
64
65
  - Security: secrets, injection, XSS, path traversal, auth/authz, unsafe external calls, vulnerable dependencies
@@ -29,6 +29,8 @@ Attempt relevant phases and record evidence:
29
29
 
30
30
  If a tool or environment is unavailable, mark that phase `NOT_VERIFIED` with the reason. Do not skip silently.
31
31
 
32
+ A flaky or intermittently-failing test is a real defect — a race, a fail-open, or nondeterminism — not noise. Root-cause it; never re-run to green or mark it `skip`/`pass` to move on. An operation that can pass without doing its job is a failure, not a flake.
33
+
32
34
  Provider-check gaps are risk-based:
33
35
 
34
36
  - Docs-only changes may use `SKIP` / `skip` for missing provider checks only when the report names the skipped check, explains why local docs evidence is enough, and the repository does not require the missing check.
@@ -0,0 +1,39 @@
1
+ # Gate Awareness
2
+
3
+ This repo runs three active gates implemented as Claude Code hook scripts. Every agent working here should know when each gate fires, what it checks, and what the correct posture is when a gate blocks or when a suspected block does not appear.
4
+
5
+ ## Active Gates
6
+
7
+ **goal-fit/Stop** (`scripts/hooks/stop-goal-fit.js`): fires on the agent Stop event (before the agent final-answers as complete). The gate reads `.flow-agents/` to find the most recent active workflow artifact and checks for: an incomplete Definition Of Done section, an incomplete or absent Goal Fit Gate section, open items in Final Acceptance when status is delivered, failing or NOT_VERIFIED checks in `evidence.json`, open sidecar issues (state.json showing non-done status, critique.json with open findings), and evidence cross-reference failures (the capture log in `command-log.jsonl` contradicting a claimed-pass command check in `evidence.json`). In `block` mode the gate exits 2, which prevents the Stop. The canonical engine default is `warn` (exit 0 with guidance on stderr); shipped runtime configs such as Claude Code at L2 set `block` so the installed product enforces. The gate releases automatically after a configurable number of consecutive identical blocks (default 3) to surface the situation to the human rather than looping forever.
8
+
9
+ **evidence-capture** (`scripts/hooks/evidence-capture.js`): fires as a postToolUse hook on every shell or command tool execution. It deterministically records the actual command result — not the model's narration about it — to `.flow-agents/<slug>/command-log.jsonl` as an append-only JSONL log. Each record captures the command string, observed result (pass/fail), exit code when available, and a timestamp. Non-blocking; always exits 0. Fail-open: a capture failure never blocks the agent or corrupts the log.
10
+
11
+ **reground** (`scripts/hooks/workflow-steering.js`): fires on `SessionStart` and `UserPromptSubmit` to re-inject the active workflow phase, goal, and next-step from `state.json` into the agent turn. This is what keeps an in-flight goal alive through context compaction and session resume without requiring the agent to voluntarily re-read sidecars. The hook also fires after subagent calls (use_subagent) to inject phase-transition reminders tailored to the completing subagent (planner, worker, reviewer, verifier). Non-blocking; always exits 0.
12
+
13
+ ## A Block Is The System Working
14
+
15
+ When the goal-fit/Stop gate blocks, that is the system functioning correctly, not an obstacle to route around. The gate blocked because it found a genuine gap: an open Definition Of Done item, a failed evidence check, a sidecar showing non-done status, or a command the capture log shows actually failed while the evidence claims it passed. Routing around the block, silencing the hook, or suppressing the exit code treats a functioning quality gate as an error to ignore. It is not. Address the gap the gate named.
16
+
17
+ ## Judge Gate Correctness
18
+
19
+ A block demands evaluation, not blind obedience and not blind routing-around. When the goal-fit/Stop gate fires, ask: is this a true-block or a false-block?
20
+
21
+ A true-block is a case where the gate is correct: a real gap exists — an unchecked Definition Of Done item, a command that genuinely failed, a missing sidecar, an open review finding — and the system is right to prevent delivery until the gap is closed. The correct response to a true-block is to close the gap, then re-attempt.
22
+
23
+ A false-block is a case where the gate has a genuine bug or is acting on stale or corrupt data — for example, a sidecar that was incorrectly written, a `command-log.jsonl` entry that misrecorded a passing command as a failure due to a capture-hook defect, or a `state.json` that was never updated to `done` even though the work is complete.
24
+
25
+ The path to a clean pass is always to **produce real evidence**, never to make the proof say what you want: run the command so the capture hook records the real result, finish the missing Definition-of-Done item, write the sidecar the flow forgot. Proof artifacts are not yours to hand-author into a pass — `command-log.jsonl` is owned by the capture hook and must never be hand-edited, and a verdict you write for yourself is not evidence of anything. Correcting a genuinely-wrong artifact is a last resort: do it transparently, note it as a correction, and prefer regenerating it through the tool that owns it. If the only way you can see to clear a block is to edit the proof, that is the signal to stop and surface the situation, not to proceed.
26
+
27
+ Do not conflate "inconvenient" with "false-block." If the gap named by the gate is real, it is a true-block regardless of how close to done the work feels.
28
+
29
+ ## Missed-Block Diagnostic
30
+
31
+ When a gate does not fire and you suspect it should have, the gate is almost never defective. The goal-fit/Stop gate only knows what the flow recorded in `.flow-agents/<slug>/`. It cross-references `evidence.json` command checks against `command-log.jsonl`. A suspected missed block nearly always means the flow did not record the evidence, not that the gate failed to evaluate it.
32
+
33
+ Start diagnosis here:
34
+
35
+ 1. Check `.flow-agents/<slug>/command-log.jsonl` — was the relevant command captured? If the evidence-capture hook was not active when the command ran (for example, the session predated the hook or the artifact directory was not yet resolved), the log will have no entry for that command and the Stop gate will see no contradiction to raise.
36
+ 2. Check `.flow-agents/<slug>/evidence.json` — does the relevant check exist with kind `command` and status `pass`? The gate only cross-references checks that are explicitly recorded in `evidence.json` as command-kind claimed passes. If the check was never written there, the gate has nothing to cross-reference.
37
+ 3. If both files are present and consistent but the block still did not fire, verify that the artifact directory the gate found is the one you expect (`state.json` newest-mtime resolution) and that the workflow artifact has the correct type and status to be treated as active.
38
+
39
+ A gate defect is a last resort diagnosis, not a first assumption.