@kontourai/flow-agents 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (418) hide show
  1. package/.githooks/pre-push +11 -0
  2. package/.github/workflows/ci.yml +210 -0
  3. package/.github/workflows/docs-pages.yml +52 -0
  4. package/.github/workflows/publish-npm.yml +104 -0
  5. package/AGENTS.md +26 -0
  6. package/CHANGELOG.md +66 -0
  7. package/CODE_OF_CONDUCT.md +25 -0
  8. package/CONTEXT.md +300 -0
  9. package/CONTRIBUTING.md +44 -0
  10. package/LICENSE +201 -0
  11. package/README.md +129 -0
  12. package/SECURITY.md +33 -0
  13. package/agent-cards/dev.json +19 -0
  14. package/agents/dev.json +127 -0
  15. package/agents/tool-code-reviewer.json +61 -0
  16. package/agents/tool-dependencies-updater.json +118 -0
  17. package/agents/tool-explore-config.json +92 -0
  18. package/agents/tool-explore-deps.json +92 -0
  19. package/agents/tool-explore-entry.json +92 -0
  20. package/agents/tool-explore-patterns.json +92 -0
  21. package/agents/tool-explore-structure.json +92 -0
  22. package/agents/tool-explore-tests.json +92 -0
  23. package/agents/tool-planner.json +57 -0
  24. package/agents/tool-playwright.json +145 -0
  25. package/agents/tool-security-reviewer.json +56 -0
  26. package/agents/tool-verifier.json +61 -0
  27. package/agents/tool-worker.json +58 -0
  28. package/build/src/cli/console-learning-projection.js +123 -0
  29. package/build/src/cli/docs-preview.js +39 -0
  30. package/build/src/cli/effective-backlog-settings.js +102 -0
  31. package/build/src/cli/export-bookmarks.js +38 -0
  32. package/build/src/cli/fixture-retirement-audit.js +140 -0
  33. package/build/src/cli/flow-kit.js +138 -0
  34. package/build/src/cli/import-bookmarks.js +50 -0
  35. package/build/src/cli/init.js +239 -0
  36. package/build/src/cli/instinct-cli.js +93 -0
  37. package/build/src/cli/promote-workflow-artifact.js +63 -0
  38. package/build/src/cli/publish-change-helper.js +154 -0
  39. package/build/src/cli/pull-work-provider.js +469 -0
  40. package/build/src/cli/runtime-adapter.js +23 -0
  41. package/build/src/cli/telemetry-doctor.js +221 -0
  42. package/build/src/cli/usage-feedback.js +443 -0
  43. package/build/src/cli/validate-hook-influence.js +152 -0
  44. package/build/src/cli/validate-source-tree.js +31 -0
  45. package/build/src/cli/validate-workflow-artifacts.js +486 -0
  46. package/build/src/cli/veritas-governance.js +262 -0
  47. package/build/src/cli/workflow-artifact-cleanup-audit.js +272 -0
  48. package/build/src/cli/workflow-sidecar.js +816 -0
  49. package/build/src/cli.js +89 -0
  50. package/build/src/flow-kit/validate.js +75 -0
  51. package/build/src/lib/args.js +45 -0
  52. package/build/src/lib/fs.js +62 -0
  53. package/build/src/lib/workflow-learning-projection.js +334 -0
  54. package/build/src/runtime-adapters.js +146 -0
  55. package/build/src/tools/build-universal-bundles.js +397 -0
  56. package/build/src/tools/common.js +56 -0
  57. package/build/src/tools/filter-installed-packs.js +132 -0
  58. package/build/src/tools/generate-context-map.js +198 -0
  59. package/build/src/tools/validate-package.js +64 -0
  60. package/build/src/tools/validate-source-tree.js +622 -0
  61. package/console.telemetry.json +176 -0
  62. package/context/base-rules.md +17 -0
  63. package/context/code-review-standards.md +62 -0
  64. package/context/coding-standards.md +42 -0
  65. package/context/common/orchestrators.md +12 -0
  66. package/context/common/subagents.md +28 -0
  67. package/context/contracts/artifact-contract.md +182 -0
  68. package/context/contracts/builder-kit-workflow-state-contract.md +319 -0
  69. package/context/contracts/delivery-contract.md +69 -0
  70. package/context/contracts/execution-contract.md +53 -0
  71. package/context/contracts/governance-adapter-contract.md +67 -0
  72. package/context/contracts/planning-contract.md +85 -0
  73. package/context/contracts/review-contract.md +104 -0
  74. package/context/contracts/sandbox-policy.md +52 -0
  75. package/context/contracts/verification-contract.md +134 -0
  76. package/context/contracts/work-item-contract.md +215 -0
  77. package/context/deferred/demo-mode.md +33 -0
  78. package/context/deferred/languages/go.md +31 -0
  79. package/context/deferred/languages/python.md +31 -0
  80. package/context/deferred/languages/typescript.md +34 -0
  81. package/context/deferred/parallelization.md +35 -0
  82. package/context/deferred/worktree-isolation.md +24 -0
  83. package/context/development-workflow.md +50 -0
  84. package/context/scripts/context-budget/budget-scan.sh +166 -0
  85. package/context/scripts/detect-tools.sh +3 -0
  86. package/context/scripts/discover-agents.sh +28 -0
  87. package/context/scripts/git-status.sh +49 -0
  88. package/context/scripts/hooks/config-protection.js +79 -0
  89. package/context/scripts/hooks/desktop-notify.sh +39 -0
  90. package/context/scripts/hooks/governance-audit.sh +135 -0
  91. package/context/scripts/hooks/lib/audit-transport.sh +40 -0
  92. package/context/scripts/hooks/lib/hook-flags.js +49 -0
  93. package/context/scripts/hooks/lib/patterns.sh +57 -0
  94. package/context/scripts/hooks/lib/resolve-formatter.js +80 -0
  95. package/context/scripts/hooks/post-edit-accumulator.js +66 -0
  96. package/context/scripts/hooks/pre-commit-quality.js +194 -0
  97. package/context/scripts/hooks/quality-gate.js +93 -0
  98. package/context/scripts/hooks/report-only-guard.js +21 -0
  99. package/context/scripts/hooks/run-hook.js +136 -0
  100. package/context/scripts/hooks/stop-format-typecheck.js +141 -0
  101. package/context/scripts/hooks/stop-goal-fit.js +337 -0
  102. package/context/scripts/hooks/workflow-steering.js +250 -0
  103. package/context/scripts/telemetry/console-presets.sh +14 -0
  104. package/context/scripts/telemetry/install-console-config.sh +214 -0
  105. package/context/scripts/telemetry/lib/config.sh +85 -0
  106. package/context/scripts/telemetry/lib/enrich.sh +115 -0
  107. package/context/scripts/telemetry/lib/redact.sh +22 -0
  108. package/context/scripts/telemetry/lib/session.sh +63 -0
  109. package/context/scripts/telemetry/lib/transport.sh +183 -0
  110. package/context/scripts/telemetry/lib/usage.sh +29 -0
  111. package/context/scripts/telemetry/sync-agents.sh +173 -0
  112. package/context/scripts/telemetry/telemetry.conf +23 -0
  113. package/context/scripts/telemetry/telemetry.sh +387 -0
  114. package/context/scripts/validate-package.sh +89 -0
  115. package/context/settings/backlog-provider-settings.json +54 -0
  116. package/context/templates/core/identity.md +26 -0
  117. package/context/templates/core/user.md +15 -0
  118. package/docs/_config.yml +15 -0
  119. package/docs/_layouts/default.html +87 -0
  120. package/docs/adr/0001-flow-agents-consumes-flow.md +77 -0
  121. package/docs/adr/0002-flow-kits-as-extension-unit.md +13 -0
  122. package/docs/adr/0003-flow-agents-coordinates-kits-and-adapters.md +13 -0
  123. package/docs/adr/0004-gates-expect-surface-claims.md +15 -0
  124. package/docs/adr/0005-kubernetes-inspired-resource-contracts.md +48 -0
  125. package/docs/adr/0006-typescript-first-source-policy.md +98 -0
  126. package/docs/agent-system-guidebook.md +391 -0
  127. package/docs/agent-usage-feedback-loop.md +351 -0
  128. package/docs/assets/favicon.svg +13 -0
  129. package/docs/assets/og-image.png +0 -0
  130. package/docs/assets/site.css +774 -0
  131. package/docs/assets/site.js +139 -0
  132. package/docs/configurable-workflow-routing.md +174 -0
  133. package/docs/context-map.md +145 -0
  134. package/docs/developer-architecture.md +145 -0
  135. package/docs/developer-hook-setup.md +61 -0
  136. package/docs/fixture-ownership.md +44 -0
  137. package/docs/flow-kit-repository-contract.md +180 -0
  138. package/docs/index.md +129 -0
  139. package/docs/kontour-resource-contract.md +358 -0
  140. package/docs/migrations.md +64 -0
  141. package/docs/north-star.md +322 -0
  142. package/docs/operating-layers.md +110 -0
  143. package/docs/repository-structure.md +132 -0
  144. package/docs/sandbox-policy.md +56 -0
  145. package/docs/skills-map.md +203 -0
  146. package/docs/standards-register.md +96 -0
  147. package/docs/veritas-integration.md +165 -0
  148. package/docs/work-item-adapters.md +72 -0
  149. package/docs/workflow-artifact-lifecycle.md +141 -0
  150. package/docs/workflow-eval-strategy.md +295 -0
  151. package/docs/workflow-shared-contracts.md +51 -0
  152. package/docs/workflow-usage-guide.md +443 -0
  153. package/evals/ARCHITECTURE.md +143 -0
  154. package/evals/CONVENTIONS.md +58 -0
  155. package/evals/README.md +128 -0
  156. package/evals/acceptance/run.sh +29 -0
  157. package/evals/acceptance/test_claude_harness.sh +242 -0
  158. package/evals/acceptance/test_codex_harness.sh +108 -0
  159. package/evals/acceptance/test_kiro_harness.sh +128 -0
  160. package/evals/cases/dev/404.html +97 -0
  161. package/evals/cases/dev/code-review.yaml +44 -0
  162. package/evals/cases/dev/dashboard.html +300 -0
  163. package/evals/cases/dev/deliver.yaml +66 -0
  164. package/evals/cases/dev/dependency-update.yaml +16 -0
  165. package/evals/cases/dev/explore.yaml +20 -0
  166. package/evals/cases/dev/index.html +370 -0
  167. package/evals/cases/dev/package-lock.json +28 -0
  168. package/evals/cases/dev/package.json +16 -0
  169. package/evals/cases/dev/plan-work.yaml +20 -0
  170. package/evals/cases/dev/promptfooconfig.yaml +666 -0
  171. package/evals/cases/dev/search-first.yaml +20 -0
  172. package/evals/cases/dev/tdd-workflow.yaml +48 -0
  173. package/evals/cases/dev/verify-work.yaml +44 -0
  174. package/evals/cases/dev/workflow.yaml +34 -0
  175. package/evals/ci/run-baseline.sh +283 -0
  176. package/evals/fixtures/backlog-provider-settings/global-default.json +44 -0
  177. package/evals/fixtures/backlog-provider-settings/project-override.json +53 -0
  178. package/evals/fixtures/builder-kit-workflow-state/baseline-freshness-resolution-hint.json +139 -0
  179. package/evals/fixtures/builder-kit-workflow-state/direct-primitive-stop.json +59 -0
  180. package/evals/fixtures/builder-kit-workflow-state/empty-board-route-shape.json +55 -0
  181. package/evals/fixtures/builder-kit-workflow-state/happy-path.json +71 -0
  182. package/evals/fixtures/builder-kit-workflow-state/mid-work-resume.json +80 -0
  183. package/evals/fixtures/builder-kit-workflow-state/missing-prestep-recovery.json +65 -0
  184. package/evals/fixtures/builder-kit-workflow-state/product-build-chaining.json +60 -0
  185. package/evals/fixtures/builder-kit-workflow-state/stale-continuation-requires-new-probe.json +57 -0
  186. package/evals/fixtures/console-learning-projection/artifacts/console-learning-correction/learning.json +50 -0
  187. package/evals/fixtures/console-learning-projection/artifacts/console-learning-open-route/learning.json +41 -0
  188. package/evals/fixtures/flow-kit-repository/invalid-absolute-path/kit.json +8 -0
  189. package/evals/fixtures/flow-kit-repository/invalid-asset-section/flows/review.flow.json +6 -0
  190. package/evals/fixtures/flow-kit-repository/invalid-asset-section/kit.json +11 -0
  191. package/evals/fixtures/flow-kit-repository/invalid-duplicate-flow/flows/review.flow.json +6 -0
  192. package/evals/fixtures/flow-kit-repository/invalid-duplicate-flow/kit.json +9 -0
  193. package/evals/fixtures/flow-kit-repository/invalid-id/flows/review.flow.json +6 -0
  194. package/evals/fixtures/flow-kit-repository/invalid-id/kit.json +8 -0
  195. package/evals/fixtures/flow-kit-repository/invalid-malformed-json/kit.json +8 -0
  196. package/evals/fixtures/flow-kit-repository/invalid-missing-flow/kit.json +8 -0
  197. package/evals/fixtures/flow-kit-repository/invalid-missing-id/flows/review.flow.json +6 -0
  198. package/evals/fixtures/flow-kit-repository/invalid-missing-id/kit.json +7 -0
  199. package/evals/fixtures/flow-kit-repository/invalid-missing-schema-version/flows/review.flow.json +6 -0
  200. package/evals/fixtures/flow-kit-repository/invalid-missing-schema-version/kit.json +7 -0
  201. package/evals/fixtures/flow-kit-repository/invalid-name/flows/review.flow.json +6 -0
  202. package/evals/fixtures/flow-kit-repository/invalid-name/kit.json +8 -0
  203. package/evals/fixtures/flow-kit-repository/invalid-schema-version/flows/review.flow.json +6 -0
  204. package/evals/fixtures/flow-kit-repository/invalid-schema-version/kit.json +8 -0
  205. package/evals/fixtures/flow-kit-repository/invalid-traversal/kit.json +8 -0
  206. package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/adapters/example.json +3 -0
  207. package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/assets/example.txt +1 -0
  208. package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/docs/README.md +3 -0
  209. package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/flows/runtime.flow.json +26 -0
  210. package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit-evals/example.json +3 -0
  211. package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit-skills/mixed/SKILL.md +3 -0
  212. package/evals/fixtures/flow-kit-repository/mixed-runtime-kit/kit.json +44 -0
  213. package/evals/fixtures/flow-kit-repository/valid-local-kit/docs/README.md +3 -0
  214. package/evals/fixtures/flow-kit-repository/valid-local-kit/flows/review.flow.json +26 -0
  215. package/evals/fixtures/flow-kit-repository/valid-local-kit/kit.json +20 -0
  216. package/evals/fixtures/hook-influence/cases.json +336 -0
  217. package/evals/fixtures/pull-work-provider/github-issues.json +170 -0
  218. package/evals/fixtures/pull-work-wip-shepherding/global-wip-informs.json +43 -0
  219. package/evals/fixtures/pull-work-wip-shepherding/personal-wip-blocks.json +42 -0
  220. package/evals/fixtures/surface-trust/accepted-claim-trust-report.json +31 -0
  221. package/evals/fixtures/surface-trust/artifact-absent.json +19 -0
  222. package/evals/fixtures/surface-trust/integrity-mismatch-trust-report.json +32 -0
  223. package/evals/fixtures/surface-trust/missing-authority-trust-report.json +27 -0
  224. package/evals/fixtures/surface-trust/provider-absent.json +19 -0
  225. package/evals/fixtures/surface-trust/rejected-claim-trust-report.json +30 -0
  226. package/evals/fixtures/surface-trust/stale-claim-trust-snapshot.json +31 -0
  227. package/evals/fixtures/usage-feedback/sample-full.jsonl +11 -0
  228. package/evals/fixtures/usage-feedback/sample-outcomes.jsonl +1 -0
  229. package/evals/fixtures/veritas-governance-adapter/fake-veritas-pass.sh +18 -0
  230. package/evals/fixtures/veritas-governance-adapter/fake-veritas-secret-fail.sh +10 -0
  231. package/evals/fixtures/veritas-governance-adapter/fake-veritas-unconfigured.sh +4 -0
  232. package/evals/integration/test_bundle_install.sh +541 -0
  233. package/evals/integration/test_console_learning_projection.sh +192 -0
  234. package/evals/integration/test_context_map.sh +65 -0
  235. package/evals/integration/test_effective_backlog_settings.sh +58 -0
  236. package/evals/integration/test_fixture_retirement_audit.sh +58 -0
  237. package/evals/integration/test_flow_agents_statusline.sh +93 -0
  238. package/evals/integration/test_flow_kit_repository.sh +90 -0
  239. package/evals/integration/test_goal_fit_hook.sh +482 -0
  240. package/evals/integration/test_hook_category_behaviors.sh +190 -0
  241. package/evals/integration/test_hook_influence_cases.sh +69 -0
  242. package/evals/integration/test_local_flow_kit_install.sh +145 -0
  243. package/evals/integration/test_publish_change_helper.sh +176 -0
  244. package/evals/integration/test_pull_work_provider.sh +140 -0
  245. package/evals/integration/test_runtime_adapter_activation.sh +106 -0
  246. package/evals/integration/test_telemetry.sh +485 -0
  247. package/evals/integration/test_telemetry_doctor.sh +193 -0
  248. package/evals/integration/test_usage_feedback_dashboard.sh +169 -0
  249. package/evals/integration/test_usage_feedback_global.sh +117 -0
  250. package/evals/integration/test_usage_feedback_import.sh +227 -0
  251. package/evals/integration/test_usage_feedback_outcomes.sh +165 -0
  252. package/evals/integration/test_usage_feedback_report.sh +263 -0
  253. package/evals/integration/test_veritas_governance_adapter.sh +235 -0
  254. package/evals/integration/test_workflow_artifact_cleanup_audit.sh +287 -0
  255. package/evals/integration/test_workflow_artifacts.sh +1247 -0
  256. package/evals/integration/test_workflow_sidecar_writer.sh +2112 -0
  257. package/evals/integration/test_workflow_steering_hook.sh +337 -0
  258. package/evals/lib/assertions/delegated-to.js +40 -0
  259. package/evals/lib/assertions/max-tool-calls.js +15 -0
  260. package/evals/lib/assertions/no-write-tools.js +27 -0
  261. package/evals/lib/assertions/pass-at-k.js +39 -0
  262. package/evals/lib/assertions/telemetry-utils.js +105 -0
  263. package/evals/lib/assertions/tool-called.js +39 -0
  264. package/evals/lib/assertions/verify-after-fix.js +61 -0
  265. package/evals/lib/claude-judge.sh +40 -0
  266. package/evals/lib/claude-provider.sh +74 -0
  267. package/evals/lib/codex-judge.sh +39 -0
  268. package/evals/lib/codex-provider.sh +81 -0
  269. package/evals/lib/eval-dev.sh +5 -0
  270. package/evals/lib/eval-judge.sh +22 -0
  271. package/evals/lib/eval-provider.sh +26 -0
  272. package/evals/lib/eval-report.sh +73 -0
  273. package/evals/lib/kiro-dev.sh +4 -0
  274. package/evals/lib/kiro-judge.sh +17 -0
  275. package/evals/lib/kiro-provider.sh +62 -0
  276. package/evals/lib/node.sh +111 -0
  277. package/evals/promptfooconfig.yaml +70 -0
  278. package/evals/run.sh +309 -0
  279. package/evals/static/test_evidence_refs.sh +141 -0
  280. package/evals/static/test_package.sh +407 -0
  281. package/evals/static/test_repo_hooks.sh +68 -0
  282. package/evals/static/test_universal_bundles.sh +274 -0
  283. package/evals/static/test_workflow_skills.sh +1207 -0
  284. package/install.sh +64 -0
  285. package/integrations/veritas/flow-agents.adapter.json +138 -0
  286. package/integrations/veritas/flow-agents.authority-settings.json +26 -0
  287. package/integrations/veritas/flow-agents.repo-standards.json +82 -0
  288. package/kits/builder/flows/build.flow.json +218 -0
  289. package/kits/builder/flows/shape.flow.json +127 -0
  290. package/kits/builder/kit.json +19 -0
  291. package/kits/catalog.json +11 -0
  292. package/package.json +130 -0
  293. package/packaging/README.md +60 -0
  294. package/packaging/manifest.json +173 -0
  295. package/packaging/packs.json +69 -0
  296. package/powers/dependency-checker/POWER.md +20 -0
  297. package/powers/dependency-checker/mcp.json +20 -0
  298. package/powers/playwright/POWER.md +25 -0
  299. package/powers/playwright/mcp.json +12 -0
  300. package/prompts/code-audit.md +123 -0
  301. package/prompts/kcommit.md +88 -0
  302. package/schemas/backlog-provider-settings.schema.json +138 -0
  303. package/schemas/workflow-acceptance.schema.json +216 -0
  304. package/schemas/workflow-critique.schema.json +113 -0
  305. package/schemas/workflow-evidence.schema.json +357 -0
  306. package/schemas/workflow-handoff.schema.json +52 -0
  307. package/schemas/workflow-learning.schema.json +223 -0
  308. package/schemas/workflow-release.schema.json +172 -0
  309. package/schemas/workflow-state.schema.json +80 -0
  310. package/scripts/README.md +111 -0
  311. package/scripts/build-universal-bundles.js +3 -0
  312. package/scripts/check-content-boundary.cjs +99 -0
  313. package/scripts/context-budget/budget-scan.sh +166 -0
  314. package/scripts/detect-tools.sh +3 -0
  315. package/scripts/discover-agents.sh +28 -0
  316. package/scripts/effective-backlog-settings.js +2 -0
  317. package/scripts/filter-installed-packs.js +2 -0
  318. package/scripts/flow-kit.js +2 -0
  319. package/scripts/generate-context-map.js +2 -0
  320. package/scripts/git-status.sh +49 -0
  321. package/scripts/hooks/claude-hook-adapter.js +174 -0
  322. package/scripts/hooks/claude-telemetry-hook.js +115 -0
  323. package/scripts/hooks/codex-hook-adapter.js +176 -0
  324. package/scripts/hooks/codex-telemetry-hook.js +95 -0
  325. package/scripts/hooks/config-protection.js +79 -0
  326. package/scripts/hooks/desktop-notify.sh +39 -0
  327. package/scripts/hooks/governance-audit.sh +135 -0
  328. package/scripts/hooks/lib/audit-transport.sh +40 -0
  329. package/scripts/hooks/lib/hook-flags.js +49 -0
  330. package/scripts/hooks/lib/patterns.sh +57 -0
  331. package/scripts/hooks/lib/resolve-formatter.js +80 -0
  332. package/scripts/hooks/post-edit-accumulator.js +66 -0
  333. package/scripts/hooks/pre-commit-quality.js +194 -0
  334. package/scripts/hooks/quality-gate.js +93 -0
  335. package/scripts/hooks/report-only-guard.js +21 -0
  336. package/scripts/hooks/run-hook.js +136 -0
  337. package/scripts/hooks/stop-format-typecheck.js +141 -0
  338. package/scripts/hooks/stop-goal-fit.js +337 -0
  339. package/scripts/hooks/workflow-steering.js +250 -0
  340. package/scripts/install-codex-home.sh +106 -0
  341. package/scripts/package.json +3 -0
  342. package/scripts/promote-workflow-artifact.js +2 -0
  343. package/scripts/publish-change-helper.js +2 -0
  344. package/scripts/pull-work-provider.js +2 -0
  345. package/scripts/setup-repo-hooks.sh +8 -0
  346. package/scripts/statusline/flow-agents-statusline.js +157 -0
  347. package/scripts/telemetry/console-presets.sh +14 -0
  348. package/scripts/telemetry/install-console-config.sh +214 -0
  349. package/scripts/telemetry/lib/config.sh +85 -0
  350. package/scripts/telemetry/lib/enrich.sh +115 -0
  351. package/scripts/telemetry/lib/redact.sh +22 -0
  352. package/scripts/telemetry/lib/session.sh +63 -0
  353. package/scripts/telemetry/lib/transport.sh +183 -0
  354. package/scripts/telemetry/lib/usage.sh +29 -0
  355. package/scripts/telemetry/sync-agents.sh +173 -0
  356. package/scripts/telemetry/telemetry.conf +23 -0
  357. package/scripts/telemetry/telemetry.sh +387 -0
  358. package/scripts/usage-feedback.js +2 -0
  359. package/scripts/validate-hook-influence-cases.js +2 -0
  360. package/scripts/validate-package.sh +89 -0
  361. package/scripts/validate-source-tree.js +9 -0
  362. package/skills/agentic-engineering/SKILL.md +62 -0
  363. package/skills/browser-test/SKILL.md +51 -0
  364. package/skills/builder-shape/SKILL.md +76 -0
  365. package/skills/context-budget/SKILL.md +40 -0
  366. package/skills/deliver/SKILL.md +241 -0
  367. package/skills/dependency-update/SKILL.md +68 -0
  368. package/skills/design-probe/SKILL.md +107 -0
  369. package/skills/eval-rebuild/SKILL.md +39 -0
  370. package/skills/evidence-gate/SKILL.md +186 -0
  371. package/skills/execute-plan/SKILL.md +110 -0
  372. package/skills/explore/SKILL.md +137 -0
  373. package/skills/feedback-loop/SKILL.md +87 -0
  374. package/skills/fix-bug/SKILL.md +133 -0
  375. package/skills/frontend-design/SKILL.md +80 -0
  376. package/skills/github-cli/SKILL.md +63 -0
  377. package/skills/idea-to-backlog/SKILL.md +267 -0
  378. package/skills/knowledge-capture/SKILL.md +55 -0
  379. package/skills/learning-review/SKILL.md +115 -0
  380. package/skills/pickup-probe/SKILL.md +114 -0
  381. package/skills/plan-work/SKILL.md +176 -0
  382. package/skills/pull-work/SKILL.md +309 -0
  383. package/skills/release-readiness/SKILL.md +121 -0
  384. package/skills/review-work/SKILL.md +161 -0
  385. package/skills/search-first/SKILL.md +66 -0
  386. package/skills/tdd-workflow/SKILL.md +140 -0
  387. package/skills/verify-work/SKILL.md +109 -0
  388. package/src/cli/console-learning-projection.ts +140 -0
  389. package/src/cli/effective-backlog-settings.ts +99 -0
  390. package/src/cli/fixture-retirement-audit.ts +154 -0
  391. package/src/cli/flow-kit.ts +139 -0
  392. package/src/cli/init.ts +248 -0
  393. package/src/cli/promote-workflow-artifact.ts +64 -0
  394. package/src/cli/publish-change-helper.ts +143 -0
  395. package/src/cli/pull-work-provider.ts +481 -0
  396. package/src/cli/runtime-adapter.ts +24 -0
  397. package/src/cli/telemetry-doctor.ts +243 -0
  398. package/src/cli/usage-feedback.ts +418 -0
  399. package/src/cli/validate-hook-influence.ts +119 -0
  400. package/src/cli/validate-source-tree.ts +30 -0
  401. package/src/cli/validate-workflow-artifacts.ts +411 -0
  402. package/src/cli/veritas-governance.ts +322 -0
  403. package/src/cli/workflow-artifact-cleanup-audit.ts +281 -0
  404. package/src/cli/workflow-sidecar.ts +676 -0
  405. package/src/cli.ts +95 -0
  406. package/src/flow-kit/validate.ts +74 -0
  407. package/src/lib/args.ts +43 -0
  408. package/src/lib/fs.ts +62 -0
  409. package/src/lib/workflow-learning-projection.ts +491 -0
  410. package/src/runtime-adapters.ts +154 -0
  411. package/src/tools/build-universal-bundles.ts +366 -0
  412. package/src/tools/common.ts +61 -0
  413. package/src/tools/filter-installed-packs.ts +129 -0
  414. package/src/tools/generate-context-map.ts +199 -0
  415. package/src/tools/validate-package.ts +57 -0
  416. package/src/tools/validate-source-tree.ts +488 -0
  417. package/tsconfig.json +19 -0
  418. package/veritas.claims.json +6 -0
@@ -0,0 +1,121 @@
1
+ ---
2
+ name: "release-readiness"
3
+ description: "Decide whether evidence-backed work is ready to merge, release, deploy, or hold. Use after evidence-gate PASS, before merge/release/deploy, and for post-deploy verification planning."
4
+ ---
5
+
6
+ # Release Readiness
7
+
8
+ Turn a clean evidence result into an explicit release, deploy, or hold decision.
9
+
10
+ Release Readiness is not Evidence Gate. Evidence Gate decides whether completed work is trustworthy enough to publish or continue fixing. Release Readiness assumes evidence is clean and then checks the real publish/release surface: committed diff, pushed branch, provider change record or explicit no-provider-change reason, provider checks, ownership, rollout timing, rollback, observability, docs, and post-deploy verification.
11
+
12
+ ## Contract
13
+
14
+ - Use only after `evidence-gate` has produced `PASS`, or explicitly record why release review is blocked.
15
+ - Use only after the verified changes have been committed, pushed, and represented by a provider change record or an explicit no-provider-change decision.
16
+ - Do not fix code or weaken release criteria.
17
+ - Do not deploy unless the user explicitly asks and the environment is clear.
18
+ - Treat merge, release, deploy, and post-deploy verification as separate gates.
19
+ - Record rollback, observability, and ownership before approving release.
20
+ - Treat final acceptance documentation as part of release readiness: CI/merge should leave durable docs or an explicit reason they are not needed.
21
+
22
+ ## Inputs
23
+
24
+ - Evidence gate artifact and verdict.
25
+ - Issue/brief, commit SHA, pushed branch, provider change link or no-provider-change reason, provider check run links, and changed-file summary.
26
+ - Release notes, migration notes, feature flags, deploy target, and rollback plan.
27
+ - Known incidents, freezes, dependency risk, and owner availability.
28
+
29
+ ## Artifact Contract
30
+
31
+ Create or update `.flow-agents/<slug>/<slug>--release-readiness.md` with:
32
+
33
+ - `release_scope`: work included, excluded, issue/provider-change links
34
+ - `evidence_reference`: evidence artifact, verdict, residual risks
35
+ - `risk_review`: migrations, data, security, dependencies, flags, compatibility
36
+ - `operational_plan`: deploy target, order, owner, timing, comms
37
+ - `rollback_plan`: trigger, steps, owner, expected recovery signal
38
+ - `observability_plan`: metrics, logs, traces, alerts, dashboards
39
+ - `post_deploy_checks`: checks, commands, URLs, timing, expected signals
40
+ - `final_acceptance_docs`: long-lived docs updated, archived `.flow-agents/<slug>/` links, and deferred docs follow-ups
41
+ - `decision`: MERGE, RELEASE, DEPLOY, HOLD, or ROLLBACK_REQUIRED
42
+
43
+ When the repository provides `npm run workflow:sidecar --`, also write `release.json` with:
44
+
45
+ ```bash
46
+ npm run workflow:sidecar -- record-release .flow-agents/<slug> \
47
+ --decision merge \
48
+ --scope "..." \
49
+ --evidence-ref evidence.json \
50
+ --gate-json '{"name":"merge","status":"pass","summary":"..."}' \
51
+ --rollback-json '{"status":"not_required","summary":"...","owner":"..."}' \
52
+ --observability-json '{"status":"not_required","summary":"..."}' \
53
+ --docs-json '{"status":"updated","summary":"..."}' \
54
+ --summary "..."
55
+ ```
56
+
57
+ Use additional `--gate-json` and `--post-deploy-json` values for release, deploy, docs, and post-deploy gates as needed.
58
+
59
+ After writing `release.json`, run artifact validation when available. If `record-release` is unavailable or blocked, keep the release decision as `HOLD` in the Markdown artifact and record the sidecar-write or validation blocker as a `NOT_VERIFIED` evidence gap until the structured release record can be written or the gap is explicitly accepted.
60
+
61
+ ## Workflow
62
+
63
+ ### 1. Confirm Evidence
64
+
65
+ Verify the evidence verdict is `PASS`, current, and tied to the release scope. If scope changed, return to `evidence-gate`.
66
+
67
+ Then verify the publish-change gate:
68
+
69
+ - verified diff is committed
70
+ - branch is pushed
71
+ - provider change record is open or updated, or a no-provider-change reason is explicitly recorded
72
+ - provider checks / CI are linked and their status is known
73
+ - GitHub PRs remain the first `ChangeProvider` adapter example: for GitHub, the provider change record is the PR and provider checks include PR checks
74
+
75
+ If these are missing, return `HOLD` and route back to `publish-change` or `evidence-gate`; do not make a merge/release/deploy decision from local verification alone. Missing provider checks also return `HOLD` unless the risk class supports accepting the gap and the no-check reason is recorded.
76
+
77
+ For Flow Agents source changes, the default provider check is the GitHub Actions `Flow Agents CI / Builder Kit Baseline` job. Its summary should list command results, artifact names, and skipped live lanes. A passing baseline supports merge readiness for deterministic workflow, docs, hook, package, and bundle checks, but it does not prove live GitHub mutation, LLM acceptance, or Veritas/governance provider evidence unless those lanes are explicitly run and linked.
78
+
79
+ ### 2. Review Release Risk
80
+
81
+ Check migrations, feature flags, config, dependency changes, compatibility, security-sensitive paths, customer impact, and deploy timing.
82
+
83
+ ### 3. Plan Operation
84
+
85
+ Record who owns merge/release/deploy, where it happens, when it happens, what communication is required, and what must not be included.
86
+
87
+ ### 4. Plan Rollback
88
+
89
+ Define rollback trigger, rollback steps, owner, data recovery limits, and the signal that confirms recovery.
90
+
91
+ ### 5. Plan Post-Deploy Verification
92
+
93
+ Map expected behavior to production-like checks, telemetry, dashboards, logs, smoke tests, or manual verification. High-risk work must have direct runtime evidence planned.
94
+
95
+ ### 6. Decide
96
+
97
+ Produce one verdict:
98
+
99
+ - `MERGE`: safe to merge, release/deploy not yet authorized.
100
+ - `RELEASE`: safe to cut or publish a release artifact.
101
+ - `DEPLOY`: safe to deploy with recorded owner and checks.
102
+ - `HOLD`: blocked by risk, timing, missing evidence, or ownership.
103
+ - `ROLLBACK_REQUIRED`: deployed state is unsafe and rollback should be considered.
104
+
105
+ ### 7. Promote Delivery Knowledge
106
+
107
+ When CI has passed and merge/release acceptance is clear, require a docs decision:
108
+
109
+ - update long-lived docs with what changed, how to use it, and important why/how decisions; or
110
+ - record why no durable docs are needed; and
111
+ - link back to the archived `.flow-agents/<slug>/` plan/session artifact for implementation history.
112
+
113
+ ## Gates
114
+
115
+ - Merge Gate: evidence is current, scope matches, CI is acceptable, and review ownership is clear.
116
+ - Release Gate: versioning, notes, compatibility, and artifact risk are clear.
117
+ - Deploy Gate: deploy owner, target, rollback, and observability are ready.
118
+ - Post-Deploy Gate: checks are scheduled or completed and signals are recorded.
119
+ - Docs Gate: durable docs are updated, intentionally skipped, or routed to an owned follow-up.
120
+
121
+ After deployment evidence exists, hand off to `learning-review` to capture outcomes.
@@ -0,0 +1,161 @@
1
+ ---
2
+ name: "review-work"
3
+ description: "Review primitive - run report-only code, security, dependency, architecture/standards, and IaC/policy critique before verification; records findings through the critique artifact/sink, currently critique.json locally."
4
+ ---
5
+
6
+ # Review
7
+
8
+ Session file in, critique verdict out. Delegates to review agents and records findings separately from verification evidence.
9
+
10
+ ## Why This Is Separate From Verify
11
+
12
+ Verification is not critique.
13
+
14
+ Review asks whether the implementation should change: maintainability, security, architecture, standards, edge cases, and risky assumptions.
15
+
16
+ Verify asks whether the behavior is proven: build, lint/types, tests, browser/runtime evidence, acceptance criteria, and Goal Fit.
17
+
18
+ Keeping them separate makes failures route cleanly:
19
+
20
+ - The critique artifact/sink says what a reviewer thinks should be fixed or accepted; the current local sidecar materialization is `critique.json`.
21
+ - `evidence.json` says what was proven, failed, or could not be verified.
22
+
23
+ ## Agents
24
+
25
+ | Agent | Role |
26
+ |---|---|
27
+ | tool-code-reviewer | Quality, maintainability, correctness, architecture fit, and project standards |
28
+ | tool-security-reviewer | Security review when risk triggers are present |
29
+ | tool-dependencies-updater | Dependency review when package manifests, dependency manifests, package manager config, or lockfiles change |
30
+ | configured architecture/domain/IaC/policy reviewer | Optional reviewer when the project or user configures one |
31
+
32
+ ## Shared Contracts
33
+
34
+ Follow:
35
+ - `context/contracts/artifact-contract.md`
36
+ - `context/contracts/review-contract.md`
37
+ - `context/contracts/planning-contract.md` for acceptance criteria and Definition Of Done context
38
+
39
+ ## Read-Only Rule (STRICT)
40
+
41
+ Reviewers NEVER modify source code:
42
+ - No code patches
43
+ - No format fixes
44
+ - No lint autofixes
45
+ - No "found and fixed"
46
+
47
+ If a fix is needed, report it as a finding. The orchestrator routes it back to execution.
48
+
49
+ ## Input
50
+
51
+ - Session file path in `.flow-agents/<slug>/` when available
52
+ - Plan artifact or implementation summary
53
+ - Modified files from execution progress or `git diff --name-only`
54
+ - Project standards, especially `context/code-review-standards.md` when present
55
+ - Security trigger context when present
56
+ - Dependency trigger context when package manifests, dependency manifests, package manager config, or lockfiles change
57
+ - IaC/policy trigger context when infrastructure, deployment, or policy files change
58
+
59
+ ## Security Review Triggers
60
+
61
+ Run `tool-security-reviewer` when modified files or the task touch:
62
+
63
+ - authentication or authorization
64
+ - user input handling, forms, query params, headers, templates, or serialization
65
+ - database queries, migrations, schemas, or persistence
66
+ - filesystem paths, uploads, downloads, archives, or generated files
67
+ - API endpoints, webhooks, external API calls, or network operations
68
+ - cryptography, token handling, secrets, payments, billing, CI, deployment, or feature flags
69
+
70
+ If trigger detection is uncertain for a substantial change, run the security reviewer or record the security review as `not_verified`.
71
+
72
+ ## Dependency Review Triggers
73
+
74
+ Delegate to `tool-dependencies-updater` when modified files include package
75
+ manifests, dependency manifests, package manager configuration, or lockfiles.
76
+ Common triggers include `package.json`, `package-lock.json`, `pnpm-lock.yaml`,
77
+ `yarn.lock`, `requirements.txt`, `pyproject.toml`, `poetry.lock`,
78
+ `Pipfile.lock`, `Cargo.toml`, `Cargo.lock`, `go.mod`, `go.sum`, `pom.xml`,
79
+ `build.gradle`, `Gemfile.lock`, `composer.lock`, NuGet lock files, Docker base
80
+ image dependency declarations, and dependency update policy files.
81
+
82
+ The dependency lane is report-only. It may inspect version, advisory, and
83
+ lockfile risk using configured read-only tooling, but review-work must not add
84
+ package-registry behavior, update dependencies, or install scanners. If the
85
+ required dependency review cannot run, record the dependency lane as
86
+ `not_verified`.
87
+
88
+ ## IaC/Policy Review Triggers
89
+
90
+ Run security and configured IaC/policy review when modified files touch
91
+ infrastructure as code, policy as code, cloud/deployment config, Terraform,
92
+ OpenTofu, CloudFormation, Kubernetes manifests, Helm charts, Dockerfiles,
93
+ Compose files, GitHub Actions, CI/CD permissions, IAM, OPA/Rego, Sentinel, or
94
+ environment provisioning.
95
+
96
+ IaC/policy scanner guidance is vendor-neutral. Acceptable scanner classes
97
+ include Checkov, tfsec, Trivy, Semgrep, and project-configured policy scanners;
98
+ do not hard-require one vendor. Treat scanner output as report-only critique
99
+ input. If repo-local scanner tooling is unavailable, record the IaC/policy lane
100
+ as `not_verified` instead of installing tools or silently passing it.
101
+
102
+ ## Workflow
103
+
104
+ 1. Read the session file to find the plan artifact path and modified files.
105
+ 2. Mark review as in progress. Markdown session files may use human-readable progress labels such as `reviewing`, but machine-readable workflow sidecars must use canonical `state.status` and `state.phase` values. For review-work, keep the lifecycle phase in execution and record critique results through the critique artifact/sink, currently `critique.json` locally:
106
+
107
+ ```bash
108
+ npm run workflow:sidecar -- advance-state <artifact-dir> \
109
+ --status in_progress \
110
+ --phase execution \
111
+ --summary "Review in progress." \
112
+ --next-action "Resolve review findings, then run verify-work."
113
+ ```
114
+
115
+ 3. Delegate in parallel:
116
+
117
+ ```text
118
+ tool-code-reviewer:
119
+ - Modified files
120
+ - Plan/acceptance criteria and user outcome
121
+ - context/code-review-standards.md when present
122
+ - Architecture, standards, maintainability, and correctness focus
123
+ - todo_file path or artifact root for writing a review artifact
124
+
125
+ tool-security-reviewer (when triggered):
126
+ - Modified files
127
+ - Security-sensitive areas to inspect
128
+ - Dependency/security commands available to run read-only
129
+
130
+ tool-dependencies-updater (when dependency triggers are present):
131
+ - Modified package manifests, dependency manifests, package manager config, and lockfiles
132
+ - Plan/acceptance criteria and dependency-risk context
133
+ - Read-only dependency review focus; no dependency updates or registry behavior changes from review-work
134
+
135
+ configured IaC/policy reviewer or repo-local scanner output (when IaC/policy triggers are present):
136
+ - Modified Terraform, Kubernetes, Docker, Helm, GitHub Actions, policy-as-code, cloud, or deployment files
137
+ - Read-only scanner classes such as Checkov, tfsec, Trivy, and Semgrep when already configured
138
+ - Missing scanner or reviewer recorded as not_verified
139
+ ```
140
+
141
+ 4. Import or record reviewer results into the critique artifact/sink, currently `critique.json` locally:
142
+
143
+ ```bash
144
+ npm run workflow:sidecar -- import-critique <artifact-dir> <review-artifact> \
145
+ --reviewer tool-code-reviewer
146
+ ```
147
+
148
+ Use `record-critique` directly when the reviewer returns structured findings instead of a Markdown artifact.
149
+
150
+ 5. Route on critique status:
151
+ - **pass/comment with no blocking findings** -> proceed to `verify-work`
152
+ - **fail** -> route findings back to `execute-plan` or `plan-work`
153
+ - **not_verified** -> surface the gap for user decision or run the missing reviewer
154
+
155
+ ## Output
156
+
157
+ - Review artifacts written by reviewers when available
158
+ - Critique artifact/sink updated with reviewer verdicts and findings; locally this is currently `critique.json`
159
+ - Session or sidecar state updated so the next step is clear
160
+
161
+ Do not treat a clean review as proof that the feature works. It only clears the critique gate; `verify-work` still has to collect evidence.
@@ -0,0 +1,66 @@
1
+ ---
2
+ name: search-first
3
+ description: "Research-before-coding workflow. Search for existing tools, libraries, and patterns before writing custom code."
4
+ ---
5
+
6
+ # Search-First
7
+
8
+ Research before building. Every implementation task starts here.
9
+
10
+ ## Workflow
11
+
12
+ ### 1. Need Analysis
13
+ Define clearly before searching:
14
+ - What functionality is needed (inputs, outputs, behavior)
15
+ - Language and framework constraints
16
+ - Performance, size, or license requirements
17
+
18
+ ### 2. Parallel Search
19
+ Search all sources simultaneously:
20
+ - **Codebase** — grep/code search for existing implementations or utilities
21
+ - **Package registries** — npm (`npmjs.com`), PyPI (`pypi.org`), crates.io, Go modules
22
+ - **GitHub** — code search for patterns, reference implementations
23
+ - **Web** — blog posts, Stack Overflow, official docs for recommended approaches
24
+
25
+ ### 3. Evaluate Candidates
26
+
27
+ Score each candidate (1-5) on:
28
+
29
+ | Criterion | Weight | What to check |
30
+ |-----------|--------|---------------|
31
+ | Functionality | High | Does it solve the actual need? |
32
+ | Maintenance | High | Last release, open issues, bus factor |
33
+ | Community | Medium | Downloads, stars, dependents |
34
+ | Documentation | Medium | API docs, examples, migration guides |
35
+ | License | High | Compatible with project? (MIT/Apache preferred) |
36
+ | Dependencies | Medium | Transitive dep count, known vulnerabilities |
37
+
38
+ ### 4. Decide
39
+
40
+ - **Adopt** — exact match, well-maintained, good community → install and use directly
41
+ - **Extend** — partial match, solid core → wrap with thin adapter layer
42
+ - **Build** — nothing suitable, unique requirements → write minimal custom code
43
+
44
+ ### 5. Implement
45
+ - Adopt: install package, write integration code
46
+ - Extend: install package, write wrapper/adapter
47
+ - Build: write minimal custom implementation, document why existing solutions were rejected
48
+
49
+ ## Search Shortcuts
50
+
51
+ | Category | First check |
52
+ |----------|-------------|
53
+ | HTTP client | axios (JS), httpx (Python), net/http (Go) |
54
+ | Validation | Zod (TS), Pydantic (Python), validator (Go) |
55
+ | CLI parsing | commander/yargs (JS), click/typer (Python), cobra (Go) |
56
+ | Testing | Jest/Vitest (TS), pytest (Python), testing (Go) |
57
+ | Date/time | date-fns (JS), pendulum (Python), time (Go) |
58
+ | Logging | pino/winston (JS), structlog (Python), slog (Go) |
59
+
60
+ ## Anti-Patterns
61
+
62
+ - **Jumping to code** — writing custom implementations without searching first
63
+ - **Ignoring existing solutions** — the codebase already has a utility for this
64
+ - **Over-customizing** — wrapping a library so heavily it's harder than building from scratch
65
+ - **Dependency bloat** — adding a 50KB package for one function (just copy the function)
66
+ - **Stale picks** — choosing unmaintained packages because they were popular 3 years ago
@@ -0,0 +1,140 @@
1
+ ---
2
+ name: "tdd-workflow"
3
+ description: "Test-driven development — RED → GREEN → REFACTOR with git checkpoints. Wraps plan-work → execute-plan → review-work → verify-work with test-first constraints and coverage gates."
4
+ ---
5
+
6
+ # TDD Workflow
7
+
8
+ Test-driven development orchestrator. Wraps the standard plan → execute → verify chain with test-first constraints.
9
+
10
+ ## When to Activate
11
+
12
+ - User says "use TDD", "test-driven", "write tests first", "TDD"
13
+ - User asks to build something and mentions test coverage requirements
14
+
15
+ ## Agents
16
+
17
+ Same as deliver (inherited from primitives):
18
+
19
+ | Agent | Used by |
20
+ |---|---|
21
+ | tool-planner | plan-work (with TDD constraints) |
22
+ | tool-worker (x4) | execute-plan (tests first, then implementation) |
23
+ | tool-code-reviewer | review-work |
24
+ | tool-security-reviewer | review-work (conditional) |
25
+ | tool-verifier | verify-work (with coverage check) |
26
+ | tool-playwright | verify-work (if UI) |
27
+
28
+ ## Orchestrator Rule
29
+
30
+ Same as deliver: you never touch source files. You coordinate the primitives with TDD-specific context.
31
+
32
+ ## Workflow
33
+
34
+ ### 1. Create session file
35
+
36
+ Filename: `<branch>--tdd-<slug>.md`
37
+ Set `status: planning`, `type: tdd`, `iteration: 0`
38
+
39
+ ### 2. Plan (plan-work with TDD constraint)
40
+
41
+ Invoke plan-work with additional constraint:
42
+ ```
43
+ Constraint: TEST-FIRST DEVELOPMENT
44
+ - Plan MUST include test files as separate tasks in Wave 1
45
+ - Each feature task must have a corresponding test task that precedes it
46
+ - Test tasks specify: test file path, test cases to write, expected failures
47
+ - Implementation tasks specify: which tests they make pass
48
+ - Include a final "coverage check" task
49
+ ```
50
+
51
+ Present plan to user. Get approval.
52
+
53
+ ### 3. Execute RED phase
54
+
55
+ Invoke execute-plan for Wave 1 only (test tasks):
56
+ - tool-worker writes test files
57
+ - After Wave 1 completes, run the tests — they MUST fail (RED)
58
+ - If tests pass (no RED state), the tests are wrong — flag to user
59
+ - Git checkpoint: `test: add failing tests for <feature>`
60
+
61
+ ### 4. Execute GREEN phase
62
+
63
+ Invoke execute-plan for Wave 2 (implementation tasks):
64
+ - tool-worker writes minimal code to make tests pass
65
+ - After Wave 2 completes, run the tests — they MUST pass (GREEN)
66
+ - If tests still fail, loop: re-invoke execute-plan with failure context
67
+ - Git checkpoint: `feat: implement <feature> (tests passing)`
68
+
69
+ ### 5. Execute REFACTOR phase
70
+
71
+ Invoke execute-plan for Wave 3 (refactor tasks, if any):
72
+ - tool-worker improves code quality while keeping tests green
73
+ - After Wave 3, run tests again — must still pass
74
+ - Git checkpoint: `refactor: clean up <feature>`
75
+
76
+ ### 6. Review (review-work)
77
+
78
+ Invoke `review-work` after GREEN/REFACTOR and before verification. Review findings must be fixed, accepted, deferred, or marked false positive before delivery.
79
+
80
+ ### 7. Verify (verify-work with coverage gate)
81
+
82
+ Invoke verify-work with additional context:
83
+ ```
84
+ Additional verification: Check test coverage.
85
+ Run coverage command and verify >= 80% on changed files.
86
+ Include coverage % in the verification report.
87
+ If coverage < 80%, verdict is FAIL with coverage gap details.
88
+ ```
89
+
90
+ ### 8. Route on verdict
91
+
92
+ Same as deliver:
93
+ - **Clean review + all PASS + coverage >= 80%** → deliver
94
+ - **Any FAIL or coverage < 80%** → loop (re-plan failing items)
95
+ - **NOT_VERIFIED** → surface to user
96
+
97
+ ### 9. Deliver
98
+
99
+ Same as deliver, plus:
100
+ - Report TDD cycle summary: RED → GREEN → REFACTOR with checkpoint SHAs
101
+ - Report final coverage %
102
+
103
+ ## Session File Format
104
+
105
+ ```markdown
106
+ # TDD: <Goal one-liner>
107
+
108
+ branch: <branch>
109
+ created: <date>
110
+ status: planning | red | green | refactor | verifying | delivered
111
+ type: tdd
112
+ iteration: 0
113
+ coverage_target: 80
114
+
115
+ ## Plan
116
+ (from plan-work)
117
+
118
+ ## RED Phase
119
+ - Tests written: <list>
120
+ - All failing: YES/NO
121
+ - Checkpoint: <SHA>
122
+
123
+ ## GREEN Phase
124
+ - Implementation: <list>
125
+ - All passing: YES/NO
126
+ - Checkpoint: <SHA>
127
+
128
+ ## REFACTOR Phase
129
+ - Changes: <list>
130
+ - Tests still passing: YES/NO
131
+ - Checkpoint: <SHA>
132
+
133
+ ## Verification Report
134
+ (from verify-work)
135
+
136
+ ## History
137
+ - iteration 1: RED ✓, GREEN ✓, REFACTOR ✓, coverage 85%
138
+ ```
139
+
140
+ {context?}
@@ -0,0 +1,109 @@
1
+ ---
2
+ name: "verify-work"
3
+ description: "Verification primitive — session file path to structured evidence verdict via tool-verifier + tool-playwright. Reads acceptance criteria from plan artifact."
4
+ ---
5
+
6
+ # Verify
7
+
8
+ Session file in, structured evidence verdict out. Delegates to tool-verifier and tool-playwright.
9
+
10
+ Verification is not critique. Run `review-work` first when the task needs maintainability, security, architecture, or standards review. Verification should start only after the required critique gate has been recorded or explicitly marked `not_verified`. `verify-work` records proof in `evidence.json`; `review-work` records critique through the critique artifact/sink, currently `critique.json` locally.
11
+
12
+ ## Agents
13
+
14
+ | Agent | Role |
15
+ |---|---|
16
+ | tool-verifier | Code verification, acceptance criteria checking, structured verdicts |
17
+ | tool-playwright | Visual verification, screenshots, accessibility checks |
18
+
19
+ ## Orchestrator Rule
20
+
21
+ You do not review source files. You delegate to tool-verifier and tool-playwright, then read the verdict artifact.
22
+
23
+ ## Shared Contracts
24
+
25
+ Follow:
26
+ - `context/contracts/artifact-contract.md`
27
+ - `context/contracts/verification-contract.md`
28
+ - `context/contracts/planning-contract.md` for acceptance criteria and Definition Of Done
29
+
30
+ This skill owns orchestration and routing. The verification contract owns phases, report-only behavior, verdict rules, report shape, Goal Fit checks, and `NOT_VERIFIED` handling.
31
+
32
+ ## Read-Only Rule (STRICT)
33
+
34
+ **Verifiers NEVER modify source code.** tool-verifier and tool-playwright are read-only reporters:
35
+ - They may run commands (build, test, lint) but NEVER apply fixes
36
+ - No format fixes, no lint auto-fixes, no "1 format fix applied"
37
+ - No code patches, no "found and fixed" — report findings only
38
+ - If a fix is needed, report it as a finding. The orchestrator routes it back to execute-plan.
39
+
40
+ ## Input
41
+
42
+ - **Session file path**: the session file in `.flow-agents/<slug>/` (preferred)
43
+ - The session file references the plan artifact (which has acceptance criteria) and execution progress (which has modified files)
44
+ - If NO session file exists, delegate to tool-verifier directly (see Standalone Verification below)
45
+
46
+ ## Standalone Verification (no session file)
47
+
48
+ When invoked without a session file (e.g., user says "verify this project" or "run verification"):
49
+
50
+ 1. Delegate to tool-verifier with:
51
+ - The user's verification request
52
+ - The current working directory
53
+ - Modified files from `git diff --name-only` (if available)
54
+ 2. Delegate to tool-playwright in parallel if UI changes are mentioned
55
+ 3. Read the verdict and report to the user
56
+
57
+ Skip session file lookup — go straight to delegation.
58
+
59
+ ## Workflow (with session file)
60
+
61
+ 1. Read the session file to find the plan artifact path and modified files
62
+ 2. Confirm the review-before-verify gate: the critique artifact/sink should show the required review pass, blocking findings, or an explicit `not_verified` gap. If the critique gate is missing for work that requires review, stop and route to `review-work` instead of treating verification as a substitute critique.
63
+ 3. Set session file `status: verifying` and update `state.json` phase/status. Use `npm run workflow:sidecar -- advance-state <artifact-dir> --status verifying --phase verification --summary ... --next-action ...` when the repository provides it.
64
+ 4. Delegate in parallel:
65
+ ```
66
+ tool-verifier:
67
+ - Acceptance criteria from plan artifact
68
+ - Acceptance criteria from acceptance.json when present
69
+ - Definition Of Done and stop-short risks from plan artifact
70
+ - Modified files from execution progress
71
+ - Requirement to preserve each AC id and map it to command/test evidence plus structured source evidence refs when implementation behavior is claimed
72
+ - Evidence ref schema: objects with `kind`, `url`, `file`, `line_start`, `line_end`, and `excerpt` where applicable; source refs require local file/line/excerpt and should use immutable GitHub blob permalinks pinned to a commit SHA when provider URLs are available
73
+ - Build/test commands from AGENTS.md or plan
74
+ - todo_file path for writing verdict artifact
75
+ - Workflow artifact root path; append verifier progress with record-agent-event
76
+
77
+ tool-playwright (if UI changes exist):
78
+ - Pages/components to check
79
+ - Expected visual state
80
+ - Workflow artifact root path; append browser evidence or blockers with record-agent-event
81
+ ```
82
+ 5. Read the verdict artifact: `<session-basename>-review.md`
83
+ 6. Update session file: paste verdict summary into `## Verification Report`
84
+ 7. Write or update `evidence.json` with verification checks, top-level verdict, and `not_verified_gaps`
85
+ - use `npm run workflow:sidecar -- record-evidence <artifact-dir> --verdict ... --check-json ...` when the repository provides it
86
+ - `checks[].artifact_refs` must use structured evidence ref objects, not legacy strings
87
+ 8. Update `acceptance.json` with criterion statuses and structured evidence refs
88
+ - `criteria[].evidence_refs` must use structured evidence refs and map each AC id to command/test proof plus source refs for behavior claims
89
+ - when source refs are missing for a behavior claim, mark the criterion `not_verified` or record an accepted gap instead of using broad prose-only evidence
90
+ 9. Route on verdicts:
91
+ - **All PASS** → set `status: verified`
92
+ - **Any FAIL** → set `status: failed`, list failures
93
+ - **Any NOT_VERIFIED** → set Markdown status `needs-decision`, set `state.json` status `needs_decision`, and surface to user
94
+
95
+ ## Verification Contract
96
+
97
+ tool-verifier writes the verdict artifact using `context/contracts/verification-contract.md`.
98
+
99
+ You do not override verdicts. FAIL is FAIL until re-verified. `NOT_VERIFIED` items are surfaced to the user so they can decide whether to accept, fix, or skip. A technically green build is not enough for PASS when the `Definition Of Done` says the user still cannot run, understand, inspect, or act on the result.
100
+
101
+ ## Output
102
+
103
+ - Verdict artifact: `<session-basename>-review.md`
104
+ - Session file updated with verification report and status
105
+ - Structured sidecars updated: `state.json`, `acceptance.json`, and `evidence.json`
106
+ - Acceptance evidence preserves AC ids and uses structured evidence refs; prose-only behavior claims are not clean verification evidence
107
+ - Verdict follows `context/contracts/verification-contract.md`
108
+
109
+ If `record-evidence` or artifact validation is unavailable or blocked, keep the verdict explicit and record the sidecar-write gap as `NOT_VERIFIED`. Do not convert verifier output into `PASS` without structured evidence when sidecars are required.