@event4u/agent-config 1.13.0 → 1.15.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (291) hide show
  1. package/.agent-src/commands/agent-handoff.md +4 -1
  2. package/.agent-src/commands/agent-status.md +3 -0
  3. package/.agent-src/commands/agents-audit.md +4 -0
  4. package/.agent-src/commands/agents-cleanup.md +6 -1
  5. package/.agent-src/commands/agents-prepare.md +3 -0
  6. package/.agent-src/commands/analyze-reference-repo.md +4 -0
  7. package/.agent-src/commands/bug-fix.md +7 -3
  8. package/.agent-src/commands/bug-investigate.md +4 -0
  9. package/.agent-src/commands/chat-history-checkpoint.md +126 -0
  10. package/.agent-src/commands/chat-history-clear.md +6 -1
  11. package/.agent-src/commands/chat-history-resume.md +7 -2
  12. package/.agent-src/commands/chat-history.md +7 -2
  13. package/.agent-src/commands/check-current-md.md +137 -0
  14. package/.agent-src/commands/commit-in-chunks.md +118 -0
  15. package/.agent-src/commands/commit.md +4 -0
  16. package/.agent-src/commands/compress.md +37 -2
  17. package/.agent-src/commands/context-create.md +4 -0
  18. package/.agent-src/commands/context-refactor.md +4 -0
  19. package/.agent-src/commands/copilot-agents-init.md +3 -0
  20. package/.agent-src/commands/copilot-agents-optimize.md +3 -0
  21. package/.agent-src/commands/create-pr-description.md +4 -0
  22. package/.agent-src/commands/create-pr.md +4 -0
  23. package/.agent-src/commands/do-and-judge.md +4 -1
  24. package/.agent-src/commands/do-in-steps.md +3 -0
  25. package/.agent-src/commands/e2e-heal.md +4 -0
  26. package/.agent-src/commands/e2e-plan.md +4 -0
  27. package/.agent-src/commands/estimate-ticket.md +4 -1
  28. package/.agent-src/commands/feature-dev.md +4 -0
  29. package/.agent-src/commands/feature-explore.md +4 -0
  30. package/.agent-src/commands/feature-plan.md +4 -0
  31. package/.agent-src/commands/feature-refactor.md +4 -0
  32. package/.agent-src/commands/feature-roadmap.md +6 -0
  33. package/.agent-src/commands/fix-ci.md +4 -0
  34. package/.agent-src/commands/fix-portability.md +5 -2
  35. package/.agent-src/commands/fix-pr-bot-comments.md +4 -0
  36. package/.agent-src/commands/fix-pr-comments.md +4 -0
  37. package/.agent-src/commands/fix-pr-developer-comments.md +4 -0
  38. package/.agent-src/commands/fix-references.md +3 -0
  39. package/.agent-src/commands/fix-seeder.md +4 -0
  40. package/.agent-src/commands/implement-ticket.md +39 -13
  41. package/.agent-src/commands/jira-ticket.md +4 -0
  42. package/.agent-src/commands/judge.md +3 -0
  43. package/.agent-src/commands/memory-add.md +5 -3
  44. package/.agent-src/commands/memory-full.md +5 -2
  45. package/.agent-src/commands/memory-promote.md +7 -6
  46. package/.agent-src/commands/mode.md +3 -0
  47. package/.agent-src/commands/module-create.md +4 -0
  48. package/.agent-src/commands/module-explore.md +4 -0
  49. package/.agent-src/commands/onboard.md +33 -0
  50. package/.agent-src/commands/optimize-agents.md +4 -0
  51. package/.agent-src/commands/optimize-augmentignore.md +12 -0
  52. package/.agent-src/commands/optimize-rtk-filters.md +3 -0
  53. package/.agent-src/commands/optimize-skills.md +4 -0
  54. package/.agent-src/commands/override-create.md +4 -0
  55. package/.agent-src/commands/override-manage.md +4 -0
  56. package/.agent-src/commands/package-reset.md +3 -0
  57. package/.agent-src/commands/package-test.md +3 -0
  58. package/.agent-src/commands/prepare-for-review.md +4 -0
  59. package/.agent-src/commands/project-analyze.md +4 -0
  60. package/.agent-src/commands/project-health.md +4 -0
  61. package/.agent-src/commands/propose-memory.md +6 -8
  62. package/.agent-src/commands/quality-fix.md +4 -0
  63. package/.agent-src/commands/refine-ticket.md +12 -7
  64. package/.agent-src/commands/review-changes.md +39 -8
  65. package/.agent-src/commands/review-routing.md +4 -0
  66. package/.agent-src/commands/roadmap-create.md +18 -0
  67. package/.agent-src/commands/roadmap-execute.md +14 -1
  68. package/.agent-src/commands/rule-compliance-audit.md +4 -0
  69. package/.agent-src/commands/set-cost-profile.md +11 -0
  70. package/.agent-src/commands/sync-agent-settings.md +12 -0
  71. package/.agent-src/commands/sync-gitignore.md +3 -0
  72. package/.agent-src/commands/tests-create.md +4 -0
  73. package/.agent-src/commands/tests-execute.md +6 -3
  74. package/.agent-src/commands/threat-model.md +4 -0
  75. package/.agent-src/commands/update-form-request-messages.md +4 -0
  76. package/.agent-src/commands/upstream-contribute.md +4 -0
  77. package/.agent-src/commands/work.md +161 -0
  78. package/.agent-src/guidelines/agent-infra/engineering-memory-data-format.md +2 -6
  79. package/.agent-src/guidelines/agent-infra/layered-settings.md +0 -1
  80. package/.agent-src/guidelines/agent-infra/memory-access.md +0 -7
  81. package/.agent-src/guidelines/agent-infra/role-contracts.md +2 -4
  82. package/.agent-src/guidelines/agent-infra/self-improvement-pipeline.md +0 -1
  83. package/.agent-src/guidelines/php/patterns/strategy.md +180 -2
  84. package/.agent-src/personas/README.md +0 -1
  85. package/.agent-src/rules/artifact-drafting-protocol.md +7 -2
  86. package/.agent-src/rules/artifact-engagement-recording.md +133 -0
  87. package/.agent-src/rules/ask-when-uncertain.md +18 -13
  88. package/.agent-src/rules/augment-portability.md +64 -37
  89. package/.agent-src/rules/autonomous-execution.md +158 -0
  90. package/.agent-src/rules/chat-history-cadence.md +109 -0
  91. package/.agent-src/rules/chat-history-ownership.md +123 -0
  92. package/.agent-src/rules/chat-history-visibility.md +96 -0
  93. package/.agent-src/rules/cli-output-handling.md +27 -4
  94. package/.agent-src/rules/command-suggestion.md +134 -0
  95. package/.agent-src/rules/commit-policy.md +109 -0
  96. package/.agent-src/rules/direct-answers.md +114 -0
  97. package/.agent-src/rules/docs-sync.md +36 -0
  98. package/.agent-src/rules/downstream-changes.md +10 -9
  99. package/.agent-src/rules/improve-before-implement.md +9 -6
  100. package/.agent-src/rules/language-and-tone.md +85 -6
  101. package/.agent-src/rules/non-destructive-by-default.md +117 -0
  102. package/.agent-src/rules/package-ci-checks.md +4 -0
  103. package/.agent-src/rules/preservation-guard.md +20 -0
  104. package/.agent-src/rules/roadmap-progress-sync.md +159 -27
  105. package/.agent-src/rules/role-mode-adherence.md +1 -1
  106. package/.agent-src/rules/scope-control.md +42 -1
  107. package/.agent-src/rules/size-enforcement.md +2 -3
  108. package/.agent-src/rules/skill-quality.md +3 -8
  109. package/.agent-src/rules/ui-audit-before-build.md +106 -0
  110. package/.agent-src/rules/user-interaction.md +107 -51
  111. package/.agent-src/scripts/update_roadmap_progress.py +73 -9
  112. package/.agent-src/skills/blade-ui/SKILL.md +47 -3
  113. package/.agent-src/skills/command-routing/SKILL.md +32 -0
  114. package/.agent-src/skills/command-writing/SKILL.md +52 -2
  115. package/.agent-src/skills/description-assist/SKILL.md +21 -0
  116. package/.agent-src/skills/estimate-ticket/SKILL.md +0 -1
  117. package/.agent-src/skills/existing-ui-audit/SKILL.md +202 -0
  118. package/.agent-src/skills/fe-design/SKILL.md +78 -61
  119. package/.agent-src/skills/file-editor/SKILL.md +9 -0
  120. package/.agent-src/skills/finishing-a-development-branch/SKILL.md +4 -0
  121. package/.agent-src/skills/flux/SKILL.md +31 -4
  122. package/.agent-src/skills/guideline-writing/SKILL.md +24 -2
  123. package/.agent-src/skills/learning-to-rule-or-skill/SKILL.md +51 -9
  124. package/.agent-src/skills/livewire/SKILL.md +49 -4
  125. package/.agent-src/skills/md-language-check/SKILL.md +103 -0
  126. package/.agent-src/skills/php-coder/SKILL.md +24 -0
  127. package/.agent-src/skills/react-shadcn-ui/SKILL.md +121 -0
  128. package/.agent-src/skills/refine-prompt/SKILL.md +220 -0
  129. package/.agent-src/skills/refine-ticket/SKILL.md +32 -28
  130. package/.agent-src/skills/roadmap-management/SKILL.md +24 -11
  131. package/.agent-src/skills/rule-writing/SKILL.md +23 -1
  132. package/.agent-src/skills/skill-writing/SKILL.md +3 -5
  133. package/.agent-src/skills/upstream-contribute/SKILL.md +3 -3
  134. package/.agent-src/skills/using-git-worktrees/SKILL.md +3 -1
  135. package/.agent-src/templates/AGENTS.md +24 -6
  136. package/.agent-src/templates/agent-settings.md +149 -0
  137. package/.agent-src/templates/roadmaps.md +11 -4
  138. package/.agent-src/templates/scripts/implement_ticket/__init__.py +63 -26
  139. package/.agent-src/templates/scripts/implement_ticket/__main__.py +8 -2
  140. package/.agent-src/templates/scripts/memory_lookup.py +1 -1
  141. package/.agent-src/templates/scripts/telemetry/__init__.py +42 -0
  142. package/.agent-src/templates/scripts/telemetry/aggregator.py +154 -0
  143. package/.agent-src/templates/scripts/telemetry/boundary.py +171 -0
  144. package/.agent-src/templates/scripts/telemetry/engagement.py +238 -0
  145. package/.agent-src/templates/scripts/telemetry/report_renderer.py +170 -0
  146. package/.agent-src/templates/scripts/telemetry/settings.py +112 -0
  147. package/.agent-src/templates/scripts/telemetry_record.py +166 -0
  148. package/.agent-src/templates/scripts/telemetry_report.py +161 -0
  149. package/.agent-src/templates/scripts/telemetry_status.py +142 -0
  150. package/.agent-src/templates/scripts/work_engine/__init__.py +58 -0
  151. package/.agent-src/templates/scripts/work_engine/__main__.py +9 -0
  152. package/.agent-src/templates/scripts/work_engine/cli.py +195 -0
  153. package/.agent-src/templates/scripts/work_engine/cli_args.py +116 -0
  154. package/.agent-src/templates/scripts/{implement_ticket → work_engine}/delivery_state.py +10 -3
  155. package/.agent-src/templates/scripts/work_engine/directives/__init__.py +33 -0
  156. package/.agent-src/templates/scripts/work_engine/directives/backend/__init__.py +98 -0
  157. package/.agent-src/templates/scripts/{implement_ticket/steps → work_engine/directives/backend}/analyze.py +1 -1
  158. package/.agent-src/templates/scripts/{implement_ticket/steps → work_engine/directives/backend}/implement.py +3 -3
  159. package/.agent-src/templates/scripts/{implement_ticket/steps → work_engine/directives/backend}/memory.py +2 -2
  160. package/.agent-src/templates/scripts/{implement_ticket/steps → work_engine/directives/backend}/plan.py +2 -2
  161. package/.agent-src/templates/scripts/work_engine/directives/backend/refine.py +396 -0
  162. package/.agent-src/templates/scripts/{implement_ticket/steps → work_engine/directives/backend}/report.py +37 -5
  163. package/.agent-src/templates/scripts/{implement_ticket/steps → work_engine/directives/backend}/test.py +2 -2
  164. package/.agent-src/templates/scripts/{implement_ticket/steps → work_engine/directives/backend}/verify.py +2 -2
  165. package/.agent-src/templates/scripts/work_engine/directives/mixed/__init__.py +116 -0
  166. package/.agent-src/templates/scripts/work_engine/directives/mixed/contract.py +254 -0
  167. package/.agent-src/templates/scripts/work_engine/directives/mixed/stitch.py +229 -0
  168. package/.agent-src/templates/scripts/work_engine/directives/mixed/ui.py +231 -0
  169. package/.agent-src/templates/scripts/work_engine/directives/ui/__init__.py +113 -0
  170. package/.agent-src/templates/scripts/work_engine/directives/ui/_passthrough.py +44 -0
  171. package/.agent-src/templates/scripts/work_engine/directives/ui/apply.py +241 -0
  172. package/.agent-src/templates/scripts/work_engine/directives/ui/audit.py +414 -0
  173. package/.agent-src/templates/scripts/work_engine/directives/ui/design.py +335 -0
  174. package/.agent-src/templates/scripts/work_engine/directives/ui/polish.py +510 -0
  175. package/.agent-src/templates/scripts/work_engine/directives/ui/review.py +468 -0
  176. package/.agent-src/templates/scripts/work_engine/directives/ui_trivial/__init__.py +119 -0
  177. package/.agent-src/templates/scripts/work_engine/directives/ui_trivial/_skipped.py +37 -0
  178. package/.agent-src/templates/scripts/work_engine/directives/ui_trivial/apply.py +165 -0
  179. package/.agent-src/templates/scripts/work_engine/directives/ui_trivial/refine.py +66 -0
  180. package/.agent-src/templates/scripts/work_engine/directives/ui_trivial/report.py +62 -0
  181. package/.agent-src/templates/scripts/work_engine/directives/ui_trivial/test.py +115 -0
  182. package/.agent-src/templates/scripts/work_engine/dispatcher.py +331 -0
  183. package/.agent-src/templates/scripts/work_engine/emitters.py +43 -0
  184. package/.agent-src/templates/scripts/work_engine/errors.py +19 -0
  185. package/.agent-src/templates/scripts/work_engine/hook_bootstrap.py +76 -0
  186. package/.agent-src/templates/scripts/work_engine/hooks/__init__.py +54 -0
  187. package/.agent-src/templates/scripts/work_engine/hooks/builtin/__init__.py +32 -0
  188. package/.agent-src/templates/scripts/work_engine/hooks/builtin/_chat_history_base.py +103 -0
  189. package/.agent-src/templates/scripts/work_engine/hooks/builtin/chat_history_append.py +44 -0
  190. package/.agent-src/templates/scripts/work_engine/hooks/builtin/chat_history_halt_append.py +42 -0
  191. package/.agent-src/templates/scripts/work_engine/hooks/builtin/chat_history_heartbeat.py +50 -0
  192. package/.agent-src/templates/scripts/work_engine/hooks/builtin/chat_history_turn_check.py +49 -0
  193. package/.agent-src/templates/scripts/work_engine/hooks/builtin/directive_set_guard.py +53 -0
  194. package/.agent-src/templates/scripts/work_engine/hooks/builtin/halt_surface_audit.py +50 -0
  195. package/.agent-src/templates/scripts/work_engine/hooks/builtin/state_shape_validation.py +52 -0
  196. package/.agent-src/templates/scripts/work_engine/hooks/builtin/trace.py +84 -0
  197. package/.agent-src/templates/scripts/work_engine/hooks/context.py +66 -0
  198. package/.agent-src/templates/scripts/work_engine/hooks/events.py +44 -0
  199. package/.agent-src/templates/scripts/work_engine/hooks/exceptions.py +79 -0
  200. package/.agent-src/templates/scripts/work_engine/hooks/registry.py +60 -0
  201. package/.agent-src/templates/scripts/work_engine/hooks/runner.py +73 -0
  202. package/.agent-src/templates/scripts/work_engine/hooks/settings.py +141 -0
  203. package/.agent-src/templates/scripts/work_engine/input_builders.py +163 -0
  204. package/.agent-src/templates/scripts/work_engine/intent/__init__.py +47 -0
  205. package/.agent-src/templates/scripts/work_engine/intent/classify.py +280 -0
  206. package/.agent-src/templates/scripts/work_engine/migration/__init__.py +8 -0
  207. package/.agent-src/templates/scripts/work_engine/migration/v0_to_v1.py +231 -0
  208. package/.agent-src/templates/scripts/{implement_ticket → work_engine}/persona_policy.py +1 -1
  209. package/.agent-src/templates/scripts/work_engine/resolvers/__init__.py +22 -0
  210. package/.agent-src/templates/scripts/work_engine/resolvers/diff.py +106 -0
  211. package/.agent-src/templates/scripts/work_engine/resolvers/file.py +113 -0
  212. package/.agent-src/templates/scripts/work_engine/resolvers/prompt.py +90 -0
  213. package/.agent-src/templates/scripts/work_engine/scoring/__init__.py +14 -0
  214. package/.agent-src/templates/scripts/work_engine/scoring/confidence.py +300 -0
  215. package/.agent-src/templates/scripts/work_engine/stack/__init__.py +31 -0
  216. package/.agent-src/templates/scripts/work_engine/stack/detect.py +187 -0
  217. package/.agent-src/templates/scripts/work_engine/state.py +641 -0
  218. package/.agent-src/templates/scripts/work_engine/state_io.py +202 -0
  219. package/.claude-plugin/marketplace.json +105 -2
  220. package/AGENTS.md +38 -8
  221. package/CHANGELOG.md +609 -0
  222. package/README.md +136 -14
  223. package/config/agent-settings.template.yml +45 -0
  224. package/config/gitignore-block.txt +4 -0
  225. package/docs/MIGRATION.md +122 -0
  226. package/docs/architecture.md +111 -35
  227. package/docs/contracts/STABILITY.md +95 -0
  228. package/docs/contracts/adr-chat-history-split.md +132 -0
  229. package/docs/contracts/adr-command-suggestion.md +146 -0
  230. package/docs/contracts/adr-implement-ticket-runtime.md +122 -0
  231. package/docs/contracts/adr-product-ui-track.md +384 -0
  232. package/docs/contracts/adr-prompt-driven-execution.md +187 -0
  233. package/docs/contracts/agent-memory-contract.md +149 -0
  234. package/docs/contracts/artifact-engagement-flow.md +262 -0
  235. package/docs/contracts/command-clusters.md +126 -0
  236. package/docs/contracts/command-suggestion-flow.md +148 -0
  237. package/docs/contracts/implement-ticket-flow.md +628 -0
  238. package/docs/contracts/linear-ai-rules-inclusion.md +143 -0
  239. package/docs/contracts/linear-ai-three-layers.md +131 -0
  240. package/docs/contracts/rule-interactions.md +107 -0
  241. package/docs/contracts/rule-interactions.yml +142 -0
  242. package/docs/contracts/ui-stack-extension.md +236 -0
  243. package/docs/contracts/ui-track-flow.md +338 -0
  244. package/docs/development.md +1 -1
  245. package/docs/getting-started.md +3 -3
  246. package/docs/installation.md +124 -2
  247. package/docs/migrations/commands-1.15.0.md +112 -0
  248. package/docs/showcase.md +204 -0
  249. package/docs/ui-track-mental-model.md +121 -0
  250. package/package.json +1 -1
  251. package/scripts/agent-config +199 -0
  252. package/scripts/audit_cloud_compatibility.py +288 -0
  253. package/scripts/build_cloud_bundle.py +458 -0
  254. package/scripts/build_linear_digest.py +263 -0
  255. package/scripts/chat_history.py +796 -7
  256. package/scripts/check_compression.py +139 -0
  257. package/scripts/check_iron_law_prominence.py +143 -0
  258. package/scripts/check_md_language.py +159 -0
  259. package/scripts/check_portability.py +38 -0
  260. package/scripts/check_public_links.py +185 -0
  261. package/scripts/check_references.py +1 -0
  262. package/scripts/check_reply_consistency.py +140 -0
  263. package/scripts/command_suggester/__init__.py +51 -0
  264. package/scripts/command_suggester/cooldown.py +132 -0
  265. package/scripts/command_suggester/loader.py +70 -0
  266. package/scripts/command_suggester/match.py +180 -0
  267. package/scripts/command_suggester/rank.py +120 -0
  268. package/scripts/command_suggester/render.py +86 -0
  269. package/scripts/command_suggester/sanitize.py +113 -0
  270. package/scripts/command_suggester/settings.py +125 -0
  271. package/scripts/command_suggester/types.py +78 -0
  272. package/scripts/hooks/augment-chat-history.sh +56 -0
  273. package/scripts/install-hooks.sh +67 -0
  274. package/scripts/install.py +150 -33
  275. package/scripts/lint_marketplace.py +27 -0
  276. package/scripts/lint_no_new_atomic_commands.py +179 -0
  277. package/scripts/lint_rule_interactions.py +149 -0
  278. package/scripts/memory_lookup.py +1 -1
  279. package/scripts/migrate_command_suggestions.py +151 -0
  280. package/scripts/release.py +297 -64
  281. package/scripts/schemas/command.schema.json +41 -0
  282. package/scripts/skill_linter.py +81 -0
  283. package/scripts/sync_agent_settings.py +42 -12
  284. package/scripts/update_counts.py +10 -0
  285. package/templates/consumer-settings/augment-cli-hooks.json +54 -0
  286. package/templates/consumer-settings/claude-settings.json +55 -1
  287. package/.agent-src/rules/chat-history.md +0 -171
  288. package/.agent-src/templates/scripts/implement_ticket/cli.py +0 -171
  289. package/.agent-src/templates/scripts/implement_ticket/dispatcher.py +0 -134
  290. package/.agent-src/templates/scripts/implement_ticket/steps/__init__.py +0 -49
  291. package/.agent-src/templates/scripts/implement_ticket/steps/refine.py +0 -140
@@ -0,0 +1,384 @@
1
+ ---
2
+ stability: stable
3
+ ---
4
+
5
+ # ADR — Product UI Track: audit-as-hard-gate, design-review loop, stack dispatch
6
+
7
+ > **Status:** Decided · R3 Phases 1–6 shipped · 2026-05-01
8
+ > **Context:** [`ui-track-flow.md`](ui-track-flow.md) ·
9
+ > [`road-to-product-ui-track.md`](../../agents/roadmaps/road-to-product-ui-track.md) ·
10
+ > [`road-to-product-ui-track-followup.md`](../../agents/roadmaps/archive/road-to-product-ui-track-followup.md)
11
+ > **Builds on:** [`adr-prompt-driven-execution.md`](adr-prompt-driven-execution.md)
12
+ > — R2 envelope routing and the band-action gate that R3 widens to UI.
13
+ > **Defers to:** Roadmap 4 (`road-to-visual-review-loop.md`, stub) for
14
+ > headless-browser screenshot capture and visual-regression assertions.
15
+
16
+ ## Decision
17
+
18
+ R3 ships four directive sets — `backend` (R1/R2), **`ui`**, **`ui-trivial`**,
19
+ and **`mixed`** — dispatched at the engine boundary on
20
+ `state.directive_set`. The slot wiring is fixed by
21
+ [`directives/ui/__init__.py`](../../.agent-src.uncompressed/templates/scripts/work_engine/directives/ui/__init__.py),
22
+ [`directives/ui_trivial/__init__.py`](../../.agent-src.uncompressed/templates/scripts/work_engine/directives/ui_trivial/__init__.py),
23
+ and [`directives/mixed/__init__.py`](../../.agent-src.uncompressed/templates/scripts/work_engine/directives/mixed/__init__.py);
24
+ the contract for each lives in [`ui-track-flow.md`](ui-track-flow.md).
25
+
26
+ The UI set drives `audit → design → apply → review → polish → report`
27
+ with three load-bearing properties:
28
+
29
+ 1. **Existing-UI-audit is a hard gate.** No `apply` runs without
30
+ `state.ui_audit` populated. The gate lives at directive level **and**
31
+ at always-on rule level ([`ui-audit-before-build`](../../.agent-src.uncompressed/rules/ui-audit-before-build.md))
32
+ so an agent acting outside the engine cannot bypass it.
33
+ 2. **Design brief is locked microcopy.** `apply` consumes the brief
34
+ verbatim — `PLACEHOLDER_PATTERNS` (`<placeholder>`, `lorem`, `todo:`,
35
+ `tbd`, `xxx`) reject at producer and consumer.
36
+ 3. **Polish has a hard 2-round ceiling.** Schema-level validation
37
+ (`work_engine.state._validate_ui_polish`) rejects `rounds > 2` on
38
+ disk; after round 2 the engine halts with ship-as-is / abort / hand-off.
39
+
40
+ Halt budget on the happy path: **2 user halts** — audit pick + design
41
+ sign-off. Apply / review / polish run silently when their producers
42
+ write clean envelopes.
43
+
44
+ ## Why this was a real question
45
+
46
+ R1 + R2 produced a competent backend executor. UI work was the gap. The
47
+ package already shipped UI skills (`fe-design`, `blade-ui`, `livewire`,
48
+ `flux`) but they were flat tools: an agent could call `flux` directly,
49
+ write a component that duplicated three existing primitives, ignore the
50
+ project's design tokens, and ship microcopy with `<placeholder>` strings
51
+ intact. Tests passed; the result felt undesigned.
52
+
53
+ Three options were on the table:
54
+
55
+ 1. **More UI skills, no directive set** — add `react-shadcn-ui`,
56
+ `existing-ui-audit`, `ui-design-brief` as flat skills and let
57
+ `/work` route to them via the existing dispatcher. Rejected:
58
+ the audit step is the load-bearing piece, and a flat skill cannot
59
+ enforce "no apply before audit". Rules can advise; only the
60
+ dispatcher can refuse.
61
+ 2. **One mega-skill (`product-ui`) that orchestrates internally** —
62
+ single SKILL.md that runs audit → design → apply → review →
63
+ polish in one shot. Rejected: the engine's halt-budget,
64
+ sentinel-based replay, and Golden Transcript suite are designed
65
+ for slot-level granularity. Folding five steps into one skill
66
+ blinds the freeze-guard to mid-flow regressions.
67
+ 3. **A new directive set per intent (chosen)** — `ui` for build /
68
+ improve, `ui-trivial` for provably bounded edits, `mixed` for
69
+ prompts that touch both layers. Adopted because each slot keeps
70
+ its own sentinel, halt surface, and replay coverage; the audit
71
+ gate is enforced at the dispatcher boundary and at the agent
72
+ boundary; and stack-specific implementation is dispatched without
73
+ widening the directive set.
74
+
75
+ ## Audit as a hard gate (the load-bearing piece)
76
+
77
+ The Lovable-grade differentiator is **"audit existing UI first, design
78
+ before code, polish before ship"**. Everything in R3 follows from that.
79
+
80
+ The audit gate refuses `apply` until `state.ui_audit` is well-formed:
81
+ either `≥1 components_found` entry, or `greenfield=True` with a user-chosen
82
+ `greenfield_decision` ∈ `{scaffold, bare, external_reference}`. An empty
83
+ dict, `None`, or a populated dict without those keys is **not** findings;
84
+ the gate emits `@agent-directive: existing-ui-audit` and refuses to advance.
85
+
86
+ Two enforcement layers, deliberately redundant:
87
+
88
+ - **Dispatcher layer** —
89
+ [`directives/ui/audit.py`](../../.agent-src.uncompressed/templates/scripts/work_engine/directives/ui/audit.py)
90
+ refuses to write `outcomes["refine"] = "success"` without a populated
91
+ audit. Purely structural; no LLM, no heuristic.
92
+ - **Agent layer** — [`ui-audit-before-build`](../../.agent-src.uncompressed/rules/ui-audit-before-build.md)
93
+ is an always-on rule that fires when the agent is about to write a
94
+ component file outside the engine (free-form edit, side conversation,
95
+ cloud surface). The rule encodes the same Iron Law in prose so cloud
96
+ agents that don't ship the engine still honour it.
97
+
98
+ **Why two layers.** A single layer would leak: cloud surfaces and
99
+ free-form edits bypass the dispatcher entirely. A rule alone would not
100
+ hold under engine-driven runs because rules don't refuse exit codes.
101
+ Belt-and-suspenders is cheap (no shared state, no double-write) and
102
+ the failure modes are different enough to justify it.
103
+
104
+ ### Confidence-path resolution
105
+
106
+ Audit findings carry a confidence label and per-candidate similarity.
107
+ [`directives/ui/audit.py::_decide_path`](../../.agent-src.uncompressed/templates/scripts/work_engine/directives/ui/audit.py)
108
+ resolves to one of:
109
+
110
+ - `high_confidence` — confidence `high` + ≥1 match with similarity
111
+ ≥ `STRONG_SIMILARITY = 0.7` and no runner-up within
112
+ `TIE_GAP = 0.05`. Audit folds findings into the design brief — no
113
+ separate halt.
114
+ - `ambiguous` — populated but below the high-confidence threshold or
115
+ with a tie. Numbered-options halt records `audit_path = "ambiguous"`
116
+ + `candidate_pick`.
117
+ - `greenfield` — no `components_found`, `greenfield = True`, user picks
118
+ `scaffold` / `bare` / `external_reference`.
119
+
120
+ Constants are named, exported, and imported by tests so a re-tune
121
+ re-captures Goldens explicitly rather than drifting silently.
122
+
123
+ ## Design-review polish loop
124
+
125
+ `apply` writes the rendered envelope. `review` (stack-dispatched) emits
126
+ `findings` + `review_clean`. `polish` runs a bounded fix loop with a
127
+ hard ceiling of `POLISH_CEILING = 2` rounds, validated at three layers
128
+ (in-memory state, on-disk schema, dispatcher).
129
+
130
+ | `review_clean` | `rounds` | Behaviour |
131
+ |---|---|---|
132
+ | `True` | any | `SUCCESS` — advance to report |
133
+ | `False` | `< 2` | `BLOCKED` + `@agent-directive: ui-polish-<stack>`; agent applies fixes, re-runs review, increments `rounds` |
134
+ | `False` | `== 2` | `BLOCKED` numbered options: ship-as-is / abort / hand off |
135
+
136
+ **Why ceiling at 2.** Three rounds is one round too many: by round 3
137
+ the agent is either making cosmetic tweaks the user can't notice, or
138
+ spinning on a design problem the engine cannot resolve. Halting at 2
139
+ hands the decision to the user with the cheapest possible context
140
+ (two completed rounds of evidence). Rate of false-ceiling-hits will
141
+ be visible in delivery reports; a tune to 3 stays a one-line constant
142
+ change with a Golden re-capture.
143
+
144
+ **Token-violation extraction.** Findings with `kind == "token_violation"`
145
+ carry `category` and `value`. Polish classifies against
146
+ `state.ui_audit.design_tokens`:
147
+
148
+ - Matched value → fix uses the named token.
149
+ - Unmatched value repeated > `TOKEN_REPEAT_THRESHOLD = 2` times →
150
+ emits `polish_token_extraction_pending` to extract a new token
151
+ before the next round runs. One-off unmatched values stay inline.
152
+
153
+ This is the only place R3 mutates the design system. It runs only
154
+ after a finding identifies a violation, and only when the same value
155
+ appears three or more times — single-use values are not worth the
156
+ ceremony.
157
+
158
+ ## Halt budget — why max 2 on the happy path
159
+
160
+ The Lovable feel comes from "user decides once, agent runs". Three
161
+ halts (audit + design + polish-final) is the obvious shape but
162
+ empirically wrong: by the time polish needs sign-off, the user is
163
+ context-switched away. Two halts that pin the **decisive** choices
164
+ (which existing UI to extend, what microcopy to ship) buy the agent
165
+ the runway to finish silently.
166
+
167
+ Additional halts surface only on real ambiguity:
168
+ greenfield-undecided, shadcn-version-mismatch, audit-ambiguous,
169
+ placeholder rejection, polish round (per dirty review, capped at 2),
170
+ polish ceiling. GT-U11 (high-confidence) and GT-U12 (ambiguous) pin
171
+ the budget at 1 and 2 halts respectively; a regression that adds a
172
+ third halt fails replay.
173
+
174
+ ## Trivial path and reclassification
175
+
176
+ For provably bounded edits (single class swap, copy tweak, one-prop
177
+ adjustment), the Phase-1 intent classifier writes
178
+ `directive_set = "ui-trivial"`. The slot wiring collapses to
179
+ `refine → ⊘ → ⊘ → ⊘ → apply → test → ⊘ → report` with
180
+ `MAX_FILES = 1` and `MAX_LINES_CHANGED = 5` enforced inside
181
+ [`directives/ui_trivial/apply.py`](../../.agent-src.uncompressed/templates/scripts/work_engine/directives/ui_trivial/apply.py).
182
+
183
+ **Mandatory reclassification at apply time.** When a trivial edit
184
+ exceeds the preconditions, apply flips
185
+ `state.directive_set = "ui"` and the dispatcher restarts at audit.
186
+ The reclassification is loud (delivery report records it) and
187
+ counted (delivery report tracks the rate). Silent skips are the
188
+ failure mode to watch for.
189
+
190
+ **Why the bypass exists.** Without it, "fix the typo on the login
191
+ button" runs the full audit + design loop. That is six halts the
192
+ user does not want to take. The bypass restores common sense without
193
+ weakening the audit gate — the gate stays in force whenever the
194
+ preconditions don't hold.
195
+
196
+ ## Stack detection and dispatch
197
+
198
+ [`scripts/work_engine/stack/detect.py`](../../.agent-src.uncompressed/templates/scripts/work_engine/stack/detect.py)
199
+ reads `composer.json` and `package.json` once, applies a four-rule
200
+ priority table, and writes `state.stack.frontend`:
201
+
202
+ 1. `livewire/livewire` + `livewire/flux` → `blade-livewire-flux`
203
+ 2. `react` + (`@radix-ui/*` OR `shadcn-ui` OR `components.json`) → `react-shadcn`
204
+ 3. `vue` (any major) → `vue`
205
+ 4. otherwise → `plain`
206
+
207
+ Cached on `state.stack` against the manifest `mtime`. A `composer.json`
208
+ or `package.json` change invalidates the cache; same conversation
209
+ re-detects on the next dispatch.
210
+
211
+ Errors (missing file, malformed JSON) downgrade to `plain` rather than
212
+ raising — a wrong stack label is recoverable (audit catches it, user
213
+ can override), a crash mid-dispatch is not.
214
+
215
+ `apply`, `review`, `polish` each route on `state.stack.frontend` to a
216
+ stack-specific directive (`ui-apply-blade-livewire-flux`,
217
+ `ui-apply-react-shadcn`, `ui-apply-vue`, `ui-apply-plain`). The
218
+ dispatch table shape is identical across all three slots; adding a
219
+ stack adds three rows, one per slot.
220
+
221
+ ## fe-design migration — reference, not executor
222
+
223
+ `fe-design` previously functioned as both heuristics catalogue and
224
+ direct executor. R3 splits the responsibilities:
225
+
226
+ - **Reference (kept):** layout patterns, form / table design,
227
+ responsive strategy, a11y heuristics. Cited by
228
+ [`directives/ui/design.py`](../../.agent-src.uncompressed/templates/scripts/work_engine/directives/ui/design.py).
229
+ - **Executor (removed):** code-writing responsibilities migrated to
230
+ the stack-specific apply / review / polish skills (`flux`,
231
+ `livewire`, `blade-ui`, `react-shadcn-ui`, `ui-apply-vue`).
232
+
233
+ The split lets the design step stay framework-agnostic while
234
+ implementation skills stay focused on a single stack's idioms.
235
+
236
+ ## Mixed orchestration — contract first, then UI, then stitch
237
+
238
+ When a single input touches both layers, `mixed` runs
239
+ `refine → memory → analyze → contract → ui → stitch → verify → report`.
240
+ Sentinels:
241
+
242
+ - `state.contract.contract_confirmed` — UI sub-flow refuses to start
243
+ without it (defense-in-depth even if `outcomes["plan"] == "success"`).
244
+ - `state.ui_review.review_clean` — mixed `ui` step's success condition.
245
+ - `state.stitch.verdict = "success"` — stitch's success condition;
246
+ `blocked` / `partial` halts unless `state.stitch.integration_confirmed`
247
+ flips.
248
+
249
+ **Why contract first.** UI shape and microcopy depend on the API
250
+ surface (field names, error codes, paginated vs streamed). Building
251
+ UI before locking the contract leads to two rounds of churn — once
252
+ when the contract surfaces a constraint the design ignored, once
253
+ when the integration test reveals a field-name mismatch. Locking
254
+ the contract first costs one halt and saves an entire polish round.
255
+
256
+ ## Tradeoffs accepted
257
+
258
+ - **Three new directive sets** — bigger surface to maintain, more
259
+ Goldens (12 GT-U entries vs. R2's 4 GT-P). Mitigated by shared
260
+ refine/report handlers and the strict-verb replay harness.
261
+ - **Hard 2-round polish ceiling** — some legitimate edge cases
262
+ (a11y compliance edits across multiple components) need 3 rounds.
263
+ Mitigated by ship-as-is / hand-off as explicit user choices at
264
+ the ceiling halt.
265
+ - **Stack detection is heuristic, not full-AST** — a project with
266
+ `react` + `flux` (e.g. test scaffold) misclassifies as
267
+ `react-shadcn`. Mitigated by user-override path: explicit `intent`
268
+ in prompt overrides detection.
269
+ - **Two-layer audit enforcement** — duplicates the gate logic in
270
+ rule prose and dispatcher Python. Accepted: cloud surfaces don't
271
+ ship the engine, and engine-driven flows can't rely on rule
272
+ enforcement. Both layers are needed; the cost is documentation.
273
+
274
+ ## Non-goals
275
+
276
+ - Does **not** replace `/implement-ticket` or `/work` — same engine,
277
+ new directive sets only.
278
+ - Does **not** introduce a `/build` or `/build-screen` entrypoint.
279
+ Intent classification at refine is enough; new commands wait on
280
+ evidence of mis-routing.
281
+ - Does **not** pin a removal date for `fe-design`; the reference
282
+ positioning is stable.
283
+ - Does **not** ship visual review (headless browser, screenshot
284
+ capture, a11y tooling) — Roadmap 4 stub.
285
+ - Does **not** ship project-local design-memory beyond what the
286
+ audit step already extracts; the audit is the memory.
287
+
288
+ ## Consequences — unblocks
289
+
290
+ - **Roadmap 4** (visual review loop) can register a post-polish
291
+ visual-assertion step against the existing `state.ui_apply`
292
+ envelope.
293
+ - **New stacks** (Svelte, SolidJS, Astro) plug in via the
294
+ extension recipe ([`ui-stack-extension.md`](ui-stack-extension.md)) —
295
+ detector heuristic + apply/review/polish skills + Golden fixture.
296
+ No engine change.
297
+ - **Mixed entrypoints** (`/refactor`, `/spike`, future) compose by
298
+ building a new envelope kind plus a new directive set; the audit
299
+ gate is reusable when the work touches UI.
300
+
301
+ ## Follow-ups (not part of this ADR)
302
+
303
+ - Track reclassification rate (`ui-trivial` → `ui` flips) in
304
+ delivery reports; if > 30 % of trivial runs reclassify, the
305
+ classifier needs tuning.
306
+ - Track polish-ceiling-hit rate; if > 10 % of UI runs hit the
307
+ ceiling, raise to 3 with a Golden re-capture.
308
+ - Decide whether `state.ui_audit` should expire on long-running
309
+ state files (currently sticky for the lifetime of the file).
310
+ - Evaluate whether `react-shadcn` should split into
311
+ `react-shadcn-v2` / `react-shadcn-v3` once shadcn ships its next
312
+ major; the audit version-mismatch warning surfaces this signal
313
+ without forcing a split today.
314
+
315
+ ## R4 amendment — Visual Review Loop (2026-05-01)
316
+
317
+ > **Status:** Decided · R4 Phases 0–4 shipped · Phase 5 in progress
318
+ > **Roadmap:** [`road-to-visual-review-loop.md`](../../agents/roadmaps/road-to-visual-review-loop.md)
319
+
320
+ R4 narrows the polish-termination contract from "subjective ceiling
321
+ only" to "subjective ceiling **plus** objective a11y block", adds a
322
+ preview envelope on `state.ui_review.preview` (engine reads, skill
323
+ writes — the engine never spawns a browser), and threads an a11y
324
+ baseline through the existing audit step so pre-existing violations
325
+ stay informational while new ones block.
326
+
327
+ ### Acceptance criteria — locked verbatim from the roadmap
328
+
329
+ - **AC #1 — Objective polish anchoring:** Polish loop terminates when
330
+ (a) `findings` is empty OR (b) `rounds == POLISH_CEILING` AND no
331
+ `a11y_violation` entries remain at severity ≥ floor. Round 2 with
332
+ remaining a11y findings halts via `polish_a11y_blocking`, not via
333
+ `polish_ceiling_reached`.
334
+ - **AC #2 — Preview envelope:** `state.ui_review.preview` shape
335
+ validated by engine; `render_ok: False` halts via
336
+ `preview_render_failed`; trivial path bypasses the envelope
337
+ entirely.
338
+ - **AC #3 — A11y baseline:** `state.ui_audit.a11y_baseline` is read
339
+ by the review gate; only NEW/CHANGED violations are actionable.
340
+ Pre-existing violations stay in findings as informational, never
341
+ block polish.
342
+ - **AC #4 — Goldens pinned:** GT-U13, GT-U14, GT-U15 captured and
343
+ replay-byte-equal. GT-U4 still passes (regression guard for the
344
+ narrowed `polish_ceiling_reached`).
345
+ - **AC #5 — `task ci` green** with all three new baselines wired
346
+ into the harness and CHECKSUMS regenerated.
347
+ - **AC #6 — Contracts updated:** `ui-track-flow.md` and
348
+ `adr-product-ui-track.md` reflect the new gates and the
349
+ polish-termination rewrite.
350
+
351
+ ### Termination rewrite — load-bearing change
352
+
353
+ The pre-R4 polish step had one ceiling halt
354
+ (`polish_ceiling_reached`) with three options: ship as-is / abort /
355
+ hand off. R4 keeps that branch for **subjective** findings (visual
356
+ polish, design tweaks) and adds an explicit **objective** branch
357
+ when remaining findings include `a11y_violation` entries:
358
+
359
+ - `polish_a11y_blocking` halt with options:
360
+ 1. **Extend** — sets `state.ui_polish.extension_used = True`; the
361
+ schema validator widens `rounds` from `[0, 2]` to `[0, 3]` only
362
+ while the flag is set. Once spent, the Extend option disappears
363
+ on re-entry.
364
+ 2. **Accept** — appends rule ids to
365
+ `state.ui_review.a11y.accepted_violations`; on replay the review
366
+ gate's `_apply_a11y_gate` filters those rule ids before
367
+ synthesising findings, so the run round-trips through `SUCCESS`
368
+ idempotently.
369
+ 3. **Abort** — drops the UI request.
370
+
371
+ The two halts share the polish step but never fire together: the
372
+ explicit branch in `polish.run()` checks for `a11y_violation` entries
373
+ before falling through to the subjective ceiling.
374
+
375
+ ### Iron law
376
+
377
+ The engine **never** renders. Playwright integration lives in the
378
+ stack-specific review skills. The engine reads
379
+ `state.ui_review.preview`; the skill writes it. `render_ok: False`
380
+ halts the user; `skipped: True` bypasses the gate idempotently;
381
+ `render_ok: True` with `screenshot_path` set threads the path into
382
+ the delivery report's `artifacts` list. This boundary keeps the
383
+ package Python + Bash and pushes the browser dependency into
384
+ consumer-project setup.
@@ -0,0 +1,187 @@
1
+ ---
2
+ stability: stable
3
+ ---
4
+
5
+ # ADR — Prompt-Driven Execution: `/work` and the confidence-band gate
6
+
7
+ > **Status:** Decided · R2 Phases 1–6 shipped · 2026-04-28
8
+ > **Context:** [`implement-ticket-flow.md`](implement-ticket-flow.md) ·
9
+ > [`road-to-prompt-driven-execution.md`](../../agents/roadmaps/archive/road-to-prompt-driven-execution.md)
10
+ > **Builds on:** [`adr-work-engine-rename.md`](adr-work-engine-rename.md) —
11
+ > the universal-dispatcher refactor that this entrypoint slots into.
12
+ > **Defers to:** Roadmap 3 (`road-to-product-ui-track.md`) for UI- and
13
+ > mixed-intent prompts; this ADR is backend-only.
14
+
15
+ ## Decision
16
+
17
+ A second top-level slash command, **`/work`**, drives a free-form prompt
18
+ through the same `work_engine` dispatcher that backs `/implement-ticket`.
19
+ The two commands differ only in the input envelope they build:
20
+
21
+ | Command | Subcommand | Envelope | Trigger |
22
+ |---|---|---|---|
23
+ | `/implement-ticket` | `./agent-config implement-ticket` | `input.kind="ticket"` | Ticket id, URL, or pasted ticket payload |
24
+ | `/work` | `./agent-config work` | `input.kind="prompt"` | Free-form goal — no ticket id, no AC yet |
25
+
26
+ The engine routes on `input.kind` at the dispatcher boundary; downstream
27
+ directives (`memory`, `analyze`, `plan`, `implement`, `test`, `verify`,
28
+ `report`) are envelope-agnostic.
29
+
30
+ ## Why this was a real question
31
+
32
+ Roadmap 1 shipped a ticket-shaped contract: the engine assumed
33
+ `state.ticket["acceptance_criteria"]` would always be populated by the
34
+ caller. R2 broke that assumption — a free-form prompt arrives as a
35
+ single string with no AC and no scope boundary. Three options were on
36
+ the table:
37
+
38
+ 1. **One command, two modes** — overload `/implement-ticket` with a
39
+ `--prompt` flag. Rejected: the slash-command surface is the public
40
+ contract, and the dispatch decision (ticket vs. prompt) belongs at
41
+ the entrypoint, not behind a flag.
42
+ 2. **Two commands, two engines** — fork the dispatcher. Rejected:
43
+ doubles the maintenance surface and re-introduces the
44
+ ticket-as-name-lock that
45
+ [`adr-work-engine-rename.md`](adr-work-engine-rename.md) just
46
+ removed.
47
+ 3. **Two commands, one engine, two envelopes (chosen)** — `/work` and
48
+ `/implement-ticket` are sibling envelope-builders over the same
49
+ `work_engine`. Adopted because it preserves the freeze-guard on
50
+ `/implement-ticket` (R1 goldens stay byte-equal) and keeps a single
51
+ place to add R3's UI-intent envelope later.
52
+
53
+ ## Naming — `/work` over `/do`, `/execute`, `/build`
54
+
55
+ Phase 1 locked the name. Considered alternatives:
56
+
57
+ | Name | Rejected because |
58
+ |---|---|
59
+ | `/do` | Prefix-collision with `/do-and-judge`, `/do-in-steps` — surfaces as ambiguous in autocomplete and command-routing |
60
+ | `/execute` | Reads as "run this artisan command for me" rather than "drive an end-to-end flow" |
61
+ | `/build` | Dominated by build-system semantics in most ecosystems (CI builds, asset builds) |
62
+ | `/work` | Pairs naturally with the underlying `work_engine` module name; reads as "do the work end-to-end" |
63
+
64
+ `/do` lost on prefix-collision alone — even if the read was lean, the
65
+ autocomplete failure is an everyday cost. `/work` was the next-best
66
+ option that also matches the engine module name.
67
+
68
+ ## Confidence-band gate (the load-bearing piece of R2)
69
+
70
+ A free-form prompt has no AC contract. Before R2, every downstream
71
+ gate would either trip on missing AC or fabricate one silently. The
72
+ gate solves this with a **deterministic, heuristic-only scorer** at the
73
+ `refine` boundary.
74
+
75
+ **Single source of truth:**
76
+ [`scripts/work_engine/scoring/confidence.py`](../../.agent-src/templates/scripts/work_engine/scoring/confidence.py).
77
+ The rubric, dimension definitions, weights, and band thresholds live in
78
+ that module. SKILL.md, this ADR, and `implement-ticket-flow.md` cite
79
+ the module — they do **not** re-derive the values. Tuning happens by
80
+ editing the constants and re-capturing goldens.
81
+
82
+ **Rubric shape** (5 dimensions × 0–2, sum / 10 → band):
83
+
84
+ - `goal_clarity` · `scope_boundary` · `ac_evidence` ·
85
+ `stack_data` · `reversibility`
86
+
87
+ **Band-action mapping:**
88
+
89
+ | Band | Score | Engine outcome | Agent surface |
90
+ |---|---|---|---|
91
+ | `high` | `≥ 0.8` | `SUCCESS` | Silent proceed; AC + assumptions land in delivery report |
92
+ | `medium` | `0.5 ≤ score < 0.8` | `PARTIAL` | Assumptions-report halt; user confirms or edits, engine re-runs |
93
+ | `low` | `< 0.5` | `BLOCKED` | One clarifying question on the weakest dimension (per `ask-when-uncertain` Iron Law) |
94
+
95
+ **Why heuristic, not LLM:** the score must be reproducible across
96
+ replays — the freeze-guard harness pins expected outcomes per fixture,
97
+ and an LLM-based scorer would drift between runs. Heuristics also
98
+ remove a network dependency from the gate.
99
+
100
+ ## AC projection — the engine fix Phase 5 surfaced
101
+
102
+ Downstream gates (`analyze`, `plan`) read
103
+ `state.ticket["acceptance_criteria"]` — a slot the prompt envelope
104
+ never sets directly. R2's first SUCCESS path through `refine` populated
105
+ `state.input.data.reconstructed_ac` only, so `analyze` blocked with
106
+ `ticket lost its acceptance criteria` even on a clean high-band run.
107
+
108
+ Fixed at the refine boundary: `directives/backend/refine.py::_run_prompt`
109
+ mirrors `data["reconstructed_ac"]` into `state.ticket["acceptance_criteria"]`
110
+ (as an independent list copy) before SUCCESS branches. Two regression
111
+ tests pin the contract on the high-band and medium-band release paths.
112
+
113
+ This is intentionally a **boundary** projection, not a parallel field
114
+ of truth: prompt envelopes carry one canonical AC list, and downstream
115
+ gates read the same slot regardless of envelope kind.
116
+
117
+ ## Golden-Transcript contract
118
+
119
+ Phase 5 captured `GT-P1..GT-P4` against the live engine and pinned them
120
+ alongside the R1 goldens:
121
+
122
+ - `GT-P1` — high-band happy path (6 cycles → exit 0)
123
+ - `GT-P2` — medium-band release after assumption confirmation (7 cycles → exit 0)
124
+ - `GT-P3` — low-band one-question halt (2 cycles → exit 1)
125
+ - `GT-P4` — UI-intent rejection with R3 pointer (2 cycles → exit 1)
126
+
127
+ `task golden-replay` runs all 9 transcripts (R1 + R2) on every PR. The
128
+ R1 freeze-guard from
129
+ [`adr-work-engine-rename.md`](adr-work-engine-rename.md) stays in force
130
+ — GT-1..GT-5 remain byte-equal across the R2 changes.
131
+
132
+ ## Deferred to Roadmap 3
133
+
134
+ R2 deliberately does **not** ship:
135
+
136
+ - `directives/ui/` and `directives/mixed/` — UI-shaped prompts are
137
+ rejected at the band-action gate (`stack_data` scores `0`) with a
138
+ `@agent-directive: r3-pointer` halt.
139
+ - Existing-UI-audit pre-step, design-review polish loop, microcopy /
140
+ a11y / states directives.
141
+ - `input.kind="diff"` and `input.kind="file"` resolvers.
142
+
143
+ The rejection path (`GT-P4`) pins this boundary so an R3 dispatcher
144
+ addition doesn't silently widen `/work`'s surface.
145
+
146
+ ## Tradeoffs accepted
147
+
148
+ - **Two top-level commands** sharing a state file. Mitigated by the
149
+ envelope-collision halt: a `.work-state.json` carrying `input.kind="ticket"`
150
+ refuses an incoming `/work` invocation, and vice versa.
151
+ - **Heuristic scorer drift** as the rubric matures — `goal_clarity` is
152
+ the most likely dimension to be retuned. Mitigated by the
153
+ freeze-guard: every threshold change re-captures `GT-P1..P4` and a
154
+ PR reviewer signs off.
155
+ - **No telemetry** — confidence scores are not aggregated. A
156
+ false-medium / false-low rate would only surface through user
157
+ reports. Telemetry is deferred indefinitely (roadmap "Future-track
158
+ recipe — deferred").
159
+
160
+ ## Non-goals
161
+
162
+ - Does **not** change `/implement-ticket` behavior — R1 contract pinned
163
+ by `GT-1..GT-5`.
164
+ - Does **not** introduce auto-git operations (`/commit`, `/create-pr`
165
+ remain user-gated per `scope-control`).
166
+ - Does **not** pin a removal date for any R1 surface.
167
+ - Does **not** route UI-intent prompts — that's R3.
168
+
169
+ ## Consequences — unblocks
170
+
171
+ - **R3** (`road-to-product-ui-track.md`) can register `directives/ui/`
172
+ and `directives/mixed/` against the existing dispatcher; the
173
+ envelope shape and the band-action gate are already in place.
174
+ - Future entrypoints (`/refactor`, `/spike`, …) can compose by
175
+ building a third envelope kind without touching the dispatch core.
176
+ - Memory entries cited from prompt runs land under the same retrieval
177
+ surface as ticket runs — the engineering-memory contract is
178
+ envelope-agnostic.
179
+
180
+ ## Follow-ups (not part of this ADR)
181
+
182
+ - Tune confidence thresholds against real-world prompts as usage
183
+ grows; refresh `GT-P1..P4` on any change.
184
+ - Decide whether `assumptions_confirmed` should expire on long-running
185
+ flows (currently sticky for the lifetime of the state file).
186
+ - Revisit telemetry once R3 lands — UI directives may produce richer
187
+ signals than backend-only flows.