@zigrivers/scaffold 2.1.2 → 2.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (391) hide show
  1. package/README.md +505 -119
  2. package/dist/cli/commands/build.d.ts.map +1 -1
  3. package/dist/cli/commands/build.js +94 -14
  4. package/dist/cli/commands/build.js.map +1 -1
  5. package/dist/cli/commands/build.test.js +30 -5
  6. package/dist/cli/commands/build.test.js.map +1 -1
  7. package/dist/cli/commands/check.d.ts +12 -0
  8. package/dist/cli/commands/check.d.ts.map +1 -0
  9. package/dist/cli/commands/check.js +311 -0
  10. package/dist/cli/commands/check.js.map +1 -0
  11. package/dist/cli/commands/check.test.d.ts +2 -0
  12. package/dist/cli/commands/check.test.d.ts.map +1 -0
  13. package/dist/cli/commands/check.test.js +412 -0
  14. package/dist/cli/commands/check.test.js.map +1 -0
  15. package/dist/cli/commands/complete.d.ts +12 -0
  16. package/dist/cli/commands/complete.d.ts.map +1 -0
  17. package/dist/cli/commands/complete.js +101 -0
  18. package/dist/cli/commands/complete.js.map +1 -0
  19. package/dist/cli/commands/complete.test.d.ts +2 -0
  20. package/dist/cli/commands/complete.test.d.ts.map +1 -0
  21. package/dist/cli/commands/complete.test.js +133 -0
  22. package/dist/cli/commands/complete.test.js.map +1 -0
  23. package/dist/cli/commands/dashboard.d.ts.map +1 -1
  24. package/dist/cli/commands/dashboard.js +12 -8
  25. package/dist/cli/commands/dashboard.js.map +1 -1
  26. package/dist/cli/commands/info.d.ts.map +1 -1
  27. package/dist/cli/commands/info.js +4 -0
  28. package/dist/cli/commands/info.js.map +1 -1
  29. package/dist/cli/commands/knowledge.d.ts.map +1 -1
  30. package/dist/cli/commands/knowledge.js +6 -2
  31. package/dist/cli/commands/knowledge.js.map +1 -1
  32. package/dist/cli/commands/knowledge.test.js +16 -11
  33. package/dist/cli/commands/knowledge.test.js.map +1 -1
  34. package/dist/cli/commands/next.d.ts.map +1 -1
  35. package/dist/cli/commands/next.js +41 -13
  36. package/dist/cli/commands/next.js.map +1 -1
  37. package/dist/cli/commands/next.test.js +3 -0
  38. package/dist/cli/commands/next.test.js.map +1 -1
  39. package/dist/cli/commands/reset.d.ts +1 -0
  40. package/dist/cli/commands/reset.d.ts.map +1 -1
  41. package/dist/cli/commands/reset.js +179 -67
  42. package/dist/cli/commands/reset.js.map +1 -1
  43. package/dist/cli/commands/reset.test.js +360 -0
  44. package/dist/cli/commands/reset.test.js.map +1 -1
  45. package/dist/cli/commands/rework.d.ts +20 -0
  46. package/dist/cli/commands/rework.d.ts.map +1 -0
  47. package/dist/cli/commands/rework.js +332 -0
  48. package/dist/cli/commands/rework.js.map +1 -0
  49. package/dist/cli/commands/rework.test.d.ts +2 -0
  50. package/dist/cli/commands/rework.test.d.ts.map +1 -0
  51. package/dist/cli/commands/rework.test.js +297 -0
  52. package/dist/cli/commands/rework.test.js.map +1 -0
  53. package/dist/cli/commands/run.d.ts.map +1 -1
  54. package/dist/cli/commands/run.js +59 -31
  55. package/dist/cli/commands/run.js.map +1 -1
  56. package/dist/cli/commands/run.test.js +288 -6
  57. package/dist/cli/commands/run.test.js.map +1 -1
  58. package/dist/cli/commands/skill.d.ts +12 -0
  59. package/dist/cli/commands/skill.d.ts.map +1 -0
  60. package/dist/cli/commands/skill.js +123 -0
  61. package/dist/cli/commands/skill.js.map +1 -0
  62. package/dist/cli/commands/skill.test.d.ts +2 -0
  63. package/dist/cli/commands/skill.test.d.ts.map +1 -0
  64. package/dist/cli/commands/skill.test.js +297 -0
  65. package/dist/cli/commands/skill.test.js.map +1 -0
  66. package/dist/cli/commands/skip.d.ts +1 -1
  67. package/dist/cli/commands/skip.d.ts.map +1 -1
  68. package/dist/cli/commands/skip.js +123 -57
  69. package/dist/cli/commands/skip.js.map +1 -1
  70. package/dist/cli/commands/skip.test.js +91 -0
  71. package/dist/cli/commands/skip.test.js.map +1 -1
  72. package/dist/cli/commands/status.d.ts +1 -0
  73. package/dist/cli/commands/status.d.ts.map +1 -1
  74. package/dist/cli/commands/status.js +57 -10
  75. package/dist/cli/commands/status.js.map +1 -1
  76. package/dist/cli/commands/status.test.js +81 -0
  77. package/dist/cli/commands/status.test.js.map +1 -1
  78. package/dist/cli/commands/update.test.js +252 -0
  79. package/dist/cli/commands/update.test.js.map +1 -1
  80. package/dist/cli/commands/version.test.js +171 -1
  81. package/dist/cli/commands/version.test.js.map +1 -1
  82. package/dist/cli/index.d.ts.map +1 -1
  83. package/dist/cli/index.js +8 -0
  84. package/dist/cli/index.js.map +1 -1
  85. package/dist/core/adapters/adapter.d.ts +14 -0
  86. package/dist/core/adapters/adapter.d.ts.map +1 -1
  87. package/dist/core/adapters/adapter.js.map +1 -1
  88. package/dist/core/adapters/adapter.test.js +10 -0
  89. package/dist/core/adapters/adapter.test.js.map +1 -1
  90. package/dist/core/adapters/claude-code.d.ts.map +1 -1
  91. package/dist/core/adapters/claude-code.js +47 -10
  92. package/dist/core/adapters/claude-code.js.map +1 -1
  93. package/dist/core/adapters/claude-code.test.js +41 -20
  94. package/dist/core/adapters/claude-code.test.js.map +1 -1
  95. package/dist/core/adapters/codex.d.ts.map +1 -1
  96. package/dist/core/adapters/codex.js +5 -1
  97. package/dist/core/adapters/codex.js.map +1 -1
  98. package/dist/core/adapters/codex.test.js +5 -0
  99. package/dist/core/adapters/codex.test.js.map +1 -1
  100. package/dist/core/adapters/universal.d.ts.map +1 -1
  101. package/dist/core/adapters/universal.js +0 -1
  102. package/dist/core/adapters/universal.js.map +1 -1
  103. package/dist/core/adapters/universal.test.js +5 -0
  104. package/dist/core/adapters/universal.test.js.map +1 -1
  105. package/dist/core/assembly/context-gatherer.d.ts.map +1 -1
  106. package/dist/core/assembly/context-gatherer.js +5 -2
  107. package/dist/core/assembly/context-gatherer.js.map +1 -1
  108. package/dist/core/assembly/engine.d.ts.map +1 -1
  109. package/dist/core/assembly/engine.js +10 -2
  110. package/dist/core/assembly/engine.js.map +1 -1
  111. package/dist/core/assembly/engine.test.js +19 -0
  112. package/dist/core/assembly/engine.test.js.map +1 -1
  113. package/dist/core/assembly/knowledge-loader.d.ts +25 -0
  114. package/dist/core/assembly/knowledge-loader.d.ts.map +1 -1
  115. package/dist/core/assembly/knowledge-loader.js +75 -2
  116. package/dist/core/assembly/knowledge-loader.js.map +1 -1
  117. package/dist/core/assembly/knowledge-loader.test.js +388 -1
  118. package/dist/core/assembly/knowledge-loader.test.js.map +1 -1
  119. package/dist/core/assembly/meta-prompt-loader.d.ts +6 -0
  120. package/dist/core/assembly/meta-prompt-loader.d.ts.map +1 -1
  121. package/dist/core/assembly/meta-prompt-loader.js +41 -25
  122. package/dist/core/assembly/meta-prompt-loader.js.map +1 -1
  123. package/dist/core/assembly/preset-loader.d.ts +10 -0
  124. package/dist/core/assembly/preset-loader.d.ts.map +1 -1
  125. package/dist/core/assembly/preset-loader.js +26 -1
  126. package/dist/core/assembly/preset-loader.js.map +1 -1
  127. package/dist/core/assembly/preset-loader.test.js +65 -1
  128. package/dist/core/assembly/preset-loader.test.js.map +1 -1
  129. package/dist/core/assembly/update-mode.d.ts.map +1 -1
  130. package/dist/core/assembly/update-mode.js +10 -4
  131. package/dist/core/assembly/update-mode.js.map +1 -1
  132. package/dist/core/assembly/update-mode.test.js +47 -0
  133. package/dist/core/assembly/update-mode.test.js.map +1 -1
  134. package/dist/core/dependency/dependency.d.ts.map +1 -1
  135. package/dist/core/dependency/dependency.js +3 -2
  136. package/dist/core/dependency/dependency.js.map +1 -1
  137. package/dist/core/dependency/dependency.test.js +2 -0
  138. package/dist/core/dependency/dependency.test.js.map +1 -1
  139. package/dist/core/dependency/eligibility.js +3 -3
  140. package/dist/core/dependency/eligibility.js.map +1 -1
  141. package/dist/core/dependency/eligibility.test.js +2 -0
  142. package/dist/core/dependency/eligibility.test.js.map +1 -1
  143. package/dist/core/dependency/graph.d.ts.map +1 -1
  144. package/dist/core/dependency/graph.js +4 -0
  145. package/dist/core/dependency/graph.js.map +1 -1
  146. package/dist/core/dependency/graph.test.d.ts +2 -0
  147. package/dist/core/dependency/graph.test.d.ts.map +1 -0
  148. package/dist/core/dependency/graph.test.js +262 -0
  149. package/dist/core/dependency/graph.test.js.map +1 -0
  150. package/dist/core/rework/phase-selector.d.ts +24 -0
  151. package/dist/core/rework/phase-selector.d.ts.map +1 -0
  152. package/dist/core/rework/phase-selector.js +98 -0
  153. package/dist/core/rework/phase-selector.js.map +1 -0
  154. package/dist/core/rework/phase-selector.test.d.ts +2 -0
  155. package/dist/core/rework/phase-selector.test.d.ts.map +1 -0
  156. package/dist/core/rework/phase-selector.test.js +138 -0
  157. package/dist/core/rework/phase-selector.test.js.map +1 -0
  158. package/dist/dashboard/generator.d.ts +48 -17
  159. package/dist/dashboard/generator.d.ts.map +1 -1
  160. package/dist/dashboard/generator.js +75 -5
  161. package/dist/dashboard/generator.js.map +1 -1
  162. package/dist/dashboard/generator.test.js +213 -5
  163. package/dist/dashboard/generator.test.js.map +1 -1
  164. package/dist/dashboard/template.d.ts +1 -1
  165. package/dist/dashboard/template.d.ts.map +1 -1
  166. package/dist/dashboard/template.js +755 -114
  167. package/dist/dashboard/template.js.map +1 -1
  168. package/dist/e2e/knowledge.test.js +4 -3
  169. package/dist/e2e/knowledge.test.js.map +1 -1
  170. package/dist/e2e/pipeline.test.js +2 -0
  171. package/dist/e2e/pipeline.test.js.map +1 -1
  172. package/dist/e2e/rework.test.d.ts +6 -0
  173. package/dist/e2e/rework.test.d.ts.map +1 -0
  174. package/dist/e2e/rework.test.js +226 -0
  175. package/dist/e2e/rework.test.js.map +1 -0
  176. package/dist/index.js +0 -0
  177. package/dist/project/adopt.test.js +2 -0
  178. package/dist/project/adopt.test.js.map +1 -1
  179. package/dist/project/claude-md.js +2 -2
  180. package/dist/project/claude-md.js.map +1 -1
  181. package/dist/project/claude-md.test.js +4 -4
  182. package/dist/project/claude-md.test.js.map +1 -1
  183. package/dist/project/detector.d.ts.map +1 -1
  184. package/dist/project/detector.js +4 -1
  185. package/dist/project/detector.js.map +1 -1
  186. package/dist/project/frontmatter.d.ts.map +1 -1
  187. package/dist/project/frontmatter.js +54 -15
  188. package/dist/project/frontmatter.js.map +1 -1
  189. package/dist/project/frontmatter.test.js +2 -2
  190. package/dist/project/frontmatter.test.js.map +1 -1
  191. package/dist/state/rework-manager.d.ts +16 -0
  192. package/dist/state/rework-manager.d.ts.map +1 -0
  193. package/dist/state/rework-manager.js +126 -0
  194. package/dist/state/rework-manager.js.map +1 -0
  195. package/dist/state/rework-manager.test.d.ts +2 -0
  196. package/dist/state/rework-manager.test.d.ts.map +1 -0
  197. package/dist/state/rework-manager.test.js +191 -0
  198. package/dist/state/rework-manager.test.js.map +1 -0
  199. package/dist/state/state-manager.d.ts +13 -0
  200. package/dist/state/state-manager.d.ts.map +1 -1
  201. package/dist/state/state-manager.js +39 -2
  202. package/dist/state/state-manager.js.map +1 -1
  203. package/dist/state/state-manager.test.js +74 -1
  204. package/dist/state/state-manager.test.js.map +1 -1
  205. package/dist/state/state-migration.d.ts +23 -0
  206. package/dist/state/state-migration.d.ts.map +1 -0
  207. package/dist/state/state-migration.js +144 -0
  208. package/dist/state/state-migration.js.map +1 -0
  209. package/dist/state/state-migration.test.d.ts +2 -0
  210. package/dist/state/state-migration.test.d.ts.map +1 -0
  211. package/dist/state/state-migration.test.js +451 -0
  212. package/dist/state/state-migration.test.js.map +1 -0
  213. package/dist/types/assembly.d.ts +2 -0
  214. package/dist/types/assembly.d.ts.map +1 -1
  215. package/dist/types/dependency.d.ts +2 -2
  216. package/dist/types/dependency.d.ts.map +1 -1
  217. package/dist/types/frontmatter.d.ts +100 -7
  218. package/dist/types/frontmatter.d.ts.map +1 -1
  219. package/dist/types/frontmatter.js +89 -1
  220. package/dist/types/frontmatter.js.map +1 -1
  221. package/dist/types/index.d.ts +1 -0
  222. package/dist/types/index.d.ts.map +1 -1
  223. package/dist/types/index.js +1 -0
  224. package/dist/types/index.js.map +1 -1
  225. package/dist/types/lock.d.ts +1 -1
  226. package/dist/types/lock.d.ts.map +1 -1
  227. package/dist/types/rework.d.ts +36 -0
  228. package/dist/types/rework.d.ts.map +1 -0
  229. package/dist/types/rework.js +2 -0
  230. package/dist/types/rework.js.map +1 -0
  231. package/dist/utils/errors.d.ts +1 -0
  232. package/dist/utils/errors.d.ts.map +1 -1
  233. package/dist/utils/errors.js +8 -0
  234. package/dist/utils/errors.js.map +1 -1
  235. package/dist/utils/fs.d.ts +6 -0
  236. package/dist/utils/fs.d.ts.map +1 -1
  237. package/dist/utils/fs.js +13 -0
  238. package/dist/utils/fs.js.map +1 -1
  239. package/dist/validation/config-validator.test.d.ts +2 -0
  240. package/dist/validation/config-validator.test.d.ts.map +1 -0
  241. package/dist/validation/config-validator.test.js +210 -0
  242. package/dist/validation/config-validator.test.js.map +1 -0
  243. package/dist/validation/dependency-validator.test.d.ts +2 -0
  244. package/dist/validation/dependency-validator.test.d.ts.map +1 -0
  245. package/dist/validation/dependency-validator.test.js +215 -0
  246. package/dist/validation/dependency-validator.test.js.map +1 -0
  247. package/dist/validation/frontmatter-validator.test.d.ts +2 -0
  248. package/dist/validation/frontmatter-validator.test.d.ts.map +1 -0
  249. package/dist/validation/frontmatter-validator.test.js +371 -0
  250. package/dist/validation/frontmatter-validator.test.js.map +1 -0
  251. package/dist/validation/state-validator.test.d.ts +2 -0
  252. package/dist/validation/state-validator.test.d.ts.map +1 -0
  253. package/dist/validation/state-validator.test.js +325 -0
  254. package/dist/validation/state-validator.test.js.map +1 -0
  255. package/dist/wizard/suggestion.test.d.ts +2 -0
  256. package/dist/wizard/suggestion.test.d.ts.map +1 -0
  257. package/dist/wizard/suggestion.test.js +115 -0
  258. package/dist/wizard/suggestion.test.js.map +1 -0
  259. package/dist/wizard/wizard.d.ts.map +1 -1
  260. package/dist/wizard/wizard.js +34 -1
  261. package/dist/wizard/wizard.js.map +1 -1
  262. package/knowledge/core/adr-craft.md +57 -0
  263. package/knowledge/core/ai-memory-management.md +246 -0
  264. package/knowledge/core/api-design.md +8 -0
  265. package/knowledge/core/automated-review-tooling.md +203 -0
  266. package/knowledge/core/claude-md-patterns.md +254 -0
  267. package/knowledge/core/coding-conventions.md +246 -0
  268. package/knowledge/core/database-design.md +8 -0
  269. package/knowledge/core/design-system-tokens.md +469 -0
  270. package/knowledge/core/dev-environment.md +223 -0
  271. package/knowledge/core/domain-modeling.md +8 -0
  272. package/knowledge/core/eval-craft.md +1008 -0
  273. package/knowledge/core/git-workflow-patterns.md +200 -0
  274. package/knowledge/core/multi-model-review-dispatch.md +250 -0
  275. package/knowledge/core/operations-runbook.md +40 -225
  276. package/knowledge/core/project-structure-patterns.md +231 -0
  277. package/knowledge/core/review-step-template.md +247 -0
  278. package/knowledge/core/{security-review.md → security-best-practices.md} +9 -1
  279. package/knowledge/core/system-architecture.md +5 -1
  280. package/knowledge/core/task-decomposition.md +174 -36
  281. package/knowledge/core/task-tracking.md +225 -0
  282. package/knowledge/core/tech-stack-selection.md +214 -0
  283. package/knowledge/core/testing-strategy.md +63 -70
  284. package/knowledge/core/user-stories.md +69 -60
  285. package/knowledge/core/user-story-innovation.md +70 -0
  286. package/knowledge/core/ux-specification.md +18 -148
  287. package/knowledge/execution/enhancement-workflow.md +201 -0
  288. package/knowledge/execution/task-claiming-strategy.md +130 -0
  289. package/knowledge/execution/tdd-execution-loop.md +172 -0
  290. package/knowledge/execution/worktree-management.md +205 -0
  291. package/knowledge/finalization/apply-fixes-and-freeze.md +177 -14
  292. package/knowledge/finalization/developer-onboarding.md +4 -0
  293. package/knowledge/finalization/implementation-playbook.md +83 -5
  294. package/knowledge/product/gap-analysis.md +5 -1
  295. package/knowledge/product/prd-craft.md +55 -34
  296. package/knowledge/product/prd-innovation.md +12 -0
  297. package/knowledge/product/vision-craft.md +213 -0
  298. package/knowledge/review/review-adr.md +44 -0
  299. package/knowledge/review/{review-api-contracts.md → review-api-design.md} +47 -1
  300. package/knowledge/review/{review-database-schema.md → review-database-design.md} +40 -1
  301. package/knowledge/review/review-domain-modeling.md +38 -1
  302. package/knowledge/review/review-implementation-tasks.md +108 -1
  303. package/knowledge/review/review-methodology.md +11 -0
  304. package/knowledge/review/review-operations.md +67 -0
  305. package/knowledge/review/review-prd.md +46 -0
  306. package/knowledge/review/review-security.md +65 -0
  307. package/knowledge/review/review-system-architecture.md +32 -2
  308. package/knowledge/review/review-testing-strategy.md +62 -0
  309. package/knowledge/review/review-user-stories.md +65 -0
  310. package/knowledge/review/{review-ux-spec.md → review-ux-specification.md} +50 -2
  311. package/knowledge/review/review-vision.md +255 -0
  312. package/knowledge/tools/release-management.md +222 -0
  313. package/knowledge/tools/session-analysis.md +215 -0
  314. package/knowledge/tools/version-strategy.md +200 -0
  315. package/knowledge/validation/critical-path-analysis.md +1 -1
  316. package/knowledge/validation/cross-phase-consistency.md +12 -0
  317. package/knowledge/validation/decision-completeness.md +13 -1
  318. package/knowledge/validation/dependency-validation.md +12 -0
  319. package/knowledge/validation/scope-management.md +12 -0
  320. package/knowledge/validation/traceability.md +12 -0
  321. package/methodology/README.md +37 -0
  322. package/methodology/custom-defaults.yml +44 -4
  323. package/methodology/deep.yml +43 -3
  324. package/methodology/mvp.yml +43 -3
  325. package/package.json +4 -3
  326. package/pipeline/architecture/review-architecture.md +36 -13
  327. package/pipeline/architecture/system-architecture.md +24 -9
  328. package/pipeline/build/multi-agent-resume.md +245 -0
  329. package/pipeline/build/multi-agent-start.md +236 -0
  330. package/pipeline/build/new-enhancement.md +456 -0
  331. package/pipeline/build/quick-task.md +381 -0
  332. package/pipeline/build/single-agent-resume.md +210 -0
  333. package/pipeline/build/single-agent-start.md +207 -0
  334. package/pipeline/consolidation/claude-md-optimization.md +76 -0
  335. package/pipeline/consolidation/workflow-audit.md +77 -0
  336. package/pipeline/decisions/adrs.md +21 -7
  337. package/pipeline/decisions/review-adrs.md +32 -11
  338. package/pipeline/environment/ai-memory-setup.md +76 -0
  339. package/pipeline/environment/automated-pr-review.md +76 -0
  340. package/pipeline/environment/design-system.md +75 -0
  341. package/pipeline/environment/dev-env-setup.md +68 -0
  342. package/pipeline/environment/git-workflow.md +73 -0
  343. package/pipeline/finalization/apply-fixes-and-freeze.md +17 -6
  344. package/pipeline/finalization/developer-onboarding-guide.md +23 -9
  345. package/pipeline/finalization/implementation-playbook.md +43 -14
  346. package/pipeline/foundation/beads.md +71 -0
  347. package/pipeline/foundation/coding-standards.md +71 -0
  348. package/pipeline/foundation/project-structure.md +73 -0
  349. package/pipeline/foundation/tdd.md +64 -0
  350. package/pipeline/foundation/tech-stack.md +74 -0
  351. package/pipeline/integration/add-e2e-testing.md +80 -0
  352. package/pipeline/modeling/domain-modeling.md +23 -8
  353. package/pipeline/modeling/review-domain-modeling.md +35 -11
  354. package/pipeline/parity/platform-parity-review.md +90 -0
  355. package/pipeline/planning/implementation-plan-review.md +67 -0
  356. package/pipeline/planning/implementation-plan.md +110 -0
  357. package/pipeline/pre/create-prd.md +34 -10
  358. package/pipeline/pre/innovate-prd.md +46 -15
  359. package/pipeline/pre/innovate-user-stories.md +47 -14
  360. package/pipeline/pre/review-prd.md +29 -8
  361. package/pipeline/pre/review-user-stories.md +34 -8
  362. package/pipeline/pre/user-stories.md +23 -8
  363. package/pipeline/quality/create-evals.md +106 -0
  364. package/pipeline/quality/operations.md +46 -17
  365. package/pipeline/quality/review-operations.md +32 -11
  366. package/pipeline/quality/review-security.md +34 -12
  367. package/pipeline/quality/review-testing.md +37 -14
  368. package/pipeline/quality/security.md +36 -10
  369. package/pipeline/quality/story-tests.md +75 -0
  370. package/pipeline/specification/api-contracts.md +28 -8
  371. package/pipeline/specification/database-schema.md +29 -8
  372. package/pipeline/specification/review-api.md +32 -11
  373. package/pipeline/specification/review-database.md +32 -11
  374. package/pipeline/specification/review-ux.md +34 -12
  375. package/pipeline/specification/ux-spec.md +35 -13
  376. package/pipeline/validation/critical-path-walkthrough.md +45 -11
  377. package/pipeline/validation/cross-phase-consistency.md +45 -11
  378. package/pipeline/validation/decision-completeness.md +45 -11
  379. package/pipeline/validation/dependency-graph-validation.md +46 -11
  380. package/pipeline/validation/implementability-dry-run.md +46 -11
  381. package/pipeline/validation/scope-creep-check.md +46 -11
  382. package/pipeline/validation/traceability-matrix.md +51 -11
  383. package/pipeline/vision/create-vision.md +267 -0
  384. package/pipeline/vision/innovate-vision.md +157 -0
  385. package/pipeline/vision/review-vision.md +149 -0
  386. package/skills/multi-model-dispatch/SKILL.md +326 -0
  387. package/skills/scaffold-pipeline/SKILL.md +210 -0
  388. package/skills/scaffold-runner/SKILL.md +619 -0
  389. package/pipeline/planning/implementation-tasks.md +0 -57
  390. package/pipeline/planning/review-tasks.md +0 -38
  391. package/pipeline/quality/testing-strategy.md +0 -42
@@ -0,0 +1,247 @@
1
+ ---
2
+ name: review-step-template
3
+ description: Shared template pattern for review pipeline steps including multi-model dispatch, finding severity, and resolution workflow
4
+ topics: [review, template, multi-model, quality-gates, methodology]
5
+ ---
6
+
7
+ # Review Step Template
8
+
9
+ ## Summary
10
+
11
+ This entry documents the common structure shared by all 15+ review pipeline steps. Individual review steps customize this structure with artifact-specific failure modes and review passes, but the scaffolding is consistent across all reviews.
12
+
13
+ **Purpose pattern**: Every review step targets domain-specific failure modes for a given artifact — not generic quality checks. Each pass has a specific focus, concrete checking instructions, and example findings.
14
+
15
+ **Standard inputs**: Primary artifact being reviewed, upstream artifacts for cross-reference validation, `review-methodology` knowledge + artifact-specific review knowledge entry.
16
+
17
+ **Standard outputs**: Review document (`docs/reviews/review-{artifact}.md`), updated primary artifact with P0/P1 fixes applied, and at depth 4+: multi-model artifacts (`codex-review.json`, `gemini-review.json`, `review-summary.md`) under `docs/reviews/{artifact}/`.
18
+
19
+ **Finding severity**: P0 (blocking — must fix), P1 (significant — fix before implementation), P2 (improvement — fix if time permits), P3 (nitpick — log for later).
20
+
21
+ **Methodology scaling**: Depth 1-2 runs top passes only (P0 focus). Depth 3 runs all passes. Depth 4-5 adds multi-model dispatch to Codex/Gemini with finding synthesis.
22
+
23
+ **Mode detection**: First review runs all passes from scratch. Re-review preserves prior findings, marks resolved ones, and reports NEW/EXISTING/RESOLVED status.
24
+
25
+ **Frontmatter conventions**: Reviews are order = creation step + 10, always include `review-methodology` in knowledge-base, and are never conditional.
26
+
27
+ ## Deep Guidance
28
+
29
+ ### Purpose Pattern
30
+
31
+ Every review step follows the pattern:
32
+
33
+ > Review **[artifact]** targeting **[domain]**-specific failure modes.
34
+
35
+ The review does not check generic quality ("is this document complete?"). Instead, it runs artifact-specific passes that target the known ways that artifact type fails. Each pass has a specific focus, concrete checking instructions, and example findings.
36
+
37
+ ### Standard Inputs
38
+
39
+ Every review step reads:
40
+ - **Primary artifact**: The document being reviewed (e.g., `docs/domain-models.md`, `docs/api-contracts.md`)
41
+ - **Upstream artifacts**: Documents the primary artifact was built from (e.g., PRD, domain models, ADRs) -- used for cross-reference validation
42
+ - **Knowledge base entries**: `review-methodology` (shared process) + artifact-specific review knowledge (e.g., `review-api-design`, `review-database-design`)
43
+
44
+ ### Standard Outputs
45
+
46
+ Every review step produces:
47
+ - **Review document**: `docs/reviews/review-{artifact}.md` -- findings organized by pass, with severity and trace information
48
+ - **Updated artifact**: The primary artifact with fixes applied for P0/P1 findings
49
+ - **Depth 4+ multi-model artifacts** (when methodology depth >= 4):
50
+ - `docs/reviews/{artifact}/codex-review.json` -- Codex independent review findings
51
+ - `docs/reviews/{artifact}/gemini-review.json` -- Gemini independent review findings
52
+ - `docs/reviews/{artifact}/review-summary.md` -- Synthesized findings from all models
53
+
54
+ ### Finding Severity Levels
55
+
56
+ All review steps use the same four-level severity scale:
57
+
58
+ | Level | Name | Meaning | Action |
59
+ |-------|------|---------|--------|
60
+ | P0 | Blocking | Cannot proceed to downstream steps without fixing | Must fix before moving on |
61
+ | P1 | Significant | Downstream steps can proceed but will encounter problems | Fix before implementation |
62
+ | P2 | Improvement | Artifact works but could be better | Fix if time permits |
63
+ | P3 | Nitpick | Style or preference | Log for future cleanup |
64
+
65
+ ### Finding Format
66
+
67
+ Each finding includes:
68
+ - **Pass**: Which review pass discovered it (e.g., "Pass 3 -- Auth/AuthZ Coverage")
69
+ - **Priority**: P0-P3
70
+ - **Location**: Specific section, line, or element in the artifact
71
+ - **Issue**: What is wrong, with concrete details
72
+ - **Impact**: What goes wrong downstream if this is not fixed
73
+ - **Recommendation**: Specific fix, not just "fix this"
74
+ - **Trace**: Link back to upstream artifact that establishes the requirement (e.g., "PRD Section 3.2 -> Architecture DF-005")
75
+
76
+ ### Example Finding
77
+
78
+ ```markdown
79
+ ### Finding F-003 (P1)
80
+ - **Pass**: Pass 2 — Entity Coverage
81
+ - **Location**: docs/domain-models/order.md, Section "Order Aggregate"
82
+ - **Issue**: Order aggregate does not include a `cancellationReason` field, but PRD
83
+ Section 4.1 requires cancellation reason tracking for analytics.
84
+ - **Impact**: Implementation will lack cancellation reason; analytics pipeline will
85
+ receive null values, causing dashboard gaps.
86
+ - **Recommendation**: Add `cancellationReason: CancellationReason` value object to
87
+ Order aggregate with enum values: USER_REQUEST, PAYMENT_FAILED, OUT_OF_STOCK,
88
+ ADMIN_ACTION.
89
+ - **Trace**: PRD §4.1 → User Story US-014 → Domain Model: Order Aggregate
90
+ ```
91
+
92
+ ### Review Document Structure
93
+
94
+ Every review output document follows a consistent structure:
95
+
96
+ ```markdown
97
+ # Review: [Artifact Name]
98
+
99
+ **Date**: YYYY-MM-DD
100
+ **Methodology**: deep | mvp | custom:depth(N)
101
+ **Status**: INITIAL | RE-REVIEW
102
+ **Models**: Claude | Claude + Codex | Claude + Codex + Gemini
103
+
104
+ ## Findings Summary
105
+ - Total findings: N (P0: X, P1: Y, P2: Z, P3: W)
106
+ - Passes run: N of M
107
+ - Artifacts checked: [list]
108
+
109
+ ## Findings by Pass
110
+
111
+ ### Pass 1 — [Pass Name]
112
+ [Findings listed by severity, highest first]
113
+
114
+ ### Pass 2 — [Pass Name]
115
+ ...
116
+
117
+ ## Resolution Log
118
+ | Finding | Severity | Status | Resolution |
119
+ |---------|----------|--------|------------|
120
+ | F-001 | P0 | RESOLVED | Fixed in commit abc123 |
121
+ | F-002 | P1 | EXISTING | Deferred — tracked in ADR-015 |
122
+
123
+ ## Multi-Model Synthesis (depth 4+)
124
+ ### Convergent Findings
125
+ [Issues found by 2+ models — high confidence]
126
+
127
+ ### Divergent Findings
128
+ [Issues found by only one model — requires manual triage]
129
+ ```
130
+
131
+ ### Methodology Scaling Pattern
132
+
133
+ Review steps scale their thoroughness based on the methodology depth setting:
134
+
135
+ ### Depth 1-2 (MVP/Minimal)
136
+ - Run only the highest-impact passes (typically passes 1-3)
137
+ - Single-model review only
138
+ - Focus on P0 findings; skip P2/P3
139
+ - Abbreviated finding descriptions
140
+
141
+ ### Depth 3 (Standard)
142
+ - Run all review passes
143
+ - Single-model review
144
+ - Report all severity levels
145
+ - Full finding descriptions with trace information
146
+
147
+ ### Depth 4-5 (Comprehensive)
148
+ - Run all review passes
149
+ - Multi-model dispatch: send the artifact to Codex and Gemini for independent analysis
150
+ - Synthesize findings from all models, flagging convergent findings (multiple models found the same issue) as higher confidence
151
+ - Cross-artifact consistency checks against all upstream documents
152
+ - Full finding descriptions with detailed trace and impact analysis
153
+
154
+ ### Depth Scaling Example
155
+
156
+ At depth 2 (MVP), a domain model review might produce:
157
+
158
+ ```markdown
159
+ # Review: Domain Models (MVP)
160
+ ## Findings Summary
161
+ - Total findings: 3 (P0: 1, P1: 2)
162
+ - Passes run: 3 of 10
163
+ ## Findings
164
+ ### F-001 (P0) — Missing aggregate root for Payment bounded context
165
+ ### F-002 (P1) — Order entity lacks status field referenced in user stories
166
+ ### F-003 (P1) — No domain event defined for order completion
167
+ ```
168
+
169
+ At depth 5 (comprehensive), the same review would run all 10 passes, dispatch to
170
+ Codex and Gemini, and produce a full synthesis with 15-30 findings across all
171
+ severity levels.
172
+
173
+ ### Mode Detection Pattern
174
+
175
+ Every review step checks whether this is a first review or a re-review:
176
+
177
+ **First review**: No prior review document exists. Run all passes from scratch.
178
+
179
+ **Re-review**: A prior review document exists (`docs/reviews/review-{artifact}.md`). The step:
180
+ 1. Reads the prior review findings
181
+ 2. Checks which findings were addressed (fixed in the artifact)
182
+ 3. Marks resolved findings as "RESOLVED" rather than removing them
183
+ 4. Runs all passes again looking for new issues or regressions
184
+ 5. Reports findings as "NEW", "EXISTING" (still unfixed), or "RESOLVED"
185
+
186
+ This preserves review history and makes progress visible.
187
+
188
+ ### Resolution Workflow
189
+
190
+ The standard workflow from review to resolution:
191
+
192
+ 1. **Review**: Run the review step, producing findings
193
+ 2. **Triage**: Categorize findings by severity; confirm P0s are genuine blockers
194
+ 3. **Fix**: Update the primary artifact to address P0 and P1 findings
195
+ 4. **Re-review**: Run the review step again in re-review mode
196
+ 5. **Verify**: Confirm all P0 findings are resolved; P1 findings are resolved or have documented justification for deferral
197
+ 6. **Proceed**: Move to the next pipeline phase
198
+
199
+ For depth 4+ reviews, the multi-model dispatch happens in both the initial review and the re-review, ensuring fixes do not introduce new issues visible to other models.
200
+
201
+ ### Frontmatter Pattern
202
+
203
+ Review steps follow a consistent frontmatter structure:
204
+
205
+ ```yaml
206
+ ---
207
+ name: review-{artifact}
208
+ description: "Review {artifact} for completeness, consistency, and downstream readiness"
209
+ phase: "{phase-slug}"
210
+ order: {N}20 # Reviews are always 10 after their creation step
211
+ dependencies: [{creation-step}]
212
+ outputs: [docs/reviews/review-{artifact}.md, docs/reviews/{artifact}/review-summary.md, docs/reviews/{artifact}/codex-review.json, docs/reviews/{artifact}/gemini-review.json]
213
+ conditional: null
214
+ knowledge-base: [review-methodology, review-{artifact-domain}]
215
+ ---
216
+ ```
217
+
218
+ Key conventions:
219
+ - Review steps always have order = creation step order + 10
220
+ - Primary output uses `review-` prefix; multi-model directory uses bare artifact name
221
+ - Knowledge base always includes `review-methodology` plus a domain-specific entry
222
+ - Reviews are never conditional — if the creation step ran, the review runs
223
+
224
+ ### Common Anti-Patterns
225
+
226
+ ### Reviewing Without Upstream Context
227
+ Running a review without loading the upstream artifacts that define requirements.
228
+ The review cannot verify traceability if it does not have the PRD, domain models,
229
+ or ADRs that establish what the artifact should contain.
230
+
231
+ ### Severity Inflation
232
+ Marking everything as P0 to force immediate action. This undermines the severity
233
+ system and causes triage fatigue. Reserve P0 for genuine blockers where downstream
234
+ steps will fail or produce incorrect output.
235
+
236
+ ### Fix Without Re-Review
237
+ Applying fixes to findings without re-running the review. Fixes can introduce new
238
+ issues or incompletely address the original finding. Always re-review after fixes.
239
+
240
+ ### Ignoring Convergent Multi-Model Findings
241
+ When multiple models independently find the same issue, it has high confidence.
242
+ Dismissing convergent findings without strong justification undermines the value
243
+ of multi-model review.
244
+
245
+ ### Removing Prior Findings
246
+ Deleting findings from a re-review output instead of marking them RESOLVED. This
247
+ loses review history and makes it impossible to track what was caught and fixed.
@@ -1,9 +1,11 @@
1
1
  ---
2
- name: security-review
2
+ name: security-best-practices
3
3
  description: OWASP Top 10, authentication, authorization, data protection, and threat modeling
4
4
  topics: [security, owasp, authentication, authorization, threat-modeling, secrets-management, dependency-auditing]
5
5
  ---
6
6
 
7
+ ## Summary
8
+
7
9
  ## OWASP Top 10
8
10
 
9
11
  The OWASP Top 10 represents the most critical security risks to web applications. Every project should evaluate each risk and implement appropriate mitigations.
@@ -55,6 +57,8 @@ Sensitive data exposed due to weak or missing encryption.
55
57
  - Hash passwords with bcrypt, scrypt, or Argon2 (NEVER MD5 or SHA-256 for passwords)
56
58
  - Don't store sensitive data you don't need — the safest data is data you don't have
57
59
 
60
+ ## Deep Guidance
61
+
58
62
  ### A03: Injection
59
63
 
60
64
  Untrusted data sent to an interpreter as part of a command or query, causing unintended execution.
@@ -521,3 +525,7 @@ Protect against compromised dependencies:
521
525
  **No rate limiting.** Login endpoints with unlimited attempts allow brute-force password attacks. API endpoints with no rate limits allow denial of service. Fix: implement rate limiting on all public endpoints. Start with conservative limits. Use exponential backoff for authentication failures.
522
526
 
523
527
  **Ignoring dependency vulnerabilities.** Running `npm audit` shows 47 vulnerabilities but nobody addresses them because "they're all low severity." Fix: set a policy and enforce it in CI. Critical and high vulnerabilities block deployment. Medium vulnerabilities have a SLA for resolution.
528
+
529
+ ## See Also
530
+
531
+ - [operations-runbook](../core/operations-runbook.md) — Logging and monitoring sensitive data
@@ -1,9 +1,11 @@
1
1
  ---
2
2
  name: system-architecture
3
3
  description: Architecture patterns, component design, and project structure
4
- topics: [architecture, components, modules, data-flows, project-structure, state-management]
4
+ topics: [architecture, components, modules, data-flow, project-structure, state-management]
5
5
  ---
6
6
 
7
+ ## Summary
8
+
7
9
  ## Architecture Patterns
8
10
 
9
11
  ### Layered Architecture
@@ -81,6 +83,8 @@ For most scaffold pipeline projects:
81
83
  4. Use **microservices** only if you have multiple teams that need independent deployment, or specific services with dramatically different scaling needs.
82
84
  5. Avoid **layered** unless the application is genuinely simple (CRUD with minimal business logic).
83
85
 
86
+ ## Deep Guidance
87
+
84
88
  ## Component Design
85
89
 
86
90
  ### Identifying Components from Domain Models
@@ -4,11 +4,52 @@ description: Breaking architecture into implementable tasks with dependency anal
4
4
  topics: [tasks, decomposition, dependencies, user-stories, parallelization, sizing, critical-path]
5
5
  ---
6
6
 
7
- ## User Stories to Tasks
7
+ # Task Decomposition
8
8
 
9
- > **Note:** User stories are created as an upstream artifact in the pre-pipeline phase and available at `docs/user-stories.md`. This section covers how to consume stories and derive implementation tasks from them.
9
+ Expert knowledge for breaking user stories into implementable tasks with dependency analysis, sizing, parallelization, and agent context requirements.
10
+
11
+ ## Summary
12
+
13
+ ### Story-to-Task Mapping
14
+
15
+ User stories bridge PRD features and implementation tasks. Each story decomposes into tasks following the technical layers needed. Every task must trace back to a user story, and every story to a PRD feature (PRD Feature → US-xxx → Task BD-xxx).
16
+
17
+ ### Task Sizing
18
+
19
+ Each task should be completable in a single AI agent session (30-90 minutes of agent time). A well-sized task has a clear title (usable as commit message), touches 1-3 application files (hard limit; justify exceptions), produces ~150 lines of net-new application code (excluding tests and generated files), and has no ambiguity about "done."
20
+
21
+ Five rules govern agent-friendly task sizing:
22
+ 1. **Three-File Rule** — Max 3 application files modified (test files excluded)
23
+ 2. **150-Line Budget** — Max ~150 lines of net-new application code per task
24
+ 3. **Single-Concern Rule** — One task does one thing (no "and" connecting unrelated work)
25
+ 4. **Decision-Free Execution** — All design decisions resolved in the task description; agents implement, they don't architect
26
+ 5. **Test Co-location** — Tests live in the same task as the code they test; no deferred testing
27
+
28
+ Split large tasks by layer (API, UI, DB, tests), by feature slice (happy path, validation, edge cases), or by entity. Combine tiny tasks that touch the same file and have no independent value.
29
+
30
+ ### Dependency Types
31
+
32
+ - **Logical** — Task B requires Task A's output (endpoint needs DB schema)
33
+ - **File contention** — Two tasks modify the same file (merge conflict risk)
34
+ - **Infrastructure** — Task requires setup that must exist first (DB, auth, CI)
35
+ - **Knowledge** — Task benefits from understanding gained in another task
36
+
37
+ Only logical, file contention, and infrastructure dependencies should be formal constraints.
38
+
39
+ ### Definition of Done
40
+
41
+ 1. Acceptance criteria from the user story are met
42
+ 2. Unit tests pass (for new logic)
43
+ 3. Integration tests pass (for API endpoints or component interactions)
44
+ 4. No linting or type errors
45
+ 5. Code follows project coding standards
46
+ 6. Changes committed with proper message format
47
+
48
+ ## Deep Guidance
49
+
50
+ ### From Stories to Tasks — Extended
10
51
 
11
- ### From Stories to Tasks
52
+ > **Note:** User stories are created as an upstream artifact in the pre-pipeline phase and available at `docs/user-stories.md`. This section covers how to consume stories and derive implementation tasks from them.
12
53
 
13
54
  User stories bridge the gap between what the business wants (PRD features) and what developers build (implementation tasks). Every PRD feature maps to one or more user stories (created in the pre-pipeline), and every user story should map to one or more implementation tasks.
14
55
 
@@ -115,16 +156,19 @@ This traceability ensures:
115
156
  - No orphan tasks exist (every task serves a purpose)
116
157
  - Impact analysis is possible (changing a PRD feature reveals which tasks are affected)
117
158
 
118
- ## Task Sizing
159
+ ### Task Sizing — Extended
119
160
 
120
- ### Right-Sizing for Agent Sessions
161
+ #### Right-Sizing for Agent Sessions
121
162
 
122
163
  Each task should be completable in a single AI agent session (typically 30-90 minutes of agent time). Tasks that are too large overflow the context window; tasks that are too small create unnecessary coordination overhead.
123
164
 
124
165
  **A well-sized task:**
125
166
  - Has a clear, specific title that could be a commit message
126
- - Touches 1-5 files (not counting test files)
127
- - Produces a testable, verifiable result
167
+ - Touches 1-3 application files (hard limit; test files excluded from count)
168
+ - Produces ~150 lines of net-new application code (excluding tests and generated files)
169
+ - Does exactly one thing (passes the single-concern test: describable without "and")
170
+ - Requires no design decisions from the agent (all choices resolved in the description)
171
+ - Includes co-located tests (the task isn't done until tests pass)
128
172
  - Has no ambiguity about what "done" means
129
173
  - Can be code-reviewed independently
130
174
 
@@ -136,7 +180,7 @@ Each task should be completable in a single AI agent session (typically 30-90 mi
136
180
  | "Create Button component" | "Build form components (Input, Select, Textarea) with validation states" | "Create the full design system" |
137
181
  | "Add index to users table" | "Create database schema for user management with migration" | "Set up the entire database" |
138
182
 
139
- ### Splitting Large Tasks
183
+ #### Splitting Large Tasks
140
184
 
141
185
  When a task is too large, split along these axes:
142
186
 
@@ -163,7 +207,7 @@ When a task is too large, split along these axes:
163
207
  - The task involves more than 2 architectural boundaries (e.g., database + API + frontend + auth)
164
208
  - You can't describe what "done" looks like in 2-3 sentences
165
209
 
166
- ### Combining Small Tasks
210
+ #### Combining Small Tasks
167
211
 
168
212
  If multiple tiny tasks touch the same file and have no independent value, combine them:
169
213
 
@@ -172,20 +216,9 @@ If multiple tiny tasks touch the same file and have no independent value, combin
172
216
 
173
217
  The test: would the small task result in a useful commit on its own? If not, combine.
174
218
 
175
- ### Definition of Done
176
-
177
- Every task needs a clear definition of done. Standard criteria:
178
-
179
- 1. All acceptance criteria from the user story are met
180
- 2. Unit tests pass (for new logic)
181
- 3. Integration tests pass (for API endpoints or component interactions)
182
- 4. No linting or type errors
183
- 5. Code follows project coding standards
184
- 6. Changes are committed with proper message format
219
+ ### Dependency Analysis — Extended
185
220
 
186
- ## Dependency Analysis
187
-
188
- ### Types of Dependencies
221
+ #### Types of Dependencies
189
222
 
190
223
  **Logical dependencies:** Task B requires Task A's output. The API endpoint task depends on the database schema task because the endpoint queries tables that must exist first.
191
224
 
@@ -195,7 +228,7 @@ Every task needs a clear definition of done. Standard criteria:
195
228
 
196
229
  **Knowledge dependencies:** A task requires understanding gained from completing another task. The developer who builds the auth system understands the auth patterns needed by other features.
197
230
 
198
- ### Building Dependency Graphs (DAGs)
231
+ #### Building Dependency Graphs (DAGs)
199
232
 
200
233
  A dependency graph is a directed acyclic graph (DAG) where:
201
234
  - Nodes are tasks
@@ -210,7 +243,7 @@ A dependency graph is a directed acyclic graph (DAG) where:
210
243
  4. Draw an edge from producer to consumer
211
244
  5. Check for cycles (if A depends on B and B depends on A, something is wrong — split or reorganize)
212
245
 
213
- ### Detecting Cycles
246
+ #### Detecting Cycles
214
247
 
215
248
  Cycles indicate a modeling problem. Common causes and fixes:
216
249
 
@@ -218,7 +251,7 @@ Cycles indicate a modeling problem. Common causes and fixes:
218
251
  - **Feature interaction:** Feature X needs Feature Y's component, and Feature Y needs Feature X's component. Fix: extract the shared component into its own task.
219
252
  - **Testing dependency:** "Can't test A without B, can't test B without A." Fix: use mocks/stubs to break the cycle during testing. The integration test that tests both together becomes a separate task.
220
253
 
221
- ### Finding Critical Path
254
+ #### Finding Critical Path
222
255
 
223
256
  The critical path is the longest chain of dependent tasks from start to finish. It determines the minimum project duration.
224
257
 
@@ -235,7 +268,7 @@ The critical path is the longest chain of dependent tasks from start to finish.
235
268
  - To shorten the project, focus on splitting or accelerating critical-path tasks
236
269
  - Non-critical-path tasks have "float" — they can be delayed without affecting the project end date
237
270
 
238
- ### Dependency Documentation
271
+ #### Dependency Documentation
239
272
 
240
273
  For each dependency, document:
241
274
 
@@ -245,9 +278,9 @@ For each dependency, document:
245
278
  | BD-12 -> BD-13 | File contention | Both modify src/routes/index.ts | Medium — merge conflict risk |
246
279
  | BD-01 -> BD-* | Infrastructure | BD-01 sets up the database; everything needs it | High — blocks all work |
247
280
 
248
- ## Parallelization
281
+ ### Parallelization and Wave Planning
249
282
 
250
- ### Identifying Independent Tasks
283
+ #### Identifying Independent Tasks
251
284
 
252
285
  Tasks are safe to run in parallel when:
253
286
  - They have no shared dependencies (no common prerequisite still in progress)
@@ -267,7 +300,7 @@ Tasks are safe to run in parallel when:
267
300
  - Tasks that modify the same shared utility file
268
301
  - Tasks where one produces test fixtures the other consumes
269
302
 
270
- ### Managing Shared-State Tasks
303
+ #### Managing Shared-State Tasks
271
304
 
272
305
  When tasks must share state (database, shared configuration, route registry):
273
306
 
@@ -277,7 +310,7 @@ When tasks must share state (database, shared configuration, route registry):
277
310
 
278
311
  **Feature flags:** Both tasks can merge independently. A feature flag controls which one is active. Integrate them in a separate task after both complete.
279
312
 
280
- ### Merge Strategies for Parallel Work
313
+ #### Merge Strategies for Parallel Work
281
314
 
282
315
  When parallel tasks produce branches that must be merged to main:
283
316
 
@@ -285,7 +318,7 @@ When parallel tasks produce branches that must be merged to main:
285
318
  - **First-in wins:** The first task to merge gets a clean merge. Subsequent tasks must rebase and resolve conflicts.
286
319
  - **Minimize shared files:** Design the task decomposition to minimize file overlap. Feature-based directory structure helps enormously.
287
320
 
288
- ### Wave Planning
321
+ #### Wave Planning
289
322
 
290
323
  Organize tasks into waves based on the dependency graph:
291
324
 
@@ -298,9 +331,9 @@ Wave 4 (depends on Wave 3): End-to-end tests, performance optimization, polish
298
331
 
299
332
  Each wave's tasks can run in parallel. Wave N+1 starts only when all its dependencies in Wave N are complete. The number of parallel agents should match the number of independent tasks in the current wave.
300
333
 
301
- ## Agent Context
334
+ ### Agent Context Requirements
302
335
 
303
- ### What Context Each Task Needs
336
+ #### What Context Each Task Needs
304
337
 
305
338
  Every task description should specify what documents and code the implementing agent needs to read:
306
339
 
@@ -321,7 +354,7 @@ Produces:
321
354
  - tests/features/auth/register.integration.test.ts
322
355
  ```
323
356
 
324
- ### Handoff Information
357
+ #### Handoff Information
325
358
 
326
359
  When a task produces output that another task consumes, specify the handoff:
327
360
 
@@ -338,7 +371,7 @@ Consuming tasks:
338
371
  BD-30 (onboarding flow) expects the response shape above
339
372
  ```
340
373
 
341
- ### Assumed Prior Work
374
+ #### Assumed Prior Work
342
375
 
343
376
  Explicitly state what the agent can assume exists:
344
377
 
@@ -353,7 +386,112 @@ Does NOT assume:
353
386
  - Any auth endpoints exist (this is the first)
354
387
  ```
355
388
 
356
- ## Common Pitfalls
389
+ ### Agent Executability Heuristics
390
+
391
+ Five formalized rules for ensuring tasks are the right size for AI agent execution. These are hard rules with an escape hatch — tasks exceeding limits must be split unless the author provides explicit justification via `<!-- agent-size-exception: reason -->`.
392
+
393
+ #### Rule 1: Three-File Rule
394
+
395
+ A task modifies at most 3 application files (test files don't count toward this limit). If it would touch more, split by layer or concern.
396
+
397
+ **Why 3:** Reading 3 files plus their context (imports, types, interfaces) consumes roughly 40-60% of a standard agent context window, leaving room for the task description, test code, and reasoning. At 5+ files, context pressure causes agents to lose track of cross-file consistency.
398
+
399
+ **Splitting when exceeded:**
400
+ - 4 files across 2 layers → split into one task per layer
401
+ - 5 files in the same layer → split by entity or concern within the layer
402
+ - Config files touched alongside application files → separate config task if non-trivial
403
+
404
+ #### Rule 2: 150-Line Budget
405
+
406
+ A task produces at most ~150 lines of net-new application code (excluding tests, generated files, and config). This keeps the entire change reviewable in one screen and within agent context budgets.
407
+
408
+ **Why 150:** Agent output quality degrades measurably after ~200 lines of new code in a single session. At 150 lines, the agent can hold the entire change in context while writing tests and verifying correctness.
409
+
410
+ **Estimating line count from task descriptions:**
411
+ - A CRUD endpoint with validation: ~80-120 lines
412
+ - A UI component with state management: ~100-150 lines
413
+ - A database migration with seed data: ~50-80 lines
414
+ - A full feature slice (API + UI + tests): ~300+ lines — MUST split
415
+
416
+ #### Rule 3: Single-Concern Rule
417
+
418
+ A task does exactly one thing. The test: can you describe what this task does in one sentence without "and"?
419
+
420
+ **Passes the test:**
421
+ - "Implement the user registration endpoint with input validation" (validation is part of the endpoint)
422
+ - "Create the order model with database migration" (migration is part of model creation)
423
+
424
+ **Fails the test:**
425
+ - "Add the API endpoint AND update the dashboard" — two tasks
426
+ - "Implement authentication AND set up the database" — two tasks
427
+ - "Build the payment form AND integrate with Stripe AND add webhook handling" — three tasks
428
+
429
+ **Splitting signals:**
430
+ - Task description contains "and" connecting unrelated work
431
+ - Task spans multiple architectural layers (API + frontend + database in one task)
432
+ - Task affects multiple bounded contexts or feature domains
433
+ - Task has acceptance criteria for two distinct user-facing behaviors
434
+
435
+ #### Rule 4: Decision-Free Execution
436
+
437
+ The task description must resolve all design decisions upfront. The agent implements, it doesn't architect. No task should require the agent to:
438
+
439
+ - Choose between patterns (repository vs active record, REST vs GraphQL)
440
+ - Select libraries or tools
441
+ - Decide module structure or file organization
442
+ - Determine API contract shapes (these come from upstream specs)
443
+
444
+ **Red flags in task descriptions:**
445
+ - "Choose the best approach for..."
446
+ - "Determine whether to use X or Y"
447
+ - "Decide how to structure..."
448
+ - "Evaluate options for..."
449
+ - "Select the most appropriate..."
450
+ - "Figure out the best way to..."
451
+
452
+ If a task contains any of these, the decision belongs in the task description — resolved by the plan author — not left to agent judgment. Local implementation choices (variable names, loop style, internal helper structure) are fine.
453
+
454
+ #### Rule 5: Test Co-location
455
+
456
+ Tests live in the same task as the code they test. The task follows TDD: write the failing test, then the implementation, then verify. The task isn't done until tests pass.
457
+
458
+ **Anti-pattern:** "Tasks 1-8: implement features. Task 9: write tests for everything." This produces untestable code, violates TDD, and creates a single massive testing task that exceeds all size limits.
459
+
460
+ **What co-location looks like:**
461
+ ```
462
+ Task: Implement user registration endpoint
463
+ 1. Write failing integration test (POST /register with valid data → 201)
464
+ 2. Implement endpoint to make test pass
465
+ 3. Write failing validation test (invalid email → 400)
466
+ 4. Add validation to make test pass
467
+ 5. Commit
468
+ ```
469
+
470
+ #### Escape Hatch
471
+
472
+ If a task genuinely can't be split further without creating tasks that have no independent value, add an explicit annotation in the task description: `<!-- agent-size-exception: [reason] -->`. The review pass flags unjustified exceptions but accepts reasoned ones.
473
+
474
+ **Valid exception reasons:**
475
+ - "Migration task touches 4 files but they're all trivial one-line renames"
476
+ - "Config file changes across 4 files are mechanical and identical in structure"
477
+ - "Test setup file is large but generated from a template"
478
+
479
+ **Invalid exception reasons:**
480
+ - "It's easier to do it all at once" (convenience is not a justification)
481
+ - "The files are related" (related files can still be separate tasks)
482
+ - "It would create too many tasks" (more small tasks > fewer large tasks)
483
+
484
+ #### Concrete "Too Big" Examples
485
+
486
+ | Task (Too Big) | Violations | Split Into |
487
+ |---------------|-----------|------------|
488
+ | "Implement user authentication" (8+ files, registration + login + reset + middleware) | Three-File, Single-Concern | 4 tasks: registration endpoint, login endpoint, password reset flow, auth middleware |
489
+ | "Build the settings page with all preferences" (6 files, multiple forms + APIs) | Three-File, 150-Line, Single-Concern | Per-group: profile settings, notification settings, security settings |
490
+ | "Set up database with all migrations and seed data" (10+ files, every entity) | Three-File, 150-Line | Per-entity: users table, orders table, products table, then seed data task |
491
+ | "Create API client with retry, caching, and auth" (4 concerns in one module) | Single-Concern, Decision-Free | 3 tasks: base client with auth, retry middleware, cache layer |
492
+ | "Implement the dashboard with charts, filters, and real-time updates" (5+ files, 300+ lines) | All five rules | 4 tasks: dashboard layout + routing, chart components, filter system, WebSocket integration |
493
+
494
+ ### Common Pitfalls
357
495
 
358
496
  **Tasks too vague.** "Implement backend" or "Set up auth" with no acceptance criteria, no file paths, and no test requirements. An agent receiving this task will guess wrong about scope, structure, and conventions. Fix: every task must specify exact files to create/modify, acceptance criteria, and test requirements.
359
497