@harness-engineering/cli 1.23.0 → 1.23.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (423) hide show
  1. package/dist/agents/commands/codex/harness/add-harness-component/SKILL.md +21 -12
  2. package/dist/agents/commands/codex/harness/cleanup-dead-code/SKILL.md +9 -0
  3. package/dist/agents/commands/codex/harness/detect-doc-drift/SKILL.md +9 -0
  4. package/dist/agents/commands/codex/harness/enforce-architecture/SKILL.md +5 -15
  5. package/dist/agents/commands/codex/harness/harness-architecture-advisor/SKILL.md +5 -15
  6. package/dist/agents/commands/codex/harness/harness-autopilot/SKILL.md +10 -0
  7. package/dist/agents/commands/codex/harness/harness-brainstorming/SKILL.md +9 -0
  8. package/dist/agents/commands/codex/harness/harness-code-review/SKILL.md +5 -15
  9. package/dist/agents/commands/codex/harness/harness-codebase-cleanup/SKILL.md +9 -0
  10. package/dist/agents/commands/codex/harness/harness-debugging/SKILL.md +9 -0
  11. package/dist/agents/commands/codex/harness/harness-dependency-health/SKILL.md +8 -0
  12. package/dist/agents/commands/codex/harness/harness-docs-pipeline/SKILL.md +9 -0
  13. package/dist/agents/commands/codex/harness/harness-execution/SKILL.md +9 -0
  14. package/dist/agents/commands/codex/harness/harness-hotspot-detector/SKILL.md +8 -0
  15. package/dist/agents/commands/codex/harness/harness-impact-analysis/SKILL.md +8 -0
  16. package/dist/agents/commands/codex/harness/harness-integrity/SKILL.md +8 -0
  17. package/dist/agents/commands/codex/harness/harness-onboarding/SKILL.md +18 -10
  18. package/dist/agents/commands/codex/harness/harness-perf/SKILL.md +9 -0
  19. package/dist/agents/commands/codex/harness/harness-planning/SKILL.md +10 -0
  20. package/dist/agents/commands/codex/harness/harness-refactoring/SKILL.md +9 -0
  21. package/dist/agents/commands/codex/harness/harness-release-readiness/SKILL.md +9 -0
  22. package/dist/agents/commands/codex/harness/harness-roadmap/SKILL.md +10 -1
  23. package/dist/agents/commands/codex/harness/harness-security-scan/SKILL.md +5 -15
  24. package/dist/agents/commands/codex/harness/harness-skill-authoring/SKILL.md +20 -1
  25. package/dist/agents/commands/codex/harness/harness-soundness-review/SKILL.md +10 -0
  26. package/dist/agents/commands/codex/harness/harness-supply-chain-audit/SKILL.md +8 -0
  27. package/dist/agents/commands/codex/harness/harness-tdd/SKILL.md +9 -0
  28. package/dist/agents/commands/codex/harness/harness-test-advisor/SKILL.md +8 -0
  29. package/dist/agents/commands/codex/harness/harness-verification/SKILL.md +9 -0
  30. package/dist/agents/commands/codex/harness/harness-verify/SKILL.md +8 -0
  31. package/dist/agents/commands/codex/harness/initialize-harness-project/SKILL.md +22 -13
  32. package/dist/agents/commands/cursor/harness/add-harness-component.mdc +12 -12
  33. package/dist/agents/commands/cursor/harness/harness-onboarding.mdc +10 -10
  34. package/dist/agents/commands/cursor/harness/harness-roadmap.mdc +1 -1
  35. package/dist/agents/commands/cursor/harness/initialize-harness-project.mdc +13 -13
  36. package/dist/agents/skills/claude-code/add-harness-component/SKILL.md +21 -12
  37. package/dist/agents/skills/claude-code/align-documentation/SKILL.md +9 -0
  38. package/dist/agents/skills/claude-code/check-mechanical-constraints/SKILL.md +9 -0
  39. package/dist/agents/skills/claude-code/cleanup-dead-code/SKILL.md +11 -0
  40. package/dist/agents/skills/claude-code/detect-doc-drift/SKILL.md +9 -0
  41. package/dist/agents/skills/claude-code/enforce-architecture/SKILL.md +5 -15
  42. package/dist/agents/skills/claude-code/harness-accessibility/SKILL.md +10 -0
  43. package/dist/agents/skills/claude-code/harness-api-design/SKILL.md +5 -15
  44. package/dist/agents/skills/claude-code/harness-architecture-advisor/SKILL.md +5 -15
  45. package/dist/agents/skills/claude-code/harness-auth/SKILL.md +5 -15
  46. package/dist/agents/skills/claude-code/harness-autopilot/SKILL.md +10 -0
  47. package/dist/agents/skills/claude-code/harness-brainstorming/SKILL.md +9 -0
  48. package/dist/agents/skills/claude-code/harness-caching/SKILL.md +10 -0
  49. package/dist/agents/skills/claude-code/harness-chaos/SKILL.md +10 -0
  50. package/dist/agents/skills/claude-code/harness-code-review/SKILL.md +5 -15
  51. package/dist/agents/skills/claude-code/harness-codebase-cleanup/SKILL.md +11 -0
  52. package/dist/agents/skills/claude-code/harness-compliance/SKILL.md +10 -0
  53. package/dist/agents/skills/claude-code/harness-containerization/SKILL.md +10 -0
  54. package/dist/agents/skills/claude-code/harness-data-pipeline/SKILL.md +10 -0
  55. package/dist/agents/skills/claude-code/harness-data-validation/SKILL.md +10 -0
  56. package/dist/agents/skills/claude-code/harness-database/SKILL.md +5 -15
  57. package/dist/agents/skills/claude-code/harness-debugging/SKILL.md +9 -0
  58. package/dist/agents/skills/claude-code/harness-dependency-health/SKILL.md +10 -0
  59. package/dist/agents/skills/claude-code/harness-deployment/SKILL.md +5 -15
  60. package/dist/agents/skills/claude-code/harness-design/SKILL.md +10 -0
  61. package/dist/agents/skills/claude-code/harness-design-mobile/SKILL.md +10 -0
  62. package/dist/agents/skills/claude-code/harness-design-system/SKILL.md +10 -0
  63. package/dist/agents/skills/claude-code/harness-design-web/SKILL.md +10 -0
  64. package/dist/agents/skills/claude-code/harness-diagnostics/SKILL.md +9 -0
  65. package/dist/agents/skills/claude-code/harness-docs-pipeline/SKILL.md +11 -0
  66. package/dist/agents/skills/claude-code/harness-dx/SKILL.md +10 -0
  67. package/dist/agents/skills/claude-code/harness-e2e/SKILL.md +9 -0
  68. package/dist/agents/skills/claude-code/harness-event-driven/SKILL.md +10 -0
  69. package/dist/agents/skills/claude-code/harness-execution/SKILL.md +9 -0
  70. package/dist/agents/skills/claude-code/harness-feature-flags/SKILL.md +10 -0
  71. package/dist/agents/skills/claude-code/harness-git-workflow/SKILL.md +10 -0
  72. package/dist/agents/skills/claude-code/harness-hotspot-detector/SKILL.md +10 -0
  73. package/dist/agents/skills/claude-code/harness-i18n/SKILL.md +10 -0
  74. package/dist/agents/skills/claude-code/harness-i18n-process/SKILL.md +10 -0
  75. package/dist/agents/skills/claude-code/harness-i18n-workflow/SKILL.md +10 -0
  76. package/dist/agents/skills/claude-code/harness-impact-analysis/SKILL.md +10 -0
  77. package/dist/agents/skills/claude-code/harness-incident-response/SKILL.md +10 -0
  78. package/dist/agents/skills/claude-code/harness-infrastructure-as-code/SKILL.md +10 -0
  79. package/dist/agents/skills/claude-code/harness-integration-test/SKILL.md +9 -0
  80. package/dist/agents/skills/claude-code/harness-integrity/SKILL.md +10 -0
  81. package/dist/agents/skills/claude-code/harness-knowledge-mapper/SKILL.md +9 -0
  82. package/dist/agents/skills/claude-code/harness-load-testing/SKILL.md +10 -0
  83. package/dist/agents/skills/claude-code/harness-ml-ops/SKILL.md +10 -0
  84. package/dist/agents/skills/claude-code/harness-mobile-patterns/SKILL.md +10 -0
  85. package/dist/agents/skills/claude-code/harness-mutation-test/SKILL.md +9 -0
  86. package/dist/agents/skills/claude-code/harness-observability/SKILL.md +10 -0
  87. package/dist/agents/skills/claude-code/harness-onboarding/SKILL.md +18 -10
  88. package/dist/agents/skills/claude-code/harness-parallel-agents/SKILL.md +9 -0
  89. package/dist/agents/skills/claude-code/harness-perf/SKILL.md +11 -0
  90. package/dist/agents/skills/claude-code/harness-perf-tdd/SKILL.md +10 -0
  91. package/dist/agents/skills/claude-code/harness-planning/SKILL.md +10 -0
  92. package/dist/agents/skills/claude-code/harness-pre-commit-review/SKILL.md +9 -0
  93. package/dist/agents/skills/claude-code/harness-product-spec/SKILL.md +10 -0
  94. package/dist/agents/skills/claude-code/harness-property-test/SKILL.md +10 -0
  95. package/dist/agents/skills/claude-code/harness-refactoring/SKILL.md +9 -0
  96. package/dist/agents/skills/claude-code/harness-release-readiness/SKILL.md +11 -0
  97. package/dist/agents/skills/claude-code/harness-resilience/SKILL.md +10 -0
  98. package/dist/agents/skills/claude-code/harness-roadmap/SKILL.md +10 -1
  99. package/dist/agents/skills/claude-code/harness-roadmap-pilot/SKILL.md +8 -0
  100. package/dist/agents/skills/claude-code/harness-secrets/SKILL.md +10 -0
  101. package/dist/agents/skills/claude-code/harness-security-review/SKILL.md +10 -0
  102. package/dist/agents/skills/claude-code/harness-security-scan/SKILL.md +5 -15
  103. package/dist/agents/skills/claude-code/harness-skill-authoring/SKILL.md +29 -1
  104. package/dist/agents/skills/claude-code/harness-soundness-review/SKILL.md +10 -0
  105. package/dist/agents/skills/claude-code/harness-sql-review/SKILL.md +10 -0
  106. package/dist/agents/skills/claude-code/harness-state-management/SKILL.md +10 -0
  107. package/dist/agents/skills/claude-code/harness-supply-chain-audit/SKILL.md +10 -0
  108. package/dist/agents/skills/claude-code/harness-tdd/SKILL.md +9 -0
  109. package/dist/agents/skills/claude-code/harness-test-advisor/SKILL.md +10 -0
  110. package/dist/agents/skills/claude-code/harness-test-data/SKILL.md +10 -0
  111. package/dist/agents/skills/claude-code/harness-ux-copy/SKILL.md +10 -0
  112. package/dist/agents/skills/claude-code/harness-verification/SKILL.md +9 -0
  113. package/dist/agents/skills/claude-code/harness-verify/SKILL.md +10 -0
  114. package/dist/agents/skills/claude-code/harness-visual-regression/SKILL.md +10 -0
  115. package/dist/agents/skills/claude-code/initialize-harness-project/SKILL.md +22 -13
  116. package/dist/agents/skills/claude-code/validate-context-engineering/SKILL.md +9 -0
  117. package/dist/agents/skills/codex/add-harness-component/SKILL.md +21 -12
  118. package/dist/agents/skills/codex/align-documentation/SKILL.md +9 -0
  119. package/dist/agents/skills/codex/check-mechanical-constraints/SKILL.md +9 -0
  120. package/dist/agents/skills/codex/cleanup-dead-code/SKILL.md +11 -0
  121. package/dist/agents/skills/codex/detect-doc-drift/SKILL.md +9 -0
  122. package/dist/agents/skills/codex/enforce-architecture/SKILL.md +5 -15
  123. package/dist/agents/skills/codex/harness-accessibility/SKILL.md +10 -0
  124. package/dist/agents/skills/codex/harness-api-design/SKILL.md +5 -15
  125. package/dist/agents/skills/codex/harness-architecture-advisor/SKILL.md +5 -15
  126. package/dist/agents/skills/codex/harness-auth/SKILL.md +5 -15
  127. package/dist/agents/skills/codex/harness-autopilot/SKILL.md +10 -0
  128. package/dist/agents/skills/codex/harness-brainstorming/SKILL.md +9 -0
  129. package/dist/agents/skills/codex/harness-caching/SKILL.md +10 -0
  130. package/dist/agents/skills/codex/harness-chaos/SKILL.md +10 -0
  131. package/dist/agents/skills/codex/harness-code-review/SKILL.md +5 -15
  132. package/dist/agents/skills/codex/harness-codebase-cleanup/SKILL.md +11 -0
  133. package/dist/agents/skills/codex/harness-compliance/SKILL.md +10 -0
  134. package/dist/agents/skills/codex/harness-containerization/SKILL.md +10 -0
  135. package/dist/agents/skills/codex/harness-data-pipeline/SKILL.md +10 -0
  136. package/dist/agents/skills/codex/harness-data-validation/SKILL.md +10 -0
  137. package/dist/agents/skills/codex/harness-database/SKILL.md +5 -15
  138. package/dist/agents/skills/codex/harness-debugging/SKILL.md +9 -0
  139. package/dist/agents/skills/codex/harness-dependency-health/SKILL.md +10 -0
  140. package/dist/agents/skills/codex/harness-deployment/SKILL.md +5 -15
  141. package/dist/agents/skills/codex/harness-design/SKILL.md +10 -0
  142. package/dist/agents/skills/codex/harness-design-mobile/SKILL.md +10 -0
  143. package/dist/agents/skills/codex/harness-design-system/SKILL.md +10 -0
  144. package/dist/agents/skills/codex/harness-design-web/SKILL.md +10 -0
  145. package/dist/agents/skills/codex/harness-diagnostics/SKILL.md +9 -0
  146. package/dist/agents/skills/codex/harness-docs-pipeline/SKILL.md +11 -0
  147. package/dist/agents/skills/codex/harness-dx/SKILL.md +10 -0
  148. package/dist/agents/skills/codex/harness-e2e/SKILL.md +9 -0
  149. package/dist/agents/skills/codex/harness-event-driven/SKILL.md +10 -0
  150. package/dist/agents/skills/codex/harness-execution/SKILL.md +9 -0
  151. package/dist/agents/skills/codex/harness-feature-flags/SKILL.md +10 -0
  152. package/dist/agents/skills/codex/harness-git-workflow/SKILL.md +10 -0
  153. package/dist/agents/skills/codex/harness-hotspot-detector/SKILL.md +10 -0
  154. package/dist/agents/skills/codex/harness-i18n/SKILL.md +10 -0
  155. package/dist/agents/skills/codex/harness-i18n-process/SKILL.md +10 -0
  156. package/dist/agents/skills/codex/harness-i18n-workflow/SKILL.md +10 -0
  157. package/dist/agents/skills/codex/harness-impact-analysis/SKILL.md +10 -0
  158. package/dist/agents/skills/codex/harness-incident-response/SKILL.md +10 -0
  159. package/dist/agents/skills/codex/harness-infrastructure-as-code/SKILL.md +10 -0
  160. package/dist/agents/skills/codex/harness-integration-test/SKILL.md +9 -0
  161. package/dist/agents/skills/codex/harness-integrity/SKILL.md +10 -0
  162. package/dist/agents/skills/codex/harness-knowledge-mapper/SKILL.md +9 -0
  163. package/dist/agents/skills/codex/harness-load-testing/SKILL.md +10 -0
  164. package/dist/agents/skills/codex/harness-ml-ops/SKILL.md +10 -0
  165. package/dist/agents/skills/codex/harness-mobile-patterns/SKILL.md +10 -0
  166. package/dist/agents/skills/codex/harness-mutation-test/SKILL.md +9 -0
  167. package/dist/agents/skills/codex/harness-observability/SKILL.md +10 -0
  168. package/dist/agents/skills/codex/harness-onboarding/SKILL.md +18 -10
  169. package/dist/agents/skills/codex/harness-parallel-agents/SKILL.md +9 -0
  170. package/dist/agents/skills/codex/harness-perf/SKILL.md +11 -0
  171. package/dist/agents/skills/codex/harness-perf-tdd/SKILL.md +10 -0
  172. package/dist/agents/skills/codex/harness-planning/SKILL.md +10 -0
  173. package/dist/agents/skills/codex/harness-pre-commit-review/SKILL.md +9 -0
  174. package/dist/agents/skills/codex/harness-product-spec/SKILL.md +10 -0
  175. package/dist/agents/skills/codex/harness-property-test/SKILL.md +10 -0
  176. package/dist/agents/skills/codex/harness-refactoring/SKILL.md +9 -0
  177. package/dist/agents/skills/codex/harness-release-readiness/SKILL.md +11 -0
  178. package/dist/agents/skills/codex/harness-resilience/SKILL.md +10 -0
  179. package/dist/agents/skills/codex/harness-roadmap/SKILL.md +10 -1
  180. package/dist/agents/skills/codex/harness-roadmap-pilot/SKILL.md +8 -0
  181. package/dist/agents/skills/codex/harness-secrets/SKILL.md +10 -0
  182. package/dist/agents/skills/codex/harness-security-review/SKILL.md +10 -0
  183. package/dist/agents/skills/codex/harness-security-scan/SKILL.md +5 -15
  184. package/dist/agents/skills/codex/harness-skill-authoring/SKILL.md +29 -1
  185. package/dist/agents/skills/codex/harness-soundness-review/SKILL.md +10 -0
  186. package/dist/agents/skills/codex/harness-sql-review/SKILL.md +10 -0
  187. package/dist/agents/skills/codex/harness-state-management/SKILL.md +10 -0
  188. package/dist/agents/skills/codex/harness-supply-chain-audit/SKILL.md +10 -0
  189. package/dist/agents/skills/codex/harness-tdd/SKILL.md +9 -0
  190. package/dist/agents/skills/codex/harness-test-advisor/SKILL.md +10 -0
  191. package/dist/agents/skills/codex/harness-test-data/SKILL.md +10 -0
  192. package/dist/agents/skills/codex/harness-ux-copy/SKILL.md +10 -0
  193. package/dist/agents/skills/codex/harness-verification/SKILL.md +9 -0
  194. package/dist/agents/skills/codex/harness-verify/SKILL.md +10 -0
  195. package/dist/agents/skills/codex/harness-visual-regression/SKILL.md +10 -0
  196. package/dist/agents/skills/codex/initialize-harness-project/SKILL.md +22 -13
  197. package/dist/agents/skills/codex/validate-context-engineering/SKILL.md +9 -0
  198. package/dist/agents/skills/cursor/add-harness-component/SKILL.md +21 -12
  199. package/dist/agents/skills/cursor/align-documentation/SKILL.md +9 -0
  200. package/dist/agents/skills/cursor/check-mechanical-constraints/SKILL.md +9 -0
  201. package/dist/agents/skills/cursor/cleanup-dead-code/SKILL.md +11 -0
  202. package/dist/agents/skills/cursor/detect-doc-drift/SKILL.md +9 -0
  203. package/dist/agents/skills/cursor/enforce-architecture/SKILL.md +5 -15
  204. package/dist/agents/skills/cursor/harness-accessibility/SKILL.md +10 -0
  205. package/dist/agents/skills/cursor/harness-api-design/SKILL.md +5 -15
  206. package/dist/agents/skills/cursor/harness-architecture-advisor/SKILL.md +5 -15
  207. package/dist/agents/skills/cursor/harness-auth/SKILL.md +5 -15
  208. package/dist/agents/skills/cursor/harness-autopilot/SKILL.md +10 -0
  209. package/dist/agents/skills/cursor/harness-brainstorming/SKILL.md +9 -0
  210. package/dist/agents/skills/cursor/harness-caching/SKILL.md +10 -0
  211. package/dist/agents/skills/cursor/harness-chaos/SKILL.md +10 -0
  212. package/dist/agents/skills/cursor/harness-code-review/SKILL.md +5 -15
  213. package/dist/agents/skills/cursor/harness-codebase-cleanup/SKILL.md +11 -0
  214. package/dist/agents/skills/cursor/harness-compliance/SKILL.md +10 -0
  215. package/dist/agents/skills/cursor/harness-containerization/SKILL.md +10 -0
  216. package/dist/agents/skills/cursor/harness-data-pipeline/SKILL.md +10 -0
  217. package/dist/agents/skills/cursor/harness-data-validation/SKILL.md +10 -0
  218. package/dist/agents/skills/cursor/harness-database/SKILL.md +5 -15
  219. package/dist/agents/skills/cursor/harness-debugging/SKILL.md +9 -0
  220. package/dist/agents/skills/cursor/harness-dependency-health/SKILL.md +10 -0
  221. package/dist/agents/skills/cursor/harness-deployment/SKILL.md +5 -15
  222. package/dist/agents/skills/cursor/harness-design/SKILL.md +10 -0
  223. package/dist/agents/skills/cursor/harness-design-mobile/SKILL.md +10 -0
  224. package/dist/agents/skills/cursor/harness-design-system/SKILL.md +10 -0
  225. package/dist/agents/skills/cursor/harness-design-web/SKILL.md +10 -0
  226. package/dist/agents/skills/cursor/harness-diagnostics/SKILL.md +9 -0
  227. package/dist/agents/skills/cursor/harness-docs-pipeline/SKILL.md +11 -0
  228. package/dist/agents/skills/cursor/harness-dx/SKILL.md +10 -0
  229. package/dist/agents/skills/cursor/harness-e2e/SKILL.md +9 -0
  230. package/dist/agents/skills/cursor/harness-event-driven/SKILL.md +10 -0
  231. package/dist/agents/skills/cursor/harness-execution/SKILL.md +9 -0
  232. package/dist/agents/skills/cursor/harness-feature-flags/SKILL.md +10 -0
  233. package/dist/agents/skills/cursor/harness-git-workflow/SKILL.md +10 -0
  234. package/dist/agents/skills/cursor/harness-hotspot-detector/SKILL.md +10 -0
  235. package/dist/agents/skills/cursor/harness-i18n/SKILL.md +10 -0
  236. package/dist/agents/skills/cursor/harness-i18n-process/SKILL.md +10 -0
  237. package/dist/agents/skills/cursor/harness-i18n-workflow/SKILL.md +10 -0
  238. package/dist/agents/skills/cursor/harness-impact-analysis/SKILL.md +10 -0
  239. package/dist/agents/skills/cursor/harness-incident-response/SKILL.md +10 -0
  240. package/dist/agents/skills/cursor/harness-infrastructure-as-code/SKILL.md +10 -0
  241. package/dist/agents/skills/cursor/harness-integration-test/SKILL.md +9 -0
  242. package/dist/agents/skills/cursor/harness-integrity/SKILL.md +10 -0
  243. package/dist/agents/skills/cursor/harness-knowledge-mapper/SKILL.md +9 -0
  244. package/dist/agents/skills/cursor/harness-load-testing/SKILL.md +10 -0
  245. package/dist/agents/skills/cursor/harness-ml-ops/SKILL.md +10 -0
  246. package/dist/agents/skills/cursor/harness-mobile-patterns/SKILL.md +10 -0
  247. package/dist/agents/skills/cursor/harness-mutation-test/SKILL.md +9 -0
  248. package/dist/agents/skills/cursor/harness-observability/SKILL.md +10 -0
  249. package/dist/agents/skills/cursor/harness-onboarding/SKILL.md +18 -10
  250. package/dist/agents/skills/cursor/harness-parallel-agents/SKILL.md +9 -0
  251. package/dist/agents/skills/cursor/harness-perf/SKILL.md +11 -0
  252. package/dist/agents/skills/cursor/harness-perf-tdd/SKILL.md +10 -0
  253. package/dist/agents/skills/cursor/harness-planning/SKILL.md +10 -0
  254. package/dist/agents/skills/cursor/harness-pre-commit-review/SKILL.md +9 -0
  255. package/dist/agents/skills/cursor/harness-product-spec/SKILL.md +10 -0
  256. package/dist/agents/skills/cursor/harness-property-test/SKILL.md +10 -0
  257. package/dist/agents/skills/cursor/harness-refactoring/SKILL.md +9 -0
  258. package/dist/agents/skills/cursor/harness-release-readiness/SKILL.md +11 -0
  259. package/dist/agents/skills/cursor/harness-resilience/SKILL.md +10 -0
  260. package/dist/agents/skills/cursor/harness-roadmap/SKILL.md +10 -1
  261. package/dist/agents/skills/cursor/harness-roadmap-pilot/SKILL.md +8 -0
  262. package/dist/agents/skills/cursor/harness-secrets/SKILL.md +10 -0
  263. package/dist/agents/skills/cursor/harness-security-review/SKILL.md +10 -0
  264. package/dist/agents/skills/cursor/harness-security-scan/SKILL.md +5 -15
  265. package/dist/agents/skills/cursor/harness-skill-authoring/SKILL.md +29 -1
  266. package/dist/agents/skills/cursor/harness-soundness-review/SKILL.md +10 -0
  267. package/dist/agents/skills/cursor/harness-sql-review/SKILL.md +10 -0
  268. package/dist/agents/skills/cursor/harness-state-management/SKILL.md +10 -0
  269. package/dist/agents/skills/cursor/harness-supply-chain-audit/SKILL.md +10 -0
  270. package/dist/agents/skills/cursor/harness-tdd/SKILL.md +9 -0
  271. package/dist/agents/skills/cursor/harness-test-advisor/SKILL.md +10 -0
  272. package/dist/agents/skills/cursor/harness-test-data/SKILL.md +10 -0
  273. package/dist/agents/skills/cursor/harness-ux-copy/SKILL.md +10 -0
  274. package/dist/agents/skills/cursor/harness-verification/SKILL.md +9 -0
  275. package/dist/agents/skills/cursor/harness-verify/SKILL.md +10 -0
  276. package/dist/agents/skills/cursor/harness-visual-regression/SKILL.md +10 -0
  277. package/dist/agents/skills/cursor/initialize-harness-project/SKILL.md +22 -13
  278. package/dist/agents/skills/cursor/validate-context-engineering/SKILL.md +9 -0
  279. package/dist/agents/skills/gemini-cli/add-harness-component/SKILL.md +21 -12
  280. package/dist/agents/skills/gemini-cli/align-documentation/SKILL.md +9 -0
  281. package/dist/agents/skills/gemini-cli/check-mechanical-constraints/SKILL.md +9 -0
  282. package/dist/agents/skills/gemini-cli/cleanup-dead-code/SKILL.md +11 -0
  283. package/dist/agents/skills/gemini-cli/detect-doc-drift/SKILL.md +9 -0
  284. package/dist/agents/skills/gemini-cli/enforce-architecture/SKILL.md +5 -15
  285. package/dist/agents/skills/gemini-cli/harness-accessibility/SKILL.md +10 -0
  286. package/dist/agents/skills/gemini-cli/harness-api-design/SKILL.md +5 -15
  287. package/dist/agents/skills/gemini-cli/harness-architecture-advisor/SKILL.md +5 -15
  288. package/dist/agents/skills/gemini-cli/harness-auth/SKILL.md +5 -15
  289. package/dist/agents/skills/gemini-cli/harness-autopilot/SKILL.md +10 -0
  290. package/dist/agents/skills/gemini-cli/harness-brainstorming/SKILL.md +9 -0
  291. package/dist/agents/skills/gemini-cli/harness-caching/SKILL.md +10 -0
  292. package/dist/agents/skills/gemini-cli/harness-chaos/SKILL.md +10 -0
  293. package/dist/agents/skills/gemini-cli/harness-code-review/SKILL.md +5 -15
  294. package/dist/agents/skills/gemini-cli/harness-codebase-cleanup/SKILL.md +11 -0
  295. package/dist/agents/skills/gemini-cli/harness-compliance/SKILL.md +10 -0
  296. package/dist/agents/skills/gemini-cli/harness-containerization/SKILL.md +10 -0
  297. package/dist/agents/skills/gemini-cli/harness-data-pipeline/SKILL.md +10 -0
  298. package/dist/agents/skills/gemini-cli/harness-data-validation/SKILL.md +10 -0
  299. package/dist/agents/skills/gemini-cli/harness-database/SKILL.md +5 -15
  300. package/dist/agents/skills/gemini-cli/harness-debugging/SKILL.md +9 -0
  301. package/dist/agents/skills/gemini-cli/harness-dependency-health/SKILL.md +10 -0
  302. package/dist/agents/skills/gemini-cli/harness-deployment/SKILL.md +5 -15
  303. package/dist/agents/skills/gemini-cli/harness-design/SKILL.md +10 -0
  304. package/dist/agents/skills/gemini-cli/harness-design-mobile/SKILL.md +10 -0
  305. package/dist/agents/skills/gemini-cli/harness-design-system/SKILL.md +10 -0
  306. package/dist/agents/skills/gemini-cli/harness-design-web/SKILL.md +10 -0
  307. package/dist/agents/skills/gemini-cli/harness-diagnostics/SKILL.md +9 -0
  308. package/dist/agents/skills/gemini-cli/harness-docs-pipeline/SKILL.md +11 -0
  309. package/dist/agents/skills/gemini-cli/harness-dx/SKILL.md +10 -0
  310. package/dist/agents/skills/gemini-cli/harness-e2e/SKILL.md +9 -0
  311. package/dist/agents/skills/gemini-cli/harness-event-driven/SKILL.md +10 -0
  312. package/dist/agents/skills/gemini-cli/harness-execution/SKILL.md +9 -0
  313. package/dist/agents/skills/gemini-cli/harness-feature-flags/SKILL.md +10 -0
  314. package/dist/agents/skills/gemini-cli/harness-git-workflow/SKILL.md +10 -0
  315. package/dist/agents/skills/gemini-cli/harness-hotspot-detector/SKILL.md +10 -0
  316. package/dist/agents/skills/gemini-cli/harness-i18n/SKILL.md +10 -0
  317. package/dist/agents/skills/gemini-cli/harness-i18n-process/SKILL.md +10 -0
  318. package/dist/agents/skills/gemini-cli/harness-i18n-workflow/SKILL.md +10 -0
  319. package/dist/agents/skills/gemini-cli/harness-impact-analysis/SKILL.md +10 -0
  320. package/dist/agents/skills/gemini-cli/harness-incident-response/SKILL.md +10 -0
  321. package/dist/agents/skills/gemini-cli/harness-infrastructure-as-code/SKILL.md +10 -0
  322. package/dist/agents/skills/gemini-cli/harness-integration-test/SKILL.md +9 -0
  323. package/dist/agents/skills/gemini-cli/harness-integrity/SKILL.md +10 -0
  324. package/dist/agents/skills/gemini-cli/harness-knowledge-mapper/SKILL.md +9 -0
  325. package/dist/agents/skills/gemini-cli/harness-load-testing/SKILL.md +10 -0
  326. package/dist/agents/skills/gemini-cli/harness-ml-ops/SKILL.md +10 -0
  327. package/dist/agents/skills/gemini-cli/harness-mobile-patterns/SKILL.md +10 -0
  328. package/dist/agents/skills/gemini-cli/harness-mutation-test/SKILL.md +9 -0
  329. package/dist/agents/skills/gemini-cli/harness-observability/SKILL.md +10 -0
  330. package/dist/agents/skills/gemini-cli/harness-onboarding/SKILL.md +18 -10
  331. package/dist/agents/skills/gemini-cli/harness-parallel-agents/SKILL.md +9 -0
  332. package/dist/agents/skills/gemini-cli/harness-perf/SKILL.md +11 -0
  333. package/dist/agents/skills/gemini-cli/harness-perf-tdd/SKILL.md +10 -0
  334. package/dist/agents/skills/gemini-cli/harness-planning/SKILL.md +10 -0
  335. package/dist/agents/skills/gemini-cli/harness-pre-commit-review/SKILL.md +9 -0
  336. package/dist/agents/skills/gemini-cli/harness-product-spec/SKILL.md +10 -0
  337. package/dist/agents/skills/gemini-cli/harness-property-test/SKILL.md +10 -0
  338. package/dist/agents/skills/gemini-cli/harness-refactoring/SKILL.md +9 -0
  339. package/dist/agents/skills/gemini-cli/harness-release-readiness/SKILL.md +11 -0
  340. package/dist/agents/skills/gemini-cli/harness-resilience/SKILL.md +10 -0
  341. package/dist/agents/skills/gemini-cli/harness-roadmap/SKILL.md +10 -1
  342. package/dist/agents/skills/gemini-cli/harness-roadmap-pilot/SKILL.md +8 -0
  343. package/dist/agents/skills/gemini-cli/harness-secrets/SKILL.md +10 -0
  344. package/dist/agents/skills/gemini-cli/harness-security-review/SKILL.md +10 -0
  345. package/dist/agents/skills/gemini-cli/harness-security-scan/SKILL.md +5 -15
  346. package/dist/agents/skills/gemini-cli/harness-skill-authoring/SKILL.md +29 -1
  347. package/dist/agents/skills/gemini-cli/harness-soundness-review/SKILL.md +10 -0
  348. package/dist/agents/skills/gemini-cli/harness-sql-review/SKILL.md +10 -0
  349. package/dist/agents/skills/gemini-cli/harness-state-management/SKILL.md +10 -0
  350. package/dist/agents/skills/gemini-cli/harness-supply-chain-audit/SKILL.md +10 -0
  351. package/dist/agents/skills/gemini-cli/harness-tdd/SKILL.md +9 -0
  352. package/dist/agents/skills/gemini-cli/harness-test-advisor/SKILL.md +10 -0
  353. package/dist/agents/skills/gemini-cli/harness-test-data/SKILL.md +10 -0
  354. package/dist/agents/skills/gemini-cli/harness-ux-copy/SKILL.md +10 -0
  355. package/dist/agents/skills/gemini-cli/harness-verification/SKILL.md +9 -0
  356. package/dist/agents/skills/gemini-cli/harness-verify/SKILL.md +10 -0
  357. package/dist/agents/skills/gemini-cli/harness-visual-regression/SKILL.md +10 -0
  358. package/dist/agents/skills/gemini-cli/initialize-harness-project/SKILL.md +22 -13
  359. package/dist/agents/skills/gemini-cli/validate-context-engineering/SKILL.md +9 -0
  360. package/dist/agents-md-HCCCO5PK.js +9 -0
  361. package/dist/{architecture-EDSBAGR4.js → architecture-S2H624W7.js} +5 -5
  362. package/dist/{assess-project-CEDY4JU3.js → assess-project-XSGK44S5.js} +1 -1
  363. package/dist/bin/harness-mcp.js +18 -18
  364. package/dist/bin/harness.js +124 -35
  365. package/dist/{check-phase-gate-N3DTKFCZ.js → check-phase-gate-UGBJ237T.js} +5 -5
  366. package/dist/{chunk-AIBAYANF.js → chunk-2DHX6TAP.js} +4 -4
  367. package/dist/{chunk-ENA4O4WD.js → chunk-2GT3HO2T.js} +3 -3
  368. package/dist/{chunk-TJ6NLLAY.js → chunk-2YA4XRI3.js} +5 -5
  369. package/dist/{chunk-GZKSBLQL.js → chunk-35EQ5UEI.js} +1 -1
  370. package/dist/{chunk-T5QWCVGK.js → chunk-4FHBPA3E.js} +11 -3
  371. package/dist/{chunk-ERS5EVUZ.js → chunk-5LMZA5LZ.js} +10 -10
  372. package/dist/{chunk-SM22U22L.js → chunk-BK52Z6DR.js} +869 -419
  373. package/dist/{chunk-5SWE24IG.js → chunk-CLD4KL7O.js} +342 -72
  374. package/dist/{chunk-OD3S2NHN.js → chunk-E2GTL3YS.js} +1 -1
  375. package/dist/{chunk-YLN34N65.js → chunk-FP53DDB5.js} +1 -1
  376. package/dist/{chunk-TLDCCPUZ.js → chunk-I47JLISV.js} +1 -1
  377. package/dist/{chunk-AKVG4MMZ.js → chunk-KC5CTCEL.js} +9 -9
  378. package/dist/{chunk-26AUZBV4.js → chunk-KTL3PHNQ.js} +6445 -6222
  379. package/dist/{chunk-DBSOCI3G.js → chunk-KV4M6Y5J.js} +1 -1
  380. package/dist/{chunk-FIAPHX37.js → chunk-LM5Z2WCA.js} +1 -1
  381. package/dist/{chunk-SD3SQOZ2.js → chunk-LOUH2LIC.js} +1 -1
  382. package/dist/{chunk-QUKH6QCJ.js → chunk-MHOO7NLG.js} +11 -11
  383. package/dist/{chunk-HT4VPPB4.js → chunk-MZAHE4DK.js} +12 -12
  384. package/dist/{chunk-A4AI3H3R.js → chunk-NKL53UBL.js} +6 -6
  385. package/dist/{chunk-GJRUIXUK.js → chunk-PGF44T2D.js} +6 -6
  386. package/dist/{chunk-H7Y5CKTM.js → chunk-Q3XYV5UC.js} +1 -1
  387. package/dist/{chunk-TD6MQUV2.js → chunk-S5ZXT3TZ.js} +1 -1
  388. package/dist/{chunk-6KWBH4EO.js → chunk-UGD37ECK.js} +5 -5
  389. package/dist/{chunk-XDAIFVGC.js → chunk-V27WDRYV.js} +603 -525
  390. package/dist/{chunk-YQ6KC6TE.js → chunk-YDRB55Q4.js} +1 -1
  391. package/dist/{chunk-2LAEDVOC.js → chunk-ZRYDYDB2.js} +6 -6
  392. package/dist/{chunk-LIWGCYON.js → chunk-ZYJJUPNE.js} +1 -1
  393. package/dist/ci-workflow-I3V7FZNV.js +9 -0
  394. package/dist/{create-skill-U3XCFRZN.js → create-skill-AO25CJFM.js} +2 -2
  395. package/dist/{dist-USY2C5JL.js → dist-666AAZQ6.js} +1 -1
  396. package/dist/{dist-DZ63LLUD.js → dist-KQSTRP36.js} +1 -1
  397. package/dist/{dist-YIKUBJLQ.js → dist-MKWF5CXR.js} +7 -3
  398. package/dist/{dist-OEXTQQZC.js → dist-WU3TVNNG.js} +7 -1
  399. package/dist/{docs-F5G7NAFF.js → docs-R7UVQBMQ.js} +5 -5
  400. package/dist/engine-JGI3MWAC.js +9 -0
  401. package/dist/{entropy-A5Q2USYX.js → entropy-IDHIG7HS.js} +4 -4
  402. package/dist/{feedback-2EU25RIW.js → feedback-JZETY4UR.js} +1 -1
  403. package/dist/{generate-agent-definitions-HNJHO5YQ.js → generate-agent-definitions-D7B25YTM.js} +6 -6
  404. package/dist/{graph-loader-XULF5QF7.js → graph-loader-BJULJYGG.js} +1 -1
  405. package/dist/index.d.ts +20 -16
  406. package/dist/index.js +54 -54
  407. package/dist/loader-E4KNTOP2.js +11 -0
  408. package/dist/mcp-67I2DBNM.js +37 -0
  409. package/dist/{performance-YAY2A6A6.js → performance-744OSR6P.js} +5 -5
  410. package/dist/{review-pipeline-YD4WI3JM.js → review-pipeline-HIO7HBW4.js} +1 -1
  411. package/dist/runtime-JXQ26U4Z.js +10 -0
  412. package/dist/{security-IBSUKMVD.js → security-GDKHVFUC.js} +1 -1
  413. package/dist/{validate-NHXWKMCR.js → validate-2IUR3OWX.js} +5 -5
  414. package/dist/validate-cross-check-AM4T6P2K.js +9 -0
  415. package/package.json +5 -5
  416. package/dist/agents-md-GLKJSGKT.js +0 -9
  417. package/dist/ci-workflow-LE3QF4FP.js +0 -9
  418. package/dist/engine-LX5RVGXN.js +0 -9
  419. package/dist/loader-GWIEW4HM.js +0 -11
  420. package/dist/mcp-ID3LR6JB.js +0 -37
  421. package/dist/runtime-UJ4YO4CA.js +0 -10
  422. package/dist/validate-cross-check-R3GV2MLM.js +0 -9
  423. package/dist/{chunk-CJDVBBPB.js → chunk-3ISINLYT.js} +1 -1
@@ -256,6 +256,15 @@ describe('ProjectService contract', () => {
256
256
  });
257
257
  ```
258
258
 
259
+ ## Rationalizations to Reject
260
+
261
+ | Rationalization | Why It Is Wrong |
262
+ | ----------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
263
+ | "Testing the happy path is sufficient -- error scenarios are edge cases" | The success criteria require error scenarios (400, 401, 403, 404, 500, timeout) for all public endpoints. Error paths are where real-world failures happen. |
264
+ | "We can test against the staging environment instead of setting up local mocks" | No integration tests that require external staging environments for CI. Tests must run with local test doubles. |
265
+ | "The consumer contract changed, so I will update the consumer test to match the provider" | Contract changes must be coordinated. The provider may have introduced a bug, not an intentional change. |
266
+ | "Tests pass when I run them in order, so they are fine" | Phase 4 requires running tests in random order. Any test that fails only in a specific order has a shared-state bug. |
267
+
259
268
  ## Gates
260
269
 
261
270
  - **No integration tests that require external staging environments for CI.** Every integration test must run with local test doubles (mocks, containers, in-memory databases). Tests that fail without a staging VPN are not integration tests -- they are environment tests.
@@ -122,6 +122,16 @@ Rules:
122
122
  - [ ] Unified report follows the exact format
123
123
  - [ ] Overall verdict correctly reflects both mechanical and review results
124
124
 
125
+ ## Rationalizations to Reject
126
+
127
+ These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
128
+
129
+ | Rationalization | Why It Is Wrong |
130
+ | -------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
131
+ | "All three mechanical checks failed, but I should still run the AI review to get useful feedback" | When ALL three checks fail, stop immediately. Do not proceed to Phase 2. AI review on code that does not compile is wasted effort. |
132
+ | "The security scanner found a warning but it is not high severity, so it should not affect the overall result" | Error-severity security findings are blocking. The distinction is severity, not the agent's opinion of importance. |
133
+ | "The AI review flagged an architectural concern as blocking, so the integrity check should fail" | Only runtime errors, data loss, and security vulnerabilities count as blocking review findings. Architectural concerns are noted but do not block. |
134
+
125
135
  ## Examples
126
136
 
127
137
  ### Example: All Clear
@@ -162,6 +162,15 @@ This ensures subsequent graph queries (impact analysis, drift detection) include
162
162
  - Report follows the structured output format
163
163
  - All findings are backed by graph query evidence (with graph) or directory/file analysis (without graph)
164
164
 
165
+ ## Rationalizations to Reject
166
+
167
+ | Rationalization | Why It Is Wrong |
168
+ | --------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
169
+ | "The graph is a few commits behind, but it is close enough for knowledge mapping" | If the graph is more than 10 commits behind, run harness scan before proceeding. A stale graph produces a knowledge map with missing modules. |
170
+ | "No graph exists, so this skill cannot produce useful output" | The fallback strategy is explicit: use directory structure and file analysis. Fallback completeness is ~50%, significantly better than nothing. |
171
+ | "The existing AGENTS.md is outdated, so I will overwrite it with the generated version" | Never overwrite without confirmation. Existing AGENTS.md may contain carefully authored context the graph cannot infer. |
172
+ | "The module descriptions I inferred from function names are accurate enough" | Inferred descriptions are starting points. Phase 3 (AUDIT) exists to identify coverage gaps. Name-based inference misses purpose, constraints, and relationships. |
173
+
165
174
  ## Examples
166
175
 
167
176
  ### Example: Generating AGENTS.md from Graph
@@ -259,6 +259,16 @@ Phase 4: ANALYZE
259
259
  Recommendation: Add DataLoader for orders resolver, re-test after fix
260
260
  ```
261
261
 
262
+ ## Rationalizations to Reject
263
+
264
+ | Rationalization | Reality |
265
+ | ----------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
266
+ | "The smoke test passed, so the full load test will probably be fine too." | A smoke test at 1-2 VUs tells you the script runs — it says nothing about behavior at 100 or 1000 VUs. Connection pool exhaustion, lock contention, and GC pressure only appear under load. Smoke passing is the floor, not the ceiling. |
267
+ | "Staging is smaller than production, so results won't be accurate anyway — no point running the full test." | Staging results are always useful as a proxy: they reveal algorithmic bottlenecks, N+1 queries, and missing indexes that scale identically regardless of instance count. Document the scale factor and use it. Do not skip testing because the environment is imperfect. |
268
+ | "We haven't changed the API, so the old load test baselines still apply." | Baselines go stale when dependencies update, traffic patterns shift, or adjacent services change. A deployment that adds one middleware layer or changes a database index can move p99 by 200ms. Baselines must be re-validated, not assumed. |
269
+ | "The p95 threshold is arbitrary — let's just relax it until the test passes." | A threshold without a documented basis is a guess. A threshold lowered to make a failing test pass is a suppressed regression. Thresholds must be derived from SLOs or measured baselines. If the SLO is wrong, change the SLO explicitly with stakeholder sign-off. |
270
+ | "We'll run the soak test later — we just need to ship the load test first." | Soak tests catch failures that only emerge over hours: memory leaks, connection pool exhaustion, log file growth. If the feature involves a long-lived process, background worker, or WebSocket, skipping the soak test means the failure surfaces in production. |
271
+
262
272
  ## Gates
263
273
 
264
274
  - **No load tests against production without explicit human approval.** Load tests can cause real outages. The target environment must be verified as non-production before execution. If production testing is required, a `[checkpoint:human-verify]` must be passed with documented approval.
@@ -326,6 +326,16 @@ Phase 4: VALIDATE
326
326
  After fixes: projected NEEDS_ATTENTION (missing precision/recall metrics)
327
327
  ```
328
328
 
329
+ ## Rationalizations to Reject
330
+
331
+ | Rationalization | Reality |
332
+ | ----------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
333
+ | "We re-trained with more data but the architecture is the same — the previous evaluation still applies." | Evaluation results are bound to a specific model artifact, not to the architecture. A re-trained model with different weights can have dramatically different failure modes even if accuracy appears similar. Every model version that goes to production must be evaluated against the golden set, not inherited from its predecessor. |
334
+ | "The model file is only 8MB — committing it to git is more convenient than setting up an artifact store." | Model files in git corrupt repository history, explode clone times for all contributors, and cannot be versioned alongside experiment metadata. Convenience now creates permanent technical debt. The artifact store setup is a one-time cost; git pollution is permanent. |
335
+ | "Loading the model inside the request handler is simpler — the model is small enough that latency won't be noticeable." | Per-request model loading adds I/O and deserialization on every inference call, holds no persistent state across requests, and collapses under any meaningful concurrency. "Small enough" is a guess without measurement. Models must be loaded at startup and held in memory. |
336
+ | "We can add experiment tracking after we get the model working — right now we just need to iterate quickly." | Experiment tracking is hardest to add retroactively because you cannot reconstruct the conditions of runs you did not log. The runs being executed without tracking right now are the ones producing the model that may go to production. Log them now or accept that the model is not reproducible. |
337
+ | "The prompt template is short enough to read in context — version controlling it adds unnecessary process." | Prompts embedded in application code change silently when developers edit them, have no history of what changed and why, and cannot be evaluated independently. A prompt is a model artifact. It requires the same versioning, evaluation, and promotion discipline as model weights. |
338
+
329
339
  ## Gates
330
340
 
331
341
  - **No deploying models without evaluation.** A model that has not been evaluated against a golden set or baseline cannot be promoted to production. This is always an error.
@@ -311,6 +311,16 @@ Phase 4: VALIDATE
311
311
  Store submission ready: PASS
312
312
  ```
313
313
 
314
+ ## Rationalizations to Reject
315
+
316
+ | Rationalization | Reality |
317
+ | -------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
318
+ | "We request all permissions at launch to get them out of the way — users can deny them if they want." | App stores treat permissions-at-launch as a review red flag and users deny at much higher rates when there is no contextual explanation. Permissions requested at the moment they are needed, with a sentence explaining why, consistently achieve higher grant rates and reduce store rejection risk. |
319
+ | "Universal Links are optional — the URL scheme fallback works fine for deep linking." | URL scheme fallbacks (`myapp://`) can be claimed by any installed app on the device. A malicious or coincidentally named app can intercept links intended for yours. Universal Links with verified `apple-app-site-association` files are cryptographically bound to your domain and cannot be hijacked. |
320
+ | "The push notification handler works in foreground and background — we can handle the terminated state separately after launch." | Users often first interact with an app by tapping a push notification when the app is terminated. The cold-start tap handler is commonly the first impression. Shipping without it means a class of users experiences a broken entry point from day one. |
321
+ | "The staging configuration is slightly different but we'll remember to change it before the App Store build." | "Remember to change it" is not a process. Staging URLs, debug API keys, and sandbox APNs environments in production builds have shipped before and will again. Separate build configurations and environment-specific entitlement files are the only reliable mitigation. |
322
+ | "The privacy manifest requirement is new — we'll add it in the next release after the store flags it." | Apple has enforced PrivacyInfo.xcprivacy requirements for new submissions and updates since May 2024. Submitting without it results in rejection, which blocks the entire release. Adding it retroactively under rejection pressure is strictly more costly than adding it now. |
323
+
314
324
  ## Gates
315
325
 
316
326
  - **No missing permission usage descriptions.** Every permission requested in code must have a corresponding usage description in the platform manifest. Missing descriptions cause automatic App Store rejection on iOS and are a best practice requirement on Android.
@@ -236,6 +236,15 @@ mvn org.pitest:pitest-maven:mutationCoverage
236
236
  # Report generated at target/pit-reports/index.html
237
237
  ```
238
238
 
239
+ ## Rationalizations to Reject
240
+
241
+ | Rationalization | Why It Is Wrong |
242
+ | ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- |
243
+ | "We have 80% line coverage, so test quality is already good" | Line coverage measures execution, not verification. Mutation testing reveals missing assertions and weak assertions. |
244
+ | "The survived mutants are in non-critical utility code, so we can ignore them" | Every survived mutant must be either addressed with a test or explicitly justified as an equivalent mutant. |
245
+ | "I will write a test that targets the specific mutation to kill it" | No gaming the mutation score. Every new test must test a meaningful behavior, not just kill a specific mutant. |
246
+ | "The test suite has some failures, but we can still run mutation testing to see what we learn" | No mutation testing against a failing test suite. Mutations against broken tests produce garbage results. |
247
+
239
248
  ## Gates
240
249
 
241
250
  - **No mutation testing against a failing test suite.** All tests must pass before mutants are generated. Running mutations against broken tests produces garbage results. Fix the tests first.
@@ -268,6 +268,16 @@ Phase 4: VALIDATE
268
268
  Result: WARN -- 3 instrumentation gaps, alerting needs SLO alignment
269
269
  ```
270
270
 
271
+ ## Rationalizations to Reject
272
+
273
+ | Rationalization | Reality |
274
+ | -------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
275
+ | "We can see what's happening in CloudWatch logs — we don't need structured logging" | Unstructured log lines cannot be queried, aggregated, or correlated across services. When an incident spans three services, searching for a request ID across unstructured logs is manual forensics. Structured logging is not a nicety — it is the foundation for incident response. |
276
+ | "We'll add alerting once we've seen a few incidents and know what to alert on" | The first incident is the worst time to define alerting. SLO-based burn rate alerts can be defined from traffic patterns before any incidents occur. Waiting for incidents to define thresholds means every early failure goes undetected. |
277
+ | "User ID is a useful label for the latency metric — it helps us debug per-user issues" | User ID as a metric label creates one time series per user, which at 100,000 users means 100,000 label combinations. High-cardinality labels exhaust metric storage, cause query timeouts, and make the entire metrics system unstable. Use logs for per-user debugging; use metrics for aggregate signals. |
278
+ | "The tracing library is initialized, so we have distributed tracing" | Initializing the library creates root spans but does not propagate context across HTTP boundaries, instrument database calls, or connect traces to logs. Trace initialization without verified end-to-end propagation produces disconnected, useless traces. |
279
+ | "We have alerts — they're just not linked to runbooks yet" | An alert that fires at 3am without a runbook link requires the on-call engineer to start debugging from scratch. The absence of a runbook is not a documentation gap; it is a mean-time-to-recover multiplier. |
280
+
271
281
  ## Gates
272
282
 
273
283
  - **No sensitive data in logs.** If PII, credentials, or tokens are detected in log output, it is a blocking finding. The logging configuration must sanitize or redact sensitive fields before any other improvements are made.
@@ -23,7 +23,7 @@
23
23
  - Constraints and forbidden patterns
24
24
  - Any special instructions or warnings
25
25
 
26
- 2. **Read `harness.yaml`.** Extract:
26
+ 2. **Read `harness.config.json`.** Extract:
27
27
  - Project name and stack
28
28
  - Adoption level (basic, intermediate, advanced)
29
29
  - Layer definitions and their directory mappings
@@ -48,7 +48,7 @@
48
48
  2. **Map the architecture.** Walk the directory structure and identify:
49
49
  - Top-level organization pattern (monorepo, single package, workspace)
50
50
  - Source code location and entry points
51
- - Layer boundaries (from `harness.yaml` and actual directory structure)
51
+ - Layer boundaries (from `harness.config.json` and actual directory structure)
52
52
  - Shared utilities or common modules
53
53
  - Configuration files and their purposes
54
54
 
@@ -61,7 +61,7 @@
61
61
  - Code formatting (detect from config files: `.prettierrc`, `.eslintrc`, `biome.json`)
62
62
 
63
63
  4. **Map the constraints.** Identify what is restricted:
64
- - Forbidden imports (from `harness.yaml` dependency constraints)
64
+ - Forbidden imports (from `harness.config.json` dependency constraints)
65
65
  - Layer boundary rules (which layers can import from which)
66
66
  - Linting rules that encode architectural decisions
67
67
  - Any constraints documented in `AGENTS.md` that are not yet automated
@@ -95,8 +95,8 @@ Graph queries produce a complete architecture map in seconds, including transiti
95
95
 
96
96
  ### Phase 3: ORIENT — Identify Adoption Level and Maturity
97
97
 
98
- 1. **Confirm the adoption level** matches what `harness.yaml` declares:
99
- - Basic: `AGENTS.md` and `harness.yaml` exist but no layers or constraints
98
+ 1. **Confirm the adoption level** matches what `harness.config.json` declares:
99
+ - Basic: `AGENTS.md` and `harness.config.json` exist but no layers or constraints
100
100
  - Intermediate: Layers defined, dependency constraints enforced, at least one custom skill
101
101
  - Advanced: Personas, state management, learnings, CI integration
102
102
 
@@ -184,21 +184,29 @@ Graph queries produce a complete architecture map in seconds, including transiti
184
184
  - **`harness check-deps`** — Run to verify dependency constraints are passing, which confirms layer boundaries are respected.
185
185
  - **`harness state show`** — View current state to understand where the last session left off.
186
186
  - **`AGENTS.md`** — Primary source of project context and agent instructions.
187
- - **`harness.yaml`** — Source of structural configuration (layers, constraints, skills).
187
+ - **`harness.config.json`** — Source of structural configuration (layers, constraints, skills).
188
188
  - **`.harness/learnings.md`** — Historical context and institutional knowledge.
189
189
 
190
190
  ## Success Criteria
191
191
 
192
- - All four configuration sources were read (`AGENTS.md`, `harness.yaml`, `.harness/learnings.md`, `.harness/state.json`)
192
+ - All four configuration sources were read (`AGENTS.md`, `harness.config.json`, `.harness/learnings.md`, `.harness/state.json`)
193
193
  - Technology stack is accurately identified (language, framework, test runner, build tool)
194
194
  - Architecture is mapped with correct layer boundaries and dependency directions
195
195
  - Conventions are identified from actual code patterns, not assumed
196
- - Constraints are enumerated from both `harness.yaml` and `AGENTS.md`
196
+ - Constraints are enumerated from both `harness.config.json` and `AGENTS.md`
197
197
  - Adoption level is confirmed (not just declared — validated)
198
198
  - A structured orientation summary is produced with all sections filled
199
199
  - The "Getting Started" section is actionable and tailored to the audience
200
200
  - `harness validate` was run and results are reported
201
201
 
202
+ ## Rationalizations to Reject
203
+
204
+ | Rationalization | Reality |
205
+ | -------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
206
+ | "I can skip reading .harness/learnings.md since it is just historical notes" | Learnings contain hard-won insights from previous sessions -- decisions made, gotchas discovered, patterns that worked or failed. Skipping them means repeating mistakes already diagnosed. |
207
+ | "The harness.config.json says intermediate, so I can report that without validation" | Declared adoption level must be confirmed, not assumed. A project that declares intermediate but fails harness validate is not truly intermediate. |
208
+ | "I will map the architecture by reading the directory names since that is faster than checking conventions in actual code" | Conventions must be identified from actual code patterns, not assumed from directory structure. File naming, import style, and error handling can only be verified by reading real source files. |
209
+
202
210
  ## Examples
203
211
 
204
212
  ### Example: Onboarding to an Intermediate TypeScript Project
@@ -211,7 +219,7 @@ Read AGENTS.md:
211
219
  - Stack: TypeScript, Express, Vitest, PostgreSQL
212
220
  - Conventions: zod validation, repository pattern, kebab-case files
213
221
 
214
- Read harness.yaml:
222
+ Read harness.config.json:
215
223
  - Level: intermediate
216
224
  - Layers: presentation (src/routes/), business (src/services/), data (src/repositories/)
217
225
  - Constraints: presentation → business OK, business → data OK, data → presentation FORBIDDEN
@@ -258,7 +266,7 @@ Produce orientation with all sections. Getting Started for this context:
258
266
 
259
267
  ```
260
268
  Read AGENTS.md — exists, minimal content
261
- Read harness.yaml — level: basic, no layers defined
269
+ Read harness.config.json — level: basic, no layers defined
262
270
  No .harness/learnings.md
263
271
  No .harness/state.json
264
272
  ```
@@ -159,6 +159,15 @@ For each independent task, write a focused agent brief:
159
159
  - `harness validate` passes after integration
160
160
  - No agent modified files outside its declared scope
161
161
 
162
+ ## Rationalizations to Reject
163
+
164
+ | Rationalization | Why It Is Wrong |
165
+ | -------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
166
+ | "These two tasks touch different functions in the same file, so they are independent enough" | If both tasks write to the same file, they are NOT independent. Even different functions in the same file creates merge conflicts. |
167
+ | "I verified independence manually -- no need to run check_task_independence" | Manual verification misses transitive dependency overlap. check_task_independence with graph-expanded analysis catches transitive conflicts. |
168
+ | "There are only 2 independent tasks, but parallelism would save time" | NOT when there are fewer than 3 independent tasks. Coordination overhead outweighs parallelism benefit for 2 tasks. |
169
+ | "Each agent's tests pass, so integration is fine" | Step 4 requires running the FULL test suite after integration. Parallel changes can cause integration failures that individual test runs miss. |
170
+
162
171
  ## Examples
163
172
 
164
173
  ### Example: Parallel Implementation of Three Independent Services
@@ -187,6 +187,17 @@ This phase runs only when `.bench.ts` files exist in the project. If none are fo
187
187
  - Gate decision is recorded in state
188
188
  - `harness validate` passes after enforcement
189
189
 
190
+ ## Rationalizations to Reject
191
+
192
+ These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
193
+
194
+ | Rationalization | Why It Is Wrong |
195
+ | ------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
196
+ | "The cyclomatic complexity is 16 but the function is straightforward, so I can override the Tier 1 threshold" | Tier 1 violations are non-negotiable blockers. No merge with Tier 1 performance violations. If a threshold needs adjustment, reconfigure with documented justification. |
197
+ | "The benchmark regression is only 6% and it is probably just noise" | The noise margin (default 3%) is applied before flagging. A 6% regression on a perf-critical path exceeds the Tier 1 threshold even after noise consideration. |
198
+ | "The working tree has a small uncommitted change but it should not affect benchmark results" | No running benchmarks with a dirty working tree. Uncommitted changes invalidate benchmark results. |
199
+ | "I will update the baselines to match the new performance numbers rather than fixing the regression" | Baselines must come from fresh runs against committed code. Silently moving the goalposts defeats the purpose of performance gates. |
200
+
190
201
  ## Examples
191
202
 
192
203
  ### Example: PR with High Complexity Function
@@ -235,6 +235,16 @@ harness check-perf — complexity reduced from 12 to 8 (improvement)
235
235
  harness perf baselines update — new baseline saved
236
236
  ```
237
237
 
238
+ ## Rationalizations to Reject
239
+
240
+ | Rationalization | Reality |
241
+ | --------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
242
+ | "The correctness test is green, I'll add the benchmark later when we know performance is an issue." | The benchmark is not optional — it is the mechanism that defines "performance issue." Without a baseline captured at implementation time, you have nothing to compare against when a regression appears months later. Later never comes. |
243
+ | "I'll skip the REFACTOR phase since the spec doesn't mention performance requirements." | The spec not mentioning a requirement means there is no user-facing SLO, not that performance is irrelevant. The benchmark still captures the baseline that future work must not regress from. Phase 3 is optional; the benchmark file is not. |
244
+ | "The benchmark results vary too much between runs to be meaningful — I'll just omit it." | Variance is a signal, not a reason to skip. High variance means the benchmark needs warmup iterations, more samples, or isolation from I/O. Fix the benchmark, do not delete it. An absent benchmark offers zero protection against regressions. |
245
+ | "This function is only called during startup, so its performance doesn't matter at runtime." | Startup performance determines deployment speed, lambda cold-start latency, and test suite duration. "Not in the hot path at runtime" does not mean performance is free to ignore. Measure it so the baseline exists if startup behavior changes. |
246
+ | "We already have an integration test that covers this — writing a separate benchmark would be redundant." | Integration tests verify correctness under realistic conditions. Benchmarks measure isolated performance with precise input control. An integration test that passes in 2 seconds tells you nothing about whether the function itself takes 1ms or 800ms. |
247
+
238
248
  ## Gates
239
249
 
240
250
  - **No code before test AND benchmark.** Both must exist before implementation begins.
@@ -468,6 +468,16 @@ When `docs/changes/` exists in the project, produce `docs/changes/<feature>/delt
468
468
  - When `rigorLevel` is `standard` and task count < 8, the skeleton is skipped
469
469
  - The skeleton format is lightweight (~200 tokens): numbered groups with task count and time estimates
470
470
 
471
+ ## Rationalizations to Reject
472
+
473
+ | Rationalization | Reality |
474
+ | ------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
475
+ | "The task is conceptually clear so I do not need to include exact code in the plan" | Every task must have exact file paths, exact code, and exact commands. If you cannot write the code in the plan, you do not understand the task well enough to plan it. |
476
+ | "This task touches 5 files but it is logically one unit of work, so splitting it would add overhead" | Tasks touching more than 3 files must be split. The overhead of splitting is far less than the cost of a failed oversized task. |
477
+ | "Tests for this task can be added in a follow-up task since the implementation is straightforward" | No skipping TDD in tasks. Every code-producing task must start with writing a test. "Add tests later" is explicitly forbidden. |
478
+ | "The spec does not cover this edge case, but I can fill in the gap during planning" | When the spec is missing information, do not fill in the gaps yourself. Escalate. Filling gaps silently creates undocumented design decisions that no one reviewed. |
479
+ | "I discovered we need an additional file during decomposition, but updating the file map is just bookkeeping" | The file map must be complete. Every file that will be created or modified must appear in the file map before task decomposition. |
480
+
471
481
  ## Examples
472
482
 
473
483
  ### Example: Planning a User Notification Feature
@@ -284,6 +284,15 @@ fi
284
284
  - [ ] AI review focused on high-signal issues only (no style nits)
285
285
  - [ ] Report follows the structured format exactly
286
286
 
287
+ ## Rationalizations to Reject
288
+
289
+ | Rationalization | Why It Is Wrong |
290
+ | ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
291
+ | "The lint errors are just warnings, so I can proceed to AI review" | The gate is absolute: any mechanical check failure means STOP. AI review does not run until lint, typecheck, and tests all pass. |
292
+ | "This is a docs-only change but let me run AI review anyway for thoroughness" | The fast path is mandatory. If only docs/config files changed, AI review is skipped. Running it anyway wastes tokens. |
293
+ | "The AI found a style issue, so I should block the commit" | AI review observations are advisory only. Only mechanical check failures block the commit. |
294
+ | "I will skip the security scan since this is an internal endpoint" | Phase 3 runs the security scanner against all staged source files regardless of exposure. Hardcoded secrets and injection are blocking even in internal code. |
295
+
287
296
  ## Examples
288
297
 
289
298
  ### Example: Clean Commit
@@ -197,6 +197,16 @@
197
197
  - Output format matches existing project conventions when they exist
198
198
  - Generated PRD is saved to the correct directory with consistent naming
199
199
 
200
+ ## Rationalizations to Reject
201
+
202
+ | Rationalization | Why It Is Wrong |
203
+ | -------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
204
+ | "The feature request is clear enough -- I can skip the ambiguity check and start writing stories" | The gate: no generating specs from ambiguous input without clarification. Missing actors or undefined triggers lead to untestable acceptance criteria. |
205
+ | "This acceptance criterion is understood by the team, so it does not need to be formally testable" | No untestable acceptance criteria is a hard gate. Every criterion must be verifiable by an automated test or specific manual procedure. |
206
+ | "The happy path scenarios are enough -- edge cases are unlikely" | The skill requires at least one unwanted-behavior criterion for every user-facing action. Edge cases are where production bugs live. |
207
+ | "The existing PRD is outdated, so I will just replace it with a fresh one" | No overwriting existing specs is a gate. Present the diff rather than replacing the file. |
208
+ | "We can figure out the success metrics later during implementation" | Every success metric must be measurable, time-bound, and specific at spec time. |
209
+
200
210
  ## Examples
201
211
 
202
212
  ### Example: GitHub Issue to PRD for Team Notifications
@@ -266,6 +266,16 @@ def test_sort_handles_floats(xs):
266
266
  assert result[i] <= result[i + 1]
267
267
  ```
268
268
 
269
+ ## Rationalizations to Reject
270
+
271
+ | Rationalization | Reality |
272
+ | ------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
273
+ | "We already have example-based tests that cover the edge cases — property tests would just be redundant." | Example-based tests cover the cases the author thought of. Property tests cover the cases they did not. The entire value of generative testing is that it explores regions of the input space that human intuition misses — off-by-one errors, Unicode combining characters, signed integer overflow at boundaries. |
274
+ | "The generator keeps producing rejected inputs, so I'll just filter more aggressively to make the test pass faster." | Heavy `filter` usage is a symptom of a broken generator, not a solution. Each rejected sample wastes an iteration, and `filter` destroys the shrinking chain, leaving you with an unhelpful counterexample when a bug is found. Rewrite the generator using `map` and `flatMap` to construct valid inputs directly. |
275
+ | "The counterexample is too strange to be a real-world case — I'll just increase the iteration count so it appears less often." | A shrunk counterexample that triggers a property failure is a real bug by definition. "Unlikely in practice" is not a property of correctness — the question is whether the invariant holds. If the counterexample is a valid input the function might receive, fix the function. If it is not a valid input, constrain the generator. |
276
+ | "This function has too many invariants to specify — I'll just skip property testing and trust the unit tests." | Complex functions with many invariants are exactly the functions most in need of property testing. High complexity means a larger bug-hiding surface. Start with the most important invariants (no-crash, round-trip, idempotence) rather than attempting to encode all properties at once. |
277
+ | "Property tests are too slow — they'll block CI for 10 minutes." | Run 100 iterations on PR, 10,000 iterations nightly. The CI time argument justifies reducing iteration count, never eliminating property tests entirely. A suite that runs 0 property tests found 0 edge cases. |
278
+
269
279
  ## Gates
270
280
 
271
281
  - **No property tests without shrinking.** If the framework's automatic shrinking is disabled or the generator uses patterns that break shrinking (excessive `filter`), counterexamples will be unhelpfully large. Fix the generator to support shrinking.
@@ -134,6 +134,15 @@ Skipping this step means subsequent graph queries (impact analysis, dependency h
134
134
  - No behavioral changes were introduced (the test suite is the proof)
135
135
  - No dead code was left behind (run `harness cleanup` to verify)
136
136
 
137
+ ## Rationalizations to Reject
138
+
139
+ | Rationalization | Reality |
140
+ | ------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
141
+ | "The tests are mostly passing, so I can start refactoring and fix the remaining failures as I go" | All tests must pass BEFORE refactoring starts. If tests are not green before you start, you are not refactoring -- you are debugging. |
142
+ | "This refactoring changes a small amount of behavior, but it is a clear improvement" | Refactoring must not change behavior. The test suite is the proof. If the refactoring requires changing tests, you may be changing behavior. |
143
+ | "I will make several changes at once and run tests at the end since each change is small" | Tests must run after EVERY single change. If a test breaks, you must undo the LAST change immediately. |
144
+ | "The refactoring did not produce a measurable improvement, but the code is different so it must be somewhat better" | If the refactoring introduced no measurable improvement, revert the entire sequence. Refactoring for its own sake is churn. |
145
+
137
146
  ## Examples
138
147
 
139
148
  ### Example: Moving business logic out of a UI component
@@ -537,6 +537,17 @@ This framing is informational — it does not block anything. It gives the team
537
537
  8. Monorepo support: each package is audited independently with per-package results in the report
538
538
  9. `harness validate` passes after the skill's SKILL.md and skill.yaml are written
539
539
 
540
+ ## Rationalizations to Reject
541
+
542
+ These are common rationalizations that sound reasonable but lead to incorrect results. When you catch yourself thinking any of these, stop and follow the documented process instead.
543
+
544
+ | Rationalization | Why It Is Wrong |
545
+ | ----------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |
546
+ | "The MAINTAIN phase takes too long, so I will skip dispatching the 4 maintenance agents" | No skipping the MAINTAIN phase. Maintenance checks catch issues that release-specific checks miss. |
547
+ | "This auto-fix is obviously correct, so I can apply it without prompting the user" | No auto-fix without prompting. Every fix must be presented to the human before being applied. |
548
+ | "Most checks pass and only a few warnings remain, so the release is ready" | A "mostly passing" report is not a passing report. The result is PASS only when zero failures exist across all categories. |
549
+ | "The previous run found these issues and I fixed them, so I can trust the cached results" | Session resumption requires re-running all checks. Code may have changed since the last run. |
550
+
540
551
  ## Examples
541
552
 
542
553
  ### Example: First Run on a Monorepo with Gaps
@@ -240,6 +240,16 @@ Phase 4: VALIDATE
240
240
  Redis fallback serves from LRU when Redis is down
241
241
  ```
242
242
 
243
+ ## Rationalizations to Reject
244
+
245
+ | Rationalization | Reality |
246
+ | ----------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
247
+ | "That third-party API has 99.99% uptime — we don't need a circuit breaker" | 99.99% uptime means 52 minutes of downtime per year. That downtime will not occur as one predictable window — it will happen as degraded responses and timeouts during a traffic spike. Without a circuit breaker, every caller blocks for the full timeout duration, exhausting thread pools and cascading across the system. |
248
+ | "We have retry logic, so failures are handled" | Retry logic without a circuit breaker amplifies failures. When the downstream service is degraded, retries multiply the load on an already struggling system. Circuit breakers and retries are complementary controls, not alternatives. |
249
+ | "The fallback adds complexity — we'll add it if the circuit breaker actually opens" | A circuit breaker without a fallback is a different kind of failure mode, not resilience. When the circuit opens, users see an error instead of a degraded-but-functional experience. Fallbacks must be designed and tested before the circuit ever opens in production. |
250
+ | "Our database connection pool is 100 connections — that's plenty" | Connection pool size without query timeouts means slow queries hold connections indefinitely. A single slow query spike can exhaust the pool, causing every subsequent request to wait. Pool sizing and query timeouts are both required. |
251
+ | "The service is internal — it doesn't need rate limiting" | Internal services are often called by automated processes, CI pipelines, and batch jobs that can spike traffic in ways user-facing services do not. Missing rate limiting on internal services is a common cause of self-inflicted outages during deployments and data migrations. |
252
+
243
253
  ## Gates
244
254
 
245
255
  - **No retry on non-idempotent operations without idempotency keys.** Retrying a POST or DELETE that lacks an idempotency mechanism can cause data duplication or data loss. This is a blocking finding. The operation must be made idempotent before retry logic is added.
@@ -42,7 +42,7 @@ If the human has not seen and approved the milestone groupings and feature list,
42
42
  - Has spec + plan but no implementation -> `planned`
43
43
  - Has spec but no plan -> `backlog`
44
44
  - Has plan but no spec -> `planned` (unusual, flag for human review)
45
- 6. Detect project name from `harness.yaml` `project` field, or `package.json` `name` field, or directory name as fallback.
45
+ 6. Detect project name from `harness.config.json` `project` field, or `package.json` `name` field, or directory name as fallback.
46
46
 
47
47
  Present scan summary:
48
48
 
@@ -457,6 +457,15 @@ Choice?
457
457
  19. `--query` filters features by status or milestone and displays results with milestone context
458
458
  20. `--query` errors gracefully when no roadmap exists, directing the user to `--create`
459
459
 
460
+ ## Rationalizations to Reject
461
+
462
+ | Rationalization | Reality |
463
+ | ----------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
464
+ | "The feature list looks correct, so I can skip the PROPOSE phase and write the roadmap directly" | The Iron Law: never write docs/roadmap.md without the human confirming the proposed structure first. |
465
+ | "This sync detected a status change and the inference is clearly correct, so I can apply it without confirmation" | The sync PROPOSE phase requires presenting proposed changes and waiting for human confirmation. The human-always-wins rule applies. |
466
+ | "The existing roadmap is outdated, so I will recreate it with --create to get a fresh start" | No overwriting an existing roadmap without explicit user consent. Silent overwrites destroy prior manual edits and status tracking. |
467
+ | "There is no roadmap yet but the user asked me to add a feature, so I will create one as a side effect of --add" | When the roadmap does not exist, --add must error with a clear message directing the user to --create. |
468
+
460
469
  ## Examples
461
470
 
462
471
  ### Example: `--create` -- Bootstrap a Roadmap from Existing Artifacts
@@ -150,6 +150,14 @@ Proceed with Feature A? (y/n/pick another)
150
150
  7. Transition routes to brainstorming (no spec) or autopilot (spec exists)
151
151
  8. `harness validate` passes after all changes
152
152
 
153
+ ## Rationalizations to Reject
154
+
155
+ | Rationalization | Reality |
156
+ | ----------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
157
+ | "The top-scored candidate is obviously correct, so I can assign it without asking the human" | The Iron Law: never assign or transition without the human confirming the recommendation first. |
158
+ | "Affinity data is not available so the scoring is degraded -- I should just pick the first planned item" | Proceed without affinity scoring by zeroing out the affinity weight. Position and dependents signals still produce meaningful rankings. |
159
+ | "The feature has no spec, but I can skip brainstorming and jump straight to planning since the summary is clear enough" | No spec routes to brainstorming, spec exists routes to autopilot. A one-line roadmap summary is not a spec. |
160
+
153
161
  ## Examples
154
162
 
155
163
  ### Example: Pick Next Item from a Multi-Milestone Roadmap
@@ -278,6 +278,16 @@ Phase 4: VALIDATE
278
278
  Result: FAIL -- rotation required before deployment, history rewrite recommended
279
279
  ```
280
280
 
281
+ ## Rationalizations to Reject
282
+
283
+ | Rationalization | Reality |
284
+ | ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
285
+ | "That key is read-only so it's not a big deal if it leaks" | Read-only credentials still enable data exfiltration, reconnaissance, and discovery of other vulnerabilities. A leaked read-only database credential exposes every row in the database. Scope does not eliminate risk. |
286
+ | "We removed it from the file — it's cleaned up now" | Removing a secret from the current tree does not remove it from git history. Anyone with a clone of the repository can recover the secret with `git log -p`. Rotation is required regardless of file deletion. |
287
+ | "That's a test environment key, not production" | Test environment credentials are frequently reused, shared informally, and rotated less often. Leaked test keys also reveal credential patterns and naming conventions that help attackers guess production secrets. |
288
+ | "It's in a private repo so only our team can see it" | Private repos are accessed by CI/CD systems, third-party integrations, contractors, and former employees. Repository access controls are not a substitute for secret externalization. Breaches routinely originate from compromised internal access. |
289
+ | "We'll move it to an environment variable before we deploy" | Intent does not prevent exposure. The secret is in the codebase now and may already be in commit history, CI logs, or developer machine caches. Remediation must happen at the moment of detection, not at deployment time. |
290
+
281
291
  ## Gates
282
292
 
283
293
  - **No CRITICAL findings may remain unaddressed.** Production credentials exposed in source code are blocking. Execution halts until the credential is rotated and the code is remediated.
@@ -174,6 +174,16 @@ Threat Model:
174
174
  - **`query_graph` / `get_relationships`** — Used in threat modeling phase for data flow tracing
175
175
  - **`get_impact`** — Understand blast radius of security-sensitive changes
176
176
 
177
+ ## Rationalizations to Reject
178
+
179
+ | Rationalization | Reality |
180
+ | -------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
181
+ | "The scanner didn't flag it so it must be fine" | Mechanical scanners catch pattern-level issues. They cannot trace user input across multiple function calls to a dangerous sink, detect authorization logic flaws, or evaluate whether a fallback chain fails open. The AI review phase exists precisely because scanners miss semantic vulnerabilities. |
182
+ | "This endpoint is behind authentication so we don't need to validate input" | Authentication and input validation are orthogonal controls. Authenticated users can still send malicious payloads. Authenticated SQL injection, SSRF, and path traversal are well-documented attack patterns against internal-only endpoints. |
183
+ | "The vulnerability requires knowing our internal schema to exploit" | Security through obscurity is not a control. Internal schema details leak through error messages, API responses, documentation, and employee turnover. Rate the vulnerability based on its impact assuming the attacker knows the system. |
184
+ | "We'll add rate limiting and input validation later once the feature ships" | Security controls added after deployment require re-testing and re-review. Shipping without them creates an exposure window and establishes technical debt that is systematically deprioritized once the feature is live. |
185
+ | "That's an OWASP theoretical risk — our app isn't targeted by sophisticated attackers" | OWASP findings are exploited by automated scanners, not just sophisticated attackers. Opportunistic bots continuously probe for SQL injection, XSS, and auth bypass. Unpatched OWASP Top 10 issues are routinely exploited within hours of exposure. |
186
+
177
187
  ## Gates
178
188
 
179
189
  - **Mechanical scanner must run before AI review.** The scanner catches what patterns can catch; AI reviews what remains.
@@ -94,21 +94,11 @@ These apply to ALL skills. If you catch yourself doing any of these, STOP.
94
94
 
95
95
  ## Rationalizations to Reject
96
96
 
97
- ### Universal
98
-
99
- These reasoning patterns sound plausible but lead to bad outcomes. Reject them.
100
-
101
- - **"It's probably fine"** "Probably" is not evidence. Verify before asserting.
102
- - **"This is best practice"** — Best practice in what context? Cite the source and
103
- confirm it applies to this codebase.
104
- - **"We can fix it later"** — If it is worth flagging, it is worth documenting now
105
- with a concrete follow-up plan.
106
-
107
- ### Domain-Specific
108
-
109
- - **"No attacker would find this"** — Security by obscurity. If the code is wrong, flag it regardless of discoverability.
110
- - **"We're behind a firewall"** — Network boundaries change. Code should be secure at every layer regardless of deployment topology.
111
- - **"The framework handles this for us"** — Verify the framework's actual behavior. Misuse of a secure framework is still insecure.
97
+ | Rationalization | Reality |
98
+ | ----------------------------------- | -------------------------------------------------------------------------------------------------- |
99
+ | "No attacker would find this" | Security by obscurity. If the code is wrong, flag it regardless of discoverability. |
100
+ | "We're behind a firewall" | Network boundaries change. Code should be secure at every layer regardless of deployment topology. |
101
+ | "The framework handles this for us" | Verify the framework's actual behavior. Misuse of a secure framework is still insecure. |
112
102
 
113
103
  ## Escalation
114
104