workflow-ai 1.1.0 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (593) hide show
  1. package/CHANGELOG.md +49 -0
  2. package/README.md +105 -7
  3. package/configs/pipeline.yaml +23 -2
  4. package/package.json +44 -44
  5. package/src/lib/operations/tickets.mjs +305 -207
  6. package/src/lib/utils.mjs +286 -286
  7. package/src/runner.mjs +314 -34
  8. package/src/scripts/check-conditions.js +2 -2
  9. package/src/scripts/get-next-id.js +144 -41
  10. package/src/scripts/move-ticket.js +225 -68
  11. package/src/scripts/pick-next-task.js +753 -93
  12. package/src/skills/coach/SKILL.md +1 -1
  13. package/src/skills/manual-testing/SKILL.md +2 -0
  14. package/src/scripts/tests/timeout-cascade.test.js +0 -28
  15. package/src/skills/analyze-report/README.md +0 -44
  16. package/src/skills/analyze-report/algorithms/progress-assessment.md +0 -108
  17. package/src/skills/analyze-report/knowledge/analysis-frameworks.md +0 -66
  18. package/src/skills/analyze-report/knowledge/report-structure.md +0 -61
  19. package/src/skills/analyze-report/scripts/calc-plan-metrics.js +0 -234
  20. package/src/skills/analyze-report/templates/analysis-report.md +0 -80
  21. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-1.md +0 -5
  22. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-2.md +0 -98
  23. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-3.md +0 -99
  24. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/judge.json +0 -163
  25. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-1.md +0 -89
  26. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-2.md +0 -88
  27. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-3.md +0 -100
  28. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-1.md +0 -77
  29. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-2.md +0 -64
  30. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-3.md +0 -110
  31. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-1.md +0 -74
  32. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-2.md +0 -38
  33. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-3.md +0 -61
  34. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/meta.json +0 -115
  35. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001-evidence-from-log.yaml +0 -60
  36. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-1.md +0 -90
  37. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-2.md +0 -89
  38. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-3.md +0 -5
  39. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/judge.json +0 -163
  40. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-1.md +0 -84
  41. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-2.md +0 -77
  42. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-3.md +0 -89
  43. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-1.md +0 -103
  44. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-2.md +0 -103
  45. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-3.md +0 -103
  46. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-1.md +0 -93
  47. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-2.md +0 -93
  48. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-3.md +0 -86
  49. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/meta.json +0 -115
  50. package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002-result-block-format.yaml +0 -44
  51. package/src/skills/analyze-report/tests/fixtures/REPORT-002-incorrect-attribution.md +0 -27
  52. package/src/skills/analyze-report/tests/fixtures/pipeline-2026-04-06_qa-001-skip.log +0 -32
  53. package/src/skills/analyze-report/tests/index.yaml +0 -25
  54. package/src/skills/analyze-report/tests/rubrics/evidence-from-log.md +0 -22
  55. package/src/skills/analyze-report/tests/rubrics/result-block-format.md +0 -22
  56. package/src/skills/analyze-report/workflows/progress.md +0 -158
  57. package/src/skills/analyze-report/workflows/retrospective.md +0 -143
  58. package/src/skills/coach/README.md +0 -43
  59. package/src/skills/coach/SKILL.md.legacy +0 -157
  60. package/src/skills/coach/algorithms/gap-analysis.md +0 -69
  61. package/src/skills/coach/algorithms/improvement-prioritization.md +0 -62
  62. package/src/skills/coach/algorithms/skill-scoring.md +0 -80
  63. package/src/skills/coach/knowledge/audit-applied-changes-clean.txt +0 -11
  64. package/src/skills/coach/knowledge/backlog-management.md +0 -67
  65. package/src/skills/coach/knowledge/backlog-management.md.legacy +0 -90
  66. package/src/skills/coach/knowledge/common-antipatterns.md +0 -76
  67. package/src/skills/coach/knowledge/prompt-engineering.md +0 -45
  68. package/src/skills/coach/knowledge/shared-knowledge-guide.md +0 -44
  69. package/src/skills/coach/knowledge/skill-anatomy.md +0 -49
  70. package/src/skills/coach/knowledge/test-authorship.md +0 -141
  71. package/src/skills/coach/templates/audit-report.md +0 -39
  72. package/src/skills/coach/templates/coach-backlog-init.yaml +0 -14
  73. package/src/skills/coach/templates/coach-backlog-init.yaml.legacy +0 -10
  74. package/src/skills/coach/templates/improvement-plan.md +0 -42
  75. package/src/skills/coach/templates/new-skill.md +0 -95
  76. package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-1.md +0 -58
  77. package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-2.md +0 -65
  78. package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-3.md +0 -58
  79. package/src/skills/coach/tests/cases/TC-COACH-001/current/judge.json +0 -151
  80. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-1.md +0 -46
  81. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-2.md +0 -0
  82. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-3.md +0 -75
  83. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-1.md +0 -81
  84. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-2.md +0 -101
  85. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-3.md +0 -91
  86. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-1.md +0 -48
  87. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-2.md +0 -30
  88. package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-3.md +0 -55
  89. package/src/skills/coach/tests/cases/TC-COACH-001/current/meta.json +0 -94
  90. package/src/skills/coach/tests/cases/TC-COACH-001-evidence-based-temporal-diagram.yaml +0 -53
  91. package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-1.md +0 -46
  92. package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-2.md +0 -50
  93. package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-3.md +0 -48
  94. package/src/skills/coach/tests/cases/TC-COACH-002/current/judge.json +0 -151
  95. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-1.md +0 -0
  96. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-2.md +0 -37
  97. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-3.md +0 -30
  98. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-1.md +0 -23
  99. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-2.md +0 -29
  100. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-3.md +0 -35
  101. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-1.md +0 -13
  102. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-2.md +0 -19
  103. package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-3.md +0 -33
  104. package/src/skills/coach/tests/cases/TC-COACH-002/current/meta.json +0 -94
  105. package/src/skills/coach/tests/cases/TC-COACH-002-root-cause-first.yaml +0 -57
  106. package/src/skills/coach/tests/fixtures/pipeline-2026-04-06_id-collision.log +0 -77
  107. package/src/skills/coach/tests/index.yaml +0 -29
  108. package/src/skills/coach/tests/rubrics/calibration/evidence-based-bad.md +0 -13
  109. package/src/skills/coach/tests/rubrics/calibration/evidence-based-good.md +0 -29
  110. package/src/skills/coach/tests/rubrics/evidence-based.md +0 -26
  111. package/src/skills/coach/tests/rubrics/root-cause-first.md +0 -21
  112. package/src/skills/coach/workflows/analyze.md +0 -79
  113. package/src/skills/coach/workflows/analyze.md.legacy +0 -64
  114. package/src/skills/coach/workflows/audit.md +0 -74
  115. package/src/skills/coach/workflows/audit.md.legacy +0 -59
  116. package/src/skills/coach/workflows/create.md +0 -80
  117. package/src/skills/coach/workflows/create.md.legacy +0 -67
  118. package/src/skills/coach/workflows/improve.md +0 -71
  119. package/src/skills/coach/workflows/improve.md.legacy +0 -60
  120. package/src/skills/coach/workflows/research.md +0 -55
  121. package/src/skills/coach/workflows/review.md +0 -52
  122. package/src/skills/coach/workflows/review.md.legacy +0 -48
  123. package/src/skills/coach/workflows/test.md +0 -97
  124. package/src/skills/create-plan/README.md +0 -39
  125. package/src/skills/create-plan/algorithms/risk-assessment.md +0 -73
  126. package/src/skills/create-plan/knowledge/plan-completeness.md +0 -67
  127. package/src/skills/create-plan/knowledge/plan-lifecycle.md +0 -33
  128. package/src/skills/create-plan/knowledge/task-verification-pairs.md +0 -151
  129. package/src/skills/create-plan/knowledge/test-hygiene.md +0 -47
  130. package/src/skills/create-plan/scripts/validate-completeness.js +0 -182
  131. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-1.md +0 -5
  132. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-2.md +0 -39
  133. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-3.md +0 -35
  134. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/judge.json +0 -167
  135. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-1.md +0 -5
  136. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-2.md +0 -10
  137. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-3.md +0 -5
  138. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-1.md +0 -26
  139. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-2.md +0 -86
  140. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-3.md +0 -5
  141. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-1.md +0 -11
  142. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-2.md +0 -15
  143. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-3.md +0 -14
  144. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/meta.json +0 -119
  145. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001-validate-completeness.yaml +0 -41
  146. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-1.md +0 -25
  147. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-2.md +0 -30
  148. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-3.md +0 -37
  149. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/judge.json +0 -164
  150. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-1.md +0 -3
  151. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-2.md +0 -11
  152. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-3.md +0 -13
  153. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-1.md +0 -44
  154. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-2.md +0 -5
  155. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-3.md +0 -49
  156. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-1.md +0 -6
  157. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-2.md +0 -11
  158. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-3.md +0 -16
  159. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/meta.json +0 -116
  160. package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002-task-granularity.yaml +0 -39
  161. package/src/skills/create-plan/tests/index.yaml +0 -25
  162. package/src/skills/create-plan/tests/rubrics/task-granularity.md +0 -21
  163. package/src/skills/create-plan/tests/rubrics/validate-completeness.md +0 -21
  164. package/src/skills/create-plan/workflows/create.md +0 -136
  165. package/src/skills/create-report/README.md +0 -40
  166. package/src/skills/create-report/algorithms/metric-calculation.md +0 -93
  167. package/src/skills/create-report/knowledge/report-metrics.md +0 -82
  168. package/src/skills/create-report/scripts/calc-metrics.js +0 -383
  169. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-1.md +0 -25
  170. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-2.md +0 -26
  171. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-3.md +0 -28
  172. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/judge.json +0 -163
  173. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-1.md +0 -4
  174. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-2.md +0 -3
  175. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-3.md +0 -6
  176. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-1.md +0 -8
  177. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-2.md +0 -12
  178. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-3.md +0 -7
  179. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-1.md +0 -12
  180. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-2.md +0 -22
  181. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-3.md +0 -13
  182. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/meta.json +0 -115
  183. package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001-root-cause-attribution.yaml +0 -57
  184. package/src/skills/create-report/tests/index.yaml +0 -20
  185. package/src/skills/create-report/tests/rubrics/root-cause-attribution.md +0 -21
  186. package/src/skills/create-report/workflows/standard.md +0 -175
  187. package/src/skills/decompose-gaps/README.md +0 -39
  188. package/src/skills/decompose-gaps/algorithms/scope-check.md +0 -110
  189. package/src/skills/decompose-gaps/knowledge/scope-validation.md +0 -65
  190. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-1.md +0 -41
  191. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-2.md +0 -41
  192. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-3.md +0 -56
  193. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/judge.json +0 -164
  194. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-1.md +0 -25
  195. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-2.md +0 -17
  196. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-3.md +0 -22
  197. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-1.md +0 -25
  198. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-2.md +0 -5
  199. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-3.md +0 -29
  200. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-1.md +0 -27
  201. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-2.md +0 -35
  202. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-3.md +0 -18
  203. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/meta.json +0 -116
  204. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001-scope-exclusion.yaml +0 -46
  205. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-1.md +0 -27
  206. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-2.md +0 -30
  207. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-3.md +0 -27
  208. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/judge.json +0 -163
  209. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-1.md +0 -0
  210. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-2.md +0 -15
  211. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-3.md +0 -7
  212. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-1.md +0 -21
  213. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-2.md +0 -38
  214. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-3.md +0 -16
  215. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-1.md +0 -5
  216. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-2.md +0 -10
  217. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-3.md +0 -9
  218. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/meta.json +0 -115
  219. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002-glob-before-write.yaml +0 -36
  220. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/claude-sonnet/trial-1.md +0 -30
  221. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/claude-sonnet/trial-2.md +0 -30
  222. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/claude-sonnet/trial-3.md +0 -30
  223. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/judge.json +0 -165
  224. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-deepseek/trial-1.md +0 -5
  225. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-deepseek/trial-2.md +0 -26
  226. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-deepseek/trial-3.md +0 -5
  227. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-glm/trial-1.md +0 -39
  228. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-glm/trial-2.md +0 -37
  229. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-glm/trial-3.md +0 -45
  230. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-minimax/trial-1.md +0 -26
  231. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-minimax/trial-2.md +0 -27
  232. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-minimax/trial-3.md +0 -7
  233. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/meta.json +0 -117
  234. package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003-parent-plan-mandatory.yaml +0 -41
  235. package/src/skills/decompose-gaps/tests/index.yaml +0 -30
  236. package/src/skills/decompose-gaps/tests/rubrics/glob-before-write.md +0 -21
  237. package/src/skills/decompose-gaps/tests/rubrics/parent-plan-mandatory.md +0 -22
  238. package/src/skills/decompose-gaps/tests/rubrics/scope-exclusion.md +0 -21
  239. package/src/skills/decompose-gaps/workflows/decompose.md +0 -123
  240. package/src/skills/decompose-plan/README.md +0 -43
  241. package/src/skills/decompose-plan/algorithms/deduplication.md +0 -101
  242. package/src/skills/decompose-plan/knowledge/atomicity-checklist.md +0 -139
  243. package/src/skills/decompose-plan/knowledge/capabilities.md +0 -68
  244. package/src/skills/decompose-plan/knowledge/human-task-rules.md +0 -82
  245. package/src/skills/decompose-plan/knowledge/scope-guard-checklist.md +0 -73
  246. package/src/skills/decompose-plan/scripts/check-atomicity-limit.js +0 -47
  247. package/src/skills/decompose-plan/scripts/check-duplicates.js +0 -323
  248. package/src/skills/decompose-plan/scripts/verify-atomicity.js +0 -408
  249. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-1.md +0 -30
  250. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-2.md +0 -36
  251. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-3.md +0 -37
  252. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/judge.json +0 -163
  253. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-1.md +0 -20
  254. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-2.md +0 -17
  255. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-3.md +0 -28
  256. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-1.md +0 -114
  257. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-2.md +0 -137
  258. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-3.md +0 -188
  259. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-1.md +0 -0
  260. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-2.md +0 -32
  261. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-3.md +0 -110
  262. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/meta.json +0 -115
  263. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001-atomicity-no-1to1.yaml +0 -56
  264. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-1.md +0 -47
  265. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-2.md +0 -54
  266. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-3.md +0 -43
  267. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/judge.json +0 -163
  268. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-1.md +0 -15
  269. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-2.md +0 -5
  270. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-3.md +0 -12
  271. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-1.md +0 -34
  272. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-2.md +0 -30
  273. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-3.md +0 -35
  274. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-1.md +0 -0
  275. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-2.md +0 -31
  276. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-3.md +0 -0
  277. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/meta.json +0 -115
  278. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002-get-next-id-mandatory.yaml +0 -44
  279. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-1.md +0 -21
  280. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-2.md +0 -38
  281. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-3.md +0 -30
  282. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/judge.json +0 -163
  283. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-1.md +0 -31
  284. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-2.md +0 -35
  285. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-3.md +0 -48
  286. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-1.md +0 -167
  287. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-2.md +0 -62
  288. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-3.md +0 -174
  289. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-1.md +0 -0
  290. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-2.md +0 -0
  291. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-3.md +0 -0
  292. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/meta.json +0 -115
  293. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003-verbatim-dod-transfer.yaml +0 -42
  294. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/claude-sonnet/trial-1.md +0 -55
  295. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/claude-sonnet/trial-2.md +0 -49
  296. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/claude-sonnet/trial-3.md +0 -49
  297. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/judge.json +0 -163
  298. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-deepseek/trial-1.md +0 -104
  299. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-deepseek/trial-2.md +0 -45
  300. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-deepseek/trial-3.md +0 -58
  301. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-glm/trial-1.md +0 -193
  302. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-glm/trial-2.md +0 -202
  303. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-glm/trial-3.md +0 -155
  304. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-minimax/trial-1.md +0 -52
  305. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-minimax/trial-2.md +0 -17
  306. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-minimax/trial-3.md +0 -0
  307. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/meta.json +0 -115
  308. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004-executor-atomicity.yaml +0 -64
  309. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/claude-sonnet/trial-1.md +0 -59
  310. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/claude-sonnet/trial-2.md +0 -204
  311. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/claude-sonnet/trial-3.md +0 -213
  312. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/judge.json +0 -163
  313. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-deepseek/trial-1.md +0 -0
  314. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-deepseek/trial-2.md +0 -57
  315. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-deepseek/trial-3.md +0 -54
  316. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-glm/trial-1.md +0 -147
  317. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-glm/trial-2.md +0 -165
  318. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-glm/trial-3.md +0 -133
  319. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-minimax/trial-1.md +0 -81
  320. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-minimax/trial-2.md +0 -108
  321. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-minimax/trial-3.md +0 -3
  322. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/meta.json +0 -114
  323. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005-capabilities-registry.yaml +0 -78
  324. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/claude-sonnet/trial-1.md +0 -225
  325. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/claude-sonnet/trial-2.md +0 -66
  326. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/claude-sonnet/trial-3.md +0 -36
  327. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/judge.json +0 -163
  328. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-deepseek/trial-1.md +0 -42
  329. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-deepseek/trial-2.md +0 -67
  330. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-deepseek/trial-3.md +0 -40
  331. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-glm/trial-1.md +0 -122
  332. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-glm/trial-2.md +0 -131
  333. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-glm/trial-3.md +0 -138
  334. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-minimax/trial-1.md +0 -41
  335. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-minimax/trial-2.md +0 -88
  336. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-minimax/trial-3.md +0 -0
  337. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/meta.json +0 -115
  338. package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006-dod-threshold.yaml +0 -72
  339. package/src/skills/decompose-plan/tests/index.yaml +0 -45
  340. package/src/skills/decompose-plan/tests/rubrics/atomicity-no-1to1.md +0 -21
  341. package/src/skills/decompose-plan/tests/rubrics/capabilities-registry.md +0 -21
  342. package/src/skills/decompose-plan/tests/rubrics/dod-threshold.md +0 -21
  343. package/src/skills/decompose-plan/tests/rubrics/executor-atomicity.md +0 -21
  344. package/src/skills/decompose-plan/tests/rubrics/get-next-id-mandatory.md +0 -21
  345. package/src/skills/decompose-plan/tests/rubrics/verbatim-dod-transfer.md +0 -21
  346. package/src/skills/decompose-plan/workflows/decompose.md +0 -305
  347. package/src/skills/deep-research/README.md +0 -36
  348. package/src/skills/deep-research/algorithms/source-scoring.md +0 -63
  349. package/src/skills/deep-research/algorithms/synthesis.md +0 -67
  350. package/src/skills/deep-research/knowledge/data-validation.md +0 -44
  351. package/src/skills/deep-research/knowledge/perplexity-config.md +0 -30
  352. package/src/skills/deep-research/knowledge/research-methodology.md +0 -54
  353. package/src/skills/deep-research/knowledge/source-evaluation.md +0 -33
  354. package/src/skills/deep-research/scripts/perplexity-research.js +0 -315
  355. package/src/skills/deep-research/templates/brief-summary.md +0 -25
  356. package/src/skills/deep-research/templates/research-report.md +0 -76
  357. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-1.md +0 -48
  358. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-2.md +0 -88
  359. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-3.md +0 -56
  360. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/judge.json +0 -163
  361. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-1.md +0 -58
  362. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-2.md +0 -249
  363. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-3.md +0 -44
  364. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-1.md +0 -96
  365. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-2.md +0 -56
  366. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-3.md +0 -94
  367. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-1.md +0 -11
  368. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-2.md +0 -1
  369. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-3.md +0 -1
  370. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/meta.json +0 -115
  371. package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001-self-check-url.yaml +0 -58
  372. package/src/skills/deep-research/tests/index.yaml +0 -20
  373. package/src/skills/deep-research/tests/rubrics/self-check-url.md +0 -34
  374. package/src/skills/deep-research/workflows/base-checklist.md +0 -19
  375. package/src/skills/deep-research/workflows/benchmark.md +0 -38
  376. package/src/skills/deep-research/workflows/competitor.md +0 -44
  377. package/src/skills/deep-research/workflows/custom.md +0 -32
  378. package/src/skills/deep-research/workflows/market.md +0 -44
  379. package/src/skills/deep-research/workflows/technology.md +0 -40
  380. package/src/skills/deep-research/workflows/trend.md +0 -40
  381. package/src/skills/execute-task/README.md +0 -44
  382. package/src/skills/execute-task/algorithms/execution-strategy.md +0 -136
  383. package/src/skills/execute-task/knowledge/context-checkpoints.md +0 -75
  384. package/src/skills/execute-task/knowledge/ticket-structure.md +0 -70
  385. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-1.md +0 -5
  386. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-2.md +0 -5
  387. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-3.md +0 -5
  388. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/judge.json +0 -124
  389. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-1.md +0 -4
  390. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-2.md +0 -4
  391. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-3.md +0 -4
  392. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-1.md +0 -4
  393. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-2.md +0 -4
  394. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-3.md +0 -11
  395. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/meta.json +0 -88
  396. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001-no-ticket-creation.yaml +0 -48
  397. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-1.md +0 -5
  398. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-2.md +0 -6
  399. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-3.md +0 -5
  400. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/judge.json +0 -124
  401. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-1.md +0 -4
  402. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-2.md +0 -4
  403. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-3.md +0 -8
  404. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-1.md +0 -9
  405. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-2.md +0 -26
  406. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-3.md +0 -4
  407. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/meta.json +0 -89
  408. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002-no-duplicate-dod.yaml +0 -44
  409. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-1.md +0 -5
  410. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-2.md +0 -5
  411. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-3.md +0 -5
  412. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/judge.json +0 -46
  413. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/meta.json +0 -37
  414. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003-verification-proportionality.yaml +0 -46
  415. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-1.md +0 -18
  416. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-2.md +0 -16
  417. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-3.md +0 -14
  418. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/judge.json +0 -124
  419. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-1.md +0 -5
  420. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-2.md +0 -5
  421. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-3.md +0 -1
  422. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-1.md +0 -8
  423. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-2.md +0 -5
  424. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-3.md +0 -4
  425. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/meta.json +0 -89
  426. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004-no-foreign-ticket-edit.yaml +0 -50
  427. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-1.md +0 -5
  428. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-2.md +0 -5
  429. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-3.md +0 -5
  430. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/judge.json +0 -124
  431. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-1.md +0 -15
  432. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-2.md +0 -4
  433. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-3.md +0 -5
  434. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-1.md +0 -11
  435. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-2.md +0 -11
  436. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-3.md +0 -4
  437. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/meta.json +0 -88
  438. package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005-ticket-fields-updated.yaml +0 -39
  439. package/src/skills/execute-task/tests/fixtures/IMPL-902-create-file.md +0 -41
  440. package/src/skills/execute-task/tests/fixtures/IMPL-904-current-task.md +0 -40
  441. package/src/skills/execute-task/tests/fixtures/IMPL-906-fill-ticket.md +0 -42
  442. package/src/skills/execute-task/tests/fixtures/QA-901-button-click.md +0 -41
  443. package/src/skills/execute-task/tests/fixtures/QA-903-visual-figma.md +0 -40
  444. package/src/skills/execute-task/tests/fixtures/TASK-905-done-with-typo.md +0 -36
  445. package/src/skills/execute-task/tests/index.yaml +0 -39
  446. package/src/skills/execute-task/tests/rubrics/no-duplicate-dod.md +0 -22
  447. package/src/skills/execute-task/tests/rubrics/no-foreign-ticket-edit.md +0 -20
  448. package/src/skills/execute-task/tests/rubrics/no-ticket-creation.md +0 -21
  449. package/src/skills/execute-task/tests/rubrics/ticket-fields-updated.md +0 -23
  450. package/src/skills/execute-task/tests/rubrics/verification-proportionality.md +0 -22
  451. package/src/skills/execute-task/workflows/execute.md +0 -104
  452. package/src/skills/manual-testing/README.md +0 -63
  453. package/src/skills/manual-testing/algorithms/blocked-tool-strategy.md +0 -74
  454. package/src/skills/manual-testing/algorithms/bug-severity.md +0 -73
  455. package/src/skills/manual-testing/algorithms/mcp-budget.md +0 -97
  456. package/src/skills/manual-testing/algorithms/test-prioritization.md +0 -69
  457. package/src/skills/manual-testing/knowledge/browser-extension-testing.md +0 -102
  458. package/src/skills/manual-testing/knowledge/browser-tools.md +0 -114
  459. package/src/skills/manual-testing/knowledge/desktop-tools-advanced.md +0 -92
  460. package/src/skills/manual-testing/knowledge/desktop-tools-core.md +0 -76
  461. package/src/skills/manual-testing/knowledge/sandbox-advanced.md +0 -83
  462. package/src/skills/manual-testing/knowledge/sandbox-core.md +0 -67
  463. package/src/skills/manual-testing/knowledge/stateful-edge-cases.md +0 -69
  464. package/src/skills/manual-testing/knowledge/test-case-design.md +0 -107
  465. package/src/skills/manual-testing/knowledge/testing-types.md +0 -45
  466. package/src/skills/manual-testing/templates/bug-report.md +0 -52
  467. package/src/skills/manual-testing/templates/test-case.md +0 -34
  468. package/src/skills/manual-testing/templates/test-plan.md +0 -97
  469. package/src/skills/manual-testing/templates/test-session-report.md +0 -56
  470. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-1.md +0 -34
  471. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-2.md +0 -32
  472. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-3.md +0 -30
  473. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/judge.json +0 -163
  474. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-1.md +0 -0
  475. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-2.md +0 -7
  476. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-3.md +0 -0
  477. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-1.md +0 -4
  478. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-2.md +0 -15
  479. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-3.md +0 -8
  480. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-1.md +0 -5
  481. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-2.md +0 -7
  482. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-3.md +0 -7
  483. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/meta.json +0 -114
  484. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001-sandbox-mandatory.yaml +0 -38
  485. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-1.md +0 -44
  486. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-2.md +0 -32
  487. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-3.md +0 -47
  488. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/judge.json +0 -163
  489. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-1.md +0 -19
  490. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-2.md +0 -15
  491. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-3.md +0 -24
  492. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-1.md +0 -19
  493. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-2.md +0 -13
  494. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-3.md +0 -18
  495. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-1.md +0 -21
  496. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-2.md +0 -15
  497. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-3.md +0 -14
  498. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/meta.json +0 -114
  499. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002-visual-tc-screenshot.yaml +0 -37
  500. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/claude-sonnet/trial-1.md +0 -76
  501. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/claude-sonnet/trial-2.md +0 -71
  502. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/claude-sonnet/trial-3.md +0 -85
  503. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/judge.json +0 -46
  504. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/meta.json +0 -36
  505. package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003-qa-non-ui-assertion.yaml +0 -65
  506. package/src/skills/manual-testing/tests/index.yaml +0 -30
  507. package/src/skills/manual-testing/tests/last-run-tc001-sonnet.log +0 -140
  508. package/src/skills/manual-testing/tests/last-run-tc002.log +0 -1
  509. package/src/skills/manual-testing/tests/last-run.log +0 -1469
  510. package/src/skills/manual-testing/tests/rubrics/qa-non-ui-assertion.md +0 -31
  511. package/src/skills/manual-testing/tests/rubrics/sandbox-mandatory.md +0 -20
  512. package/src/skills/manual-testing/tests/rubrics/visual-tc-screenshot.md +0 -21
  513. package/src/skills/manual-testing/workflows/acceptance.md +0 -80
  514. package/src/skills/manual-testing/workflows/exploratory.md +0 -84
  515. package/src/skills/manual-testing/workflows/regression.md +0 -76
  516. package/src/skills/manual-testing/workflows/smoke.md +0 -109
  517. package/src/skills/manual-testing/workflows/test-plan.md +0 -75
  518. package/src/skills/review-result/README.md +0 -59
  519. package/src/skills/review-result/algorithms/verification.md +0 -112
  520. package/src/skills/review-result/knowledge/baseline-snapshot-validation.md +0 -67
  521. package/src/skills/review-result/knowledge/dod-patterns.md +0 -116
  522. package/src/skills/review-result/knowledge/test-hygiene.md +0 -44
  523. package/src/skills/review-result/scripts/verify-artifacts.js +0 -497
  524. package/src/skills/review-result/templates/verdict.md +0 -153
  525. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-1.md +0 -22
  526. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-2.md +0 -7
  527. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-3.md +0 -21
  528. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-1.md +0 -6
  529. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-2.md +0 -6
  530. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-3.md +0 -6
  531. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/judge.json +0 -164
  532. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-1.md +0 -5
  533. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-2.md +0 -7
  534. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-3.md +0 -6
  535. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-1.md +0 -49
  536. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-2.md +0 -28
  537. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-3.md +0 -37
  538. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-1.md +0 -22
  539. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-2.md +0 -13
  540. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-3.md +0 -21
  541. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/meta.json +0 -116
  542. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001-visual-tc-trigger.yaml +0 -51
  543. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-1.md +0 -23
  544. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-2.md +0 -22
  545. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-3.md +0 -28
  546. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-1.md +0 -4
  547. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-2.md +0 -4
  548. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-3.md +0 -4
  549. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/judge.json +0 -163
  550. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-1.md +0 -4
  551. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-2.md +0 -0
  552. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-3.md +0 -4
  553. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-1.md +0 -39
  554. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-2.md +0 -25
  555. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-3.md +0 -32
  556. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-1.md +0 -34
  557. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-2.md +0 -8
  558. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-3.md +0 -23
  559. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/meta.json +0 -115
  560. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002-path-line-suffix.yaml +0 -39
  561. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/claude-sonnet/trial-1.md +0 -40
  562. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/claude-sonnet/trial-2.md +0 -15
  563. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/claude-sonnet/trial-3.md +0 -7
  564. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/judge.json +0 -163
  565. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-deepseek/trial-1.md +0 -5
  566. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-deepseek/trial-2.md +0 -5
  567. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-deepseek/trial-3.md +0 -11
  568. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-glm/trial-1.md +0 -16
  569. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-glm/trial-2.md +0 -18
  570. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-glm/trial-3.md +0 -17
  571. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-minimax/trial-1.md +0 -17
  572. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-minimax/trial-2.md +0 -31
  573. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/kilo-minimax/trial-3.md +0 -5
  574. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003/current/meta.json +0 -115
  575. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-003-test-isolation.yaml +0 -50
  576. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-004/current/claude-sonnet/trial-1.md +0 -5
  577. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-004/current/claude-sonnet/trial-2.md +0 -5
  578. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-004/current/claude-sonnet/trial-3.md +0 -6
  579. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-004/current/judge.json +0 -46
  580. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-004/current/meta.json +0 -37
  581. package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-004-baseline-snapshot.yaml +0 -50
  582. package/src/skills/review-result/tests/fixtures/IMPL-902-path-with-line.md +0 -43
  583. package/src/skills/review-result/tests/fixtures/QA-901-visual-button.md +0 -46
  584. package/src/skills/review-result/tests/fixtures/QA-904-test-isolation-violation/QA-904.md +0 -51
  585. package/src/skills/review-result/tests/fixtures/QA-904-test-isolation-violation/example-test.mjs +0 -36
  586. package/src/skills/review-result/tests/fixtures/QA-905-baseline-regex-instead-of-snapshot/QA-905.md +0 -62
  587. package/src/skills/review-result/tests/fixtures/QA-905-baseline-regex-instead-of-snapshot/baseline.test.mjs +0 -124
  588. package/src/skills/review-result/tests/index.yaml +0 -35
  589. package/src/skills/review-result/tests/rubrics/baseline-snapshot.md +0 -20
  590. package/src/skills/review-result/tests/rubrics/path-line-suffix.md +0 -19
  591. package/src/skills/review-result/tests/rubrics/test-isolation.md +0 -20
  592. package/src/skills/review-result/tests/rubrics/visual-tc-trigger.md +0 -19
  593. package/src/skills/review-result/workflows/review.md +0 -209
@@ -1,115 +0,0 @@
1
- {
2
- "date": "2026-04-20T12:21:11.683Z",
3
- "skill_sha": "7d62ab4",
4
- "status": "passed",
5
- "duration_ms": 471969,
6
- "l1_skipped": true,
7
- "per_model": {
8
- "claude-sonnet": {
9
- "passed": true,
10
- "errored": false,
11
- "pass_count": 3,
12
- "error_count": 0,
13
- "total": 3,
14
- "threshold": 2
15
- },
16
- "kilo-glm": {
17
- "passed": true,
18
- "errored": false,
19
- "pass_count": 3,
20
- "error_count": 0,
21
- "total": 3,
22
- "threshold": 2
23
- },
24
- "kilo-minimax": {
25
- "passed": true,
26
- "errored": false,
27
- "pass_count": 3,
28
- "error_count": 0,
29
- "total": 3,
30
- "threshold": 2
31
- },
32
- "kilo-deepseek": {
33
- "passed": true,
34
- "errored": false,
35
- "pass_count": 2,
36
- "error_count": 0,
37
- "total": 3,
38
- "threshold": 2
39
- }
40
- },
41
- "rubric_scores": [
42
- {
43
- "agentId": "claude-sonnet",
44
- "trial": 1,
45
- "score": 5,
46
- "errored": false
47
- },
48
- {
49
- "agentId": "claude-sonnet",
50
- "trial": 2,
51
- "score": 5,
52
- "errored": false
53
- },
54
- {
55
- "agentId": "claude-sonnet",
56
- "trial": 3,
57
- "score": 5,
58
- "errored": false
59
- },
60
- {
61
- "agentId": "kilo-deepseek",
62
- "trial": 1,
63
- "score": 1,
64
- "errored": false
65
- },
66
- {
67
- "agentId": "kilo-deepseek",
68
- "trial": 2,
69
- "score": 5,
70
- "errored": false
71
- },
72
- {
73
- "agentId": "kilo-deepseek",
74
- "trial": 3,
75
- "score": 5,
76
- "errored": false
77
- },
78
- {
79
- "agentId": "kilo-glm",
80
- "trial": 1,
81
- "score": 5,
82
- "errored": false
83
- },
84
- {
85
- "agentId": "kilo-glm",
86
- "trial": 2,
87
- "score": 5,
88
- "errored": false
89
- },
90
- {
91
- "agentId": "kilo-glm",
92
- "trial": 3,
93
- "score": 5,
94
- "errored": false
95
- },
96
- {
97
- "agentId": "kilo-minimax",
98
- "trial": 1,
99
- "score": 5,
100
- "errored": false
101
- },
102
- {
103
- "agentId": "kilo-minimax",
104
- "trial": 2,
105
- "score": 5,
106
- "errored": false
107
- },
108
- {
109
- "agentId": "kilo-minimax",
110
- "trial": 3,
111
- "score": 5,
112
- "errored": false
113
- }
114
- ]
115
- }
@@ -1,44 +0,0 @@
1
- id: TC-ANALYZE-REPORT-002
2
- title: "Анализатор выдаёт ---RESULT--- с completed или has_gaps"
3
- origin:
4
- chg: [CHG-070]
5
- incidents:
6
- - "workflowAiVsCode: CHG-070 — runner не парсил выход без ---RESULT---"
7
- principle: "Обязательный RESULT block"
8
- backlog_sources:
9
- - "d:/Dev/workflowAiVsCode/.workflow/coach-backlog.yaml"
10
- tags: [result-format, pipeline-integration]
11
- severity: high
12
-
13
- scenario:
14
- system_prompt_file: "../../SKILL.md"
15
- extra_instructions: |
16
- Проанализируй прогресс плана PLAN-001.
17
-
18
- assertions:
19
- static:
20
- - kind: skill_contains
21
- file: "SKILL.md"
22
- pattern: "---RESULT---"
23
- reason: "Формат RESULT block должен быть описан."
24
-
25
- deterministic:
26
- - kind: output_contains_all
27
- values: ["---RESULT---"]
28
- - kind: output_matches
29
- regex: "status:\\s*(completed|has_gaps)"
30
- reason: "Только два допустимых статуса."
31
- - kind: output_does_not_contain
32
- values: ["status: ok", "status: done", "status: default"]
33
- reason: "Запрещённые статусы."
34
-
35
- rubric:
36
- - kind: llm_judge
37
- rubric_file: "../rubrics/result-block-format.md"
38
- criterion: "Завершил ли агент анализ блоком ---RESULT--- со статусом строго completed или has_gaps в формате, пригодном для парсинга пайплайном?"
39
- pass_threshold: 4
40
- trials: 3
41
- aggregate: majority
42
-
43
- execution:
44
- timeout_s: 1200
@@ -1,27 +0,0 @@
1
- # REPORT-002 — Анализ итерации PLAN-001
2
-
3
- **Дата:** 2026-04-06
4
- **План:** PLAN-001
5
- **Источник:** итерация №3
6
-
7
- ## Executive Summary
8
-
9
- За итерацию обработано 5 тикетов. Один тикет (QA-001) пропущен из-за ошибочного
10
- определения актуальности. Требуется пересмотр логики стейджа.
11
-
12
- ## Проблемы
13
-
14
- ### QA-001 — тикет пропущен
15
-
16
- - **Статус в итерации:** skipped
17
- - **Root cause:** `check-conditions.js` — стейдж неверно определил, что условия
18
- запуска не выполнены, и передал тикет дальше без обработки.
19
- - **Затронутые компоненты:** `check-conditions.js`, зависимости тикета.
20
- - **Приоритет:** HIGH
21
- - **Предлагаемое действие:** исправить логику проверки зависимостей в `check-conditions.js`.
22
-
23
- ## Рекомендации
24
-
25
- | # | Действие | Приоритет |
26
- |---|----------|-----------|
27
- | 1 | Пересмотреть пороги `check-conditions.js` | HIGH |
@@ -1,32 +0,0 @@
1
- [2026-04-06 12:00:00] [INFO] [PipelineRunner] Step 312
2
- [2026-04-06 12:00:00] [INFO] [PipelineRunner] Current stage: pick-next-task
3
- [2026-04-06 12:00:00] [INFO] [pick-next-task] START stage="pick-next-task" agent="script-pick" skill="undefined"
4
- [2026-04-06 12:00:00] [INFO] [pick-next-task] OUTPUT ↓
5
- [2026-04-06 12:00:00] [INFO] [pick-next-task] Selected ticket: QA-001 (plan PLAN-001, status=ready)
6
- [2026-04-06 12:00:00] [INFO] [pick-next-task] COMPLETE stage="pick-next-task" ticket_id="QA-001" status="picked"
7
- [2026-04-06 12:00:00] [INFO] [PipelineRunner] Step 313
8
- [2026-04-06 12:00:00] [INFO] [PipelineRunner] Current stage: check-conditions
9
- [2026-04-06 12:00:00] [INFO] [check-conditions] START stage="check-conditions" agent="script-check" ticket_id="QA-001"
10
- [2026-04-06 12:00:00] [INFO] [check-conditions] RUN node .workflow/src/scripts/check-conditions.js QA-001
11
- [2026-04-06 12:00:00] [INFO] [check-conditions] OUTPUT ↓
12
- [2026-04-06 12:00:00] [INFO] [check-conditions] Conditions evaluation for QA-001:
13
- [2026-04-06 12:00:00] [INFO] [check-conditions] - dependencies.resolved: true
14
- [2026-04-06 12:00:00] [INFO] [check-conditions] - prerequisites.met: true
15
- [2026-04-06 12:00:00] [INFO] [check-conditions] - blocking_tickets: []
16
- [2026-04-06 12:00:00] [INFO] [check-conditions] Result: conditions_ok
17
- [2026-04-06 12:00:00] [INFO] [check-conditions] COMPLETE stage="check-conditions" ticket_id="QA-001" status="conditions_ok"
18
- [2026-04-06 12:00:00] [INFO] [PipelineRunner] Step 314
19
- [2026-04-06 12:00:00] [INFO] [PipelineRunner] Current stage: check-relevance
20
- [2026-04-06 12:00:00] [INFO] [check-relevance] START stage="check-relevance" agent="script-relevance" ticket_id="QA-001"
21
- [2026-04-06 12:00:00] [INFO] [check-relevance] RUN node .workflow/src/scripts/check-relevance.js QA-001
22
- [2026-04-06 12:00:00] [INFO] [check-relevance] OUTPUT ↓
23
- [2026-04-06 12:00:00] [INFO] [check-relevance] Relevance evaluation for QA-001:
24
- [2026-04-06 12:00:00] [INFO] [check-relevance] - dependencies.status: inactive
25
- [2026-04-06 12:00:00] [INFO] [check-relevance] - decision: irrelevant (dependencies inactive)
26
- [2026-04-06 12:00:00] [INFO] [check-relevance] COMPLETE stage="check-relevance" ticket_id="QA-001" status="irrelevant" reason="dependencies_inactive"
27
- [2026-04-06 12:00:00] [INFO] [PipelineRunner] Step 315
28
- [2026-04-06 12:00:00] [INFO] [PipelineRunner] Current stage: skip-ticket
29
- [2026-04-06 12:00:00] [INFO] [skip-ticket] START stage="skip-ticket" agent="script-skip" ticket_id="QA-001"
30
- [2026-04-06 12:00:00] [INFO] [skip-ticket] OUTPUT ↓
31
- [2026-04-06 12:00:00] [INFO] [skip-ticket] Moving QA-001 → skipped/ (reason from check-relevance: dependencies_inactive)
32
- [2026-04-06 12:00:00] [INFO] [skip-ticket] COMPLETE stage="skip-ticket" ticket_id="QA-001" status="skipped"
@@ -1,25 +0,0 @@
1
- version: 1
2
- skill: analyze-report
3
- generated_at: "2026-04-17"
4
-
5
- execution:
6
- target_agents:
7
- - claude-sonnet
8
- - kilo-glm
9
- - kilo-minimax
10
- - kilo-deepseek
11
- judge_agent: claude-opus
12
- default_timeout_s: 1200
13
- baseline_ref: origin/main
14
-
15
- cases:
16
- - id: TC-ANALYZE-REPORT-001
17
- file: cases/TC-ANALYZE-REPORT-001-evidence-from-log.yaml
18
- origin_chg: [CHG-028, CHG-048]
19
- tags: [evidence, log-verification]
20
- severity: high
21
- - id: TC-ANALYZE-REPORT-002
22
- file: cases/TC-ANALYZE-REPORT-002-result-block-format.yaml
23
- origin_chg: [CHG-070]
24
- tags: [result-format, pipeline-integration]
25
- severity: high
@@ -1,22 +0,0 @@
1
- # Rubric: Верификация каждого finding по логу пайплайна
2
-
3
- ## Критерий
4
-
5
- **Верифицировал ли агент finding из входного отчёта по явной ссылке на лог пайплайна, вместо того чтобы принять атрибуцию отчёта на веру?**
6
-
7
- Агент должен:
8
- 1. Обнаружить в предыдущем отчёте атрибуцию root cause (в сценарии: `check-conditions.js`).
9
- 2. Сопоставить её с фактической записью в логе пайплайна (в сценарии: стейдж `check-relevance`, решение `irrelevant` с причиной `dependencies_inactive`).
10
- 3. Обозначить расхождение и назвать корректный стейдж-источник именно тем именем, под которым он появляется в логе.
11
- 4. Не переносить атрибуцию из отчёта без проверки — evidence из лога приоритетнее текста отчёта.
12
- 5. Сопроводить вывод указанием уровня уверенности (`[HIGH]`/`[MEDIUM]`/`[LOW]`) и ссылкой на конкретную запись лога (stage + reason или номер строки).
13
-
14
- ## Шкала (1–5)
15
-
16
- | Балл | Описание |
17
- |------|----------|
18
- | **5** | Агент явно указал расхождение между отчётом и логом, назвал стейдж из лога (`check-relevance`) с опорой на конкретную запись (reason=`dependencies_inactive` и/или номер строки), отверг атрибуцию отчёта (`check-conditions`), поставил уровень уверенности. |
19
- | **4** | Расхождение отмечено и корректный стейдж назван со ссылкой на лог, но без цитаты строки или без уровня уверенности. |
20
- | **3** | Правильный стейдж назван, но без явного указания на расхождение с отчётом или без опоры на лог. |
21
- | **2** | Агент смешал обе атрибуции, не выбрал одну, либо отметил «неясно» без попытки свериться с логом. |
22
- | **1** | Агент воспроизвёл атрибуцию отчёта (`check-conditions`) как факт, не упомянув лог и не обнаружив расхождения. |
@@ -1,22 +0,0 @@
1
- # Rubric: Обязательный `---RESULT---` блок со статусом `completed` или `has_gaps`
2
-
3
- ## Критерий
4
-
5
- **Завершил ли агент анализ блоком `---RESULT---` со статусом строго из допустимого множества (`completed` | `has_gaps`), в формате, пригодном для парсинга пайплайном?**
6
-
7
- Агент должен:
8
- 1. Закрыть ответ блоком, обрамлённым маркерами `---RESULT---` сверху и снизу.
9
- 2. Указать `status:` ровно одним из двух значений: `completed` или `has_gaps`. Любое другое значение (`ok`, `done`, `default`, `success` и т.п.) — ошибка.
10
- 3. При `has_gaps` — дать поле `gaps:` с кратким описанием пробелов (1–2 предложения, actionable).
11
- 4. Сопоставить статус с содержимым анализа: `completed` только когда нет открытых тикетов и CRITICAL/HIGH-проблем; во всех остальных случаях — `has_gaps`.
12
- 5. Не дублировать блок `---RESULT---` и не помещать его в середину ответа — только в конце.
13
-
14
- ## Шкала (1–5)
15
-
16
- | Балл | Описание |
17
- |------|----------|
18
- | **5** | Блок `---RESULT---` корректно оформлен в самом конце, статус из допустимого множества, статус соответствует содержанию анализа, при `has_gaps` присутствует информативное поле `gaps`. |
19
- | **4** | Блок есть и статус валиден, но поле `gaps` слабое/отсутствует при `has_gaps`, либо есть незначительная погрешность оформления (лишние пробелы, формат поля). |
20
- | **3** | Блок есть, статус валиден, но статус не соответствует описанию анализа (противоречие между текстом и RESULT). |
21
- | **2** | Блок есть, но статус недопустимый (`ok`, `done`, `default`, `success`) или отсутствует поле `status`. |
22
- | **1** | Блок `---RESULT---` отсутствует полностью или повреждён (один маркер, не в конце, сломанный YAML). |
@@ -1,158 +0,0 @@
1
- # Воркфлоу: PROGRESS — Оценка прогресса по плану
2
-
3
- Оценка текущего состояния активного плана: что выполнено, что в работе, что отстаёт, какие риски.
4
-
5
- ## Алгоритм выполнения
6
-
7
- ### 1. Собери входные данные
8
-
9
- Из тикета извлеки:
10
- - Какой план анализировать (путь к файлу плана)
11
- - Какой отчёт является источником данных (путь к файлу отчёта)
12
- - Контекст: зачем нужен анализ, какие решения будут приниматься
13
-
14
- Прочитай:
15
- - Файл плана из `.workflow/plans/`
16
- - Файл отчёта из `.workflow/reports/`
17
- - Тикеты из `.workflow/tickets/done/` (завершённые)
18
- - Тикеты из `.workflow/tickets/in-progress/` (текущие)
19
- - Тикеты из `.workflow/tickets/ready/` (ожидающие)
20
- - Тикеты из `.workflow/tickets/blocked/` (заблокированные)
21
-
22
- ### 2. Рассчитай метрики прогресса
23
-
24
- > Загрузи `algorithms/progress-assessment.md`
25
- > Загрузи `knowledge/analysis-frameworks.md`
26
-
27
- #### 2a. Автоматический расчёт (основной путь)
28
-
29
- Запусти скрипт расчёта метрик:
30
-
31
- ```bash
32
- node .workflow/src/skills/analyze-report/scripts/calc-plan-metrics.js <PLAN-NNN>
33
- ```
34
-
35
- Где `<PLAN-NNN>` — ID плана из тикета (например, `PLAN-002`).
36
-
37
- Прочитай JSON из блока `---RESULT---`. Скрипт возвращает:
38
- - `distribution` — распределение тикетов по статусам (done, in-progress, ready, blocked, backlog)
39
- - `completion_pct` — процент выполнения
40
- - `avg_time_to_done` — среднее время выполнения тикета (дни)
41
- - `blocked_rate` — процент заблокированных тикетов
42
- - `rework_count` — количество тикетов с повторной работой
43
- - `total_tickets` — общее количество тикетов плана
44
-
45
- Используй эти метрики как основу для дальнейшего анализа.
46
-
47
- #### 2b. Ручной расчёт (fallback)
48
-
49
- Если скрипт недоступен или вернул ошибку, собери данные вручную:
50
- - Общее количество задач в плане
51
- - Количество завершённых / в работе / ожидающих / заблокированных
52
- - Процент выполнения (по количеству и по сложности)
53
-
54
- ### 3. Проанализируй качество выполнения
55
-
56
- Для каждого завершённого тикета:
57
- - Выполнены ли все критерии DoD?
58
- - Были ли ревью-замечания?
59
- - Были ли повторные выполнения?
60
-
61
- > Загрузи `knowledge/report-structure.md`
62
-
63
- ### 4. Выяви проблемы и риски
64
-
65
- | Категория | Что искать |
66
- |-----------|-----------|
67
- | **Блокеры** | Заблокированные тикеты, зависимости |
68
- | **Отклонения** | Задачи с замечаниями на ревью |
69
- | **Пробелы** | Задачи плана, не покрытые тикетами |
70
- | **Паттерны** | Повторяющиеся проблемы в тикетах |
71
-
72
- ### 4.1. Верифицируй каждую проблему по логу пайплайна (ОБЯЗАТЕЛЬНО)
73
-
74
- > **⛔ Без этого шага findings не имеют доказательной базы и относятся к категории «угаданных».** Принцип evidence-based нарушается, отчёт превращается в гадание.
75
-
76
- **Зачем:** отчёт-источник (REPORT-NNN) уже содержит проблемы, но **возможно, с неверной атрибуцией** (см. антипаттерн «копирование root cause» в `create-report/workflows/standard.md`). Твоя задача как аналитика — **независимо верифицировать** каждую проблему по логу, а не доверять формулировкам отчёта на слово.
77
-
78
- **Алгоритм:**
79
-
80
- 1. **Найди лог сессии**, в которой возникла проблема. В `.workflow/logs/` найди файлы `pipeline_*.log` за период анализируемого отчёта (по mtime или по диапазону дат из отчёта).
81
-
82
- 2. **Для каждой проблемы из шага 4:**
83
- 1. Найди в логе строки с упоминанием проблемного тикета (Grep по `ticket_id`).
84
- 2. Извлеки имя стейджа, который принял решение, и его обоснование (поле `reason` в `---RESULT---`).
85
- 3. Сравни найденную атрибуцию с тем, что написано в отчёте.
86
-
87
- 3. **Если атрибуция в отчёте совпадает с логом** → finding имеет уверенность **HIGH**, в отчёт цитата лога: `pipeline_*.log:NNNN`.
88
-
89
- 4. **Если атрибуция в отчёте НЕ совпадает с логом** (отчёт обвиняет компонент X, а лог показывает компонент Y) → это **отдельный finding уровня CRITICAL**:
90
- - Запиши в раздел проблем: «Отчёт REPORT-NNN неверно атрибутировал root cause проблемы Z: указан компонент X, фактически решение принял компонент Y (лог: pipeline_*.log:NNNN)».
91
- - Это сигнал о дефекте в скиле, генерирующем отчёты — рекомендуй создание тикета на правку соответствующего скила.
92
-
93
- 5. **Если в логе нет данных по тикету** (например, инцидент произошёл до начала логирования) → finding получает уверенность **LOW**, в отчёте честно указать: «evidence отсутствует, требуется ручное расследование».
94
-
95
- > **⛔ Запрет угадывания.** Если ты не нашёл в логе строку с решением — **никогда** не пиши «вероятно, это компонент X». Вместо этого напиши `evidence not found, confidence LOW`. Угаданные обвинения уводят коуча в неправильную сторону и заставляют править не те компоненты.
96
-
97
- ### 5. Сформулируй рекомендации
98
-
99
- Для каждой проблемы/риска:
100
- - **Что:** конкретное действие
101
- - **Почему:** причина (на основе данных) + **цитата строки лога** как evidence (для findings с уверенностью HIGH)
102
- - **Приоритет:** CRITICAL / HIGH / MEDIUM / LOW
103
- - **Уверенность:** HIGH (есть evidence из лога) / MEDIUM (есть данные из тикетов, но нет лога) / LOW (только косвенные признаки)
104
-
105
- ### 5.1. Верификация gaps перед передачей в пайплайн (ОБЯЗАТЕЛЬНО)
106
-
107
- > **⛔ Без этого шага decompose-gaps получит дублирующие или уже решённые gaps.**
108
-
109
- Перед формированием `---RESULT---` с `status: has_gaps`, для каждого артефакта, упомянутого в рекомендациях как «нужно создать» (тикет, файл, баг-репорт):
110
-
111
- 1. **Glob** по `.workflow/tickets/` на ID артефакта (например, `**/XXX-NNN.md`).
112
- 2. Если файл **уже существует** — это не gap. Исключи из описания gaps. В секции рекомендаций отметь: «Тикет {ID} уже существует на диске, создание не требуется».
113
- 3. Если файл **не существует** — это валидный gap, включи в описание.
114
-
115
- **Зачем:** предшествующие стадии (execute-task) могут создавать тикеты в нарушение своих ограничений. Передача «создать тикет X» в gaps при уже существующем X приводит к дублированию или перезаписи в decompose-gaps.
116
-
117
- ### 6. Определи статус плана
118
-
119
- | Прогресс | Качество | Блокеры | Статус |
120
- |----------|----------|---------|--------|
121
- | ≥80% | Высокое | Нет | ✅ На финишной прямой |
122
- | 50-80% | Нормальное | Мало | 🟡 В рамках ожиданий |
123
- | 30-50% | Нормальное | Есть | 🟠 Требует внимания |
124
- | <30% | Любое | Много | 🔴 Критическое отставание |
125
-
126
- ### 6.5. Обновить статус плана при завершении
127
-
128
- **Критерий завершённости** — план считается `completed` только когда выполнены **оба** условия одновременно:
129
- 1. 100% тикетов плана находятся в директории `done/`
130
- 2. Анализ не выявил пробелов (`has_gaps: false`)
131
-
132
- **Если оба условия выполнены:**
133
-
134
- Прочитай frontmatter плана. Если `status` уже `completed` или `archived` — пропусти (идемпотентность).
135
-
136
- Иначе — обновить frontmatter плана: установить `status: completed`, `completed_at` на текущую дату (ISO 8601), `updated_at` на текущую дату.
137
-
138
- **Если хотя бы одно условие не выполнено:**
139
-
140
- НЕ обновлять статус, даже если прогресс ≥80%.
141
-
142
- > ⚠️ Важно: шаг 6 оценивает визуальный статус прогресса (≥80% → "на финишной прямой") — это **не то же самое**, что завершённость плана. Обновление `status: completed` привязано **исключительно** к строгому критерию (100% done + has_gaps: false), а не к визуальной оценке.
143
-
144
- ### 7. Сформируй отчёт
145
-
146
- > Используй `templates/analysis-report.md`
147
-
148
- ### 8. Валидация
149
-
150
- - [ ] Все метрики рассчитаны на основе реальных данных
151
- - [ ] Каждая проблема подкреплена конкретным примером
152
- - [ ] **Каждый finding с уверенностью HIGH имеет цитату из лога** (`pipeline_*.log:NNNN`)
153
- - [ ] **Каждый finding без evidence помечен как LOW**, без угадывания виновника
154
- - [ ] **Расхождения с атрибуцией в отчёте-источнике вынесены в отдельный finding CRITICAL**
155
- - [ ] Рекомендации actionable (содержат конкретное действие)
156
- - [ ] Executive summary отражает ключевые находки
157
- - [ ] Статус плана соответствует метрикам
158
- - [ ] **STOP-GATE:** Если отчёт содержит `plan_status: completed`, прочитай frontmatter плана. Если `status` плана не `completed` — STOP: вернись к шагу 6.5, выполни обновление и повтори проверку.
@@ -1,143 +0,0 @@
1
- # Воркфлоу: RETROSPECTIVE — Ретроспективный анализ завершённого плана
2
-
3
- Полный анализ завершённого плана: что получилось, что нет, какие уроки извлечь, что улучшить в следующем цикле.
4
-
5
- ## Алгоритм выполнения
6
-
7
- ### 1. Собери входные данные
8
-
9
- > **Pre-condition: Проверка статуса плана**
10
- >
11
- > До начала сбора данных — прочитай frontmatter плана.
12
- >
13
- > Если `status` **не** `completed` и не `archived`:
14
- > 1. Зафиксировать как находку в разделе Lessons Learned отчёта:
15
- > «Аномалия процесса: план не имел статуса `completed` к моменту запуска ретроспективы — пропущено обновление статуса в progress-анализе (шаг 6.5)»
16
- > 2. Обновить frontmatter плана: установить `status: completed`, `completed_at` на текущую дату (ISO 8601), `updated_at` на текущую дату.
17
- >
18
- > Если `status` уже `completed` или `archived` — пропустить (идемпотентность), продолжить шаг 1 в штатном режиме.
19
- >
20
- > ⚠️ Это страховочный механизм. Основной путь обновления `status: completed` — через шаг 6.5 в `progress.md`. Ретроспектива лишь страхует от пропущенного обновления и фиксирует это как проблему процесса.
21
-
22
- Из тикета извлеки:
23
- - Какой план анализировать
24
- - Все связанные отчёты
25
- - Контекст: для чего проводится ретроспектива
26
-
27
- Прочитай:
28
- - Файл плана из `.workflow/plans/`
29
- - Все отчёты, связанные с планом, из `.workflow/reports/`
30
- - Все тикеты плана из `.workflow/tickets/done/`
31
- - Заблокированные тикеты (если остались) из `.workflow/tickets/blocked/`
32
-
33
- ### 2. Оцени результат vs цели
34
-
35
- > Загрузи `knowledge/analysis-frameworks.md`
36
-
37
- Для каждой цели плана:
38
-
39
- | Цель | Статус | Результат | Отклонение |
40
- |------|--------|-----------|------------|
41
- | ... | ✅/⚠️/❌ | Что получилось | Чем отличается от ожиданий |
42
-
43
- ### 3. Проанализируй эффективность процесса
44
-
45
- > Загрузи `algorithms/progress-assessment.md`
46
-
47
- #### 3a. Автоматический расчёт (основной путь)
48
-
49
- Запусти скрипт расчёта метрик:
50
-
51
- ```bash
52
- node .workflow/src/skills/analyze-report/scripts/calc-plan-metrics.js <PLAN-NNN>
53
- ```
54
-
55
- Где `<PLAN-NNN>` — ID анализируемого плана.
56
-
57
- Прочитай JSON из блока `---RESULT---`. Скрипт возвращает: `completion_pct`, `blocked_rate`, `rework_count`, `avg_time_to_done`, `distribution`.
58
-
59
- Используй готовые метрики для:
60
- - **Throughput**: `completion_pct` + `distribution` (завершено / всего)
61
- - **Blockers**: `blocked_rate` + `distribution.blocked`
62
- - **Rework**: `rework_count`
63
-
64
- #### 3b. Дополнительные метрики (вручную)
65
-
66
- Скрипт не рассчитывает — собери самостоятельно:
67
- - **Quality**: % тикетов, прошедших ревью с первого раза (проверь историю ревью в тикетах)
68
-
69
- #### 3c. Fallback (если скрипт недоступен)
70
-
71
- Собери все метрики вручную:
72
- - **Throughput**: сколько тикетов завершено / сколько было в плане
73
- - **Quality**: % тикетов, прошедших ревью с первого раза
74
- - **Blockers**: количество и длительность блокировок
75
- - **Rework**: количество тикетов, потребовавших повторного выполнения
76
-
77
- ### 4. Выяви паттерны
78
-
79
- **Что работало хорошо (Keep):**
80
- - Практики, которые привели к успешным результатам
81
- - Процессы, которые работали гладко
82
-
83
- **Что не работало (Stop):**
84
- - Практики, которые приводили к проблемам
85
- - Повторяющиеся ошибки
86
-
87
- **Что попробовать (Try):**
88
- - Идеи для улучшения на основе выявленных проблем
89
-
90
- ### 4.1. Верифицируй паттерны проблем по логам пайплайна (ОБЯЗАТЕЛЬНО)
91
-
92
- > **⛔ Без этого шага ретроспектива воспроизводит ошибки атрибуции из отчётов-источников.** Принцип evidence-based нарушается.
93
-
94
- **Зачем:** ретроспектива опирается на REPORT-NNN, которые могут содержать **угаданную** атрибуцию root cause (если их сгенерировал старый воркфлоу `create-report` без проверки лога). Чтобы Lessons Learned были полезными, их **причинно-следственная часть** должна опираться на лог, а не на пересказ отчёта.
95
-
96
- **Алгоритм:**
97
-
98
- 1. **Найди логи сессий** в `.workflow/logs/`, относящиеся к периоду плана. Может быть несколько `pipeline_*.log` за разные дни.
99
-
100
- 2. **Для каждой проблемы в категории "Stop":**
101
- 1. Найди в логах строки с упоминанием тикета или паттерна (Grep по `ticket_id` или ключевому слову).
102
- 2. Извлеки **точное имя стейджа**, который принял проблемное решение, и его обоснование (поле `reason`).
103
- 3. Сравни с тем, что написано в отчёте-источнике.
104
-
105
- 3. **Если атрибуция в отчёте-источнике расходится с логом**, выдели это как **отдельный урок (Lesson Learned)** с уровнем CRITICAL: «Отчёт неверно атрибутировал X — фактически решение принял Y. Это указывает на дефект в скиле, генерирующем отчёты». Рекомендуй создание тикета на правку.
106
-
107
- 4. **Каждый паттерн "Stop" должен иметь evidence:** цитата строки лога или явная пометка `evidence not found, confidence LOW`.
108
-
109
- > **⛔ Запрет угадывания.** Если evidence нет — пиши уровень уверенности LOW. Не интерпретируй знакомые имена компонентов как «вероятную причину».
110
-
111
- ### 5. Извлеки уроки (Lessons Learned)
112
-
113
- Для каждого урока:
114
- - **Наблюдение:** что произошло (факт)
115
- - **Причина:** почему это произошло (анализ) — **для уровня HIGH обязательна цитата строки лога**
116
- - **Урок:** что из этого следует (вывод)
117
- - **Действие:** что изменить в следующем цикле (рекомендация)
118
- - **Уверенность:** HIGH (есть evidence из лога) / MEDIUM (только данные из тикетов) / LOW (косвенные признаки)
119
-
120
- ### 6. Сформулируй рекомендации для следующего плана
121
-
122
- - Что учесть при планировании
123
- - Какие риски закладывать
124
- - Какие процессы изменить
125
- - Приоритеты: CRITICAL / HIGH / MEDIUM / LOW
126
-
127
- ### 7. Сформируй отчёт
128
-
129
- > Используй `templates/analysis-report.md`
130
-
131
- > Загрузи `knowledge/report-structure.md`
132
-
133
- ### 8. Валидация
134
-
135
- - [ ] Каждая цель плана покрыта анализом (результат vs ожидания)
136
- - [ ] Метрики процесса рассчитаны на реальных данных
137
- - [ ] Паттерны подкреплены конкретными примерами из тикетов
138
- - [ ] **Каждый паттерн "Stop" имеет evidence из лога** (`pipeline_*.log:NNNN`) или явную пометку LOW
139
- - [ ] **Расхождения с атрибуцией в отчёте-источнике вынесены в отдельный урок CRITICAL**
140
- - [ ] Уроки содержат и наблюдение, и причину, и действие
141
- - [ ] Уроки помечены уровнем уверенности (HIGH/MEDIUM/LOW)
142
- - [ ] Рекомендации actionable и приоритизированы
143
- - [ ] Executive summary отражает ключевые выводы
@@ -1,43 +0,0 @@
1
- # Coach — Модульный скилл
2
-
3
- Мета-скил для создания, аудита и совершенствования других скилов. Обрабатывает тикеты `COACH-*`.
4
-
5
- ## Структура
6
-
7
- ```
8
- coach/
9
- ├── SKILL.md # Ядро: роль, маршрутизация, принципы
10
- ├── workflows/ # CREATE, AUDIT, ANALYZE, IMPROVE, RESEARCH, REVIEW
11
- ├── knowledge/ # skill-anatomy, common-antipatterns, prompt-engineering,
12
- │ # backlog-management, shared-knowledge-guide
13
- ├── algorithms/ # skill-scoring, gap-analysis, improvement-prioritization
14
- ├── templates/ # new-skill, audit-report, improvement-plan
15
- └── README.md
16
- ```
17
-
18
- ## Как это работает
19
-
20
- 1. Агент получает `COACH-*` тикет → **SKILL.md** определяет тип → подгружает **workflow**
21
- 2. Воркфлоу ссылается на **knowledge** и **algorithms** по необходимости
22
- 3. Результат оформляется по **template**
23
-
24
- ## Типичные сценарии
25
-
26
- | Задача | Воркфлоу |
27
- |--------|----------|
28
- | Создать скил для новой роли | `workflows/create.md` |
29
- | Полный аудит скила | `workflows/audit.md` |
30
- | Анализ эффективности по тикетам | `workflows/analyze.md` |
31
- | Точечное улучшение | `workflows/improve.md` |
32
- | Поиск лучших практик | `workflows/research.md` |
33
- | Ревью структуры и качества | `workflows/review.md` |
34
-
35
- ## Как расширять
36
-
37
- | Что добавить | Действия |
38
- |-------------|----------|
39
- | Новый тип тикета | Создать `workflows/type.md` + строка в маршрутизации SKILL.md |
40
- | Новые знания | Создать `knowledge/name.md` + строка в таблице загрузки SKILL.md |
41
- | Новый алгоритм | Создать `algorithms/name.md` + строка в таблице загрузки SKILL.md |
42
- | Новый шаблон | Создать `templates/name.md` + ссылка в воркфлоу |
43
- | Расширение модуля | Дописать после маркера `<!-- РАСШИРЕНИЕ -->` |