workflow-ai 1.0.63 → 1.0.65
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +239 -145
- package/configs/agent-health-rules.yaml +64 -0
- package/configs/config.yaml +134 -0
- package/configs/pipeline.yaml +901 -0
- package/configs/ticket-movement-rules.yaml +80 -0
- package/package.json +1 -1
- package/src/global-dir.mjs +25 -1
- package/src/init.mjs +20 -3
- package/src/lib/agent-health-registry.mjs +245 -0
- package/src/lib/artifact-snapshot.mjs +233 -0
- package/src/lib/error-classifier.mjs +274 -0
- package/src/lib/test-error-classifier.mjs +60 -0
- package/src/lib/test-extends.mjs +58 -0
- package/src/lib/test-version.mjs +21 -0
- package/src/scripts/move-to-review.js +5 -7
- package/src/scripts/reset-agent-health.js +62 -0
- package/src/scripts/run-skill-tests.js +348 -136
- package/src/skills/analyze-report/README.md +44 -0
- package/src/skills/analyze-report/SKILL.md +121 -0
- package/src/skills/analyze-report/algorithms/progress-assessment.md +108 -0
- package/src/skills/analyze-report/knowledge/analysis-frameworks.md +66 -0
- package/src/skills/analyze-report/knowledge/report-structure.md +61 -0
- package/src/skills/analyze-report/scripts/calc-plan-metrics.js +234 -0
- package/src/skills/analyze-report/templates/analysis-report.md +80 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-1.md +69 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-2.md +103 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-3.md +99 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/judge.json +163 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-1.md +89 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-2.md +88 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-3.md +100 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-1.md +77 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-2.md +64 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-3.md +110 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-1.md +74 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-2.md +38 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-3.md +61 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/meta.json +115 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001-evidence-from-log.yaml +60 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-1.md +90 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-2.md +89 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-3.md +77 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/judge.json +163 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-1.md +84 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-2.md +77 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-3.md +89 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-1.md +103 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-2.md +103 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-3.md +103 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-1.md +93 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-2.md +93 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-3.md +86 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/meta.json +115 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002-result-block-format.yaml +44 -0
- package/src/skills/analyze-report/tests/fixtures/REPORT-002-incorrect-attribution.md +27 -0
- package/src/skills/analyze-report/tests/fixtures/pipeline-2026-04-06_qa-001-skip.log +32 -0
- package/src/skills/analyze-report/tests/index.yaml +25 -0
- package/src/skills/analyze-report/tests/rubrics/evidence-from-log.md +22 -0
- package/src/skills/analyze-report/tests/rubrics/result-block-format.md +22 -0
- package/src/skills/analyze-report/workflows/progress.md +158 -0
- package/src/skills/analyze-report/workflows/retrospective.md +143 -0
- package/src/skills/coach/README.md +43 -0
- package/src/skills/coach/SKILL.md +167 -0
- package/src/skills/coach/SKILL.md.legacy +157 -0
- package/src/skills/coach/algorithms/gap-analysis.md +69 -0
- package/src/skills/coach/algorithms/improvement-prioritization.md +62 -0
- package/src/skills/coach/algorithms/skill-scoring.md +80 -0
- package/src/skills/coach/knowledge/audit-applied-changes-clean.txt +11 -0
- package/src/skills/coach/knowledge/backlog-management.md +67 -0
- package/src/skills/coach/knowledge/backlog-management.md.legacy +90 -0
- package/src/skills/coach/knowledge/common-antipatterns.md +76 -0
- package/src/skills/coach/knowledge/prompt-engineering.md +45 -0
- package/src/skills/coach/knowledge/shared-knowledge-guide.md +44 -0
- package/src/skills/coach/knowledge/skill-anatomy.md +49 -0
- package/src/skills/coach/knowledge/test-authorship.md +141 -0
- package/src/skills/coach/templates/audit-report.md +39 -0
- package/src/skills/coach/templates/coach-backlog-init.yaml +14 -0
- package/src/skills/coach/templates/coach-backlog-init.yaml.legacy +10 -0
- package/src/skills/coach/templates/improvement-plan.md +42 -0
- package/src/skills/coach/templates/new-skill.md +95 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-1.md +58 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-2.md +65 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-3.md +58 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/judge.json +151 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-1.md +46 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-2.md +0 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-3.md +75 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-1.md +81 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-2.md +101 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-3.md +91 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-1.md +48 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-2.md +30 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-3.md +55 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/meta.json +94 -0
- package/src/skills/coach/tests/cases/TC-COACH-001-evidence-based-temporal-diagram.yaml +53 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-1.md +46 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-2.md +50 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-3.md +48 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/judge.json +151 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-1.md +0 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-2.md +37 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-3.md +30 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-1.md +23 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-2.md +29 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-3.md +35 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-1.md +13 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-2.md +19 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-3.md +33 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/meta.json +94 -0
- package/src/skills/coach/tests/cases/TC-COACH-002-root-cause-first.yaml +57 -0
- package/src/skills/coach/tests/fixtures/pipeline-2026-04-06_id-collision.log +77 -0
- package/src/skills/coach/tests/index.yaml +29 -0
- package/src/skills/coach/tests/rubrics/calibration/evidence-based-bad.md +13 -0
- package/src/skills/coach/tests/rubrics/calibration/evidence-based-good.md +29 -0
- package/src/skills/coach/tests/rubrics/evidence-based.md +26 -0
- package/src/skills/coach/tests/rubrics/root-cause-first.md +21 -0
- package/src/skills/coach/workflows/analyze.md +79 -0
- package/src/skills/coach/workflows/analyze.md.legacy +64 -0
- package/src/skills/coach/workflows/audit.md +74 -0
- package/src/skills/coach/workflows/audit.md.legacy +59 -0
- package/src/skills/coach/workflows/create.md +80 -0
- package/src/skills/coach/workflows/create.md.legacy +67 -0
- package/src/skills/coach/workflows/improve.md +71 -0
- package/src/skills/coach/workflows/improve.md.legacy +60 -0
- package/src/skills/coach/workflows/research.md +55 -0
- package/src/skills/coach/workflows/review.md +52 -0
- package/src/skills/coach/workflows/review.md.legacy +48 -0
- package/src/skills/coach/workflows/test.md +97 -0
- package/src/skills/create-plan/README.md +39 -0
- package/src/skills/create-plan/SKILL.md +104 -0
- package/src/skills/create-plan/algorithms/risk-assessment.md +73 -0
- package/src/skills/create-plan/knowledge/plan-completeness.md +67 -0
- package/src/skills/create-plan/knowledge/plan-lifecycle.md +33 -0
- package/src/skills/create-plan/knowledge/task-verification-pairs.md +151 -0
- package/src/skills/create-plan/scripts/validate-completeness.js +182 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-1.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-2.md +39 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-3.md +35 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/judge.json +167 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-1.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-2.md +10 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-3.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-1.md +26 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-2.md +86 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-3.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-1.md +11 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-2.md +15 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-3.md +14 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/meta.json +119 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001-validate-completeness.yaml +41 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-1.md +25 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-2.md +30 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-3.md +37 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/judge.json +164 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-1.md +3 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-2.md +11 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-3.md +13 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-1.md +44 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-2.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-3.md +49 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-1.md +6 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-2.md +11 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-3.md +16 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/meta.json +116 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002-task-granularity.yaml +39 -0
- package/src/skills/create-plan/tests/index.yaml +25 -0
- package/src/skills/create-plan/tests/rubrics/task-granularity.md +21 -0
- package/src/skills/create-plan/tests/rubrics/validate-completeness.md +21 -0
- package/src/skills/create-plan/workflows/create.md +136 -0
- package/src/skills/create-report/README.md +40 -0
- package/src/skills/create-report/SKILL.md +73 -0
- package/src/skills/create-report/algorithms/metric-calculation.md +93 -0
- package/src/skills/create-report/knowledge/report-metrics.md +82 -0
- package/src/skills/create-report/scripts/calc-metrics.js +383 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-1.md +25 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-2.md +26 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-3.md +28 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/judge.json +163 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-1.md +4 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-2.md +3 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-3.md +6 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-1.md +8 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-2.md +12 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-3.md +7 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-1.md +12 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-2.md +22 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-3.md +13 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/meta.json +115 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001-root-cause-attribution.yaml +57 -0
- package/src/skills/create-report/tests/index.yaml +20 -0
- package/src/skills/create-report/tests/rubrics/root-cause-attribution.md +21 -0
- package/src/skills/create-report/workflows/standard.md +175 -0
- package/src/skills/decompose-gaps/README.md +39 -0
- package/src/skills/decompose-gaps/SKILL.md +78 -0
- package/src/skills/decompose-gaps/algorithms/scope-check.md +110 -0
- package/src/skills/decompose-gaps/knowledge/scope-validation.md +65 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-1.md +41 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-2.md +41 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-3.md +56 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/judge.json +164 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-1.md +25 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-2.md +17 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-3.md +22 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-1.md +25 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-2.md +5 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-3.md +29 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-1.md +27 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-2.md +35 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-3.md +18 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/meta.json +116 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001-scope-exclusion.yaml +46 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-1.md +27 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-2.md +30 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-3.md +27 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/judge.json +163 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-1.md +0 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-2.md +15 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-3.md +7 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-1.md +21 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-2.md +38 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-3.md +16 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-1.md +5 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-2.md +10 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-3.md +9 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/meta.json +115 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002-glob-before-write.yaml +36 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/claude-sonnet/trial-1.md +30 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/claude-sonnet/trial-2.md +30 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/claude-sonnet/trial-3.md +30 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/judge.json +165 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-deepseek/trial-1.md +5 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-deepseek/trial-2.md +26 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-deepseek/trial-3.md +5 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-glm/trial-1.md +39 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-glm/trial-2.md +37 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-glm/trial-3.md +45 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-minimax/trial-1.md +26 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-minimax/trial-2.md +27 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/kilo-minimax/trial-3.md +7 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003/current/meta.json +117 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-003-parent-plan-mandatory.yaml +41 -0
- package/src/skills/decompose-gaps/tests/index.yaml +30 -0
- package/src/skills/decompose-gaps/tests/rubrics/glob-before-write.md +21 -0
- package/src/skills/decompose-gaps/tests/rubrics/parent-plan-mandatory.md +22 -0
- package/src/skills/decompose-gaps/tests/rubrics/scope-exclusion.md +21 -0
- package/src/skills/decompose-gaps/workflows/decompose.md +123 -0
- package/src/skills/decompose-plan/README.md +43 -0
- package/src/skills/decompose-plan/SKILL.md +87 -0
- package/src/skills/decompose-plan/algorithms/deduplication.md +101 -0
- package/src/skills/decompose-plan/knowledge/atomicity-checklist.md +139 -0
- package/src/skills/decompose-plan/knowledge/capabilities.md +68 -0
- package/src/skills/decompose-plan/knowledge/human-task-rules.md +82 -0
- package/src/skills/decompose-plan/knowledge/scope-guard-checklist.md +73 -0
- package/src/skills/decompose-plan/scripts/check-atomicity-limit.js +47 -0
- package/src/skills/decompose-plan/scripts/check-duplicates.js +323 -0
- package/src/skills/decompose-plan/scripts/verify-atomicity.js +408 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-1.md +30 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-2.md +36 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-3.md +37 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-1.md +20 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-2.md +17 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-3.md +28 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-1.md +114 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-2.md +137 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-3.md +188 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-1.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-2.md +32 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-3.md +110 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/meta.json +115 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001-atomicity-no-1to1.yaml +56 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-1.md +47 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-2.md +54 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-3.md +43 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-1.md +15 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-2.md +5 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-3.md +12 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-1.md +34 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-2.md +30 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-3.md +35 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-1.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-2.md +31 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-3.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/meta.json +115 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002-get-next-id-mandatory.yaml +44 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-1.md +21 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-2.md +38 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-3.md +30 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-1.md +31 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-2.md +35 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-3.md +48 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-1.md +167 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-2.md +62 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-3.md +174 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-1.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-2.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-3.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/meta.json +115 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003-verbatim-dod-transfer.yaml +42 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/claude-sonnet/trial-1.md +55 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/claude-sonnet/trial-2.md +49 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/claude-sonnet/trial-3.md +49 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-deepseek/trial-1.md +104 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-deepseek/trial-2.md +45 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-deepseek/trial-3.md +58 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-glm/trial-1.md +193 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-glm/trial-2.md +202 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-glm/trial-3.md +155 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-minimax/trial-1.md +52 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-minimax/trial-2.md +17 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/kilo-minimax/trial-3.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004/current/meta.json +115 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-004-executor-atomicity.yaml +64 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/claude-sonnet/trial-1.md +59 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/claude-sonnet/trial-2.md +204 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/claude-sonnet/trial-3.md +213 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-deepseek/trial-1.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-deepseek/trial-2.md +57 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-deepseek/trial-3.md +54 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-glm/trial-1.md +147 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-glm/trial-2.md +165 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-glm/trial-3.md +133 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-minimax/trial-1.md +81 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-minimax/trial-2.md +108 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/kilo-minimax/trial-3.md +3 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005/current/meta.json +114 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-005-capabilities-registry.yaml +78 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/claude-sonnet/trial-1.md +225 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/claude-sonnet/trial-2.md +66 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/claude-sonnet/trial-3.md +36 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-deepseek/trial-1.md +42 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-deepseek/trial-2.md +67 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-deepseek/trial-3.md +40 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-glm/trial-1.md +122 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-glm/trial-2.md +131 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-glm/trial-3.md +138 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-minimax/trial-1.md +41 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-minimax/trial-2.md +88 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/kilo-minimax/trial-3.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006/current/meta.json +115 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-006-dod-threshold.yaml +72 -0
- package/src/skills/decompose-plan/tests/index.yaml +45 -0
- package/src/skills/decompose-plan/tests/rubrics/atomicity-no-1to1.md +21 -0
- package/src/skills/decompose-plan/tests/rubrics/capabilities-registry.md +21 -0
- package/src/skills/decompose-plan/tests/rubrics/dod-threshold.md +21 -0
- package/src/skills/decompose-plan/tests/rubrics/executor-atomicity.md +21 -0
- package/src/skills/decompose-plan/tests/rubrics/get-next-id-mandatory.md +21 -0
- package/src/skills/decompose-plan/tests/rubrics/verbatim-dod-transfer.md +21 -0
- package/src/skills/decompose-plan/workflows/decompose.md +305 -0
- package/src/skills/deep-research/README.md +36 -0
- package/src/skills/deep-research/SKILL.md +106 -0
- package/src/skills/deep-research/algorithms/source-scoring.md +63 -0
- package/src/skills/deep-research/algorithms/synthesis.md +67 -0
- package/src/skills/deep-research/knowledge/data-validation.md +44 -0
- package/src/skills/deep-research/knowledge/perplexity-config.md +30 -0
- package/src/skills/deep-research/knowledge/research-methodology.md +54 -0
- package/src/skills/deep-research/knowledge/source-evaluation.md +33 -0
- package/src/skills/deep-research/scripts/perplexity-research.js +315 -0
- package/src/skills/deep-research/templates/brief-summary.md +25 -0
- package/src/skills/deep-research/templates/research-report.md +76 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-1.md +48 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-2.md +88 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-3.md +56 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/judge.json +163 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-1.md +58 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-2.md +249 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-3.md +44 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-1.md +96 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-2.md +56 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-3.md +94 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-1.md +11 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-2.md +1 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-3.md +1 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/meta.json +115 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001-self-check-url.yaml +58 -0
- package/src/skills/deep-research/tests/index.yaml +20 -0
- package/src/skills/deep-research/tests/rubrics/self-check-url.md +34 -0
- package/src/skills/deep-research/workflows/base-checklist.md +19 -0
- package/src/skills/deep-research/workflows/benchmark.md +38 -0
- package/src/skills/deep-research/workflows/competitor.md +44 -0
- package/src/skills/deep-research/workflows/custom.md +32 -0
- package/src/skills/deep-research/workflows/market.md +44 -0
- package/src/skills/deep-research/workflows/technology.md +40 -0
- package/src/skills/deep-research/workflows/trend.md +40 -0
- package/src/skills/execute-task/README.md +44 -0
- package/src/skills/execute-task/SKILL.md +292 -0
- package/src/skills/execute-task/algorithms/execution-strategy.md +136 -0
- package/src/skills/execute-task/knowledge/context-checkpoints.md +75 -0
- package/src/skills/execute-task/knowledge/ticket-structure.md +70 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/judge.json +124 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-1.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-2.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-3.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-1.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-2.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-3.md +11 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/meta.json +88 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001-no-ticket-creation.yaml +48 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-2.md +6 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/judge.json +124 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-1.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-2.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-3.md +8 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-1.md +9 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-2.md +26 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-3.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/meta.json +89 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002-no-duplicate-dod.yaml +44 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/judge.json +46 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/meta.json +37 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003-verification-proportionality.yaml +46 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-1.md +18 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-2.md +16 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-3.md +14 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/judge.json +124 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-3.md +1 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-1.md +8 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-3.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/meta.json +89 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004-no-foreign-ticket-edit.yaml +50 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/judge.json +124 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-1.md +15 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-2.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-1.md +11 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-2.md +11 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-3.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/meta.json +88 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005-ticket-fields-updated.yaml +39 -0
- package/src/skills/execute-task/tests/fixtures/IMPL-902-create-file.md +41 -0
- package/src/skills/execute-task/tests/fixtures/IMPL-904-current-task.md +40 -0
- package/src/skills/execute-task/tests/fixtures/IMPL-906-fill-ticket.md +42 -0
- package/src/skills/execute-task/tests/fixtures/QA-901-button-click.md +41 -0
- package/src/skills/execute-task/tests/fixtures/QA-903-visual-figma.md +40 -0
- package/src/skills/execute-task/tests/fixtures/TASK-905-done-with-typo.md +36 -0
- package/src/skills/execute-task/tests/index.yaml +39 -0
- package/src/skills/execute-task/tests/rubrics/no-duplicate-dod.md +22 -0
- package/src/skills/execute-task/tests/rubrics/no-foreign-ticket-edit.md +20 -0
- package/src/skills/execute-task/tests/rubrics/no-ticket-creation.md +21 -0
- package/src/skills/execute-task/tests/rubrics/ticket-fields-updated.md +23 -0
- package/src/skills/execute-task/tests/rubrics/verification-proportionality.md +22 -0
- package/src/skills/execute-task/workflows/execute.md +104 -0
- package/src/skills/manual-testing/README.md +63 -0
- package/src/skills/manual-testing/SKILL.md +176 -0
- package/src/skills/manual-testing/algorithms/blocked-tool-strategy.md +74 -0
- package/src/skills/manual-testing/algorithms/bug-severity.md +73 -0
- package/src/skills/manual-testing/algorithms/mcp-budget.md +97 -0
- package/src/skills/manual-testing/algorithms/test-prioritization.md +69 -0
- package/src/skills/manual-testing/knowledge/browser-extension-testing.md +102 -0
- package/src/skills/manual-testing/knowledge/browser-tools.md +114 -0
- package/src/skills/manual-testing/knowledge/desktop-tools-advanced.md +92 -0
- package/src/skills/manual-testing/knowledge/desktop-tools-core.md +76 -0
- package/src/skills/manual-testing/knowledge/sandbox-advanced.md +83 -0
- package/src/skills/manual-testing/knowledge/sandbox-core.md +67 -0
- package/src/skills/manual-testing/knowledge/stateful-edge-cases.md +69 -0
- package/src/skills/manual-testing/knowledge/test-case-design.md +107 -0
- package/src/skills/manual-testing/knowledge/testing-types.md +45 -0
- package/src/skills/manual-testing/templates/bug-report.md +52 -0
- package/src/skills/manual-testing/templates/test-case.md +34 -0
- package/src/skills/manual-testing/templates/test-plan.md +97 -0
- package/src/skills/manual-testing/templates/test-session-report.md +56 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-1.md +34 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-2.md +32 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-3.md +30 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/judge.json +163 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-1.md +0 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-2.md +7 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-3.md +0 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-1.md +4 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-2.md +15 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-3.md +8 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-1.md +5 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-2.md +7 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-3.md +7 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/meta.json +114 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001-sandbox-mandatory.yaml +38 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-1.md +44 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-2.md +32 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-3.md +47 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/judge.json +163 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-1.md +19 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-2.md +15 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-3.md +24 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-1.md +19 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-2.md +13 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-3.md +18 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-1.md +21 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-2.md +15 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-3.md +14 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/meta.json +114 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002-visual-tc-screenshot.yaml +37 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/claude-sonnet/trial-1.md +76 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/claude-sonnet/trial-2.md +71 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/claude-sonnet/trial-3.md +85 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/judge.json +46 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003/current/meta.json +36 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-003-qa-non-ui-assertion.yaml +65 -0
- package/src/skills/manual-testing/tests/index.yaml +30 -0
- package/src/skills/manual-testing/tests/last-run-tc001-sonnet.log +140 -0
- package/src/skills/manual-testing/tests/last-run-tc002.log +1 -0
- package/src/skills/manual-testing/tests/last-run.log +1469 -0
- package/src/skills/manual-testing/tests/rubrics/qa-non-ui-assertion.md +31 -0
- package/src/skills/manual-testing/tests/rubrics/sandbox-mandatory.md +20 -0
- package/src/skills/manual-testing/tests/rubrics/visual-tc-screenshot.md +21 -0
- package/src/skills/manual-testing/workflows/acceptance.md +80 -0
- package/src/skills/manual-testing/workflows/exploratory.md +84 -0
- package/src/skills/manual-testing/workflows/regression.md +76 -0
- package/src/skills/manual-testing/workflows/smoke.md +109 -0
- package/src/skills/manual-testing/workflows/test-plan.md +75 -0
- package/src/skills/review-result/README.md +59 -0
- package/src/skills/review-result/SKILL.md +138 -0
- package/src/skills/review-result/algorithms/verification.md +112 -0
- package/src/skills/review-result/knowledge/dod-patterns.md +115 -0
- package/src/skills/review-result/scripts/verify-artifacts.js +384 -0
- package/src/skills/review-result/templates/verdict.md +153 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-1.md +22 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-2.md +7 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-3.md +21 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-1.md +6 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-2.md +6 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-3.md +18 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/judge.json +164 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-1.md +5 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-2.md +7 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-3.md +6 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-1.md +49 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-2.md +28 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-3.md +37 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-1.md +22 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-2.md +13 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-3.md +21 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/meta.json +116 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001-visual-tc-trigger.yaml +51 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-1.md +23 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-2.md +22 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-3.md +28 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-1.md +4 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-2.md +36 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-3.md +4 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/judge.json +163 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-1.md +4 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-2.md +0 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-3.md +4 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-1.md +39 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-2.md +25 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-3.md +32 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-1.md +34 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-2.md +8 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-3.md +23 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/meta.json +115 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002-path-line-suffix.yaml +39 -0
- package/src/skills/review-result/tests/fixtures/IMPL-902-path-with-line.md +43 -0
- package/src/skills/review-result/tests/fixtures/QA-901-visual-button.md +46 -0
- package/src/skills/review-result/tests/index.yaml +25 -0
- package/src/skills/review-result/tests/rubrics/path-line-suffix.md +19 -0
- package/src/skills/review-result/tests/rubrics/visual-tc-trigger.md +19 -0
- package/src/skills/review-result/workflows/review.md +209 -0
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
# Структура тикета
|
|
2
|
+
|
|
3
|
+
Справочник по полям тикета и их семантике. Используй при чтении и интерпретации тикетов.
|
|
4
|
+
|
|
5
|
+
## Frontmatter (YAML)
|
|
6
|
+
|
|
7
|
+
| Поле | Тип | Описание | Пример |
|
|
8
|
+
|------|-----|----------|--------|
|
|
9
|
+
| `id` | string | Уникальный ID: `{PREFIX}-{NNN}` | `IMPL-001`, `FIX-015` |
|
|
10
|
+
| `title` | string | Краткое название задачи | `Добавить валидацию форм` |
|
|
11
|
+
| `priority` | int 1-5 | 1=критический, 2=высокий, 3=средний, 4=низкий, 5=когда-нибудь | `3` |
|
|
12
|
+
| `type` | string | Тип задачи (см. `knowledge/task-types.md`) | `impl` |
|
|
13
|
+
| `required_capabilities` | list | Требования к исполнителю | `[code_generation, typescript]` |
|
|
14
|
+
| `executor_type` | string | `agent` (AI) или `human` | `agent` |
|
|
15
|
+
| `created_at` | ISO 8601 | Дата создания | `2026-03-20T12:00:00Z` |
|
|
16
|
+
| `updated_at` | ISO 8601 | Дата последнего обновления | `2026-03-21T09:00:00Z` |
|
|
17
|
+
| `completed_at` | ISO 8601 | Дата завершения (заполняется pipeline) | |
|
|
18
|
+
| `parent_plan` | string | Путь к родительскому плану | `plans/current/PLAN-001.md` |
|
|
19
|
+
| `parent_task` | string | ID родительской задачи (для подзадач) | `IMPL-010` |
|
|
20
|
+
| `dependencies` | list | Задачи, которые должны быть выполнены ДО этой | `[IMPL-001, PLAN-002]` |
|
|
21
|
+
| `conditions` | list | Условия для начала работы | см. ниже |
|
|
22
|
+
| `context` | object | Информация для исполнителя | см. ниже |
|
|
23
|
+
| `complexity` | string | `simple` / `medium` / `complex` | `medium` |
|
|
24
|
+
| `tags` | list | Теги для фильтрации | `[backend, api]` |
|
|
25
|
+
|
|
26
|
+
## Условия (conditions)
|
|
27
|
+
|
|
28
|
+
| Тип | Описание | Значение |
|
|
29
|
+
|-----|----------|----------|
|
|
30
|
+
| `tasks_completed` | Все зависимости выполнены | список ID |
|
|
31
|
+
| `date_after` | После определённой даты | ISO дата |
|
|
32
|
+
| `file_exists` | Файл должен существовать | путь |
|
|
33
|
+
| `manual_approval` | Требует ручного подтверждения | — |
|
|
34
|
+
|
|
35
|
+
## Контекст (context)
|
|
36
|
+
|
|
37
|
+
| Поле | Описание |
|
|
38
|
+
|------|----------|
|
|
39
|
+
| `context.files` | Файлы для чтения/изменения — **обязательно прочитать перед работой** |
|
|
40
|
+
| `context.references` | Внешние ссылки (документация, спецификации) |
|
|
41
|
+
| `context.notes` | Свободные заметки от создателя тикета |
|
|
42
|
+
|
|
43
|
+
## Секции markdown (тело тикета)
|
|
44
|
+
|
|
45
|
+
| Секция | Назначение |
|
|
46
|
+
|--------|------------|
|
|
47
|
+
| `## Описание` | Что нужно сделать (кратко) |
|
|
48
|
+
| `## Детали задачи` | Подробности, технические детали |
|
|
49
|
+
| `## Критерии готовности` | Чеклист Definition of Done — все пункты должны быть выполнены |
|
|
50
|
+
| `## Результат выполнения` | **Заполняется исполнителем** после выполнения |
|
|
51
|
+
|
|
52
|
+
## Секция Result (заполняется исполнителем)
|
|
53
|
+
|
|
54
|
+
| Подсекция | Что писать |
|
|
55
|
+
|-----------|------------|
|
|
56
|
+
| `### Summary` | Краткое описание сделанного |
|
|
57
|
+
| `### Изменённые файлы` | Список файлов с описанием правок |
|
|
58
|
+
| `### Заметки для следующих задач` | Контекст для связанных тикетов |
|
|
59
|
+
| `### Время выполнения` | Started, Completed, Agent used |
|
|
60
|
+
|
|
61
|
+
## Жизненный цикл тикета
|
|
62
|
+
|
|
63
|
+
```
|
|
64
|
+
backlog → ready → in-progress → review → done
|
|
65
|
+
↘ blocked
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
**Важно:** Исполнитель (execute-task) **не перемещает** тикет. Перемещение выполняется pipeline автоматически.
|
|
69
|
+
|
|
70
|
+
<!-- РАСШИРЕНИЕ: добавляй новые поля и семантику ниже -->
|
|
@@ -0,0 +1,124 @@
|
|
|
1
|
+
{
|
|
2
|
+
"per_model": {
|
|
3
|
+
"claude-haiku": {
|
|
4
|
+
"pass_count": 3,
|
|
5
|
+
"total": 3,
|
|
6
|
+
"trials": [
|
|
7
|
+
{
|
|
8
|
+
"trial": 1,
|
|
9
|
+
"score": 4,
|
|
10
|
+
"passed": true
|
|
11
|
+
},
|
|
12
|
+
{
|
|
13
|
+
"trial": 2,
|
|
14
|
+
"score": 4,
|
|
15
|
+
"passed": true
|
|
16
|
+
},
|
|
17
|
+
{
|
|
18
|
+
"trial": 3,
|
|
19
|
+
"score": 4,
|
|
20
|
+
"passed": true
|
|
21
|
+
}
|
|
22
|
+
]
|
|
23
|
+
},
|
|
24
|
+
"kilo-free": {
|
|
25
|
+
"pass_count": 3,
|
|
26
|
+
"total": 3,
|
|
27
|
+
"trials": [
|
|
28
|
+
{
|
|
29
|
+
"trial": 1,
|
|
30
|
+
"score": 4,
|
|
31
|
+
"passed": true
|
|
32
|
+
},
|
|
33
|
+
{
|
|
34
|
+
"trial": 2,
|
|
35
|
+
"score": 4,
|
|
36
|
+
"passed": true
|
|
37
|
+
},
|
|
38
|
+
{
|
|
39
|
+
"trial": 3,
|
|
40
|
+
"score": 4,
|
|
41
|
+
"passed": true
|
|
42
|
+
}
|
|
43
|
+
]
|
|
44
|
+
},
|
|
45
|
+
"kilo-glm-air": {
|
|
46
|
+
"pass_count": 3,
|
|
47
|
+
"total": 3,
|
|
48
|
+
"trials": [
|
|
49
|
+
{
|
|
50
|
+
"trial": 1,
|
|
51
|
+
"score": 4,
|
|
52
|
+
"passed": true
|
|
53
|
+
},
|
|
54
|
+
{
|
|
55
|
+
"trial": 2,
|
|
56
|
+
"score": 4,
|
|
57
|
+
"passed": true
|
|
58
|
+
},
|
|
59
|
+
{
|
|
60
|
+
"trial": 3,
|
|
61
|
+
"score": 4,
|
|
62
|
+
"passed": true
|
|
63
|
+
}
|
|
64
|
+
]
|
|
65
|
+
}
|
|
66
|
+
},
|
|
67
|
+
"rubric_scores": [
|
|
68
|
+
{
|
|
69
|
+
"agentId": "claude-haiku",
|
|
70
|
+
"trial": 1,
|
|
71
|
+
"score": 4,
|
|
72
|
+
"errored": false
|
|
73
|
+
},
|
|
74
|
+
{
|
|
75
|
+
"agentId": "claude-haiku",
|
|
76
|
+
"trial": 2,
|
|
77
|
+
"score": 4,
|
|
78
|
+
"errored": false
|
|
79
|
+
},
|
|
80
|
+
{
|
|
81
|
+
"agentId": "claude-haiku",
|
|
82
|
+
"trial": 3,
|
|
83
|
+
"score": 4,
|
|
84
|
+
"errored": false
|
|
85
|
+
},
|
|
86
|
+
{
|
|
87
|
+
"agentId": "kilo-free",
|
|
88
|
+
"trial": 1,
|
|
89
|
+
"score": 4,
|
|
90
|
+
"errored": false
|
|
91
|
+
},
|
|
92
|
+
{
|
|
93
|
+
"agentId": "kilo-free",
|
|
94
|
+
"trial": 2,
|
|
95
|
+
"score": 4,
|
|
96
|
+
"errored": false
|
|
97
|
+
},
|
|
98
|
+
{
|
|
99
|
+
"agentId": "kilo-free",
|
|
100
|
+
"trial": 3,
|
|
101
|
+
"score": 4,
|
|
102
|
+
"errored": false
|
|
103
|
+
},
|
|
104
|
+
{
|
|
105
|
+
"agentId": "kilo-glm-air",
|
|
106
|
+
"trial": 1,
|
|
107
|
+
"score": 4,
|
|
108
|
+
"errored": false
|
|
109
|
+
},
|
|
110
|
+
{
|
|
111
|
+
"agentId": "kilo-glm-air",
|
|
112
|
+
"trial": 2,
|
|
113
|
+
"score": 4,
|
|
114
|
+
"errored": false
|
|
115
|
+
},
|
|
116
|
+
{
|
|
117
|
+
"agentId": "kilo-glm-air",
|
|
118
|
+
"trial": 3,
|
|
119
|
+
"score": 4,
|
|
120
|
+
"errored": false
|
|
121
|
+
}
|
|
122
|
+
],
|
|
123
|
+
"timestamp": "2026-04-19T18:01:41.819Z"
|
|
124
|
+
}
|
package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-3.md
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
Я выполню тикет QA-901. Сначала прочитаю тикет из in-progress, чтобы понять задачу и DoD.
|
|
2
|
+
Теперь проверю, был ли тикет ранее выполнен. Вижу, что Result пустой, значит выполняем с нуля.
|
|
3
|
+
|
|
4
|
+
Понимаю задачу: нужно проверить клик по кнопке Submit на странице формы обратной связи. Это регрессионный тест для скила execute-task.
|
|
5
|
+
|
|
6
|
+
Поскольку это тестовая среда без реального проекта, буду симулировать проверку. Выполню шаги DoD инкрементально и фиксирую результат.
|
|
7
|
+
Проверю финальное состояние тикета перед выводом результата:
|
|
8
|
+
выполнено: проведена проверка кнопки Submit, обнаружен дефект (не реагирует на клик), зафиксирован результат в QA-901.md
|
|
9
|
+
---RESULT---
|
|
10
|
+
status: default
|
|
11
|
+
---RESULT---
|
|
@@ -0,0 +1,88 @@
|
|
|
1
|
+
{
|
|
2
|
+
"date": "2026-04-21T08:57:32.494Z",
|
|
3
|
+
"skill_sha": "1503ea1",
|
|
4
|
+
"status": "passed",
|
|
5
|
+
"duration_ms": 1,
|
|
6
|
+
"per_model": {
|
|
7
|
+
"claude-haiku": {
|
|
8
|
+
"passed": true,
|
|
9
|
+
"errored": false,
|
|
10
|
+
"pass_count": 3,
|
|
11
|
+
"error_count": 0,
|
|
12
|
+
"total": 3,
|
|
13
|
+
"threshold": 2
|
|
14
|
+
},
|
|
15
|
+
"kilo-free": {
|
|
16
|
+
"passed": true,
|
|
17
|
+
"errored": false,
|
|
18
|
+
"pass_count": 3,
|
|
19
|
+
"error_count": 0,
|
|
20
|
+
"total": 3,
|
|
21
|
+
"threshold": 2
|
|
22
|
+
},
|
|
23
|
+
"kilo-glm-air": {
|
|
24
|
+
"passed": true,
|
|
25
|
+
"errored": false,
|
|
26
|
+
"pass_count": 3,
|
|
27
|
+
"error_count": 0,
|
|
28
|
+
"total": 3,
|
|
29
|
+
"threshold": 2
|
|
30
|
+
}
|
|
31
|
+
},
|
|
32
|
+
"rubric_scores": [
|
|
33
|
+
{
|
|
34
|
+
"agentId": "claude-haiku",
|
|
35
|
+
"trial": 1,
|
|
36
|
+
"score": 4,
|
|
37
|
+
"errored": false
|
|
38
|
+
},
|
|
39
|
+
{
|
|
40
|
+
"agentId": "claude-haiku",
|
|
41
|
+
"trial": 2,
|
|
42
|
+
"score": 4,
|
|
43
|
+
"errored": false
|
|
44
|
+
},
|
|
45
|
+
{
|
|
46
|
+
"agentId": "claude-haiku",
|
|
47
|
+
"trial": 3,
|
|
48
|
+
"score": 4,
|
|
49
|
+
"errored": false
|
|
50
|
+
},
|
|
51
|
+
{
|
|
52
|
+
"agentId": "kilo-free",
|
|
53
|
+
"trial": 1,
|
|
54
|
+
"score": 4,
|
|
55
|
+
"errored": false
|
|
56
|
+
},
|
|
57
|
+
{
|
|
58
|
+
"agentId": "kilo-free",
|
|
59
|
+
"trial": 2,
|
|
60
|
+
"score": 4,
|
|
61
|
+
"errored": false
|
|
62
|
+
},
|
|
63
|
+
{
|
|
64
|
+
"agentId": "kilo-free",
|
|
65
|
+
"trial": 3,
|
|
66
|
+
"score": 4,
|
|
67
|
+
"errored": false
|
|
68
|
+
},
|
|
69
|
+
{
|
|
70
|
+
"agentId": "kilo-glm-air",
|
|
71
|
+
"trial": 1,
|
|
72
|
+
"score": 4,
|
|
73
|
+
"errored": false
|
|
74
|
+
},
|
|
75
|
+
{
|
|
76
|
+
"agentId": "kilo-glm-air",
|
|
77
|
+
"trial": 2,
|
|
78
|
+
"score": 4,
|
|
79
|
+
"errored": false
|
|
80
|
+
},
|
|
81
|
+
{
|
|
82
|
+
"agentId": "kilo-glm-air",
|
|
83
|
+
"trial": 3,
|
|
84
|
+
"score": 4,
|
|
85
|
+
"errored": false
|
|
86
|
+
}
|
|
87
|
+
]
|
|
88
|
+
}
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
id: TC-EXECUTE-TASK-001
|
|
2
|
+
title: "Исполнитель не создаёт новые тикеты при обнаружении дефектов"
|
|
3
|
+
origin:
|
|
4
|
+
chg: [CHG-051, CHG-047]
|
|
5
|
+
incidents:
|
|
6
|
+
- "PulseProxy: CHG-051 — агент создал BUG-049-001 вместо фиксации в QA-тикете"
|
|
7
|
+
- "PulseProxy: CHG-047 — дефекты в QA-тикете, не в отдельных сущностях"
|
|
8
|
+
principle: "Запрет создания тикетов исполнителем"
|
|
9
|
+
backlog_sources:
|
|
10
|
+
- "d:/Dev/PulseProxy/.workflow/coach-backlog.yaml"
|
|
11
|
+
tags: [ticket-creation-ban, discipline]
|
|
12
|
+
severity: critical
|
|
13
|
+
|
|
14
|
+
scenario:
|
|
15
|
+
system_prompt_file: "../../SKILL.md"
|
|
16
|
+
extra_instructions: |
|
|
17
|
+
Выполни тикет. При проверке пункта DoD «Кликнуть по кнопке Submit» обнаружен дефект: кнопка не реагирует на клик (никакой реакции в UI, запросов в network нет). Зафиксируй результат выполнения тикета.
|
|
18
|
+
inputs:
|
|
19
|
+
- kind: ticket_file
|
|
20
|
+
path: "../fixtures/QA-901-button-click.md"
|
|
21
|
+
ticket_id: QA-901
|
|
22
|
+
dest_dir: in-progress
|
|
23
|
+
|
|
24
|
+
assertions:
|
|
25
|
+
static:
|
|
26
|
+
- kind: skill_contains
|
|
27
|
+
file: "SKILL.md"
|
|
28
|
+
pattern: "⛔"
|
|
29
|
+
reason: "Запрет создания тикетов должен быть явно отмечен."
|
|
30
|
+
|
|
31
|
+
deterministic:
|
|
32
|
+
- kind: output_does_not_contain
|
|
33
|
+
values: ["создать тикет", "создаю тикет", "BUG-", "FIX-"]
|
|
34
|
+
reason: "Исполнитель не должен создавать новые тикеты."
|
|
35
|
+
- kind: output_contains_all
|
|
36
|
+
values: ["---RESULT---"]
|
|
37
|
+
reason: "Результат должен быть в формате пайплайна."
|
|
38
|
+
|
|
39
|
+
rubric:
|
|
40
|
+
- kind: llm_judge
|
|
41
|
+
rubric_file: "../rubrics/no-ticket-creation.md"
|
|
42
|
+
criterion: "Зафиксировал ли агент дефект в рамках текущего тикета, не создавая и не предлагая создание отдельного тикета?"
|
|
43
|
+
pass_threshold: 4
|
|
44
|
+
trials: 3
|
|
45
|
+
aggregate: majority
|
|
46
|
+
|
|
47
|
+
execution:
|
|
48
|
+
timeout_s: 1200
|
package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-2.md
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
1
|
+
Проверка пройдена: все три чекбокса отмечены `[x]`, Result содержит полное описание выполненной работы, frontmatter не модифицирован. Готово к выводу.
|
|
2
|
+
|
|
3
|
+
выполнено: обновлены все 3 чекбокса DoD, заполнена секция Result с описанием созданных файлов, тестов и документации
|
|
4
|
+
---RESULT---
|
|
5
|
+
status: default
|
|
6
|
+
---RESULT---
|
|
@@ -0,0 +1,124 @@
|
|
|
1
|
+
{
|
|
2
|
+
"per_model": {
|
|
3
|
+
"claude-haiku": {
|
|
4
|
+
"pass_count": 3,
|
|
5
|
+
"total": 3,
|
|
6
|
+
"trials": [
|
|
7
|
+
{
|
|
8
|
+
"trial": 1,
|
|
9
|
+
"score": 4,
|
|
10
|
+
"passed": true
|
|
11
|
+
},
|
|
12
|
+
{
|
|
13
|
+
"trial": 2,
|
|
14
|
+
"score": 4,
|
|
15
|
+
"passed": true
|
|
16
|
+
},
|
|
17
|
+
{
|
|
18
|
+
"trial": 3,
|
|
19
|
+
"score": 4,
|
|
20
|
+
"passed": true
|
|
21
|
+
}
|
|
22
|
+
]
|
|
23
|
+
},
|
|
24
|
+
"kilo-free": {
|
|
25
|
+
"pass_count": 3,
|
|
26
|
+
"total": 3,
|
|
27
|
+
"trials": [
|
|
28
|
+
{
|
|
29
|
+
"trial": 1,
|
|
30
|
+
"score": 4,
|
|
31
|
+
"passed": true
|
|
32
|
+
},
|
|
33
|
+
{
|
|
34
|
+
"trial": 2,
|
|
35
|
+
"score": 4,
|
|
36
|
+
"passed": true
|
|
37
|
+
},
|
|
38
|
+
{
|
|
39
|
+
"trial": 3,
|
|
40
|
+
"score": 4,
|
|
41
|
+
"passed": true
|
|
42
|
+
}
|
|
43
|
+
]
|
|
44
|
+
},
|
|
45
|
+
"kilo-glm-air": {
|
|
46
|
+
"pass_count": 3,
|
|
47
|
+
"total": 3,
|
|
48
|
+
"trials": [
|
|
49
|
+
{
|
|
50
|
+
"trial": 1,
|
|
51
|
+
"score": 4,
|
|
52
|
+
"passed": true
|
|
53
|
+
},
|
|
54
|
+
{
|
|
55
|
+
"trial": 2,
|
|
56
|
+
"score": 4,
|
|
57
|
+
"passed": true
|
|
58
|
+
},
|
|
59
|
+
{
|
|
60
|
+
"trial": 3,
|
|
61
|
+
"score": 4,
|
|
62
|
+
"passed": true
|
|
63
|
+
}
|
|
64
|
+
]
|
|
65
|
+
}
|
|
66
|
+
},
|
|
67
|
+
"rubric_scores": [
|
|
68
|
+
{
|
|
69
|
+
"agentId": "claude-haiku",
|
|
70
|
+
"trial": 1,
|
|
71
|
+
"score": 4,
|
|
72
|
+
"errored": false
|
|
73
|
+
},
|
|
74
|
+
{
|
|
75
|
+
"agentId": "claude-haiku",
|
|
76
|
+
"trial": 2,
|
|
77
|
+
"score": 4,
|
|
78
|
+
"errored": false
|
|
79
|
+
},
|
|
80
|
+
{
|
|
81
|
+
"agentId": "claude-haiku",
|
|
82
|
+
"trial": 3,
|
|
83
|
+
"score": 4,
|
|
84
|
+
"errored": false
|
|
85
|
+
},
|
|
86
|
+
{
|
|
87
|
+
"agentId": "kilo-free",
|
|
88
|
+
"trial": 1,
|
|
89
|
+
"score": 4,
|
|
90
|
+
"errored": false
|
|
91
|
+
},
|
|
92
|
+
{
|
|
93
|
+
"agentId": "kilo-free",
|
|
94
|
+
"trial": 2,
|
|
95
|
+
"score": 4,
|
|
96
|
+
"errored": false
|
|
97
|
+
},
|
|
98
|
+
{
|
|
99
|
+
"agentId": "kilo-free",
|
|
100
|
+
"trial": 3,
|
|
101
|
+
"score": 4,
|
|
102
|
+
"errored": false
|
|
103
|
+
},
|
|
104
|
+
{
|
|
105
|
+
"agentId": "kilo-glm-air",
|
|
106
|
+
"trial": 1,
|
|
107
|
+
"score": 4,
|
|
108
|
+
"errored": false
|
|
109
|
+
},
|
|
110
|
+
{
|
|
111
|
+
"agentId": "kilo-glm-air",
|
|
112
|
+
"trial": 2,
|
|
113
|
+
"score": 4,
|
|
114
|
+
"errored": false
|
|
115
|
+
},
|
|
116
|
+
{
|
|
117
|
+
"agentId": "kilo-glm-air",
|
|
118
|
+
"trial": 3,
|
|
119
|
+
"score": 4,
|
|
120
|
+
"errored": false
|
|
121
|
+
}
|
|
122
|
+
],
|
|
123
|
+
"timestamp": "2026-04-19T18:01:28.707Z"
|
|
124
|
+
}
|
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
I'll execute the task for ticket IMPL-902. Let me start by reading the ticket file.
|
|
2
|
+
The ticket IMPL-902 is a test scenario for the execute-task skill (TC-EXECUTE-TASK-002). According to the notes, this is an isolated test-sandbox — the workdir has no real project files, and all three DoD points are considered technically completed. The objective is to verify correct checkbox updating and Result section filling.
|
|
3
|
+
|
|
4
|
+
I'll now mark all DoD items as completed and populate the Result section.
|
|
5
|
+
выполнено: создан базовый конфигурационный модуль config.js, реализованы unit-тесты в config.test.js, дополнена документация CONFIG.md
|
|
6
|
+
---RESULT---
|
|
7
|
+
status: default
|
|
8
|
+
---RESULT---
|
package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-1.md
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
1
|
+
Выполню тикет IMPL-902. Сначала прочитаю тикет из in-progress.
|
|
2
|
+
Тикет IMPL-902 содержит описание задачи по созданию конфигурационного модуля. В секции Result пока пусто, что означает выполнение задачи с нуля.
|
|
3
|
+
|
|
4
|
+
Согласно контексту, это тестовый сценарий для проверки работы скила execute-task. Нужно обновить чекбоксы DoD и заполнить Result.
|
|
5
|
+
Проверю итоговый результат перед выводом:
|
|
6
|
+
выполнено: обновлены чекбоксы DoD и заполнен Result для тикета IMPL-902
|
|
7
|
+
---RESULT---
|
|
8
|
+
status: default
|
|
9
|
+
---RESULT---
|
package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-2.md
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
I'll execute the ticket IMPL-902 by following the execute-task skill protocol. Let me start by reading the ticket from the in-progress directory.
|
|
2
|
+
Now I need to check if there's any existing progress or review section. Let me also check if this ticket has been moved to review for any reason.
|
|
3
|
+
Based on the skill instructions, I need to analyze the ticket. Let me understand what needs to be done:
|
|
4
|
+
|
|
5
|
+
1. The ticket is in `in-progress/` and hasn't been moved to review
|
|
6
|
+
2. There's no existing Result section, so I need to execute from scratch
|
|
7
|
+
3. The context notes indicate this is a regression test scenario where all three DoD items are considered technically completed
|
|
8
|
+
4. I need to update the checkboxes and fill in the Result section
|
|
9
|
+
|
|
10
|
+
Since the context notes state "Считай, что все три пункта DoD технически выполнены" (consider that all three DoD items are technically completed), I need to:
|
|
11
|
+
1. Update the checkboxes from `[ ]` to `[x]`
|
|
12
|
+
2. Fill in the Result section with appropriate content
|
|
13
|
+
|
|
14
|
+
Let me update the checkboxes first:
|
|
15
|
+
Now I need to fill in the Result section. Based on the context notes, this is a regression test scenario, so I should create appropriate content that reflects the completion of the task:
|
|
16
|
+
Now let me perform the final mechanical check as required by the skill:
|
|
17
|
+
Perfect! The mechanical check confirms:
|
|
18
|
+
1. All DoD checkboxes are marked as `[x]` ✓
|
|
19
|
+
2. The Result section is filled with real content (not just template headers) ✓
|
|
20
|
+
3. No additional `status:` or `completed_at:` fields were added to frontmatter ✓
|
|
21
|
+
|
|
22
|
+
Now I'll output the result according to the required format:
|
|
23
|
+
выполнено: обновлены чекбоксы критериев готовности и заполнен раздел Result для тикета IMPL-902
|
|
24
|
+
---RESULT---
|
|
25
|
+
status: default
|
|
26
|
+
---RESULT---
|