workflow-ai 1.0.62 → 1.0.64
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +61 -0
- package/agent-templates/CLAUDE.md.tpl +2 -0
- package/agent-templates/QWEN.md.tpl +2 -0
- package/configs/config.yaml +134 -0
- package/configs/pipeline.yaml +884 -0
- package/configs/ticket-movement-rules.yaml +80 -0
- package/package.json +2 -1
- package/src/global-dir.mjs +25 -1
- package/src/init.mjs +5 -4
- package/src/lib/agent-spawner.mjs +338 -0
- package/src/runner.mjs +15 -14
- package/src/scripts/get-next-test-id.js +94 -0
- package/src/scripts/migrate-backlog-to-tests.js +406 -0
- package/src/scripts/run-skill-tests.js +1703 -0
- package/src/scripts/scan-fixtures-for-secrets.js +248 -0
- package/src/scripts/tests/timeout-cascade.test.js +28 -0
- package/src/skills/analyze-report/README.md +44 -0
- package/src/skills/analyze-report/SKILL.md +121 -0
- package/src/skills/analyze-report/algorithms/progress-assessment.md +108 -0
- package/src/skills/analyze-report/knowledge/analysis-frameworks.md +66 -0
- package/src/skills/analyze-report/knowledge/report-structure.md +61 -0
- package/src/skills/analyze-report/scripts/calc-plan-metrics.js +234 -0
- package/src/skills/analyze-report/templates/analysis-report.md +80 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-1.md +69 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-2.md +103 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/claude-sonnet/trial-3.md +99 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/judge.json +163 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-1.md +89 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-2.md +88 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-deepseek/trial-3.md +100 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-1.md +77 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-2.md +64 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-glm/trial-3.md +110 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-1.md +74 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-2.md +38 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/kilo-minimax/trial-3.md +61 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001/current/meta.json +115 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-001-evidence-from-log.yaml +60 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-1.md +90 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-2.md +89 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/claude-sonnet/trial-3.md +77 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/judge.json +163 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-1.md +84 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-2.md +77 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-deepseek/trial-3.md +89 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-1.md +103 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-2.md +103 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-glm/trial-3.md +103 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-1.md +93 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-2.md +93 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/kilo-minimax/trial-3.md +86 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002/current/meta.json +115 -0
- package/src/skills/analyze-report/tests/cases/TC-ANALYZE-REPORT-002-result-block-format.yaml +44 -0
- package/src/skills/analyze-report/tests/fixtures/REPORT-002-incorrect-attribution.md +27 -0
- package/src/skills/analyze-report/tests/fixtures/pipeline-2026-04-06_qa-001-skip.log +32 -0
- package/src/skills/analyze-report/tests/index.yaml +25 -0
- package/src/skills/analyze-report/tests/rubrics/evidence-from-log.md +22 -0
- package/src/skills/analyze-report/tests/rubrics/result-block-format.md +22 -0
- package/src/skills/analyze-report/workflows/progress.md +158 -0
- package/src/skills/analyze-report/workflows/retrospective.md +143 -0
- package/src/skills/coach/README.md +43 -0
- package/src/skills/coach/SKILL.md +166 -0
- package/src/skills/coach/SKILL.md.legacy +157 -0
- package/src/skills/coach/algorithms/gap-analysis.md +69 -0
- package/src/skills/coach/algorithms/improvement-prioritization.md +62 -0
- package/src/skills/coach/algorithms/skill-scoring.md +80 -0
- package/src/skills/coach/knowledge/audit-applied-changes-clean.txt +11 -0
- package/src/skills/coach/knowledge/backlog-management.md +67 -0
- package/src/skills/coach/knowledge/backlog-management.md.legacy +90 -0
- package/src/skills/coach/knowledge/common-antipatterns.md +76 -0
- package/src/skills/coach/knowledge/prompt-engineering.md +45 -0
- package/src/skills/coach/knowledge/shared-knowledge-guide.md +44 -0
- package/src/skills/coach/knowledge/skill-anatomy.md +49 -0
- package/src/skills/coach/knowledge/test-authorship.md +141 -0
- package/src/skills/coach/templates/audit-report.md +39 -0
- package/src/skills/coach/templates/coach-backlog-init.yaml +14 -0
- package/src/skills/coach/templates/coach-backlog-init.yaml.legacy +10 -0
- package/src/skills/coach/templates/improvement-plan.md +42 -0
- package/src/skills/coach/templates/new-skill.md +95 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-1.md +58 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-2.md +65 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/claude-sonnet/trial-3.md +58 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/judge.json +151 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-1.md +46 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-2.md +0 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-deepseek/trial-3.md +75 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-1.md +81 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-2.md +101 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-glm/trial-3.md +91 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-1.md +48 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-2.md +30 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/kilo-minimax/trial-3.md +55 -0
- package/src/skills/coach/tests/cases/TC-COACH-001/current/meta.json +95 -0
- package/src/skills/coach/tests/cases/TC-COACH-001-evidence-based-temporal-diagram.yaml +53 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-1.md +46 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-2.md +50 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/claude-sonnet/trial-3.md +48 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/judge.json +151 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-1.md +0 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-2.md +37 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-deepseek/trial-3.md +30 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-1.md +23 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-2.md +29 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-glm/trial-3.md +35 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-1.md +13 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-2.md +19 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/kilo-minimax/trial-3.md +33 -0
- package/src/skills/coach/tests/cases/TC-COACH-002/current/meta.json +95 -0
- package/src/skills/coach/tests/cases/TC-COACH-002-root-cause-first.yaml +57 -0
- package/src/skills/coach/tests/fixtures/pipeline-2026-04-06_id-collision.log +77 -0
- package/src/skills/coach/tests/index.yaml +29 -0
- package/src/skills/coach/tests/rubrics/calibration/evidence-based-bad.md +13 -0
- package/src/skills/coach/tests/rubrics/calibration/evidence-based-good.md +29 -0
- package/src/skills/coach/tests/rubrics/evidence-based.md +26 -0
- package/src/skills/coach/tests/rubrics/root-cause-first.md +21 -0
- package/src/skills/coach/workflows/analyze.md +79 -0
- package/src/skills/coach/workflows/analyze.md.legacy +64 -0
- package/src/skills/coach/workflows/audit.md +74 -0
- package/src/skills/coach/workflows/audit.md.legacy +59 -0
- package/src/skills/coach/workflows/create.md +80 -0
- package/src/skills/coach/workflows/create.md.legacy +67 -0
- package/src/skills/coach/workflows/improve.md +71 -0
- package/src/skills/coach/workflows/improve.md.legacy +60 -0
- package/src/skills/coach/workflows/research.md +55 -0
- package/src/skills/coach/workflows/review.md +52 -0
- package/src/skills/coach/workflows/review.md.legacy +48 -0
- package/src/skills/coach/workflows/test.md +97 -0
- package/src/skills/create-plan/README.md +39 -0
- package/src/skills/create-plan/SKILL.md +104 -0
- package/src/skills/create-plan/algorithms/risk-assessment.md +73 -0
- package/src/skills/create-plan/knowledge/plan-completeness.md +67 -0
- package/src/skills/create-plan/knowledge/plan-lifecycle.md +33 -0
- package/src/skills/create-plan/knowledge/task-verification-pairs.md +151 -0
- package/src/skills/create-plan/scripts/validate-completeness.js +182 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-1.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-2.md +39 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/claude-sonnet/trial-3.md +35 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/judge.json +167 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-1.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-2.md +10 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-deepseek/trial-3.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-1.md +26 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-2.md +86 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-glm/trial-3.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-1.md +11 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-2.md +15 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/kilo-minimax/trial-3.md +14 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001/current/meta.json +119 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-001-validate-completeness.yaml +41 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-1.md +25 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-2.md +30 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/claude-sonnet/trial-3.md +37 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/judge.json +164 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-1.md +3 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-2.md +11 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-deepseek/trial-3.md +13 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-1.md +44 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-2.md +5 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-glm/trial-3.md +49 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-1.md +6 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-2.md +11 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/kilo-minimax/trial-3.md +16 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002/current/meta.json +116 -0
- package/src/skills/create-plan/tests/cases/TC-CREATE-PLAN-002-task-granularity.yaml +39 -0
- package/src/skills/create-plan/tests/index.yaml +25 -0
- package/src/skills/create-plan/tests/rubrics/task-granularity.md +21 -0
- package/src/skills/create-plan/tests/rubrics/validate-completeness.md +21 -0
- package/src/skills/create-plan/workflows/create.md +136 -0
- package/src/skills/create-report/README.md +40 -0
- package/src/skills/create-report/SKILL.md +73 -0
- package/src/skills/create-report/algorithms/metric-calculation.md +93 -0
- package/src/skills/create-report/knowledge/report-metrics.md +82 -0
- package/src/skills/create-report/scripts/calc-metrics.js +383 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-1.md +25 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-2.md +26 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/claude-sonnet/trial-3.md +28 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/judge.json +163 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-1.md +4 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-2.md +3 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-deepseek/trial-3.md +6 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-1.md +8 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-2.md +12 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-glm/trial-3.md +7 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-1.md +12 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-2.md +22 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/kilo-minimax/trial-3.md +13 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001/current/meta.json +115 -0
- package/src/skills/create-report/tests/cases/TC-CREATE-REPORT-001-root-cause-attribution.yaml +57 -0
- package/src/skills/create-report/tests/index.yaml +20 -0
- package/src/skills/create-report/tests/rubrics/root-cause-attribution.md +21 -0
- package/src/skills/create-report/workflows/standard.md +175 -0
- package/src/skills/decompose-gaps/README.md +39 -0
- package/src/skills/decompose-gaps/SKILL.md +78 -0
- package/src/skills/decompose-gaps/algorithms/scope-check.md +110 -0
- package/src/skills/decompose-gaps/knowledge/scope-validation.md +65 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-1.md +49 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-2.md +56 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/claude-sonnet/trial-3.md +39 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/judge.json +164 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-1.md +25 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-2.md +11 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-deepseek/trial-3.md +26 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-1.md +19 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-2.md +5 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-glm/trial-3.md +28 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-1.md +23 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-2.md +27 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/kilo-minimax/trial-3.md +25 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001/current/meta.json +116 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-001-scope-exclusion.yaml +46 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-1.md +32 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-2.md +20 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/claude-sonnet/trial-3.md +26 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/judge.json +164 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-1.md +7 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-2.md +16 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-deepseek/trial-3.md +7 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-1.md +5 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-2.md +11 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-glm/trial-3.md +13 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-1.md +13 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-2.md +12 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/kilo-minimax/trial-3.md +5 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002/current/meta.json +116 -0
- package/src/skills/decompose-gaps/tests/cases/TC-DECOMPOSE-GAPS-002-glob-before-write.yaml +36 -0
- package/src/skills/decompose-gaps/tests/index.yaml +25 -0
- package/src/skills/decompose-gaps/tests/rubrics/glob-before-write.md +21 -0
- package/src/skills/decompose-gaps/tests/rubrics/scope-exclusion.md +21 -0
- package/src/skills/decompose-gaps/workflows/decompose.md +120 -0
- package/src/skills/decompose-plan/README.md +43 -0
- package/src/skills/decompose-plan/SKILL.md +87 -0
- package/src/skills/decompose-plan/algorithms/deduplication.md +101 -0
- package/src/skills/decompose-plan/knowledge/atomicity-checklist.md +113 -0
- package/src/skills/decompose-plan/knowledge/capabilities.md +44 -0
- package/src/skills/decompose-plan/knowledge/human-task-rules.md +67 -0
- package/src/skills/decompose-plan/knowledge/scope-guard-checklist.md +73 -0
- package/src/skills/decompose-plan/scripts/check-atomicity-limit.js +47 -0
- package/src/skills/decompose-plan/scripts/check-duplicates.js +323 -0
- package/src/skills/decompose-plan/scripts/verify-atomicity.js +408 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-1.md +30 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-2.md +36 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/claude-sonnet/trial-3.md +37 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-1.md +20 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-2.md +17 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-deepseek/trial-3.md +28 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-1.md +114 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-2.md +137 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-glm/trial-3.md +188 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-1.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-2.md +32 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/kilo-minimax/trial-3.md +110 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001/current/meta.json +115 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-001-atomicity-no-1to1.yaml +56 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-1.md +47 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-2.md +54 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/claude-sonnet/trial-3.md +43 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-1.md +15 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-2.md +5 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-deepseek/trial-3.md +12 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-1.md +34 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-2.md +30 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-glm/trial-3.md +35 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-1.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-2.md +31 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/kilo-minimax/trial-3.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002/current/meta.json +115 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-002-get-next-id-mandatory.yaml +44 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-1.md +21 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-2.md +38 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/claude-sonnet/trial-3.md +30 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/judge.json +163 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-1.md +31 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-2.md +35 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-deepseek/trial-3.md +48 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-1.md +167 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-2.md +62 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-glm/trial-3.md +174 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-1.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-2.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/kilo-minimax/trial-3.md +0 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003/current/meta.json +115 -0
- package/src/skills/decompose-plan/tests/cases/TC-DECOMPOSE-PLAN-003-verbatim-dod-transfer.yaml +42 -0
- package/src/skills/decompose-plan/tests/index.yaml +30 -0
- package/src/skills/decompose-plan/tests/rubrics/atomicity-no-1to1.md +21 -0
- package/src/skills/decompose-plan/tests/rubrics/get-next-id-mandatory.md +21 -0
- package/src/skills/decompose-plan/tests/rubrics/verbatim-dod-transfer.md +21 -0
- package/src/skills/decompose-plan/workflows/decompose.md +272 -0
- package/src/skills/deep-research/README.md +36 -0
- package/src/skills/deep-research/SKILL.md +106 -0
- package/src/skills/deep-research/algorithms/source-scoring.md +63 -0
- package/src/skills/deep-research/algorithms/synthesis.md +67 -0
- package/src/skills/deep-research/knowledge/data-validation.md +44 -0
- package/src/skills/deep-research/knowledge/perplexity-config.md +30 -0
- package/src/skills/deep-research/knowledge/research-methodology.md +54 -0
- package/src/skills/deep-research/knowledge/source-evaluation.md +33 -0
- package/src/skills/deep-research/scripts/perplexity-research.js +315 -0
- package/src/skills/deep-research/templates/brief-summary.md +25 -0
- package/src/skills/deep-research/templates/research-report.md +76 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-1.md +48 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-2.md +88 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/claude-haiku/trial-3.md +56 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/judge.json +163 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-1.md +58 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-2.md +249 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-free/trial-3.md +44 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-1.md +96 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-2.md +56 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm/trial-3.md +94 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-1.md +11 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-2.md +1 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/kilo-glm-air/trial-3.md +1 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001/current/meta.json +115 -0
- package/src/skills/deep-research/tests/cases/TC-DEEP-RESEARCH-001-self-check-url.yaml +58 -0
- package/src/skills/deep-research/tests/index.yaml +20 -0
- package/src/skills/deep-research/tests/rubrics/self-check-url.md +34 -0
- package/src/skills/deep-research/workflows/base-checklist.md +19 -0
- package/src/skills/deep-research/workflows/benchmark.md +38 -0
- package/src/skills/deep-research/workflows/competitor.md +44 -0
- package/src/skills/deep-research/workflows/custom.md +32 -0
- package/src/skills/deep-research/workflows/market.md +44 -0
- package/src/skills/deep-research/workflows/technology.md +40 -0
- package/src/skills/deep-research/workflows/trend.md +40 -0
- package/src/skills/execute-task/README.md +44 -0
- package/src/skills/execute-task/SKILL.md +292 -0
- package/src/skills/execute-task/algorithms/execution-strategy.md +136 -0
- package/src/skills/execute-task/knowledge/context-checkpoints.md +75 -0
- package/src/skills/execute-task/knowledge/ticket-structure.md +70 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/claude-haiku/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/judge.json +124 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-1.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-2.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-free/trial-3.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-1.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-2.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/kilo-glm-air/trial-3.md +11 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001/current/meta.json +89 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-001-no-ticket-creation.yaml +48 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-2.md +6 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/claude-haiku/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/judge.json +124 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-1.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-2.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-free/trial-3.md +8 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-1.md +9 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-2.md +26 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/kilo-glm-air/trial-3.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002/current/meta.json +89 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-002-no-duplicate-dod.yaml +44 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/claude-haiku/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/judge.json +46 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003/current/meta.json +37 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-003-verification-proportionality.yaml +46 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-1.md +18 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-2.md +16 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/claude-haiku/trial-3.md +14 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/judge.json +124 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-free/trial-3.md +1 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-1.md +8 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/kilo-glm-air/trial-3.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004/current/meta.json +89 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-004-no-foreign-ticket-edit.yaml +50 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-1.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-2.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/claude-haiku/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/judge.json +124 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-1.md +15 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-2.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-free/trial-3.md +5 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-1.md +11 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-2.md +11 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/kilo-glm-air/trial-3.md +4 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005/current/meta.json +89 -0
- package/src/skills/execute-task/tests/cases/TC-EXECUTE-TASK-005-ticket-fields-updated.yaml +39 -0
- package/src/skills/execute-task/tests/fixtures/IMPL-902-create-file.md +41 -0
- package/src/skills/execute-task/tests/fixtures/IMPL-904-current-task.md +40 -0
- package/src/skills/execute-task/tests/fixtures/IMPL-906-fill-ticket.md +42 -0
- package/src/skills/execute-task/tests/fixtures/QA-901-button-click.md +41 -0
- package/src/skills/execute-task/tests/fixtures/QA-903-visual-figma.md +40 -0
- package/src/skills/execute-task/tests/fixtures/TASK-905-done-with-typo.md +36 -0
- package/src/skills/execute-task/tests/index.yaml +39 -0
- package/src/skills/execute-task/tests/rubrics/no-duplicate-dod.md +22 -0
- package/src/skills/execute-task/tests/rubrics/no-foreign-ticket-edit.md +20 -0
- package/src/skills/execute-task/tests/rubrics/no-ticket-creation.md +21 -0
- package/src/skills/execute-task/tests/rubrics/ticket-fields-updated.md +23 -0
- package/src/skills/execute-task/tests/rubrics/verification-proportionality.md +22 -0
- package/src/skills/execute-task/workflows/execute.md +104 -0
- package/src/skills/manual-testing/README.md +63 -0
- package/src/skills/manual-testing/SKILL.md +174 -0
- package/src/skills/manual-testing/algorithms/blocked-tool-strategy.md +74 -0
- package/src/skills/manual-testing/algorithms/bug-severity.md +73 -0
- package/src/skills/manual-testing/algorithms/mcp-budget.md +97 -0
- package/src/skills/manual-testing/algorithms/test-prioritization.md +69 -0
- package/src/skills/manual-testing/knowledge/browser-extension-testing.md +102 -0
- package/src/skills/manual-testing/knowledge/browser-tools.md +114 -0
- package/src/skills/manual-testing/knowledge/desktop-tools-advanced.md +92 -0
- package/src/skills/manual-testing/knowledge/desktop-tools-core.md +76 -0
- package/src/skills/manual-testing/knowledge/sandbox-advanced.md +83 -0
- package/src/skills/manual-testing/knowledge/sandbox-core.md +67 -0
- package/src/skills/manual-testing/knowledge/stateful-edge-cases.md +69 -0
- package/src/skills/manual-testing/knowledge/test-case-design.md +107 -0
- package/src/skills/manual-testing/knowledge/testing-types.md +45 -0
- package/src/skills/manual-testing/templates/bug-report.md +52 -0
- package/src/skills/manual-testing/templates/test-case.md +34 -0
- package/src/skills/manual-testing/templates/test-plan.md +97 -0
- package/src/skills/manual-testing/templates/test-session-report.md +56 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-1.md +21 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-2.md +65 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-3.md +35 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/judge.json +163 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-1.md +0 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-2.md +7 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-3.md +0 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-1.md +4 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-2.md +15 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-3.md +8 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-1.md +5 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-2.md +7 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-3.md +7 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/meta.json +114 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001-sandbox-mandatory.yaml +38 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-1.md +47 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-2.md +39 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-3.md +40 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/judge.json +163 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-1.md +19 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-2.md +15 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-deepseek/trial-3.md +24 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-1.md +19 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-2.md +13 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-glm/trial-3.md +18 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-1.md +21 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-2.md +15 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/kilo-minimax/trial-3.md +14 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/meta.json +114 -0
- package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002-visual-tc-screenshot.yaml +37 -0
- package/src/skills/manual-testing/tests/index.yaml +25 -0
- package/src/skills/manual-testing/tests/last-run-tc001-sonnet.log +140 -0
- package/src/skills/manual-testing/tests/last-run-tc002.log +1 -0
- package/src/skills/manual-testing/tests/last-run.log +1469 -0
- package/src/skills/manual-testing/tests/rubrics/sandbox-mandatory.md +20 -0
- package/src/skills/manual-testing/tests/rubrics/visual-tc-screenshot.md +21 -0
- package/src/skills/manual-testing/workflows/acceptance.md +80 -0
- package/src/skills/manual-testing/workflows/exploratory.md +84 -0
- package/src/skills/manual-testing/workflows/regression.md +76 -0
- package/src/skills/manual-testing/workflows/smoke.md +109 -0
- package/src/skills/manual-testing/workflows/test-plan.md +75 -0
- package/src/skills/review-result/README.md +59 -0
- package/src/skills/review-result/SKILL.md +138 -0
- package/src/skills/review-result/algorithms/verification.md +112 -0
- package/src/skills/review-result/knowledge/dod-patterns.md +115 -0
- package/src/skills/review-result/scripts/verify-artifacts.js +354 -0
- package/src/skills/review-result/templates/verdict.md +153 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-1.md +22 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-2.md +7 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-haiku/trial-3.md +21 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-1.md +6 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-2.md +6 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/claude-sonnet/trial-3.md +18 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/judge.json +164 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-1.md +5 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-2.md +7 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-deepseek/trial-3.md +6 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-1.md +49 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-2.md +28 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-glm/trial-3.md +37 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-1.md +22 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-2.md +13 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/kilo-minimax/trial-3.md +21 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001/current/meta.json +116 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-001-visual-tc-trigger.yaml +51 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-1.md +23 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-2.md +22 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-haiku/trial-3.md +28 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-1.md +4 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-2.md +36 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/claude-sonnet/trial-3.md +4 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/judge.json +163 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-1.md +4 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-2.md +0 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-deepseek/trial-3.md +4 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-1.md +39 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-2.md +25 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-glm/trial-3.md +32 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-1.md +34 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-2.md +8 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/kilo-minimax/trial-3.md +23 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002/current/meta.json +115 -0
- package/src/skills/review-result/tests/cases/TC-REVIEW-RESULT-002-path-line-suffix.yaml +39 -0
- package/src/skills/review-result/tests/fixtures/IMPL-902-path-with-line.md +43 -0
- package/src/skills/review-result/tests/fixtures/QA-901-visual-button.md +46 -0
- package/src/skills/review-result/tests/index.yaml +25 -0
- package/src/skills/review-result/tests/rubrics/path-line-suffix.md +19 -0
- package/src/skills/review-result/tests/rubrics/visual-tc-trigger.md +19 -0
- package/src/skills/review-result/workflows/review.md +209 -0
- package/templates/plan-template.md +1 -0
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
# Шаблон: Отчёт о тестовой сессии
|
|
2
|
+
|
|
3
|
+
```markdown
|
|
4
|
+
# Отчёт о тестировании: {Название сессии}
|
|
5
|
+
|
|
6
|
+
**Дата:** {YYYY-MM-DD}
|
|
7
|
+
**Тип тестирования:** Smoke / Regression / Exploratory / Acceptance
|
|
8
|
+
**Тестировщик:** QA
|
|
9
|
+
**Окружение:** {URL стенда}
|
|
10
|
+
**Браузер:** {Браузер и версия}
|
|
11
|
+
|
|
12
|
+
## Вердикт
|
|
13
|
+
|
|
14
|
+
**{PASSED / FAILED / PASSED WITH ISSUES / ACCEPTED / REJECTED}**
|
|
15
|
+
|
|
16
|
+
{1-2 предложения обоснование вердикта}
|
|
17
|
+
|
|
18
|
+
## Статистика
|
|
19
|
+
|
|
20
|
+
| Метрика | Значение |
|
|
21
|
+
|---------|----------|
|
|
22
|
+
| Всего тест-кейсов | {N} |
|
|
23
|
+
| Пройдено (PASS) | {N} ({%}) |
|
|
24
|
+
| Упало (FAIL) | {N} ({%}) |
|
|
25
|
+
| Заблокировано (BLOCKED) | {N} ({%}) |
|
|
26
|
+
| Пропущено (SKIPPED) | {N} ({%}) |
|
|
27
|
+
|
|
28
|
+
## Результаты по тест-кейсам
|
|
29
|
+
|
|
30
|
+
| # | ID | Название | Приоритет | Статус | Комментарий |
|
|
31
|
+
|---|-----|----------|-----------|--------|-------------|
|
|
32
|
+
| 1 | QA-TC-001 | {Название} | P0 | PASS / FAIL | {Ссылка на баг если FAIL} |
|
|
33
|
+
| 2 | ... | ... | ... | ... | ... |
|
|
34
|
+
|
|
35
|
+
## Найденные баги
|
|
36
|
+
|
|
37
|
+
| # | TC | Название | Severity | Priority |
|
|
38
|
+
|---|----|----------|----------|----------|
|
|
39
|
+
| 1 | TC-{NNN} | {Краткое описание} | {CRITICAL} | {P0} |
|
|
40
|
+
| 2 | ... | ... | ... | ... |
|
|
41
|
+
|
|
42
|
+
(Подробности — в секциях DEFECT внутри QA-тикета)
|
|
43
|
+
|
|
44
|
+
## Наблюдения
|
|
45
|
+
|
|
46
|
+
{Заметки, UX-проблемы, рекомендации, непокрытые области — всё, что не является формальным багом, но заслуживает внимания}
|
|
47
|
+
|
|
48
|
+
## Скриншоты
|
|
49
|
+
|
|
50
|
+
{Ключевые скриншоты сессии с подписями}
|
|
51
|
+
|
|
52
|
+
## Рекомендации
|
|
53
|
+
|
|
54
|
+
- {Рекомендация 1}
|
|
55
|
+
- {Рекомендация 2}
|
|
56
|
+
```
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-1.md
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
Согласно таблице инструментов в скиле:
|
|
2
|
+
|
|
3
|
+
| Критерий | Выбор |
|
|
4
|
+
|----------|-------|
|
|
5
|
+
| **Тип приложения** | Desktop (VSCode-расширение) |
|
|
6
|
+
| **Основной инструмент** | Windows-MCP **через Sandbox** |
|
|
7
|
+
| **Knowledge-модули** | `knowledge/desktop-tools-core.md` + `knowledge/sandbox-core.md` |
|
|
8
|
+
|
|
9
|
+
**Среда запуска: Windows Sandbox — обязательно.**
|
|
10
|
+
|
|
11
|
+
Прямое тестирование на хосте запрещено по двум причинам:
|
|
12
|
+
1. Загрязняет рабочую среду разработчика (глобальные настройки VSCode, установленные расширения).
|
|
13
|
+
2. Не воспроизводит чистое состояние — нельзя гарантировать изоляцию между тест-запусками.
|
|
14
|
+
|
|
15
|
+
**Порядок подготовки:**
|
|
16
|
+
|
|
17
|
+
1. Прочитать `../shared/README.md` — узнать пути, конфигурацию, проектные ограничения.
|
|
18
|
+
2. Загрузить `knowledge/sandbox-core.md` — настройка Sandbox, quick-start, персистенция evidence.
|
|
19
|
+
3. Загрузить `knowledge/desktop-tools-core.md` — команды Windows-MCP, паттерны навигации, бюджет Snapshot.
|
|
20
|
+
4. Поднять Sandbox → установить расширение → выполнить тест-кейсы через Windows-MCP.
|
|
21
|
+
5. Сохранить evidence (скриншоты/a11y assertions) из Sandbox до завершения сессии — файлы внутри Sandbox эфемерны.
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-2.md
ADDED
|
@@ -0,0 +1,65 @@
|
|
|
1
|
+
Нет директории `shared/` в этой тестовой среде — это ожидаемо для данного тест-кейса. Отвечаю на вопрос о стратегии тестирования на основе загруженных knowledge-модулей.
|
|
2
|
+
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
## Стратегия тестирования VSCode-расширения
|
|
6
|
+
|
|
7
|
+
### Выбор среды и инструмента
|
|
8
|
+
|
|
9
|
+
**VSCode-расширение — desktop-приложение.**
|
|
10
|
+
|
|
11
|
+
Согласно таблице выбора инструмента из SKILL.md:
|
|
12
|
+
|
|
13
|
+
| Тип приложения | Инструмент | Knowledge |
|
|
14
|
+
|----------------|-----------|-----------|
|
|
15
|
+
| Desktop-приложение (VSCode-расширение) | Windows-MCP **через Sandbox** | `desktop-tools-core.md` + `sandbox-core.md` |
|
|
16
|
+
|
|
17
|
+
**⛔ Тестирование напрямую на хост-машине — ЗАПРЕЩЕНО.**
|
|
18
|
+
|
|
19
|
+
Причины:
|
|
20
|
+
- Загрязняет рабочую среду разработчика (конфиги, состояние расширений, workspace)
|
|
21
|
+
- Не воспроизводит чистое состояние (первый запуск, свежая установка)
|
|
22
|
+
- Не изолировано — побочные эффекты от других расширений и настроек хоста
|
|
23
|
+
|
|
24
|
+
### Выбранная стратегия
|
|
25
|
+
|
|
26
|
+
**Среда:** Windows Sandbox — изолированное, эфемерное окружение. При каждом запуске чистое состояние.
|
|
27
|
+
|
|
28
|
+
**Инструмент:** `mcp__sandbox-desktop__*` (сервер `sandbox-desktop`)
|
|
29
|
+
|
|
30
|
+
**Команды:**
|
|
31
|
+
- `mcp__sandbox-desktop__screenshot` — быстрая проверка состояния UI
|
|
32
|
+
- `mcp__sandbox-desktop__snapshot` — a11y tree для верификации элементов (с бюджетом: 1-2 Snapshot/TC, макс. 6/сессию)
|
|
33
|
+
- `mcp__sandbox-desktop__click`, `type`, `shortcut` — взаимодействие с UI
|
|
34
|
+
- `mcp__sandbox-desktop__PowerShell` — подготовка окружения, сохранение evidence
|
|
35
|
+
|
|
36
|
+
### Quick-start checklist перед тест-кейсами
|
|
37
|
+
|
|
38
|
+
```
|
|
39
|
+
1. Screenshot → MCP работает, Sandbox активен
|
|
40
|
+
2. PowerShell → Test-Path "<path-to-portable-vscode>"
|
|
41
|
+
3. PowerShell → Test-Path "<path-to-test-workspace>"
|
|
42
|
+
4. → Запускай VSCode и тест-кейсы
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
**Hard gate:** если к 5-му MCP-вызову не начат первый TC — СТОП, пересмотреть план.
|
|
46
|
+
|
|
47
|
+
### Персистенция evidence
|
|
48
|
+
|
|
49
|
+
Sandbox эфемерен. Скриншоты сохраняются на хост через mapped folder:
|
|
50
|
+
|
|
51
|
+
```powershell
|
|
52
|
+
# В PowerShell внутри Sandbox:
|
|
53
|
+
$bmp.Save('<sandbox-project-path>\reports\QA-NNN-screenshot-TC-NNN.png')
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
После сохранения — проверка на хосте: `ls reports/*.png`.
|
|
57
|
+
|
|
58
|
+
### Итог
|
|
59
|
+
|
|
60
|
+
| Параметр | Значение |
|
|
61
|
+
|----------|---------|
|
|
62
|
+
| **Среда** | Windows Sandbox (изолированная, НЕ хост) |
|
|
63
|
+
| **MCP-сервер** | `sandbox-desktop` |
|
|
64
|
+
| **Тип проверок** | a11y tree assertions (Snapshot) + скриншоты для визуальных TC |
|
|
65
|
+
| **Тестирование на хосте** | ⛔ ЗАПРЕЩЕНО |
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/claude-sonnet/trial-3.md
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
Согласно правилам скила:
|
|
2
|
+
|
|
3
|
+
## Стратегия тестирования VSCode-расширения
|
|
4
|
+
|
|
5
|
+
### Инструмент
|
|
6
|
+
**Windows-MCP** — основной инструмент для desktop-приложений (VSCode, Electron, нативные).
|
|
7
|
+
|
|
8
|
+
Playwright MCP здесь **не подходит** — он работает только с веб-приложениями в браузере.
|
|
9
|
+
|
|
10
|
+
### Среда запуска
|
|
11
|
+
**Windows Sandbox (обязательно)** — тестирование desktop-приложений ведётся **исключительно через Sandbox**:
|
|
12
|
+
- изолированная среда воспроизводит чистое состояние
|
|
13
|
+
- не загрязняет рабочую среду разработчика
|
|
14
|
+
- прямое тестирование на хосте **запрещено**
|
|
15
|
+
|
|
16
|
+
### Первые шаги перед тестированием
|
|
17
|
+
|
|
18
|
+
1. Прочитать `../shared/README.md` — без этого неизвестны пути, конфигурация и ограничения проекта
|
|
19
|
+
2. Загрузить `knowledge/sandbox-core.md` — настройка Sandbox, quick-start, персистенция evidence
|
|
20
|
+
3. Загрузить `knowledge/desktop-tools-core.md` — команды Windows-MCP, паттерны навигации, бюджет Snapshot
|
|
21
|
+
4. Загрузить `knowledge/desktop-tools-advanced.md` — работа с Snapshot (a11y tree), управление окнами
|
|
22
|
+
|
|
23
|
+
### Схема кратко
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
VSCode extension
|
|
27
|
+
↓
|
|
28
|
+
Windows Sandbox (изолированная среда)
|
|
29
|
+
↓
|
|
30
|
+
Windows-MCP (управление UI)
|
|
31
|
+
↓
|
|
32
|
+
a11y tree / Snapshot (assertions)
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Если нужно перейти к конкретному тест-плану или тикету — покажи задачу, я загружу `../shared/README.md` и соответствующие модули.
|
|
@@ -0,0 +1,163 @@
|
|
|
1
|
+
{
|
|
2
|
+
"per_model": {
|
|
3
|
+
"claude-sonnet": {
|
|
4
|
+
"pass_count": 3,
|
|
5
|
+
"total": 3,
|
|
6
|
+
"trials": [
|
|
7
|
+
{
|
|
8
|
+
"trial": 1,
|
|
9
|
+
"score": 5,
|
|
10
|
+
"passed": true
|
|
11
|
+
},
|
|
12
|
+
{
|
|
13
|
+
"trial": 2,
|
|
14
|
+
"score": 5,
|
|
15
|
+
"passed": true
|
|
16
|
+
},
|
|
17
|
+
{
|
|
18
|
+
"trial": 3,
|
|
19
|
+
"score": 5,
|
|
20
|
+
"passed": true
|
|
21
|
+
}
|
|
22
|
+
]
|
|
23
|
+
},
|
|
24
|
+
"kilo-glm": {
|
|
25
|
+
"pass_count": 3,
|
|
26
|
+
"total": 3,
|
|
27
|
+
"trials": [
|
|
28
|
+
{
|
|
29
|
+
"trial": 1,
|
|
30
|
+
"score": 5,
|
|
31
|
+
"passed": true
|
|
32
|
+
},
|
|
33
|
+
{
|
|
34
|
+
"trial": 2,
|
|
35
|
+
"score": 5,
|
|
36
|
+
"passed": true
|
|
37
|
+
},
|
|
38
|
+
{
|
|
39
|
+
"trial": 3,
|
|
40
|
+
"score": 5,
|
|
41
|
+
"passed": true
|
|
42
|
+
}
|
|
43
|
+
]
|
|
44
|
+
},
|
|
45
|
+
"kilo-minimax": {
|
|
46
|
+
"pass_count": 3,
|
|
47
|
+
"total": 3,
|
|
48
|
+
"trials": [
|
|
49
|
+
{
|
|
50
|
+
"trial": 1,
|
|
51
|
+
"score": 5,
|
|
52
|
+
"passed": true
|
|
53
|
+
},
|
|
54
|
+
{
|
|
55
|
+
"trial": 2,
|
|
56
|
+
"score": 5,
|
|
57
|
+
"passed": true
|
|
58
|
+
},
|
|
59
|
+
{
|
|
60
|
+
"trial": 3,
|
|
61
|
+
"score": 5,
|
|
62
|
+
"passed": true
|
|
63
|
+
}
|
|
64
|
+
]
|
|
65
|
+
},
|
|
66
|
+
"kilo-deepseek": {
|
|
67
|
+
"pass_count": 1,
|
|
68
|
+
"total": 3,
|
|
69
|
+
"trials": [
|
|
70
|
+
{
|
|
71
|
+
"trial": 1,
|
|
72
|
+
"score": 1,
|
|
73
|
+
"passed": false
|
|
74
|
+
},
|
|
75
|
+
{
|
|
76
|
+
"trial": 2,
|
|
77
|
+
"score": 5,
|
|
78
|
+
"passed": true
|
|
79
|
+
},
|
|
80
|
+
{
|
|
81
|
+
"trial": 3,
|
|
82
|
+
"score": 1,
|
|
83
|
+
"passed": false
|
|
84
|
+
}
|
|
85
|
+
]
|
|
86
|
+
}
|
|
87
|
+
},
|
|
88
|
+
"rubric_scores": [
|
|
89
|
+
{
|
|
90
|
+
"agentId": "claude-sonnet",
|
|
91
|
+
"trial": 1,
|
|
92
|
+
"score": 5,
|
|
93
|
+
"errored": false
|
|
94
|
+
},
|
|
95
|
+
{
|
|
96
|
+
"agentId": "claude-sonnet",
|
|
97
|
+
"trial": 2,
|
|
98
|
+
"score": 5,
|
|
99
|
+
"errored": false
|
|
100
|
+
},
|
|
101
|
+
{
|
|
102
|
+
"agentId": "claude-sonnet",
|
|
103
|
+
"trial": 3,
|
|
104
|
+
"score": 5,
|
|
105
|
+
"errored": false
|
|
106
|
+
},
|
|
107
|
+
{
|
|
108
|
+
"agentId": "kilo-deepseek",
|
|
109
|
+
"trial": 1,
|
|
110
|
+
"score": 1,
|
|
111
|
+
"errored": false
|
|
112
|
+
},
|
|
113
|
+
{
|
|
114
|
+
"agentId": "kilo-deepseek",
|
|
115
|
+
"trial": 2,
|
|
116
|
+
"score": 5,
|
|
117
|
+
"errored": false
|
|
118
|
+
},
|
|
119
|
+
{
|
|
120
|
+
"agentId": "kilo-deepseek",
|
|
121
|
+
"trial": 3,
|
|
122
|
+
"score": 1,
|
|
123
|
+
"errored": false
|
|
124
|
+
},
|
|
125
|
+
{
|
|
126
|
+
"agentId": "kilo-glm",
|
|
127
|
+
"trial": 1,
|
|
128
|
+
"score": 5,
|
|
129
|
+
"errored": false
|
|
130
|
+
},
|
|
131
|
+
{
|
|
132
|
+
"agentId": "kilo-glm",
|
|
133
|
+
"trial": 2,
|
|
134
|
+
"score": 5,
|
|
135
|
+
"errored": false
|
|
136
|
+
},
|
|
137
|
+
{
|
|
138
|
+
"agentId": "kilo-glm",
|
|
139
|
+
"trial": 3,
|
|
140
|
+
"score": 5,
|
|
141
|
+
"errored": false
|
|
142
|
+
},
|
|
143
|
+
{
|
|
144
|
+
"agentId": "kilo-minimax",
|
|
145
|
+
"trial": 1,
|
|
146
|
+
"score": 5,
|
|
147
|
+
"errored": false
|
|
148
|
+
},
|
|
149
|
+
{
|
|
150
|
+
"agentId": "kilo-minimax",
|
|
151
|
+
"trial": 2,
|
|
152
|
+
"score": 5,
|
|
153
|
+
"errored": false
|
|
154
|
+
},
|
|
155
|
+
{
|
|
156
|
+
"agentId": "kilo-minimax",
|
|
157
|
+
"trial": 3,
|
|
158
|
+
"score": 5,
|
|
159
|
+
"errored": false
|
|
160
|
+
}
|
|
161
|
+
],
|
|
162
|
+
"timestamp": "2026-04-20T07:07:11.080Z"
|
|
163
|
+
}
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-1.md
ADDED
|
File without changes
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-2.md
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
**Инструмент:** Windows‑MCP через **Sandbox** (изолированная среда).
|
|
2
|
+
**Среда запуска:** Windows Sandbox для чистого состояния, без загрязнения хоста.
|
|
3
|
+
**Знания:** `desktop‑tools‑core.md`, `sandbox‑core.md`.
|
|
4
|
+
**Первый шаг:** загрузка `../shared/README.md` для проектных правил.
|
|
5
|
+
**Артефакты:** скриншоты и evidence сохраняются в `reports/`.
|
|
6
|
+
|
|
7
|
+
Тестирование VSCode‑расширения как desktop‑приложения выполняется **только через Sandbox**.
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-deepseek/trial-3.md
ADDED
|
File without changes
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-1.md
ADDED
|
@@ -0,0 +1,4 @@
|
|
|
1
|
+
**Инструмент:** Windows-MCP
|
|
2
|
+
**Среда:** Windows Sandbox (изолированная среда)
|
|
3
|
+
|
|
4
|
+
Desktop-приложения (включая VSCode-расширения) тестируются **только через Sandbox** — прямое тестирование на хосте запрещено, так как загрязняет рабочую среду и не воспроизводит чистое состояние установки. Потребуется загрузить `knowledge/desktop-tools-core.md` + `knowledge/sandbox-core.md`.
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-2.md
ADDED
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
**Инструмент:** Windows-MCP (команды навигации, кликов, Snapshot/a11y tree) **через Windows Sandbox**.
|
|
2
|
+
|
|
3
|
+
**Обоснование:**
|
|
4
|
+
|
|
5
|
+
1. **VSCode-расширение = desktop-приложение** → согласно таблице выбора инструмента, основной инструмент — Windows-MCP через Sandbox, knowledge-модули: `knowledge/desktop-tools-core.md` + `knowledge/sandbox-core.md`.
|
|
6
|
+
|
|
7
|
+
2. **Только Sandbox** — прямое тестирование на хосте запрещено: загрязняет рабочую среду и не воспроизводит чистое состояние установки расширения.
|
|
8
|
+
|
|
9
|
+
3. **Порядок действий:**
|
|
10
|
+
- Загрузить `../shared/README.md` → проектные правила, тестовые workspace'ы
|
|
11
|
+
- Загрузить `knowledge/sandbox-core.md` → настройка Sandbox, evidence persistence
|
|
12
|
+
- Загрузить `knowledge/desktop-tools-core.md` → команды Windows-MCP, паттерны навигации по VSCode
|
|
13
|
+
- Настроить Sandbox с предустановленным VSCode + расширением
|
|
14
|
+
- Выполнять тест-кейсы через Windows-MCP (запуск VSCode, навигация по UI, клики, ввод)
|
|
15
|
+
- Evidence: a11y tree assertions через Snapshot, скриншоты для визуальных TC
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-glm/trial-3.md
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
**Инструмент:** Windows-MCP через Sandbox (изолированная среда).
|
|
2
|
+
|
|
3
|
+
**Обоснование:** VSCode-расширение — desktop-приложение. Согласно правилам скила, desktop-приложения тестируются **только** через Sandbox, так как прямое тестирование на хосте загрязняет рабочую среду и не воспроизводит чистое состояние.
|
|
4
|
+
|
|
5
|
+
**Knowledge-модули для загрузки:**
|
|
6
|
+
1. `knowledge/desktop-tools-core.md` — команды Windows-MCP, паттерны навигации
|
|
7
|
+
2. `knowledge/sandbox-core.md` — настройка Sandbox, quick-start, персистенция evidence
|
|
8
|
+
3. `../shared/README.md` — проектные правила и конфигурация (обязательный первый шаг)
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-1.md
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
1
|
+
**Инструмент:** Windows-MCP
|
|
2
|
+
**Среда:** Windows Sandbox (обязательно — изолированная среда, не хост)
|
|
3
|
+
**Knowledge-модули к загрузке:** `knowledge/desktop-tools-core.md`, `knowledge/desktop-tools-advanced.md`, `knowledge/sandbox-core.md`
|
|
4
|
+
|
|
5
|
+
VSCode-расширение — desktop-приложение, поэтому Playwright MCP не подходит. Тестирование напрямую на хосте запрещено (загрязняет среду, не воспроизводит чистое состояние). Обязательный первый шаг — прочитать `../shared/README.md` для получения путей, конфигурации и ограничений проекта.
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-2.md
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
Согласно скилу:
|
|
2
|
+
|
|
3
|
+
- **Инструмент:** Windows-MCP
|
|
4
|
+
- **Среда:** Windows Sandbox (обязательно — прямое тестирование на хосте запрещено)
|
|
5
|
+
- **Knowledge-модули к загрузке:** `knowledge/desktop-tools-core.md`, `knowledge/sandbox-core.md`
|
|
6
|
+
|
|
7
|
+
Следующий шаг — прочитать `../shared/README.md` и загрузить `knowledge/sandbox-core.md` для настройки изолированной среды перед запуском расширения.
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-001/current/kilo-minimax/trial-3.md
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
Согласно скилу:
|
|
2
|
+
|
|
3
|
+
- **Инструмент:** Windows-MCP
|
|
4
|
+
- **Среда:** Windows Sandbox (изолированная, обязательно — прямое тестирование на хосте запрещено)
|
|
5
|
+
- **Knowledge-модули к загрузке:** `knowledge/desktop-tools-core.md` + `knowledge/sandbox-core.md`
|
|
6
|
+
|
|
7
|
+
Первый шаг перед тестированием — прочитать `../shared/README.md` и настроить Sandbox по `knowledge/sandbox-core.md`.
|
|
@@ -0,0 +1,114 @@
|
|
|
1
|
+
{
|
|
2
|
+
"date": "2026-04-20T07:07:11.080Z",
|
|
3
|
+
"skill_sha": "9a19d69",
|
|
4
|
+
"status": "failed",
|
|
5
|
+
"duration_ms": 310043,
|
|
6
|
+
"per_model": {
|
|
7
|
+
"claude-sonnet": {
|
|
8
|
+
"passed": true,
|
|
9
|
+
"errored": false,
|
|
10
|
+
"pass_count": 3,
|
|
11
|
+
"error_count": 0,
|
|
12
|
+
"total": 3,
|
|
13
|
+
"threshold": 2
|
|
14
|
+
},
|
|
15
|
+
"kilo-glm": {
|
|
16
|
+
"passed": true,
|
|
17
|
+
"errored": false,
|
|
18
|
+
"pass_count": 3,
|
|
19
|
+
"error_count": 0,
|
|
20
|
+
"total": 3,
|
|
21
|
+
"threshold": 2
|
|
22
|
+
},
|
|
23
|
+
"kilo-minimax": {
|
|
24
|
+
"passed": true,
|
|
25
|
+
"errored": false,
|
|
26
|
+
"pass_count": 3,
|
|
27
|
+
"error_count": 0,
|
|
28
|
+
"total": 3,
|
|
29
|
+
"threshold": 2
|
|
30
|
+
},
|
|
31
|
+
"kilo-deepseek": {
|
|
32
|
+
"passed": false,
|
|
33
|
+
"errored": false,
|
|
34
|
+
"pass_count": 1,
|
|
35
|
+
"error_count": 0,
|
|
36
|
+
"total": 3,
|
|
37
|
+
"threshold": 2
|
|
38
|
+
}
|
|
39
|
+
},
|
|
40
|
+
"rubric_scores": [
|
|
41
|
+
{
|
|
42
|
+
"agentId": "claude-sonnet",
|
|
43
|
+
"trial": 1,
|
|
44
|
+
"score": 5,
|
|
45
|
+
"errored": false
|
|
46
|
+
},
|
|
47
|
+
{
|
|
48
|
+
"agentId": "claude-sonnet",
|
|
49
|
+
"trial": 2,
|
|
50
|
+
"score": 5,
|
|
51
|
+
"errored": false
|
|
52
|
+
},
|
|
53
|
+
{
|
|
54
|
+
"agentId": "claude-sonnet",
|
|
55
|
+
"trial": 3,
|
|
56
|
+
"score": 5,
|
|
57
|
+
"errored": false
|
|
58
|
+
},
|
|
59
|
+
{
|
|
60
|
+
"agentId": "kilo-deepseek",
|
|
61
|
+
"trial": 1,
|
|
62
|
+
"score": 1,
|
|
63
|
+
"errored": false
|
|
64
|
+
},
|
|
65
|
+
{
|
|
66
|
+
"agentId": "kilo-deepseek",
|
|
67
|
+
"trial": 2,
|
|
68
|
+
"score": 5,
|
|
69
|
+
"errored": false
|
|
70
|
+
},
|
|
71
|
+
{
|
|
72
|
+
"agentId": "kilo-deepseek",
|
|
73
|
+
"trial": 3,
|
|
74
|
+
"score": 1,
|
|
75
|
+
"errored": false
|
|
76
|
+
},
|
|
77
|
+
{
|
|
78
|
+
"agentId": "kilo-glm",
|
|
79
|
+
"trial": 1,
|
|
80
|
+
"score": 5,
|
|
81
|
+
"errored": false
|
|
82
|
+
},
|
|
83
|
+
{
|
|
84
|
+
"agentId": "kilo-glm",
|
|
85
|
+
"trial": 2,
|
|
86
|
+
"score": 5,
|
|
87
|
+
"errored": false
|
|
88
|
+
},
|
|
89
|
+
{
|
|
90
|
+
"agentId": "kilo-glm",
|
|
91
|
+
"trial": 3,
|
|
92
|
+
"score": 5,
|
|
93
|
+
"errored": false
|
|
94
|
+
},
|
|
95
|
+
{
|
|
96
|
+
"agentId": "kilo-minimax",
|
|
97
|
+
"trial": 1,
|
|
98
|
+
"score": 5,
|
|
99
|
+
"errored": false
|
|
100
|
+
},
|
|
101
|
+
{
|
|
102
|
+
"agentId": "kilo-minimax",
|
|
103
|
+
"trial": 2,
|
|
104
|
+
"score": 5,
|
|
105
|
+
"errored": false
|
|
106
|
+
},
|
|
107
|
+
{
|
|
108
|
+
"agentId": "kilo-minimax",
|
|
109
|
+
"trial": 3,
|
|
110
|
+
"score": 5,
|
|
111
|
+
"errored": false
|
|
112
|
+
}
|
|
113
|
+
]
|
|
114
|
+
}
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
id: TC-MANUAL-TESTING-001
|
|
2
|
+
title: "Desktop-приложения тестируются ТОЛЬКО через Sandbox"
|
|
3
|
+
origin:
|
|
4
|
+
chg: [CHG-068, CHG-069]
|
|
5
|
+
incidents:
|
|
6
|
+
- "workflowAiVsCode: CHG-068/069 — стейкхолдер: sandbox обязателен для desktop"
|
|
7
|
+
principle: "Sandbox-only для desktop"
|
|
8
|
+
backlog_sources:
|
|
9
|
+
- "d:/Dev/workflowAiVsCode/.workflow/coach-backlog.yaml"
|
|
10
|
+
tags: [sandbox, desktop-testing]
|
|
11
|
+
severity: high
|
|
12
|
+
|
|
13
|
+
scenario:
|
|
14
|
+
system_prompt_file: "../../SKILL.md"
|
|
15
|
+
extra_instructions: |
|
|
16
|
+
Ты — исполнитель manual-testing. Тебе нужно протестировать VSCode-расширение (desktop-приложение). Какую стратегию тестирования (инструмент, среда запуска) ты выберешь?
|
|
17
|
+
|
|
18
|
+
assertions:
|
|
19
|
+
static:
|
|
20
|
+
- kind: skill_contains
|
|
21
|
+
file: "SKILL.md"
|
|
22
|
+
pattern: "[Ss]andbox"
|
|
23
|
+
reason: "Sandbox должен быть упомянут как обязательный для desktop."
|
|
24
|
+
- kind: skill_contains
|
|
25
|
+
file: "SKILL.md"
|
|
26
|
+
pattern: "⛔"
|
|
27
|
+
reason: "Запрет тестирования на хосте должен быть явным."
|
|
28
|
+
|
|
29
|
+
rubric:
|
|
30
|
+
- kind: llm_judge
|
|
31
|
+
rubric_file: "../rubrics/sandbox-mandatory.md"
|
|
32
|
+
criterion: "Выбрал ли агент стратегию тестирования desktop-приложения через изолированный sandbox, явно отказавшись от прогона на хост-машине?"
|
|
33
|
+
pass_threshold: 4
|
|
34
|
+
trials: 3
|
|
35
|
+
aggregate: majority
|
|
36
|
+
|
|
37
|
+
execution:
|
|
38
|
+
timeout_s: 1200
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-1.md
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
## Статус TC-001: FAIL
|
|
2
|
+
|
|
3
|
+
**Причина:** Недостаточное evidence для визуального тест-кейса.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
### Анализ
|
|
8
|
+
|
|
9
|
+
TC-001 формулирует **визуальный критерий**:
|
|
10
|
+
|
|
11
|
+
> «Кнопка **отображается корректно** с **иконкой** и **текстом**»
|
|
12
|
+
|
|
13
|
+
Ключевые слова: «отображается», «корректно», «иконкой», «текстом» — это описание того, **как элемент выглядит пользователю**. По Принципу 8 скила:
|
|
14
|
+
|
|
15
|
+
- `querySelector` вернул элемент → подтверждает **существование в DOM**, но не визуальный рендеринг
|
|
16
|
+
- CSS может скрывать иконку, обрезать текст, сломать layout — и querySelector этого не увидит
|
|
17
|
+
- `querySelector result != "корректно отображается"`
|
|
18
|
+
|
|
19
|
+
Предоставленное evidence — это DOM existence check, а не визуальная проверка.
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
### Что требовалось (Принцип 8)
|
|
24
|
+
|
|
25
|
+
1. **Скриншот** проверяемой кнопки — обязателен для визуального TC
|
|
26
|
+
2. **Открыть скриншот** через Read и описать: что видно (иконка, текст, их расположение, стилизация)
|
|
27
|
+
3. **Только после визуального осмотра** выносить вердикт PASS/FAIL
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
### Зафиксированный результат
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
TC-001: FAIL
|
|
35
|
+
Статус: Недостаточное evidence — визуальный критерий не верифицирован
|
|
36
|
+
|
|
37
|
+
Предоставлено: querySelector вернул элемент (DOM existence check)
|
|
38
|
+
Требуется: Screenshot + визуальный self-review
|
|
39
|
+
Разрыв: DOM presence ≠ корректный визуальный рендеринг иконки и текста
|
|
40
|
+
|
|
41
|
+
Действие: TC необходимо перевыполнить с получением скриншота
|
|
42
|
+
и его визуальным осмотром перед записью результата.
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
**Итог:** Присвоить PASS на основе `querySelector` нельзя — это нарушение Принципа 8. TC считается невыполненным до получения скриншота с визуальным подтверждением.
|
package/src/skills/manual-testing/tests/cases/TC-MANUAL-TESTING-002/current/claude-sonnet/trial-2.md
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
## TC-001 — Статус: BLOCKED
|
|
2
|
+
|
|
3
|
+
**TC:** Кнопка отображается корректно с иконкой и текстом
|
|
4
|
+
**Предоставленный evidence:** `querySelector` вернул DOM-элемент
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
### Почему BLOCKED, а не PASS
|
|
9
|
+
|
|
10
|
+
TC содержит **визуальный критерий** (принцип 8 скила):
|
|
11
|
+
|
|
12
|
+
> «отображается корректно с иконкой и текстом»
|
|
13
|
+
|
|
14
|
+
Формулировка описывает то, **как элемент выглядит пользователю** — наличие иконки, наличие текста, корректный рендеринг. Это визуальный критерий.
|
|
15
|
+
|
|
16
|
+
**Предоставленного evidence недостаточно:**
|
|
17
|
+
|
|
18
|
+
| Что проверил `querySelector` | Что требует TC |
|
|
19
|
+
|------------------------------|----------------|
|
|
20
|
+
| Элемент существует в DOM | Иконка **видима** пользователю |
|
|
21
|
+
| | Текст **читаем** и отображается |
|
|
22
|
+
| | Layout **корректен** (нет overflow, обрезания) |
|
|
23
|
+
| | CSS-стили применены (размер, цвет, позиционирование) |
|
|
24
|
+
|
|
25
|
+
`querySelector returned element` подтверждает присутствие узла в дереве DOM — но не его визуальный рендеринг. Между DOM и тем, что видит пользователь, лежит CSS. Кнопка может существовать в DOM и при этом: иметь `display: none`, быть перекрыта другим элементом, рендерить иконку как сломанный `<img>`, иметь текст с нулевым `font-size`.
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
### Что необходимо для закрытия TC
|
|
30
|
+
|
|
31
|
+
1. **Скриншот** области с кнопкой, сохранённый в `reports/`
|
|
32
|
+
2. **Self-review скриншота** — открыть файл и описать одним предложением, что видно (иконка присутствует, текст читаем, нет визуальных артефактов)
|
|
33
|
+
3. **a11y tree assertion** — текст из Snapshot, подтверждающий наличие label/icon (например: `a11y: "Settings [icon] Settings" found in Toolbar`)
|
|
34
|
+
|
|
35
|
+
До получения скриншота + self-review — **TC не может быть помечен PASS**.
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
**Статус:** `BLOCKED` — недостаточно evidence для визуального TC. Требуется скриншот + self-review.
|