cokit-cli 1.2.3 → 1.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (79) hide show
  1. package/README.md +6 -7
  2. package/agents/brainstormer.agent.md +9 -2
  3. package/agents/code-reviewer.agent.md +59 -84
  4. package/agents/code-simplifier.agent.md +9 -6
  5. package/agents/debugger.agent.md +17 -8
  6. package/agents/docs-manager.agent.md +104 -8
  7. package/agents/fullstack-developer.agent.md +57 -13
  8. package/agents/git-manager.agent.md +2 -382
  9. package/agents/planner.agent.md +36 -8
  10. package/agents/researcher.agent.md +18 -3
  11. package/agents/tester.agent.md +13 -14
  12. package/agents/ui-ux-designer.agent.md +209 -33
  13. package/docs/README.md +4 -3
  14. package/docs/claudekit-porting-rules.md +182 -0
  15. package/docs/codebase-summary.md +11 -10
  16. package/docs/cokit-comprehensive-mapping-guide.md +4 -4
  17. package/docs/cokit-slides.md +1 -1
  18. package/docs/cokit-sync-and-maintenance-guide.md +8 -3
  19. package/docs/cokit-team-presentation.md +5 -5
  20. package/docs/guide-next-steps-speckit-cokit-implementation.md +1 -1
  21. package/docs/project-overview-pdr.md +1 -1
  22. package/docs/project-roadmap.md +6 -7
  23. package/package.json +1 -1
  24. package/prompts/ck-ask.prompt.md +1 -1
  25. package/prompts/ck-bootstrap.prompt.md +1 -1
  26. package/prompts/ck-cook.prompt.md +12 -12
  27. package/prompts/ck-plan-fast.prompt.md +1 -0
  28. package/prompts/ck-plan-hard.prompt.md +2 -1
  29. package/prompts/ck-plan-red-team.prompt.md +227 -0
  30. package/prompts/ck-plan.prompt.md +1 -0
  31. package/prompts/ck-simplify.prompt.md +1 -1
  32. package/skills/code-review/SKILL.md +78 -28
  33. package/skills/cook/SKILL.md +45 -11
  34. package/skills/debug/SKILL.md +112 -17
  35. package/skills/fix/SKILL.md +20 -8
  36. package/skills/frontend-design/SKILL.md +6 -3
  37. package/skills/planning/SKILL.md +47 -15
  38. package/skills/research/SKILL.md +1 -1
  39. package/skills/scout/SKILL.md +24 -11
  40. package/skills/web-testing/SKILL.md +60 -6
  41. package/skills/web-testing/references/report-format.md +57 -0
  42. package/skills/web-testing/references/test-execution-workflow.md +118 -0
  43. package/skills/web-testing/references/ui-testing-workflow.md +97 -0
  44. package/templates/repo/.github/agents/brainstormer.agent.md +9 -2
  45. package/templates/repo/.github/agents/code-reviewer.agent.md +59 -84
  46. package/templates/repo/.github/agents/code-simplifier.agent.md +9 -6
  47. package/templates/repo/.github/agents/debugger.agent.md +17 -8
  48. package/templates/repo/.github/agents/docs-manager.agent.md +104 -8
  49. package/templates/repo/.github/agents/fullstack-developer.agent.md +57 -13
  50. package/templates/repo/.github/agents/git-manager.agent.md +2 -382
  51. package/templates/repo/.github/agents/planner.agent.md +36 -8
  52. package/templates/repo/.github/agents/researcher.agent.md +18 -3
  53. package/templates/repo/.github/agents/tester.agent.md +13 -14
  54. package/templates/repo/.github/agents/ui-ux-designer.agent.md +209 -33
  55. package/templates/repo/.github/prompts/ck-ask.prompt.md +1 -1
  56. package/templates/repo/.github/prompts/ck-bootstrap.prompt.md +1 -1
  57. package/templates/repo/.github/prompts/ck-cook.prompt.md +12 -12
  58. package/templates/repo/.github/prompts/ck-plan-fast.prompt.md +1 -0
  59. package/templates/repo/.github/prompts/ck-plan-hard.prompt.md +2 -1
  60. package/templates/repo/.github/prompts/ck-plan-red-team.prompt.md +227 -0
  61. package/templates/repo/.github/prompts/ck-plan.prompt.md +1 -0
  62. package/templates/repo/.github/prompts/ck-simplify.prompt.md +1 -1
  63. package/templates/repo/.github/prompts/ck-spec-specify.prompt.md +1 -0
  64. package/templates/repo/.github/skills/code-review/SKILL.md +78 -28
  65. package/templates/repo/.github/skills/cook/SKILL.md +45 -11
  66. package/templates/repo/.github/skills/debug/SKILL.md +112 -17
  67. package/templates/repo/.github/skills/fix/SKILL.md +20 -8
  68. package/templates/repo/.github/skills/frontend-design/SKILL.md +6 -3
  69. package/templates/repo/.github/skills/planning/SKILL.md +47 -15
  70. package/templates/repo/.github/skills/research/SKILL.md +1 -1
  71. package/templates/repo/.github/skills/scout/SKILL.md +24 -11
  72. package/templates/repo/.github/skills/web-testing/SKILL.md +60 -6
  73. package/templates/repo/.github/skills/web-testing/references/report-format.md +57 -0
  74. package/templates/repo/.github/skills/web-testing/references/test-execution-workflow.md +118 -0
  75. package/templates/repo/.github/skills/web-testing/references/ui-testing-workflow.md +97 -0
  76. package/prompts/ck-journal.prompt.md +0 -19
  77. package/prompts/ck-preview.prompt.md +0 -77
  78. package/templates/repo/.github/prompts/ck-journal.prompt.md +0 -19
  79. package/templates/repo/.github/prompts/ck-preview.prompt.md +0 -77
@@ -1,12 +1,11 @@
1
1
  ---
2
2
  name: debug
3
- description: Debug systematically with root cause analysis before fixes. Use for bugs, test failures, unexpected behavior, performance issues, call stack tracing, multi-layer validation.
4
- languages: all
3
+ description: Debug systematically with root cause analysis before fixes. Covers bugs, test failures, log analysis, CI/CD failures, database diagnostics, system investigation, performance issues, call stack tracing, multi-layer validation.
5
4
  ---
6
5
 
7
- # Debugging
6
+ # Debugging & System Investigation
8
7
 
9
- Comprehensive debugging framework combining systematic investigation, root cause tracing, defense-in-depth validation, and verification protocols.
8
+ Comprehensive debugging framework combining systematic investigation, root cause tracing, defense-in-depth validation, verification protocols, and system-level diagnostics.
10
9
 
11
10
  ## Core Principle
12
11
 
@@ -16,29 +15,29 @@ Random fixes waste time and create new bugs. Find the root cause, fix at source,
16
15
 
17
16
  ## When to Use
18
17
 
19
- **Always use for:** Test failures, bugs, unexpected behavior, performance issues, build failures, integration problems, before claiming work complete
18
+ **Code-level:** Test failures, bugs, unexpected behavior, build failures, integration problems, before claiming work complete
19
+
20
+ **System-level:** CI/CD pipeline failures, log analysis, database diagnostics, performance bottlenecks, infrastructure issues
20
21
 
21
22
  **Especially when:** Under time pressure, "quick fix" seems obvious, tried multiple fixes, don't fully understand issue, about to claim success
22
23
 
23
- ## The Four Techniques
24
+ ## Techniques
24
25
 
25
26
  ### 1. Systematic Debugging (`references/systematic-debugging.md`)
26
27
 
27
- Four-phase framework ensuring proper investigation:
28
+ Four-phase framework:
28
29
  - Phase 1: Root Cause Investigation (read errors, reproduce, check changes, gather evidence)
29
30
  - Phase 2: Pattern Analysis (find working examples, compare, identify differences)
30
31
  - Phase 3: Hypothesis and Testing (form theory, test minimally, verify)
31
32
  - Phase 4: Implementation (create test, fix once, verify)
32
33
 
33
- **Key rule:** Complete each phase before proceeding. No fixes without Phase 1.
34
+ Complete each phase before proceeding. No fixes without Phase 1.
34
35
 
35
36
  **Load when:** Any bug/issue requiring investigation and fix
36
37
 
37
38
  ### 2. Root Cause Tracing (`references/root-cause-tracing.md`)
38
39
 
39
- Trace bugs backward through call stack to find original trigger.
40
-
41
- **Technique:** When error appears deep in execution, trace backward level-by-level until finding source where invalid data originated. Fix at source, not at symptom.
40
+ Trace bugs backward through call stack to find original trigger. Fix at source, not at symptom.
42
41
 
43
42
  **Includes:** `scripts/find-polluter.sh` for bisecting test pollution
44
43
 
@@ -46,9 +45,7 @@ Trace bugs backward through call stack to find original trigger.
46
45
 
47
46
  ### 3. Defense-in-Depth (`references/defense-in-depth.md`)
48
47
 
49
- Validate at every layer data passes through. Make bugs impossible.
50
-
51
- **Four layers:** Entry validation → Business logic → Environment guards → Debug instrumentation
48
+ Validate at every layer data passes through. Four layers: Entry validation → Business logic → Environment guards → Debug instrumentation
52
49
 
53
50
  **Load when:** After finding root cause, need to add comprehensive validation
54
51
 
@@ -58,19 +55,117 @@ Run verification commands and confirm output before claiming success.
58
55
 
59
56
  **Iron law:** NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
60
57
 
61
- Run the command. Read the output. Then claim the result.
62
-
63
58
  **Load when:** About to claim work complete, fixed, or passing
64
59
 
60
+ ### 5. Investigation Methodology
61
+
62
+ For system-level issues (CI/CD, infrastructure, data pipeline):
63
+
64
+ 1. **Scope** - Define what is broken and what is working
65
+ 2. **Gather** - Collect logs, metrics, error outputs before touching anything
66
+ 3. **Isolate** - Narrow to smallest reproducible case
67
+ 4. **Hypothesize** - Form one theory, test it, reject or confirm
68
+ 5. **Fix & Validate** - Fix at root, verify at every affected layer
69
+
70
+ **Load when:** Issue is not code-local — spans services, environments, or pipelines
71
+
72
+ ### 6. Log & CI/CD Analysis
73
+
74
+ Use `gh` CLI and structured queries to diagnose pipeline failures:
75
+
76
+ ```bash
77
+ # View failed CI run logs
78
+ gh run view <run-id> --log-failed
79
+
80
+ # List recent runs for a workflow
81
+ gh run list --workflow=<name> --limit 10
82
+
83
+ # Watch a running workflow
84
+ gh run watch <run-id>
85
+ ```
86
+
87
+ For structured logs: filter by severity, timestamp range, and correlation ID before reading raw output.
88
+
89
+ **Load when:** CI/CD failure, deployment issue, or log-driven investigation
90
+
91
+ ### 7. Performance Diagnostics
92
+
93
+ Identify bottlenecks before optimizing:
94
+ - Profile first — measure before guessing
95
+ - Check slow queries with `EXPLAIN ANALYZE` (PostgreSQL) or equivalent
96
+ - Identify N+1 query patterns in ORM usage
97
+ - Check memory allocation patterns for leaks
98
+ - Use `psql` for live database diagnostics
99
+
100
+ **Load when:** Slowness reported, timeout errors, resource exhaustion
101
+
102
+ ### 8. Reporting Standards
103
+
104
+ For multi-component investigations, write a structured diagnostic report:
105
+
106
+ ```
107
+ ## Diagnostic Report
108
+ - **Issue:** [one-line description]
109
+ - **Root Cause:** [where and why it fails]
110
+ - **Evidence:** [logs, output, reproduction steps]
111
+ - **Fix Applied:** [what was changed]
112
+ - **Verification:** [command run + result]
113
+ - **Remaining Risk:** [any open questions]
114
+ ```
115
+
116
+ Save to `plans/reports/debugger-{date}-{slug}.md`.
117
+
118
+ **Load when:** Investigation spans multiple components or will be shared with others
119
+
120
+ ### 9. Task Management
121
+
122
+ For multi-component investigations, track progress with a checklist rather than holding state mentally:
123
+
124
+ ```
125
+ - [ ] Reproduce the issue
126
+ - [ ] Identify root cause
127
+ - [ ] Fix applied
128
+ - [ ] Tests passing
129
+ - [ ] Verification complete
130
+ ```
131
+
132
+ Add this checklist to the active plan or investigation report. Check items off as each step completes.
133
+
134
+ **Load when:** Investigation touches 3+ components or files
135
+
136
+ ### 10. Frontend Verification
137
+
138
+ For visual bugs or UI regressions, use browser developer tools (or the `agent-browser` skill) to inspect rendering, network, and console errors directly in the browser.
139
+
140
+ Use `/ck-scout ext` to search for frontend-specific patterns before diving into devtools.
141
+
142
+ **Load when:** Visual regression, layout bug, client-side network error, or UI behavior that differs from expected
143
+
65
144
  ## Quick Reference
66
145
 
67
146
  ```
68
- Bug → systematic-debugging.md (Phase 1-4)
147
+ Code bug → systematic-debugging.md (Phase 1-4)
69
148
  Error deep in stack? → root-cause-tracing.md (trace backward)
70
149
  Found root cause? → defense-in-depth.md (add layers)
71
150
  About to claim success? → verification.md (verify first)
151
+
152
+ System issue → Investigation Methodology (5 steps)
153
+ CI/CD failure? → Log & CI/CD Analysis (gh CLI)
154
+ Slow/timeout? → Performance Diagnostics
155
+ Multi-component? → Task Management checklist + Reporting Standards
156
+ Visual/UI bug? → Frontend Verification (agent-browser / browser devtools)
72
157
  ```
73
158
 
159
+ ## Tools Integration
160
+
161
+ | Tool | Use Case |
162
+ |------|----------|
163
+ | `execute` | Run test commands, build scripts, verification steps |
164
+ | `gh` CLI | CI/CD log analysis, PR checks, workflow runs |
165
+ | `psql` | Live database diagnostics and slow query analysis |
166
+ | `agent-browser` skill | Frontend visual verification and network inspection |
167
+ | `/ck-scout` | Search codebase for related patterns before investigating |
168
+
74
169
  ## Red Flags
75
170
 
76
171
  Stop and follow process if thinking:
@@ -12,6 +12,7 @@ Unified skill for fixing issues of any complexity with intelligent routing.
12
12
  - `--auto` - Activate autonomous mode (**default**)
13
13
  - `--review` - Activate human-in-the-loop review mode
14
14
  - `--quick` - Activate quick mode
15
+ - `--parallel` - Fix 2+ independent issues concurrently using parallel agents
15
16
 
16
17
  ## Workflow
17
18
 
@@ -31,10 +32,10 @@ See `references/mode-selection.md` for question format.
31
32
 
32
33
  - Activate `debug` skill.
33
34
  - Guess all possible root causes.
34
- - Search in parallel to verify each hypothesis.
35
+ - Spawn multiple parallel search agents to verify each hypothesis.
35
36
  - Create report with all findings for the next step.
36
37
 
37
- ### Step 3: Complexity Assessment & Fix Implementation
38
+ ### Step 3: Complexity Assessment & Task Orchestration
38
39
 
39
40
  Classify before routing. See `references/complexity-assessment.md`.
40
41
 
@@ -45,17 +46,27 @@ Classify before routing. See `references/complexity-assessment.md`.
45
46
  | **Complex** | System-wide, architecture impact | `references/workflow-deep.md` |
46
47
  | **Parallel** | 2+ independent issues | Parallel `fullstack-developer` agents |
47
48
 
48
- ### Step 4: Fix Verification & Prevent Future Issues
49
+ **Task orchestration notes:**
50
+ - **Quick workflow:** Skip task creation — proceed directly to fix.
51
+ - **Moderate+ workflows:** After classifying, create a todo checklist for all phases upfront with dependencies before starting any implementation. Track each phase with checkboxes and note blockers inline.
52
+
53
+ See `references/task-orchestration.md` for checklist structure and dependency tracking patterns.
54
+
55
+ ### Step 4: Fix Implementation & Verification
49
56
 
50
57
  - Read and analyze all the implemented changes.
51
- - Search in parallel to find possible related code for verification.
58
+ - Spawn multiple parallel search agents to verify no regressions in related code.
52
59
  - Make sure these fixes don't break other parts of the codebase.
53
60
  - Prevent future issues by adding comprehensive validation.
54
61
 
55
62
  ### Step 5: Finalize
56
63
 
57
- - Report summary to user with confidence level/score, all the changes and related files.
58
- - Ask to commit via `git-manager` agent and update docs if needed via `docs-manager` agent (in parallel).
64
+ **MANDATORY always execute all steps:**
65
+
66
+ 1. Report summary to user with confidence score (0–10), all changes, and related files.
67
+ 2. Update `./docs` via ``docs-manager`` agent (NON-OPTIONAL — always run even for small fixes).
68
+ 3. Mark all checklist tasks complete.
69
+ 4. Ask user to commit via ``git-manager`` agent.
59
70
 
60
71
  ---
61
72
 
@@ -65,8 +76,8 @@ See `references/skill-activation-matrix.md` for complete matrix.
65
76
 
66
77
  **Always activate:** `debug` (all workflows)
67
78
  **Conditional:** `problem-solving`, `sequential-thinking`, `brainstorming`, `context-engineering`
68
- **Agents:** `debugger`, `researcher`, `planner`, `code-reviewer`, `tester`
69
- **Parallel:** Multiple parallel searches for scouting, terminal commands for verification
79
+ **Agents:** ``debugger``, ``researcher``, ``planner``, ``code-reviewer``, ``tester``
80
+ **Parallel patterns:** Multiple parallel search agents for scouting; parallel terminal commands for verification; parallel ``fullstack-developer`` agents for independent issues (`--parallel` flag)
70
81
 
71
82
  ## Output Format
72
83
 
@@ -85,6 +96,7 @@ Unified step markers:
85
96
  Load as needed:
86
97
  - `references/mode-selection.md` - Mode selection question format
87
98
  - `references/complexity-assessment.md` - Classification criteria
99
+ - `references/task-orchestration.md` - Todo checklist structure and dependency tracking
88
100
  - `references/workflow-quick.md` - Quick: debug → fix → review
89
101
  - `references/workflow-standard.md` - Standard: full pipeline
90
102
  - `references/workflow-deep.md` - Deep: research + brainstorm + plan
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: frontend-design
3
- description: Create polished frontend interfaces from designs/screenshots. Use for web components, replicating UI designs, quick prototypes, avoiding AI slop.
3
+ description: Create polished frontend interfaces from designs/screenshots/videos. Use for web components, replicating UI designs, quick prototypes, 3D experiences, immersive interfaces, avoiding AI slop.
4
4
  ---
5
5
 
6
6
  Create distinctive, production-grade frontend interfaces. Implement real working code with exceptional aesthetic attention.
@@ -12,15 +12,18 @@ Choose workflow based on input type:
12
12
  | Input | Workflow | Reference |
13
13
  |-------|----------|-----------|
14
14
  | Screenshot | Replicate exactly | `./references/workflow-screenshot.md` |
15
+ | Video | Replicate with animations | `./references/workflow-video.md` |
15
16
  | Quick task | Rapid implementation | `./references/workflow-quick.md` |
16
17
  | Describe only | Document for devs | `./references/workflow-describe.md` |
18
+ | 3D/WebGL request | Three.js immersive | `./references/workflow-3d.md` |
19
+ | Complex/award-quality | Full immersive | `./references/workflow-immersive.md` |
17
20
  | From scratch | Design Thinking below | - |
18
21
 
19
22
  **All workflows**: Activate `ui-styling` skill FIRST for design patterns and component library.
20
23
 
21
24
  ## Screenshot/Video Replication (Quick Reference)
22
25
 
23
- 1. **Analyze** visually - extract colors, fonts, spacing, effects
26
+ 1. **Analyze** visually (use `ai-multimodal` skill if available) - extract colors, fonts, spacing, effects
24
27
  2. **Plan** with `ui-ux-designer` agent - create phased implementation
25
28
  3. **Implement** - match source precisely
26
29
  4. **Verify** - compare to original
@@ -45,7 +48,7 @@ Before coding, commit to a BOLD aesthetic direction:
45
48
  - **Motion**: CSS-first, anime.js for complex (`./references/animejs.md`). Orchestrated page loads > scattered micro-interactions.
46
49
  - **Spatial**: Unexpected layouts. Asymmetry. Overlap. Negative space OR controlled density.
47
50
  - **Backgrounds**: Atmosphere over solid colors. Gradients, noise, patterns, shadows, grain.
48
- - **Assets**: Process with ImageMagick, FFmpeg, RMBG CLI tools
51
+ - **Assets**: Process with ImageMagick, FFmpeg, RMBG CLI tools (or `ai-multimodal`/`media-processing` skills if available)
49
52
 
50
53
  ## Asset & Analysis References
51
54
 
@@ -7,6 +7,22 @@ description: Plan implementations, design architectures, create technical roadma
7
7
 
8
8
  Create detailed technical implementation plans through research, codebase analysis, solution design, and comprehensive documentation.
9
9
 
10
+ ## Workflow Modes
11
+
12
+ Default: `--auto` — analyze the task and auto-pick the most appropriate mode.
13
+
14
+ | Flag | Research | Red Team | Validation | Cook Flag |
15
+ |------|----------|----------|------------|-----------|
16
+ | `--auto` | Auto | Auto | Auto | auto |
17
+ | `--fast` | Skip | Skip | Skip | fast |
18
+ | `--hard` | Full | Yes | Yes | hard |
19
+ | `--parallel` | Parallel | Yes | Yes | parallel |
20
+ | `--two` | Full | Yes | Yes | two |
21
+
22
+ Add `--no-tasks` to any mode to skip todo checklist hydration after the plan is written.
23
+
24
+ See `references/workflow-modes.md` for detailed mode behavior.
25
+
10
26
  ## When to Use
11
27
 
12
28
  Use this skill when:
@@ -41,12 +57,15 @@ Load: `references/output-standards.md`
41
57
 
42
58
  ## Workflow Process
43
59
 
44
- 1. **Initial Analysis** → Read codebase docs, understand context
45
- 2. **Research Phase** → Spawn researchers, investigate approaches
46
- 3. **Synthesis** → Analyze reports, identify optimal solution
47
- 4. **Design Phase** → Create architecture, implementation design
48
- 5. **Plan Documentation** → Write comprehensive plan
49
- 6. **Review & Refine** → Ensure completeness, clarity, actionability
60
+ 1. **Pre-Creation Check** → Check `## Plan Context` from hook injection; follow Active Plan State rules below.
61
+ 2. **Mode Detection** → Use explicit flag if provided; otherwise auto-detect based on task complexity.
62
+ 3. **Research Phase** → Spawn parallel researcher agents to investigate approaches (skip in `--fast` mode).
63
+ 4. **Codebase Analysis** → Read docs in `./docs`; activate `/ck-scout` if file relationships are unclear.
64
+ 5. **Plan Documentation** → Write comprehensive plan via `planner` agent using the directory structure below.
65
+ 6. **Red Team Review** → Spawn adversarial reviewers to challenge assumptions (`--hard`, `--parallel`, `--two` modes only). See `references/workflow-modes.md`.
66
+ 7. **Post-Plan Validation** → Use `/ck-plan-validate` to verify completeness and coherence (`--hard`, `--parallel`, `--two` modes only).
67
+ 8. **Hydrate Tasks** → Create a todo checklist from plan phases with dependency annotations (default on; skip with `--no-tasks` or fewer than 3 phases).
68
+ 9. **Context Reminder** → Output the cook command with the absolute plan path (MANDATORY): `Use plan at: {absolute-plan-dir-path}`
50
69
 
51
70
  ## Output Requirements
52
71
 
@@ -57,13 +76,15 @@ Load: `references/output-standards.md`
57
76
  - Provide multiple options with trade-offs when appropriate
58
77
  - Fully respect the `./docs/development-rules.md` file.
59
78
 
60
- ## Task Integration (Optional)
79
+ ## Task Management
61
80
 
62
- When session has `TASK_LIST_ID` set (active plan):
63
- - Create tasks for each phase with clear subjects
64
- - Set dependencies: Phase N+1 `blockedBy` Phase N
65
- - Agents coordinate via shared task list automatically
66
- - Update tasks to mark progress (in_progress completed)
81
+ Plan files are persistent on disk. Todo checklists are session-scoped. Hydration bridges the gap by converting plan phases into trackable checklist items at plan-creation time.
82
+
83
+ - **Default:** Auto-hydrate after plan is written (create checklist with one item per phase).
84
+ - **Skip with:** `--no-tasks` flag or when plan has fewer than 3 phases (3-Task Rule).
85
+ - **Checklist format:** Include phase name, dependencies, and owning agent hint per item.
86
+
87
+ See `references/task-management.md` for checklist schema and dependency notation.
67
88
 
68
89
  ### Important
69
90
  DO NOT create plans or reports in USER directory.
@@ -95,8 +116,8 @@ Prevents version proliferation by tracking current working plan via session stat
95
116
  ### Active vs Suggested Plans
96
117
 
97
118
  Check the `## Plan Context` section injected by hooks:
98
- - **"Plan: {path}"** = Active plan, explicitly set via `set-active-plan.cjs` - use for reports
99
- - **"Suggested: {path}"** = Branch-matched, hint only - do NOT auto-use
119
+ - **"Plan: {path}"** = Active plan, explicitly set via `set-active-plan.cjs` use this path for all reports
120
+ - **"Suggested: {path}"** = Branch-matched hint only do NOT auto-use
100
121
  - **"Plan: none"** = No active plan
101
122
 
102
123
  ### Rules
@@ -117,7 +138,7 @@ All agents writing reports MUST:
117
138
  DO NOT create plans or reports in USER directory.
118
139
  ALWAYS create plans or reports in CURRENT WORKING PROJECT DIRECTORY.
119
140
 
120
- **Important:** Suggested plans do NOT get plan-specific reports - this prevents pollution of old plan folders.
141
+ **Important:** Suggested plans do NOT get plan-specific reports this prevents pollution of old plan folders.
121
142
 
122
143
  ## Quality Standards
123
144
 
@@ -129,3 +150,14 @@ ALWAYS create plans or reports in CURRENT WORKING PROJECT DIRECTORY.
129
150
  - Validate against existing codebase patterns
130
151
 
131
152
  **Remember:** Plan quality determines implementation success. Be comprehensive and consider all solution aspects.
153
+
154
+ ## References
155
+
156
+ Load as needed:
157
+ - `references/workflow-modes.md` - Mode behavior details and flag descriptions
158
+ - `references/task-management.md` - Checklist schema, dependency notation, hydration rules
159
+ - `references/research-phase.md` - Research phase execution
160
+ - `references/codebase-understanding.md` - Codebase analysis steps
161
+ - `references/solution-design.md` - Solution design process
162
+ - `references/plan-organization.md` - Plan file structure and organization
163
+ - `references/output-standards.md` - Task breakdown and output format standards
@@ -24,7 +24,7 @@ You will employ a multi-source research strategy:
24
24
 
25
25
  1. **Search Strategy**:
26
26
  - **Gemini Toggle**: Check `$HOME/.copilot/.ck.json` (or `~/.copilot/.ck.json`) for `skills.research.useGemini` (default: `true`). If `false`, skip Gemini and use WebSearch.
27
- - **Gemini Model**: Read from `$HOME/.copilot/.ck.json`: `gemini.model` (default: `gemini-3.0-flash`)
27
+ - **Gemini Model**: Read from `$HOME/.copilot/.ck.json`: `gemini.model` (default: `gemini-3-flash-preview`)
28
28
  - If `useGemini` is enabled and `gemini` bash command is available, execute `gemini -y -m <gemini.model> "...your search prompt..."` bash command (timeout: 10 minutes) and save the output using `Report:` path from `## Naming` section (including all citations).
29
29
  - If `useGemini` is disabled or `gemini` bash command is not available, use `WebSearch` tool.
30
30
  - Run multiple `gemini` bash commands or `WebSearch` tools in parallel to search for relevant information.
@@ -29,7 +29,7 @@ Fast, token-efficient codebase scouting using parallel agents to find files need
29
29
  ## Configuration
30
30
 
31
31
  Read from `$HOME/.copilot/.ck.json`:
32
- - `gemini.model` - Gemini model (default: `gemini-3.0-flash`)
32
+ - `gemini.model` - Gemini model (default: `gemini-3-flash-preview`)
33
33
 
34
34
  ## Workflow
35
35
 
@@ -43,21 +43,33 @@ Read from `$HOME/.copilot/.ck.json`:
43
43
  - Assign each agent specific directories or patterns
44
44
  - Ensure no overlap, maximize coverage
45
45
 
46
- ### 3. Spawn Parallel Agents
46
+ ### 3. Register Scout Tasks
47
+
48
+ **Skip this step if agent count <= 2.**
49
+
50
+ - Check for existing scout tasks in the current session to avoid duplicates.
51
+ - Create a markdown checklist — one entry per agent — including scope metadata (directories, patterns assigned).
52
+ - Example checklist entry: `- [ ] Agent 1: src/api/*, src/models/* — searching for auth-related files`
53
+ - Reference: `references/task-management-scouting.md` for checklist format and metadata fields.
54
+
55
+ ### 4. Spawn Parallel Agents
56
+
47
57
  Load appropriate reference based on decision tree:
48
58
  - **Internal (Default):** `references/internal-scouting.md` (search agents)
49
59
  - **External:** `references/external-scouting.md` (Gemini/OpenCode)
50
60
 
51
61
  **Notes:**
52
- - Prompt detailed instructions for each agent with exact directories or files it should read
53
- - Remember that each agent has less than 200K tokens of context window
54
- - Amount of agents to-be-spawned depends on the current system resources available and amount of files to be scanned
55
- - Each agent must return a detailed summary report to a main agent
56
-
57
- ### 4. Collect Results
58
- - Timeout: 3 minutes per agent (skip non-responders)
59
- - Aggregate findings into single report
60
- - List unresolved questions at end
62
+ - Update each task to in_progress before spawning its agent.
63
+ - Prompt detailed instructions for each agent with exact directories or files it should read.
64
+ - Remember that each agent has less than 200K tokens of context window.
65
+ - Amount of agents to-be-spawned depends on the current system resources available and amount of files to be scanned.
66
+ - Each agent must return a detailed summary report to a main agent.
67
+
68
+ ### 5. Collect Results
69
+ - Timeout: 3 minutes per agent (skip non-responders).
70
+ - Update completed tasks to done; log timed-out agents in the report under "Unresolved Questions".
71
+ - Aggregate findings into single report.
72
+ - List unresolved questions at end.
61
73
 
62
74
  ## Report Format
63
75
 
@@ -76,3 +88,4 @@ Load appropriate reference based on decision tree:
76
88
 
77
89
  - `references/internal-scouting.md` - Using search agents
78
90
  - `references/external-scouting.md` - Using Gemini/OpenCode CLI
91
+ - `references/task-management-scouting.md` - Checklist format and scope metadata for scout tasks
@@ -1,23 +1,52 @@
1
1
  ---
2
2
  name: web-testing
3
- description: Web testing with Playwright, Vitest, k6. E2E/unit/integration/load/security/visual/a11y testing. Use for test automation, flakiness, Core Web Vitals, mobile gestures, cross-browser.
3
+ description: Web testing with Playwright, Vitest, k6. E2E/unit/integration/load/security/visual/a11y testing. Multi-language support (JS/TS, Python, Go, Rust, Flutter). Use for test automation, flakiness, Core Web Vitals, mobile gestures, cross-browser.
4
4
  ---
5
5
 
6
6
  # Web Testing Skill
7
7
 
8
- Comprehensive web testing: unit, integration, E2E, load, security, visual regression, accessibility.
8
+ Comprehensive testing: unit, integration, E2E, load, security, visual regression, accessibility. Multi-language workflow orchestration with structured QA reporting.
9
+
10
+ ## Core Principle
11
+
12
+ **NEVER IGNORE FAILING TESTS.** Fix root causes, not symptoms. No mocks/cheats/tricks to pass builds.
9
13
 
10
14
  ## Quick Start
11
15
 
12
16
  ```bash
17
+ # JavaScript/TypeScript
13
18
  npx vitest run # Unit tests
14
19
  npx playwright test # E2E tests
15
- npx playwright test --ui # E2E with UI
20
+ npm run test:coverage # Coverage
21
+
22
+ # Python
23
+ pytest --cov=src # Unit + coverage
24
+
25
+ # Go / Rust / Flutter
26
+ go test ./... -cover # Go
27
+ cargo test # Rust
28
+ flutter test --coverage # Flutter
29
+
30
+ # Web Quality
16
31
  k6 run load-test.js # Load tests
17
- npx @axe-core/cli https://example.com # Accessibility
18
- npx lighthouse https://example.com # Performance
32
+ npx @axe-core/cli https://... # Accessibility
33
+ npx lighthouse https://... # Performance
19
34
  ```
20
35
 
36
+ ## Workflows
37
+
38
+ Three orchestrated workflows — load the relevant reference when needed:
39
+
40
+ | Workflow | Reference | Use When |
41
+ |----------|-----------|----------|
42
+ | Code Testing | `./references/test-execution-workflow.md` | Running unit/integration/e2e, checking coverage, validating builds |
43
+ | UI Testing | `./references/ui-testing-workflow.md` | Visual regression, responsive checks, accessibility audits, form automation |
44
+ | Report Format | `./references/report-format.md` | Generating structured QA summary reports |
45
+
46
+ **Code Testing Process:** Identify scope → Pre-flight checks → Execute tests → Analyze results → Coverage analysis → Build verification → Report
47
+
48
+ **UI Testing Process:** Discovery → Visual capture → Console errors → Network validation → Responsive testing → Form testing → Performance metrics → Report
49
+
21
50
  ## Testing Strategy (Choose Your Model)
22
51
 
23
52
  | Model | Structure | Best For |
@@ -26,10 +55,15 @@ npx lighthouse https://example.com # Performance
26
55
  | Trophy | Integration-heavy | Modern SPAs |
27
56
  | Honeycomb | Contract-centric | Microservices |
28
57
 
29
- `./references/testing-pyramid-strategy.md`
58
+ > `./references/testing-pyramid-strategy.md`
30
59
 
31
60
  ## Reference Documentation
32
61
 
62
+ ### Workflows & Reports
63
+ - `./references/test-execution-workflow.md` - Orchestrated code testing (multi-language)
64
+ - `./references/ui-testing-workflow.md` - Browser-based visual testing via `agent-browser`
65
+ - `./references/report-format.md` - Structured QA report template
66
+
33
67
  ### Core Testing
34
68
  - `./references/unit-integration-testing.md` - Vitest, browser mode, AAA
35
69
  - `./references/e2e-testing-playwright.md` - Fixtures, sharding, selectors
@@ -64,6 +98,17 @@ npx lighthouse https://example.com # Performance
64
98
  - `./references/pre-release-checklist.md` - Complete release checklist
65
99
  - `./references/functional-testing-checklist.md` - Feature testing
66
100
 
101
+ ## Tools Integration
102
+
103
+ - **Test runners**: Vitest, Jest, Mocha, pytest, go test, cargo test, flutter test
104
+ - **Coverage**: Istanbul/c8/nyc, pytest-cov, go cover
105
+ - **E2E**: Playwright (multi-browser, sharding)
106
+ - **Load**: k6
107
+ - **Browser**: `agent-browser` skill for UI testing
108
+ - **Analysis**: `ai-multimodal` skill (if available) for screenshot analysis
109
+ - **Debugging**: `debug` skill when tests reveal bugs
110
+ - **Thinking**: `sequential-thinking` skill for complex test failure analysis
111
+
67
112
  ## Scripts
68
113
 
69
114
  ### Initialize Playwright Project
@@ -92,3 +137,12 @@ jobs:
92
137
  - run: npm run test:a11y # Accessibility
93
138
  - run: npx lhci autorun # Performance
94
139
  ```
140
+
141
+ ## Quality Standards
142
+
143
+ - All critical paths must have test coverage
144
+ - Validate happy path AND error scenarios
145
+ - Ensure test isolation — no interdependencies
146
+ - Tests must be deterministic and reproducible
147
+ - Clean up test data after execution
148
+ - Coverage: 80%+ lines, 70%+ branches minimum
@@ -0,0 +1,57 @@
1
+ # Test Report Format
2
+
3
+ Structured QA report template. Sacrifice grammar for concision.
4
+
5
+ ## Template
6
+
7
+ ```markdown
8
+ # Test Report — {date} — {scope}
9
+
10
+ ## Test Results Overview
11
+ - **Total**: X tests
12
+ - **Passed**: X | **Failed**: X | **Skipped**: X
13
+ - **Duration**: Xs
14
+
15
+ ## Coverage Metrics
16
+ | Metric | Value | Threshold | Status |
17
+ |----------|-------|-----------|--------|
18
+ | Lines | X% | 80% | PASS/FAIL |
19
+ | Branches | X% | 70% | PASS/FAIL |
20
+ | Functions| X% | 80% | PASS/FAIL |
21
+
22
+ ## Failed Tests
23
+ ### `test/path/file.test.ts` — TestName
24
+ - **Error**: Error message
25
+ - **Stack**: Relevant stack trace (truncated)
26
+ - **Cause**: Brief root cause analysis
27
+ - **Fix**: Suggested resolution
28
+
29
+ ## UI Test Results (if applicable)
30
+ - **Pages tested**: X
31
+ - **Screenshots**: ./screenshots/
32
+ - **Console errors**: none | [list]
33
+ - **Responsive**: checked at [viewports] | skipped
34
+ - **Performance**: LCP Xs, FID Xms, CLS X
35
+
36
+ ## Build Status
37
+ - **Build**: PASS/FAIL
38
+ - **Warnings**: none | [list]
39
+ - **Dependencies**: all resolved | [issues]
40
+
41
+ ## Critical Issues
42
+ 1. [Blocking issue description + impact]
43
+
44
+ ## Recommendations
45
+ 1. [Actionable improvement with priority]
46
+
47
+ ## Unresolved Questions
48
+ - [Any open questions, if any]
49
+ ```
50
+
51
+ ## Guidelines
52
+
53
+ - Include ALL failed tests with error messages — don't summarize away details
54
+ - Coverage: highlight specific uncovered files/functions, not just percentages
55
+ - Screenshots: embed paths directly in report for easy access
56
+ - Recommendations: prioritize by impact (critical > high > medium > low)
57
+ - Keep report under 200 lines — split into sections if larger scope needed