@crewpilot/agent 1.0.0 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +131 -107
- package/dist-npm/cli.js +0 -0
- package/dist-npm/index.js +160 -127
- package/package.json +69 -69
- package/prompts/agent.md +282 -266
- package/prompts/catalyst.config.json +72 -72
- package/prompts/copilot-instructions.md +36 -36
- package/prompts/skills/assure-code-quality/SKILL.md +112 -112
- package/prompts/skills/assure-pr-intelligence/SKILL.md +148 -148
- package/prompts/skills/assure-review-functional/SKILL.md +114 -0
- package/prompts/skills/assure-review-standards/SKILL.md +106 -0
- package/prompts/skills/assure-threat-model/SKILL.md +182 -0
- package/prompts/skills/assure-vulnerability-scan/SKILL.md +146 -146
- package/prompts/skills/autopilot-meeting/SKILL.md +434 -407
- package/prompts/skills/autopilot-worker/SKILL.md +737 -623
- package/prompts/skills/daily-digest/SKILL.md +188 -167
- package/prompts/skills/deliver-change-management/SKILL.md +132 -132
- package/prompts/skills/deliver-deploy-guard/SKILL.md +144 -144
- package/prompts/skills/deliver-doc-governance/SKILL.md +130 -130
- package/prompts/skills/engineer-feature-builder/SKILL.md +270 -270
- package/prompts/skills/engineer-root-cause-analysis/SKILL.md +150 -150
- package/prompts/skills/engineer-test-first/SKILL.md +148 -148
- package/prompts/skills/insights-knowledge-base/SKILL.md +202 -181
- package/prompts/skills/insights-pattern-detection/SKILL.md +142 -142
- package/prompts/skills/strategize-architecture-planner/SKILL.md +141 -141
- package/prompts/skills/strategize-solution-design/SKILL.md +118 -118
- package/scripts/postinstall.js +108 -108
|
@@ -1,150 +1,150 @@
|
|
|
1
|
-
# Root Cause Analysis
|
|
2
|
-
|
|
3
|
-
> **Pillar**: Engineer | **ID**: `engineer-root-cause-analysis`
|
|
4
|
-
|
|
5
|
-
## Purpose
|
|
6
|
-
|
|
7
|
-
Systematic debugging that finds the actual root cause, not just the symptom. Uses a structured hypothesis-test-eliminate approach with a maximum attempt budget to prevent rabbit holes.
|
|
8
|
-
|
|
9
|
-
## Activation Triggers
|
|
10
|
-
|
|
11
|
-
- "debug this", "fix this error", "why is this crashing", "investigate"
|
|
12
|
-
- "root cause", "not working", "broken", "unexpected behavior"
|
|
13
|
-
- Any error message, stack trace, or unexpected output
|
|
14
|
-
|
|
15
|
-
## Methodology
|
|
16
|
-
|
|
17
|
-
### Process Flow
|
|
18
|
-
|
|
19
|
-
```dot
|
|
20
|
-
digraph root_cause_analysis {
|
|
21
|
-
rankdir=TB;
|
|
22
|
-
node [shape=box];
|
|
23
|
-
|
|
24
|
-
symptoms [label="Phase 1\nSymptom Collection"];
|
|
25
|
-
hypotheses [label="Phase 2\nHypothesis Generation\n(2-3 ranked)"];
|
|
26
|
-
eliminate [label="Phase 3\nSystematic Elimination", shape=diamond];
|
|
27
|
-
root_cause [label="Phase 4\nRoot Cause Identified"];
|
|
28
|
-
fix [label="Phase 5\nFix Implementation"];
|
|
29
|
-
prevent [label="Phase 6\nPrevention", shape=doublecircle];
|
|
30
|
-
escalate [label="Escalate to Human", shape=octagon, style=filled, fillcolor="#ff9999"];
|
|
31
|
-
|
|
32
|
-
symptoms -> hypotheses;
|
|
33
|
-
hypotheses -> eliminate;
|
|
34
|
-
eliminate -> root_cause [label="confirmed"];
|
|
35
|
-
eliminate -> hypotheses [label="all eliminated\nrefine"];
|
|
36
|
-
eliminate -> escalate [label="max_attempts\nreached"];
|
|
37
|
-
root_cause -> fix;
|
|
38
|
-
fix -> prevent;
|
|
39
|
-
}
|
|
40
|
-
```
|
|
41
|
-
|
|
42
|
-
### Phase 1 — Symptom Collection
|
|
43
|
-
1. Gather all available evidence:
|
|
44
|
-
- Error message / stack trace
|
|
45
|
-
- Steps to reproduce
|
|
46
|
-
- When it started (recent changes?)
|
|
47
|
-
- Environment details (dev/staging/prod, OS, runtime version)
|
|
48
|
-
2. Reproduce the issue if possible (run the failing code/test)
|
|
49
|
-
3. Check git log for recent changes to affected files
|
|
50
|
-
|
|
51
|
-
### Phase 2 — Hypothesis Generation
|
|
52
|
-
Generate 2-3 ranked hypotheses:
|
|
53
|
-
|
|
54
|
-
| # | Hypothesis | Likelihood | Evidence | How to Test |
|
|
55
|
-
|---|---|---|---|---|
|
|
56
|
-
| H1 | {most likely cause} | High | {what points here} | {test strategy} |
|
|
57
|
-
| H2 | {alternative cause} | Medium | {what points here} | {test strategy} |
|
|
58
|
-
| H3 | {edge case cause} | Low | {what points here} | {test strategy} |
|
|
59
|
-
|
|
60
|
-
### Phase 3 — Systematic Elimination
|
|
61
|
-
For each hypothesis (highest likelihood first):
|
|
62
|
-
1. **Test**: Run the specific validation (add logging, modify input, check state)
|
|
63
|
-
2. **Observe**: What happened? Does it confirm or eliminate?
|
|
64
|
-
3. **Record**: Document result before moving to next hypothesis
|
|
65
|
-
4. **Refine**: If partially confirmed, narrow the hypothesis
|
|
66
|
-
|
|
67
|
-
Track progress:
|
|
68
|
-
```
|
|
69
|
-
Attempt 1/5: Testing H1 — {result} → {confirmed/eliminated/narrowed}
|
|
70
|
-
Attempt 2/5: Testing H2 — {result} → {confirmed/eliminated/narrowed}
|
|
71
|
-
```
|
|
72
|
-
|
|
73
|
-
Stop at `max_attempts` from config (default: 5). If root cause not found, report what was eliminated and recommend next steps.
|
|
74
|
-
|
|
75
|
-
### Phase 4 — Root Cause Identification
|
|
76
|
-
When found:
|
|
77
|
-
1. State the root cause in one sentence
|
|
78
|
-
2. Explain the causal chain: trigger → intermediate effects → symptom
|
|
79
|
-
3. Explain WHY the code was vulnerable to this (design gap, missing validation, etc.)
|
|
80
|
-
|
|
81
|
-
### Phase 5 — Fix Implementation
|
|
82
|
-
|
|
83
|
-
<HARD-GATE>
|
|
84
|
-
Do NOT implement a fix until the root cause has been identified in Phase 4.
|
|
85
|
-
Do NOT fix symptoms — the fix MUST address the root cause.
|
|
86
|
-
If the root cause is still unknown after max_attempts, escalate to human instead of guessing.
|
|
87
|
-
</HARD-GATE>
|
|
88
|
-
|
|
89
|
-
1. Implement the minimal fix
|
|
90
|
-
2. Add a regression test that fails without the fix
|
|
91
|
-
3. Run the full test suite to verify no regressions
|
|
92
|
-
4. If the fix reveals a systemic issue, note it for `pattern-detection`
|
|
93
|
-
|
|
94
|
-
### Phase 6 — Prevention
|
|
95
|
-
1. Identify what would have caught this earlier (better tests, type safety, validation)
|
|
96
|
-
2. Suggest ONE concrete prevention measure
|
|
97
|
-
|
|
98
|
-
## Tools Required
|
|
99
|
-
|
|
100
|
-
- `codebase` — Read failing code, trace call chains
|
|
101
|
-
- `terminal` — Run code, execute tests, check logs
|
|
102
|
-
- `catalyst_git_log` — Check recent changes
|
|
103
|
-
- `catalyst_git_diff` — Compare working vs. broken state
|
|
104
|
-
|
|
105
|
-
## Output Format
|
|
106
|
-
|
|
107
|
-
```
|
|
108
|
-
## [Catalyst → Root Cause Analysis]
|
|
109
|
-
|
|
110
|
-
### Symptom
|
|
111
|
-
{error/behavior description}
|
|
112
|
-
|
|
113
|
-
### Investigation
|
|
114
|
-
{hypothesis table}
|
|
115
|
-
|
|
116
|
-
### Elimination Log
|
|
117
|
-
{attempt-by-attempt results}
|
|
118
|
-
|
|
119
|
-
### Root Cause
|
|
120
|
-
**{one sentence}**
|
|
121
|
-
|
|
122
|
-
Causal chain:
|
|
123
|
-
{trigger} → {intermediate} → {symptom}
|
|
124
|
-
|
|
125
|
-
**Design gap**: {why code was vulnerable}
|
|
126
|
-
|
|
127
|
-
### Fix
|
|
128
|
-
{code change with explanation}
|
|
129
|
-
|
|
130
|
-
### Regression Test
|
|
131
|
-
{test that validates the fix}
|
|
132
|
-
|
|
133
|
-
### Prevention
|
|
134
|
-
{one concrete measure}
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
## Chains To
|
|
138
|
-
|
|
139
|
-
- `test-first` — Write the regression test
|
|
140
|
-
- `code-quality` — Review the fix for quality
|
|
141
|
-
- `pattern-detection` — If this reveals a systemic issue
|
|
142
|
-
- `knowledge-base` — Store the root cause for future reference
|
|
143
|
-
|
|
144
|
-
## Anti-Patterns
|
|
145
|
-
|
|
146
|
-
- Do NOT guess the fix without testing hypotheses
|
|
147
|
-
- Do NOT exceed max_attempts — report what you know and escalate
|
|
148
|
-
- Do NOT fix the symptom without finding the root cause
|
|
149
|
-
- Do NOT skip the regression test
|
|
150
|
-
- Do NOT modify code while investigating (observe first, fix after)
|
|
1
|
+
# Root Cause Analysis
|
|
2
|
+
|
|
3
|
+
> **Pillar**: Engineer | **ID**: `engineer-root-cause-analysis`
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Systematic debugging that finds the actual root cause, not just the symptom. Uses a structured hypothesis-test-eliminate approach with a maximum attempt budget to prevent rabbit holes.
|
|
8
|
+
|
|
9
|
+
## Activation Triggers
|
|
10
|
+
|
|
11
|
+
- "debug this", "fix this error", "why is this crashing", "investigate"
|
|
12
|
+
- "root cause", "not working", "broken", "unexpected behavior"
|
|
13
|
+
- Any error message, stack trace, or unexpected output
|
|
14
|
+
|
|
15
|
+
## Methodology
|
|
16
|
+
|
|
17
|
+
### Process Flow
|
|
18
|
+
|
|
19
|
+
```dot
|
|
20
|
+
digraph root_cause_analysis {
|
|
21
|
+
rankdir=TB;
|
|
22
|
+
node [shape=box];
|
|
23
|
+
|
|
24
|
+
symptoms [label="Phase 1\nSymptom Collection"];
|
|
25
|
+
hypotheses [label="Phase 2\nHypothesis Generation\n(2-3 ranked)"];
|
|
26
|
+
eliminate [label="Phase 3\nSystematic Elimination", shape=diamond];
|
|
27
|
+
root_cause [label="Phase 4\nRoot Cause Identified"];
|
|
28
|
+
fix [label="Phase 5\nFix Implementation"];
|
|
29
|
+
prevent [label="Phase 6\nPrevention", shape=doublecircle];
|
|
30
|
+
escalate [label="Escalate to Human", shape=octagon, style=filled, fillcolor="#ff9999"];
|
|
31
|
+
|
|
32
|
+
symptoms -> hypotheses;
|
|
33
|
+
hypotheses -> eliminate;
|
|
34
|
+
eliminate -> root_cause [label="confirmed"];
|
|
35
|
+
eliminate -> hypotheses [label="all eliminated\nrefine"];
|
|
36
|
+
eliminate -> escalate [label="max_attempts\nreached"];
|
|
37
|
+
root_cause -> fix;
|
|
38
|
+
fix -> prevent;
|
|
39
|
+
}
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### Phase 1 — Symptom Collection
|
|
43
|
+
1. Gather all available evidence:
|
|
44
|
+
- Error message / stack trace
|
|
45
|
+
- Steps to reproduce
|
|
46
|
+
- When it started (recent changes?)
|
|
47
|
+
- Environment details (dev/staging/prod, OS, runtime version)
|
|
48
|
+
2. Reproduce the issue if possible (run the failing code/test)
|
|
49
|
+
3. Check git log for recent changes to affected files
|
|
50
|
+
|
|
51
|
+
### Phase 2 — Hypothesis Generation
|
|
52
|
+
Generate 2-3 ranked hypotheses:
|
|
53
|
+
|
|
54
|
+
| # | Hypothesis | Likelihood | Evidence | How to Test |
|
|
55
|
+
|---|---|---|---|---|
|
|
56
|
+
| H1 | {most likely cause} | High | {what points here} | {test strategy} |
|
|
57
|
+
| H2 | {alternative cause} | Medium | {what points here} | {test strategy} |
|
|
58
|
+
| H3 | {edge case cause} | Low | {what points here} | {test strategy} |
|
|
59
|
+
|
|
60
|
+
### Phase 3 — Systematic Elimination
|
|
61
|
+
For each hypothesis (highest likelihood first):
|
|
62
|
+
1. **Test**: Run the specific validation (add logging, modify input, check state)
|
|
63
|
+
2. **Observe**: What happened? Does it confirm or eliminate?
|
|
64
|
+
3. **Record**: Document result before moving to next hypothesis
|
|
65
|
+
4. **Refine**: If partially confirmed, narrow the hypothesis
|
|
66
|
+
|
|
67
|
+
Track progress:
|
|
68
|
+
```
|
|
69
|
+
Attempt 1/5: Testing H1 — {result} → {confirmed/eliminated/narrowed}
|
|
70
|
+
Attempt 2/5: Testing H2 — {result} → {confirmed/eliminated/narrowed}
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
Stop at `max_attempts` from config (default: 5). If root cause not found, report what was eliminated and recommend next steps.
|
|
74
|
+
|
|
75
|
+
### Phase 4 — Root Cause Identification
|
|
76
|
+
When found:
|
|
77
|
+
1. State the root cause in one sentence
|
|
78
|
+
2. Explain the causal chain: trigger → intermediate effects → symptom
|
|
79
|
+
3. Explain WHY the code was vulnerable to this (design gap, missing validation, etc.)
|
|
80
|
+
|
|
81
|
+
### Phase 5 — Fix Implementation
|
|
82
|
+
|
|
83
|
+
<HARD-GATE>
|
|
84
|
+
Do NOT implement a fix until the root cause has been identified in Phase 4.
|
|
85
|
+
Do NOT fix symptoms — the fix MUST address the root cause.
|
|
86
|
+
If the root cause is still unknown after max_attempts, escalate to human instead of guessing.
|
|
87
|
+
</HARD-GATE>
|
|
88
|
+
|
|
89
|
+
1. Implement the minimal fix
|
|
90
|
+
2. Add a regression test that fails without the fix
|
|
91
|
+
3. Run the full test suite to verify no regressions
|
|
92
|
+
4. If the fix reveals a systemic issue, note it for `pattern-detection`
|
|
93
|
+
|
|
94
|
+
### Phase 6 — Prevention
|
|
95
|
+
1. Identify what would have caught this earlier (better tests, type safety, validation)
|
|
96
|
+
2. Suggest ONE concrete prevention measure
|
|
97
|
+
|
|
98
|
+
## Tools Required
|
|
99
|
+
|
|
100
|
+
- `codebase` — Read failing code, trace call chains
|
|
101
|
+
- `terminal` — Run code, execute tests, check logs
|
|
102
|
+
- `catalyst_git_log` — Check recent changes
|
|
103
|
+
- `catalyst_git_diff` — Compare working vs. broken state
|
|
104
|
+
|
|
105
|
+
## Output Format
|
|
106
|
+
|
|
107
|
+
```
|
|
108
|
+
## [Catalyst → Root Cause Analysis]
|
|
109
|
+
|
|
110
|
+
### Symptom
|
|
111
|
+
{error/behavior description}
|
|
112
|
+
|
|
113
|
+
### Investigation
|
|
114
|
+
{hypothesis table}
|
|
115
|
+
|
|
116
|
+
### Elimination Log
|
|
117
|
+
{attempt-by-attempt results}
|
|
118
|
+
|
|
119
|
+
### Root Cause
|
|
120
|
+
**{one sentence}**
|
|
121
|
+
|
|
122
|
+
Causal chain:
|
|
123
|
+
{trigger} → {intermediate} → {symptom}
|
|
124
|
+
|
|
125
|
+
**Design gap**: {why code was vulnerable}
|
|
126
|
+
|
|
127
|
+
### Fix
|
|
128
|
+
{code change with explanation}
|
|
129
|
+
|
|
130
|
+
### Regression Test
|
|
131
|
+
{test that validates the fix}
|
|
132
|
+
|
|
133
|
+
### Prevention
|
|
134
|
+
{one concrete measure}
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
## Chains To
|
|
138
|
+
|
|
139
|
+
- `test-first` — Write the regression test
|
|
140
|
+
- `code-quality` — Review the fix for quality
|
|
141
|
+
- `pattern-detection` — If this reveals a systemic issue
|
|
142
|
+
- `knowledge-base` — Store the root cause for future reference
|
|
143
|
+
|
|
144
|
+
## Anti-Patterns
|
|
145
|
+
|
|
146
|
+
- Do NOT guess the fix without testing hypotheses
|
|
147
|
+
- Do NOT exceed max_attempts — report what you know and escalate
|
|
148
|
+
- Do NOT fix the symptom without finding the root cause
|
|
149
|
+
- Do NOT skip the regression test
|
|
150
|
+
- Do NOT modify code while investigating (observe first, fix after)
|
|
@@ -1,148 +1,148 @@
|
|
|
1
|
-
# Test First
|
|
2
|
-
|
|
3
|
-
> **Pillar**: Engineer | **ID**: `engineer-test-first`
|
|
4
|
-
|
|
5
|
-
## Purpose
|
|
6
|
-
|
|
7
|
-
Test-Driven Development enforcement. Write tests before implementation, use tests as specification, achieve meaningful coverage — not vanity metrics.
|
|
8
|
-
|
|
9
|
-
## Activation Triggers
|
|
10
|
-
|
|
11
|
-
- "write tests", "TDD", "test first", "unit test", "test coverage"
|
|
12
|
-
- "add test cases", "what should I test"
|
|
13
|
-
- Automatically chained before `feature-builder` when enforcement is `strict`
|
|
14
|
-
|
|
15
|
-
## Methodology
|
|
16
|
-
|
|
17
|
-
### Process Flow
|
|
18
|
-
|
|
19
|
-
```dot
|
|
20
|
-
digraph test_first {
|
|
21
|
-
rankdir=LR;
|
|
22
|
-
node [shape=box];
|
|
23
|
-
|
|
24
|
-
strategy [label="Phase 1\nTest Strategy"];
|
|
25
|
-
design [label="Phase 2\nTest Case Design"];
|
|
26
|
-
red [label="Phase 3\nRED\nWrite Failing Tests", style=filled, fillcolor="#ffcccc"];
|
|
27
|
-
green [label="Phase 4\nGREEN\nMinimal Implementation", style=filled, fillcolor="#ccffcc"];
|
|
28
|
-
refactor [label="Phase 5\nREFACTOR\nClean Up", style=filled, fillcolor="#ccccff"];
|
|
29
|
-
coverage [label="Phase 6\nCoverage Check"];
|
|
30
|
-
next [label="Next behavior", shape=ellipse];
|
|
31
|
-
|
|
32
|
-
strategy -> design;
|
|
33
|
-
design -> red;
|
|
34
|
-
red -> red [label="test passes\nimmediately?\nfix test"];
|
|
35
|
-
red -> green [label="test fails\ncorrectly"];
|
|
36
|
-
green -> green [label="test still fails?\nfix code"];
|
|
37
|
-
green -> refactor [label="all pass"];
|
|
38
|
-
refactor -> green [label="test broke?\nfix"];
|
|
39
|
-
refactor -> coverage;
|
|
40
|
-
coverage -> next [label="more behaviors"];
|
|
41
|
-
next -> red;
|
|
42
|
-
}
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
### Phase 1 — Test Strategy
|
|
46
|
-
1. Identify what's being tested: function, class, API endpoint, workflow
|
|
47
|
-
2. Determine test type needed:
|
|
48
|
-
- **Unit**: Isolated function/method behavior
|
|
49
|
-
- **Integration**: Component interactions, API calls, DB queries
|
|
50
|
-
- **E2E**: Full user journey (only when explicitly requested)
|
|
51
|
-
3. Follow existing test patterns in the project (framework, naming, file location)
|
|
52
|
-
4. Locate the test runner config and understand the test command
|
|
53
|
-
|
|
54
|
-
### Phase 2 — Test Case Design
|
|
55
|
-
For each function/component, design test cases across:
|
|
56
|
-
|
|
57
|
-
| Category | Examples |
|
|
58
|
-
|---|---|
|
|
59
|
-
| **Happy path** | Valid inputs producing expected outputs |
|
|
60
|
-
| **Edge cases** | Empty inputs, boundary values, single element, max size |
|
|
61
|
-
| **Error cases** | Invalid inputs, null/undefined, malformed data |
|
|
62
|
-
| **State transitions** | Before/after, concurrent access, cleanup |
|
|
63
|
-
|
|
64
|
-
Present as a test plan:
|
|
65
|
-
```
|
|
66
|
-
describe('{subject}')
|
|
67
|
-
✓ should {expected behavior} when {condition}
|
|
68
|
-
✓ should {expected behavior} when {edge case}
|
|
69
|
-
✗ should throw/return error when {invalid condition}
|
|
70
|
-
```
|
|
71
|
-
|
|
72
|
-
### Phase 3 — Red (Write Failing Tests)
|
|
73
|
-
|
|
74
|
-
<HARD-GATE>
|
|
75
|
-
Do NOT write any production/implementation code until failing tests exist.
|
|
76
|
-
If you find yourself writing implementation first, STOP, delete it, and write the test.
|
|
77
|
-
Tests MUST fail before any implementation begins. No exceptions.
|
|
78
|
-
</HARD-GATE>
|
|
79
|
-
|
|
80
|
-
1. Write the test cases with proper assertions
|
|
81
|
-
2. Use descriptive test names that read as specifications
|
|
82
|
-
3. Set up fixtures/mocks using the project's existing patterns
|
|
83
|
-
4. Run the tests — they MUST fail (red phase)
|
|
84
|
-
5. If tests pass immediately, they're not testing new behavior
|
|
85
|
-
|
|
86
|
-
### Phase 4 — Green (Minimal Implementation)
|
|
87
|
-
1. Write the minimum code to make tests pass
|
|
88
|
-
2. No optimization, no elegance — just make it work
|
|
89
|
-
3. Run tests — all must pass
|
|
90
|
-
|
|
91
|
-
### Phase 5 — Refactor
|
|
92
|
-
1. Clean up the implementation while tests stay green
|
|
93
|
-
2. Remove duplication
|
|
94
|
-
3. Improve naming
|
|
95
|
-
4. Extract helpers if needed
|
|
96
|
-
5. Run tests after each refactor step
|
|
97
|
-
|
|
98
|
-
### Phase 6 — Coverage Check
|
|
99
|
-
1. Run coverage tool
|
|
100
|
-
2. Check against `min_coverage` from config (default: 80%)
|
|
101
|
-
3. Identify uncovered branches — add tests only for meaningful gaps
|
|
102
|
-
4. Do NOT chase 100% — test behavior, not lines
|
|
103
|
-
|
|
104
|
-
## Tools Required
|
|
105
|
-
|
|
106
|
-
- `codebase` — Find existing test patterns, test config
|
|
107
|
-
- `findTestFiles` — Locate related test files
|
|
108
|
-
- `terminal` — Run tests, run coverage
|
|
109
|
-
- `catalyst_metrics_coverage` — Get coverage report
|
|
110
|
-
|
|
111
|
-
## Output Format
|
|
112
|
-
|
|
113
|
-
```
|
|
114
|
-
## [Catalyst → Test First]
|
|
115
|
-
|
|
116
|
-
### Test Strategy
|
|
117
|
-
- Type: {unit/integration/e2e}
|
|
118
|
-
- Framework: {detected framework}
|
|
119
|
-
- Test file: {path}
|
|
120
|
-
|
|
121
|
-
### Test Plan
|
|
122
|
-
{describe/it outline}
|
|
123
|
-
|
|
124
|
-
### Results
|
|
125
|
-
| Phase | Status |
|
|
126
|
-
|---|---|
|
|
127
|
-
| Red (failing tests) | {N} tests written, all fail ✓ |
|
|
128
|
-
| Green (passing) | All {N} pass ✓ |
|
|
129
|
-
| Refactor | Clean, tests still pass ✓ |
|
|
130
|
-
| Coverage | {%} (target: {min_coverage}%) |
|
|
131
|
-
|
|
132
|
-
### Tests Written
|
|
133
|
-
{list of test files created/modified}
|
|
134
|
-
```
|
|
135
|
-
|
|
136
|
-
## Chains To
|
|
137
|
-
|
|
138
|
-
- `feature-builder` — After tests are written, implement the feature
|
|
139
|
-
- `code-quality` — Review the implementation after TDD cycle
|
|
140
|
-
- `change-management` — Commit the TDD cycle
|
|
141
|
-
|
|
142
|
-
## Anti-Patterns
|
|
143
|
-
|
|
144
|
-
- Do NOT write tests after implementation and call it TDD
|
|
145
|
-
- Do NOT mock everything — test real behavior where feasible
|
|
146
|
-
- Do NOT write tests that test the framework instead of your code
|
|
147
|
-
- Do NOT skip the red phase — every test must fail first
|
|
148
|
-
- Do NOT chase coverage numbers with meaningless assertions
|
|
1
|
+
# Test First
|
|
2
|
+
|
|
3
|
+
> **Pillar**: Engineer | **ID**: `engineer-test-first`
|
|
4
|
+
|
|
5
|
+
## Purpose
|
|
6
|
+
|
|
7
|
+
Test-Driven Development enforcement. Write tests before implementation, use tests as specification, achieve meaningful coverage — not vanity metrics.
|
|
8
|
+
|
|
9
|
+
## Activation Triggers
|
|
10
|
+
|
|
11
|
+
- "write tests", "TDD", "test first", "unit test", "test coverage"
|
|
12
|
+
- "add test cases", "what should I test"
|
|
13
|
+
- Automatically chained before `feature-builder` when enforcement is `strict`
|
|
14
|
+
|
|
15
|
+
## Methodology
|
|
16
|
+
|
|
17
|
+
### Process Flow
|
|
18
|
+
|
|
19
|
+
```dot
|
|
20
|
+
digraph test_first {
|
|
21
|
+
rankdir=LR;
|
|
22
|
+
node [shape=box];
|
|
23
|
+
|
|
24
|
+
strategy [label="Phase 1\nTest Strategy"];
|
|
25
|
+
design [label="Phase 2\nTest Case Design"];
|
|
26
|
+
red [label="Phase 3\nRED\nWrite Failing Tests", style=filled, fillcolor="#ffcccc"];
|
|
27
|
+
green [label="Phase 4\nGREEN\nMinimal Implementation", style=filled, fillcolor="#ccffcc"];
|
|
28
|
+
refactor [label="Phase 5\nREFACTOR\nClean Up", style=filled, fillcolor="#ccccff"];
|
|
29
|
+
coverage [label="Phase 6\nCoverage Check"];
|
|
30
|
+
next [label="Next behavior", shape=ellipse];
|
|
31
|
+
|
|
32
|
+
strategy -> design;
|
|
33
|
+
design -> red;
|
|
34
|
+
red -> red [label="test passes\nimmediately?\nfix test"];
|
|
35
|
+
red -> green [label="test fails\ncorrectly"];
|
|
36
|
+
green -> green [label="test still fails?\nfix code"];
|
|
37
|
+
green -> refactor [label="all pass"];
|
|
38
|
+
refactor -> green [label="test broke?\nfix"];
|
|
39
|
+
refactor -> coverage;
|
|
40
|
+
coverage -> next [label="more behaviors"];
|
|
41
|
+
next -> red;
|
|
42
|
+
}
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
### Phase 1 — Test Strategy
|
|
46
|
+
1. Identify what's being tested: function, class, API endpoint, workflow
|
|
47
|
+
2. Determine test type needed:
|
|
48
|
+
- **Unit**: Isolated function/method behavior
|
|
49
|
+
- **Integration**: Component interactions, API calls, DB queries
|
|
50
|
+
- **E2E**: Full user journey (only when explicitly requested)
|
|
51
|
+
3. Follow existing test patterns in the project (framework, naming, file location)
|
|
52
|
+
4. Locate the test runner config and understand the test command
|
|
53
|
+
|
|
54
|
+
### Phase 2 — Test Case Design
|
|
55
|
+
For each function/component, design test cases across:
|
|
56
|
+
|
|
57
|
+
| Category | Examples |
|
|
58
|
+
|---|---|
|
|
59
|
+
| **Happy path** | Valid inputs producing expected outputs |
|
|
60
|
+
| **Edge cases** | Empty inputs, boundary values, single element, max size |
|
|
61
|
+
| **Error cases** | Invalid inputs, null/undefined, malformed data |
|
|
62
|
+
| **State transitions** | Before/after, concurrent access, cleanup |
|
|
63
|
+
|
|
64
|
+
Present as a test plan:
|
|
65
|
+
```
|
|
66
|
+
describe('{subject}')
|
|
67
|
+
✓ should {expected behavior} when {condition}
|
|
68
|
+
✓ should {expected behavior} when {edge case}
|
|
69
|
+
✗ should throw/return error when {invalid condition}
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Phase 3 — Red (Write Failing Tests)
|
|
73
|
+
|
|
74
|
+
<HARD-GATE>
|
|
75
|
+
Do NOT write any production/implementation code until failing tests exist.
|
|
76
|
+
If you find yourself writing implementation first, STOP, delete it, and write the test.
|
|
77
|
+
Tests MUST fail before any implementation begins. No exceptions.
|
|
78
|
+
</HARD-GATE>
|
|
79
|
+
|
|
80
|
+
1. Write the test cases with proper assertions
|
|
81
|
+
2. Use descriptive test names that read as specifications
|
|
82
|
+
3. Set up fixtures/mocks using the project's existing patterns
|
|
83
|
+
4. Run the tests — they MUST fail (red phase)
|
|
84
|
+
5. If tests pass immediately, they're not testing new behavior
|
|
85
|
+
|
|
86
|
+
### Phase 4 — Green (Minimal Implementation)
|
|
87
|
+
1. Write the minimum code to make tests pass
|
|
88
|
+
2. No optimization, no elegance — just make it work
|
|
89
|
+
3. Run tests — all must pass
|
|
90
|
+
|
|
91
|
+
### Phase 5 — Refactor
|
|
92
|
+
1. Clean up the implementation while tests stay green
|
|
93
|
+
2. Remove duplication
|
|
94
|
+
3. Improve naming
|
|
95
|
+
4. Extract helpers if needed
|
|
96
|
+
5. Run tests after each refactor step
|
|
97
|
+
|
|
98
|
+
### Phase 6 — Coverage Check
|
|
99
|
+
1. Run coverage tool
|
|
100
|
+
2. Check against `min_coverage` from config (default: 80%)
|
|
101
|
+
3. Identify uncovered branches — add tests only for meaningful gaps
|
|
102
|
+
4. Do NOT chase 100% — test behavior, not lines
|
|
103
|
+
|
|
104
|
+
## Tools Required
|
|
105
|
+
|
|
106
|
+
- `codebase` — Find existing test patterns, test config
|
|
107
|
+
- `findTestFiles` — Locate related test files
|
|
108
|
+
- `terminal` — Run tests, run coverage
|
|
109
|
+
- `catalyst_metrics_coverage` — Get coverage report
|
|
110
|
+
|
|
111
|
+
## Output Format
|
|
112
|
+
|
|
113
|
+
```
|
|
114
|
+
## [Catalyst → Test First]
|
|
115
|
+
|
|
116
|
+
### Test Strategy
|
|
117
|
+
- Type: {unit/integration/e2e}
|
|
118
|
+
- Framework: {detected framework}
|
|
119
|
+
- Test file: {path}
|
|
120
|
+
|
|
121
|
+
### Test Plan
|
|
122
|
+
{describe/it outline}
|
|
123
|
+
|
|
124
|
+
### Results
|
|
125
|
+
| Phase | Status |
|
|
126
|
+
|---|---|
|
|
127
|
+
| Red (failing tests) | {N} tests written, all fail ✓ |
|
|
128
|
+
| Green (passing) | All {N} pass ✓ |
|
|
129
|
+
| Refactor | Clean, tests still pass ✓ |
|
|
130
|
+
| Coverage | {%} (target: {min_coverage}%) |
|
|
131
|
+
|
|
132
|
+
### Tests Written
|
|
133
|
+
{list of test files created/modified}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
## Chains To
|
|
137
|
+
|
|
138
|
+
- `feature-builder` — After tests are written, implement the feature
|
|
139
|
+
- `code-quality` — Review the implementation after TDD cycle
|
|
140
|
+
- `change-management` — Commit the TDD cycle
|
|
141
|
+
|
|
142
|
+
## Anti-Patterns
|
|
143
|
+
|
|
144
|
+
- Do NOT write tests after implementation and call it TDD
|
|
145
|
+
- Do NOT mock everything — test real behavior where feasible
|
|
146
|
+
- Do NOT write tests that test the framework instead of your code
|
|
147
|
+
- Do NOT skip the red phase — every test must fail first
|
|
148
|
+
- Do NOT chase coverage numbers with meaningless assertions
|