qaa-agent 1.6.3 → 1.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +22 -0
- package/agents/qaa-analyzer.md +16 -1
- package/agents/qaa-bug-detective.md +33 -0
- package/agents/qaa-discovery.md +384 -0
- package/agents/qaa-e2e-runner.md +7 -6
- package/agents/qaa-planner.md +16 -1
- package/agents/qaa-testid-injector.md +60 -2
- package/agents/qaa-validator.md +38 -0
- package/bin/install.cjs +11 -9
- package/commands/qa-audit.md +119 -0
- package/commands/qa-create-test.md +288 -0
- package/commands/qa-fix.md +147 -0
- package/commands/qa-map.md +137 -0
- package/package.json +40 -41
- package/{.claude/settings.json → settings.json} +19 -20
- package/{.claude/skills → skills}/qa-bug-detective/SKILL.md +122 -122
- package/{.claude/skills → skills}/qa-repo-analyzer/SKILL.md +88 -88
- package/{.claude/skills → skills}/qa-self-validator/SKILL.md +109 -109
- package/{.claude/skills → skills}/qa-template-engine/SKILL.md +113 -113
- package/{.claude/skills → skills}/qa-testid-injector/SKILL.md +93 -93
- package/{.claude/skills → skills}/qa-workflow-documenter/SKILL.md +87 -87
- package/workflows/qa-gap.md +7 -1
- package/workflows/qa-start.md +25 -1
- package/workflows/qa-testid.md +29 -1
- package/workflows/qa-validate.md +5 -1
- package/.claude/commands/create-test.md +0 -164
- package/.claude/commands/qa-audit.md +0 -37
- package/.claude/commands/qa-blueprint.md +0 -54
- package/.claude/commands/qa-fix.md +0 -36
- package/.claude/commands/qa-from-ticket.md +0 -24
- package/.claude/commands/qa-gap.md +0 -20
- package/.claude/commands/qa-map.md +0 -47
- package/.claude/commands/qa-pom.md +0 -36
- package/.claude/commands/qa-pyramid.md +0 -37
- package/.claude/commands/qa-report.md +0 -38
- package/.claude/commands/qa-research.md +0 -33
- package/.claude/commands/qa-validate.md +0 -42
- package/.claude/commands/update-test.md +0 -58
- package/.claude/skills/qa-learner/SKILL.md +0 -150
- /package/{.claude/commands → commands}/qa-pr.md +0 -0
- /package/{.claude/commands → commands}/qa-start.md +0 -0
- /package/{.claude/commands → commands}/qa-testid.md +0 -0
|
@@ -0,0 +1,137 @@
|
|
|
1
|
+
# QA Codebase Map & Analysis
|
|
2
|
+
|
|
3
|
+
Deep-scan a codebase for QA-relevant information, produce a complete analysis, and generate test inventory. Runs codebase mapping (4 parallel agents) followed by full repository analysis. One command to fully understand a codebase before writing tests.
|
|
4
|
+
|
|
5
|
+
## Usage
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
/qa-map [options]
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
### Options
|
|
12
|
+
|
|
13
|
+
- No arguments — runs full map + analysis on current directory
|
|
14
|
+
- `--focus <area>` — run a single map area only, skip analysis (testability, risk, patterns, existing-tests)
|
|
15
|
+
- `--dev-repo <path>` — explicit path to developer repository
|
|
16
|
+
- `--qa-repo <path>` — path to existing QA repository (produces gap analysis instead of blueprint)
|
|
17
|
+
- `--skip-map` — skip codebase mapping, only run analysis (lighter, faster)
|
|
18
|
+
|
|
19
|
+
## What It Produces
|
|
20
|
+
|
|
21
|
+
### Stage 1: Codebase Map (4 parallel agents)
|
|
22
|
+
|
|
23
|
+
| Focus Area | Documents Produced |
|
|
24
|
+
|------------|-------------------|
|
|
25
|
+
| **testability** | TESTABILITY.md + TEST_SURFACE.md — what's testable, entry points, mocking needs |
|
|
26
|
+
| **risk** | RISK_MAP.md + CRITICAL_PATHS.md — business-critical paths, error handling gaps |
|
|
27
|
+
| **patterns** | CODE_PATTERNS.md + API_CONTRACTS.md — naming conventions, API shapes, auth patterns |
|
|
28
|
+
| **existing-tests** | TEST_ASSESSMENT.md + COVERAGE_GAPS.md — existing test quality, what's missing |
|
|
29
|
+
|
|
30
|
+
All documents written to `.qa-output/codebase/`.
|
|
31
|
+
|
|
32
|
+
### Stage 2: Repository Analysis
|
|
33
|
+
|
|
34
|
+
| Document | Description |
|
|
35
|
+
|----------|-------------|
|
|
36
|
+
| SCAN_MANIFEST.md | File tree, framework detection, testable surfaces |
|
|
37
|
+
| QA_ANALYSIS.md | Architecture overview, risk assessment, top 10 unit targets, testing pyramid |
|
|
38
|
+
| TEST_INVENTORY.md | Every test case with ID, target, inputs, expected outcome, priority |
|
|
39
|
+
| QA_REPO_BLUEPRINT.md | If no QA repo — full repo structure, configs, CI/CD strategy |
|
|
40
|
+
| GAP_ANALYSIS.md | If QA repo provided — coverage map, missing tests, broken tests, quality assessment |
|
|
41
|
+
|
|
42
|
+
## Instructions
|
|
43
|
+
|
|
44
|
+
1. Read `CLAUDE.md` — QA standards.
|
|
45
|
+
|
|
46
|
+
2. Create output directories:
|
|
47
|
+
```bash
|
|
48
|
+
mkdir -p .qa-output/codebase
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
3. **Stage 1: Codebase Mapping**
|
|
52
|
+
|
|
53
|
+
If `--skip-map` was NOT passed and `--focus` was NOT specified, spawn 4 agents in parallel (one per focus area):
|
|
54
|
+
|
|
55
|
+
```
|
|
56
|
+
Agent(
|
|
57
|
+
prompt="Analyze this codebase for QA purposes. Focus area: testability. Write TESTABILITY.md and TEST_SURFACE.md to .qa-output/codebase/. Follow your agent definition process.",
|
|
58
|
+
subagent_type="general-purpose",
|
|
59
|
+
execution_context="@agents/qaa-codebase-mapper.md"
|
|
60
|
+
)
|
|
61
|
+
|
|
62
|
+
Agent(
|
|
63
|
+
prompt="Analyze this codebase for QA purposes. Focus area: risk. Write RISK_MAP.md and CRITICAL_PATHS.md to .qa-output/codebase/. Follow your agent definition process.",
|
|
64
|
+
subagent_type="general-purpose",
|
|
65
|
+
execution_context="@agents/qaa-codebase-mapper.md"
|
|
66
|
+
)
|
|
67
|
+
|
|
68
|
+
Agent(
|
|
69
|
+
prompt="Analyze this codebase for QA purposes. Focus area: patterns. Write CODE_PATTERNS.md and API_CONTRACTS.md to .qa-output/codebase/. Follow your agent definition process.",
|
|
70
|
+
subagent_type="general-purpose",
|
|
71
|
+
execution_context="@agents/qaa-codebase-mapper.md"
|
|
72
|
+
)
|
|
73
|
+
|
|
74
|
+
Agent(
|
|
75
|
+
prompt="Analyze this codebase for QA purposes. Focus area: existing-tests. Write TEST_ASSESSMENT.md and COVERAGE_GAPS.md to .qa-output/codebase/. Follow your agent definition process.",
|
|
76
|
+
subagent_type="general-purpose",
|
|
77
|
+
execution_context="@agents/qaa-codebase-mapper.md"
|
|
78
|
+
)
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
If `--focus <area>` was provided, spawn only that one agent and STOP after it completes (skip Stage 2).
|
|
82
|
+
|
|
83
|
+
If `--skip-map` was passed, skip Stage 1 entirely and go to Stage 2.
|
|
84
|
+
|
|
85
|
+
4. When all map agents complete, print summary of documents produced.
|
|
86
|
+
|
|
87
|
+
5. **Stage 2: Repository Analysis**
|
|
88
|
+
|
|
89
|
+
Initialize pipeline context:
|
|
90
|
+
```bash
|
|
91
|
+
node bin/qaa-tools.cjs init qa-start 2>/dev/null || true
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
Invoke scanner agent:
|
|
95
|
+
|
|
96
|
+
Task(
|
|
97
|
+
prompt="
|
|
98
|
+
<objective>Scan repository and produce SCAN_MANIFEST.md</objective>
|
|
99
|
+
<execution_context>@agents/qaa-scanner.md</execution_context>
|
|
100
|
+
<files_to_read>
|
|
101
|
+
- CLAUDE.md
|
|
102
|
+
</files_to_read>
|
|
103
|
+
<parameters>
|
|
104
|
+
user_input: $ARGUMENTS
|
|
105
|
+
</parameters>
|
|
106
|
+
"
|
|
107
|
+
)
|
|
108
|
+
|
|
109
|
+
Invoke analyzer agent:
|
|
110
|
+
|
|
111
|
+
Task(
|
|
112
|
+
prompt="
|
|
113
|
+
<objective>Analyze repository and produce QA_ANALYSIS.md, TEST_INVENTORY.md, and blueprint or gap analysis. Use codebase map documents from .qa-output/codebase/ if they exist for deeper, more accurate analysis.</objective>
|
|
114
|
+
<execution_context>@agents/qaa-analyzer.md</execution_context>
|
|
115
|
+
<files_to_read>
|
|
116
|
+
- CLAUDE.md
|
|
117
|
+
- .qa-output/SCAN_MANIFEST.md
|
|
118
|
+
- .qa-output/codebase/TESTABILITY.md (if exists)
|
|
119
|
+
- .qa-output/codebase/RISK_MAP.md (if exists)
|
|
120
|
+
- .qa-output/codebase/CODE_PATTERNS.md (if exists)
|
|
121
|
+
- .qa-output/codebase/TEST_ASSESSMENT.md (if exists)
|
|
122
|
+
- .qa-output/codebase/TEST_SURFACE.md (if exists)
|
|
123
|
+
- .qa-output/codebase/CRITICAL_PATHS.md (if exists)
|
|
124
|
+
- .qa-output/codebase/API_CONTRACTS.md (if exists)
|
|
125
|
+
- .qa-output/codebase/COVERAGE_GAPS.md (if exists)
|
|
126
|
+
</files_to_read>
|
|
127
|
+
<parameters>
|
|
128
|
+
user_input: $ARGUMENTS
|
|
129
|
+
</parameters>
|
|
130
|
+
"
|
|
131
|
+
)
|
|
132
|
+
|
|
133
|
+
6. Print final summary: all documents produced across both stages.
|
|
134
|
+
No git operations. No test generation.
|
|
135
|
+
Suggest `/qa-create-test` to generate tests from the analysis.
|
|
136
|
+
|
|
137
|
+
$ARGUMENTS
|
package/package.json
CHANGED
|
@@ -1,41 +1,40 @@
|
|
|
1
|
-
{
|
|
2
|
-
"name": "qaa-agent",
|
|
3
|
-
"version": "1.
|
|
4
|
-
"description": "QA Automation Agent for Claude Code — multi-agent pipeline that analyzes repos, generates tests, validates, and creates PRs",
|
|
5
|
-
"bin": {
|
|
6
|
-
"qaa-agent": "./bin/install.cjs"
|
|
7
|
-
},
|
|
8
|
-
"keywords": [
|
|
9
|
-
"qa",
|
|
10
|
-
"automation",
|
|
11
|
-
"testing",
|
|
12
|
-
"claude-code",
|
|
13
|
-
"playwright",
|
|
14
|
-
"jest",
|
|
15
|
-
"pytest",
|
|
16
|
-
"ai-agent"
|
|
17
|
-
],
|
|
18
|
-
"repository": {
|
|
19
|
-
"type": "git",
|
|
20
|
-
"url": "https://github.com/
|
|
21
|
-
},
|
|
22
|
-
"author": "Backhaus7997",
|
|
23
|
-
"license": "MIT",
|
|
24
|
-
"dependencies": {
|
|
25
|
-
"@playwright/mcp": "latest"
|
|
26
|
-
},
|
|
27
|
-
"files": [
|
|
28
|
-
"bin/",
|
|
29
|
-
"agents/",
|
|
30
|
-
"workflows/",
|
|
31
|
-
"templates/",
|
|
32
|
-
"docs/",
|
|
33
|
-
"
|
|
34
|
-
"
|
|
35
|
-
"
|
|
36
|
-
".mcp.json",
|
|
37
|
-
"CLAUDE.md",
|
|
38
|
-
"CHANGELOG.md"
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
}
|
|
1
|
+
{
|
|
2
|
+
"name": "qaa-agent",
|
|
3
|
+
"version": "1.7.0",
|
|
4
|
+
"description": "QA Automation Agent for Claude Code — multi-agent pipeline that analyzes repos, generates tests, validates, and creates PRs",
|
|
5
|
+
"bin": {
|
|
6
|
+
"qaa-agent": "./bin/install.cjs"
|
|
7
|
+
},
|
|
8
|
+
"keywords": [
|
|
9
|
+
"qa",
|
|
10
|
+
"automation",
|
|
11
|
+
"testing",
|
|
12
|
+
"claude-code",
|
|
13
|
+
"playwright",
|
|
14
|
+
"jest",
|
|
15
|
+
"pytest",
|
|
16
|
+
"ai-agent"
|
|
17
|
+
],
|
|
18
|
+
"repository": {
|
|
19
|
+
"type": "git",
|
|
20
|
+
"url": "https://github.com/capmation/qaa-testing.git"
|
|
21
|
+
},
|
|
22
|
+
"author": "Backhaus7997",
|
|
23
|
+
"license": "MIT",
|
|
24
|
+
"dependencies": {
|
|
25
|
+
"@playwright/mcp": "latest"
|
|
26
|
+
},
|
|
27
|
+
"files": [
|
|
28
|
+
"bin/",
|
|
29
|
+
"agents/",
|
|
30
|
+
"workflows/",
|
|
31
|
+
"templates/",
|
|
32
|
+
"docs/",
|
|
33
|
+
"commands/",
|
|
34
|
+
"skills/",
|
|
35
|
+
"settings.json",
|
|
36
|
+
".mcp.json",
|
|
37
|
+
"CLAUDE.md",
|
|
38
|
+
"CHANGELOG.md"
|
|
39
|
+
]
|
|
40
|
+
}
|
|
@@ -1,20 +1,19 @@
|
|
|
1
|
-
{
|
|
2
|
-
"permissions": {
|
|
3
|
-
"allow": [
|
|
4
|
-
"Bash(*)",
|
|
5
|
-
"Read",
|
|
6
|
-
"Write",
|
|
7
|
-
"Edit",
|
|
8
|
-
"Glob",
|
|
9
|
-
"Grep",
|
|
10
|
-
"Agent",
|
|
11
|
-
"WebFetch",
|
|
12
|
-
"WebSearch",
|
|
13
|
-
"NotebookEdit"
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
}
|
|
1
|
+
{
|
|
2
|
+
"permissions": {
|
|
3
|
+
"allow": [
|
|
4
|
+
"Bash(*)",
|
|
5
|
+
"Read",
|
|
6
|
+
"Write",
|
|
7
|
+
"Edit",
|
|
8
|
+
"Glob",
|
|
9
|
+
"Grep",
|
|
10
|
+
"Agent",
|
|
11
|
+
"WebFetch",
|
|
12
|
+
"WebSearch",
|
|
13
|
+
"NotebookEdit"
|
|
14
|
+
]
|
|
15
|
+
},
|
|
16
|
+
"env": {
|
|
17
|
+
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
|
|
18
|
+
}
|
|
19
|
+
}
|
|
@@ -1,122 +1,122 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: qa-bug-detective
|
|
3
|
-
description: QA Bug Detective. Runs generated tests and classifies failures as APPLICATION BUG, TEST CODE ERROR, ENVIRONMENT ISSUE, or INCONCLUSIVE with evidence and confidence levels. Use when user wants to run tests and classify results, investigate test failures, determine if failures are bugs or test issues, debug failing tests, triage test results, or understand why tests are failing. Triggers on "run tests", "classify failures", "why is this failing", "test failures", "debug tests", "triage results", "is this a bug or test error", "investigate failures".
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
# QA Bug Detective
|
|
7
|
-
|
|
8
|
-
## Purpose
|
|
9
|
-
|
|
10
|
-
Run generated tests and classify every failure into one of four categories with evidence and confidence levels. Auto-fix TEST CODE ERRORS when confidence is HIGH.
|
|
11
|
-
|
|
12
|
-
## Classification Decision Tree
|
|
13
|
-
|
|
14
|
-
```
|
|
15
|
-
Test fails
|
|
16
|
-
├── Syntax/import error in TEST file?
|
|
17
|
-
│ └── YES → TEST CODE ERROR
|
|
18
|
-
├── Error occurs in PRODUCTION code path?
|
|
19
|
-
│ ├── Known bug / unexpected behavior? → APPLICATION BUG
|
|
20
|
-
│ └── Code works as designed but test expectation wrong? → TEST CODE ERROR
|
|
21
|
-
├── Connection refused / timeout / missing env var?
|
|
22
|
-
│ └── YES → ENVIRONMENT ISSUE
|
|
23
|
-
└── Can't determine?
|
|
24
|
-
└── INCONCLUSIVE
|
|
25
|
-
```
|
|
26
|
-
|
|
27
|
-
## Classification Categories
|
|
28
|
-
|
|
29
|
-
### APPLICATION BUG
|
|
30
|
-
- Error manifests in production code (not test code)
|
|
31
|
-
- Stack trace points to src/ or app/ code
|
|
32
|
-
- Behavior contradicts documented requirements or API contracts
|
|
33
|
-
- **Action**: Report only. NEVER auto-fix application code.
|
|
34
|
-
|
|
35
|
-
### TEST CODE ERROR
|
|
36
|
-
- Import/require fails (wrong path, missing module)
|
|
37
|
-
- Selector doesn't match current DOM
|
|
38
|
-
- Assertion expects wrong value (test written incorrectly)
|
|
39
|
-
- Missing await, wrong API usage, stale fixture reference
|
|
40
|
-
- **Action**: Auto-fix if HIGH confidence. Report if MEDIUM or lower.
|
|
41
|
-
|
|
42
|
-
### ENVIRONMENT ISSUE
|
|
43
|
-
- Connection refused (database, API, external service)
|
|
44
|
-
- Timeout waiting for resource
|
|
45
|
-
- Missing environment variable
|
|
46
|
-
- File/directory not found (test infrastructure)
|
|
47
|
-
- **Action**: Report with suggested resolution steps.
|
|
48
|
-
|
|
49
|
-
### INCONCLUSIVE
|
|
50
|
-
- Error is ambiguous
|
|
51
|
-
- Could be multiple root causes
|
|
52
|
-
- Insufficient data to classify
|
|
53
|
-
- **Action**: Report with what's known, request more info.
|
|
54
|
-
|
|
55
|
-
## Evidence Requirements
|
|
56
|
-
|
|
57
|
-
Every classification MUST include:
|
|
58
|
-
1. **File path**: Exact file where error occurs
|
|
59
|
-
2. **Line number**: Specific line of failure
|
|
60
|
-
3. **Error message**: Complete error text
|
|
61
|
-
4. **Code snippet**: The specific code proving the classification
|
|
62
|
-
5. **Confidence level**: HIGH / MEDIUM-HIGH / MEDIUM / LOW
|
|
63
|
-
6. **Reasoning**: Why this classification, not another
|
|
64
|
-
|
|
65
|
-
## Confidence Levels
|
|
66
|
-
|
|
67
|
-
| Level | Definition |
|
|
68
|
-
|-------|------------|
|
|
69
|
-
| HIGH | Clear evidence in one direction, no ambiguity |
|
|
70
|
-
| MEDIUM-HIGH | Strong evidence but minor ambiguity |
|
|
71
|
-
| MEDIUM | Evidence points one way but alternatives exist |
|
|
72
|
-
| LOW | Insufficient data, multiple possible causes |
|
|
73
|
-
|
|
74
|
-
## Auto-Fix Rules
|
|
75
|
-
|
|
76
|
-
Only auto-fix when:
|
|
77
|
-
- Classification = TEST CODE ERROR
|
|
78
|
-
- Confidence = HIGH
|
|
79
|
-
- Fix is mechanical (import path, selector, assertion value, config)
|
|
80
|
-
|
|
81
|
-
Fix types:
|
|
82
|
-
- Import path corrections
|
|
83
|
-
- Selector updates (match current DOM/data-testid)
|
|
84
|
-
- Assertion value updates (match current actual behavior)
|
|
85
|
-
- Config fixes (baseURL, timeout values)
|
|
86
|
-
- Missing await keywords
|
|
87
|
-
- Fixture path corrections
|
|
88
|
-
|
|
89
|
-
**NEVER auto-fix**: Application bugs, environment issues, anything with confidence < HIGH.
|
|
90
|
-
|
|
91
|
-
## Output: FAILURE_CLASSIFICATION_REPORT.md
|
|
92
|
-
|
|
93
|
-
```markdown
|
|
94
|
-
# Failure Classification Report
|
|
95
|
-
|
|
96
|
-
## Summary
|
|
97
|
-
| Classification | Count | Auto-Fixed | Needs Attention |
|
|
98
|
-
|---------------|-------|-----------|----------------|
|
|
99
|
-
| APPLICATION BUG | N | 0 | N |
|
|
100
|
-
| TEST CODE ERROR | N | N | N |
|
|
101
|
-
| ENVIRONMENT ISSUE | N | 0 | N |
|
|
102
|
-
| INCONCLUSIVE | N | 0 | N |
|
|
103
|
-
|
|
104
|
-
## Detailed Analysis
|
|
105
|
-
|
|
106
|
-
### Failure 1: [test name]
|
|
107
|
-
- **Classification**: [category]
|
|
108
|
-
- **Confidence**: [level]
|
|
109
|
-
- **File**: [path]:[line]
|
|
110
|
-
- **Error**: [message]
|
|
111
|
-
- **Evidence**: [code snippet + reasoning]
|
|
112
|
-
- **Action Taken**: [auto-fixed / reported]
|
|
113
|
-
- **Resolution**: [what was fixed / what needs human attention]
|
|
114
|
-
```
|
|
115
|
-
|
|
116
|
-
## Quality Gate
|
|
117
|
-
|
|
118
|
-
- [ ] Every failure classified with evidence
|
|
119
|
-
- [ ] Confidence level assigned to each
|
|
120
|
-
- [ ] No application bugs auto-fixed
|
|
121
|
-
- [ ] Auto-fixes only applied at HIGH confidence
|
|
122
|
-
- [ ] FAILURE_CLASSIFICATION_REPORT.md produced
|
|
1
|
+
---
|
|
2
|
+
name: qa-bug-detective
|
|
3
|
+
description: QA Bug Detective. Runs generated tests and classifies failures as APPLICATION BUG, TEST CODE ERROR, ENVIRONMENT ISSUE, or INCONCLUSIVE with evidence and confidence levels. Use when user wants to run tests and classify results, investigate test failures, determine if failures are bugs or test issues, debug failing tests, triage test results, or understand why tests are failing. Triggers on "run tests", "classify failures", "why is this failing", "test failures", "debug tests", "triage results", "is this a bug or test error", "investigate failures".
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# QA Bug Detective
|
|
7
|
+
|
|
8
|
+
## Purpose
|
|
9
|
+
|
|
10
|
+
Run generated tests and classify every failure into one of four categories with evidence and confidence levels. Auto-fix TEST CODE ERRORS when confidence is HIGH.
|
|
11
|
+
|
|
12
|
+
## Classification Decision Tree
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
Test fails
|
|
16
|
+
├── Syntax/import error in TEST file?
|
|
17
|
+
│ └── YES → TEST CODE ERROR
|
|
18
|
+
├── Error occurs in PRODUCTION code path?
|
|
19
|
+
│ ├── Known bug / unexpected behavior? → APPLICATION BUG
|
|
20
|
+
│ └── Code works as designed but test expectation wrong? → TEST CODE ERROR
|
|
21
|
+
├── Connection refused / timeout / missing env var?
|
|
22
|
+
│ └── YES → ENVIRONMENT ISSUE
|
|
23
|
+
└── Can't determine?
|
|
24
|
+
└── INCONCLUSIVE
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## Classification Categories
|
|
28
|
+
|
|
29
|
+
### APPLICATION BUG
|
|
30
|
+
- Error manifests in production code (not test code)
|
|
31
|
+
- Stack trace points to src/ or app/ code
|
|
32
|
+
- Behavior contradicts documented requirements or API contracts
|
|
33
|
+
- **Action**: Report only. NEVER auto-fix application code.
|
|
34
|
+
|
|
35
|
+
### TEST CODE ERROR
|
|
36
|
+
- Import/require fails (wrong path, missing module)
|
|
37
|
+
- Selector doesn't match current DOM
|
|
38
|
+
- Assertion expects wrong value (test written incorrectly)
|
|
39
|
+
- Missing await, wrong API usage, stale fixture reference
|
|
40
|
+
- **Action**: Auto-fix if HIGH confidence. Report if MEDIUM or lower.
|
|
41
|
+
|
|
42
|
+
### ENVIRONMENT ISSUE
|
|
43
|
+
- Connection refused (database, API, external service)
|
|
44
|
+
- Timeout waiting for resource
|
|
45
|
+
- Missing environment variable
|
|
46
|
+
- File/directory not found (test infrastructure)
|
|
47
|
+
- **Action**: Report with suggested resolution steps.
|
|
48
|
+
|
|
49
|
+
### INCONCLUSIVE
|
|
50
|
+
- Error is ambiguous
|
|
51
|
+
- Could be multiple root causes
|
|
52
|
+
- Insufficient data to classify
|
|
53
|
+
- **Action**: Report with what's known, request more info.
|
|
54
|
+
|
|
55
|
+
## Evidence Requirements
|
|
56
|
+
|
|
57
|
+
Every classification MUST include:
|
|
58
|
+
1. **File path**: Exact file where error occurs
|
|
59
|
+
2. **Line number**: Specific line of failure
|
|
60
|
+
3. **Error message**: Complete error text
|
|
61
|
+
4. **Code snippet**: The specific code proving the classification
|
|
62
|
+
5. **Confidence level**: HIGH / MEDIUM-HIGH / MEDIUM / LOW
|
|
63
|
+
6. **Reasoning**: Why this classification, not another
|
|
64
|
+
|
|
65
|
+
## Confidence Levels
|
|
66
|
+
|
|
67
|
+
| Level | Definition |
|
|
68
|
+
|-------|------------|
|
|
69
|
+
| HIGH | Clear evidence in one direction, no ambiguity |
|
|
70
|
+
| MEDIUM-HIGH | Strong evidence but minor ambiguity |
|
|
71
|
+
| MEDIUM | Evidence points one way but alternatives exist |
|
|
72
|
+
| LOW | Insufficient data, multiple possible causes |
|
|
73
|
+
|
|
74
|
+
## Auto-Fix Rules
|
|
75
|
+
|
|
76
|
+
Only auto-fix when:
|
|
77
|
+
- Classification = TEST CODE ERROR
|
|
78
|
+
- Confidence = HIGH
|
|
79
|
+
- Fix is mechanical (import path, selector, assertion value, config)
|
|
80
|
+
|
|
81
|
+
Fix types:
|
|
82
|
+
- Import path corrections
|
|
83
|
+
- Selector updates (match current DOM/data-testid)
|
|
84
|
+
- Assertion value updates (match current actual behavior)
|
|
85
|
+
- Config fixes (baseURL, timeout values)
|
|
86
|
+
- Missing await keywords
|
|
87
|
+
- Fixture path corrections
|
|
88
|
+
|
|
89
|
+
**NEVER auto-fix**: Application bugs, environment issues, anything with confidence < HIGH.
|
|
90
|
+
|
|
91
|
+
## Output: FAILURE_CLASSIFICATION_REPORT.md
|
|
92
|
+
|
|
93
|
+
```markdown
|
|
94
|
+
# Failure Classification Report
|
|
95
|
+
|
|
96
|
+
## Summary
|
|
97
|
+
| Classification | Count | Auto-Fixed | Needs Attention |
|
|
98
|
+
|---------------|-------|-----------|----------------|
|
|
99
|
+
| APPLICATION BUG | N | 0 | N |
|
|
100
|
+
| TEST CODE ERROR | N | N | N |
|
|
101
|
+
| ENVIRONMENT ISSUE | N | 0 | N |
|
|
102
|
+
| INCONCLUSIVE | N | 0 | N |
|
|
103
|
+
|
|
104
|
+
## Detailed Analysis
|
|
105
|
+
|
|
106
|
+
### Failure 1: [test name]
|
|
107
|
+
- **Classification**: [category]
|
|
108
|
+
- **Confidence**: [level]
|
|
109
|
+
- **File**: [path]:[line]
|
|
110
|
+
- **Error**: [message]
|
|
111
|
+
- **Evidence**: [code snippet + reasoning]
|
|
112
|
+
- **Action Taken**: [auto-fixed / reported]
|
|
113
|
+
- **Resolution**: [what was fixed / what needs human attention]
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
## Quality Gate
|
|
117
|
+
|
|
118
|
+
- [ ] Every failure classified with evidence
|
|
119
|
+
- [ ] Confidence level assigned to each
|
|
120
|
+
- [ ] No application bugs auto-fixed
|
|
121
|
+
- [ ] Auto-fixes only applied at HIGH confidence
|
|
122
|
+
- [ ] FAILURE_CLASSIFICATION_REPORT.md produced
|