ai-test-failure-analyzer 1.0.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- ai_test_failure_analyzer-1.0.1/PKG-INFO +240 -0
- ai_test_failure_analyzer-1.0.1/README.md +203 -0
- ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/PKG-INFO +240 -0
- ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/SOURCES.txt +41 -0
- ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/dependency_links.txt +1 -0
- ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/entry_points.txt +4 -0
- ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/requires.txt +22 -0
- ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/top_level.txt +1 -0
- ai_test_failure_analyzer-1.0.1/analyzer/__init__.py +4 -0
- ai_test_failure_analyzer-1.0.1/analyzer/__main__.py +124 -0
- ai_test_failure_analyzer-1.0.1/analyzer/config.py +75 -0
- ai_test_failure_analyzer-1.0.1/analyzer/elicit.py +86 -0
- ai_test_failure_analyzer-1.0.1/analyzer/evidence/__init__.py +8 -0
- ai_test_failure_analyzer-1.0.1/analyzer/evidence/config_scan.py +87 -0
- ai_test_failure_analyzer-1.0.1/analyzer/evidence/correlator.py +182 -0
- ai_test_failure_analyzer-1.0.1/analyzer/evidence/git_scan.py +140 -0
- ai_test_failure_analyzer-1.0.1/analyzer/evidence/log_scan.py +126 -0
- ai_test_failure_analyzer-1.0.1/analyzer/github_integration.py +103 -0
- ai_test_failure_analyzer-1.0.1/analyzer/hypothesis.py +255 -0
- ai_test_failure_analyzer-1.0.1/analyzer/noise_filter.py +107 -0
- ai_test_failure_analyzer-1.0.1/analyzer/orchestrator.py +204 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/__init__.py +83 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/base.py +141 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/cypress_json.py +102 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/jest_json.py +100 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/junit_generic.py +118 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/k6_json.py +76 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/newman_json.py +93 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/playwright_json.py +176 -0
- ai_test_failure_analyzer-1.0.1/analyzer/parsers/pytest_junit.py +102 -0
- ai_test_failure_analyzer-1.0.1/analyzer/render/__init__.py +6 -0
- ai_test_failure_analyzer-1.0.1/analyzer/render/ansi.py +131 -0
- ai_test_failure_analyzer-1.0.1/analyzer/render/markdown.py +173 -0
- ai_test_failure_analyzer-1.0.1/analyzer/security.py +104 -0
- ai_test_failure_analyzer-1.0.1/analyzer/server.py +425 -0
- ai_test_failure_analyzer-1.0.1/analyzer/ui/__init__.py +0 -0
- ai_test_failure_analyzer-1.0.1/analyzer/ui/cli.py +148 -0
- ai_test_failure_analyzer-1.0.1/analyzer/ui/tui.py +238 -0
- ai_test_failure_analyzer-1.0.1/analyzer/ui/web/__init__.py +0 -0
- ai_test_failure_analyzer-1.0.1/analyzer/ui/web/app.py +147 -0
- ai_test_failure_analyzer-1.0.1/analyzer/workspace_scanner.py +95 -0
- ai_test_failure_analyzer-1.0.1/pyproject.toml +76 -0
- ai_test_failure_analyzer-1.0.1/setup.cfg +4 -0
|
@@ -0,0 +1,240 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: ai-test-failure-analyzer
|
|
3
|
+
Version: 1.0.1
|
|
4
|
+
Summary: Root cause in seconds. Evidence, not intuition. Analyzes Playwright, Jest, Cypress, Newman, k6, and JUnit test failures by tracing git history, logs, and config. No guesses, no fixture noise.
|
|
5
|
+
Author: NashTech AI
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/nashtech/ai-test-failure-analyzer
|
|
8
|
+
Keywords: mcp,testing,qa,ai,playwright,pytest,jest,cypress,newman,k6,root-cause-analysis
|
|
9
|
+
Classifier: Development Status :: 4 - Beta
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
+
Classifier: Programming Language :: Python :: 3 :: Only
|
|
13
|
+
Classifier: Topic :: Software Development :: Testing
|
|
14
|
+
Requires-Python: <3.15,>=3.10
|
|
15
|
+
Description-Content-Type: text/markdown
|
|
16
|
+
Requires-Dist: mcp>=1.2.0
|
|
17
|
+
Requires-Dist: pydantic>=2.7
|
|
18
|
+
Requires-Dist: pydantic-settings>=2.3
|
|
19
|
+
Requires-Dist: fastapi>=0.111
|
|
20
|
+
Requires-Dist: uvicorn[standard]>=0.30
|
|
21
|
+
Requires-Dist: jinja2>=3.1
|
|
22
|
+
Requires-Dist: sse-starlette>=2.1
|
|
23
|
+
Requires-Dist: questionary>=2.0
|
|
24
|
+
Requires-Dist: rich>=13.7
|
|
25
|
+
Requires-Dist: textual>=0.70
|
|
26
|
+
Requires-Dist: typer>=0.12
|
|
27
|
+
Requires-Dist: lxml>=5.2
|
|
28
|
+
Requires-Dist: PyGithub>=2.3
|
|
29
|
+
Requires-Dist: python-dotenv>=1.0
|
|
30
|
+
Provides-Extra: dev
|
|
31
|
+
Requires-Dist: pytest>=8; extra == "dev"
|
|
32
|
+
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
|
|
33
|
+
Requires-Dist: ruff>=0.5; extra == "dev"
|
|
34
|
+
Requires-Dist: mypy>=1.10; extra == "dev"
|
|
35
|
+
Requires-Dist: build>=1.0; extra == "dev"
|
|
36
|
+
Requires-Dist: twine>=5.0; extra == "dev"
|
|
37
|
+
|
|
38
|
+
<div align="center">
|
|
39
|
+
|
|
40
|
+
# 🩻 ai-test-failure-analyzer
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
**Root cause in seconds. Evidence, not intuition.**
|
|
45
|
+
|
|
46
|
+
Feed it a **real** test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit —
|
|
47
|
+
and it traces back through your **real** git history, application logs, and config
|
|
48
|
+
to surface the **actual** root cause, with a cited evidence chain and `file:line` precision.
|
|
49
|
+
No guesses. No fixture noise. No repeating the obvious.
|
|
50
|
+
|
|
51
|
+
[](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml)
|
|
52
|
+
[](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml)
|
|
53
|
+
[](https://www.npmjs.com/package/ai-test-failure-analyzer)
|
|
54
|
+
[](https://pypi.org/project/ai-test-failure-analyzer)
|
|
55
|
+
[](LICENSE)
|
|
56
|
+
[](https://modelcontextprotocol.io)
|
|
57
|
+
[](skills/ai-test-failure-analyzer/SKILL.md)
|
|
58
|
+
|
|
59
|
+
<!-- HERO-START -->
|
|
60
|
+

|
|
61
|
+
<!-- HERO-END -->
|
|
62
|
+
|
|
63
|
+
🩻 A real analysis — evidence from `git`, `app.log`, and `.env` — no guesses, no fixture noise.
|
|
64
|
+
|
|
65
|
+
</div>
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## Why ai-test-failure-analyzer
|
|
70
|
+
|
|
71
|
+
Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.
|
|
72
|
+
|
|
73
|
+
This tool does it automatically in seconds:
|
|
74
|
+
|
|
75
|
+
- Parses the test result file to extract failing tests with HTTP details
|
|
76
|
+
- Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
|
|
77
|
+
- Scans application logs for ERROR/FATAL lines
|
|
78
|
+
- Reads config files (.env, docker-compose)
|
|
79
|
+
- Cross-correlates all evidence into clusters
|
|
80
|
+
- Forms ranked, evidence-cited hypotheses with `file:line` precision
|
|
81
|
+
- Never points to test fixtures or "intentional failure" comments as root causes
|
|
82
|
+
|
|
83
|
+
## How it's different
|
|
84
|
+
|
|
85
|
+
| | ai-test-failure-analyzer | Manual triage | Generic LLM |
|
|
86
|
+
|---|---|---|---|
|
|
87
|
+
| Evidence source | Real git/logs/config | Human memory | Training data |
|
|
88
|
+
| Fixture noise | Blocked by Tier-1 gate | No protection | No protection |
|
|
89
|
+
| `file:line` precision | ✅ | Sometimes | No |
|
|
90
|
+
| Works without source code | ✅ API-only mode | ✅ | ✅ |
|
|
91
|
+
| Repeatable | ✅ | ❌ | ❌ |
|
|
92
|
+
| CI-integrated | ✅ | ❌ | ❌ |
|
|
93
|
+
|
|
94
|
+
## Supported frameworks
|
|
95
|
+
|
|
96
|
+
| Framework | Format | Command |
|
|
97
|
+
|---|---|---|
|
|
98
|
+
| Playwright | JSON reporter | `playwright test --reporter=json` |
|
|
99
|
+
| Jest / Vitest | JSON | `jest --json --outputFile=results.json` |
|
|
100
|
+
| Cypress | Mochawesome JSON | `cypress run --reporter mochawesome` |
|
|
101
|
+
| pytest | JUnit XML | `pytest --junit-xml=results.xml` |
|
|
102
|
+
| Newman (Postman) | JSON | `newman run col.json --reporters json --reporter-json-export results.json` |
|
|
103
|
+
| k6 | Summary JSON | `k6 run --summary-export=results.json script.js` |
|
|
104
|
+
| REST Assured | JUnit XML | standard Maven Surefire output |
|
|
105
|
+
| Any JUnit-compatible | XML | TestNG, Karate, Insomnia CLI |
|
|
106
|
+
|
|
107
|
+
## Install
|
|
108
|
+
|
|
109
|
+
**npm (global — JS/CI devs):**
|
|
110
|
+
```bash
|
|
111
|
+
npm install -g ai-test-failure-analyzer
|
|
112
|
+
ai-analyze analyze playwright-report.json
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
**npx (zero install):**
|
|
116
|
+
```bash
|
|
117
|
+
npx ai-test-failure-analyzer analyze playwright-report.json
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**pipx (Python devs):**
|
|
121
|
+
```bash
|
|
122
|
+
pipx install ai-test-failure-analyzer
|
|
123
|
+
analyzer analyze playwright-report.json
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
**Claude Code skill:**
|
|
127
|
+
```
|
|
128
|
+
/plugin install ai-test-failure-analyzer
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
**Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):**
|
|
132
|
+
```bash
|
|
133
|
+
ai-analyze install
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
## Usage
|
|
137
|
+
|
|
138
|
+
### CLI
|
|
139
|
+
|
|
140
|
+
```bash
|
|
141
|
+
ai-analyze analyze results.json
|
|
142
|
+
ai-analyze analyze results.json --mode api-only # force API-only (no source scan)
|
|
143
|
+
ai-analyze analyze results.json --out report.md # write report to file
|
|
144
|
+
ai-analyze analyze results.json --create-issue # file GitHub issue for top hypothesis
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
### MCP server (Claude Code / Cursor)
|
|
148
|
+
|
|
149
|
+
Add to your MCP config:
|
|
150
|
+
```json
|
|
151
|
+
{
|
|
152
|
+
"mcpServers": {
|
|
153
|
+
"ai-test-failure-analyzer": {
|
|
154
|
+
"command": "ai-analyze",
|
|
155
|
+
"args": ["serve-stdio"]
|
|
156
|
+
}
|
|
157
|
+
}
|
|
158
|
+
}
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
Then ask Claude: *"Analyze the failures in playwright-report.json"*
|
|
162
|
+
|
|
163
|
+
### MCP HTTP (OpenAI / Gemini)
|
|
164
|
+
|
|
165
|
+
```bash
|
|
166
|
+
ai-analyze serve-http --port 8765
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
## API-only mode
|
|
170
|
+
|
|
171
|
+
No source code? No problem. When your workspace has no `src/`, `app/`, `lib/`, or `api/` directory — or when you pass `--mode api-only` — the tool switches to API-only mode.
|
|
172
|
+
|
|
173
|
+
It analyzes HTTP contract evidence directly from the test results:
|
|
174
|
+
|
|
175
|
+
```bash
|
|
176
|
+
ai-analyze analyze newman-results.json
|
|
177
|
+
# > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
|
|
178
|
+
# Root Cause [95%] — POST /api/clips → 404 Not Found
|
|
179
|
+
# Endpoint moved or removed. Check API changelog or versioning.
|
|
180
|
+
# Evidence: response status 404 + URL /api/clips
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.
|
|
184
|
+
|
|
185
|
+
## CI integration
|
|
186
|
+
|
|
187
|
+
```yaml
|
|
188
|
+
# .github/workflows/analyze-failures.yml
|
|
189
|
+
- name: Analyze test failures
|
|
190
|
+
if: failure()
|
|
191
|
+
run: |
|
|
192
|
+
npx ai-test-failure-analyzer analyze test-results/results.json \
|
|
193
|
+
--non-interactive \
|
|
194
|
+
--out failure-analysis.md
|
|
195
|
+
- uses: actions/upload-artifact@v4
|
|
196
|
+
if: failure()
|
|
197
|
+
with:
|
|
198
|
+
name: failure-analysis
|
|
199
|
+
path: failure-analysis.md
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
## Security
|
|
203
|
+
|
|
204
|
+
- **No shell injection**: all subprocess calls use explicit argument lists
|
|
205
|
+
- **Path traversal protection**: all paths resolved relative to workspace root
|
|
206
|
+
- **Size caps**: 5 MB/file, 50 MB/scan, 200 commits max
|
|
207
|
+
- **Secrets redacted**: `.env` token/secret/key/password values masked in reports
|
|
208
|
+
- **No outbound network** from core analysis (GitHub issue creation is opt-in)
|
|
209
|
+
|
|
210
|
+
See [SECURITY.md](SECURITY.md) for the full threat model.
|
|
211
|
+
|
|
212
|
+
## Repository layout
|
|
213
|
+
|
|
214
|
+
```
|
|
215
|
+
analyzer/ Python package (MCP server + CLI + analysis)
|
|
216
|
+
parsers/ Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
|
|
217
|
+
evidence/ Evidence collection (git, logs, config)
|
|
218
|
+
render/ Report rendering (Markdown, ANSI)
|
|
219
|
+
ui/ User interfaces (CLI, TUI, Web)
|
|
220
|
+
workspace_scanner.py Phase 0 — mode detection, noise path discovery
|
|
221
|
+
noise_filter.py Evidence filtering and hypothesis deduplication
|
|
222
|
+
orchestrator.py 8-phase analysis pipeline
|
|
223
|
+
hypothesis.py Confidence scoring and hypothesis formation
|
|
224
|
+
bin/cli.js Zero-dep Node wrapper (ai-analyze command)
|
|
225
|
+
skills/ai-test-failure-analyzer/SKILL.md Claude Code agent skill
|
|
226
|
+
.claude-plugin/ Claude marketplace manifests
|
|
227
|
+
tests/analyzer/ pytest test suite
|
|
228
|
+
.github/workflows/ CI/CD (ci, release, publish, codeql)
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
## Testing
|
|
232
|
+
|
|
233
|
+
```bash
|
|
234
|
+
pytest tests/analyzer -q # Python: parsers, correlator, noise filter, workspace scanner
|
|
235
|
+
npm test # Node: CLI smoke tests
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
## Contributing
|
|
239
|
+
|
|
240
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md).
|
|
@@ -0,0 +1,203 @@
|
|
|
1
|
+
<div align="center">
|
|
2
|
+
|
|
3
|
+
# 🩻 ai-test-failure-analyzer
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
**Root cause in seconds. Evidence, not intuition.**
|
|
8
|
+
|
|
9
|
+
Feed it a **real** test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit —
|
|
10
|
+
and it traces back through your **real** git history, application logs, and config
|
|
11
|
+
to surface the **actual** root cause, with a cited evidence chain and `file:line` precision.
|
|
12
|
+
No guesses. No fixture noise. No repeating the obvious.
|
|
13
|
+
|
|
14
|
+
[](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml)
|
|
15
|
+
[](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml)
|
|
16
|
+
[](https://www.npmjs.com/package/ai-test-failure-analyzer)
|
|
17
|
+
[](https://pypi.org/project/ai-test-failure-analyzer)
|
|
18
|
+
[](LICENSE)
|
|
19
|
+
[](https://modelcontextprotocol.io)
|
|
20
|
+
[](skills/ai-test-failure-analyzer/SKILL.md)
|
|
21
|
+
|
|
22
|
+
<!-- HERO-START -->
|
|
23
|
+

|
|
24
|
+
<!-- HERO-END -->
|
|
25
|
+
|
|
26
|
+
🩻 A real analysis — evidence from `git`, `app.log`, and `.env` — no guesses, no fixture noise.
|
|
27
|
+
|
|
28
|
+
</div>
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## Why ai-test-failure-analyzer
|
|
33
|
+
|
|
34
|
+
Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.
|
|
35
|
+
|
|
36
|
+
This tool does it automatically in seconds:
|
|
37
|
+
|
|
38
|
+
- Parses the test result file to extract failing tests with HTTP details
|
|
39
|
+
- Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
|
|
40
|
+
- Scans application logs for ERROR/FATAL lines
|
|
41
|
+
- Reads config files (.env, docker-compose)
|
|
42
|
+
- Cross-correlates all evidence into clusters
|
|
43
|
+
- Forms ranked, evidence-cited hypotheses with `file:line` precision
|
|
44
|
+
- Never points to test fixtures or "intentional failure" comments as root causes
|
|
45
|
+
|
|
46
|
+
## How it's different
|
|
47
|
+
|
|
48
|
+
| | ai-test-failure-analyzer | Manual triage | Generic LLM |
|
|
49
|
+
|---|---|---|---|
|
|
50
|
+
| Evidence source | Real git/logs/config | Human memory | Training data |
|
|
51
|
+
| Fixture noise | Blocked by Tier-1 gate | No protection | No protection |
|
|
52
|
+
| `file:line` precision | ✅ | Sometimes | No |
|
|
53
|
+
| Works without source code | ✅ API-only mode | ✅ | ✅ |
|
|
54
|
+
| Repeatable | ✅ | ❌ | ❌ |
|
|
55
|
+
| CI-integrated | ✅ | ❌ | ❌ |
|
|
56
|
+
|
|
57
|
+
## Supported frameworks
|
|
58
|
+
|
|
59
|
+
| Framework | Format | Command |
|
|
60
|
+
|---|---|---|
|
|
61
|
+
| Playwright | JSON reporter | `playwright test --reporter=json` |
|
|
62
|
+
| Jest / Vitest | JSON | `jest --json --outputFile=results.json` |
|
|
63
|
+
| Cypress | Mochawesome JSON | `cypress run --reporter mochawesome` |
|
|
64
|
+
| pytest | JUnit XML | `pytest --junit-xml=results.xml` |
|
|
65
|
+
| Newman (Postman) | JSON | `newman run col.json --reporters json --reporter-json-export results.json` |
|
|
66
|
+
| k6 | Summary JSON | `k6 run --summary-export=results.json script.js` |
|
|
67
|
+
| REST Assured | JUnit XML | standard Maven Surefire output |
|
|
68
|
+
| Any JUnit-compatible | XML | TestNG, Karate, Insomnia CLI |
|
|
69
|
+
|
|
70
|
+
## Install
|
|
71
|
+
|
|
72
|
+
**npm (global — JS/CI devs):**
|
|
73
|
+
```bash
|
|
74
|
+
npm install -g ai-test-failure-analyzer
|
|
75
|
+
ai-analyze analyze playwright-report.json
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
**npx (zero install):**
|
|
79
|
+
```bash
|
|
80
|
+
npx ai-test-failure-analyzer analyze playwright-report.json
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
**pipx (Python devs):**
|
|
84
|
+
```bash
|
|
85
|
+
pipx install ai-test-failure-analyzer
|
|
86
|
+
analyzer analyze playwright-report.json
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
**Claude Code skill:**
|
|
90
|
+
```
|
|
91
|
+
/plugin install ai-test-failure-analyzer
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
**Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):**
|
|
95
|
+
```bash
|
|
96
|
+
ai-analyze install
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## Usage
|
|
100
|
+
|
|
101
|
+
### CLI
|
|
102
|
+
|
|
103
|
+
```bash
|
|
104
|
+
ai-analyze analyze results.json
|
|
105
|
+
ai-analyze analyze results.json --mode api-only # force API-only (no source scan)
|
|
106
|
+
ai-analyze analyze results.json --out report.md # write report to file
|
|
107
|
+
ai-analyze analyze results.json --create-issue # file GitHub issue for top hypothesis
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### MCP server (Claude Code / Cursor)
|
|
111
|
+
|
|
112
|
+
Add to your MCP config:
|
|
113
|
+
```json
|
|
114
|
+
{
|
|
115
|
+
"mcpServers": {
|
|
116
|
+
"ai-test-failure-analyzer": {
|
|
117
|
+
"command": "ai-analyze",
|
|
118
|
+
"args": ["serve-stdio"]
|
|
119
|
+
}
|
|
120
|
+
}
|
|
121
|
+
}
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
Then ask Claude: *"Analyze the failures in playwright-report.json"*
|
|
125
|
+
|
|
126
|
+
### MCP HTTP (OpenAI / Gemini)
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
ai-analyze serve-http --port 8765
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
## API-only mode
|
|
133
|
+
|
|
134
|
+
No source code? No problem. When your workspace has no `src/`, `app/`, `lib/`, or `api/` directory — or when you pass `--mode api-only` — the tool switches to API-only mode.
|
|
135
|
+
|
|
136
|
+
It analyzes HTTP contract evidence directly from the test results:
|
|
137
|
+
|
|
138
|
+
```bash
|
|
139
|
+
ai-analyze analyze newman-results.json
|
|
140
|
+
# > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
|
|
141
|
+
# Root Cause [95%] — POST /api/clips → 404 Not Found
|
|
142
|
+
# Endpoint moved or removed. Check API changelog or versioning.
|
|
143
|
+
# Evidence: response status 404 + URL /api/clips
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.
|
|
147
|
+
|
|
148
|
+
## CI integration
|
|
149
|
+
|
|
150
|
+
```yaml
|
|
151
|
+
# .github/workflows/analyze-failures.yml
|
|
152
|
+
- name: Analyze test failures
|
|
153
|
+
if: failure()
|
|
154
|
+
run: |
|
|
155
|
+
npx ai-test-failure-analyzer analyze test-results/results.json \
|
|
156
|
+
--non-interactive \
|
|
157
|
+
--out failure-analysis.md
|
|
158
|
+
- uses: actions/upload-artifact@v4
|
|
159
|
+
if: failure()
|
|
160
|
+
with:
|
|
161
|
+
name: failure-analysis
|
|
162
|
+
path: failure-analysis.md
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
## Security
|
|
166
|
+
|
|
167
|
+
- **No shell injection**: all subprocess calls use explicit argument lists
|
|
168
|
+
- **Path traversal protection**: all paths resolved relative to workspace root
|
|
169
|
+
- **Size caps**: 5 MB/file, 50 MB/scan, 200 commits max
|
|
170
|
+
- **Secrets redacted**: `.env` token/secret/key/password values masked in reports
|
|
171
|
+
- **No outbound network** from core analysis (GitHub issue creation is opt-in)
|
|
172
|
+
|
|
173
|
+
See [SECURITY.md](SECURITY.md) for the full threat model.
|
|
174
|
+
|
|
175
|
+
## Repository layout
|
|
176
|
+
|
|
177
|
+
```
|
|
178
|
+
analyzer/ Python package (MCP server + CLI + analysis)
|
|
179
|
+
parsers/ Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
|
|
180
|
+
evidence/ Evidence collection (git, logs, config)
|
|
181
|
+
render/ Report rendering (Markdown, ANSI)
|
|
182
|
+
ui/ User interfaces (CLI, TUI, Web)
|
|
183
|
+
workspace_scanner.py Phase 0 — mode detection, noise path discovery
|
|
184
|
+
noise_filter.py Evidence filtering and hypothesis deduplication
|
|
185
|
+
orchestrator.py 8-phase analysis pipeline
|
|
186
|
+
hypothesis.py Confidence scoring and hypothesis formation
|
|
187
|
+
bin/cli.js Zero-dep Node wrapper (ai-analyze command)
|
|
188
|
+
skills/ai-test-failure-analyzer/SKILL.md Claude Code agent skill
|
|
189
|
+
.claude-plugin/ Claude marketplace manifests
|
|
190
|
+
tests/analyzer/ pytest test suite
|
|
191
|
+
.github/workflows/ CI/CD (ci, release, publish, codeql)
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
## Testing
|
|
195
|
+
|
|
196
|
+
```bash
|
|
197
|
+
pytest tests/analyzer -q # Python: parsers, correlator, noise filter, workspace scanner
|
|
198
|
+
npm test # Node: CLI smoke tests
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
## Contributing
|
|
202
|
+
|
|
203
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md).
|
|
@@ -0,0 +1,240 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: ai-test-failure-analyzer
|
|
3
|
+
Version: 1.0.1
|
|
4
|
+
Summary: Root cause in seconds. Evidence, not intuition. Analyzes Playwright, Jest, Cypress, Newman, k6, and JUnit test failures by tracing git history, logs, and config. No guesses, no fixture noise.
|
|
5
|
+
Author: NashTech AI
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/nashtech/ai-test-failure-analyzer
|
|
8
|
+
Keywords: mcp,testing,qa,ai,playwright,pytest,jest,cypress,newman,k6,root-cause-analysis
|
|
9
|
+
Classifier: Development Status :: 4 - Beta
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
+
Classifier: Programming Language :: Python :: 3 :: Only
|
|
13
|
+
Classifier: Topic :: Software Development :: Testing
|
|
14
|
+
Requires-Python: <3.15,>=3.10
|
|
15
|
+
Description-Content-Type: text/markdown
|
|
16
|
+
Requires-Dist: mcp>=1.2.0
|
|
17
|
+
Requires-Dist: pydantic>=2.7
|
|
18
|
+
Requires-Dist: pydantic-settings>=2.3
|
|
19
|
+
Requires-Dist: fastapi>=0.111
|
|
20
|
+
Requires-Dist: uvicorn[standard]>=0.30
|
|
21
|
+
Requires-Dist: jinja2>=3.1
|
|
22
|
+
Requires-Dist: sse-starlette>=2.1
|
|
23
|
+
Requires-Dist: questionary>=2.0
|
|
24
|
+
Requires-Dist: rich>=13.7
|
|
25
|
+
Requires-Dist: textual>=0.70
|
|
26
|
+
Requires-Dist: typer>=0.12
|
|
27
|
+
Requires-Dist: lxml>=5.2
|
|
28
|
+
Requires-Dist: PyGithub>=2.3
|
|
29
|
+
Requires-Dist: python-dotenv>=1.0
|
|
30
|
+
Provides-Extra: dev
|
|
31
|
+
Requires-Dist: pytest>=8; extra == "dev"
|
|
32
|
+
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
|
|
33
|
+
Requires-Dist: ruff>=0.5; extra == "dev"
|
|
34
|
+
Requires-Dist: mypy>=1.10; extra == "dev"
|
|
35
|
+
Requires-Dist: build>=1.0; extra == "dev"
|
|
36
|
+
Requires-Dist: twine>=5.0; extra == "dev"
|
|
37
|
+
|
|
38
|
+
<div align="center">
|
|
39
|
+
|
|
40
|
+
# 🩻 ai-test-failure-analyzer
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
**Root cause in seconds. Evidence, not intuition.**
|
|
45
|
+
|
|
46
|
+
Feed it a **real** test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit —
|
|
47
|
+
and it traces back through your **real** git history, application logs, and config
|
|
48
|
+
to surface the **actual** root cause, with a cited evidence chain and `file:line` precision.
|
|
49
|
+
No guesses. No fixture noise. No repeating the obvious.
|
|
50
|
+
|
|
51
|
+
[](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml)
|
|
52
|
+
[](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml)
|
|
53
|
+
[](https://www.npmjs.com/package/ai-test-failure-analyzer)
|
|
54
|
+
[](https://pypi.org/project/ai-test-failure-analyzer)
|
|
55
|
+
[](LICENSE)
|
|
56
|
+
[](https://modelcontextprotocol.io)
|
|
57
|
+
[](skills/ai-test-failure-analyzer/SKILL.md)
|
|
58
|
+
|
|
59
|
+
<!-- HERO-START -->
|
|
60
|
+

|
|
61
|
+
<!-- HERO-END -->
|
|
62
|
+
|
|
63
|
+
🩻 A real analysis — evidence from `git`, `app.log`, and `.env` — no guesses, no fixture noise.
|
|
64
|
+
|
|
65
|
+
</div>
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## Why ai-test-failure-analyzer
|
|
70
|
+
|
|
71
|
+
Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.
|
|
72
|
+
|
|
73
|
+
This tool does it automatically in seconds:
|
|
74
|
+
|
|
75
|
+
- Parses the test result file to extract failing tests with HTTP details
|
|
76
|
+
- Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
|
|
77
|
+
- Scans application logs for ERROR/FATAL lines
|
|
78
|
+
- Reads config files (.env, docker-compose)
|
|
79
|
+
- Cross-correlates all evidence into clusters
|
|
80
|
+
- Forms ranked, evidence-cited hypotheses with `file:line` precision
|
|
81
|
+
- Never points to test fixtures or "intentional failure" comments as root causes
|
|
82
|
+
|
|
83
|
+
## How it's different
|
|
84
|
+
|
|
85
|
+
| | ai-test-failure-analyzer | Manual triage | Generic LLM |
|
|
86
|
+
|---|---|---|---|
|
|
87
|
+
| Evidence source | Real git/logs/config | Human memory | Training data |
|
|
88
|
+
| Fixture noise | Blocked by Tier-1 gate | No protection | No protection |
|
|
89
|
+
| `file:line` precision | ✅ | Sometimes | No |
|
|
90
|
+
| Works without source code | ✅ API-only mode | ✅ | ✅ |
|
|
91
|
+
| Repeatable | ✅ | ❌ | ❌ |
|
|
92
|
+
| CI-integrated | ✅ | ❌ | ❌ |
|
|
93
|
+
|
|
94
|
+
## Supported frameworks
|
|
95
|
+
|
|
96
|
+
| Framework | Format | Command |
|
|
97
|
+
|---|---|---|
|
|
98
|
+
| Playwright | JSON reporter | `playwright test --reporter=json` |
|
|
99
|
+
| Jest / Vitest | JSON | `jest --json --outputFile=results.json` |
|
|
100
|
+
| Cypress | Mochawesome JSON | `cypress run --reporter mochawesome` |
|
|
101
|
+
| pytest | JUnit XML | `pytest --junit-xml=results.xml` |
|
|
102
|
+
| Newman (Postman) | JSON | `newman run col.json --reporters json --reporter-json-export results.json` |
|
|
103
|
+
| k6 | Summary JSON | `k6 run --summary-export=results.json script.js` |
|
|
104
|
+
| REST Assured | JUnit XML | standard Maven Surefire output |
|
|
105
|
+
| Any JUnit-compatible | XML | TestNG, Karate, Insomnia CLI |
|
|
106
|
+
|
|
107
|
+
## Install
|
|
108
|
+
|
|
109
|
+
**npm (global — JS/CI devs):**
|
|
110
|
+
```bash
|
|
111
|
+
npm install -g ai-test-failure-analyzer
|
|
112
|
+
ai-analyze analyze playwright-report.json
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
**npx (zero install):**
|
|
116
|
+
```bash
|
|
117
|
+
npx ai-test-failure-analyzer analyze playwright-report.json
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**pipx (Python devs):**
|
|
121
|
+
```bash
|
|
122
|
+
pipx install ai-test-failure-analyzer
|
|
123
|
+
analyzer analyze playwright-report.json
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
**Claude Code skill:**
|
|
127
|
+
```
|
|
128
|
+
/plugin install ai-test-failure-analyzer
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
**Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):**
|
|
132
|
+
```bash
|
|
133
|
+
ai-analyze install
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
## Usage
|
|
137
|
+
|
|
138
|
+
### CLI
|
|
139
|
+
|
|
140
|
+
```bash
|
|
141
|
+
ai-analyze analyze results.json
|
|
142
|
+
ai-analyze analyze results.json --mode api-only # force API-only (no source scan)
|
|
143
|
+
ai-analyze analyze results.json --out report.md # write report to file
|
|
144
|
+
ai-analyze analyze results.json --create-issue # file GitHub issue for top hypothesis
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
### MCP server (Claude Code / Cursor)
|
|
148
|
+
|
|
149
|
+
Add to your MCP config:
|
|
150
|
+
```json
|
|
151
|
+
{
|
|
152
|
+
"mcpServers": {
|
|
153
|
+
"ai-test-failure-analyzer": {
|
|
154
|
+
"command": "ai-analyze",
|
|
155
|
+
"args": ["serve-stdio"]
|
|
156
|
+
}
|
|
157
|
+
}
|
|
158
|
+
}
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
Then ask Claude: *"Analyze the failures in playwright-report.json"*
|
|
162
|
+
|
|
163
|
+
### MCP HTTP (OpenAI / Gemini)
|
|
164
|
+
|
|
165
|
+
```bash
|
|
166
|
+
ai-analyze serve-http --port 8765
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
## API-only mode
|
|
170
|
+
|
|
171
|
+
No source code? No problem. When your workspace has no `src/`, `app/`, `lib/`, or `api/` directory — or when you pass `--mode api-only` — the tool switches to API-only mode.
|
|
172
|
+
|
|
173
|
+
It analyzes HTTP contract evidence directly from the test results:
|
|
174
|
+
|
|
175
|
+
```bash
|
|
176
|
+
ai-analyze analyze newman-results.json
|
|
177
|
+
# > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
|
|
178
|
+
# Root Cause [95%] — POST /api/clips → 404 Not Found
|
|
179
|
+
# Endpoint moved or removed. Check API changelog or versioning.
|
|
180
|
+
# Evidence: response status 404 + URL /api/clips
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.
|
|
184
|
+
|
|
185
|
+
## CI integration
|
|
186
|
+
|
|
187
|
+
```yaml
|
|
188
|
+
# .github/workflows/analyze-failures.yml
|
|
189
|
+
- name: Analyze test failures
|
|
190
|
+
if: failure()
|
|
191
|
+
run: |
|
|
192
|
+
npx ai-test-failure-analyzer analyze test-results/results.json \
|
|
193
|
+
--non-interactive \
|
|
194
|
+
--out failure-analysis.md
|
|
195
|
+
- uses: actions/upload-artifact@v4
|
|
196
|
+
if: failure()
|
|
197
|
+
with:
|
|
198
|
+
name: failure-analysis
|
|
199
|
+
path: failure-analysis.md
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
## Security
|
|
203
|
+
|
|
204
|
+
- **No shell injection**: all subprocess calls use explicit argument lists
|
|
205
|
+
- **Path traversal protection**: all paths resolved relative to workspace root
|
|
206
|
+
- **Size caps**: 5 MB/file, 50 MB/scan, 200 commits max
|
|
207
|
+
- **Secrets redacted**: `.env` token/secret/key/password values masked in reports
|
|
208
|
+
- **No outbound network** from core analysis (GitHub issue creation is opt-in)
|
|
209
|
+
|
|
210
|
+
See [SECURITY.md](SECURITY.md) for the full threat model.
|
|
211
|
+
|
|
212
|
+
## Repository layout
|
|
213
|
+
|
|
214
|
+
```
|
|
215
|
+
analyzer/ Python package (MCP server + CLI + analysis)
|
|
216
|
+
parsers/ Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
|
|
217
|
+
evidence/ Evidence collection (git, logs, config)
|
|
218
|
+
render/ Report rendering (Markdown, ANSI)
|
|
219
|
+
ui/ User interfaces (CLI, TUI, Web)
|
|
220
|
+
workspace_scanner.py Phase 0 — mode detection, noise path discovery
|
|
221
|
+
noise_filter.py Evidence filtering and hypothesis deduplication
|
|
222
|
+
orchestrator.py 8-phase analysis pipeline
|
|
223
|
+
hypothesis.py Confidence scoring and hypothesis formation
|
|
224
|
+
bin/cli.js Zero-dep Node wrapper (ai-analyze command)
|
|
225
|
+
skills/ai-test-failure-analyzer/SKILL.md Claude Code agent skill
|
|
226
|
+
.claude-plugin/ Claude marketplace manifests
|
|
227
|
+
tests/analyzer/ pytest test suite
|
|
228
|
+
.github/workflows/ CI/CD (ci, release, publish, codeql)
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
## Testing
|
|
232
|
+
|
|
233
|
+
```bash
|
|
234
|
+
pytest tests/analyzer -q # Python: parsers, correlator, noise filter, workspace scanner
|
|
235
|
+
npm test # Node: CLI smoke tests
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
## Contributing
|
|
239
|
+
|
|
240
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md).
|