ai-test-failure-analyzer 1.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. ai_test_failure_analyzer-1.0.1/PKG-INFO +240 -0
  2. ai_test_failure_analyzer-1.0.1/README.md +203 -0
  3. ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/PKG-INFO +240 -0
  4. ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/SOURCES.txt +41 -0
  5. ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/dependency_links.txt +1 -0
  6. ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/entry_points.txt +4 -0
  7. ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/requires.txt +22 -0
  8. ai_test_failure_analyzer-1.0.1/ai_test_failure_analyzer.egg-info/top_level.txt +1 -0
  9. ai_test_failure_analyzer-1.0.1/analyzer/__init__.py +4 -0
  10. ai_test_failure_analyzer-1.0.1/analyzer/__main__.py +124 -0
  11. ai_test_failure_analyzer-1.0.1/analyzer/config.py +75 -0
  12. ai_test_failure_analyzer-1.0.1/analyzer/elicit.py +86 -0
  13. ai_test_failure_analyzer-1.0.1/analyzer/evidence/__init__.py +8 -0
  14. ai_test_failure_analyzer-1.0.1/analyzer/evidence/config_scan.py +87 -0
  15. ai_test_failure_analyzer-1.0.1/analyzer/evidence/correlator.py +182 -0
  16. ai_test_failure_analyzer-1.0.1/analyzer/evidence/git_scan.py +140 -0
  17. ai_test_failure_analyzer-1.0.1/analyzer/evidence/log_scan.py +126 -0
  18. ai_test_failure_analyzer-1.0.1/analyzer/github_integration.py +103 -0
  19. ai_test_failure_analyzer-1.0.1/analyzer/hypothesis.py +255 -0
  20. ai_test_failure_analyzer-1.0.1/analyzer/noise_filter.py +107 -0
  21. ai_test_failure_analyzer-1.0.1/analyzer/orchestrator.py +204 -0
  22. ai_test_failure_analyzer-1.0.1/analyzer/parsers/__init__.py +83 -0
  23. ai_test_failure_analyzer-1.0.1/analyzer/parsers/base.py +141 -0
  24. ai_test_failure_analyzer-1.0.1/analyzer/parsers/cypress_json.py +102 -0
  25. ai_test_failure_analyzer-1.0.1/analyzer/parsers/jest_json.py +100 -0
  26. ai_test_failure_analyzer-1.0.1/analyzer/parsers/junit_generic.py +118 -0
  27. ai_test_failure_analyzer-1.0.1/analyzer/parsers/k6_json.py +76 -0
  28. ai_test_failure_analyzer-1.0.1/analyzer/parsers/newman_json.py +93 -0
  29. ai_test_failure_analyzer-1.0.1/analyzer/parsers/playwright_json.py +176 -0
  30. ai_test_failure_analyzer-1.0.1/analyzer/parsers/pytest_junit.py +102 -0
  31. ai_test_failure_analyzer-1.0.1/analyzer/render/__init__.py +6 -0
  32. ai_test_failure_analyzer-1.0.1/analyzer/render/ansi.py +131 -0
  33. ai_test_failure_analyzer-1.0.1/analyzer/render/markdown.py +173 -0
  34. ai_test_failure_analyzer-1.0.1/analyzer/security.py +104 -0
  35. ai_test_failure_analyzer-1.0.1/analyzer/server.py +425 -0
  36. ai_test_failure_analyzer-1.0.1/analyzer/ui/__init__.py +0 -0
  37. ai_test_failure_analyzer-1.0.1/analyzer/ui/cli.py +148 -0
  38. ai_test_failure_analyzer-1.0.1/analyzer/ui/tui.py +238 -0
  39. ai_test_failure_analyzer-1.0.1/analyzer/ui/web/__init__.py +0 -0
  40. ai_test_failure_analyzer-1.0.1/analyzer/ui/web/app.py +147 -0
  41. ai_test_failure_analyzer-1.0.1/analyzer/workspace_scanner.py +95 -0
  42. ai_test_failure_analyzer-1.0.1/pyproject.toml +76 -0
  43. ai_test_failure_analyzer-1.0.1/setup.cfg +4 -0
@@ -0,0 +1,240 @@
1
+ Metadata-Version: 2.4
2
+ Name: ai-test-failure-analyzer
3
+ Version: 1.0.1
4
+ Summary: Root cause in seconds. Evidence, not intuition. Analyzes Playwright, Jest, Cypress, Newman, k6, and JUnit test failures by tracing git history, logs, and config. No guesses, no fixture noise.
5
+ Author: NashTech AI
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/nashtech/ai-test-failure-analyzer
8
+ Keywords: mcp,testing,qa,ai,playwright,pytest,jest,cypress,newman,k6,root-cause-analysis
9
+ Classifier: Development Status :: 4 - Beta
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: License :: OSI Approved :: MIT License
12
+ Classifier: Programming Language :: Python :: 3 :: Only
13
+ Classifier: Topic :: Software Development :: Testing
14
+ Requires-Python: <3.15,>=3.10
15
+ Description-Content-Type: text/markdown
16
+ Requires-Dist: mcp>=1.2.0
17
+ Requires-Dist: pydantic>=2.7
18
+ Requires-Dist: pydantic-settings>=2.3
19
+ Requires-Dist: fastapi>=0.111
20
+ Requires-Dist: uvicorn[standard]>=0.30
21
+ Requires-Dist: jinja2>=3.1
22
+ Requires-Dist: sse-starlette>=2.1
23
+ Requires-Dist: questionary>=2.0
24
+ Requires-Dist: rich>=13.7
25
+ Requires-Dist: textual>=0.70
26
+ Requires-Dist: typer>=0.12
27
+ Requires-Dist: lxml>=5.2
28
+ Requires-Dist: PyGithub>=2.3
29
+ Requires-Dist: python-dotenv>=1.0
30
+ Provides-Extra: dev
31
+ Requires-Dist: pytest>=8; extra == "dev"
32
+ Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
33
+ Requires-Dist: ruff>=0.5; extra == "dev"
34
+ Requires-Dist: mypy>=1.10; extra == "dev"
35
+ Requires-Dist: build>=1.0; extra == "dev"
36
+ Requires-Dist: twine>=5.0; extra == "dev"
37
+
38
+ <div align="center">
39
+
40
+ # 🩻 ai-test-failure-analyzer
41
+
42
+ ---
43
+
44
+ **Root cause in seconds. Evidence, not intuition.**
45
+
46
+ Feed it a **real** test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit —
47
+ and it traces back through your **real** git history, application logs, and config
48
+ to surface the **actual** root cause, with a cited evidence chain and `file:line` precision.
49
+ No guesses. No fixture noise. No repeating the obvious.
50
+
51
+ [![CI](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml/badge.svg)](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml)
52
+ [![CodeQL](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml/badge.svg)](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml)
53
+ [![npm](https://img.shields.io/npm/v/ai-test-failure-analyzer)](https://www.npmjs.com/package/ai-test-failure-analyzer)
54
+ [![PyPI](https://img.shields.io/pypi/v/ai-test-failure-analyzer)](https://pypi.org/project/ai-test-failure-analyzer)
55
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
56
+ [![MCP server](https://img.shields.io/badge/MCP-server-7c3aed)](https://modelcontextprotocol.io)
57
+ [![Agent Skill](https://img.shields.io/badge/Agent-Skill-7c3aed)](skills/ai-test-failure-analyzer/SKILL.md)
58
+
59
+ <!-- HERO-START -->
60
+ ![ai-analyze running 8-phase analysis](.github/media/hero.svg)
61
+ <!-- HERO-END -->
62
+
63
+ 🩻 A real analysis — evidence from `git`, `app.log`, and `.env` — no guesses, no fixture noise.
64
+
65
+ </div>
66
+
67
+ ---
68
+
69
+ ## Why ai-test-failure-analyzer
70
+
71
+ Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.
72
+
73
+ This tool does it automatically in seconds:
74
+
75
+ - Parses the test result file to extract failing tests with HTTP details
76
+ - Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
77
+ - Scans application logs for ERROR/FATAL lines
78
+ - Reads config files (.env, docker-compose)
79
+ - Cross-correlates all evidence into clusters
80
+ - Forms ranked, evidence-cited hypotheses with `file:line` precision
81
+ - Never points to test fixtures or "intentional failure" comments as root causes
82
+
83
+ ## How it's different
84
+
85
+ | | ai-test-failure-analyzer | Manual triage | Generic LLM |
86
+ |---|---|---|---|
87
+ | Evidence source | Real git/logs/config | Human memory | Training data |
88
+ | Fixture noise | Blocked by Tier-1 gate | No protection | No protection |
89
+ | `file:line` precision | ✅ | Sometimes | No |
90
+ | Works without source code | ✅ API-only mode | ✅ | ✅ |
91
+ | Repeatable | ✅ | ❌ | ❌ |
92
+ | CI-integrated | ✅ | ❌ | ❌ |
93
+
94
+ ## Supported frameworks
95
+
96
+ | Framework | Format | Command |
97
+ |---|---|---|
98
+ | Playwright | JSON reporter | `playwright test --reporter=json` |
99
+ | Jest / Vitest | JSON | `jest --json --outputFile=results.json` |
100
+ | Cypress | Mochawesome JSON | `cypress run --reporter mochawesome` |
101
+ | pytest | JUnit XML | `pytest --junit-xml=results.xml` |
102
+ | Newman (Postman) | JSON | `newman run col.json --reporters json --reporter-json-export results.json` |
103
+ | k6 | Summary JSON | `k6 run --summary-export=results.json script.js` |
104
+ | REST Assured | JUnit XML | standard Maven Surefire output |
105
+ | Any JUnit-compatible | XML | TestNG, Karate, Insomnia CLI |
106
+
107
+ ## Install
108
+
109
+ **npm (global — JS/CI devs):**
110
+ ```bash
111
+ npm install -g ai-test-failure-analyzer
112
+ ai-analyze analyze playwright-report.json
113
+ ```
114
+
115
+ **npx (zero install):**
116
+ ```bash
117
+ npx ai-test-failure-analyzer analyze playwright-report.json
118
+ ```
119
+
120
+ **pipx (Python devs):**
121
+ ```bash
122
+ pipx install ai-test-failure-analyzer
123
+ analyzer analyze playwright-report.json
124
+ ```
125
+
126
+ **Claude Code skill:**
127
+ ```
128
+ /plugin install ai-test-failure-analyzer
129
+ ```
130
+
131
+ **Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):**
132
+ ```bash
133
+ ai-analyze install
134
+ ```
135
+
136
+ ## Usage
137
+
138
+ ### CLI
139
+
140
+ ```bash
141
+ ai-analyze analyze results.json
142
+ ai-analyze analyze results.json --mode api-only # force API-only (no source scan)
143
+ ai-analyze analyze results.json --out report.md # write report to file
144
+ ai-analyze analyze results.json --create-issue # file GitHub issue for top hypothesis
145
+ ```
146
+
147
+ ### MCP server (Claude Code / Cursor)
148
+
149
+ Add to your MCP config:
150
+ ```json
151
+ {
152
+ "mcpServers": {
153
+ "ai-test-failure-analyzer": {
154
+ "command": "ai-analyze",
155
+ "args": ["serve-stdio"]
156
+ }
157
+ }
158
+ }
159
+ ```
160
+
161
+ Then ask Claude: *"Analyze the failures in playwright-report.json"*
162
+
163
+ ### MCP HTTP (OpenAI / Gemini)
164
+
165
+ ```bash
166
+ ai-analyze serve-http --port 8765
167
+ ```
168
+
169
+ ## API-only mode
170
+
171
+ No source code? No problem. When your workspace has no `src/`, `app/`, `lib/`, or `api/` directory — or when you pass `--mode api-only` — the tool switches to API-only mode.
172
+
173
+ It analyzes HTTP contract evidence directly from the test results:
174
+
175
+ ```bash
176
+ ai-analyze analyze newman-results.json
177
+ # > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
178
+ # Root Cause [95%] — POST /api/clips → 404 Not Found
179
+ # Endpoint moved or removed. Check API changelog or versioning.
180
+ # Evidence: response status 404 + URL /api/clips
181
+ ```
182
+
183
+ Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.
184
+
185
+ ## CI integration
186
+
187
+ ```yaml
188
+ # .github/workflows/analyze-failures.yml
189
+ - name: Analyze test failures
190
+ if: failure()
191
+ run: |
192
+ npx ai-test-failure-analyzer analyze test-results/results.json \
193
+ --non-interactive \
194
+ --out failure-analysis.md
195
+ - uses: actions/upload-artifact@v4
196
+ if: failure()
197
+ with:
198
+ name: failure-analysis
199
+ path: failure-analysis.md
200
+ ```
201
+
202
+ ## Security
203
+
204
+ - **No shell injection**: all subprocess calls use explicit argument lists
205
+ - **Path traversal protection**: all paths resolved relative to workspace root
206
+ - **Size caps**: 5 MB/file, 50 MB/scan, 200 commits max
207
+ - **Secrets redacted**: `.env` token/secret/key/password values masked in reports
208
+ - **No outbound network** from core analysis (GitHub issue creation is opt-in)
209
+
210
+ See [SECURITY.md](SECURITY.md) for the full threat model.
211
+
212
+ ## Repository layout
213
+
214
+ ```
215
+ analyzer/ Python package (MCP server + CLI + analysis)
216
+ parsers/ Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
217
+ evidence/ Evidence collection (git, logs, config)
218
+ render/ Report rendering (Markdown, ANSI)
219
+ ui/ User interfaces (CLI, TUI, Web)
220
+ workspace_scanner.py Phase 0 — mode detection, noise path discovery
221
+ noise_filter.py Evidence filtering and hypothesis deduplication
222
+ orchestrator.py 8-phase analysis pipeline
223
+ hypothesis.py Confidence scoring and hypothesis formation
224
+ bin/cli.js Zero-dep Node wrapper (ai-analyze command)
225
+ skills/ai-test-failure-analyzer/SKILL.md Claude Code agent skill
226
+ .claude-plugin/ Claude marketplace manifests
227
+ tests/analyzer/ pytest test suite
228
+ .github/workflows/ CI/CD (ci, release, publish, codeql)
229
+ ```
230
+
231
+ ## Testing
232
+
233
+ ```bash
234
+ pytest tests/analyzer -q # Python: parsers, correlator, noise filter, workspace scanner
235
+ npm test # Node: CLI smoke tests
236
+ ```
237
+
238
+ ## Contributing
239
+
240
+ See [CONTRIBUTING.md](CONTRIBUTING.md).
@@ -0,0 +1,203 @@
1
+ <div align="center">
2
+
3
+ # 🩻 ai-test-failure-analyzer
4
+
5
+ ---
6
+
7
+ **Root cause in seconds. Evidence, not intuition.**
8
+
9
+ Feed it a **real** test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit —
10
+ and it traces back through your **real** git history, application logs, and config
11
+ to surface the **actual** root cause, with a cited evidence chain and `file:line` precision.
12
+ No guesses. No fixture noise. No repeating the obvious.
13
+
14
+ [![CI](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml/badge.svg)](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml)
15
+ [![CodeQL](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml/badge.svg)](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml)
16
+ [![npm](https://img.shields.io/npm/v/ai-test-failure-analyzer)](https://www.npmjs.com/package/ai-test-failure-analyzer)
17
+ [![PyPI](https://img.shields.io/pypi/v/ai-test-failure-analyzer)](https://pypi.org/project/ai-test-failure-analyzer)
18
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
19
+ [![MCP server](https://img.shields.io/badge/MCP-server-7c3aed)](https://modelcontextprotocol.io)
20
+ [![Agent Skill](https://img.shields.io/badge/Agent-Skill-7c3aed)](skills/ai-test-failure-analyzer/SKILL.md)
21
+
22
+ <!-- HERO-START -->
23
+ ![ai-analyze running 8-phase analysis](.github/media/hero.svg)
24
+ <!-- HERO-END -->
25
+
26
+ 🩻 A real analysis — evidence from `git`, `app.log`, and `.env` — no guesses, no fixture noise.
27
+
28
+ </div>
29
+
30
+ ---
31
+
32
+ ## Why ai-test-failure-analyzer
33
+
34
+ Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.
35
+
36
+ This tool does it automatically in seconds:
37
+
38
+ - Parses the test result file to extract failing tests with HTTP details
39
+ - Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
40
+ - Scans application logs for ERROR/FATAL lines
41
+ - Reads config files (.env, docker-compose)
42
+ - Cross-correlates all evidence into clusters
43
+ - Forms ranked, evidence-cited hypotheses with `file:line` precision
44
+ - Never points to test fixtures or "intentional failure" comments as root causes
45
+
46
+ ## How it's different
47
+
48
+ | | ai-test-failure-analyzer | Manual triage | Generic LLM |
49
+ |---|---|---|---|
50
+ | Evidence source | Real git/logs/config | Human memory | Training data |
51
+ | Fixture noise | Blocked by Tier-1 gate | No protection | No protection |
52
+ | `file:line` precision | ✅ | Sometimes | No |
53
+ | Works without source code | ✅ API-only mode | ✅ | ✅ |
54
+ | Repeatable | ✅ | ❌ | ❌ |
55
+ | CI-integrated | ✅ | ❌ | ❌ |
56
+
57
+ ## Supported frameworks
58
+
59
+ | Framework | Format | Command |
60
+ |---|---|---|
61
+ | Playwright | JSON reporter | `playwright test --reporter=json` |
62
+ | Jest / Vitest | JSON | `jest --json --outputFile=results.json` |
63
+ | Cypress | Mochawesome JSON | `cypress run --reporter mochawesome` |
64
+ | pytest | JUnit XML | `pytest --junit-xml=results.xml` |
65
+ | Newman (Postman) | JSON | `newman run col.json --reporters json --reporter-json-export results.json` |
66
+ | k6 | Summary JSON | `k6 run --summary-export=results.json script.js` |
67
+ | REST Assured | JUnit XML | standard Maven Surefire output |
68
+ | Any JUnit-compatible | XML | TestNG, Karate, Insomnia CLI |
69
+
70
+ ## Install
71
+
72
+ **npm (global — JS/CI devs):**
73
+ ```bash
74
+ npm install -g ai-test-failure-analyzer
75
+ ai-analyze analyze playwright-report.json
76
+ ```
77
+
78
+ **npx (zero install):**
79
+ ```bash
80
+ npx ai-test-failure-analyzer analyze playwright-report.json
81
+ ```
82
+
83
+ **pipx (Python devs):**
84
+ ```bash
85
+ pipx install ai-test-failure-analyzer
86
+ analyzer analyze playwright-report.json
87
+ ```
88
+
89
+ **Claude Code skill:**
90
+ ```
91
+ /plugin install ai-test-failure-analyzer
92
+ ```
93
+
94
+ **Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):**
95
+ ```bash
96
+ ai-analyze install
97
+ ```
98
+
99
+ ## Usage
100
+
101
+ ### CLI
102
+
103
+ ```bash
104
+ ai-analyze analyze results.json
105
+ ai-analyze analyze results.json --mode api-only # force API-only (no source scan)
106
+ ai-analyze analyze results.json --out report.md # write report to file
107
+ ai-analyze analyze results.json --create-issue # file GitHub issue for top hypothesis
108
+ ```
109
+
110
+ ### MCP server (Claude Code / Cursor)
111
+
112
+ Add to your MCP config:
113
+ ```json
114
+ {
115
+ "mcpServers": {
116
+ "ai-test-failure-analyzer": {
117
+ "command": "ai-analyze",
118
+ "args": ["serve-stdio"]
119
+ }
120
+ }
121
+ }
122
+ ```
123
+
124
+ Then ask Claude: *"Analyze the failures in playwright-report.json"*
125
+
126
+ ### MCP HTTP (OpenAI / Gemini)
127
+
128
+ ```bash
129
+ ai-analyze serve-http --port 8765
130
+ ```
131
+
132
+ ## API-only mode
133
+
134
+ No source code? No problem. When your workspace has no `src/`, `app/`, `lib/`, or `api/` directory — or when you pass `--mode api-only` — the tool switches to API-only mode.
135
+
136
+ It analyzes HTTP contract evidence directly from the test results:
137
+
138
+ ```bash
139
+ ai-analyze analyze newman-results.json
140
+ # > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
141
+ # Root Cause [95%] — POST /api/clips → 404 Not Found
142
+ # Endpoint moved or removed. Check API changelog or versioning.
143
+ # Evidence: response status 404 + URL /api/clips
144
+ ```
145
+
146
+ Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.
147
+
148
+ ## CI integration
149
+
150
+ ```yaml
151
+ # .github/workflows/analyze-failures.yml
152
+ - name: Analyze test failures
153
+ if: failure()
154
+ run: |
155
+ npx ai-test-failure-analyzer analyze test-results/results.json \
156
+ --non-interactive \
157
+ --out failure-analysis.md
158
+ - uses: actions/upload-artifact@v4
159
+ if: failure()
160
+ with:
161
+ name: failure-analysis
162
+ path: failure-analysis.md
163
+ ```
164
+
165
+ ## Security
166
+
167
+ - **No shell injection**: all subprocess calls use explicit argument lists
168
+ - **Path traversal protection**: all paths resolved relative to workspace root
169
+ - **Size caps**: 5 MB/file, 50 MB/scan, 200 commits max
170
+ - **Secrets redacted**: `.env` token/secret/key/password values masked in reports
171
+ - **No outbound network** from core analysis (GitHub issue creation is opt-in)
172
+
173
+ See [SECURITY.md](SECURITY.md) for the full threat model.
174
+
175
+ ## Repository layout
176
+
177
+ ```
178
+ analyzer/ Python package (MCP server + CLI + analysis)
179
+ parsers/ Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
180
+ evidence/ Evidence collection (git, logs, config)
181
+ render/ Report rendering (Markdown, ANSI)
182
+ ui/ User interfaces (CLI, TUI, Web)
183
+ workspace_scanner.py Phase 0 — mode detection, noise path discovery
184
+ noise_filter.py Evidence filtering and hypothesis deduplication
185
+ orchestrator.py 8-phase analysis pipeline
186
+ hypothesis.py Confidence scoring and hypothesis formation
187
+ bin/cli.js Zero-dep Node wrapper (ai-analyze command)
188
+ skills/ai-test-failure-analyzer/SKILL.md Claude Code agent skill
189
+ .claude-plugin/ Claude marketplace manifests
190
+ tests/analyzer/ pytest test suite
191
+ .github/workflows/ CI/CD (ci, release, publish, codeql)
192
+ ```
193
+
194
+ ## Testing
195
+
196
+ ```bash
197
+ pytest tests/analyzer -q # Python: parsers, correlator, noise filter, workspace scanner
198
+ npm test # Node: CLI smoke tests
199
+ ```
200
+
201
+ ## Contributing
202
+
203
+ See [CONTRIBUTING.md](CONTRIBUTING.md).
@@ -0,0 +1,240 @@
1
+ Metadata-Version: 2.4
2
+ Name: ai-test-failure-analyzer
3
+ Version: 1.0.1
4
+ Summary: Root cause in seconds. Evidence, not intuition. Analyzes Playwright, Jest, Cypress, Newman, k6, and JUnit test failures by tracing git history, logs, and config. No guesses, no fixture noise.
5
+ Author: NashTech AI
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/nashtech/ai-test-failure-analyzer
8
+ Keywords: mcp,testing,qa,ai,playwright,pytest,jest,cypress,newman,k6,root-cause-analysis
9
+ Classifier: Development Status :: 4 - Beta
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: License :: OSI Approved :: MIT License
12
+ Classifier: Programming Language :: Python :: 3 :: Only
13
+ Classifier: Topic :: Software Development :: Testing
14
+ Requires-Python: <3.15,>=3.10
15
+ Description-Content-Type: text/markdown
16
+ Requires-Dist: mcp>=1.2.0
17
+ Requires-Dist: pydantic>=2.7
18
+ Requires-Dist: pydantic-settings>=2.3
19
+ Requires-Dist: fastapi>=0.111
20
+ Requires-Dist: uvicorn[standard]>=0.30
21
+ Requires-Dist: jinja2>=3.1
22
+ Requires-Dist: sse-starlette>=2.1
23
+ Requires-Dist: questionary>=2.0
24
+ Requires-Dist: rich>=13.7
25
+ Requires-Dist: textual>=0.70
26
+ Requires-Dist: typer>=0.12
27
+ Requires-Dist: lxml>=5.2
28
+ Requires-Dist: PyGithub>=2.3
29
+ Requires-Dist: python-dotenv>=1.0
30
+ Provides-Extra: dev
31
+ Requires-Dist: pytest>=8; extra == "dev"
32
+ Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
33
+ Requires-Dist: ruff>=0.5; extra == "dev"
34
+ Requires-Dist: mypy>=1.10; extra == "dev"
35
+ Requires-Dist: build>=1.0; extra == "dev"
36
+ Requires-Dist: twine>=5.0; extra == "dev"
37
+
38
+ <div align="center">
39
+
40
+ # 🩻 ai-test-failure-analyzer
41
+
42
+ ---
43
+
44
+ **Root cause in seconds. Evidence, not intuition.**
45
+
46
+ Feed it a **real** test result file — Playwright, Jest, Cypress, Newman, k6, or JUnit —
47
+ and it traces back through your **real** git history, application logs, and config
48
+ to surface the **actual** root cause, with a cited evidence chain and `file:line` precision.
49
+ No guesses. No fixture noise. No repeating the obvious.
50
+
51
+ [![CI](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml/badge.svg)](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/ci.yml)
52
+ [![CodeQL](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml/badge.svg)](https://github.com/aks-builds/ai-test-failure-analyzer/actions/workflows/codeql.yml)
53
+ [![npm](https://img.shields.io/npm/v/ai-test-failure-analyzer)](https://www.npmjs.com/package/ai-test-failure-analyzer)
54
+ [![PyPI](https://img.shields.io/pypi/v/ai-test-failure-analyzer)](https://pypi.org/project/ai-test-failure-analyzer)
55
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
56
+ [![MCP server](https://img.shields.io/badge/MCP-server-7c3aed)](https://modelcontextprotocol.io)
57
+ [![Agent Skill](https://img.shields.io/badge/Agent-Skill-7c3aed)](skills/ai-test-failure-analyzer/SKILL.md)
58
+
59
+ <!-- HERO-START -->
60
+ ![ai-analyze running 8-phase analysis](.github/media/hero.svg)
61
+ <!-- HERO-END -->
62
+
63
+ 🩻 A real analysis — evidence from `git`, `app.log`, and `.env` — no guesses, no fixture noise.
64
+
65
+ </div>
66
+
67
+ ---
68
+
69
+ ## Why ai-test-failure-analyzer
70
+
71
+ Manual test failure investigation takes 30–60 minutes: open the test output, grep through logs, dig through git history, check recent deploys, ask Slack. And you can still point at the wrong thing — especially when the test file itself has an "intentional failure" comment or a fixture designed to trigger the analyzer.
72
+
73
+ This tool does it automatically in seconds:
74
+
75
+ - Parses the test result file to extract failing tests with HTTP details
76
+ - Scans git history for high-risk commits (endpoint renames, migrations, auth changes)
77
+ - Scans application logs for ERROR/FATAL lines
78
+ - Reads config files (.env, docker-compose)
79
+ - Cross-correlates all evidence into clusters
80
+ - Forms ranked, evidence-cited hypotheses with `file:line` precision
81
+ - Never points to test fixtures or "intentional failure" comments as root causes
82
+
83
+ ## How it's different
84
+
85
+ | | ai-test-failure-analyzer | Manual triage | Generic LLM |
86
+ |---|---|---|---|
87
+ | Evidence source | Real git/logs/config | Human memory | Training data |
88
+ | Fixture noise | Blocked by Tier-1 gate | No protection | No protection |
89
+ | `file:line` precision | ✅ | Sometimes | No |
90
+ | Works without source code | ✅ API-only mode | ✅ | ✅ |
91
+ | Repeatable | ✅ | ❌ | ❌ |
92
+ | CI-integrated | ✅ | ❌ | ❌ |
93
+
94
+ ## Supported frameworks
95
+
96
+ | Framework | Format | Command |
97
+ |---|---|---|
98
+ | Playwright | JSON reporter | `playwright test --reporter=json` |
99
+ | Jest / Vitest | JSON | `jest --json --outputFile=results.json` |
100
+ | Cypress | Mochawesome JSON | `cypress run --reporter mochawesome` |
101
+ | pytest | JUnit XML | `pytest --junit-xml=results.xml` |
102
+ | Newman (Postman) | JSON | `newman run col.json --reporters json --reporter-json-export results.json` |
103
+ | k6 | Summary JSON | `k6 run --summary-export=results.json script.js` |
104
+ | REST Assured | JUnit XML | standard Maven Surefire output |
105
+ | Any JUnit-compatible | XML | TestNG, Karate, Insomnia CLI |
106
+
107
+ ## Install
108
+
109
+ **npm (global — JS/CI devs):**
110
+ ```bash
111
+ npm install -g ai-test-failure-analyzer
112
+ ai-analyze analyze playwright-report.json
113
+ ```
114
+
115
+ **npx (zero install):**
116
+ ```bash
117
+ npx ai-test-failure-analyzer analyze playwright-report.json
118
+ ```
119
+
120
+ **pipx (Python devs):**
121
+ ```bash
122
+ pipx install ai-test-failure-analyzer
123
+ analyzer analyze playwright-report.json
124
+ ```
125
+
126
+ **Claude Code skill:**
127
+ ```
128
+ /plugin install ai-test-failure-analyzer
129
+ ```
130
+
131
+ **Install skill to all agents (Claude, Cursor, Codex, Gemini, Windsurf):**
132
+ ```bash
133
+ ai-analyze install
134
+ ```
135
+
136
+ ## Usage
137
+
138
+ ### CLI
139
+
140
+ ```bash
141
+ ai-analyze analyze results.json
142
+ ai-analyze analyze results.json --mode api-only # force API-only (no source scan)
143
+ ai-analyze analyze results.json --out report.md # write report to file
144
+ ai-analyze analyze results.json --create-issue # file GitHub issue for top hypothesis
145
+ ```
146
+
147
+ ### MCP server (Claude Code / Cursor)
148
+
149
+ Add to your MCP config:
150
+ ```json
151
+ {
152
+ "mcpServers": {
153
+ "ai-test-failure-analyzer": {
154
+ "command": "ai-analyze",
155
+ "args": ["serve-stdio"]
156
+ }
157
+ }
158
+ }
159
+ ```
160
+
161
+ Then ask Claude: *"Analyze the failures in playwright-report.json"*
162
+
163
+ ### MCP HTTP (OpenAI / Gemini)
164
+
165
+ ```bash
166
+ ai-analyze serve-http --port 8765
167
+ ```
168
+
169
+ ## API-only mode
170
+
171
+ No source code? No problem. When your workspace has no `src/`, `app/`, `lib/`, or `api/` directory — or when you pass `--mode api-only` — the tool switches to API-only mode.
172
+
173
+ It analyzes HTTP contract evidence directly from the test results:
174
+
175
+ ```bash
176
+ ai-analyze analyze newman-results.json
177
+ # > API_ONLY mode — no workspace source detected, analyzing HTTP contract only
178
+ # Root Cause [95%] — POST /api/clips → 404 Not Found
179
+ # Endpoint moved or removed. Check API changelog or versioning.
180
+ # Evidence: response status 404 + URL /api/clips
181
+ ```
182
+
183
+ Supports Newman, k6, Playwright (API tests), Jest, and any framework that records HTTP status codes.
184
+
185
+ ## CI integration
186
+
187
+ ```yaml
188
+ # .github/workflows/analyze-failures.yml
189
+ - name: Analyze test failures
190
+ if: failure()
191
+ run: |
192
+ npx ai-test-failure-analyzer analyze test-results/results.json \
193
+ --non-interactive \
194
+ --out failure-analysis.md
195
+ - uses: actions/upload-artifact@v4
196
+ if: failure()
197
+ with:
198
+ name: failure-analysis
199
+ path: failure-analysis.md
200
+ ```
201
+
202
+ ## Security
203
+
204
+ - **No shell injection**: all subprocess calls use explicit argument lists
205
+ - **Path traversal protection**: all paths resolved relative to workspace root
206
+ - **Size caps**: 5 MB/file, 50 MB/scan, 200 commits max
207
+ - **Secrets redacted**: `.env` token/secret/key/password values masked in reports
208
+ - **No outbound network** from core analysis (GitHub issue creation is opt-in)
209
+
210
+ See [SECURITY.md](SECURITY.md) for the full threat model.
211
+
212
+ ## Repository layout
213
+
214
+ ```
215
+ analyzer/ Python package (MCP server + CLI + analysis)
216
+ parsers/ Framework parsers (Playwright, Jest, Cypress, Newman, k6, JUnit)
217
+ evidence/ Evidence collection (git, logs, config)
218
+ render/ Report rendering (Markdown, ANSI)
219
+ ui/ User interfaces (CLI, TUI, Web)
220
+ workspace_scanner.py Phase 0 — mode detection, noise path discovery
221
+ noise_filter.py Evidence filtering and hypothesis deduplication
222
+ orchestrator.py 8-phase analysis pipeline
223
+ hypothesis.py Confidence scoring and hypothesis formation
224
+ bin/cli.js Zero-dep Node wrapper (ai-analyze command)
225
+ skills/ai-test-failure-analyzer/SKILL.md Claude Code agent skill
226
+ .claude-plugin/ Claude marketplace manifests
227
+ tests/analyzer/ pytest test suite
228
+ .github/workflows/ CI/CD (ci, release, publish, codeql)
229
+ ```
230
+
231
+ ## Testing
232
+
233
+ ```bash
234
+ pytest tests/analyzer -q # Python: parsers, correlator, noise filter, workspace scanner
235
+ npm test # Node: CLI smoke tests
236
+ ```
237
+
238
+ ## Contributing
239
+
240
+ See [CONTRIBUTING.md](CONTRIBUTING.md).