npm - @yasserkhanorg/e2e-agents - Versions diffs - 0.9.0 → 0.11.0 - Mend

@yasserkhanorg/e2e-agents 0.9.0 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (93) hide show

package/README.md +112 -584
package/dist/agent/api_catalog.d.ts +11 -0
package/dist/agent/api_catalog.d.ts.map +1 -0
package/dist/agent/api_catalog.js +210 -0
package/dist/agent/llm_agents_flow.d.ts +15 -0
package/dist/agent/llm_agents_flow.d.ts.map +1 -0
package/dist/agent/llm_agents_flow.js +434 -0
package/dist/agent/native_flow.d.ts +6 -0
package/dist/agent/native_flow.d.ts.map +1 -0
package/dist/agent/native_flow.js +179 -0
package/dist/agent/pipeline.d.ts +2 -25
package/dist/agent/pipeline.d.ts.map +1 -1
package/dist/agent/pipeline.js +30 -1329
package/dist/agent/pipeline_types.d.ts +54 -0
package/dist/agent/pipeline_types.d.ts.map +1 -0
package/dist/agent/pipeline_types.js +4 -0
package/dist/agent/pipeline_utils.d.ts +12 -0
package/dist/agent/pipeline_utils.d.ts.map +1 -0
package/dist/agent/pipeline_utils.js +156 -0
package/dist/agent/process_runner.d.ts +10 -0
package/dist/agent/process_runner.d.ts.map +1 -0
package/dist/agent/process_runner.js +92 -0
package/dist/agent/spec_generator.d.ts +5 -0
package/dist/agent/spec_generator.d.ts.map +1 -0
package/dist/agent/spec_generator.js +253 -0
package/dist/agent/validation_runner.d.ts +5 -0
package/dist/agent/validation_runner.d.ts.map +1 -0
package/dist/agent/validation_runner.js +77 -0
package/dist/agentic/playwright_runner.js +1 -1
package/dist/cli/commands/analyze.d.ts +3 -0
package/dist/cli/commands/analyze.d.ts.map +1 -0
package/dist/cli/commands/analyze.js +77 -0
package/dist/cli/commands/feedback.d.ts +3 -0
package/dist/cli/commands/feedback.d.ts.map +1 -0
package/dist/cli/commands/feedback.js +39 -0
package/dist/cli/commands/finalize.d.ts +3 -0
package/dist/cli/commands/finalize.d.ts.map +1 -0
package/dist/cli/commands/finalize.js +41 -0
package/dist/cli/commands/generate.d.ts +4 -0
package/dist/cli/commands/generate.d.ts.map +1 -0
package/dist/cli/commands/generate.js +108 -0
package/dist/cli/commands/heal.d.ts +3 -0
package/dist/cli/commands/heal.d.ts.map +1 -0
package/dist/cli/commands/heal.js +60 -0
package/dist/cli/commands/impact.d.ts +4 -0
package/dist/cli/commands/impact.d.ts.map +1 -0
package/dist/cli/commands/impact.js +26 -0
package/dist/cli/commands/llm_health.d.ts +2 -0
package/dist/cli/commands/llm_health.d.ts.map +1 -0
package/dist/cli/commands/llm_health.js +38 -0
package/dist/cli/commands/plan.d.ts +4 -0
package/dist/cli/commands/plan.d.ts.map +1 -0
package/dist/cli/commands/plan.js +83 -0
package/dist/cli/commands/traceability.d.ts +4 -0
package/dist/cli/commands/traceability.d.ts.map +1 -0
package/dist/cli/commands/traceability.js +77 -0
package/dist/cli/parse_args.d.ts +6 -0
package/dist/cli/parse_args.d.ts.map +1 -0
package/dist/cli/parse_args.js +216 -0
package/dist/cli/types.d.ts +70 -0
package/dist/cli/types.d.ts.map +1 -0
package/dist/cli/types.js +4 -0
package/dist/cli/usage.d.ts +2 -0
package/dist/cli/usage.d.ts.map +1 -0
package/dist/cli/usage.js +86 -0
package/dist/cli.js +26 -1057
package/dist/esm/agent/api_catalog.js +199 -0
package/dist/esm/agent/llm_agents_flow.js +421 -0
package/dist/esm/agent/native_flow.js +175 -0
package/dist/esm/agent/pipeline.js +8 -1307
package/dist/esm/agent/pipeline_types.js +3 -0
package/dist/esm/agent/pipeline_utils.js +146 -0
package/dist/esm/agent/process_runner.js +83 -0
package/dist/esm/agent/spec_generator.js +249 -0
package/dist/esm/agent/validation_runner.js +73 -0
package/dist/esm/agentic/playwright_runner.js +1 -1
package/dist/esm/cli/commands/analyze.js +74 -0
package/dist/esm/cli/commands/feedback.js +36 -0
package/dist/esm/cli/commands/finalize.js +38 -0
package/dist/esm/cli/commands/generate.js +105 -0
package/dist/esm/cli/commands/heal.js +57 -0
package/dist/esm/cli/commands/impact.js +23 -0
package/dist/esm/cli/commands/llm_health.js +35 -0
package/dist/esm/cli/commands/plan.js +80 -0
package/dist/esm/cli/commands/traceability.js +73 -0
package/dist/esm/cli/parse_args.js +210 -0
package/dist/esm/cli/types.js +3 -0
package/dist/esm/cli/usage.js +83 -0
package/dist/esm/cli.js +20 -1051
package/dist/esm/mcp-server.js +18 -1
package/dist/mcp-server.d.ts.map +1 -1
package/dist/mcp-server.js +17 -0
package/package.json +2 -4

package/README.md CHANGED Viewed

@@ -1,19 +1,16 @@
 # @yasserkhanorg/e2e-agents
-Framework-agnostic LLM provider library with MCP server for autonomous E2E testing.
+AI-powered E2E test impact analysis, generation, and healing for frontend repositories.
 [![npm](https://img.shields.io/npm/v/%40yasserkhanorg%2Fe2e-agents)](https://www.npmjs.com/package/@yasserkhanorg/e2e-agents)
 [![License](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE)
 [![GitHub](https://img.shields.io/badge/github-yasserfaraazkhan%2Fe2e--agents-blue?logo=github)](https://github.com/yasserfaraazkhan/e2e-agents)
-## Overview
+## What It Does
-Pluggable LLM provider abstraction for test automation with:
-- **Anthropic Claude** — Advanced reasoning, vision support
-- **OpenAI GPT** — Official OpenAI API integration
-- **Ollama** — Free, local, privacy-first
-- **MCP Server** — 6 tools for test discovery, generation, and healing
-- **Custom Providers** — Extend with any OpenAI-compatible API
+Given a git diff, `e2e-ai-agents` determines which E2E test flows are impacted, identifies coverage gaps, and can generate or heal Playwright tests — all from the CLI.
+**Pipeline:** `impact` → `plan` → `generate` → `heal` → `finalize`
 ## Installation
@@ -21,666 +18,197 @@ Pluggable LLM provider abstraction for test automation with:
 npm install @yasserkhanorg/e2e-agents
 ```
-## Module Formats (CJS + ESM)
-This package ships both CommonJS and ESM builds:
-- `require('@yasserkhanorg/e2e-agents')` loads the CommonJS build from `dist/index.js`.
-- `import ... from '@yasserkhanorg/e2e-agents'` loads the ESM build from `dist/esm/index.js`.
-- `./mcp` follows the same pattern (`dist/mcp-server.js` for CJS, `dist/esm/mcp-server.js` for ESM).
-Node.js >= 20 is required.
-## Quick Links
-📖 **[Comprehensive Guide](E2E_AI_TESTING.md)** - In-depth documentation including:
-- How to use e2e-ai-agents in your projects
-- Real-world examples for Playwright, Cypress, Selenium
-- How Mattermost uses this package
-- Cost optimization and best practices
-## Quick Start
-### Use Claude
-```typescript
-import { AnthropicProvider } from '@yasserkhanorg/e2e-agents';
-const claude = new AnthropicProvider({
-    apiKey: process.env.ANTHROPIC_API_KEY
-});
-const response = await claude.generateText('Analyze test failure');
-console.log(response.text);
-console.log(`Cost: $${response.cost.toFixed(4)}`);
-```
-### Use OpenAI
-```typescript
-import { OpenAIProvider } from '@yasserkhanorg/e2e-agents';
-const openai = new OpenAIProvider({
-    apiKey: process.env.OPENAI_API_KEY,
-    model: 'gpt-4'
-});
-const response = await openai.generateText('Summarize test failure');
-console.log(response.text);
-```
-Tip: for accurate OpenAI cost tracking, set `costPer1MInputTokens` and `costPer1MOutputTokens` in the `OpenAIProvider` config.
-### Use Ollama (Free)
-```typescript
-import { OllamaProvider } from '@yasserkhanorg/e2e-agents';
-const ollama = new OllamaProvider({
-    model: 'deepseek-r1:7b'
-});
-const response = await ollama.generateText('Generate test case');
-console.log(response.text); // Free!
-```
-### Use Custom Provider (OpenAI-compatible endpoint)
-```typescript
-import { CustomProvider } from '@yasserkhanorg/e2e-agents';
-const custom = new CustomProvider({
-    baseUrl: 'https://your-llm-gateway.example.com/v1',
-    auth: { Authorization: `Bearer ${process.env.CUSTOM_API_KEY}` },
-    model: 'your-model-name',
-    requestFormat: 'openai'
-});
-const response = await custom.generateText('Generate test case');
-console.log(response.text);
-```
-`requestFormat` can be `'openai'`, `'anthropic'`, or `'custom'` (with `transformRequest`/`transformResponse`).
-### Factory Pattern
-```typescript
-import { LLMProviderFactory } from '@yasserkhanorg/e2e-agents';
+Requires Node.js >= 20. Ships both CommonJS and ESM builds.
-// Auto-detect from environment
-const provider = LLMProviderFactory.create({
-    type: 'anthropic',
-    config: { apiKey: process.env.ANTHROPIC_API_KEY }
-});
-```
+## CLI Commands
-### Hybrid Mode (Free + Premium)
+```bash
+# Analyze which flows are impacted by code changes
+npx e2e-ai-agents impact --path /path/to/webapp
-```typescript
-const provider = LLMProviderFactory.createHybrid({
-    primary: { type: 'ollama', config: { model: 'deepseek-r1:7b' } },
-    fallback: { type: 'anthropic', config: { apiKey: process.env.ANTHROPIC_API_KEY } },
-    useFallbackFor: ['vision'] // Only use Claude for vision
-});
+# Generate a coverage plan with gap analysis
+npx e2e-ai-agents plan --path /path/to/webapp
-await provider.generateText('Analyze code'); // Uses Ollama (free)
-await provider.analyzeImage([...], 'Compare screenshots'); // Uses Claude (vision)
-```
+# Generate tests for uncovered gaps (requires plan output)
+npx e2e-ai-agents generate --path /path/to/webapp
-## CLI: Impact and Gap Analysis
+# Heal flaky/failing specs from a Playwright report
+npx e2e-ai-agents heal --path /path/to/webapp --traceability-report ./playwright-report.json
-Run AI-driven impact analysis or gap analysis on any frontend repo.
+# Stage generated tests, commit, and open a PR
+npx e2e-ai-agents finalize-generated-tests --path /path/to/webapp --create-pr
-```bash
-npx e2e-ai-agents impact --path /path/to/webapp
-npx e2e-ai-agents gap --path /path/to/webapp
-npx e2e-ai-agents plan --path /path/to/webapp
-npx e2e-ai-agents suggest --path /path/to/webapp --mattermost
-npx e2e-ai-agents generate --path /path/to/webapp --pipeline
-npx e2e-ai-agents heal --path /path/to/webapp --traceability-report ./playwright-report.json
-npx e2e-ai-agents suggest --path /path/to/webapp
-npx e2e-ai-agents approve-and-generate --path /path/to/webapp
-npx e2e-ai-agents finalize-generated-tests --path /path/to/webapp
-npx e2e-ai-agents feedback --path /path/to/webapp --feedback-input ./feedback.json
+# Ingest test execution data for traceability
 npx e2e-ai-agents traceability-capture --path /path/to/webapp --traceability-report ./playwright-report.json
 npx e2e-ai-agents traceability-ingest --path /path/to/webapp --traceability-input ./traceability-input.json
-```
-Local approval workflow (dev/QA + AI) with one review artifact:
-```bash
-# 1) Suggest and generate local review + pending approval JSON
-node scripts/local-impact-workflow.js suggest --config ./e2e-ai-agents.config.json --since master
-# 2) Approve or reject after review
-node scripts/local-impact-workflow.js approve --config ./e2e-ai-agents.config.json --decision approve --note "QA approved"
+# Ingest recommendation feedback for calibration
+npx e2e-ai-agents feedback --path /path/to/webapp --feedback-input ./feedback.json
-# 3) Generate/heal only after approval
-node scripts/local-impact-workflow.js generate --config ./e2e-ai-agents.config.json --since master --pipeline-dry-run
-# Generates in MCP-only mode by default (AI generation/healing only).
-# Optional: tune MCP timeout per call:
-# node scripts/local-impact-workflow.js generate --config ./e2e-ai-agents.config.json --since master --pipeline-mcp-timeout-ms 120000
+# Test LLM provider connectivity
+npx e2e-ai-agents llm-health
 ```
-Generated local artifacts:
-- `<tests-root>/.e2e-ai-agents/local-impact-review.md`
-- `<tests-root>/.e2e-ai-agents/local-impact-approval.json`
-If tests live outside the app root:
+`plan` and `suggest` are aliases. Use `--help` for all available flags.
-```bash
-npx e2e-ai-agents impact --path /path/to/webapp --tests-root /path/to/e2e-tests
-```
+## Configuration
-Optional config file `e2e-ai-agents.config.json` (JSON):
+Create `e2e-ai-agents.config.json` in your project (auto-discovered):
 ```json
 {
   "path": ".",
-  "profile": "default",
+  "profile": "mattermost",
   "testsRoot": ".",
-  "flowCatalogPath": ".e2e-ai-agents/flows.json",
   "mode": "impact",
   "framework": "auto",
-  "timeLimitMinutes": 10,
-  "budget": { "maxUSD": 2, "maxTokens": 20000 },
-  "artifacts": { "mode": "commit", "specsDir": ".e2e-ai-agents/reports" },
-  "selectors": { "patchOnApply": true },
-  "testDiscovery": { "patterns": ["tests/**/*.spec.ts"] },
-  "flowDiscovery": {
-    "patterns": ["channels/src/components/**/*.{tsx,jsx}"],
-    "exclude": ["**/components/**/stories/**"]
-  },
-  "catalogScoring": {
-    "priorityScores": { "P0": 10, "P1": 6, "P2": 3 },
-    "fileMatchWeight": 1
-  },
+  "git": { "since": "origin/master" },
   "impact": {
-    "allowFallback": false,
-    "dependencyGraph": {
-      "enabled": true,
-      "maxDepth": 3,
-      "maxExpandedFiles": 1000,
-      "filePatterns": ["**/*.{ts,tsx,js,jsx}"],
-      "excludePatterns": ["**/node_modules/**", "**/.git/**", "**/dist/**", "**/build/**"],
-      "aliasRoots": ["src", "channels/src"],
-      "pathAliases": {
-        "@app/*": ["src/*"],
-        "@channels/*": ["channels/src/*"]
-      }
-    },
-    "traceability": {
-      "enabled": true,
-      "manifestPath": ".e2e-ai-agents/traceability.json",
-      "minSignalsPerTest": 1
-    },
-    "subsystemRisk": {
-      "enabled": false,
-      "mapPath": ".e2e-ai-agents/subsystem-risk-map.json",
-      "maxRulesPerFile": 4
-    },
-    "aiFlow": {
-      "enabled": true,
-      "strict": true,
-      "provider": "anthropic",
-      "contextFiles": [
-        "CLAUDE.OPTIONAL.md",
-        ".claude/CLAUDE.OPTIONAL.md"
-      ],
-      "maxFilesPerRequest": 220,
-      "maxFlowsPerRequest": 80,
-      "maxTokens": 4000,
-      "temperature": 0
-    },
-    "aiMapping": {
-      "enabled": false,
-      "provider": "anthropic",
-      "contextFiles": [
-        "CLAUDE.OPTIONAL.md",
-        ".claude/CLAUDE.OPTIONAL.md"
-      ],
-      "maxFlowsPerRequest": 30,
-      "maxCandidateTests": 400,
-      "maxTokens": 4000,
-      "temperature": 0
-    }
+    "dependencyGraph": { "enabled": true, "maxDepth": 3 },
+    "traceability": { "enabled": true },
+    "aiFlow": { "enabled": true, "provider": "anthropic" }
   },
   "pipeline": {
     "enabled": false,
     "scenarios": 3,
     "outputDir": "specs/functional/ai-assisted",
-    "heal": true,
-    "mcp": false,
-    "mcpAllowFallback": false,
-    "mcpOnly": false,
-    "mcpCommandTimeoutMs": 180000,
-    "mcpRetries": 1
+    "mcp": false
   },
-  "llm": { "provider": "anthropic", "fallback": "ollama" },
   "policy": {
-    "minConfidenceForTargeted": 60,
-    "safeMergeMinConfidence": 85,
-    "forceFullOnWarningsAtOrAbove": 2,
-    "forceFullOnP0WithGaps": true,
-    "forceFullOnRiskyFiles": true,
-    "riskyFilePatterns": ["**/auth/**", "**/permissions/**", "**/security/**", "**/*.sql"],
-    "enforcementMode": "advisory",
+    "enforcementMode": "block",
     "blockOnActions": ["must-add-tests"]
-  },
-  "flags": { "defaultState": "on" },
-  "audience": { "defaultRoles": ["member"] },
-  "blastRadius": {
-    "memberBonus": 1,
-    "guestBonus": 1,
-    "adminOnlyPenalty": -1,
-    "flagOffPenalty": -2
   }
 }
 ```
-Notes:
-- If no framework config is found, provide `testDiscovery.patterns` or `--patterns`.
-- Use `flowDiscovery.patterns` or `--flow-patterns` to customize flow scanning.
-- Use `testsRoot` when tests live outside the app root.
-- Use `flowCatalogPath` or `--flow-catalog` to provide a flow catalog for deterministic P0/P1 mapping.
-- Impact mode expects a git diff; use `--since` or add `"impact": { "allowFallback": true }` to fall back to scanning.
-- Impact analysis now uses static reverse dependency graph expansion (configurable via `impact.dependencyGraph`) to propagate changed-file impact, including alias imports via `aliasRoots` and `pathAliases`.
-- Impact analysis can use coverage-style traceability manifests (`impact.traceability`) for file->test mapping with heuristic fallback for uncovered flows.
-- Impact analysis can run AI-first flow mapping (`impact.aiFlow`) so impacted flows and priorities come from LLM reasoning rather than heuristic scoring.
-- Impact analysis can use optional Anthropic-powered AI mapping (`impact.aiMapping`) to map impacted flows to existing tests when traceability is missing/low; context is loaded from optional markdown files such as `CLAUDE.OPTIONAL.md`.
-- Impact analysis can apply subsystem-aware risk boosts and priority floors from a map (`impact.subsystemRisk`) to capture known high-blast-radius areas.
-- Diffing is computed from `merge-base(<since>, HEAD)` when available, which is the standard PR-impact baseline.
-- Reports are written under `testsRoot/.e2e-ai-agents/reports` (or app root if `testsRoot` is not set).
-- Use `approve-and-generate` for explicit approval before generating/healing tests.
-- Selector/data-testid patches are only applied when `--apply` is passed.
-- `plan` is a direct alias for `suggest`.
-- `generate` is a direct alias for `approve-and-generate`.
-- Mattermost-first strict mode is available with `--mattermost` (or `"profile": "mattermost"` in config).
-- In Mattermost mode, heuristic-only test mapping is treated as insufficient evidence and recommendations are escalated to broad runs.
-- `heal` targets flaky/failed specs from a Playwright JSON report (`--traceability-report`).
-- `--apply` remains available as a legacy shortcut for direct `gap` execution.
-- Use `--pipeline` to run the Playwright generation pipeline.
-- If `e2e-test-gen-cli.ts` exists in `testsRoot`, it is used as the advanced runner.
-- If it is absent, `@yasserkhanorg/e2e-agents` falls back to package-native generation with strategy-based templates, quality guardrails (`no test.describe`, single tag), and iterative heal attempts.
-- `--pipeline-mcp` now attempts the official Playwright Test Agent loop first (planner/generator/healer) using:
-  - `npx playwright init-agents --loop=claude --prompts`
-  - `.mcp.json` (`playwright run-test-mcp-server`)
-  - `claude -p` non-interactive orchestration
-- In MCP mode, fallback is strict by default: if official agent setup fails, generation stops instead of silently degrading.
-- Use `--pipeline-mcp-allow-fallback` (or config `pipeline.mcpAllowFallback=true`) only when you explicitly want fallback generation.
-- MCP prerequisites: Playwright config in `testsRoot` and Claude CLI installed/authenticated.
-- Use `--pipeline-mcp-timeout-ms` (or config `pipeline.mcpCommandTimeoutMs`) to limit per-command MCP wait time and fail fast in strict mode.
-- Use `--pipeline-mcp-retries` (or config `pipeline.mcpRetries`) to retry transient MCP failures while staying in AI-only mode.
-- Official MCP outputs are validated against discovered local API surface (`pw.*`, `pw.testBrowser.*`, `channelsPage.*`) to block invented methods (for example `pw.mainClient.*`).
-- If fallback is enabled and official MCP agent execution is unavailable, pipeline falls back to `e2e-test-gen` (if present) or package-native generation with warnings in report output.
-- `impact/gap` pipeline output now includes `pipeline.mcp` (`requested`, `active`, `backend`) so MCP activation is explicit.
-- `suggest` writes `.e2e-ai-agents/plan.json` with `runSet` (`smoke|targeted|full`) and confidence.
-- `suggest` also writes `.e2e-ai-agents/ci-summary.md` with CI status: `run-now`, `must-add-tests`, or `safe-to-merge`.
-- CLI policy overrides: `--policy-min-confidence`, `--policy-safe-merge-confidence`, `--policy-force-full-on-warnings`, `--policy-risky-patterns`, `--policy-enforcement-mode`, `--policy-block-actions`.
-- GitHub Actions output wiring: `--github-output $GITHUB_OUTPUT`.
-- Optional merge gating: `--fail-on-must-add-tests` exits non-zero when uncovered P0/P1 gaps are detected. Leave this flag unset for advisory-only mode.
-- `suggest` now appends run metrics to `.e2e-ai-agents/metrics.jsonl` and writes aggregated `.e2e-ai-agents/metrics-summary.json`.
-- `impact/gap` now include actionable `testSuggestions` with linked source files and skeleton test code.
-- `impact/gap` now include `impactModel` metadata (`flowMapping`, `testMapping`, `confidenceClass`, traceability stats, dependency graph stats).
-- `impact/gap` now include `runMetadata` (run id/timestamps/duration/since ref) for auditability.
-- `impact/gap` now include optional `impactModel.subsystemRisk` stats (map status, matched files/rules, boosted flows).
-- `impact/gap` pipeline result rows now include failure taxonomy (`failureCategory`, `failureCode`) when generation/heal fails.
-- `feedback` appends outcomes to `.e2e-ai-agents/feedback.json` and recomputes `.e2e-ai-agents/calibration.json`.
-- `feedback` also computes intelligent flaky scores into `.e2e-ai-agents/flaky-tests.json`.
-- `traceability-capture` converts Playwright JSON execution report + optional coverage map into `.e2e-ai-agents/traceability-input.json`.
-- `traceability-ingest` merges CI execution mappings into `.e2e-ai-agents/traceability.json` and persists rolling counts in `.e2e-ai-agents/traceability-state.json`.
-- Traceability capture flags: `--traceability-report`, `--traceability-capture-output`, `--traceability-coverage-map`, `--traceability-changed-files`.
-- Traceability ingest tuning flags: `--traceability-min-hits`, `--traceability-max-files-per-test`, `--traceability-max-age-days`.
-- Optional ownership routing for flaky alerts: `.e2e-ai-agents/subsystem-owners.json`.
-- `suggest` automatically consumes optional operational manifests:
-  - `.e2e-ai-agents/flaky-tests.json`
-  - `.e2e-ai-agents/quality-gates.json`
-- `plan.json` includes `nextActions` commands for run/approve-and-generate/heal/finalize/PR handoff.
-- `finalize-generated-tests` stages generated artifacts from `gap.json`, commits, and can open a PR with `--create-pr`.
-- Generated Mattermost Playwright tests use standalone `test(...)` style (no `test.describe`) and a single tag string.
-Programmatic API:
+Key options:
-```typescript
-import {analyzeImpact, findGaps, recommendTests, captureTraceability, ingestTraceability} from '@yasserkhanorg/e2e-agents';
+- **`testsRoot`** — path to tests when they live outside the app root
+- **`profile`** — `default` or `mattermost` (strict mode with escalation for heuristic-only mappings)
+- **`impact.dependencyGraph`** — static reverse dependency graph for transitive impact
+- **`impact.traceability`** — file-to-test mapping from CI execution data
+- **`impact.aiFlow`** — LLM-powered flow mapping (requires `ANTHROPIC_API_KEY`)
+- **`pipeline.mcp`** — use Playwright MCP server for browser-aware generation/healing
+- **`policy.enforcementMode`** — `advisory`, `warn`, or `block`
-await analyzeImpact({path: '/path/to/webapp'});
-await findGaps({path: '/path/to/webapp'});
-const suggestion = await recommendTests({path: '/path/to/webapp'});
-console.log(suggestion.plan.runSet);
+## CI Integration
-const captured = captureTraceability({
-  path: '/path/to/webapp',
-  testsRoot: '/path/to/e2e-tests/playwright',
-  reportPath: '/path/to/playwright-report.json',
-});
+### GitHub Actions
-ingestTraceability({
-  path: '/path/to/webapp',
-  testsRoot: '/path/to/e2e-tests/playwright',
-  payload: JSON.parse(require('fs').readFileSync(captured.outputPath, 'utf8')),
-});
+```yaml
+- name: Run E2E coverage check
+  env:
+    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+  run: |
+    npx e2e-ai-agents plan \
+      --config ./e2e-ai-agents.config.json \
+      --since origin/${{ github.base_ref }} \
+      --fail-on-must-add-tests \
+      --github-output "$GITHUB_OUTPUT"
 ```
-Feedback API:
+The `plan` command writes:
+- `.e2e-ai-agents/plan.json` — structured plan with `runSet`, `confidence`, `decision`
+- `.e2e-ai-agents/ci-summary.md` — markdown summary for PR comments
+- `.e2e-ai-agents/metrics-summary.json` — run metrics
-```typescript
-import {appendFeedbackAndRecompute} from '@yasserkhanorg/e2e-agents';
-appendFeedbackAndRecompute('/path/to/webapp', {
-  timestamp: new Date().toISOString(),
-  runSet: 'targeted',
-  recommendedTests: ['specs/channels/realtime.spec.ts'],
-  executedTests: ['specs/channels/realtime.spec.ts'],
-  failedTests: ['specs/channels/realtime.spec.ts'],
-  escapedFailures: []
-});
-```
+Use `--fail-on-must-add-tests` to exit non-zero when uncovered P0/P1 gaps exist. Use `--github-output` to expose outputs to subsequent workflow steps.
-Traceability ingest API:
+See [examples/github-actions/](examples/github-actions/) for a complete workflow template.
-```typescript
-import {ingestTraceability} from '@yasserkhanorg/e2e-agents';
-ingestTraceability({
-  path: '/path/to/webapp',
-  testsRoot: '/path/to/e2e-tests/playwright',
-  payload: {
-    runs: [
-      {
-        test: 'specs/channels/channels.switch.spec.ts',
-        touchedFiles: ['channels/src/components/channel_switcher/channel_switcher.tsx']
-      }
-    ]
-  },
-  options: {minHits: 2}
-});
-```
+## Pipeline Modes
-Automation API:
+### Package Native (default)
-```typescript
-import {handoffGeneratedTests} from '@yasserkhanorg/e2e-agents';
+Strategy-based Playwright test templates with quality guardrails (no `test.describe`, single tag) and iterative heal attempts.
-handoffGeneratedTests({
-  appPath: '/path/to/webapp',
-  testsRoot: '/path/to/e2e-tests/playwright',
-  createPr: true,
-});
-```
+### MCP Mode (`--pipeline-mcp`)
-CI integration template:
+Uses the official Playwright Test Agent loop (planner/generator/healer) with Claude CLI orchestration. Validates generated specs against discovered local API surface to block hallucinated methods.
-- [GitHub Actions example](examples/github-actions/pr-impact.yml)
-- The example uses Node 22 (`actions/setup-node@v4` with `node-version: 22`).
-- The example captures Playwright JSON output via `traceability-capture` and ingests it with `traceability-ingest`.
-- Feedback payload example: [examples/feedback.sample.json](examples/feedback.sample.json)
-- Subsystem owners example: [examples/subsystem-owners.sample.json](examples/subsystem-owners.sample.json)
-- Traceability ingest payload schema: [schemas/traceability-input.schema.json](schemas/traceability-input.schema.json)
-- Traceability ingest payload example: [examples/traceability-input.sample.json](examples/traceability-input.sample.json)
-- Traceability manifest example: [examples/traceability.sample.json](examples/traceability.sample.json)
-- Subsystem risk map schema: [schemas/subsystem-risk-map.schema.json](schemas/subsystem-risk-map.schema.json)
-- Subsystem risk map example: [examples/subsystem-risk-map.sample.json](examples/subsystem-risk-map.sample.json)
-- End-to-end verification steps: [examples/verification/README.md](examples/verification/README.md)
-- Impact checklist playbook: [examples/verification/IMPACT_ANALYSIS_CHECKLIST.md](examples/verification/IMPACT_ANALYSIS_CHECKLIST.md)
-- Checklist validator command: `npm run impact:checklist -- --root <tests-root>`
+- **`--pipeline-mcp-only`** — fail if MCP setup fails (no silent fallback)
+- **`--pipeline-mcp-allow-fallback`** — fall back to package-native if MCP unavailable
+- **`--pipeline-mcp-timeout-ms`** — per-command timeout
+- **`--pipeline-mcp-retries`** — retry count for transient failures
-Traceability manifest example (`.e2e-ai-agents/traceability.json`):
+### Agentic Generation (`generate` command)
-```json
-{
-  "schemaVersion": "1.0.0",
-  "tests": [
-    {
-      "test": "specs/channels/channels.switch.spec.ts",
-      "touchedFiles": ["channels/src/components/channel_switcher/channel_switcher.tsx"]
-    }
-  ]
-}
-```
+LLM-powered generate-run-fix loop: generates a spec, runs it, analyzes failures, and iterates up to `--max-attempts` times.
-Traceability ingest input example (`traceability-input.json`):
+## LLM Providers
-```json
-{
-  "runs": [
-    {
-      "test": "specs/channels/channels.switch.spec.ts",
-      "touchedFiles": ["channels/src/components/channel_switcher/channel_switcher.tsx"]
-    }
-  ]
-}
-```
-Flow catalog entries can also include optional audience and flag metadata:
+Used internally for AI enrichment, test generation, and healing.
-```json
-{
-  "id": "messaging.realtime",
-  "priority": "P0",
-  "audience": ["member", "guest"],
-  "flags": [
-    "EnableSomething",
-    { "name": "EnableEnterpriseOnly", "source": "config", "defaultState": "off" }
-  ],
-  "tests": ["specs/functional/channels/realtime.spec.ts"]
-}
-```
-## Extending with Custom Frameworks
+```bash
+# Anthropic (default)
+export ANTHROPIC_API_KEY=sk-ant-...
-### 1. Create Custom Provider
+# OpenAI
+export OPENAI_API_KEY=sk-...
-```typescript
-import { LLMProvider } from '@yasserkhanorg/e2e-agents';
-export class MyCustomProvider implements LLMProvider {
-    async generateText(prompt: string) {
-        // Your API call here
-        return {
-            text: '...',
-            cost: 0.001,
-            tokens: { input: 100, output: 50 }
-        };
-    }
-    async analyzeImage(images, prompt) {
-        throw new Error('Vision not supported');
-    }
-    async streamText(prompt) {
-        // Generator implementation
-        yield 'chunk1';
-        yield 'chunk2';
-    }
-    getUsageStats() {
-        return { /* ... */ };
-    }
-}
+# Ollama (free, local)
+export OLLAMA_BASE_URL=http://localhost:11434
+export OLLAMA_MODEL=deepseek-r1:7b
 ```
-### 2. Register with Factory
+Programmatic provider usage:
 ```typescript
-import { LLMProviderFactory } from '@yasserkhanorg/e2e-agents';
-LLMProviderFactory.register('my-provider', (config) => {
-    return new MyCustomProvider(config);
-});
+import { AnthropicProvider } from '@yasserkhanorg/e2e-agents';
-// Use it
-const provider = LLMProviderFactory.create({
-    type: 'my-provider',
-    config: { apiKey: '...' }
+const claude = new AnthropicProvider({
+    apiKey: process.env.ANTHROPIC_API_KEY
 });
+const response = await claude.generateText('Analyze test failure');
 ```
-### 3. Integrate with Test Framework
+Factory pattern with auto-detection, hybrid mode (free local + premium fallback), and custom OpenAI-compatible endpoints are also supported. See the [provider API exports](src/index.ts) for full details.
-```typescript
-// Playwright example
-import { test } from '@playwright/test';
-import { LLMProviderFactory } from '@yasserkhanorg/e2e-agents';
+## MCP Server
-const llm = LLMProviderFactory.create({
-    type: 'anthropic',
-    config: { apiKey: process.env.ANTHROPIC_API_KEY }
-});
-test('use LLM to verify UI', async ({ page }) => {
-    await page.goto('https://example.com');
-    const screenshot = await page.screenshot();
-    const analysis = await llm.analyzeImage(
-        [{ data: screenshot.toString('base64'), mimeType: 'image/png' }],
-        'Is the login button visible and correctly styled?'
-    );
-    console.log(analysis.text);
-});
-```
-## MCP Server Integration
-For Playwright test agents (v1.56+):
+Exposes 6 tools for test agents (Playwright v1.56+):
 ```typescript
 import { E2EAgentsMCPServer } from '@yasserkhanorg/e2e-agents/mcp';
 const server = new E2EAgentsMCPServer();
-const tools = server.getTools();
-// Available tools:
-// - discover_tests: Find tests needed for code changes
-// - read_file: Read repository files
-// - write_file: Create/update test files
-// - run_tests: Execute tests
-// - get_git_changes: Detect changed files
-// - get_repository_context: Gather metadata
-```
-## Configuration
-### Environment Variables
-```bash
-ANTHROPIC_API_KEY=sk-ant-...
-ANTHROPIC_MODEL=claude-sonnet-4-5-20250929
-OPENAI_API_KEY=sk-...
-OPENAI_MODEL=gpt-4
-OPENAI_BASE_URL=https://api.openai.com/v1
-OPENAI_ORG_ID=org_...
-OLLAMA_BASE_URL=http://localhost:11434
-OLLAMA_MODEL=deepseek-r1:7b
-```
-Note: If `OLLAMA_BASE_URL` points to the root host (for example, `http://localhost:11434`), it will be normalized to `/v1`.
-### Setup
-**Claude:**
-1. Get key: https://console.anthropic.com
-2. Export: `export ANTHROPIC_API_KEY=sk-ant-...`
-**OpenAI:**
-1. Get key: https://platform.openai.com
-2. Export: `export OPENAI_API_KEY=sk-...`
-**Ollama:**
-1. Install: `curl -fsSL https://ollama.com/install.sh | sh`
-2. Pull: `ollama pull deepseek-r1:7b`
-3. Run: `ollama serve`
-## Error Handling
-```typescript
-import { LLMProviderError, UnsupportedCapabilityError } from '@yasserkhanorg/e2e-agents';
-try {
-    await provider.analyzeImage([...], 'Analyze');
-} catch (error) {
-    if (error instanceof UnsupportedCapabilityError) {
-        console.log(`Not supported by: ${error.provider}`);
-    } else if (error instanceof LLMProviderError) {
-        console.log(`API error: ${error.message}`);
-    }
-}
-```
-## Performance Comparison
-| Feature | Claude | OpenAI | Ollama |
-|---------|--------|--------|--------|
-| Vision | ✅ | ✅ (model dependent) | ❌ |
-| Cost | $3-15/1M tokens | Model dependent | Free |
-| Speed | ~800ms | ~1000ms | ~3000ms |
-| Streaming | ✅ | ✅ | ✅ |
-| Local | ❌ | ❌ | ✅ |
-## Cost Optimization
-```typescript
-const stats = provider.getUsageStats();
-console.log(`Tokens: ${stats.totalTokens.toLocaleString()}`);
-console.log(`Cost: $${stats.totalCost.toFixed(2)}`);
-console.log(`Avg speed: ${stats.averageResponseTimeMs.toFixed(0)}ms`);
+// Tools: discover_tests, read_file, write_file, run_tests, get_git_changes, get_repository_context
 ```
-## Performance & Optimization (v0.3.0+)
+Security: `write_file` is restricted to test spec files (`*.spec.ts`, `*.test.ts`) and the `.e2e-ai-agents/` directory. Path traversal and symlink escape are blocked. Rate limited to 100 requests/minute.
-### Logging Configuration
+## Traceability
-Control logging verbosity with the `LOG_LEVEL` environment variable:
+Build file-to-test mappings from CI execution data:
-```bash
-# Production: errors only
-LOG_LEVEL=ERROR npm start
-# Development: all messages
-LOG_LEVEL=DEBUG npm start
-```
-Supported levels: `ERROR`, `WARN`, `INFO`, `DEBUG` (default: `INFO`)
-### Caching
+1. **Capture** — extract test-file relationships from Playwright JSON reports
+2. **Ingest** — merge into a rolling manifest (`.e2e-ai-agents/traceability.json`)
+3. **Query** — impact analysis uses the manifest to map changed files to relevant tests
-Repository context and analysis data are cached internally by the tool.
-No public cache API is exposed; caching behavior is automatic.
+Tuning flags: `--traceability-min-hits`, `--traceability-max-files-per-test`, `--traceability-max-age-days`.
-### Performance Metrics (v0.3.0)
+Schemas: [schemas/traceability-input.schema.json](schemas/traceability-input.schema.json)
-Improvements from code quality refactoring:
+## Artifacts
-- **40% faster** stats calculation (incremental updates)
-- **30% faster** API key validation (pre-compiled patterns)
-- **90% faster** repository context (cache hits)
-- **15% smaller** bundle size (code deduplication)
-- **44 comprehensive tests** (80%+ coverage)
+| File | Written by | Purpose |
+|------|-----------|---------|
+| `plan.json` | `plan` | Coverage plan with gaps, decisions, metrics |
+| `ci-summary.md` | `plan` | Markdown for PR comments |
+| `metrics.jsonl` | `plan` | Append-only run metrics |
+| `metrics-summary.json` | `plan` | Aggregated metrics |
+| `traceability.json` | `traceability-ingest` | File-to-test manifest |
+| `traceability-state.json` | `traceability-ingest` | Rolling counts |
+| `feedback.json` | `feedback` | Recommendation outcomes |
+| `calibration.json` | `feedback` | Precision/recall calibration |
+| `flaky-tests.json` | `feedback` | Flaky test scores |
+| `agentic-summary.json` | `generate` | Agentic generation results |
-See [CHANGELOG.md](CHANGELOG.md) for detailed improvements.
-## Learn More
-For comprehensive documentation on:
-- Real-world usage examples
-- Integration with different frameworks
-- How Mattermost uses e2e-ai-agents in production
-- Cost optimization strategies
-- Security features and best practices
-👉 **See [E2E_AI_TESTING.md](E2E_AI_TESTING.md)**
+All written under `<testsRoot>/.e2e-ai-agents/`.
 ## Production Usage
-This package is used in production by Mattermost for:
-- ✅ Automated test generation
-- ✅ Test validation and healing
-- ✅ UI screenshot analysis
-- ✅ Test data generation
-See the [Mattermost e2e-test-gen implementation](https://github.com/mattermost/mattermost/tree/master/e2e-tests/playwright) for a complete example.
+Used by [Mattermost](https://github.com/mattermost/mattermost) for CI-integrated E2E coverage gating, test generation, and spec healing. See the [Mattermost Playwright integration](https://github.com/mattermost/mattermost/tree/master/e2e-tests/playwright) for a real-world example.
 ## License