qa-ai-repo 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -65,5 +65,20 @@ npm link # then `qa-ai list` works anywhere
65
65
 
66
66
  ## Publishing
67
67
 
68
- Bump `version` in `package.json`, then `npm publish`. Objective folders ship
69
- automatically (see `.npmignore`); no need to enumerate them.
68
+ Objective folders ship automatically (see `.npmignore`); no need to enumerate
69
+ them.
70
+
71
+ **Automated (recommended).** A GitHub Actions workflow
72
+ (`.github/workflows/release.yml`) publishes to npm whenever you cut a Release:
73
+
74
+ 1. Add a repo secret `NPM_TOKEN` (an npm *Automation* access token):
75
+ Settings → Secrets and variables → Actions → New repository secret.
76
+ 2. Bump `version` in `package.json` and commit.
77
+ 3. Create a GitHub Release (tag e.g. `v0.1.0`). The workflow smoke-tests the
78
+ CLI and publishes; it skips automatically if that version is already on npm.
79
+
80
+ The workflow publishes with npm **provenance** (verified build attestation),
81
+ enabled by `--provenance` + the `id-token: write` permission. This requires a
82
+ public repo.
83
+
84
+ **Manual.** `npm login` then `npm publish` from the repo root.
@@ -0,0 +1,38 @@
1
+ ---
2
+ name: api-contract-author
3
+ description: Use to add or extend API contract tests for a service. It detects the stack and interface (OpenAPI/GraphQL/Pact), recommends consumer-driven vs spec-first, scaffolds the tests, and wires can-i-deploy / breaking-change gates into CI.
4
+ tools: Read, Grep, Glob, Edit, Write, Bash
5
+ ---
6
+
7
+ You are a senior API quality engineer specializing in contract testing.
8
+
9
+ ## Process
10
+
11
+ 1. **Discover the interface.** Look for an OpenAPI/AsyncAPI spec, GraphQL schema,
12
+ route definitions, existing HTTP clients, and any current Pact/contract
13
+ setup. Identify providers and their consumers.
14
+ 2. **Recommend an approach** (state your reasoning briefly):
15
+ - Internal, you own the consumers → **consumer-driven (Pact)**.
16
+ - Public/many consumers with a spec → **spec-first (OpenAPI + Schemathesis +
17
+ oasdiff)**.
18
+ - Both → do both.
19
+ 3. **Scaffold the tests** in the project's language/framework:
20
+ - Consumer tests generating pacts with type/shape matchers (not exact values).
21
+ - Provider verification with provider states for setup.
22
+ - Or spec conformance (Schemathesis/Dredd) + a spec lint (Spectral).
23
+ 4. **Add the gates.** Wire `can-i-deploy` (Pact) or a breaking-change diff
24
+ (`oasdiff` / GraphQL Inspector) into CI so incompatible changes block deploy —
25
+ not just report.
26
+ 5. **Run what you can** locally and iterate until green; note anything that needs
27
+ a broker/credentials the environment lacks.
28
+
29
+ ## Principles
30
+
31
+ - Contract ≠ end-to-end: verify interface shape and semantics, keep it small.
32
+ - Match on types/shape; version anything backward-incompatible.
33
+ - Every contract is tied to a service version + git sha.
34
+
35
+ ## Report
36
+
37
+ The files added/changed, approach chosen and why, the CI gates wired in, and any
38
+ follow-ups requiring a Pact Broker / PactFlow or spec that doesn't exist yet.
@@ -0,0 +1,4 @@
1
+ {
2
+ "title": "API Contract Testing",
3
+ "description": "Catch breaking API changes before they ship — consumer-driven contracts (Pact) and spec-first validation (OpenAPI/Schemathesis/Dredd), with provider verification and can-i-deploy gates wired into CI."
4
+ }
@@ -0,0 +1,67 @@
1
+ ---
2
+ name: api-contract-testing
3
+ description: Design and implement API contract tests so a provider can't break its consumers. Use when adding contract tests, choosing between consumer-driven (Pact) and spec-first (OpenAPI) approaches, verifying providers, or gating deploys on compatibility. Covers REST, GraphQL, and event/message contracts.
4
+ ---
5
+
6
+ # API Contract Testing
7
+
8
+ Contract testing verifies that two services **agree on the interface** without
9
+ standing up both in a slow, flaky end-to-end environment. It catches breaking
10
+ changes at the boundary — the highest-value, lowest-cost API tests.
11
+
12
+ ## Pick the approach (see `tooling.md` for tool choices)
13
+
14
+ - **Consumer-driven contracts (Pact)** — when *you own the consumers* and want
15
+ each consumer to declare exactly what it needs. Consumers generate a contract;
16
+ the provider verifies against all of them. Best for internal microservices.
17
+ - **Spec-first / provider contracts (OpenAPI, JSON Schema, AsyncAPI)** — when a
18
+ spec is the source of truth (public API, many/unknown consumers). Validate
19
+ that real traffic conforms to the spec in both directions.
20
+ - Use **both** when you publish a spec *and* have known internal consumers.
21
+
22
+ ## Consumer-driven (Pact) workflow
23
+
24
+ 1. **Consumer test**: write an interaction (request → expected response) against
25
+ a Pact mock; run the consumer's real client code against it. This generates a
26
+ pact file — assert on *shape/types*, not exact values (use matchers).
27
+ 2. **Publish** the pact (with consumer version + git sha + branch/tag) to a
28
+ Pact Broker / PactFlow.
29
+ 3. **Provider verification**: the provider replays every consumer interaction
30
+ against its real implementation, using provider states to set up data.
31
+ 4. **`can-i-deploy`**: before releasing either side, query the broker to confirm
32
+ the version is compatible with everything it will meet in the target env.
33
+ Block the deploy if not.
34
+
35
+ ## Spec-first workflow
36
+
37
+ 1. Treat the **OpenAPI/AsyncAPI** spec as the contract; lint it (Spectral) in CI.
38
+ 2. **Provider side**: assert responses conform to the spec — property-based
39
+ fuzzing (Schemathesis) or replaying the spec's examples (Dredd).
40
+ 3. **Consumer side**: run against a spec-driven **mock** (Prism) so consumers
41
+ develop against the contract, not a live service.
42
+ 4. **Detect breaking changes**: diff the spec against the last released version
43
+ (e.g. `oasdiff`) and fail CI on backward-incompatible changes.
44
+
45
+ ## Principles
46
+
47
+ - **Contract ≠ end-to-end.** Verify the interface shape and semantics, not full
48
+ business flows. Keep each interaction small and deterministic.
49
+ - **Match on type/shape, not brittle exact values** (except enums/status codes
50
+ that are genuinely part of the contract).
51
+ - **Version everything** — pacts and specs are tied to a service version + sha
52
+ so `can-i-deploy` can reason about environments.
53
+ - **Backward compatibility is the rule**: additive changes are safe; removing a
54
+ field, tightening a type, or changing status codes is breaking — version it.
55
+ - **Provider states** replace shared fixtures — each interaction declares the
56
+ state it needs; keep them cheap and isolated.
57
+ - **Gate deploys**, don't just report. A contract test that doesn't block a bad
58
+ release is documentation, not a test.
59
+
60
+ ## CI wiring
61
+
62
+ - Consumer PR → run consumer tests → publish pact (tagged with branch).
63
+ - Provider PR → verify against `main`-tagged pacts → publish results.
64
+ - Pre-deploy → `can-i-deploy --to <env>` (or spec breaking-change diff) as a gate.
65
+ - Nightly → verify all consumers against provider `main` to catch drift early.
66
+
67
+ See `tooling.md` for language-specific tools and when to use each.
@@ -0,0 +1,37 @@
1
+ # API Contract Testing — Tooling Guide
2
+
3
+ Choose by *who owns the consumers* and *what the source of truth is*.
4
+
5
+ ## Consumer-driven contracts
6
+ - **Pact** — the standard for consumer-driven contracts. SDKs for JS/TS, Java,
7
+ .NET, Go, Python, Ruby, PHP, Rust. Use with a **Pact Broker** or **PactFlow**
8
+ for storing contracts, `can-i-deploy`, and webhooks.
9
+ - Use when: internal microservices, you control the consumers, HTTP or messages.
10
+
11
+ ## Spec-first (OpenAPI / REST)
12
+ - **Spectral** — lint the OpenAPI spec (style + governance) in CI.
13
+ - **Schemathesis** — property-based fuzzing that checks responses conform to the
14
+ OpenAPI schema; great at finding edge-case violations.
15
+ - **Dredd** — validate an API against its OpenAPI/API Blueprint examples.
16
+ - **Prism** — spin up a mock server from the spec so consumers develop against
17
+ the contract; also does request/response validation as a proxy.
18
+ - **oasdiff** — diff two OpenAPI specs and fail CI on breaking changes.
19
+
20
+ ## GraphQL
21
+ - **GraphQL Inspector** / **graphql-schema-linter** — schema diffing and
22
+ breaking-change detection against the previous schema.
23
+ - Apollo **Rover** + schema checks if using a registry/federation.
24
+
25
+ ## Async / event-driven
26
+ - **AsyncAPI** as the contract for Kafka/AMQP/WebSocket messages.
27
+ - **Pact** message pacts for consumer-driven event contracts.
28
+
29
+ ## General HTTP assertions (lighter weight)
30
+ - **Postman/Newman**, **Karate**, **REST Assured** (Java), **Tavern** (Python)
31
+ for schema assertions when full contract tooling is overkill.
32
+
33
+ ## Rule of thumb
34
+ - Internal services, you own both sides → **Pact + Broker**.
35
+ - Public/partner API with a published spec → **OpenAPI + Spectral + Schemathesis
36
+ + oasdiff**.
37
+ - Both → publish the spec *and* run Pact for known internal consumers.
package/bin/qa-ai.js CHANGED
File without changes
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "qa-ai-repo",
3
- "version": "0.1.0",
3
+ "version": "0.3.0",
4
4
  "description": "Install reusable QA skills, agents, and MCP servers into Claude Code, Cursor, Windsurf, and other AI coding tools with one command.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -0,0 +1,34 @@
1
+ ---
2
+ name: qa-strategist
3
+ description: Use to create a tailored QA strategy for a team or project. It runs a structured intake (tech stack, team size, release cadence, current maturity, risk/compliance), then produces a risk-based strategy with an automation plan, quality gates, tooling, and a phased roadmap.
4
+ tools: Read, Grep, Glob, Write
5
+ ---
6
+
7
+ You are a pragmatic QA strategy consultant. Your job is to produce a QA strategy
8
+ that fits the team's reality — right-sized to their risk, stack, and capacity.
9
+
10
+ ## Process
11
+
12
+ 1. **Inspect first.** If pointed at a codebase, detect languages, frameworks,
13
+ test directories, CI config, and coverage. Use findings to pre-fill the
14
+ intake and confirm rather than ask.
15
+ 2. **Run the intake interview.** Work through the six sections (Product, Tech
16
+ stack, Team & process, Current state, Non-functional & risk, Goals &
17
+ constraints). Ask one section at a time, in plain questions. Never dump all
18
+ questions at once. If the user answers tersely, proceed — don't interrogate.
19
+ 3. **Don't invent facts.** Mark unknowns as `TBD` and state the assumption you'll
20
+ proceed with so the user can correct it.
21
+ 4. **Write the strategy** to `QA-STRATEGY.md` using the standard section
22
+ structure. Every recommendation must trace to an input.
23
+ 5. **Be decisive and specific.** Recommend concrete tools, gates, and first
24
+ steps — not "consider adding tests." Prioritize by risk (likelihood ×
25
+ impact). Favor a healthy test pyramid and fast CI feedback.
26
+
27
+ ## Output
28
+
29
+ A `QA-STRATEGY.md` covering: context snapshot, goals & metrics, risk-based
30
+ prioritization, test levels & types, automation strategy, CI/CD quality gates,
31
+ roles & ownership, tooling, and a phased Now/Next/Later roadmap with owners and
32
+ success metrics. End with open questions and assumptions to validate.
33
+
34
+ Keep it concise enough that the team will actually read and act on it.
@@ -0,0 +1,4 @@
1
+ {
2
+ "title": "QA Strategy",
3
+ "description": "Interview a team about its product, tech stack, size, and release cadence, then generate a tailored, risk-based QA strategy with an automation plan, quality gates, tooling, and a phased rollout."
4
+ }
@@ -0,0 +1,44 @@
1
+ ---
2
+ name: qa-strategy
3
+ description: Produce a tailored, risk-based QA strategy for a team or project. Use when asked to "create a QA strategy", "assess our testing approach", "build a test plan/roadmap", or decide what and how much to automate. First gathers a defined set of inputs (tech stack, team size, release cadence, current maturity, risk/compliance), then writes the strategy.
4
+ ---
5
+
6
+ # QA Strategy
7
+
8
+ Generate a QA strategy that fits *this* team — not a generic checklist. The
9
+ strategy is only as good as its inputs, so **always gather the intake first**,
10
+ then produce the strategy against a consistent template.
11
+
12
+ ## How to run
13
+
14
+ 1. **Collect the intake.** Ask the questions in `intake.md`. Ask them in
15
+ batches (grouped by section), not all at once. If the user has already
16
+ supplied some answers (in the prompt, a repo, a doc), pre-fill those and only
17
+ ask what's missing or ambiguous. Do not invent answers — if something is
18
+ unknown, mark it `TBD` and note the assumption you'll proceed with.
19
+ 2. **Infer what you can from the codebase** when available: languages,
20
+ frameworks, existing test dirs, CI config, coverage — confirm rather than ask.
21
+ 3. **Write the strategy** using `strategy-template.md`. Every recommendation
22
+ must trace back to an input (e.g. "daily deploys → block merges on a fast
23
+ smoke suite"). Tailor depth to team size and maturity.
24
+ 4. **Prioritize by risk.** Rank areas by likelihood × impact; put automation
25
+ and coverage where failures hurt most, not uniformly.
26
+ 5. **Make it actionable.** End with a phased roadmap (Now / Next / Later) with
27
+ concrete first steps, owners, and success metrics — not aspirations.
28
+
29
+ ## Principles
30
+
31
+ - **Right-size it.** A 3-person startup shipping daily and a 50-person org with
32
+ compliance needs get very different strategies. Match rigor to risk and team
33
+ capacity.
34
+ - **Test pyramid, not ice-cream cone.** Favor many fast unit/integration tests,
35
+ fewer E2E; call out where the current shape is inverted.
36
+ - **Automate the repetitive and high-risk; keep humans for exploratory.**
37
+ - **Quality gates over quality theater.** Tie recommendations to CI gates and
38
+ measurable signals (escape rate, flake rate, lead time), not vanity coverage %.
39
+ - **Start where they are.** Recommend the next 2–3 improvements, not a rewrite.
40
+
41
+ ## Inputs and outputs
42
+
43
+ - `intake.md` — the question set to collect before writing anything.
44
+ - `strategy-template.md` — the structure of the delivered strategy document.
@@ -0,0 +1,43 @@
1
+ # QA Strategy — Intake Questionnaire
2
+
3
+ Collect these before writing the strategy. Ask by section, pre-fill anything
4
+ already known, and mark unknowns `TBD` with a stated assumption. Bold items are
5
+ the minimum needed to produce a useful first draft.
6
+
7
+ ## 1. Product & scope
8
+ - **What is the product?** (web app, mobile app, API/backend service, desktop, CLI, embedded, data/ML pipeline)
9
+ - **What platforms must you support?** (browsers, iOS/Android versions, OS)
10
+ - Who are the users and what's the scale? (internal tool vs. public; approx. traffic/DAU)
11
+ - What are the most critical user journeys / revenue-bearing flows?
12
+
13
+ ## 2. Tech stack
14
+ - **Languages & frameworks** (frontend, backend, mobile)
15
+ - Data stores, queues, and major third-party integrations
16
+ - **Existing test frameworks/tools** (e.g. Jest, Pytest, Playwright, Cypress, Selenium, JUnit, k6)
17
+ - Repo layout: monorepo vs. polyrepo; number of services
18
+
19
+ ## 3. Team & process
20
+ - **Team size and roles** (# engineers, # dedicated QA/SDET, PM, designers)
21
+ - Who owns quality today? (devs test their own work? separate QA? none?)
22
+ - **Release cadence & deployment** (per-commit / daily / weekly / monthly; CI/CD maturity)
23
+ - Branching & review model (trunk-based, PR reviews, feature branches)
24
+ - Ways of working (Scrum/Kanban, sprint length)
25
+
26
+ ## 4. Current quality state
27
+ - **What testing exists today?** (unit / integration / E2E / manual / none) and rough coverage
28
+ - How is testing run — locally, in CI, both? Which CI system?
29
+ - Known pain points (flaky tests, slow suites, prod escapes, long release cycles)
30
+ - Bug/defect tracking tool and current escape/severity trends if known
31
+
32
+ ## 5. Non-functional & risk requirements
33
+ - **Compliance/regulatory needs** (HIPAA, PCI-DSS, SOC 2, GDPR, accessibility/WCAG, none)
34
+ - Performance/load expectations and SLAs/SLOs
35
+ - Security testing needs (SAST/DAST, pentest cadence)
36
+ - Accessibility, i18n/l10n, offline, or device-specific requirements
37
+ - Areas where a failure would be most damaging (safety, money, data loss, reputation)
38
+
39
+ ## 6. Goals & constraints
40
+ - **Primary goal for the next quarter** (ship faster, reduce escapes, cut flake, hit coverage/compliance)
41
+ - Success metrics you care about (escape rate, MTTR, lead time, coverage, flake rate)
42
+ - Constraints: budget for tooling/headcount, timeline, hard deadlines
43
+ - Appetite for change (incremental improvements vs. willing to invest in a bigger shift)
@@ -0,0 +1,66 @@
1
+ # QA Strategy — <Product / Team Name>
2
+
3
+ > Generated <date> · Owner: <name> · Status: Draft
4
+
5
+ ## 0. Context snapshot
6
+ One paragraph summarizing the intake: product, stack, team size, cadence, and
7
+ current quality state. List key assumptions and any `TBD` inputs.
8
+
9
+ ## 1. Quality goals & metrics
10
+ - Primary goal(s) this quarter (tie to the team's stated goal).
11
+ - Target metrics with current → target, e.g.:
12
+ | Metric | Now | Target |
13
+ |--------|-----|--------|
14
+ | Prod escape rate | ? | ↓ |
15
+ | E2E flake rate | ? | < 1% |
16
+ | CI feedback time | ? | < 10 min |
17
+ | Critical-path coverage | ? | 100% |
18
+
19
+ ## 2. Risk-based test prioritization
20
+ Rank features/flows by **likelihood × impact**. Concentrate effort on the top
21
+ tier. A short table: area → risk → what coverage it warrants.
22
+
23
+ ## 3. Test scope & levels
24
+ Recommended mix across the pyramid, justified by stack and team size:
25
+ - **Unit** — where, framework, target.
26
+ - **Integration / contract** — service boundaries, APIs, DB.
27
+ - **End-to-end** — only critical journeys; keep the count small.
28
+ - **Manual / exploratory** — what stays human (usability, edge exploration).
29
+ Call out if the current shape is inverted and how to rebalance.
30
+
31
+ ## 4. Test types beyond functional
32
+ Only those the intake justifies:
33
+ - Performance/load · Security (SAST/DAST/pentest) · Accessibility (WCAG)
34
+ - Compatibility (browsers/devices) · i18n/l10n · Resilience/chaos
35
+ - Compliance-driven testing (HIPAA/PCI/SOC 2) with required evidence.
36
+
37
+ ## 5. Automation strategy
38
+ - What to automate first (high-risk + high-repetition) and what not to.
39
+ - Recommended frameworks/tools (respect existing stack; justify changes).
40
+ - Test data & environment management approach.
41
+ - Standards: naming, structure, stable locators, no fixed sleeps, isolation.
42
+
43
+ ## 6. CI/CD integration & quality gates
44
+ - Which suites run at which stage (pre-commit / PR / merge / nightly / release).
45
+ - **Merge gates**: what must pass to merge (fast smoke + unit/integration).
46
+ - Handling flake (quarantine, retry policy) and keeping the suite fast.
47
+ - Release checklist and rollback signals.
48
+
49
+ ## 7. Roles & ownership
50
+ - Who writes, reviews, and maintains tests (dev-owned vs. QA/SDET).
51
+ - Bug triage flow, severity definitions, and SLAs.
52
+ - Definition of Done for a story to include quality criteria.
53
+
54
+ ## 8. Tooling recommendations
55
+ Concrete tools for: test frameworks, CI, reporting/dashboards, coverage,
56
+ performance, security, accessibility, bug tracking. Note cost/effort and
57
+ whether each is adopt-now or later.
58
+
59
+ ## 9. Rollout roadmap
60
+ Phased, with owners and success criteria — not a wish list.
61
+ - **Now (0–4 weeks):** 2–3 concrete first steps.
62
+ - **Next (1–2 months):** build-out.
63
+ - **Later (quarter+):** maturity, harder NFRs, scale.
64
+
65
+ ## 10. Risks & open questions
66
+ Assumptions to validate, `TBD` inputs to resolve, and dependencies/blockers.
@@ -0,0 +1,44 @@
1
+ ---
2
+ name: test-architect
3
+ description: Use to analyze a full-stack application (frontend, backend, middleware) and produce a complete test pyramid strategy — what to test in the FE, what in the BE, what at the middleware/seams, and at which level. It inspects the codebase, maps the layers, and writes a per-layer test plan with tooling and CI wiring.
4
+ tools: Read, Grep, Glob, Bash, Write
5
+ ---
6
+
7
+ You are a test architect. You design testing for an application as a whole
8
+ system of layers, so every behavior is verified at the lowest effective level
9
+ and seams are covered by contracts rather than heavy end-to-end tests.
10
+
11
+ ## Process
12
+
13
+ 1. **Discover the architecture.** Inspect the repo to identify each layer and its
14
+ stack: frontend framework/state/routing; backend services, API style, data
15
+ stores; middleware (gateway/BFF, auth, queues/event bus, cache, workers,
16
+ third-party integrations). Read package manifests, framework configs, `src/`
17
+ layout, infra/compose files, and CI. Confirm findings; don't assume.
18
+ 2. **Inventory current tests** and their pyramid shape (unit vs integration vs
19
+ E2E). Flag if it's inverted (mostly slow E2E).
20
+ 3. **List behaviors per layer**, then assign each to the **lowest** level that
21
+ can prove it:
22
+ - pure logic/rendering/validation → unit
23
+ - needs a real collaborator (DB, rendered tree + network, queue) → integration/component
24
+ - agreement across a boundary → contract (Pact / OpenAPI / AsyncAPI)
25
+ - only a full critical journey → E2E (keep to a handful)
26
+ 4. **Cover every seam with a contract** instead of re-testing both sides via E2E.
27
+ 5. **Write `TEST-PYRAMID.md`** with: architecture map; FE / BE / middleware test
28
+ plans (concrete tests + tools); a seams→contracts table; the few E2E journeys;
29
+ target proportions vs. current gap; tooling summary; and a Now/Next/Later
30
+ build order with owners.
31
+
32
+ ## Principles
33
+
34
+ - Push tests down; contracts replace integration E2E.
35
+ - Test behavior, not implementation — especially in the FE (assert what the user
36
+ sees, not internal state), and never drive backend rules through the UI.
37
+ - Concentrate depth on revenue/safety-critical flows.
38
+ - Be specific: name the actual modules/endpoints/queues to cover and the tool for
39
+ each, not generic advice.
40
+
41
+ ## Report
42
+
43
+ The path to `TEST-PYRAMID.md`, the layers found, the biggest coverage gaps, and
44
+ the top 3 tests to add first.
@@ -0,0 +1,4 @@
1
+ {
2
+ "title": "Full-Stack Test Pyramid Strategy",
3
+ "description": "Analyze an application end to end — frontend, backend, and middleware — and produce a complete test pyramid: exactly what to test at each layer and each level (unit, integration/component, contract, E2E), where each behavior belongs, and which tools to use."
4
+ }
@@ -0,0 +1,72 @@
1
+ ---
2
+ name: test-pyramid
3
+ description: Analyze a full-stack application (frontend, backend, middleware) and design a complete test pyramid — deciding exactly which tests belong in the FE, which in the BE, which at middleware/seams, and at what level (unit, integration/component, contract, E2E). Use when asked to "design a testing strategy for the whole app", "what should we test where", "build a test pyramid", or to rebalance an inverted (E2E-heavy) suite.
4
+ ---
5
+
6
+ # Full-Stack Test Pyramid
7
+
8
+ Design testing for an application **as a whole system of layers**, not one suite.
9
+ The goal: every behavior is tested at the **lowest level that can meaningfully
10
+ verify it**, seams are covered by **contract tests**, and only a handful of
11
+ journeys reach **end-to-end**. See `layer-test-matrix.md` for the full
12
+ layer × level grid of what to test and which tools to use.
13
+
14
+ ## Method
15
+
16
+ 1. **Map the architecture.** Identify each layer and its technology:
17
+ - **Frontend** — SPA/SSR framework, state management, design system, routes.
18
+ - **Backend** — services, APIs (REST/GraphQL/gRPC), domain logic, data stores.
19
+ - **Middleware** — API gateway/BFF, auth, message queues/event bus, caching,
20
+ service mesh, background workers, third-party integrations.
21
+ Inspect the repo (package manifests, framework configs, `src/` layout, CI) and
22
+ confirm rather than assume.
23
+
24
+ 2. **List behaviors per layer**, then **assign each to the lowest suitable
25
+ level** using this decision order:
26
+ - Pure logic / rendering / validation → **unit**.
27
+ - Behavior that needs a real collaborator (DB, rendered tree + network, queue)
28
+ → **integration / component**.
29
+ - Agreement across a boundary (FE↔API, service↔service, producer↔consumer)
30
+ → **contract** (Pact / OpenAPI / AsyncAPI) — this is what lets you keep E2E small.
31
+ - Only a full critical user journey that no lower level can prove → **E2E**.
32
+
33
+ 3. **Cover the seams, not the internals twice.** Where two layers meet, use a
34
+ contract test once instead of re-testing both sides through E2E.
35
+
36
+ 4. **Set the shape.** Aim for a true pyramid, roughly:
37
+ - ~70% unit · ~20% integration/component · <10% contract+E2E (E2E a small
38
+ handful). If the current suite is an inverted "ice-cream cone" (mostly E2E),
39
+ call it out and give the rebalancing plan.
40
+
41
+ 5. **Output the plan** using `plan-template.md`: per-layer test lists, the seam
42
+ contracts, the few E2E journeys, target proportions, tooling, CI wiring, and
43
+ what to build first.
44
+
45
+ ## What goes where (summary — detail in `layer-test-matrix.md`)
46
+
47
+ - **Frontend:** unit test component logic/hooks/reducers/utils; component-test
48
+ rendered UI with mocked network (Testing Library + MSW), accessibility (axe),
49
+ and visual regression; **consumer contract** tests against the API; a few E2E
50
+ journeys. Do NOT drive backend business rules through the UI.
51
+ - **Backend:** unit test domain/business logic and validators; integration-test
52
+ repositories and route handlers against a real (containerized) DB and real
53
+ adapters; **provider contract** verification + OpenAPI/schema conformance;
54
+ service/component tests with downstreams mocked.
55
+ - **Middleware:** unit test routing/transformation/auth/rate-limit logic;
56
+ integration-test queue producers/consumers, gateway routing, and cache
57
+ behavior with real infra (Testcontainers); **message/event contracts**
58
+ (AsyncAPI, Pact message pacts); resilience tests for retries, timeouts,
59
+ circuit breakers, and idempotency.
60
+ - **Cross-cutting (top):** a small set of full E2E journeys, plus performance/
61
+ load (k6) and security (SAST/DAST) as their own tracks.
62
+
63
+ ## Principles
64
+
65
+ - **Push tests down.** A bug catchable by a unit test should not need an E2E.
66
+ - **Contracts replace integration E2E.** Seams verified by contracts let you
67
+ delete most cross-service E2E.
68
+ - **Test behavior, not implementation.** Especially in the FE — assert what the
69
+ user sees, not internal state.
70
+ - **Isolation + speed at the base**, realism concentrated at the seams, breadth
71
+ only at the tip.
72
+ - **Right-size to risk:** put the extra depth on revenue/safety-critical flows.
@@ -0,0 +1,50 @@
1
+ # Layer × Level Test Matrix
2
+
3
+ For each layer, what to test at each pyramid level and typical tools. Assign each
4
+ behavior to the **lowest** level that can prove it.
5
+
6
+ ## Frontend (UI)
7
+
8
+ | Level | What to test in the FE | Tools |
9
+ |-------|------------------------|-------|
10
+ | Unit | Pure logic: hooks, reducers/stores, selectors, formatters, validation, utility fns | Jest / Vitest |
11
+ | Component / integration | Rendered components & flows with **network mocked**; forms, conditional UI, routing; accessibility; visual regression | Testing Library, MSW, jest-axe, Storybook + Playwright/Chromatic snapshots |
12
+ | Contract (consumer) | The shape/behavior the FE expects from each API it calls | Pact (consumer), or types generated from OpenAPI/GraphQL schema |
13
+ | E2E | A few critical user journeys through the real app | Playwright / Cypress |
14
+
15
+ **Don't** test backend business rules or data validation *through* the UI — mock
16
+ the API and test those rules in the BE.
17
+
18
+ ## Backend (services / APIs)
19
+
20
+ | Level | What to test in the BE | Tools |
21
+ |-------|------------------------|-------|
22
+ | Unit | Domain/business logic, calculations, state machines, validators, mappers — no I/O | Jest/Vitest, Pytest, JUnit, Go test, RSpec |
23
+ | Integration | Repositories/ORM against a **real DB**, route/controller handlers, external adapters, migrations | Testcontainers, Supertest, test DB, WireMock for third parties |
24
+ | Contract (provider) | Verify the provider satisfies every consumer contract; conform to the published OpenAPI/GraphQL schema | Pact (provider verification), Schemathesis, Dredd, oasdiff |
25
+ | Component / service | The whole service in isolation with downstreams stubbed | in-process HTTP + mocked deps |
26
+
27
+ ## Middleware (gateway / queues / auth / cache / workers)
28
+
29
+ | Level | What to test in middleware | Tools |
30
+ |-------|----------------------------|-------|
31
+ | Unit | Routing/transformation rules, auth/authorization middleware, rate limiting, serialization | framework test runner |
32
+ | Integration | Queue producers/consumers, event handlers, gateway routing/BFF aggregation, cache read/write/invalidation | Testcontainers (Kafka/RabbitMQ/Redis), LocalStack |
33
+ | Contract (message/event) | Event & message schemas between producers and consumers | AsyncAPI validation, Pact message pacts |
34
+ | Resilience | Retries, timeouts, circuit breakers, idempotency, dead-letter handling, backpressure | Toxiproxy, fault-injection, chaos tests |
35
+
36
+ ## Cross-cutting (top of the pyramid — keep small)
37
+
38
+ | Concern | What | Tools |
39
+ |---------|------|-------|
40
+ | E2E journeys | A handful of full-stack critical paths only | Playwright / Cypress |
41
+ | Performance / load | Throughput, latency, soak, spike | k6, Gatling, Locust |
42
+ | Security | SAST, dependency scan, DAST | Semgrep/CodeQL, Snyk/Dependabot, OWASP ZAP |
43
+ | Accessibility | End-to-end a11y on key flows | axe, Lighthouse CI |
44
+
45
+ ## Target shape
46
+
47
+ - ~70% unit · ~20% integration/component · ~7% contract · ~3% E2E (a small,
48
+ fixed set). Contracts do the heavy lifting at seams so E2E stays tiny.
49
+ - Inverted suite (mostly slow E2E)? Push each E2E down: replace with a component
50
+ test (FE), an integration test (BE), or a contract test (seam) wherever possible.
@@ -0,0 +1,58 @@
1
+ # Test Pyramid Strategy — <Application Name>
2
+
3
+ > Generated <date> · Scope: frontend + backend + middleware
4
+
5
+ ## 0. Architecture map
6
+ Layers detected and their tech:
7
+ - **Frontend:** framework, state, routing, design system.
8
+ - **Backend:** services, API style (REST/GraphQL/gRPC), data stores.
9
+ - **Middleware:** gateway/BFF, auth, queues/event bus, cache, workers, 3rd parties.
10
+ Diagram or bullet list of how requests/events flow across layers.
11
+
12
+ ## 1. Frontend test plan
13
+ - **Unit:** <hooks/reducers/utils/validators to cover> — tool.
14
+ - **Component/integration:** <rendered flows, forms, a11y, visual> — tool.
15
+ - **Consumer contracts:** <each API the FE consumes> — tool.
16
+ - Explicitly **not** in the FE: <backend rules to push down>.
17
+
18
+ ## 2. Backend test plan
19
+ - **Unit:** <domain logic, calculations, validators>.
20
+ - **Integration:** <repositories, handlers, adapters, migrations> — real DB via Testcontainers.
21
+ - **Provider contracts / schema conformance:** <APIs to verify>.
22
+ - **Service/component:** <services to test in isolation>.
23
+
24
+ ## 3. Middleware test plan
25
+ - **Unit:** <routing/auth/rate-limit/transform logic>.
26
+ - **Integration:** <queues, gateway, cache> with real infra.
27
+ - **Message/event contracts:** <topics/queues and their schemas>.
28
+ - **Resilience:** <retries, timeouts, circuit breakers, idempotency>.
29
+
30
+ ## 4. Seams & contracts (the glue)
31
+ Table of every boundary → contract that covers it, so E2E can stay small.
32
+ | Seam | Consumer | Provider | Contract |
33
+ |------|----------|----------|----------|
34
+ | FE ↔ Orders API | web app | orders-svc | Pact / OpenAPI |
35
+ | orders-svc ↔ payments | orders-svc | payments-svc | Pact |
36
+ | orders-svc → events | producer | notif-worker | AsyncAPI / message pact |
37
+
38
+ ## 5. End-to-end journeys (keep to a handful)
39
+ List the few critical full-stack paths that justify E2E, and why each can't be
40
+ covered lower down.
41
+
42
+ ## 6. Cross-cutting
43
+ Performance/load, security (SAST/DAST/deps), accessibility — owners and cadence.
44
+
45
+ ## 7. Target proportions & current gap
46
+ | Level | Target | Now | Action |
47
+ |-------|--------|-----|--------|
48
+ | Unit | ~70% | ? | |
49
+ | Integration/component | ~20% | ? | |
50
+ | Contract | ~7% | ? | |
51
+ | E2E | ~3% | ? | |
52
+ Note if the current suite is inverted and the rebalancing moves.
53
+
54
+ ## 8. Tooling summary
55
+ Per layer: chosen runners, mocking, contract tooling, CI reporting.
56
+
57
+ ## 9. Build order (Now / Next / Later)
58
+ Concrete first tests to add, then build-out, then hardening — with owners.