@metasession.co/devaudit-cli 0.1.52 → 0.1.53

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@metasession.co/devaudit-cli",
3
- "version": "0.1.52",
3
+ "version": "0.1.53",
4
4
  "description": "DevAudit CLI — installs, syncs, and operates the Metasession SDLC across consumer projects.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -33,7 +33,7 @@
33
33
  },
34
34
  "dependencies": {
35
35
  "@clack/prompts": "^0.8.2",
36
- "@metasession.co/devaudit-plugin-sdk": "^0.1.52",
36
+ "@metasession.co/devaudit-plugin-sdk": "^0.1.53",
37
37
  "commander": "^12.1.0",
38
38
  "consola": "^3.2.3",
39
39
  "env-paths": "^3.0.0",
@@ -39,7 +39,7 @@ These standards apply to all Metasession products, client engagements, and inter
39
39
  ### Speed over Exhaustiveness
40
40
  - Fast feedback prioritized (unit tests < 30 seconds)
41
41
  - Parallelization and sharding for E2E suites
42
- - Strategic test selection based on code changes
42
+ - Strategic test selection based on code changes — first concrete implementation is the three-tier E2E gating model (smoke / critical / regression), see Test_Strategy.md § *E2E gating model* (v0.1.53+)
43
43
  - Regression suites optimized for execution time
44
44
 
45
45
  ### Traceability
@@ -150,6 +150,18 @@ Testing effort is prioritized by risk level, determined at planning time:
150
150
 
151
151
  AI involvement in Medium or High categories raises risk by one level. The Test Strategy defines specific testing depth requirements per level.
152
152
 
153
+ ### E2E gate enforcement (v0.1.53+)
154
+
155
+ The MoSCoW prioritisation of acceptance criteria maps onto three E2E gates, each enforced at a different point in the workflow:
156
+
157
+ - **Must-tier ACs in the smoke subset** must pass on every push to the integration branch. Blocking — a red smoke gate stops the integration hop.
158
+ - **Must-tier ACs in the critical subset** must pass on every PR to the release branch. Blocking — a red critical gate stops the release.
159
+ - **Should/Could-tier ACs (full regression)** must pass on the next post-merge run to the release branch OR a hotfix issue is auto-filed. Not pre-merge blocking — the safety net is post-hoc triage by the operator within working hours.
160
+
161
+ Operator override on a hotfix issue (accept-with-rationale) is logged on the issue itself + carried in the next release's `test-execution-summary.md` design record (devaudit#50). The framework does not permit silently shipping a failing test — every red regression spec ends as either fixed, reverted, or accepted-with-recorded-rationale.
162
+
163
+ See Test_Strategy.md § *System Testing (E2E)* — *E2E gating model* for the tier definitions + cost philosophy.
164
+
153
165
  ---
154
166
 
155
167
  ## Roles & Responsibilities
@@ -38,6 +38,24 @@ Validates interactions between system components — API contracts, service inte
38
38
 
39
39
  End-to-end validation of complete user workflows from UI to database. Primary responsibility of the QA team. Automated using BDD frameworks that map acceptance criteria to executable specifications. Covers 100% of critical user paths.
40
40
 
41
+ #### E2E gating model — three tiers (devaudit#152 follow-up, v0.1.53)
42
+
43
+ Full E2E regression on every PR is expensive — a 30+ minute wait per release-PR blocks velocity for diminishing marginal safety once smoke covers the headline flows. The framework's gating model maps the existing MoSCoW prioritisation onto three tiers, each gated at a different point in the workflow:
44
+
45
+ | Tier | Location | When it runs | Wall-clock target | Audit role |
46
+ |---|---|---|---|---|
47
+ | **smoke** | `e2e/smoke/*.spec.ts` | every push to `$INTEGRATION_BRANCH` (via `ci.yml`) | ~3–5 min | fast feedback on every change |
48
+ | **critical** | `e2e/smoke/` + `e2e/critical/*.spec.ts` | PR-to-`$RELEASE_BRANCH` (via `e2e-regression.yml`) | ~10–15 min | release-readiness Must gate |
49
+ | **regression** | all `e2e/**/*.spec.ts` | nightly + push-to-`$RELEASE_BRANCH` + `workflow_dispatch` | ~35 min (or your project's full pack) | full audit trail + drift catch |
50
+
51
+ The mapping to MoSCoW: **Must-priority SRS items live in `e2e/smoke/` (fast feedback) and `e2e/critical/` (release gate); Should/Could items live in `e2e/` and are covered by the regression tier.** The classifier is the developer authoring the spec — see `skills/e2e-test-engineer/SKILL.md` Phase 3 for the decision tree.
52
+
53
+ **Cost philosophy.** Smoke protects every push from breaking the headline flow. Critical protects every release from a Must-tier regression. Full regression protects the audit trail + catches drift overnight. We accept that a Should/Could-tier regression *can* slip past the PR gate; we catch it on the next post-merge run + auto-file a hotfix issue. The framework prefers this over a 35-min wait on every release because operator velocity matters and the safety net stays intact.
54
+
55
+ **Post-merge safety net.** Every push to `$RELEASE_BRANCH` re-runs the full regression. On failure, `e2e-regression.yml` auto-files a `bug, priority:high` issue tagging the merge commit + the failing specs. The operator triages within working hours — hotfix forward, revert the commit, or accept-with-rationale if the failure is environmental. No automated revert (false positives + flakes + UAT-data drift are real classes; an operator triages each individually).
56
+
57
+ **Reference workflow.** A copy-pasteable `e2e-regression.yml` shape lives at `skills/e2e-test-engineer/references/e2e-regression-3-tier.yml`. Adoption is opt-in per consumer (the framework doesn't currently sync this workflow; consumers own their own `e2e-regression.yml`).
58
+
41
59
  ### Acceptance Testing
42
60
 
43
61
  Validates that requirements and acceptance criteria are met from a business perspective. Conducted in staging environments mirroring production. Requires sign-off from Product Managers. May include formal UAT with stakeholders for regulated features.
@@ -131,6 +131,26 @@ Resist padding. A new endpoint doesn't need a test that re-verifies login if log
131
131
 
132
132
  For each scenario, write a one-line description. Present the full grouped list to the user before writing any code: *"Here's the coverage I'd propose — anything to add or drop?"*
133
133
 
134
+ #### Classify each spec into a tier (devaudit#152 follow-up, v0.1.53)
135
+
136
+ When designing each scenario, also pick the tier it'll live in. Three tiers map to MoSCoW priority + gating point (see `Test_Strategy.md` § *E2E gating model*):
137
+
138
+ | Tier | File location | Picks this when… |
139
+ |---|---|---|
140
+ | **smoke** | `e2e/smoke/*.spec.ts` | Cross-cutting sanity that proves the app is up: login, basic nav, one canonical CRUD per main domain. Runs on every push to the integration branch. Keep small — total smoke wall-clock target is ~3–5 min. |
141
+ | **critical** | `e2e/critical/*.spec.ts` | Must-priority SRS item that breaks a headline flow if it regresses. Examples: payment authorisation, order completion, admin permission editing, RBAC enforcement on financial surfaces. Runs on PR-to-release-branch. Total critical wall-clock target ~10–15 min (includes smoke). |
142
+ | **regression** | `e2e/<area>/*.spec.ts` | Should/Could-priority SRS item, edge cases, less-load-bearing flows. Runs nightly + post-merge + dispatch. Total full pack can be 30+ min; that's the point of the tier. |
143
+
144
+ Decision tree, applied per scenario:
145
+
146
+ 1. **Does the spec prove a Must-priority SRS AC (or a baseline "app is up" sanity check)?** → smoke or critical.
147
+ 2. **Within Must: would a regression here break a headline business flow visible to a paying customer or stop a release from shipping?** → critical. Otherwise → smoke.
148
+ 3. **Should/Could priority, edge case, advanced flow?** → regression (file under `e2e/<area>/`, not under `e2e/smoke/` or `e2e/critical/`).
149
+
150
+ When you can't decide between critical and regression, default to **regression** — promoting a spec from regression → critical later is cheap (move the file); demoting in the other direction is rarely needed but equally cheap. The cost of putting a Should spec in critical is everyone waiting longer on every PR-to-main for a low-value signal.
151
+
152
+ Record the tier choice in the eventual `test-execution-summary.md` § *Test design* (devaudit#50) — Layers covered should name which tier each new spec landed in. Reviewers verify the tier choice is defensible during the WAIT CHECKPOINT.
153
+
134
154
  ### Phase 4 — Reconcile with existing tests
135
155
 
136
156
  For the area touched by the change, look at what's already there.
@@ -0,0 +1,178 @@
1
+ # Reference: three-tier E2E gating workflow (devaudit#152 follow-up, v0.1.53)
2
+ #
3
+ # Copy this into your consumer-owned .github/workflows/e2e-regression.yml
4
+ # to adopt the 3-tier model: smoke (every develop push, fast) / critical
5
+ # (PR-to-main, ~10-15 min target) / regression (nightly + push-to-main +
6
+ # dispatch, full audit trail with auto-issue on failure).
7
+ #
8
+ # The framework does NOT sync this file automatically — your consumer
9
+ # owns its e2e-regression.yml. Apply the patterns below to your own
10
+ # file; keep any consumer-specific env / matrix / runner customisations.
11
+ #
12
+ # Tier definitions:
13
+ # - smoke — runs on develop push via ci.yml (no change here)
14
+ # - critical — Playwright project that selects e2e/smoke/ + e2e/critical/
15
+ # - regression — Playwright project that selects all e2e/**/*.spec.ts
16
+ #
17
+ # playwright.config.ts must define the `critical` project for this to
18
+ # fire; if it doesn't, the gate falls back to the existing `smoke`
19
+ # project so PR-to-main stays green during migration.
20
+
21
+ name: E2E Regression
22
+
23
+ on:
24
+ pull_request:
25
+ branches: [main] # critical-tier gate before merge
26
+ push:
27
+ branches: [main] # full regression after merge; auto-issues on failure
28
+ schedule:
29
+ - cron: '0 2 * * *' # nightly full regression
30
+ workflow_dispatch:
31
+ inputs:
32
+ specs:
33
+ description: 'Optional: space-separated spec paths or --grep pattern for a scoped run. Empty = full regression.'
34
+ required: false
35
+
36
+ permissions:
37
+ contents: read
38
+ issues: write # post-merge auto-issue on regression failure
39
+
40
+ concurrency:
41
+ group: e2e-regression-${{ github.ref }}
42
+ cancel-in-progress: ${{ github.event_name == 'pull_request' }}
43
+
44
+ jobs:
45
+ e2e:
46
+ name: E2E Regression Tests
47
+ runs-on: ubuntu-latest # adapt to your runner; e.g. self-hosted, ubuntu-24.04
48
+ steps:
49
+ - uses: actions/checkout@v4
50
+ with:
51
+ fetch-depth: 0 # for E2E_NEW_SPECS computation
52
+
53
+ - uses: actions/setup-node@v4
54
+ with:
55
+ node-version: '22' # match your project
56
+ cache: 'npm'
57
+
58
+ - name: Install dependencies
59
+ run: npm ci --legacy-peer-deps
60
+
61
+ - name: Install Playwright browsers
62
+ run: npx playwright install --with-deps chromium
63
+
64
+ # Decide which Playwright project to run based on the trigger.
65
+ # PR-to-main uses critical with smoke fall-back; push-to-main and
66
+ # schedule run the full regression project; workflow_dispatch
67
+ # accepts an optional spec filter.
68
+ - name: Determine E2E project + spec selector
69
+ id: select
70
+ run: |
71
+ set -euo pipefail
72
+ EVENT="${{ github.event_name }}"
73
+ case "$EVENT" in
74
+ pull_request)
75
+ if grep -qE "name:\s*['\"]critical['\"]" playwright.config.ts 2>/dev/null; then
76
+ echo "project=critical" >> "$GITHUB_OUTPUT"
77
+ echo "Using critical-tier project (smoke + e2e/critical/)"
78
+ else
79
+ echo "project=smoke" >> "$GITHUB_OUTPUT"
80
+ echo "::warning::No 'critical' Playwright project defined; falling back to smoke. See e2e-test-engineer/references/e2e-regression-3-tier.yml + the Phase 3 tier-classification guide."
81
+ fi
82
+ echo "specs=" >> "$GITHUB_OUTPUT"
83
+ ;;
84
+ push|schedule)
85
+ echo "project=regression" >> "$GITHUB_OUTPUT"
86
+ echo "specs=" >> "$GITHUB_OUTPUT"
87
+ echo "Running full regression project"
88
+ ;;
89
+ workflow_dispatch)
90
+ echo "project=regression" >> "$GITHUB_OUTPUT"
91
+ echo "specs=${{ github.event.inputs.specs }}" >> "$GITHUB_OUTPUT"
92
+ if [ -n "${{ github.event.inputs.specs }}" ]; then
93
+ echo "Scoped dispatch: ${{ github.event.inputs.specs }}"
94
+ fi
95
+ ;;
96
+ esac
97
+
98
+ - name: Run E2E suite
99
+ id: run
100
+ env:
101
+ PLAYWRIGHT_HTML_REPORTER_OPEN: never
102
+ PLAYWRIGHT_JSON_OUTPUT_NAME: e2e-regression-results.json
103
+ # Add your e2e_env values here as needed (DEVAUDIT_BASE_URL etc.)
104
+ run: |
105
+ set -euo pipefail
106
+ PROJECT="${{ steps.select.outputs.project }}"
107
+ SPECS="${{ steps.select.outputs.specs }}"
108
+ if [ -n "$SPECS" ]; then
109
+ npx playwright test --project="$PROJECT" --reporter=json,html $SPECS
110
+ else
111
+ npx playwright test --project="$PROJECT" --reporter=json,html
112
+ fi
113
+
114
+ - uses: actions/upload-artifact@v4
115
+ if: always()
116
+ with:
117
+ name: e2e-regression-report
118
+ path: |
119
+ e2e-regression-results.json
120
+ playwright-report/
121
+ test-results/
122
+
123
+ # ─────────────────────────────────────────────────────────────
124
+ # Post-merge auto-issue on regression failure (push:branches:[main])
125
+ #
126
+ # Catches regressions that slipped past the critical-tier PR gate.
127
+ # Opens a high-priority issue tagging the merge commit + the
128
+ # failing specs so the operator can triage within working hours.
129
+ # No auto-revert — that's intentionally an operator decision.
130
+ # ─────────────────────────────────────────────────────────────
131
+ - name: Open hotfix issue on post-merge regression
132
+ if: failure() && github.event_name == 'push' && github.ref == 'refs/heads/main'
133
+ env:
134
+ GH_TOKEN: ${{ github.token }}
135
+ run: |
136
+ set -euo pipefail
137
+ MERGE_SHA="${{ github.sha }}"
138
+ MERGE_SHA_SHORT=$(echo "$MERGE_SHA" | cut -c1-7)
139
+ RUN_URL="${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
140
+
141
+ # Extract failing spec names from the JSON reporter output if available.
142
+ FAILING=""
143
+ if [ -f e2e-regression-results.json ]; then
144
+ FAILING=$(jq -r '
145
+ [.. | objects | select(.status == "failed" or .status == "timedOut") | .title // empty]
146
+ | unique | .[]
147
+ ' e2e-regression-results.json 2>/dev/null | head -20 || true)
148
+ fi
149
+ if [ -z "$FAILING" ]; then
150
+ FAILING="(see the failing run logs — could not parse spec titles from reporter output)"
151
+ fi
152
+
153
+ BODY=$(cat <<EOF
154
+ ## Post-merge regression caught on \`main\`
155
+
156
+ The full regression suite failed on the post-merge run for commit \`${MERGE_SHA_SHORT}\`. The critical-tier PR gate let this slip through.
157
+
158
+ **Failing specs (best-effort extracted from the JSON reporter):**
159
+
160
+ \`\`\`
161
+ ${FAILING}
162
+ \`\`\`
163
+
164
+ **Triage actions:**
165
+
166
+ - [ ] Read the run log: ${RUN_URL}
167
+ - [ ] Pull \`e2e-regression-report\` artifact from the run; inspect \`test-results/<spec>/error-context.md\` for page state at failure
168
+ - [ ] Decide: hotfix on \`main\`, revert \`${MERGE_SHA_SHORT}\`, or accept-with-rationale if the failure is environmental
169
+ - [ ] If the failing spec is a Must-tier candidate that should have caught this pre-merge, move it from \`e2e/\` to \`e2e/critical/\` so the next PR-to-main runs it
170
+
171
+ **Auto-filed by:** \`e2e-regression.yml\` (devaudit#152 3-tier gating, v0.1.53+)
172
+ EOF
173
+ )
174
+
175
+ gh issue create \
176
+ --title "[hotfix] Post-merge regression on \`${MERGE_SHA_SHORT}\` — full E2E failed" \
177
+ --body "$BODY" \
178
+ --label "bug,priority:high"