npm - @brunosps00/dev-workflow - Versions diffs - 0.9.0 → 0.10.0 - Mend

@brunosps00/dev-workflow 0.9.0 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (72) hide show

package/scaffold/skills/dw-source-grounding/references/freshness-check.md ADDED Viewed

@@ -0,0 +1,108 @@
+# Freshness check — keeping citations valid over time
+A citation goes stale in two ways: the URL stops resolving (404, redirect, paywall added) OR the project's installed version moves past the version the citation pinned. Both invalidate the citation's authority. This file describes how to detect both and what to do.
+## Two staleness modes
+### Mode 1 — URL drift
+The URL still loads, but the content has changed (doc was rewritten, section deleted, deprecated). Or the URL 404s outright.
+Detection: re-fetch the URL on demand. Compare the section/heading the original citation pointed to.
+Action when detected:
+- If the new content still supports the original claim → update the citation's `retrieved` date.
+- If the new content contradicts or removes the claim → the citation is invalid. Find a replacement source OR revisit the decision.
+### Mode 2 — Version drift
+The URL is fine, but the project bumped from React 18.3 to React 19. The citation `[..., version: 18.3, ...]` no longer pins to the version installed.
+Detection: compare the cited version with the manifest's current version (`package.json` etc.). Mismatch → drift.
+Action:
+- If the doc has version-aware content (most modern docs do), find the equivalent for the new version.
+- If the API was renamed/removed in the new version, the underlying decision needs re-evaluation, not just a citation patch.
+## When to check freshness
+| Trigger | Check what |
+|---------|-----------|
+| Acting on an artifact older than 90 days | Both URL drift and version drift |
+| About to ship code based on a citation | URL drift (single fetch) |
+| User explicitly asks "is this still current?" | Both |
+| Routine `dw-deps-audit` | Version drift for every cited dep |
+Don't check on every read — that turns documentation into a trip hazard. Check at decision points: before committing, before merging, before promoting to production.
+## How to check programmatically
+For URL drift:
+```bash
+# Quick HEAD check — does the URL still resolve?
+curl -sI "<url>" | head -1
+# 200 OK → fine; 301/302 → follow; 404 → broken; 403 → paywall added
+# Content drift check — fetch and grep for the original heading
+curl -s "<url>" | grep -i "<expected-heading-or-section>"
+```
+In an agent context: use `WebFetch` on the URL and confirm the cited section still exists. If a heading the citation referenced is gone, mark the citation as drift.
+For version drift:
+```bash
+# What does the manifest say now?
+node -p "require('./package.json').dependencies.react"
+# Compare to the cited version in the artifact
+```
+In an agent context: read the manifest, parse the dep version, compare to the citation's `version` field.
+## Updating stale citations
+When a citation drifts but the underlying decision still holds, update in place:
+```
+Before:
+[source: https://react.dev/reference/react/useEffect, version: 18.3, retrieved: 2025-09-01]
+After URL drift detected (page restructured):
+[source: https://react.dev/reference/react/useEffect, version: 18.3, retrieved: 2026-05-07,
+ superseded-by: https://react.dev/learn/synchronizing-with-effects, retrieved: 2026-05-07]
+After version drift (project moved to React 19):
+[source: https://react.dev/reference/react/useEffect, version: 19.0, retrieved: 2026-05-07,
+ previous: 18.3]
+```
+The artifact records the history. Reviewers can see: "this decision was first sourced for v18.3 in Sep 2025; re-verified for v19.0 in May 2026."
+## When the underlying decision dies
+Sometimes drift means the API the decision used no longer exists. Example: a decision in 2023 to use `React.useTransition` with the `pending` second tuple element. In React 18.3 that's the API; in React 19 the API shape changed.
+In this case:
+1. Don't silently update the citation. The decision IS now invalid.
+2. Open an ADR or comment in the techspec: "decision X relied on API Y; API Y was changed in v<new>; need to revisit."
+3. Loop the user (or the next iteration of `dw-create-techspec`/`dw-deps-audit`) into the decision, with the new constraint.
+Quietly patching a stale citation when the underlying API is gone is a subtle category of bug. Surface it.
+## Bibliography rotation in long-lived artifacts
+For artifacts that live more than 6 months (long-running PRDs, ADRs, design docs), consider a "Sources last verified: YYYY-MM-DD" header at the top:
+```markdown
+---
+type: techspec
+schema_version: "1.0"
+sources_last_verified: 2026-05-07
+---
+```
+When an agent re-reads the artifact and the verification date is >90 days old, prompt: "Sources last verified <date>; re-run freshness check?"
+Cheap operational discipline; prevents silent decay.

package/scaffold/skills/dw-source-grounding/references/source-priority.md ADDED Viewed

@@ -0,0 +1,146 @@
+# Source priority — what counts as authoritative
+Not all sources are equal. The whole point of grounded development is using authoritative sources, not the loudest ones. This file is the hierarchy.
+## Tier 1 — Authoritative
+These are the only valid PRIMARY sources. Cite these for any claim about how a library, framework, or standard works.
+### 1.1 Official versioned documentation
+The exact version's published docs:
+- React: `react.dev/reference/react?version=<X>`
+- Next.js: `nextjs.org/docs` (versioned via release page)
+- Python: `docs.python.org/<X.Y>/`
+- Node: `nodejs.org/api/` (version-pinned via dropdown / URL `/dist/v<X.Y.Z>/`)
+- ASP.NET Core: `learn.microsoft.com/aspnet/core/` (filter by version)
+- Rust: `doc.rust-lang.org/<X.Y>/` and `docs.rs/<crate>/<version>/`
+- Postgres: `postgresql.org/docs/<major>/`
+- AWS: `docs.aws.amazon.com/<service>/` (versioned per SDK / API)
+When the URL doesn't pin version, add a query (`?v=`, `?version=`) or use the docs section that explicitly states the version. If neither exists, note in the citation: `version: latest-as-of-retrieved`.
+### 1.2 Official changelogs and migration guides
+For decisions involving version transitions (upgrade from v17 to v18, etc.):
+- React's `react.dev/blog` for major releases
+- Next.js's `nextjs.org/docs/app/building-your-application/upgrading`
+- Maintainer-published `CHANGELOG.md` in the repo root
+### 1.3 Web standards & RFCs
+For cross-implementation behavior:
+- W3C specs (e.g., HTML, CSS specs)
+- WHATWG specs (e.g., Fetch, URL)
+- IETF RFCs (e.g., RFC 7807 Problem Details, RFC 9110 HTTP Semantics)
+- ECMA-262 (JavaScript spec)
+- ISO standards when relevant (e.g., ISO 8601 for dates)
+Cite when the question is "is this behavior portable?" or "what's the standard?"
+### 1.4 Compatibility tables
+For "does X work in browser/runtime Y?":
+- caniuse.com for web platform features
+- MDN's Browser Compat Data (BCD) for Web APIs
+- Compatibility tables published in maintainer docs (Postgres has them, Node has them)
+## Tier 2 — Acceptable as supplement, NOT primary
+Cite Tier 2 ONLY in addition to a Tier 1 source — never as the sole basis for a decision.
+### 2.1 Maintainer blog posts
+Examples:
+- `vercel.com/blog/...` — for Vercel-published deep-dives on Next.js patterns
+- `microsoft.com/devblogs/...` — for ASP.NET / .NET deep-dives
+- `engineering.<company>.com` — when the company maintains the project
+These are first-person from people who built the thing. Often clearer than docs. But docs are the contract; blogs are commentary.
+### 2.2 Conference talks (recorded)
+Examples:
+- React Conf, Next.js Conf, Pycon, Rustconf
+- The talk's slides + speaker handle
+When citing, name the talk + year + speaker; link the video.
+### 2.3 GitHub issues / PRs from the maintainer
+Useful for understanding the WHY behind a doc statement. Cite the issue/PR number explicitly:
+```
+[source: https://github.com/vercel/next.js/issues/12345, retrieved: 2026-05-07]
+(maintainer thread on the rationale for App Router's caching behavior)
+```
+## Tier 3 — Discovery only, NEVER cite as primary
+These help you find the right Tier 1 doc. They do NOT support a decision on their own.
+### 3.1 Stack Overflow
+Frequently outdated, sometimes wrong, often not version-aware. Use to discover an answer's existence — then verify against Tier 1.
+If a Stack Overflow answer points to docs, fetch the docs and cite those instead.
+### 3.2 Tutorial blogs (non-maintainer)
+Most blog posts are static; the framework moved on. The author may not even remember writing it. Don't cite.
+### 3.3 LLM training data
+Yours included. Treated as Tier 3 for two reasons: it's stale (months/years), and you can't link to it.
+### 3.4 README screenshots from random repos
+Someone's `examples/foo` directory isn't authoritative. The framework's official docs are.
+## When sources conflict
+Common scenario: the official docs say one thing, a maintainer blog says another, an issue thread says yet a third.
+Resolution:
+1. **Newer doc wins** if the version is the same. Docs get corrected; old blog posts don't.
+2. **Maintainer commitment wins** if it's tracked. An issue closed with `wontfix` or a PR merged with `feat:` is binding signal.
+3. **For grey areas, surface the conflict**. Cite all three and tell the user the authoritative resolution is unclear; ask whether to consult the maintainer directly.
+## Examples in practice
+### Good — multi-source decision
+> Decision: use React 19 `useActionState` for the form submission flow.
+>
+> Rationale: idiomatic since React 19; replaces ad-hoc `useFormState`.
+>
+> Sources:
+> - `[source: https://react.dev/reference/react/useActionState, version: 19.0, retrieved: 2026-05-07]` — Tier 1, official API doc
+> - `[source: https://react.dev/blog/2024/12/05/react-19, retrieved: 2026-05-07]` — Tier 2, maintainer blog explaining the migration
+### Bad — Stack Overflow as primary
+> Decision: use `useEffect` cleanup with `AbortController` to cancel fetches.
+>
+> Source: a Stack Overflow answer with 1.2k upvotes.
+The decision is correct, but the citation isn't authoritative. The fix:
+> Source: `[source: https://react.dev/reference/react/useEffect#fetching-data-with-effects, version: 18.3, retrieved: 2026-05-07]` — same conclusion, authoritative origin.
+### Bad — version-less citation
+> Decision: use `Promise.withResolvers()`.
+>
+> Source: `[source: developer.mozilla.org]`
+`Promise.withResolvers()` is Node 22+ / browsers TC39 stage 4. The version matters — the cite is incomplete:
+> Source: `[source: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers, retrieved: 2026-05-07, runtime-support: Node 22+, Chrome 119+]`

package/scaffold/skills/dw-verify/SKILL.md CHANGED Viewed

@@ -177,7 +177,6 @@ This skill is invoked transparently from:
 - `/dw-bugfix` — before claiming the bug is fixed (original symptom no longer reproduces)
 - `/dw-code-review` — before emitting an APPROVED verdict
 - `/dw-generate-pr` — blocks PR creation if the session has no passing VERIFICATION REPORT post-last-edit
-- `/dw-quick` — before committing the one-off change
 Callers should mention this skill in their "Skills Complementares" section so the user sees the dependency.

package/scaffold/skills/vercel-react-best-practices/SKILL.md CHANGED Viewed

@@ -144,3 +144,7 @@ Each rule file contains:
 ## Full Compiled Document
 For the complete guide with all rules expanded: `AGENTS.md`
+## References
+- `references/perf-discipline.md` — workflow discipline (measure → identify → fix → verify → guard) that wraps the per-rule recipes above. Use when tackling performance work; cite the metric and tool before applying any rule. Adapted from [`addyosmani/agent-skills/performance-optimization`](https://github.com/addyosmani/agent-skills/tree/main/performance-optimization) (MIT).

package/scaffold/skills/vercel-react-best-practices/references/perf-discipline.md ADDED Viewed

@@ -0,0 +1,122 @@
+# Performance discipline — measure, identify, fix, verify, guard
+> Adapted from [`addyosmani/agent-skills/performance-optimization`](https://github.com/addyosmani/agent-skills/tree/main/performance-optimization) (MIT). The rules below complement the per-rule recipes in `rules/` with a workflow discipline.
+The biggest performance mistake is fixing the wrong thing. The second biggest is "fixing" without measuring. This file establishes the workflow that prevents both.
+## The five-step loop
+### 1. Measure
+Don't optimize what you haven't measured.
+**Frontend:**
+- Lighthouse / PageSpeed Insights → composite score + breakdown.
+- DevTools Performance tab → flame graph, layout/paint timing.
+- `web-vitals` library → LCP, FID/INP, CLS, TTFB on real users.
+- Bundle analyzer (`next-bundle-analyzer`, `webpack-bundle-analyzer`) → see what's actually shipping.
+**Backend:**
+- Application logs with timing (`time-X-took: Yms`).
+- DB query analyzer (`EXPLAIN ANALYZE`, slow query log).
+- APM (Datadog, New Relic, Sentry Performance) for distributed traces.
+- `top` / `htop` / process memory + CPU during load.
+**Output:** a baseline number with a unit. "It's slow" is not a baseline. "P95 LCP is 4.8s" is a baseline.
+### 2. Identify
+Find where the time goes. The flame graph or trace shows it; don't guess.
+Common culprits:
+| Symptom | Likely cause |
+|---------|--------------|
+| Long initial paint | Large bundle, render-blocking resources |
+| Slow time-to-interactive | Heavy JS execution, hydration cost |
+| Layout shift | Missing dimensions, late-loaded fonts |
+| Slow API response | N+1 query, missing index, expensive computation, external call latency |
+| Memory creep | Listener leak, retained closure, unbounded cache |
+The cause must come from data, not pattern-matching. The same symptom can have different causes in different apps.
+### 3. Fix
+Apply the smallest fix that addresses the identified cause:
+- N+1 query → batch with `IN (...)`, single JOIN, or DataLoader.
+- Heavy bundle → code splitting, lazy load, dynamic imports.
+- Re-render storm → `useMemo`, `useCallback`, `memo`, signal-based state.
+- Slow API → cache, precompute, parallelize, move work to background.
+- Layout shift → reserve space (width+height attrs, CSS aspect-ratio).
+The `rules/` directory in this skill has tactical recipes for each. Use them when the diagnosis points there.
+**Don't apply optimizations preemptively.** `useMemo` everywhere = noise + cost. `memo` everywhere = stale prop bugs.
+### 4. Verify
+Re-measure with the same instrument from step 1.
+- Same scenario, same env (or as close as possible).
+- Multiple runs (perf is noisy; one run is not evidence).
+- Compare against baseline.
+If the number didn't change meaningfully (e.g., <10% improvement is below noise floor for most metrics): you fixed the wrong thing. Revert and go back to step 2.
+If the number improved but the user-perceived experience didn't: you optimized a metric, not a bottleneck. Rethink what to measure.
+### 5. Guard
+Prevent regression:
+- **Performance budgets in CI:** Lighthouse CI, bundle-size limits per route, P95 latency checks.
+- **Regression test for the specific scenario** that was slow.
+- **Monitoring in production** so future regressions surface from real users (not just CI runs that may not match prod load).
+- **Document the constraint** in code comments at the boundary that must stay fast (e.g., "this loop processes the entire user list; keep it O(n)").
+Without guards, every refactor risks reintroducing the bottleneck. The fix decays.
+## Frontend-specific patterns
+The `rules/` directory provides recipes for: bundle size (barrel imports, dynamic imports, defer third-party), client-side perf (passive listeners, swr dedup, localStorage schema), async patterns (parallel fetches, suspense boundaries, defer awaits), and JS micro-perf. Apply when measurement points there.
+**Hierarchy of impact (typical):**
+1. Bundle size — biggest impact for cold loads.
+2. Hydration cost — biggest impact for time-to-interactive.
+3. Network waterfalls — biggest impact for data-heavy pages.
+4. Re-render volume — biggest impact for interactive heavy pages.
+5. JS micro-perf — usually irrelevant unless in a hot loop.
+Optimize in this order; don't jump to (5) before (1).
+## Backend-specific patterns
+| Pattern | When |
+|---------|------|
+| Add database index | Query plan shows full table scan |
+| Batch N queries into 1 | N+1 detected in trace |
+| Cache (Redis, in-memory, edge) | Same expensive computation repeats |
+| Precompute / materialize | Aggregation that runs per-request but updates rarely |
+| Background job | Work doesn't need to block the response |
+| Parallelize independent calls | Trace shows sequential awaits with no dependency |
+| Move to faster runtime / region | Network or CPU is the bottleneck after other fixes |
+## When NOT to optimize
+- The number isn't actually a problem. "P95 200ms" doesn't need optimization unless your SLA is tighter.
+- The optimization makes the code substantially harder to maintain. A 5% gain isn't worth a 50% complexity increase.
+- The optimized code can't be tested. If perf code can't be regression-tested, the next change will undo it silently.
+- You're optimizing dev-mode performance, not prod. Many tools (React, Next, Vite) have very different hot paths in dev vs prod.
+## Anti-patterns
+- "Looks slow, let me memo this" — without measurement, this just adds complexity.
+- "Add caching to fix the slow query" — caching hides the bug; the slow query reappears for the next user.
+- Profiling once, optimizing five things, never re-measuring.
+- Setting `useMemo` deps wrong — silently breaks correctness for marginal perf gain.
+- Treating Lighthouse score as the only metric — score can improve without UX improving.
+## Integration with dev-workflow
+Use with `dw-refactoring-analysis` when flagging perf-related smells: cite the metric, the measurement tool, and the suggested rule from the `rules/` directory. Without those three, a perf "smell" is a guess.

package/scaffold/skills/webapp-testing/SKILL.md CHANGED Viewed

@@ -131,3 +131,8 @@ try {
 ## Helper Functions
 Some helper functions are available in [`test-helper.js`](./assets/test-helper.js) to simplify common tasks like waiting for elements, capturing screenshots, and handling errors. You can import and use these functions in your tests to improve readability and maintainability.
+## References
+- `references/security-boundary.md` — every byte from a browser is potentially attacker-controlled. Test that server-side authorization, validation, and CSRF protection hold even when the UI is bypassed via direct API calls or DevTools manipulation. Adapted from [`addyosmani/agent-skills/browser-devtools`](https://github.com/addyosmani/agent-skills/tree/main/browser-devtools) (MIT).
+- `references/three-workflow-patterns.md` — UI bugs vs network issues vs performance investigations are three distinct testing workflows with different signals and failure modes. Pick the right workflow for the verification you actually need; don't conflate them in a single mega-test. Adapted from the same upstream skill.

package/scaffold/skills/webapp-testing/references/security-boundary.md ADDED Viewed

@@ -0,0 +1,115 @@
+# Security boundary — every browser is hostile
+> Adapted from [`addyosmani/agent-skills/browser-devtools`](https://github.com/addyosmani/agent-skills/tree/main/browser-devtools) (MIT). Adopts the security-boundary principle to inform how webapp-testing scenarios should validate trust boundaries.
+The browser is not a secure environment. Anything that runs there — JS, CSS, HTML, devtools — is under the user's control. When you write a webapp test, you're not just verifying functionality; you're often verifying that this assumption holds.
+## The core principle
+> Every byte sent from a browser is potentially attacker-controlled, regardless of what the UI presents.
+The UI is a convenience for the user. The server cannot trust:
+- Hidden form fields (the user can edit them in DevTools).
+- Disabled buttons (the user can re-enable them).
+- Client-side validation (the user can bypass it).
+- Cookie values (the user can modify them).
+- HTTP request bodies (the user can craft any payload).
+- Headers (mostly user-controlled; a few are browser-set).
+Only server-side checks count for security. Client-side checks are UX, not security.
+## Implications for webapp testing
+When designing test scenarios for a webapp, validate:
+### 1. Server-side authorization on every action
+Test that a user CANNOT perform an action they shouldn't, even if they manipulate the UI to send the request:
+```
+- Log in as user A.
+- Attempt to access another user's resource by directly hitting the endpoint
+  (skip the UI navigation; craft the request).
+- Expected: 403 Forbidden, no data leakage in error response.
+```
+If the test passes only when going through the UI, the test is incomplete. Real attackers don't go through the UI.
+### 2. Server-side validation independent of client
+Test that the server rejects malformed input even when the client would normally prevent it:
+```
+- Open the form.
+- Use DevTools to remove the `maxlength` attribute from an input.
+- Submit an oversized value.
+- Expected: server rejects with 400/422, not 500 (server crashed because client validation was assumed).
+```
+### 3. Auth state cannot be forged
+Test that:
+- Modifying client-stored tokens (localStorage, sessionStorage) does not grant access.
+- Removing or modifying cookies does not grant access (or it gracefully de-authenticates).
+- Replaying captured requests after logout fails.
+### 4. CSRF / cross-origin protection
+Test that:
+- A request originating from a different origin (set `Origin: https://attacker.com` in test) is rejected for state-changing operations.
+- CSRF tokens (or SameSite cookie equivalents) are validated, not just included.
+### 5. No client-side secrets
+Audit the bundle for accidentally-shipped secrets:
+```
+- Build the production bundle.
+- grep for: API keys (hex strings, JWT structure), private endpoints, internal URLs,
+  source maps containing internal paths, debug flags left enabled.
+- Expected: nothing sensitive present.
+```
+## Common UI-as-security misconceptions
+| Misconception | Why it fails |
+|---------------|--------------|
+| "Hidden field hides the value" | Visible in HTML source / DevTools |
+| "Disabled button prevents action" | User can re-enable in DevTools |
+| "Client-side regex prevents bad input" | Bypassable with crafted request |
+| "Auth check on the page renders the wrong page" | Page didn't render but the API is still callable |
+| "We minified the code" | Reverse-engineering minified code is trivial |
+| "We obfuscated the API" | Network tab reveals the calls |
+| "Only our app calls this endpoint" | Anyone can call any URL |
+## Testing browser-side trust boundaries
+When using Playwright/Puppeteer/MCP for testing:
+- **Capture and replay attacks:** record a request, replay with modified payload, assert the server rejects.
+- **Session manipulation:** modify cookies/localStorage between actions, assert the server detects.
+- **Direct API calls:** skip the UI; call endpoints directly; assert correct authorization.
+- **Cross-origin simulation:** override `Origin` header; assert correct rejection.
+These tests catch bugs unit tests miss, because unit tests assume well-formed input. The browser-as-attacker tests assume malicious input.
+## Anti-patterns
+- Testing only the happy path through the UI.
+- Asserting "the button is disabled" without asserting "the API rejects the call."
+- Treating client-side validation messages as if they were security checks.
+- Relying on minification/obfuscation as defense.
+- Testing once, never re-testing after dependency updates that change the trust surface.
+## When this matters most
+- Auth flows, account changes, password resets.
+- Payment / billing operations.
+- Data export, account deletion, irreversible actions.
+- Multi-tenant boundaries (one user's data must not leak to another).
+- Admin endpoints (must reject non-admin users at the server, not just hide UI).
+These are the tests that catch real production incidents.

package/scaffold/skills/webapp-testing/references/three-workflow-patterns.md ADDED Viewed

@@ -0,0 +1,144 @@
+# Three workflow patterns for browser-based testing
+> Adapted from [`addyosmani/agent-skills/browser-devtools`](https://github.com/addyosmani/agent-skills/tree/main/browser-devtools) (MIT). The three workflows below organize webapp testing tasks by what's actually being verified.
+Most webapp testing tasks fall into one of three workflows. Each has different goals, different signals, and different failure modes. Don't conflate them.
+## Workflow 1 — UI bugs
+**Goal:** verify what the user sees matches what's expected.
+**Signals:**
+- Screenshot diff vs reference.
+- Element exists / does not exist in DOM.
+- Element has expected text / attributes.
+- Element is visible / styled correctly.
+- Click triggers expected navigation or state change.
+**Tools:**
+- Playwright / Puppeteer for navigation and interaction.
+- Visual regression (Percy, Chromatic, Playwright's `toHaveScreenshot`).
+- Accessibility checks (`axe-core`, Playwright's accessibility snapshot).
+**Common bugs caught:**
+- Layout shift after image loads.
+- Text wrapping that overflows containers.
+- Missing focus styles on interactive elements.
+- Hover/active states broken on touch devices.
+- Hydration mismatch (server-rendered ≠ client-rendered DOM).
+**Common bugs missed:**
+- Behavior bugs (click works but state is wrong).
+- Race conditions (UI state stable; network race underneath).
+- Security bugs (UI hides the action; server still accepts it).
+**When to use this workflow:**
+- After a CSS / component refactor.
+- Before / after design system migrations.
+- Smoke testing critical pages on every release.
+## Workflow 2 — Network issues
+**Goal:** verify the client-server contract is honored under various network conditions.
+**Signals:**
+- Request was sent with expected payload, headers, method, URL.
+- Response was received with expected status, body, headers.
+- Retries occurred when expected (and didn't when not).
+- Errors are surfaced to UI vs swallowed silently.
+- Requests don't fire when they shouldn't (e.g., debounced search).
+**Tools:**
+- Playwright `page.route()` to intercept and inspect / modify requests.
+- DevTools Network panel via MCP / inspection.
+- Mock server / MSW for controlled scenarios.
+- Network throttling (slow 3G, offline) for resilience tests.
+**Common bugs caught:**
+- N+1 requests on page load.
+- Missing error handling (200 success path tested; 500 path crashes UI).
+- Auth headers missing on retried requests.
+- Stale data shown after offline reconnect.
+- Race conditions when multiple requests resolve out of order.
+**Common bugs missed:**
+- Visual bugs that don't affect network (CSS issues).
+- Server-side bugs (test only checks request/response shape, not server logic).
+- Performance bugs at scale (single request looks fine; thousands per second don't).
+**When to use this workflow:**
+- After auth / API client refactor.
+- Verifying offline / connectivity-loss behavior.
+- Validating against contract tests / API mocks.
+- Reproducing user-reported "loading forever" bugs.
+## Workflow 3 — Performance investigation
+**Goal:** find why a page is slow and verify a fix.
+**Signals:**
+- Lighthouse scores (LCP, FID/INP, CLS, TTFB).
+- DevTools Performance flame graph timing.
+- Bundle analyzer output.
+- `web-vitals` library captures from real browser sessions.
+- Frame rate during interactions (`requestAnimationFrame` timing).
+**Tools:**
+- Playwright `tracing.start({ snapshots: true, screenshots: true })`.
+- Lighthouse CI for automated runs.
+- DevTools Performance tab via MCP.
+- WebPageTest for repeatable third-party measurement.
+- Bundle analyzer (`next-bundle-analyzer`, `webpack-bundle-analyzer`).
+**Common bugs caught:**
+- Render-blocking third-party scripts.
+- Unintended re-renders amplifying click handlers.
+- Large bundle from accidental lib import (e.g., `import _ from 'lodash'` instead of specific function).
+- Images larger than displayed size.
+- N+1 client-side renders (list of 1000 items each fetching).
+**Common bugs missed:**
+- Network correctness (perf can pass even when results are wrong).
+- Visual issues unrelated to render time.
+- Backend perf (this workflow looks at client-side; backend traces are needed too).
+**When to use this workflow:**
+- Performance regression alerts firing.
+- User-reported slowness.
+- Before / after a perf-targeted refactor.
+- Pre-launch validation against budget targets.
+## Choosing a workflow
+The first question for any test: what am I trying to verify?
+| Concern | Workflow |
+|---------|----------|
+| "Does the page look right?" | UI bugs |
+| "Does clicking this button do the right thing visually?" | UI bugs |
+| "Does the API get called correctly?" | Network issues |
+| "Does the UI handle errors gracefully?" | Network issues |
+| "Why is this page slow?" | Performance |
+| "Does this hit our perf budget?" | Performance |
+| "Does an attacker get blocked here?" | Network issues + security boundary (`security-boundary.md`) |
+Mixing workflows in a single test produces flaky, slow, or incomplete coverage. A test that asserts BOTH "the button is styled correctly" AND "Lighthouse score is >90" runs slowly and fails for unrelated reasons.
+## Anti-patterns
+- One mega-test that "checks everything" — fails fragilely, hard to debug.
+- UI tests that assert pixel-perfect layout in CI (CI rendering differs from local).
+- Network tests that mock the entire API (you stop testing the contract; you test the mock).
+- Performance tests run once during dev, never in CI (regressions land silently).
+- Skipping security workflow because "we have unit tests for auth" — unit tests don't catch UI/server-disagreement bugs.
+## How these compose
+A real webapp test suite has all three workflows running:
+- **Per commit (CI):** UI smoke tests + critical-path network tests.
+- **Per PR:** above + visual regression on changed components.
+- **Per release:** above + Lighthouse CI + extended network resilience tests.
+- **Periodically (nightly / weekly):** full perf baseline + security boundary checks.
+Different cadences match different cost profiles. UI smoke is cheap; full perf + security is expensive. Balance accordingly.