@clipboard-health/ai-rules 2.14.23 → 2.14.24

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@clipboard-health/ai-rules",
3
- "version": "2.14.23",
3
+ "version": "2.14.24",
4
4
  "description": "Pre-built AI agent rules for consistent coding standards.",
5
5
  "keywords": [
6
6
  "ai",
@@ -3,6 +3,7 @@ Object.defineProperty(exports, "__esModule", { value: true });
3
3
  exports.execAndLog = execAndLog;
4
4
  const node_child_process_1 = require("node:child_process");
5
5
  const node_util_1 = require("node:util");
6
+ // oxlint-disable-next-line typescript/strict-void-return -- execFile returns ChildProcess by design
6
7
  const execAsync = (0, node_util_1.promisify)(node_child_process_1.execFile);
7
8
  async function execAndLog(params) {
8
9
  const { command, verbose, ...rest } = params;
@@ -141,24 +141,16 @@ This downloads and extracts to `/tmp/playwright-llm-report-{runId}/`. The report
141
141
 
142
142
  ## Phase 3E: Quick Classification
143
143
 
144
- LLM report structure:
145
-
146
- - **`summary`** -- quick pass/fail counts
147
- - **`tests[].errors[].message`** -- ANSI-stripped, clean error text
148
- - **`tests[].errors[].diff`** -- extracted expected/actual from assertion errors
149
- - **`tests[].errors[].location`** -- exact file and line of failure
150
- - **`tests[].flaky`** -- true if test passed after retry
151
- - **`tests[].attempts[]`** -- full retry history with per-attempt status, timing, stdio, attachments, steps, and network
152
- - **`tests[].attempts[].consoleMessages[]`** -- warning/error/pageerror/page-closed/page-crashed trace entries only (2KB text cap with `[truncated]` marker, max 50 per attempt, high-signal entries prioritized over low-signal)
153
- - **`tests[].steps` / `tests[].network` / `tests[].timeline`** -- convenience aliases from the final attempt
154
- - **`tests[].attempts[].timeline[]`** -- unified, sorted-by-`offsetMs` array of all retained events (`kind: "step" | "network" | "console"`). Slimmed-down entries for quick temporal scanning; full details remain in the source arrays
155
- - **`offsetMs`** -- milliseconds since the attempt's `startTime`. Always present on steps (from `TestStep.startTime`). Optional on network entries (from trace `_monotonicTime` or `startedDateTime`, converted via the trace's `context-options` anchor) and console entries (from trace monotonic `time` field + anchor). Absent when the trace lacks a `context-options` event. Entries without `offsetMs` are excluded from the timeline
156
- - **`tests[].attempts[].network[].traceId`** -- promoted from `x-datadog-trace-id` header for direct access
157
- - **`tests[].attempts[].network[]`** -- max 200 per attempt, priority-based: fetch/xhr requests, error responses (status >= 400), failed, and aborted requests are retained over static assets (script, stylesheet, image, font). Includes failure details (`failureText`, `wasAborted`), redirect chain (`redirectToUrl`, `redirectFromUrl`, `redirectChain`), timing breakdown (`timings`), `durationMs` derived from available timing components, and allowlisted headers (`requestHeaders`, `responseHeaders`)
158
- - **`tests[].attempts[].network[].responseHeaders`** -- includes `x-datadog-trace-id` and `x-datadog-span-id` when present (values capped to 256 chars)
159
- - **`tests[].attempts[].failureArtifacts`** -- for failing/timed-out/interrupted attempts: `screenshotBase64` (base64-encoded screenshot, max 512KB), `videoPath` (first video attachment path). Omitted entirely when neither screenshot nor video is available
160
- - **`tests[].attachments[].path`** -- relative to Playwright outputDir
161
- - **`tests[].stdout` / `tests[].stderr`** -- capped at 4KB with `[truncated]` marker
144
+ For the full report schema, field reference, caps, and example reports:
145
+
146
+ 1. If the repo has `node_modules/@clipboard-health/playwright-reporter-llm/`, read `README.md` and `docs/example-report.json` from there — exact version match to the report.
147
+ 2. Otherwise, fetch the latest docs from GitHub:
148
+ - `https://raw.githubusercontent.com/ClipboardHealth/core-utils/refs/heads/main/packages/playwright-reporter-llm/README.md`
149
+ - `https://raw.githubusercontent.com/ClipboardHealth/core-utils/refs/heads/main/packages/playwright-reporter-llm/docs/example-report.json`
150
+
151
+ Cross-check the report's `schemaVersion` against the docs if they disagree, the `main` docs describe a different version and some field semantics may not apply.
152
+
153
+ Read the docs if you need field semantics or limits; otherwise the field names used below are enough to drive the investigation.
162
154
 
163
155
  Classify the flake to narrow the search space:
164
156
 
@@ -218,33 +210,21 @@ Filter `tests[]` for entries where `status` is `"failed"` or `flaky` is `true`.
218
210
 
219
211
  ### 4Ed: Examine attempts for retry patterns
220
212
 
221
- Each attempt includes:
213
+ For each attempt, compare `status`, `durationMs`, and `error` across retries — timing or error-shape differences between attempts often point at the trigger.
222
214
 
223
- - `status` and `durationMs` spot timing differences between passing and failing attempts
224
- - `error` — failure reason per attempt (may differ across retries)
225
- - `consoleMessages[]` — browser warnings/errors (only warning, error, pageerror, page-closed, page-crashed entries; capped at 2KB / 50 per attempt)
226
- - `failureArtifacts` — for failed/timed-out/interrupted attempts:
227
- - `screenshotBase64` — base64-encoded failure screenshot (max 512KB). **Decode and inspect this** to see exactly what the page showed at failure time — often reveals modals, loading spinners, error banners, or unexpected navigation that the assertion text alone doesn't explain.
228
- - `videoPath` — path to video recording
229
- - `network[]` — HTTP requests/responses for that attempt
230
- - `timeline[]` — unified sorted event stream
215
+ **Always decode `failureArtifacts.screenshotBase64` when present.** The page state at failure often reveals modals, loading spinners, error banners, or unexpected navigation that the assertion text alone doesn't explain.
231
216
 
232
217
  ### 4Ee: Inspect network activity and extract trace IDs
233
218
 
234
- The `network[]` array (on tests or individual attempts) includes:
219
+ Scan `network[]` for 4xx/5xx responses, `failureText`, and `wasAborted` near the failure's `offsetMs`. Use `timings` to isolate slow phases (DNS, connect, wait, receive).
235
220
 
236
- - `method`, `url`, `status` identify 4xx/5xx responses
237
- - `timings` — detailed breakdown: `dnsMs`, `connectMs`, `sslMs`, `sendMs`, `waitMs`, `receiveMs`
238
- - `durationMs` — total request duration derived from timing components
239
- - `requestHeaders`, `responseHeaders` — allowlisted headers
240
- - `redirectChain` — full redirect sequence
241
- - **`traceId`** — Datadog trace ID extracted from `x-datadog-trace-id` response header. **When present near a failure, you must use references/datadog-apm-traces.md for backend correlation to bridge the gap between frontend test failure and potential backend root cause.**
221
+ **`traceId`** when present on a failing request, you must follow [`references/datadog-apm-traces.md`](./references/datadog-apm-traces.md) to correlate with backend behavior. This is the bridge between frontend test failure and potential backend root cause.
242
222
 
243
- Network is capped at 200 entries per attempt, prioritized: fetch/xhr and error responses are retained over static assets. Headers/values capped at 256 chars. If all 200 entries are static assets (script/stylesheet/font) with no API calls, the capture is saturated.
223
+ If the `network[]` array is full (200 entries) but contains only static assets, the capture is saturated and the relevant API calls may have been dropped note this as a confidence-reducing factor.
244
224
 
245
225
  ### 4Ef: Review test steps
246
226
 
247
- `tests[].steps[]` provides a step-by-step breakdown of test actions with timing (`offsetMs`, `durationMs`, `depth`). Prefer the timeline view (4Ea) which interleaves steps with network and console. Use steps directly when you need the full hierarchy (nested steps via `depth`).
227
+ Prefer the timeline view (4Ea) which interleaves steps with network and console. Fall back to `tests[].attempts[].steps[]` directly when you need the full nesting hierarchy via `depth`.
248
228
 
249
229
  ## Phase 4E Evidence Standard
250
230