npm - @alwaysmeticulous/debug-workspace - Versions diffs - 2.261.0 → 2.261.1 - Mend

@alwaysmeticulous/debug-workspace 2.261.0 → 2.261.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/README.md ADDED Viewed

@@ -0,0 +1,27 @@
+# @alwaysmeticulous/debug-workspace
+Internal package that provides the shared debug workspace pipeline used by the [Meticulous](https://meticulous.ai) CLI to investigate visual diffs and replays.
+## What it does
+Given a test run ID, replay diff ID, replay IDs, or session ID, this package:
+1. **Resolves context** — fetches metadata about the test run and its replay diffs from the Meticulous API.
+2. **Downloads debug data** — downloads replay data, screenshots, and any additional artifacts to a local workspace directory.
+3. **Generates a workspace** — scaffolds a structured directory with files and templates for investigating the diff.
+## Usage
+This package is not intended for direct use. It is consumed internally by `@alwaysmeticulous/cli` via the `meticulous debug` command.
+## Key exports
+- `runDebugPipeline` — runs the full pipeline (resolve → download → generate workspace).
+- `resolveDebugContext` — resolves a `DebugContext` from a test run ID, replay diff ID, replay IDs, or session ID.
+- `downloadDebugData` — downloads replay data and screenshots into a workspace directory.
+- `generateDebugWorkspace` — scaffolds the workspace directory from templates.
+- `DebugContext`, `ReplayDiffInfo` — types describing the resolved debug context.
+## Part of the Meticulous SDK
+This package is part of the [meticulous-sdk](https://github.com/alwaysmeticulous/meticulous-sdk) monorepo.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@alwaysmeticulous/debug-workspace",
-  "version": "2.261.0",
+  "version": "2.261.1",
   "description": "Shared debug workspace pipeline for investigating Meticulous diffs and replays",
   "license": "ISC",
   "main": "dist/index.js",
@@ -8,14 +8,26 @@
   "files": [
     "dist"
   ],
+  "scripts": {
+    "clean": "rimraf dist tsconfig.tsbuildinfo",
+    "build": "tsc --build tsconfig.json",
+    "postbuild": "cp -r src/templates dist/templates",
+    "dev": "tsc --build tsconfig.json --watch",
+    "format": "prettier --write src",
+    "lint": "eslint \"src/**/*.{js,ts,tsx}\"  --cache",
+    "lint:commit": "eslint --cache $(git diff --relative --name-only --diff-filter=ACMRTUXB master | grep  -E \"(.js$|.ts$|.tsx$)\")",
+    "lint:fix": "eslint \"src/**/*.{js,ts,tsx}\"  --cache --fix",
+    "depcheck": "depcheck --ignore-patterns=dist",
+    "test": "vitest run --passWithNoTests"
+  },
   "dependencies": {
-    "chalk": "^4.1.2",
     "@alwaysmeticulous/client": "2.261.0",
     "@alwaysmeticulous/common": "2.260.2",
-    "@alwaysmeticulous/downloading-helpers": "2.261.0"
+    "@alwaysmeticulous/downloading-helpers": "2.261.0",
+    "chalk": "^4.1.2"
   },
   "devDependencies": {
-    "vitest": "^4.0.17"
+    "vitest": "catalog:"
   },
   "author": {
     "name": "The Meticulous Team",
@@ -34,16 +46,5 @@
   "bugs": {
     "url": "https://github.com/alwaysmeticulous/meticulous-sdk/issues"
   },
-  "scripts": {
-    "clean": "rimraf dist tsconfig.tsbuildinfo",
-    "build": "tsc --build tsconfig.json",
-    "postbuild": "cp -r src/templates dist/templates",
-    "dev": "tsc --build tsconfig.json --watch",
-    "format": "prettier --write src",
-    "lint": "eslint \"src/**/*.{js,ts,tsx}\"  --cache",
-    "lint:commit": "eslint --cache $(git diff --relative --name-only --diff-filter=ACMRTUXB master | grep  -E \"(.js$|.ts$|.tsx$)\")",
-    "lint:fix": "eslint \"src/**/*.{js,ts,tsx}\"  --cache --fix",
-    "depcheck": "depcheck --ignore-patterns=dist",
-    "test": "vitest run --passWithNoTests"
-  }
-}
+  "gitHead": "c7735db49f18531aa3d2024e2931fe227d002429"
+}

package/dist/templates/templates/CLAUDE.md DELETED Viewed

@@ -1,206 +0,0 @@
-# Meticulous Debug Workspace
-## What This Is
-You are in a debugging workspace for the Meticulous automated UI testing platform.
-You are investigating a replay issue (flaky behavior, unexpected diffs, or replay failures).
-`debug-data/context.json` has been automatically loaded into your context. It contains all IDs, paths,
-metadata, and what data is available in this workspace. You do not need to read it again.
-## How Meticulous Works
-Meticulous records user sessions by injecting a JavaScript recorder snippet into your
-application. These sessions capture user activity and network requests. When you make
-a commit, Meticulous triggers a test run that replays selected sessions against the new code,
-taking screenshots at key moments. If there is a base test run to compare against, screenshot
-diffs are computed and surfaced to the developer.
-- A **replay** is a single session being replayed against a version of your app.
-- A **test run** is a collection of replays triggered by a commit.
-- A **replay diff** compares a head replay (new code) against a base replay (old code) and
-  contains the screenshot diff results.
-- A **session** is the original user recording that gets replayed.
-## Workspace Layout
-The workspace root contains the debug workspace files. All downloaded debug data lives under
-the `debug-data/` subdirectory.
-- **`.claude/`** -- Configuration for this debugging workspace (hooks, skills, agents).
-- **`debug-data/context.json`** -- Loaded automatically into context by the SessionStart hook;
-  you do not need to read it again.
-- **`debug-data/`** -- All downloaded replay data, session recordings, diffs, and
-  pre-computed analysis artifacts.
-- **`project-repo/`** -- (Optional) Your codebase checked out at the relevant commit.
-  Only present if the command was run from within a git repository.
-## debug-data/ Contents
-Data falls into three categories: per-replay files (always present), diff files (only when
-comparing replays), and other data.
-### Per-Replay Files (always available)
-Replay data is organized into `head/`, `base/`, and `other/` subdirectories under
-`debug-data/replays/`. All files are searchable and can be found via glob/search.
-Each replay directory (`debug-data/replays/{head,base,other}/<replayId>/`) contains:
-- `logs.deterministic.txt` -- Deterministic logs with non-deterministic data stripped. Best for
-  diffing between replays. Can be very large (check `fileMetadata` in `context.json` for sizes).
-- `logs.deterministic.filtered.txt` -- **Start here for single-replay investigation.**
-  Noise-stripped version of the deterministic logs: tunnel URLs, S3 tokens, PostHog payloads,
-  build hashes, and other non-deterministic patterns are replaced with placeholders. Prefer this
-  over the raw version unless you need unmodified output.
-- `logs.concise.txt` -- Full logs with both virtual and real timestamps, and trace IDs.
-- `timeline.json` -- Detailed timeline of all replay events (user interactions, network requests,
-  DOM mutations, etc.). Can be 1-2MB; prefer `debug-data/timeline-summaries/` for a compact overview.
-- `timeline-stats.json` -- Aggregated statistics about timeline events.
-- `metadata.json` -- Replay configuration, parameters, and environment info.
-- `launchBrowserAndReplayParams.json` -- The exact parameters used to launch the replay.
-- `stackTraces.json` -- JavaScript stack traces captured during replay (if any errors occurred).
-- `accuracyData.json` -- Replay accuracy assessment comparing to expected behavior.
-- `snapshotted-assets/` -- Static assets (JS/CSS) that were captured and used during replay.
-  **Only present if `snapshotAssets` was enabled** -- check `launchBrowserAndReplayParams.json`
-  for the `snapshotAssets` field before assuming this directory exists.
-Note: `screenshots/` are not copied into the workspace (they are large binary PNGs). Reference
-screenshot paths via `screenshotMap` in `context.json` instead; the actual files are in the
-replay cache at `~/.meticulous/replays/<replayId>/screenshots/`.
-Per-replay generated summaries:
-- `debug-data/timeline-summaries/<role>-<replayId>.txt` -- Compact summary of each replay's
-  timeline: total entries, virtual time range, screenshot timestamps, event kind breakdown.
-- `debug-data/formatted-assets/<role>/<replayId>/` -- Pretty-printed JS/CSS from
-  `snapshotted-assets/`. Only present if snapshotted assets exist. Use these instead of the originals.
-- `context.json` fields: `screenshotMap` (screenshot-to-timestamp mapping), `replayComparison`
-  (side-by-side event counts, virtual time, screenshot count), `fileMetadata` (byte sizes and
-  line counts for key files).
-### Diff Files (only when comparing replays)
-These files are only generated when comparing replays -- i.e. when using `meticulous debug replay-diff`,
-`meticulous debug test-run`, or `meticulous debug replays` with exactly 2 replay IDs.
-- `debug-data/diffs/<id>.json` -- Full diff data including replay metadata, test run config,
-  and screenshot results. Can be very large (20K+ tokens). Only read this if you need the full context.
-- `debug-data/diffs/<id>.summary.json` -- **Start here.** Compact summary with just the screenshot
-  diff results: which screenshots differ, mismatch pixel counts, mismatch percentages, and changed
-  section class names.
-- `debug-data/log-diffs/<id>.diff` -- Raw unified diff of `logs.deterministic.txt` between head and base.
-- `debug-data/log-diffs/<id>.filtered.diff` -- **Start here for diff investigation.** Noise-stripped
-  version with tunnel URLs, S3 tokens, PostHog payloads removed. Hunks that only differ in
-  noise are removed entirely.
-- `debug-data/log-diffs/<id>.summary.txt` -- High-level summary: total changed lines, first divergence
-  point, and categorized change counts with direction (e.g. "animation frames: +85 in head /
-  -46 in base, net +39 in head").
-- `debug-data/params-diffs/<id>.diff` -- JSON-aware diff of `launchBrowserAndReplayParams.json`
-  between head and base. Keys are sorted and pretty-printed so only meaningful value changes appear.
-- `debug-data/assets-diffs/<id>.txt` -- Comparison of snapshotted asset file lists between head
-  and base (added/removed/changed by content hash). Not generated if assets are identical.
-- `debug-data/screenshot-context/<id>.txt` -- Only generated with `--screenshot`. Shows ±30 lines
-  of `logs.deterministic.txt` surrounding the screenshot for both head and base, with the
-  screenshot line marked `>>>`.
-### Other Data
-- `debug-data/session-summaries/<sessionId>.txt` -- **Start here for session investigation.** Compact
-  summary of each session: URL history, user event breakdown, network request stats (methods,
-  status codes, domains, failures), storage counts, WebSocket connections, custom data, session
-  context, and framework info.
-- `debug-data/sessions/<sessionId>/data.json` -- Full session recording data including user events, network
-  requests (HAR format), and application storage. Can be very large; prefer the session summary
-  or use search to find relevant portions.
-- `debug-data/test-run/<testRunId>.json` -- Test run configuration, results, commit SHA, and status.
-- `debug-data/pr-metadata.json` -- Pull request metadata (title, URL, hosting provider, author, status) from
-  the database. May not be present if no PR is associated with the test run.
-- `debug-data/pr-diff.txt` -- Source code changes between the base and head commits. May not be present if
-  commit SHAs are unavailable.
-- `debug-data/project-repo/` -- Your codebase checked out at the relevant commit. Only present if
-  the command was run from within a git repository.
-## Screenshot Mapping
-`context.json` includes a `screenshotMap` that maps each screenshot to its virtual timestamp
-and event number. Use this to correlate screenshot filenames (e.g. `screenshot-after-event-00673.png`)
-with specific points in the replay timeline and logs.
-## Replay Comparison
-`context.json` includes a `replayComparison` array with side-by-side stats for each replay:
-total events, network requests, animation frames, virtual time, and screenshot count. Compare
-head vs base entries to quickly spot drift (e.g. extra animation frames or different virtual time).
-## File Sizes
-`context.json` includes a `fileMetadata` array with the byte size and line count of key files.
-Check this before attempting to read large files -- use grep/search or read specific line ranges
-for files over ~5000 lines instead of reading them in full.
-## Debugging Workflow
-1. **Start with `debug-data/context.json`** -- Read this file for all IDs, statuses, file paths,
-   `screenshotMap`, and `replayComparison`. If a `screenshot` field is present, this is the
-   specific screenshot the user wants to investigate. Use `screenshotMap` to find its
-   virtual timestamp and event number, then focus your analysis on events leading up to it.
-2. **Check replay comparison** -- Compare head vs base entries in `replayComparison` for
-   immediate drift signals (different event counts, animation frames, virtual time).
-3. **Read filtered logs** -- For diffs: start with `debug-data/log-diffs/*.summary.txt` then
-   `debug-data/log-diffs/*.filtered.diff`. For single replays: read `logs.deterministic.filtered.txt`
-   inside the replay directory. Fall back to the raw `logs.deterministic.txt` only if you
-   need unmodified output.
-4. **Read timeline summaries** -- Check `debug-data/timeline-summaries/` for a compact overview of each
-   replay's events, screenshot timestamps, and counts. Only read raw `timeline.json` if you
-   need granular event-level detail.
-5. **Inspect screenshot diffs** -- Start with `debug-data/diffs/<id>.summary.json` for a compact view of
-   which screenshots differ and by how much. If a `debug-data/screenshot-context/` file exists, read it
-   for the log lines surrounding the screenshot in both head and base.
-   Only read the full `debug-data/diffs/<id>.json` if you need complete replay metadata.
-6. **Check replay parameters** -- Read `debug-data/params-diffs/` for pre-computed diffs. For single
-   replays, read `launchBrowserAndReplayParams.json` directly.
-7. **Check assets diffs** -- Read `debug-data/assets-diffs/` to see if the snapshotted JS/CSS chunks
-   differ between head and base.
-8. **Analyze session data** -- Start with `debug-data/session-summaries/` for a quick overview of the
-   session (URL history, user events, network stats). Only read the raw `debug-data/sessions/` data
-   if you need specific details like request/response bodies or exact event selectors.
-9. **Review the PR diff** -- Read `debug-data/pr-diff.txt` to see what code changed in this PR and
-   correlate with screenshot diffs.
-10. **Trace through formatted assets** -- Use `debug-data/formatted-assets/` (pretty-printed JS/CSS)
-    instead of raw minified bundles when tracing code execution.
-11. **Review your code** -- If `project-repo/` is present, check it for the relevant changes.
-    For library source code, use `debug-data/formatted-assets/` which contains the bundled and
-    pretty-printed versions of third-party code.
-## Subagents
-This workspace includes two specialized subagents in `.claude/agents/`:
-### Planner
-After the user describes their issue, **always delegate to the planner subagent first**
-before starting your own investigation. The planner reads workspace summaries and metadata
-to produce a structured debugging plan with prioritized investigation steps. Follow its plan
-as your starting point.
-### Summarizer
-When you need to understand a large file (over 5000 lines), delegate to the summarizer
-subagent instead of reading the file in full. The summarizer scans the file using grep and
-targeted reads, returning a concise overview with line numbers for follow-up. This preserves
-your context window for the actual investigation.
-## Rules
-- This workspace is for analysis and investigation. Focus on understanding root causes.
-- When referencing files, use paths relative to this workspace root.
-- Prefer `logs.deterministic.filtered.txt` over `logs.deterministic.txt` for general
-  investigation. Use the raw version only when you need unmodified output.
-- Prefer `logs.deterministic.txt` over `logs.concise.txt` when comparing between replays,
-  since real-time timestamps are stripped.
-- Session data files can be very large. Use grep/search to find relevant portions rather than
-  reading entire files.
-- Screenshot images are binary PNG files stored in the replay cache (not in this workspace).
-  Reference them by path but analyze the diff metadata in JSON files instead.
-- Check `fileMetadata` in `context.json` for file sizes before reading large files.

package/dist/templates/templates/agents/planner.md DELETED Viewed

@@ -1,65 +0,0 @@
----
-name: planner
-description: Creates a structured debugging plan based on workspace data and user context. Use proactively at the start of every debugging session after the user describes their issue.
-tools: Read, Grep, Glob
-model: opus
----
-You are a debugging planning assistant for the Meticulous automated UI testing platform.
-Your job is to quickly scan the workspace data and produce a structured debugging plan
-that the main agent will follow. You run at the start of a session after the developer
-describes the issue they want to investigate.
-## What to Read
-Gather context from these sources (in order):
-1. `context.json` -- IDs, file paths, `screenshotMap`, `replayComparison`, `fileMetadata`.
-   If a `screenshot` field is present, the developer wants to investigate that specific
-   screenshot.
-2. `timeline-summaries/*.txt` -- compact overview of each replay's events, screenshot
-   timestamps, and counts.
-3. `log-diffs/*.summary.txt` -- high-level log diff summary with categorized change counts
-   (only present when comparing replays).
-4. `diffs/*.summary.json` -- which screenshots differ and by how much (only present when
-   comparing replays).
-5. `params-diffs/*.diff` -- parameter differences between head and base replays.
-6. `pr-diff.txt` -- source code changes (first ~200 lines if large).
-## What to Produce
-Based on the workspace data and the developer's description, output:
-### Initial Assessment
-- What type of issue is this? (flake, unexpected diff, replay failure, investigation)
-- What data is available in the workspace?
-- Key observations from summaries and comparisons (e.g. event count drift, virtual time
-  differences, screenshot mismatch percentages).
-### Investigation Steps (ordered by priority)
-For each step:
-- What to examine and why
-- Specific file paths to read
-- What patterns or anomalies to look for
-- What would confirm or rule out each hypothesis
-### Key Files
-List the most important files with their sizes (from `fileMetadata` in `context.json`).
-Flag any files too large to read in full and suggest using the summarizer subagent or
-grep for those.
-## Guidelines
-- Be concise. The plan should be actionable, not exhaustive.
-- Prioritize the most likely root causes first.
-- If the developer mentioned a specific screenshot, correlate it with the `screenshotMap`
-  to find its virtual timestamp and event number, and focus the plan around events leading
-  up to that screenshot.
-- If `replayComparison` shows drift (different event counts, animation frames, or virtual
-  time), call that out prominently.
-- Suggest which debugging skills (in `.claude/skills/`) are most relevant to the issue.

package/dist/templates/templates/agents/summarizer.md DELETED Viewed

@@ -1,75 +0,0 @@
----
-name: summarizer
-description: Summarizes large files (logs, timelines, session data, diffs) that are too large to read in full. Use when a file exceeds 5000 lines or when you need a quick overview of a large file's contents.
-tools: Read, Grep, Glob
-model: haiku
----
-You are a file summarization specialist for debugging Meticulous replay issues.
-When given a file to summarize, produce a concise overview that helps the main agent
-decide what to investigate further. Do not read the entire file -- use Grep and targeted
-reads to extract the key information efficiently.
-## Process
-1. Read the first ~50 lines to understand the file's structure and format.
-2. Use Grep to find key patterns: errors, warnings, screenshots, network failures,
-   timeouts, navigation events, and any terms the caller highlighted.
-3. Read targeted sections around important matches.
-4. Read the last ~30 lines for final state or summary information.
-5. Produce a structured summary.
-## File-Type Guidelines
-### Log files (`logs.deterministic.txt`, `logs.deterministic.filtered.txt`, `logs.concise.txt`)
-Summarize:
-- Approximate line count and virtual time range
-- Key phases: navigation, network loading, user events, screenshots
-- Errors, warnings, or unusual patterns (grep for `error`, `warning`, `fail`, `timeout`)
-- Network request overview: grep for `request` and note counts, failures
-- Screenshot timestamps and event numbers
-### Timeline files (`timeline.json`)
-Summarize:
-- Total entry count
-- Event kind breakdown (grep for `"kind":` and tally)
-- Any `potentialFlakinessWarning` entries
-- Virtual time range (first and last entries)
-- Notable gaps or clusters of events
-### Session data (`sessions/*/data.json`)
-Summarize:
-- Session structure (grep for top-level keys)
-- User interaction count and types
-- Network request patterns (count, domains)
-- Any storage or cookie data of note
-### Diff files (`log-diffs/*.diff`, `log-diffs/*.filtered.diff`)
-Summarize:
-- Total hunks and changed line counts
-- Categories of changes (network, animation, timers, navigation)
-- Location of first divergence
-- Whether changes are concentrated or spread throughout
-### Any other file
-Summarize:
-- File structure and format
-- Size and key sections
-- Notable content relevant to debugging
-## Output Format
-Return a summary under 500 words. Include specific line numbers or grep patterns so
-the main agent can follow up on anything interesting. Structure the summary with clear
-headings for easy scanning.

package/dist/templates/templates/hooks/check-file-size.sh DELETED Viewed

@@ -1,36 +0,0 @@
-#!/bin/bash
-#
-# PreToolUse hook for the Read tool. Warns Claude when a file is large
-# so it considers using Grep or reading specific line ranges instead.
-INPUT=$(cat)
-FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // .tool_input.path // empty')
-if [ -z "$FILE_PATH" ]; then
-  exit 0
-fi
-if [ ! -f "$FILE_PATH" ]; then
-  exit 0
-fi
-# macOS stat uses -f%z, Linux uses -c%s
-SIZE=$(stat -f%z "$FILE_PATH" 2>/dev/null || stat -c%s "$FILE_PATH" 2>/dev/null)
-if [ -z "$SIZE" ]; then
-  exit 0
-fi
-THRESHOLD=500000
-if [ "$SIZE" -gt "$THRESHOLD" ]; then
-  SIZE_KB=$((SIZE / 1024))
-  cat <<EOF
-{
-  "hookSpecificOutput": {
-    "hookEventName": "PreToolUse",
-    "additionalContext": "This file is ${SIZE_KB}KB. Consider using the summarizer subagent to get an overview, using Grep to search for specific content, or reading a specific line range. Check fileMetadata in context.json for line counts."
-  }
-}
-EOF
-fi

package/dist/templates/templates/hooks/load-context.sh DELETED Viewed

@@ -1,20 +0,0 @@
-#!/bin/bash
-#
-# SessionStart hook: loads context.json into Claude's context automatically.
-CONTEXT_FILE="debug-data/context.json"
-if [ ! -f "$CONTEXT_FILE" ]; then
-  exit 0
-fi
-CONTENT=$(cat "$CONTEXT_FILE")
-cat <<EOF
-{
-  "hookSpecificOutput": {
-    "hookEventName": "SessionStart",
-    "additionalContext": $(echo "$CONTENT" | jq -Rs .)
-  }
-}
-EOF

package/dist/templates/templates/rules/feedback.md DELETED Viewed

@@ -1,37 +0,0 @@
-# Debugging Feedback
-When the developer asks for feedback on the debugging session, or when you have completed
-your investigation, provide structured feedback on the experience.
-## Feedback Template
-### What Worked Well
-- Which data sources were most useful for the investigation?
-- Which files did you read most and find most informative?
-- Were the pre-computed log diffs helpful?
-### What Was Missing or Unhelpful
-- Was there any data you needed but did not have access to?
-- Were any files too large to work with effectively?
-- Were there entities or relationships you had to guess about?
-### Issues Encountered
-- Did you hit any dead ends during investigation?
-- Were there any files that were malformed, empty, or confusing?
-- Did you struggle with any part of the workspace layout?
-### Suggestions for Improvement
-- What additional data should be downloaded into the workspace?
-- What additional context should be provided in `CLAUDE.md` or `context.json`?
-- Would any pre-computed analyses (beyond log diffs) have saved time?
-- Are there any debugging patterns you found yourself repeating that could be automated?
-### Session Summary
-- What was the root cause (or most likely hypothesis)?
-- How confident are you in the diagnosis?
-- What steps would you recommend to the developer next?

package/dist/templates/templates/settings.json DELETED Viewed

@@ -1,82 +0,0 @@
-{
-  "permissions": {
-    "allow": [
-      "Read",
-      "Grep",
-      "Glob",
-      "Edit",
-      "Write",
-      "Task",
-      "Bash(find *)",
-      "Bash(grep *)",
-      "Bash(ls *)",
-      "Bash(git diff *)",
-      "Bash(git log *)",
-      "Bash(git show *)",
-      "Bash(git status)",
-      "Bash(git rev-parse *)",
-      "Bash(git blame *)",
-      "Bash(git grep *)",
-      "Bash(diff *)",
-      "Bash(wc *)",
-      "Bash(head *)",
-      "Bash(tail *)",
-      "Bash(sort *)",
-      "Bash(uniq *)",
-      "Bash(jq *)",
-      "Bash(cat *)"
-    ],
-    "ask": ["WebFetch", "WebSearch"],
-    "deny": [
-      "Edit(debug-data/**)",
-      "Write(debug-data/**)",
-      "mcp__*",
-      "ToolSearch",
-      "ListMcpResourcesTool",
-      "ReadMcpResourceTool"
-    ],
-    "defaultMode": "default"
-  },
-  "spinnerVerbs": {
-    "mode": "replace",
-    "verbs": [
-      "Analyzing",
-      "Investigating",
-      "Comparing",
-      "Searching",
-      "Processing",
-      "Squinting at screenshots",
-      "Traversing tangled timelines",
-      "Fighting fickle flakes",
-      "Diagnosing dubious diffs",
-      "Reconciling replays",
-      "Deciphering devious divergences",
-      "Debugging flakes",
-      "Patching network sessions",
-      "Determining nondeterminism"
-    ]
-  },
-  "hooks": {
-    "SessionStart": [
-      {
-        "hooks": [
-          {
-            "type": "command",
-            "command": ".claude/hooks/load-context.sh"
-          }
-        ]
-      }
-    ],
-    "PreToolUse": [
-      {
-        "matcher": "Read",
-        "hooks": [
-          {
-            "type": "command",
-            "command": ".claude/hooks/check-file-size.sh"
-          }
-        ]
-      }
-    ]
-  }
-}

package/dist/templates/templates/skills/debugging-diffs/SKILL.md DELETED Viewed

@@ -1,57 +0,0 @@
----
-name: debugging-diffs
-description: Investigate unexpected visual differences between head and base replays. Use when screenshot diffs are flagged or visual regressions are reported.
----
-# Debugging Screenshot Diffs
-Use this guide when investigating unexpected visual differences between head and base replays.
-## Investigation Steps
-### 1. Understand the Diffs
-- Read the replay diff JSON in `debug-data/diffs/<id>.json`.
-- Check `screenshotDiffResults` for which screenshots differ.
-- Note the diff percentage and pixel count for each screenshot.
-### 2. Correlate with Code Changes
-- Check `commitSha` in `context.json` to identify the code changes.
-- If `project-repo/` is available, use `git log` and `git diff` to see what changed.
-- Focus on CSS changes, component rendering logic, and layout modifications.
-### 3. Compare Logs at Screenshot Time
-- Find the screenshot timestamps in `timeline.json`.
-- Compare what events occurred before each screenshot in head vs base.
-- Look for missing or extra events that could cause visual differences.
-### 4. Check for Expected vs Unexpected Diffs
-- **Expected**: Code changes that intentionally modify the UI (new features, style updates).
-- **Unexpected**: Same code producing different visual output, or unrelated areas changing.
-- Check if the diff is in a dynamic content area (timestamps, counters, user-specific data).
-### 5. Examine Snapshotted Assets
-- If `debug-data/replays/{head,base}/<replayId>/snapshotted-assets/` exists, compare JS/CSS between head and base.
-- Look for changes in CSS that could cause layout shifts.
-- Check for new or modified JavaScript that affects rendering.
-### 6. Review Screenshot Assertions Config
-- Check `screenshotAssertionsOptions` in the diff JSON for threshold settings.
-- Some diffs may be within acceptable tolerance but still flagged.
-### 7. Known Meticulous Replay Behaviors (Safe to Approve)
-The following patterns are caused by Meticulous replay engine behavior, not by your code changes. When these are the root cause of a diff, the diff is safe to approve.
-**Animation flakes**: Continuous animations (Lottie, CSS `@keyframes`, `requestAnimationFrame` loops) can cause screenshots to land at slightly different animation states between base and head, even with no code changes. Meticulous's DOM stability detection triggers extra animation frames trying to reach a stable state, but continuous animations never fully stabilize. **How to confirm**: check `replayComparison` in `context.json` for differing `totalAnimationFrames`; check log diffs for different animation frame counts; search the codebase for `lottie`, `requestAnimationFrame`, CSS `animation`, `@keyframes`, `<canvas>`, `<video>`. If the only visual difference is an animation in a different frame, this diff is safe to approve.
-**Network timing differences**: When multiple network responses complete at the same virtual time, slight ordering differences can cause minor rendering variations (e.g. a list rendering in a different order before settling). **How to confirm**: log diffs show network requests completing in a different order at the same virtual time, but the final rendered state is the same or nearly identical. If no code change caused the reordering, this diff is safe to approve.
-**Extra DOM stability frames**: The head replay may show more timeline events before a screenshot than the base, caused by DOM stability detection running additional animation frames. If the extra events are all animation-frame-related and the visual diff is minor, this diff is safe to approve.
-When any of these patterns is the root cause, tell the user: "This diff is caused by a known Meticulous replay behavior, not by your code changes. It is safe to approve."

package/dist/templates/templates/skills/debugging-flakes/SKILL.md DELETED Viewed

@@ -1,52 +0,0 @@
----
-name: debugging-flakes
-description: Investigate flaky or non-deterministic replay behavior. Use when replays produce different results across runs of the same session against the same code.
----
-# Debugging Flaky Replays
-A flaky replay is one that produces different results (screenshots, logs, behavior) across
-multiple runs of the same session against the same code. Use this guide when investigating
-non-deterministic behavior.
-## Investigation Steps
-### 1. Compare Deterministic Logs
-- Diff `logs.deterministic.txt` between the head and base replays.
-- Look for the first point of divergence -- this is usually where the flake originates.
-- Pay attention to event execution order differences.
-### 2. Check Timeline for Timing Issues
-- Read `timeline.json` and look for events with significantly different virtual timestamps.
-- Look for race conditions: events that depend on network responses or animations completing.
-- Check for `setTimeout`/`setInterval` patterns that may resolve differently.
-### 3. Analyze Accuracy Data
-- Read `accuracyData.json` for the replay's self-assessment of accuracy.
-- Low accuracy scores often correlate with flakiness.
-- Check which specific checks failed.
-### 4. Look for Common Flake Patterns
-- **Animation timing**: CSS transitions or JS animations that haven't completed when screenshot is taken.
-- **Network race conditions**: API responses arriving in different order.
-- **Date/time dependencies**: Code that uses `Date.now()` or renders timestamps.
-- **Random values**: Code using `Math.random()` that wasn't properly seeded.
-- **Lazy loading**: Content loaded on intersection observer timing.
-- **Font loading**: Web fonts loading at different times causing layout shifts.
-When any of these patterns is identified as the root cause and no user code change is involved, the resulting diff is caused by non-determinism in the replay environment, not the user's code. Advise the user that these diffs are **safe to approve**.
-### 5. Check Replay Parameters
-- Compare `launchBrowserAndReplayParams.json` between runs.
-- Verify viewport size, user agent, and other environment settings match.
-- Check if network stubbing configuration is consistent.
-### 6. Examine Stack Traces
-- Read `stackTraces.json` for any errors thrown during replay.
-- Errors that occur in some runs but not others are strong flake indicators.

package/dist/templates/templates/skills/debugging-network/SKILL.md DELETED Viewed

@@ -1,45 +0,0 @@
----
-name: debugging-network
-description: Investigate network-related replay failures and divergences. Use when replays fail due to network errors, stubbing issues, or request ordering problems.
----
-# Debugging Network Issues
-Use this guide when replays fail or diverge due to network request problems.
-## Investigation Steps
-### 1. Check Logs for Network Errors
-- Search `logs.concise.txt` for "network", "fetch", "xhr", "request", "response", "timeout".
-- Look for failed requests, unexpected status codes, or missing responses.
-- Check for CORS errors or SSL issues.
-### 2. Compare Network Activity in Timeline
-- In `timeline.json`, look for network-related events.
-- Compare the sequence and timing of network requests between head and base.
-- Look for requests in one replay that are missing in the other.
-### 3. Examine Session Data
-- Read session data in `debug-data/sessions/<id>/data.json`.
-- Check `recordedRequests` for the original HAR entries captured during recording.
-- Compare recorded requests with what was replayed.
-### 4. Look for Stubbing Issues
-- Network requests are stubbed during replay using recorded data.
-- Check if new API endpoints were added that don't have recorded responses.
-- Look for requests with dynamic parameters (timestamps, tokens) that may not match stubs.
-### 5. Check for Request Ordering Dependencies
-- Some applications depend on requests completing in a specific order.
-- Look for race conditions where parallel requests resolve differently.
-- Check for waterfall dependencies (request B depends on response from request A).
-### 6. Verify API Compatibility
-- If the API was changed, recorded responses may no longer be valid.
-- Check for schema changes, new required fields, or renamed endpoints.

package/dist/templates/templates/skills/debugging-sessions/SKILL.md DELETED Viewed

@@ -1,47 +0,0 @@
----
-name: debugging-sessions
-description: Investigate problems with recorded session data. Use when session recordings appear incomplete, corrupted, or contain unexpected data.
----
-# Debugging Session Data Issues
-Use this guide when investigating problems with the recorded session data itself.
-## Investigation Steps
-### 1. Examine Session Structure
-- Read `debug-data/sessions/<id>/data.json` (this can be very large, use grep/search).
-- Key fields: `rrwebEvents`, `userEvents`, `recordedRequests`, `applicationStorage`, `webSockets`.
-### 2. Check User Events
-- `userEvents` contains the sequence of user interactions that will be replayed.
-- Verify events are in chronological order.
-- Check for truncated or incomplete event sequences.
-- Look for unusually rapid event sequences that may indicate automated behavior.
-### 3. Verify Network Recordings
-- `recordedRequests` contains HAR-format entries of network activity.
-- Check for missing responses (the request was recorded but the response wasn't).
-- Look for very large responses that might have been truncated.
-- Verify content types and encoding are preserved correctly.
-### 4. Check Application Storage
-- `applicationStorage` captures localStorage, sessionStorage, and cookies.
-- Verify that authentication state is properly captured.
-- Look for expired tokens or sessions that may cause different behavior during replay.
-### 5. Look for Session Quality Issues
-- Very short sessions (few events) may not provide meaningful coverage.
-- Sessions with `abandoned: true` were not completed normally.
-- Check `numberUserEvents` and `numberBytes` for unusually small or large values.
-### 6. Verify Recording Environment
-- Check session metadata for the recording environment (hostname, URL).
-- Ensure the session was recorded against a compatible version of the application.
-- Look for environment-specific behavior (staging vs production data).

package/dist/templates/templates/skills/debugging-timelines/SKILL.md DELETED Viewed

@@ -1,51 +0,0 @@
----
-name: debugging-timelines
-description: Investigate timeline divergence between head and base replays. Use when event sequences, ordering, or timing differ unexpectedly.
----
-# Debugging Timeline Divergence
-Use this guide when replay timelines differ unexpectedly between head and base runs.
-## Investigation Steps
-### 1. Load and Compare Timelines
-- Read `timeline.json` from both head and base replay directories.
-- The timeline is an array of events with timestamps, types, and data.
-- Look for the first event where the timelines diverge.
-### 2. Understand Event Types
-Key timeline event types:
-- **user-event**: User interactions (click, type, scroll, hover).
-- **network-request**: API calls and responses.
-- **screenshot**: Screenshot capture points.
-- **mutation**: DOM mutations observed during replay.
-- **navigation**: Page navigation events.
-- **error**: JavaScript errors.
-- **console**: Console log messages.
-### 3. Identify Divergence Patterns
-- **Missing events**: Events in base that don't appear in head (or vice versa).
-- **Reordered events**: Same events but in different sequence.
-- **Timing shifts**: Events at significantly different virtual timestamps.
-- **Extra events**: New events not present in the baseline.
-### 4. Check Timeline Stats
-- Read `timeline-stats.json` for aggregated statistics.
-- Compare event counts, durations, and error counts between replays.
-### 5. Trace Back to Root Cause
-- Once you find the divergence point, look at what happened immediately before.
-- Check if a user event triggered different behavior.
-- Look for conditional logic in the application that might execute differently.
-### 6. Cross-Reference with Logs
-- Use timestamps from the timeline divergence to find corresponding log entries.
-- Check `logs.deterministic.txt` at the same virtual time for additional context.

package/dist/templates/templates/skills/pr-analysis/SKILL.md DELETED Viewed

@@ -1,20 +0,0 @@
----
-name: pr-analysis
-description: Analyze PR source code changes and correlate with screenshot diffs. Use when pr-diff.txt is present and you need to understand which code changes caused visual differences.
----
-# PR Analysis
-When `debug-data/pr-diff.txt` is present in the workspace, analyze the source code changes and correlate
-them with the screenshot diffs.
-1. Read `debug-data/pr-diff.txt` to understand what code changed
-2. Read the diff summaries in `debug-data/diffs/*.summary.json` to see which screenshots differ
-3. For each screenshot that differs, identify which code changes are most likely responsible
-Provide a structured analysis:
-- Which files were modified and what the key changes are
-- Which code changes are most likely to affect visual output (CSS, layout, component rendering)
-- For each differing screenshot, the most likely code change that caused it
-- Whether the visual changes appear intentional (matching the code intent) or unintentional