@alwaysmeticulous/debug-workspace 2.261.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. package/LICENSE +15 -0
  2. package/dist/debug-constants.d.ts +3 -0
  3. package/dist/debug-constants.js +10 -0
  4. package/dist/debug-constants.js.map +1 -0
  5. package/dist/debug.types.d.ts +21 -0
  6. package/dist/debug.types.js +3 -0
  7. package/dist/debug.types.js.map +1 -0
  8. package/dist/download-debug-data.d.ts +10 -0
  9. package/dist/download-debug-data.js +141 -0
  10. package/dist/download-debug-data.js.map +1 -0
  11. package/dist/generate-debug-workspace.d.ts +31 -0
  12. package/dist/generate-debug-workspace.js +1302 -0
  13. package/dist/generate-debug-workspace.js.map +1 -0
  14. package/dist/index.d.ts +6 -0
  15. package/dist/index.js +16 -0
  16. package/dist/index.js.map +1 -0
  17. package/dist/pipeline.d.ts +21 -0
  18. package/dist/pipeline.js +63 -0
  19. package/dist/pipeline.js.map +1 -0
  20. package/dist/resolve-debug-context.d.ts +12 -0
  21. package/dist/resolve-debug-context.js +187 -0
  22. package/dist/resolve-debug-context.js.map +1 -0
  23. package/dist/templates/CLAUDE.md +206 -0
  24. package/dist/templates/agents/planner.md +65 -0
  25. package/dist/templates/agents/summarizer.md +75 -0
  26. package/dist/templates/hooks/check-file-size.sh +36 -0
  27. package/dist/templates/hooks/load-context.sh +20 -0
  28. package/dist/templates/rules/feedback.md +37 -0
  29. package/dist/templates/settings.json +82 -0
  30. package/dist/templates/skills/debugging-diffs/SKILL.md +57 -0
  31. package/dist/templates/skills/debugging-flakes/SKILL.md +52 -0
  32. package/dist/templates/skills/debugging-network/SKILL.md +45 -0
  33. package/dist/templates/skills/debugging-sessions/SKILL.md +47 -0
  34. package/dist/templates/skills/debugging-timelines/SKILL.md +51 -0
  35. package/dist/templates/skills/pr-analysis/SKILL.md +20 -0
  36. package/dist/templates/templates/CLAUDE.md +206 -0
  37. package/dist/templates/templates/agents/planner.md +65 -0
  38. package/dist/templates/templates/agents/summarizer.md +75 -0
  39. package/dist/templates/templates/hooks/check-file-size.sh +36 -0
  40. package/dist/templates/templates/hooks/load-context.sh +20 -0
  41. package/dist/templates/templates/rules/feedback.md +37 -0
  42. package/dist/templates/templates/settings.json +82 -0
  43. package/dist/templates/templates/skills/debugging-diffs/SKILL.md +57 -0
  44. package/dist/templates/templates/skills/debugging-flakes/SKILL.md +52 -0
  45. package/dist/templates/templates/skills/debugging-network/SKILL.md +45 -0
  46. package/dist/templates/templates/skills/debugging-sessions/SKILL.md +47 -0
  47. package/dist/templates/templates/skills/debugging-timelines/SKILL.md +51 -0
  48. package/dist/templates/templates/skills/pr-analysis/SKILL.md +20 -0
  49. package/package.json +49 -0
@@ -0,0 +1,45 @@
1
+ ---
2
+ name: debugging-network
3
+ description: Investigate network-related replay failures and divergences. Use when replays fail due to network errors, stubbing issues, or request ordering problems.
4
+ ---
5
+
6
+ # Debugging Network Issues
7
+
8
+ Use this guide when replays fail or diverge due to network request problems.
9
+
10
+ ## Investigation Steps
11
+
12
+ ### 1. Check Logs for Network Errors
13
+
14
+ - Search `logs.concise.txt` for "network", "fetch", "xhr", "request", "response", "timeout".
15
+ - Look for failed requests, unexpected status codes, or missing responses.
16
+ - Check for CORS errors or SSL issues.
17
+
18
+ ### 2. Compare Network Activity in Timeline
19
+
20
+ - In `timeline.json`, look for network-related events.
21
+ - Compare the sequence and timing of network requests between head and base.
22
+ - Look for requests in one replay that are missing in the other.
23
+
24
+ ### 3. Examine Session Data
25
+
26
+ - Read session data in `debug-data/sessions/<id>/data.json`.
27
+ - Check `recordedRequests` for the original HAR entries captured during recording.
28
+ - Compare recorded requests with what was replayed.
29
+
30
+ ### 4. Look for Stubbing Issues
31
+
32
+ - Network requests are stubbed during replay using recorded data.
33
+ - Check if new API endpoints were added that don't have recorded responses.
34
+ - Look for requests with dynamic parameters (timestamps, tokens) that may not match stubs.
35
+
36
+ ### 5. Check for Request Ordering Dependencies
37
+
38
+ - Some applications depend on requests completing in a specific order.
39
+ - Look for race conditions where parallel requests resolve differently.
40
+ - Check for waterfall dependencies (request B depends on response from request A).
41
+
42
+ ### 6. Verify API Compatibility
43
+
44
+ - If the API was changed, recorded responses may no longer be valid.
45
+ - Check for schema changes, new required fields, or renamed endpoints.
@@ -0,0 +1,47 @@
1
+ ---
2
+ name: debugging-sessions
3
+ description: Investigate problems with recorded session data. Use when session recordings appear incomplete, corrupted, or contain unexpected data.
4
+ ---
5
+
6
+ # Debugging Session Data Issues
7
+
8
+ Use this guide when investigating problems with the recorded session data itself.
9
+
10
+ ## Investigation Steps
11
+
12
+ ### 1. Examine Session Structure
13
+
14
+ - Read `debug-data/sessions/<id>/data.json` (this can be very large, use grep/search).
15
+ - Key fields: `rrwebEvents`, `userEvents`, `recordedRequests`, `applicationStorage`, `webSockets`.
16
+
17
+ ### 2. Check User Events
18
+
19
+ - `userEvents` contains the sequence of user interactions that will be replayed.
20
+ - Verify events are in chronological order.
21
+ - Check for truncated or incomplete event sequences.
22
+ - Look for unusually rapid event sequences that may indicate automated behavior.
23
+
24
+ ### 3. Verify Network Recordings
25
+
26
+ - `recordedRequests` contains HAR-format entries of network activity.
27
+ - Check for missing responses (the request was recorded but the response wasn't).
28
+ - Look for very large responses that might have been truncated.
29
+ - Verify content types and encoding are preserved correctly.
30
+
31
+ ### 4. Check Application Storage
32
+
33
+ - `applicationStorage` captures localStorage, sessionStorage, and cookies.
34
+ - Verify that authentication state is properly captured.
35
+ - Look for expired tokens or sessions that may cause different behavior during replay.
36
+
37
+ ### 5. Look for Session Quality Issues
38
+
39
+ - Very short sessions (few events) may not provide meaningful coverage.
40
+ - Sessions with `abandoned: true` were not completed normally.
41
+ - Check `numberUserEvents` and `numberBytes` for unusually small or large values.
42
+
43
+ ### 6. Verify Recording Environment
44
+
45
+ - Check session metadata for the recording environment (hostname, URL).
46
+ - Ensure the session was recorded against a compatible version of the application.
47
+ - Look for environment-specific behavior (staging vs production data).
@@ -0,0 +1,51 @@
1
+ ---
2
+ name: debugging-timelines
3
+ description: Investigate timeline divergence between head and base replays. Use when event sequences, ordering, or timing differ unexpectedly.
4
+ ---
5
+
6
+ # Debugging Timeline Divergence
7
+
8
+ Use this guide when replay timelines differ unexpectedly between head and base runs.
9
+
10
+ ## Investigation Steps
11
+
12
+ ### 1. Load and Compare Timelines
13
+
14
+ - Read `timeline.json` from both head and base replay directories.
15
+ - The timeline is an array of events with timestamps, types, and data.
16
+ - Look for the first event where the timelines diverge.
17
+
18
+ ### 2. Understand Event Types
19
+
20
+ Key timeline event types:
21
+
22
+ - **user-event**: User interactions (click, type, scroll, hover).
23
+ - **network-request**: API calls and responses.
24
+ - **screenshot**: Screenshot capture points.
25
+ - **mutation**: DOM mutations observed during replay.
26
+ - **navigation**: Page navigation events.
27
+ - **error**: JavaScript errors.
28
+ - **console**: Console log messages.
29
+
30
+ ### 3. Identify Divergence Patterns
31
+
32
+ - **Missing events**: Events in base that don't appear in head (or vice versa).
33
+ - **Reordered events**: Same events but in different sequence.
34
+ - **Timing shifts**: Events at significantly different virtual timestamps.
35
+ - **Extra events**: New events not present in the baseline.
36
+
37
+ ### 4. Check Timeline Stats
38
+
39
+ - Read `timeline-stats.json` for aggregated statistics.
40
+ - Compare event counts, durations, and error counts between replays.
41
+
42
+ ### 5. Trace Back to Root Cause
43
+
44
+ - Once you find the divergence point, look at what happened immediately before.
45
+ - Check if a user event triggered different behavior.
46
+ - Look for conditional logic in the application that might execute differently.
47
+
48
+ ### 6. Cross-Reference with Logs
49
+
50
+ - Use timestamps from the timeline divergence to find corresponding log entries.
51
+ - Check `logs.deterministic.txt` at the same virtual time for additional context.
@@ -0,0 +1,20 @@
1
+ ---
2
+ name: pr-analysis
3
+ description: Analyze PR source code changes and correlate with screenshot diffs. Use when pr-diff.txt is present and you need to understand which code changes caused visual differences.
4
+ ---
5
+
6
+ # PR Analysis
7
+
8
+ When `debug-data/pr-diff.txt` is present in the workspace, analyze the source code changes and correlate
9
+ them with the screenshot diffs.
10
+
11
+ 1. Read `debug-data/pr-diff.txt` to understand what code changed
12
+ 2. Read the diff summaries in `debug-data/diffs/*.summary.json` to see which screenshots differ
13
+ 3. For each screenshot that differs, identify which code changes are most likely responsible
14
+
15
+ Provide a structured analysis:
16
+
17
+ - Which files were modified and what the key changes are
18
+ - Which code changes are most likely to affect visual output (CSS, layout, component rendering)
19
+ - For each differing screenshot, the most likely code change that caused it
20
+ - Whether the visual changes appear intentional (matching the code intent) or unintentional
@@ -0,0 +1,206 @@
1
+ # Meticulous Debug Workspace
2
+
3
+ ## What This Is
4
+
5
+ You are in a debugging workspace for the Meticulous automated UI testing platform.
6
+ You are investigating a replay issue (flaky behavior, unexpected diffs, or replay failures).
7
+
8
+ `debug-data/context.json` has been automatically loaded into your context. It contains all IDs, paths,
9
+ metadata, and what data is available in this workspace. You do not need to read it again.
10
+
11
+ ## How Meticulous Works
12
+
13
+ Meticulous records user sessions by injecting a JavaScript recorder snippet into your
14
+ application. These sessions capture user activity and network requests. When you make
15
+ a commit, Meticulous triggers a test run that replays selected sessions against the new code,
16
+ taking screenshots at key moments. If there is a base test run to compare against, screenshot
17
+ diffs are computed and surfaced to the developer.
18
+
19
+ - A **replay** is a single session being replayed against a version of your app.
20
+ - A **test run** is a collection of replays triggered by a commit.
21
+ - A **replay diff** compares a head replay (new code) against a base replay (old code) and
22
+ contains the screenshot diff results.
23
+ - A **session** is the original user recording that gets replayed.
24
+
25
+ ## Workspace Layout
26
+
27
+ The workspace root contains the debug workspace files. All downloaded debug data lives under
28
+ the `debug-data/` subdirectory.
29
+
30
+ - **`.claude/`** -- Configuration for this debugging workspace (hooks, skills, agents).
31
+ - **`debug-data/context.json`** -- Loaded automatically into context by the SessionStart hook;
32
+ you do not need to read it again.
33
+ - **`debug-data/`** -- All downloaded replay data, session recordings, diffs, and
34
+ pre-computed analysis artifacts.
35
+ - **`project-repo/`** -- (Optional) Your codebase checked out at the relevant commit.
36
+ Only present if the command was run from within a git repository.
37
+
38
+ ## debug-data/ Contents
39
+
40
+ Data falls into three categories: per-replay files (always present), diff files (only when
41
+ comparing replays), and other data.
42
+
43
+ ### Per-Replay Files (always available)
44
+
45
+ Replay data is organized into `head/`, `base/`, and `other/` subdirectories under
46
+ `debug-data/replays/`. All files are searchable and can be found via glob/search.
47
+
48
+ Each replay directory (`debug-data/replays/{head,base,other}/<replayId>/`) contains:
49
+
50
+ - `logs.deterministic.txt` -- Deterministic logs with non-deterministic data stripped. Best for
51
+ diffing between replays. Can be very large (check `fileMetadata` in `context.json` for sizes).
52
+ - `logs.deterministic.filtered.txt` -- **Start here for single-replay investigation.**
53
+ Noise-stripped version of the deterministic logs: tunnel URLs, S3 tokens, PostHog payloads,
54
+ build hashes, and other non-deterministic patterns are replaced with placeholders. Prefer this
55
+ over the raw version unless you need unmodified output.
56
+ - `logs.concise.txt` -- Full logs with both virtual and real timestamps, and trace IDs.
57
+ - `timeline.json` -- Detailed timeline of all replay events (user interactions, network requests,
58
+ DOM mutations, etc.). Can be 1-2MB; prefer `debug-data/timeline-summaries/` for a compact overview.
59
+ - `timeline-stats.json` -- Aggregated statistics about timeline events.
60
+ - `metadata.json` -- Replay configuration, parameters, and environment info.
61
+ - `launchBrowserAndReplayParams.json` -- The exact parameters used to launch the replay.
62
+ - `stackTraces.json` -- JavaScript stack traces captured during replay (if any errors occurred).
63
+ - `accuracyData.json` -- Replay accuracy assessment comparing to expected behavior.
64
+ - `snapshotted-assets/` -- Static assets (JS/CSS) that were captured and used during replay.
65
+ **Only present if `snapshotAssets` was enabled** -- check `launchBrowserAndReplayParams.json`
66
+ for the `snapshotAssets` field before assuming this directory exists.
67
+
68
+ Note: `screenshots/` are not copied into the workspace (they are large binary PNGs). Reference
69
+ screenshot paths via `screenshotMap` in `context.json` instead; the actual files are in the
70
+ replay cache at `~/.meticulous/replays/<replayId>/screenshots/`.
71
+
72
+ Per-replay generated summaries:
73
+
74
+ - `debug-data/timeline-summaries/<role>-<replayId>.txt` -- Compact summary of each replay's
75
+ timeline: total entries, virtual time range, screenshot timestamps, event kind breakdown.
76
+ - `debug-data/formatted-assets/<role>/<replayId>/` -- Pretty-printed JS/CSS from
77
+ `snapshotted-assets/`. Only present if snapshotted assets exist. Use these instead of the originals.
78
+ - `context.json` fields: `screenshotMap` (screenshot-to-timestamp mapping), `replayComparison`
79
+ (side-by-side event counts, virtual time, screenshot count), `fileMetadata` (byte sizes and
80
+ line counts for key files).
81
+
82
+ ### Diff Files (only when comparing replays)
83
+
84
+ These files are only generated when comparing replays -- i.e. when using `meticulous debug replay-diff`,
85
+ `meticulous debug test-run`, or `meticulous debug replays` with exactly 2 replay IDs.
86
+
87
+ - `debug-data/diffs/<id>.json` -- Full diff data including replay metadata, test run config,
88
+ and screenshot results. Can be very large (20K+ tokens). Only read this if you need the full context.
89
+ - `debug-data/diffs/<id>.summary.json` -- **Start here.** Compact summary with just the screenshot
90
+ diff results: which screenshots differ, mismatch pixel counts, mismatch percentages, and changed
91
+ section class names.
92
+ - `debug-data/log-diffs/<id>.diff` -- Raw unified diff of `logs.deterministic.txt` between head and base.
93
+ - `debug-data/log-diffs/<id>.filtered.diff` -- **Start here for diff investigation.** Noise-stripped
94
+ version with tunnel URLs, S3 tokens, PostHog payloads removed. Hunks that only differ in
95
+ noise are removed entirely.
96
+ - `debug-data/log-diffs/<id>.summary.txt` -- High-level summary: total changed lines, first divergence
97
+ point, and categorized change counts with direction (e.g. "animation frames: +85 in head /
98
+ -46 in base, net +39 in head").
99
+ - `debug-data/params-diffs/<id>.diff` -- JSON-aware diff of `launchBrowserAndReplayParams.json`
100
+ between head and base. Keys are sorted and pretty-printed so only meaningful value changes appear.
101
+ - `debug-data/assets-diffs/<id>.txt` -- Comparison of snapshotted asset file lists between head
102
+ and base (added/removed/changed by content hash). Not generated if assets are identical.
103
+ - `debug-data/screenshot-context/<id>.txt` -- Only generated with `--screenshot`. Shows ±30 lines
104
+ of `logs.deterministic.txt` surrounding the screenshot for both head and base, with the
105
+ screenshot line marked `>>>`.
106
+
107
+ ### Other Data
108
+
109
+ - `debug-data/session-summaries/<sessionId>.txt` -- **Start here for session investigation.** Compact
110
+ summary of each session: URL history, user event breakdown, network request stats (methods,
111
+ status codes, domains, failures), storage counts, WebSocket connections, custom data, session
112
+ context, and framework info.
113
+ - `debug-data/sessions/<sessionId>/data.json` -- Full session recording data including user events, network
114
+ requests (HAR format), and application storage. Can be very large; prefer the session summary
115
+ or use search to find relevant portions.
116
+ - `debug-data/test-run/<testRunId>.json` -- Test run configuration, results, commit SHA, and status.
117
+ - `debug-data/pr-metadata.json` -- Pull request metadata (title, URL, hosting provider, author, status) from
118
+ the database. May not be present if no PR is associated with the test run.
119
+ - `debug-data/pr-diff.txt` -- Source code changes between the base and head commits. May not be present if
120
+ commit SHAs are unavailable.
121
+ - `debug-data/project-repo/` -- Your codebase checked out at the relevant commit. Only present if
122
+ the command was run from within a git repository.
123
+
124
+ ## Screenshot Mapping
125
+
126
+ `context.json` includes a `screenshotMap` that maps each screenshot to its virtual timestamp
127
+ and event number. Use this to correlate screenshot filenames (e.g. `screenshot-after-event-00673.png`)
128
+ with specific points in the replay timeline and logs.
129
+
130
+ ## Replay Comparison
131
+
132
+ `context.json` includes a `replayComparison` array with side-by-side stats for each replay:
133
+ total events, network requests, animation frames, virtual time, and screenshot count. Compare
134
+ head vs base entries to quickly spot drift (e.g. extra animation frames or different virtual time).
135
+
136
+ ## File Sizes
137
+
138
+ `context.json` includes a `fileMetadata` array with the byte size and line count of key files.
139
+ Check this before attempting to read large files -- use grep/search or read specific line ranges
140
+ for files over ~5000 lines instead of reading them in full.
141
+
142
+ ## Debugging Workflow
143
+
144
+ 1. **Start with `debug-data/context.json`** -- Read this file for all IDs, statuses, file paths,
145
+ `screenshotMap`, and `replayComparison`. If a `screenshot` field is present, this is the
146
+ specific screenshot the user wants to investigate. Use `screenshotMap` to find its
147
+ virtual timestamp and event number, then focus your analysis on events leading up to it.
148
+ 2. **Check replay comparison** -- Compare head vs base entries in `replayComparison` for
149
+ immediate drift signals (different event counts, animation frames, virtual time).
150
+ 3. **Read filtered logs** -- For diffs: start with `debug-data/log-diffs/*.summary.txt` then
151
+ `debug-data/log-diffs/*.filtered.diff`. For single replays: read `logs.deterministic.filtered.txt`
152
+ inside the replay directory. Fall back to the raw `logs.deterministic.txt` only if you
153
+ need unmodified output.
154
+ 4. **Read timeline summaries** -- Check `debug-data/timeline-summaries/` for a compact overview of each
155
+ replay's events, screenshot timestamps, and counts. Only read raw `timeline.json` if you
156
+ need granular event-level detail.
157
+ 5. **Inspect screenshot diffs** -- Start with `debug-data/diffs/<id>.summary.json` for a compact view of
158
+ which screenshots differ and by how much. If a `debug-data/screenshot-context/` file exists, read it
159
+ for the log lines surrounding the screenshot in both head and base.
160
+ Only read the full `debug-data/diffs/<id>.json` if you need complete replay metadata.
161
+ 6. **Check replay parameters** -- Read `debug-data/params-diffs/` for pre-computed diffs. For single
162
+ replays, read `launchBrowserAndReplayParams.json` directly.
163
+ 7. **Check assets diffs** -- Read `debug-data/assets-diffs/` to see if the snapshotted JS/CSS chunks
164
+ differ between head and base.
165
+ 8. **Analyze session data** -- Start with `debug-data/session-summaries/` for a quick overview of the
166
+ session (URL history, user events, network stats). Only read the raw `debug-data/sessions/` data
167
+ if you need specific details like request/response bodies or exact event selectors.
168
+ 9. **Review the PR diff** -- Read `debug-data/pr-diff.txt` to see what code changed in this PR and
169
+ correlate with screenshot diffs.
170
+ 10. **Trace through formatted assets** -- Use `debug-data/formatted-assets/` (pretty-printed JS/CSS)
171
+ instead of raw minified bundles when tracing code execution.
172
+ 11. **Review your code** -- If `project-repo/` is present, check it for the relevant changes.
173
+ For library source code, use `debug-data/formatted-assets/` which contains the bundled and
174
+ pretty-printed versions of third-party code.
175
+
176
+ ## Subagents
177
+
178
+ This workspace includes two specialized subagents in `.claude/agents/`:
179
+
180
+ ### Planner
181
+
182
+ After the user describes their issue, **always delegate to the planner subagent first**
183
+ before starting your own investigation. The planner reads workspace summaries and metadata
184
+ to produce a structured debugging plan with prioritized investigation steps. Follow its plan
185
+ as your starting point.
186
+
187
+ ### Summarizer
188
+
189
+ When you need to understand a large file (over 5000 lines), delegate to the summarizer
190
+ subagent instead of reading the file in full. The summarizer scans the file using grep and
191
+ targeted reads, returning a concise overview with line numbers for follow-up. This preserves
192
+ your context window for the actual investigation.
193
+
194
+ ## Rules
195
+
196
+ - This workspace is for analysis and investigation. Focus on understanding root causes.
197
+ - When referencing files, use paths relative to this workspace root.
198
+ - Prefer `logs.deterministic.filtered.txt` over `logs.deterministic.txt` for general
199
+ investigation. Use the raw version only when you need unmodified output.
200
+ - Prefer `logs.deterministic.txt` over `logs.concise.txt` when comparing between replays,
201
+ since real-time timestamps are stripped.
202
+ - Session data files can be very large. Use grep/search to find relevant portions rather than
203
+ reading entire files.
204
+ - Screenshot images are binary PNG files stored in the replay cache (not in this workspace).
205
+ Reference them by path but analyze the diff metadata in JSON files instead.
206
+ - Check `fileMetadata` in `context.json` for file sizes before reading large files.
@@ -0,0 +1,65 @@
1
+ ---
2
+ name: planner
3
+ description: Creates a structured debugging plan based on workspace data and user context. Use proactively at the start of every debugging session after the user describes their issue.
4
+ tools: Read, Grep, Glob
5
+ model: opus
6
+ ---
7
+
8
+ You are a debugging planning assistant for the Meticulous automated UI testing platform.
9
+
10
+ Your job is to quickly scan the workspace data and produce a structured debugging plan
11
+ that the main agent will follow. You run at the start of a session after the developer
12
+ describes the issue they want to investigate.
13
+
14
+ ## What to Read
15
+
16
+ Gather context from these sources (in order):
17
+
18
+ 1. `context.json` -- IDs, file paths, `screenshotMap`, `replayComparison`, `fileMetadata`.
19
+ If a `screenshot` field is present, the developer wants to investigate that specific
20
+ screenshot.
21
+ 2. `timeline-summaries/*.txt` -- compact overview of each replay's events, screenshot
22
+ timestamps, and counts.
23
+ 3. `log-diffs/*.summary.txt` -- high-level log diff summary with categorized change counts
24
+ (only present when comparing replays).
25
+ 4. `diffs/*.summary.json` -- which screenshots differ and by how much (only present when
26
+ comparing replays).
27
+ 5. `params-diffs/*.diff` -- parameter differences between head and base replays.
28
+ 6. `pr-diff.txt` -- source code changes (first ~200 lines if large).
29
+
30
+ ## What to Produce
31
+
32
+ Based on the workspace data and the developer's description, output:
33
+
34
+ ### Initial Assessment
35
+
36
+ - What type of issue is this? (flake, unexpected diff, replay failure, investigation)
37
+ - What data is available in the workspace?
38
+ - Key observations from summaries and comparisons (e.g. event count drift, virtual time
39
+ differences, screenshot mismatch percentages).
40
+
41
+ ### Investigation Steps (ordered by priority)
42
+
43
+ For each step:
44
+
45
+ - What to examine and why
46
+ - Specific file paths to read
47
+ - What patterns or anomalies to look for
48
+ - What would confirm or rule out each hypothesis
49
+
50
+ ### Key Files
51
+
52
+ List the most important files with their sizes (from `fileMetadata` in `context.json`).
53
+ Flag any files too large to read in full and suggest using the summarizer subagent or
54
+ grep for those.
55
+
56
+ ## Guidelines
57
+
58
+ - Be concise. The plan should be actionable, not exhaustive.
59
+ - Prioritize the most likely root causes first.
60
+ - If the developer mentioned a specific screenshot, correlate it with the `screenshotMap`
61
+ to find its virtual timestamp and event number, and focus the plan around events leading
62
+ up to that screenshot.
63
+ - If `replayComparison` shows drift (different event counts, animation frames, or virtual
64
+ time), call that out prominently.
65
+ - Suggest which debugging skills (in `.claude/skills/`) are most relevant to the issue.
@@ -0,0 +1,75 @@
1
+ ---
2
+ name: summarizer
3
+ description: Summarizes large files (logs, timelines, session data, diffs) that are too large to read in full. Use when a file exceeds 5000 lines or when you need a quick overview of a large file's contents.
4
+ tools: Read, Grep, Glob
5
+ model: haiku
6
+ ---
7
+
8
+ You are a file summarization specialist for debugging Meticulous replay issues.
9
+
10
+ When given a file to summarize, produce a concise overview that helps the main agent
11
+ decide what to investigate further. Do not read the entire file -- use Grep and targeted
12
+ reads to extract the key information efficiently.
13
+
14
+ ## Process
15
+
16
+ 1. Read the first ~50 lines to understand the file's structure and format.
17
+ 2. Use Grep to find key patterns: errors, warnings, screenshots, network failures,
18
+ timeouts, navigation events, and any terms the caller highlighted.
19
+ 3. Read targeted sections around important matches.
20
+ 4. Read the last ~30 lines for final state or summary information.
21
+ 5. Produce a structured summary.
22
+
23
+ ## File-Type Guidelines
24
+
25
+ ### Log files (`logs.deterministic.txt`, `logs.deterministic.filtered.txt`, `logs.concise.txt`)
26
+
27
+ Summarize:
28
+
29
+ - Approximate line count and virtual time range
30
+ - Key phases: navigation, network loading, user events, screenshots
31
+ - Errors, warnings, or unusual patterns (grep for `error`, `warning`, `fail`, `timeout`)
32
+ - Network request overview: grep for `request` and note counts, failures
33
+ - Screenshot timestamps and event numbers
34
+
35
+ ### Timeline files (`timeline.json`)
36
+
37
+ Summarize:
38
+
39
+ - Total entry count
40
+ - Event kind breakdown (grep for `"kind":` and tally)
41
+ - Any `potentialFlakinessWarning` entries
42
+ - Virtual time range (first and last entries)
43
+ - Notable gaps or clusters of events
44
+
45
+ ### Session data (`sessions/*/data.json`)
46
+
47
+ Summarize:
48
+
49
+ - Session structure (grep for top-level keys)
50
+ - User interaction count and types
51
+ - Network request patterns (count, domains)
52
+ - Any storage or cookie data of note
53
+
54
+ ### Diff files (`log-diffs/*.diff`, `log-diffs/*.filtered.diff`)
55
+
56
+ Summarize:
57
+
58
+ - Total hunks and changed line counts
59
+ - Categories of changes (network, animation, timers, navigation)
60
+ - Location of first divergence
61
+ - Whether changes are concentrated or spread throughout
62
+
63
+ ### Any other file
64
+
65
+ Summarize:
66
+
67
+ - File structure and format
68
+ - Size and key sections
69
+ - Notable content relevant to debugging
70
+
71
+ ## Output Format
72
+
73
+ Return a summary under 500 words. Include specific line numbers or grep patterns so
74
+ the main agent can follow up on anything interesting. Structure the summary with clear
75
+ headings for easy scanning.
@@ -0,0 +1,36 @@
1
+ #!/bin/bash
2
+ #
3
+ # PreToolUse hook for the Read tool. Warns Claude when a file is large
4
+ # so it considers using Grep or reading specific line ranges instead.
5
+
6
+ INPUT=$(cat)
7
+ FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // .tool_input.path // empty')
8
+
9
+ if [ -z "$FILE_PATH" ]; then
10
+ exit 0
11
+ fi
12
+
13
+ if [ ! -f "$FILE_PATH" ]; then
14
+ exit 0
15
+ fi
16
+
17
+ # macOS stat uses -f%z, Linux uses -c%s
18
+ SIZE=$(stat -f%z "$FILE_PATH" 2>/dev/null || stat -c%s "$FILE_PATH" 2>/dev/null)
19
+
20
+ if [ -z "$SIZE" ]; then
21
+ exit 0
22
+ fi
23
+
24
+ THRESHOLD=500000
25
+
26
+ if [ "$SIZE" -gt "$THRESHOLD" ]; then
27
+ SIZE_KB=$((SIZE / 1024))
28
+ cat <<EOF
29
+ {
30
+ "hookSpecificOutput": {
31
+ "hookEventName": "PreToolUse",
32
+ "additionalContext": "This file is ${SIZE_KB}KB. Consider using the summarizer subagent to get an overview, using Grep to search for specific content, or reading a specific line range. Check fileMetadata in context.json for line counts."
33
+ }
34
+ }
35
+ EOF
36
+ fi
@@ -0,0 +1,20 @@
1
+ #!/bin/bash
2
+ #
3
+ # SessionStart hook: loads context.json into Claude's context automatically.
4
+
5
+ CONTEXT_FILE="debug-data/context.json"
6
+
7
+ if [ ! -f "$CONTEXT_FILE" ]; then
8
+ exit 0
9
+ fi
10
+
11
+ CONTENT=$(cat "$CONTEXT_FILE")
12
+
13
+ cat <<EOF
14
+ {
15
+ "hookSpecificOutput": {
16
+ "hookEventName": "SessionStart",
17
+ "additionalContext": $(echo "$CONTENT" | jq -Rs .)
18
+ }
19
+ }
20
+ EOF
@@ -0,0 +1,37 @@
1
+ # Debugging Feedback
2
+
3
+ When the developer asks for feedback on the debugging session, or when you have completed
4
+ your investigation, provide structured feedback on the experience.
5
+
6
+ ## Feedback Template
7
+
8
+ ### What Worked Well
9
+
10
+ - Which data sources were most useful for the investigation?
11
+ - Which files did you read most and find most informative?
12
+ - Were the pre-computed log diffs helpful?
13
+
14
+ ### What Was Missing or Unhelpful
15
+
16
+ - Was there any data you needed but did not have access to?
17
+ - Were any files too large to work with effectively?
18
+ - Were there entities or relationships you had to guess about?
19
+
20
+ ### Issues Encountered
21
+
22
+ - Did you hit any dead ends during investigation?
23
+ - Were there any files that were malformed, empty, or confusing?
24
+ - Did you struggle with any part of the workspace layout?
25
+
26
+ ### Suggestions for Improvement
27
+
28
+ - What additional data should be downloaded into the workspace?
29
+ - What additional context should be provided in `CLAUDE.md` or `context.json`?
30
+ - Would any pre-computed analyses (beyond log diffs) have saved time?
31
+ - Are there any debugging patterns you found yourself repeating that could be automated?
32
+
33
+ ### Session Summary
34
+
35
+ - What was the root cause (or most likely hypothesis)?
36
+ - How confident are you in the diagnosis?
37
+ - What steps would you recommend to the developer next?