@vizzly-testing/cli 0.11.1 → 0.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -30,28 +30,99 @@ cd vizzly-cli
30
30
  /plugin install vizzly@vizzly-marketplace
31
31
  ```
32
32
 
33
+ ## Migration Guide (v0.0.x → v0.1.0)
34
+
35
+ **⚠️ Breaking Changes:** Slash commands for status checking and debugging have been replaced with Agent Skills.
36
+
37
+ ### What Changed
38
+
39
+ **Before (v0.0.x):**
40
+ ```bash
41
+ # Manually invoke slash commands
42
+ /vizzly:tdd-status
43
+ /vizzly:debug-diff homepage
44
+ ```
45
+
46
+ **After (v0.1.0):**
47
+ ```bash
48
+ # Just ask naturally - Skills activate automatically
49
+ "How are my visual tests?"
50
+ "The homepage screenshot is failing"
51
+ ```
52
+
53
+ ### Command Migration
54
+
55
+ | Old Slash Command | New Approach | How It Works |
56
+ |-------------------|--------------|--------------|
57
+ | `/vizzly:tdd-status` | Ask: "How are my tests?" | `check-visual-tests` Skill activates automatically |
58
+ | `/vizzly:debug-diff <name>` | Say: "Debug the homepage screenshot" | `debug-visual-regression` Skill activates automatically |
59
+ | `/vizzly:setup` | Still `/vizzly:setup` | ✅ No change - explicit setup workflow |
60
+ | `/vizzly:suggest-screenshots` | Still `/vizzly:suggest-screenshots` | ✅ No change - explicit suggestions workflow |
61
+
62
+ ### Why This Change?
63
+
64
+ **Better UX:**
65
+ - No need to memorize command syntax
66
+ - Just describe what you need in natural language
67
+ - Claude understands your intent and activates the right Skill
68
+ - More intuitive and conversational workflow
69
+
70
+ **What Are Agent Skills?**
71
+
72
+ Agent Skills are model-invoked capabilities that Claude uses autonomously based on your request. Instead of explicitly typing `/command`, you simply ask questions or describe problems, and Claude will automatically use the appropriate Skill.
73
+
74
+ **Still Prefer Explicit Commands?**
75
+
76
+ You can still be explicit in your requests:
77
+ - "Check my Vizzly test status" → Activates `check-visual-tests` Skill
78
+ - "Debug the login screenshot failure" → Activates `debug-visual-regression` Skill
79
+
80
+ The Skills will activate based on your intent, not rigid command syntax.
81
+
33
82
  ## Features
34
83
 
35
- ### 🔍 **TDD Status Checking**
36
- - `/vizzly:tdd-status` - Check current TDD status and comparison results
37
- - See failed/new/passed screenshot counts
38
- - Direct links to diff images and dashboard
84
+ ### **Agent Skills** (Model-Invoked)
39
85
 
40
- ### 🐛 **Smart Diff Analysis**
41
- - `/vizzly:debug-diff <screenshot-name>` - Analyze visual regression failures
42
- - AI-powered analysis with contextual suggestions
43
- - Guidance on whether to accept or fix changes
86
+ Claude automatically uses these Skills when you mention visual testing:
44
87
 
45
- ### 💡 **Screenshot Suggestions**
46
- - `/vizzly:suggest-screenshots` - Analyze test files for screenshot opportunities
47
- - Framework-specific code examples
48
- - Respect your test structure and patterns
88
+ **🐛 Debug Visual Regression**
89
+ - Activated when you mention failing visual tests or screenshot differences
90
+ - Automatically analyzes visual changes, identifies root causes
91
+ - Compares baseline vs current screenshots
92
+ - Suggests whether to accept or fix changes
93
+ - Works with both local TDD and cloud modes
94
+
95
+ **🔍 Check Visual Test Status**
96
+ - Activated when you ask about test status or results
97
+ - Provides quick summary of passed/failed/new screenshots
98
+ - Shows diff percentages and threshold information
99
+ - Links to dashboard for detailed review
100
+
101
+ **Example usage:**
102
+ - Just say: *"The homepage screenshot is failing"* → Claude debugs it
103
+ - Just ask: *"How are my visual tests?"* → Claude checks status
104
+ - No slash commands needed—Claude activates Skills automatically!
105
+
106
+ ### 📋 **Slash Commands** (User-Invoked)
49
107
 
50
- ### **Quick Setup**
108
+ Explicit workflows you trigger manually:
109
+
110
+ **⚡ Quick Setup**
51
111
  - `/vizzly:setup` - Initialize Vizzly configuration
52
112
  - Environment variable guidance
53
113
  - CI/CD integration help
54
114
 
115
+ **💡 Screenshot Suggestions**
116
+ - `/vizzly:suggest-screenshots` - Analyze test files for screenshot opportunities
117
+ - Framework-specific code examples
118
+ - Respect your test structure and patterns
119
+
120
+ ### Skills vs Slash Commands
121
+
122
+ **Skills** are capabilities Claude uses autonomously based on your request. Just describe what you need naturally, and Claude will use the appropriate Skill.
123
+
124
+ **Slash Commands** are explicit workflows you invoke manually when you want step-by-step guidance through a process.
125
+
55
126
  ## MCP Server Tools
56
127
 
57
128
  The plugin provides an MCP server with direct access to Vizzly data:
@@ -103,11 +174,92 @@ The plugin will automatically use the appropriate token based on your context.
103
174
  - TDD mode running for local features
104
175
  - Authentication configured (see above) for cloud features
105
176
 
177
+ ## How It Works
178
+
179
+ ### Agent Skills
180
+
181
+ The plugin's Skills use Claude Code's `allowed-tools` feature to restrict what actions they can perform:
182
+
183
+ **Check Visual Test Status Skill:**
184
+ - Can use: `get_tdd_status`, `detect_context`
185
+ - Purpose: Quickly check test status without modifying anything
186
+
187
+ **Debug Visual Regression Skill:**
188
+ - Can use: `Read`, `WebFetch`, `read_comparison_details`, `accept_baseline`, `approve_comparison`, `reject_comparison`
189
+ - Purpose: Analyze failures and suggest/apply fixes
190
+
191
+ ### MCP Server Integration
192
+
193
+ The plugin bundles an MCP server that provides 15 tools for interacting with Vizzly:
194
+
195
+ - **Automatic startup** - MCP server starts when plugin is enabled
196
+ - **Token resolution** - Automatically finds your authentication token
197
+ - **Dual mode** - Works with both local TDD and cloud builds
198
+ - **No configuration needed** - Just install and use
199
+
200
+ ## Troubleshooting
201
+
202
+ ### Skills not activating
203
+
204
+ If Claude isn't using the Skills automatically:
205
+
206
+ 1. Verify plugin is enabled: `/plugin`
207
+ 2. Check MCP server status: `/mcp` (should show `plugin:vizzly:vizzly`)
208
+ 3. Try being more explicit: "Check my Vizzly test status"
209
+
210
+ ### MCP server not connecting
211
+
212
+ If the MCP server shows as "failed" in `/mcp`:
213
+
214
+ 1. Check Node.js version: `node --version` (requires 20+)
215
+ 2. View logs: `claude --debug`
216
+ 3. Reinstall plugin: `/plugin uninstall vizzly@vizzly-marketplace` then `/plugin install vizzly@vizzly-marketplace`
217
+
218
+ ### TDD server not found
219
+
220
+ If Skills report "TDD server not running":
221
+
222
+ 1. Start TDD mode: `vizzly tdd start`
223
+ 2. Verify server is running: Check for `.vizzly/server.json`
224
+ 3. Run tests to generate screenshots
225
+
226
+ ## Example Workflows
227
+
228
+ ### Local TDD Development
229
+
230
+ ```bash
231
+ # Start TDD server
232
+ vizzly tdd start
233
+
234
+ # Run your tests
235
+ npm test
236
+
237
+ # Ask Claude to check status
238
+ # "How are my visual tests?"
239
+
240
+ # If failures, ask Claude to debug
241
+ # "The login page screenshot is failing"
242
+ ```
243
+
244
+ ### Cloud Build Review
245
+
246
+ ```bash
247
+ # After CI/CD runs and creates a build
248
+ # "Show me recent Vizzly builds"
249
+
250
+ # Review specific comparison
251
+ # "Analyze comparison cmp_abc123"
252
+
253
+ # Approve or reject
254
+ # Claude will suggest using approve/reject tools
255
+ ```
256
+
106
257
  ## Documentation
107
258
 
108
259
  - [Vizzly CLI](https://github.com/vizzly-testing/vizzly-cli) - Official CLI documentation
109
260
  - [Vizzly Platform](https://vizzly.dev) - Web dashboard and cloud features
110
261
  - [Claude Code](https://claude.com/claude-code) - Claude Code documentation
262
+ - [Agent Skills](https://docs.claude.com/en/docs/claude-code/skills) - Learn about Claude Code Skills
111
263
 
112
264
  ## License
113
265
 
@@ -0,0 +1,58 @@
1
+ # Changelog
2
+
3
+ All notable changes to the Vizzly Claude Code plugin will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.1.0] - 2025-10-18
9
+
10
+ ### Added
11
+ - ✨ **Agent Skills** - Model-invoked capabilities that activate autonomously
12
+ - `check-visual-tests` Skill - Automatically checks test status when you ask about tests
13
+ - `debug-visual-regression` Skill - Automatically analyzes failures when you mention visual issues
14
+ - 📦 MCP server configuration moved to plugin root (`.mcp.json`)
15
+ - 📝 Comprehensive README with Skills documentation, troubleshooting, and workflows
16
+ - 🔒 Tool restrictions via `allowed-tools` for better security and focused capabilities
17
+
18
+ ### Changed
19
+ - 🔧 MCP server name: `vizzly-server` → `vizzly` (cleaner naming)
20
+ - 🔧 Skills use correct tool prefix: `mcp__plugin_vizzly_vizzly__*`
21
+
22
+ ### Removed
23
+ - ❌ **BREAKING:** `/vizzly:tdd-status` slash command → Replaced by `check-visual-tests` Skill
24
+ - ❌ **BREAKING:** `/vizzly:debug-diff` slash command → Replaced by `debug-visual-regression` Skill
25
+
26
+ ### Migration Guide
27
+
28
+ **Before (v0.0.x):**
29
+ ```bash
30
+ # Manually invoke slash commands
31
+ /vizzly:tdd-status
32
+ /vizzly:debug-diff homepage
33
+ ```
34
+
35
+ **After (v0.1.0):**
36
+ ```bash
37
+ # Just ask naturally - Skills activate automatically
38
+ "How are my visual tests?"
39
+ "The homepage screenshot is failing"
40
+ ```
41
+
42
+ **What Changed:**
43
+ - **Status checks** are now autonomous - just ask about your tests
44
+ - **Debugging** happens automatically when you mention failures
45
+ - **No need to remember slash commands** - Claude understands your intent
46
+ - **Setup and suggestions** still use slash commands (`/vizzly:setup`, `/vizzly:suggest-screenshots`)
47
+
48
+ **Why This Change:**
49
+ Agent Skills provide a more natural, intuitive experience. Instead of memorizing command syntax, you can ask questions in plain language and Claude will automatically use the right tools.
50
+
51
+ **If You Prefer Explicit Commands:**
52
+ While the slash commands are removed, you can still be explicit in your requests:
53
+ - "Check my Vizzly test status" → Activates `check-visual-tests` Skill
54
+ - "Debug the homepage screenshot" → Activates `debug-visual-regression` Skill
55
+
56
+ ### Fixed
57
+ - MCP server location now follows Claude Code plugin specifications
58
+ - Tool naming consistency across Skills and MCP server
@@ -145,6 +145,35 @@ export class CloudAPIProvider {
145
145
  return data.comparison;
146
146
  }
147
147
 
148
+ /**
149
+ * Search for comparisons by name across builds
150
+ */
151
+ async searchComparisons(name, apiToken, options = {}) {
152
+ if (!name || typeof name !== 'string') {
153
+ throw new Error('name is required and must be a non-empty string');
154
+ }
155
+
156
+ let { branch, limit = 50, offset = 0, apiUrl } = options;
157
+
158
+ let queryParams = new URLSearchParams({
159
+ name,
160
+ limit: limit.toString(),
161
+ offset: offset.toString()
162
+ });
163
+
164
+ if (branch) {
165
+ queryParams.append('branch', branch);
166
+ }
167
+
168
+ let data = await this.makeRequest(
169
+ `/api/sdk/comparisons/search?${queryParams}`,
170
+ apiToken,
171
+ apiUrl
172
+ );
173
+
174
+ return data;
175
+ }
176
+
148
177
  // ==================================================================
149
178
  // BUILD COMMENTS
150
179
  // ==================================================================
@@ -21,7 +21,7 @@ class VizzlyMCPServer {
21
21
  constructor() {
22
22
  this.server = new Server(
23
23
  {
24
- name: 'vizzly-server',
24
+ name: 'vizzly',
25
25
  version: '0.1.0'
26
26
  },
27
27
  {
@@ -247,6 +247,41 @@ class VizzlyMCPServer {
247
247
  required: ['comparisonId']
248
248
  }
249
249
  },
250
+ {
251
+ name: 'search_comparisons',
252
+ description:
253
+ 'Search for comparisons by screenshot name across recent builds in the cloud. Returns matching comparisons with their build context and screenshot URLs. Use this to find all instances of a specific screenshot across different builds for debugging.',
254
+ inputSchema: {
255
+ type: 'object',
256
+ properties: {
257
+ name: {
258
+ type: 'string',
259
+ description: 'Screenshot/comparison name to search for (supports partial matching)'
260
+ },
261
+ branch: {
262
+ type: 'string',
263
+ description: 'Optional branch name to filter results'
264
+ },
265
+ limit: {
266
+ type: 'number',
267
+ description: 'Maximum number of results to return (default: 50)'
268
+ },
269
+ offset: {
270
+ type: 'number',
271
+ description: 'Offset for pagination (default: 0)'
272
+ },
273
+ apiToken: {
274
+ type: 'string',
275
+ description: 'Vizzly API token (optional, auto-resolves from user login or env)'
276
+ },
277
+ apiUrl: {
278
+ type: 'string',
279
+ description: 'API base URL (optional)'
280
+ }
281
+ },
282
+ required: ['name']
283
+ }
284
+ },
250
285
  {
251
286
  name: 'create_build_comment',
252
287
  description: 'Create a comment on a build for collaboration',
@@ -668,6 +703,24 @@ class VizzlyMCPServer {
668
703
  };
669
704
  }
670
705
 
706
+ case 'search_comparisons': {
707
+ let apiToken = await this.resolveApiToken(args);
708
+ let results = await this.cloudProvider.searchComparisons(args.name, apiToken, {
709
+ branch: args.branch,
710
+ limit: args.limit,
711
+ offset: args.offset,
712
+ apiUrl: args.apiUrl
713
+ });
714
+ return {
715
+ content: [
716
+ {
717
+ type: 'text',
718
+ text: JSON.stringify(results, null, 2)
719
+ }
720
+ ]
721
+ };
722
+ }
723
+
671
724
  case 'create_build_comment': {
672
725
  let apiToken = await this.resolveApiToken(args);
673
726
  let result = await this.cloudProvider.createBuildComment(
@@ -0,0 +1,158 @@
1
+ ---
2
+ name: Check Visual Test Status
3
+ description: Check the status of Vizzly visual regression tests. Use when the user asks about test status, test results, what's failing, how tests are doing, or wants a summary of visual tests. Works with both local TDD and cloud modes.
4
+ allowed-tools: mcp__plugin_vizzly_vizzly__get_tdd_status, mcp__plugin_vizzly_vizzly__detect_context
5
+ ---
6
+
7
+ # Check Visual Test Status
8
+
9
+ Automatically check Vizzly visual test status when the user asks about their tests. Provides a quick summary of passed, failed, and new screenshots.
10
+
11
+ ## When to Use This Skill
12
+
13
+ Activate this Skill when the user:
14
+ - Asks "How are my tests doing?"
15
+ - Asks "Are there any failing tests?"
16
+ - Asks "What's the status of visual tests?"
17
+ - Asks "Show me test results"
18
+ - Asks "What's failing?"
19
+ - Wants a summary of visual regression tests
20
+
21
+ ## How This Skill Works
22
+
23
+ 1. **Detect context** (local TDD or cloud mode)
24
+ 2. **Fetch TDD status** from the local server
25
+ 3. **Analyze results** to identify failures, new screenshots, and passes
26
+ 4. **Provide summary** with actionable information
27
+ 5. **Link to dashboard** for detailed review
28
+
29
+ ## Instructions
30
+
31
+ ### Step 1: Get TDD Status
32
+
33
+ Use the `get_tdd_status` tool from the Vizzly MCP server to fetch current comparison results.
34
+
35
+ This returns:
36
+ - Total screenshot count
37
+ - Passed, failed, and new screenshot counts
38
+ - List of all comparisons with details
39
+ - Dashboard URL (if TDD server is running)
40
+
41
+ ### Step 2: Analyze the Results
42
+
43
+ Examine the comparison data:
44
+ - Count total, passed, failed, and new screenshots
45
+ - Identify which specific screenshots failed
46
+ - Note diff percentages for failures
47
+ - Check if new screenshots need baselines
48
+
49
+ ### Step 3: Provide Clear Summary
50
+
51
+ Format the output to be scannable and actionable:
52
+
53
+ ```
54
+ Vizzly TDD Status:
55
+ ✅ Total: [count] screenshots
56
+ ✅ Passed: [count]
57
+ ❌ Failed: [count] (exceeded threshold)
58
+ 🆕 New: [count] (no baseline)
59
+
60
+ Failed Comparisons:
61
+ - [name] ([diff]% diff) - Exceeds [threshold]% threshold
62
+ - [name] ([diff]% diff) - Exceeds [threshold]% threshold
63
+
64
+ New Screenshots:
65
+ - [name] (no baseline for comparison)
66
+
67
+ Dashboard: http://localhost:47392
68
+ ```
69
+
70
+ ### Step 4: Suggest Next Steps
71
+
72
+ Based on the results, provide guidance:
73
+
74
+ **If there are failures:**
75
+ - Suggest using the debug-visual-regression Skill for detailed analysis
76
+ - Provide dashboard link for visual review
77
+ - Mention accept/reject options
78
+
79
+ **If there are new screenshots:**
80
+ - Explain that new screenshots need baseline approval
81
+ - Show how to accept them from dashboard or CLI
82
+
83
+ **If all passed:**
84
+ - Confirm tests are passing
85
+ - No action needed
86
+
87
+ ## Example Output
88
+
89
+ ```
90
+ User: "How are my tests?"
91
+
92
+ Vizzly TDD Status:
93
+ ✅ Total: 15 screenshots
94
+ ✅ Passed: 12
95
+ ❌ Failed: 2 (exceeded threshold)
96
+ 🆕 New: 1 (no baseline)
97
+
98
+ Failed Comparisons:
99
+ - homepage (2.3% diff) - Exceeds 0.1% threshold
100
+ Check .vizzly/diffs/homepage.png
101
+ - login-form (1.8% diff) - Exceeds 0.1% threshold
102
+ Check .vizzly/diffs/login-form.png
103
+
104
+ New Screenshots:
105
+ - dashboard (no baseline for comparison)
106
+
107
+ Dashboard: http://localhost:47392
108
+
109
+ Next Steps:
110
+ - Review diff images to understand what changed
111
+ - Accept baselines from dashboard if changes are intentional
112
+ - For detailed analysis of failures, ask me to debug specific screenshots
113
+ - Fix visual issues if changes are unintentional
114
+ ```
115
+
116
+ ## Example When All Pass
117
+
118
+ ```
119
+ User: "Are my visual tests passing?"
120
+
121
+ Vizzly TDD Status:
122
+ ✅ Total: 15 screenshots
123
+ ✅ All passed!
124
+
125
+ No visual regressions detected. All screenshots match their baselines.
126
+
127
+ Dashboard: http://localhost:47392
128
+ ```
129
+
130
+ ## Example When TDD Not Running
131
+
132
+ ```
133
+ User: "How are my tests?"
134
+
135
+ Vizzly TDD Status:
136
+ ❌ TDD server is not running
137
+
138
+ To start TDD mode:
139
+ vizzly tdd start
140
+
141
+ Then run your tests to capture screenshots.
142
+ ```
143
+
144
+ ## Important Notes
145
+
146
+ - **Quick status check** - Designed for fast overview, not detailed analysis
147
+ - **Use dashboard for visuals** - Link to dashboard for image review
148
+ - **Suggest next steps** - Always provide actionable guidance
149
+ - **Detect TDD mode** - Only works with local TDD server running
150
+ - **For detailed debugging** - Suggest using debug-visual-regression Skill
151
+
152
+ ## Focus on Actionability
153
+
154
+ Always end with clear next steps:
155
+ - What to investigate
156
+ - Which tools to use (dashboard, debug Skill)
157
+ - How to accept/reject baselines
158
+ - When to fix code vs accept changes
@@ -0,0 +1,269 @@
1
+ ---
2
+ name: Debug Visual Regression
3
+ description: Analyze visual regression test failures in Vizzly. Use when the user mentions failing visual tests, screenshot differences, visual bugs, diffs, or asks to debug/investigate/analyze visual changes. Works with both local TDD and cloud modes.
4
+ allowed-tools: Read, WebFetch, mcp__plugin_vizzly_vizzly__read_comparison_details, mcp__plugin_vizzly_vizzly__search_comparisons, mcp__plugin_vizzly_vizzly__accept_baseline, mcp__plugin_vizzly_vizzly__approve_comparison, mcp__plugin_vizzly_vizzly__reject_comparison
5
+ ---
6
+
7
+ # Debug Visual Regression
8
+
9
+ Automatically analyze visual regression failures when the user mentions them. This Skill helps identify the root cause of visual differences and suggests whether to accept or fix changes.
10
+
11
+ ## When to Use This Skill
12
+
13
+ Activate this Skill when the user:
14
+ - Mentions "failing visual test" or "screenshot failure"
15
+ - Asks "what's wrong with my visual tests?"
16
+ - Says "the homepage screenshot is different" or similar
17
+ - Wants to understand why a visual comparison failed
18
+ - Asks to "debug", "analyze", or "investigate" visual changes
19
+ - Mentions specific screenshot names that are failing
20
+
21
+ ## How This Skill Works
22
+
23
+ 1. **Automatically detect the mode** (local TDD or cloud)
24
+ 2. **Fetch comparison details** using the screenshot name or comparison ID
25
+ 3. **View the actual images** to perform visual analysis
26
+ 4. **Provide detailed insights** on what changed and why
27
+ 5. **Suggest next steps** (accept, reject, or fix)
28
+
29
+ ## Instructions
30
+
31
+ ### Step 1: Call the Unified MCP Tool
32
+
33
+ Use `read_comparison_details` with the identifier:
34
+ - Pass screenshot name (e.g., "homepage_desktop") for local mode
35
+ - Pass comparison ID (e.g., "cmp_abc123") for cloud mode
36
+ - The tool automatically detects which mode to use
37
+ - Returns a response with `mode` field indicating local or cloud
38
+
39
+ ### Step 2: Check the Mode in Response
40
+
41
+ The response will contain a `mode` field:
42
+ - **Local mode** (`mode: "local"`): Returns filesystem paths (`baselinePath`, `currentPath`, `diffPath`)
43
+ - **Cloud mode** (`mode: "cloud"`): Returns URLs (`baselineUrl`, `currentUrl`, `diffUrl`)
44
+
45
+ ### Step 3: Analyze Comparison Data
46
+
47
+ Examine the comparison details:
48
+ - Diff percentage and threshold
49
+ - Status (failed/new/passed)
50
+ - Image references (paths or URLs depending on mode)
51
+ - Viewport and browser information
52
+
53
+ ### Step 4: View the Actual Images
54
+
55
+ **CRITICAL:** You MUST view the baseline and current images to provide accurate analysis.
56
+
57
+ **If mode is "local":**
58
+ - Response contains filesystem paths (`baselinePath`, `currentPath`, `diffPath`)
59
+ - **Use the Read tool to view ONLY baselinePath and currentPath**
60
+ - **DO NOT read diffPath** - it causes API errors
61
+
62
+ **If mode is "cloud":**
63
+ - Response contains public URLs (`baselineUrl`, `currentUrl`, `diffUrl`)
64
+ - **Use the WebFetch tool to view ONLY baselineUrl and currentUrl**
65
+ - **DO NOT fetch diffUrl** - it causes API errors
66
+
67
+ ### Step 5: Provide Detailed Visual Insights
68
+
69
+ Based on what you observe in the images:
70
+
71
+ **Describe the specific visual differences:**
72
+ - Which UI components, elements, or layouts changed
73
+ - Colors, spacing, typography, positioning changes
74
+ - Missing or added elements
75
+
76
+ **Categorize the change by diff percentage:**
77
+ - **< 1%:** Anti-aliasing, font rendering, subpixel differences
78
+ - **1-5%:** Layout shifts, padding/margin changes, color variations
79
+ - **> 5%:** Significant layout changes, missing content, major visual updates
80
+
81
+ **Identify possible causes:**
82
+ - CSS changes (margin, padding, positioning)
83
+ - Content changes (text, images)
84
+ - State issues (hover, focus, loading states)
85
+ - Browser/viewport rendering differences
86
+
87
+ ### Step 6: Suggest Next Steps
88
+
89
+ **If local mode:**
90
+ - Whether to accept using `accept_baseline` tool
91
+ - Specific code areas to investigate if unintentional
92
+ - How to fix common issues
93
+
94
+ **If cloud mode:**
95
+ - Whether to approve using `approve_comparison` tool
96
+ - Whether to reject using `reject_comparison` tool with reason
97
+ - Team coordination steps
98
+
99
+ **If changes are intentional:**
100
+ - Explain why it's safe to accept/approve
101
+ - Confirm this matches expected behavior
102
+
103
+ **If changes are unintentional:**
104
+ - Specific files to check (CSS, templates, components)
105
+ - Git commands to investigate recent changes
106
+ - How to reproduce and fix
107
+
108
+ ## Example Analysis (Local TDD Mode)
109
+
110
+ ```
111
+ User: "The homepage screenshot is failing"
112
+
113
+ Step 1: Call tool
114
+ Tool: read_comparison_details({ identifier: "homepage" })
115
+
116
+ Response:
117
+ {
118
+ "name": "homepage",
119
+ "status": "failed",
120
+ "diffPercentage": 2.3,
121
+ "threshold": 0.1,
122
+ "mode": "local",
123
+ "baselinePath": "/Users/you/project/.vizzly/baselines/homepage.png",
124
+ "currentPath": "/Users/you/project/.vizzly/screenshots/homepage.png"
125
+ }
126
+
127
+ Step 2: View images
128
+ Read(baselinePath)
129
+ Read(currentPath)
130
+
131
+ Step 3: Visual Analysis
132
+ Comparing the two images, I can see the following changes:
133
+
134
+ Visual Differences:
135
+ - The navigation header has shifted down by approximately 10-15 pixels
136
+ - Logo position moved from y:20px to y:35px
137
+ - Navigation menu items are now overlapping with the hero section
138
+ - The "Sign Up" button background changed from blue (#2563eb) to darker blue (#1e40af)
139
+
140
+ Root Cause Analysis:
141
+ Based on the 2.3% diff and visual comparison, this appears to be:
142
+ 1. A margin or padding change on the header element
143
+ 2. The button color change suggests a hover state being captured
144
+
145
+ Recommendations:
146
+ If intentional:
147
+ - This appears to be a deliberate redesign of the header spacing
148
+ - Accept as new baseline if these changes match your design updates
149
+
150
+ If unintentional:
151
+ - Check recent CSS changes to `.header` or `nav` margin-top/padding-top
152
+ - Verify button states during screenshot capture to avoid hover state
153
+ - Investigate with: `git diff src/styles/header.css`
154
+ - Fix the header positioning and re-run tests
155
+ ```
156
+
157
+ ## Example Analysis (Cloud Mode)
158
+
159
+ ```
160
+ User: "Can you check comparison cmp_xyz789?"
161
+
162
+ Step 1: Call tool
163
+ Tool: read_comparison_details({
164
+ identifier: "cmp_xyz789",
165
+ apiToken: "vzt_..."
166
+ })
167
+
168
+ Response:
169
+ {
170
+ "name": "login-page",
171
+ "status": "failed",
172
+ "diffPercentage": 0.8,
173
+ "threshold": 0.1,
174
+ "mode": "cloud",
175
+ "baselineUrl": "https://app.vizzly.dev/screenshots/abc123/baseline.png",
176
+ "currentUrl": "https://app.vizzly.dev/screenshots/abc123/current.png",
177
+ "comparisonId": "cmp_xyz789"
178
+ }
179
+
180
+ Step 2: View images
181
+ WebFetch(baselineUrl, "show me the baseline screenshot")
182
+ WebFetch(currentUrl, "show me the current screenshot")
183
+
184
+ Step 3: Visual Analysis
185
+ The differences are very subtle (0.8% diff):
186
+
187
+ Visual Differences:
188
+ - Input field border color slightly different (#e5e7eb vs #d1d5db)
189
+ - Submit button has very slight shadow change
190
+ - Font anti-aliasing appears slightly different
191
+
192
+ Root Cause Analysis:
193
+ This appears to be browser rendering differences or minor CSS updates:
194
+ - The 0.8% diff suggests this is not a major visual change
195
+ - Could be Chrome version differences
196
+ - Could be sub-pixel rendering variations
197
+
198
+ Recommendations:
199
+ This is likely acceptable to approve:
200
+ - The changes are extremely minor (< 1%)
201
+ - No functional UI changes
202
+ - Probably browser/rendering variations
203
+
204
+ Approve with: approve_comparison tool
205
+ Or adjust threshold to 1% if these variations are expected
206
+ ```
207
+
208
+ ## Cross-Build Debugging (Cloud Only)
209
+
210
+ When debugging visual regressions in cloud mode, you can track a screenshot across multiple builds to identify when changes were introduced.
211
+
212
+ ### When to Use Search
213
+
214
+ Use `search_comparisons` when:
215
+ - The user asks "when did this screenshot start failing?"
216
+ - They want to track a visual change across builds
217
+ - They're investigating a regression that appeared recently
218
+ - They want to see the history of a specific screenshot
219
+
220
+ ### How to Search
221
+
222
+ ```javascript
223
+ // Search for all comparisons of a screenshot
224
+ search_comparisons({
225
+ name: "homepage_desktop",
226
+ branch: "main", // optional: filter by branch
227
+ limit: 10 // optional: limit results
228
+ })
229
+ ```
230
+
231
+ ### Example Workflow
232
+
233
+ ```
234
+ User: "When did the homepage screenshot start failing?"
235
+
236
+ Step 1: Search for the screenshot across builds
237
+ Tool: search_comparisons({ name: "homepage", branch: "main", limit: 10 })
238
+
239
+ Response shows 10 comparisons from most recent to oldest, each with:
240
+ - Build name, branch, and creation date
241
+ - Diff percentage and status
242
+ - Screenshot URLs
243
+
244
+ Step 2: Analyze the timeline
245
+ - Build #45 (today): 5.2% diff - FAILED
246
+ - Build #44 (yesterday): 5.1% diff - FAILED
247
+ - Build #43 (2 days ago): 0.3% diff - PASSED
248
+ - Build #42 (3 days ago): 0.2% diff - PASSED
249
+
250
+ Step 3: Report findings
251
+ "The homepage screenshot started failing between Build #43 and #44
252
+ (approximately 1-2 days ago). The diff jumped from 0.3% to 5.1%,
253
+ suggesting a significant visual change was introduced."
254
+
255
+ Step 4: Deep dive into the failing build
256
+ Tool: read_comparison_details({ identifier: "cmp_xyz_from_build44" })
257
+ [View images and provide detailed analysis as usual]
258
+ ```
259
+
260
+ ## Important Notes
261
+
262
+ - **Always use `read_comparison_details`** - it automatically detects the mode
263
+ - **Use `search_comparisons` for cloud debugging** - to track changes across builds
264
+ - **Check the `mode` field** to know which viewing tool to use (Read vs WebFetch)
265
+ - **Never view diff images** - only baseline and current
266
+ - **Visual inspection is critical** - don't rely solely on diff percentages
267
+ - **Be specific in analysis** - identify exact elements that changed
268
+ - **Provide actionable advice** - specific files, commands, or tools to use
269
+ - **Consider context** - small diffs might be acceptable, large ones need investigation
@@ -128,7 +128,37 @@ export class ApiService {
128
128
  * @returns {Promise<Object>} Comparison data
129
129
  */
130
130
  async getComparison(comparisonId) {
131
- return this.request(`/api/sdk/comparisons/${comparisonId}`);
131
+ let response = await this.request(`/api/sdk/comparisons/${comparisonId}`);
132
+ return response.comparison;
133
+ }
134
+
135
+ /**
136
+ * Search for comparisons by name across builds
137
+ * @param {string} name - Screenshot name to search for
138
+ * @param {Object} filters - Optional filters (branch, limit, offset)
139
+ * @param {string} [filters.branch] - Filter by branch name
140
+ * @param {number} [filters.limit=50] - Maximum number of results (default: 50)
141
+ * @param {number} [filters.offset=0] - Pagination offset (default: 0)
142
+ * @returns {Promise<Object>} Search results with comparisons and pagination
143
+ */
144
+ async searchComparisons(name, filters = {}) {
145
+ if (!name || typeof name !== 'string') {
146
+ throw new VizzlyError('name is required and must be a non-empty string');
147
+ }
148
+ let {
149
+ branch,
150
+ limit = 50,
151
+ offset = 0
152
+ } = filters;
153
+ const queryParams = new URLSearchParams({
154
+ name,
155
+ limit: String(limit),
156
+ offset: String(offset)
157
+ });
158
+
159
+ // Only add branch if provided
160
+ if (branch) queryParams.append('branch', branch);
161
+ return this.request(`/api/sdk/comparisons/search?${queryParams}`);
132
162
  }
133
163
 
134
164
  /**
@@ -132,10 +132,35 @@ export class TddService {
132
132
  logger.warn(`⚠️ Build ${buildId} has status: ${baselineBuild.status} (expected: completed)`);
133
133
  }
134
134
  } else if (comparisonId) {
135
- // Use specific comparison ID
135
+ // Use specific comparison ID - download only this comparison's baseline screenshot
136
136
  logger.info(`📌 Using comparison: ${comparisonId}`);
137
137
  const comparison = await this.api.getComparison(comparisonId);
138
- baselineBuild = comparison.baselineBuild;
138
+
139
+ // A comparison doesn't have baselineBuild directly - we need to get it
140
+ // The comparison has baseline_screenshot which contains the build_id
141
+ if (!comparison.baseline_screenshot) {
142
+ throw new Error(`Comparison ${comparisonId} has no baseline screenshot. This comparison may be a "new" screenshot with no baseline to compare against.`);
143
+ }
144
+
145
+ // The original_url might be in baseline_screenshot.original_url or comparison.baseline_screenshot_url
146
+ let baselineUrl = comparison.baseline_screenshot.original_url || comparison.baseline_screenshot_url;
147
+ if (!baselineUrl) {
148
+ throw new Error(`Baseline screenshot for comparison ${comparisonId} has no download URL`);
149
+ }
150
+
151
+ // For a specific comparison, we only download that one baseline screenshot
152
+ // Create a mock build structure with just this one screenshot
153
+ baselineBuild = {
154
+ id: comparison.baseline_screenshot.build_id || 'comparison-baseline',
155
+ name: `Comparison ${comparisonId.substring(0, 8)}`,
156
+ screenshots: [{
157
+ id: comparison.baseline_screenshot.id,
158
+ name: comparison.baseline_name || comparison.current_name,
159
+ original_url: baselineUrl,
160
+ metadata: {},
161
+ properties: {}
162
+ }]
163
+ };
139
164
  } else {
140
165
  // Get the latest passed build for this environment and branch
141
166
  const builds = await this.api.getBuilds({
@@ -152,10 +177,12 @@ export class TddService {
152
177
  baselineBuild = builds.data[0];
153
178
  }
154
179
 
155
- // For specific buildId, we already have screenshots, otherwise get build details
180
+ // For specific buildId, we already have screenshots
181
+ // For comparisonId, we created a mock build with just the one screenshot
182
+ // Otherwise, get build details with screenshots
156
183
  let buildDetails = baselineBuild;
157
- if (!buildId) {
158
- // Get build details with screenshots for non-buildId cases
184
+ if (!buildId && !comparisonId) {
185
+ // Get build details with screenshots for non-buildId/non-comparisonId cases
159
186
  const actualBuildId = baselineBuild.id;
160
187
  buildDetails = await this.api.getBuild(actualBuildId, 'screenshots');
161
188
  }
@@ -28,6 +28,20 @@ export class ApiService {
28
28
  * @returns {Promise<Object>} Comparison data
29
29
  */
30
30
  getComparison(comparisonId: string): Promise<any>;
31
+ /**
32
+ * Search for comparisons by name across builds
33
+ * @param {string} name - Screenshot name to search for
34
+ * @param {Object} filters - Optional filters (branch, limit, offset)
35
+ * @param {string} [filters.branch] - Filter by branch name
36
+ * @param {number} [filters.limit=50] - Maximum number of results (default: 50)
37
+ * @param {number} [filters.offset=0] - Pagination offset (default: 0)
38
+ * @returns {Promise<Object>} Search results with comparisons and pagination
39
+ */
40
+ searchComparisons(name: string, filters?: {
41
+ branch?: string;
42
+ limit?: number;
43
+ offset?: number;
44
+ }): Promise<any>;
31
45
  /**
32
46
  * Get builds for a project
33
47
  * @param {Object} filters - Filter options
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@vizzly-testing/cli",
3
- "version": "0.11.1",
3
+ "version": "0.11.2",
4
4
  "description": "Visual review platform for UI developers and designers",
5
5
  "keywords": [
6
6
  "visual-testing",
@@ -1,153 +0,0 @@
1
- ---
2
- description: Analyze a specific visual regression failure and suggest fixes
3
- ---
4
-
5
- # Debug Vizzly Visual Regression
6
-
7
- Analyze a specific failing visual comparison in detail. Supports both local TDD mode and cloud API mode.
8
-
9
- ## Process
10
-
11
- 1. **Call the unified MCP tool**: Use `read_comparison_details` with the identifier (screenshot name or comparison ID)
12
- - The tool automatically detects whether to use local TDD mode or cloud mode
13
- - Pass screenshot name (e.g., "homepage_desktop") for local mode
14
- - Pass comparison ID (e.g., "cmp_abc123") for cloud mode
15
- - Returns a response with `mode` field indicating which mode was used
16
-
17
- 2. **Check the mode in the response**:
18
- - **Local mode** (`mode: "local"`): Returns filesystem paths (`baselinePath`, `currentPath`, `diffPath`)
19
- - **Cloud mode** (`mode: "cloud"`): Returns URLs (`baselineUrl`, `currentUrl`, `diffUrl`)
20
-
21
- 3. **Analyze the comparison data**:
22
- - Diff percentage and threshold
23
- - Status (failed/new/passed)
24
- - Image references (paths or URLs depending on mode)
25
-
26
- 4. **View the actual images** (critical for visual analysis):
27
-
28
- Check the `mode` field in the response to determine which tool to use:
29
-
30
- **If mode is "local":**
31
- - Response contains filesystem paths (`baselinePath`, `currentPath`, `diffPath`)
32
- - **Use the Read tool to view ONLY baselinePath and currentPath**
33
- - **DO NOT read diffPath** - it causes API errors
34
-
35
- **If mode is "cloud":**
36
- - Response contains public URLs (`baselineUrl`, `currentUrl`, `diffUrl`)
37
- - **Use the WebFetch tool to view ONLY baselineUrl and currentUrl**
38
- - **DO NOT fetch diffUrl** - it causes API errors
39
-
40
- **IMPORTANT:** You MUST view the baseline and current images to provide accurate analysis
41
-
42
- 5. **Provide detailed visual insights** based on what you see:
43
- - What type of change was detected (small/moderate/large diff)
44
- - Describe the specific visual differences you observe in the images
45
- - Identify which UI components, elements, or layouts changed
46
- - Possible causes based on diff percentage and visual inspection:
47
- - <1%: Anti-aliasing, font rendering, subpixel differences
48
- - 1-5%: Layout shifts, padding/margin changes, color variations
49
- - > 5%: Significant layout changes, missing content, major visual updates
50
-
51
- 6. **Suggest next steps** based on the mode:
52
- - **If local mode**: Whether to accept using `accept_baseline` tool
53
- - **If cloud mode**: Whether to approve/reject using `approve_comparison` or `reject_comparison` tools
54
- - Areas to investigate if unintentional
55
- - How to fix common issues
56
- - Specific code changes if you can identify them from the visual diff
57
-
58
- ## Example Analysis (Local TDD Mode)
59
-
60
- ```
61
- Step 1: Call read_comparison_details with screenshot name
62
- Tool: read_comparison_details({ identifier: "homepage" })
63
-
64
- Response:
65
- {
66
- "name": "homepage",
67
- "status": "failed",
68
- "diffPercentage": 2.3,
69
- "threshold": 0.1,
70
- "mode": "local",
71
- "baselinePath": "/Users/you/project/.vizzly/baselines/homepage.png",
72
- "currentPath": "/Users/you/project/.vizzly/screenshots/homepage.png",
73
- "diffPath": "/Users/you/project/.vizzly/diffs/homepage.png"
74
- }
75
-
76
- Step 2: Detected mode is "local", so use Read tool for images
77
- Read(baselinePath) and Read(currentPath)
78
-
79
- Visual Analysis:
80
- [After reading the baseline and current image files...]
81
-
82
- Comparing the two images, the navigation header has shifted down by approximately 10-15 pixels.
83
- Specific changes observed:
84
- - The logo position moved from y:20px to y:35px
85
- - Navigation menu items are now overlapping with the hero section
86
- - The "Sign Up" button background changed from blue (#2563eb) to a darker blue (#1e40af)
87
-
88
- Root Cause:
89
- Based on the visual comparison, this appears to be a margin or padding change on the
90
- header element. The button color change is likely a hover state being captured.
91
-
92
- Recommendations:
93
- 1. Check for recent CSS changes to:
94
- - `.header` or `nav` margin-top/padding-top
95
- - Any global layout shifts affecting the header
96
- 2. The button color change suggests a hover state - ensure consistent state during screenshot capture
97
- 3. If the header position change is intentional:
98
- - Accept as new baseline using `accept_baseline` tool
99
- 4. If unintentional:
100
- - Revert CSS changes to header positioning
101
- - Verify with: `git diff src/styles/header.css`
102
- ```
103
-
104
- ## Example Analysis (Cloud Mode)
105
-
106
- ```
107
- Step 1: Call read_comparison_details with comparison ID
108
- Tool: read_comparison_details({
109
- identifier: "cmp_xyz789",
110
- apiToken: "vzt_..."
111
- })
112
-
113
- Response:
114
- {
115
- "name": "homepage",
116
- "status": "failed",
117
- "diffPercentage": 2.3,
118
- "threshold": 0.1,
119
- "mode": "cloud",
120
- "baselineUrl": "https://app.vizzly.dev/screenshots/abc123/baseline.png",
121
- "currentUrl": "https://app.vizzly.dev/screenshots/abc123/current.png",
122
- "diffUrl": "https://app.vizzly.dev/screenshots/abc123/diff.png",
123
- "comparisonId": "cmp_xyz789",
124
- "buildId": "bld_abc123"
125
- }
126
-
127
- Step 2: Detected mode is "cloud", so use WebFetch tool for images
128
- WebFetch(baselineUrl) and WebFetch(currentUrl)
129
-
130
- Visual Analysis:
131
- [After fetching the baseline and current image URLs...]
132
-
133
- [Same analysis as local mode example...]
134
-
135
- Recommendations:
136
- 1. [Same technical recommendations as local mode...]
137
- 2. If the header position change is intentional:
138
- - Approve this comparison using `approve_comparison` tool
139
- 3. If unintentional:
140
- - Reject using `reject_comparison` tool with detailed reason
141
- - Have the team fix the CSS changes
142
- ```
143
-
144
- ## Important Notes
145
-
146
- - **Unified Tool**: Always use `read_comparison_details` with the identifier - it automatically detects the mode
147
- - **Mode Detection**: Check the `mode` field in the response to know which viewing tool to use
148
- - **Image Viewing**:
149
- - Local mode → Use Read tool with filesystem paths
150
- - Cloud mode → Use WebFetch tool with URLs
151
- - **Diff Images**: NEVER attempt to view/read/fetch the diff image - it causes API errors
152
- - **Visual Analysis**: Always view the baseline and current images before providing analysis
153
- - Visual inspection reveals details that diff percentages alone cannot convey
@@ -1,43 +0,0 @@
1
- ---
2
- description: Check TDD dashboard status and view visual regression test results
3
- ---
4
-
5
- # Check Vizzly TDD Status
6
-
7
- Use the Vizzly MCP server to check the current TDD status:
8
-
9
- 1. Call the `get_tdd_status` tool from the vizzly MCP server
10
- 2. Analyze the comparison results
11
- 3. Show a summary of:
12
- - Total screenshots tested
13
- - Passed, failed, and new screenshot counts
14
- - List of failed comparisons with diff percentages
15
- - Available diff images to inspect
16
- 4. If TDD server is running, provide the dashboard URL
17
- 5. For failed comparisons, provide guidance on next steps
18
-
19
- ## Example Output Format
20
-
21
- ```
22
- Vizzly TDD Status:
23
- ✅ Total: 15 screenshots
24
- ✅ Passed: 12
25
- ❌ Failed: 2 (exceeded threshold)
26
- 🆕 New: 1 (no baseline)
27
-
28
- Failed Comparisons:
29
- - homepage (2.3% diff) - Check .vizzly/diffs/homepage.png
30
- - login-form (1.8% diff) - Check .vizzly/diffs/login-form.png
31
-
32
- New Screenshots:
33
- - dashboard (no baseline for comparison)
34
-
35
- Dashboard: http://localhost:47392
36
-
37
- Next Steps:
38
- - Review diff images to understand what changed
39
- - Accept baselines from dashboard if changes are intentional
40
- - Fix visual issues if changes are unintentional
41
- ```
42
-
43
- Focus on providing actionable information to help the developer understand what's failing and why.