npm - @vizzly-testing/cli - Versions diffs - 0.11.1 → 0.11.2 - Mend

@vizzly-testing/cli 0.11.1 → 0.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/claude-plugin/.claude-plugin/README.md +165 -13
package/claude-plugin/CHANGELOG.md +58 -0
package/claude-plugin/mcp/vizzly-server/cloud-api-provider.js +29 -0
package/claude-plugin/mcp/vizzly-server/index.js +54 -1
package/claude-plugin/skills/check-visual-tests/SKILL.md +158 -0
package/claude-plugin/skills/debug-visual-regression/SKILL.md +269 -0
package/dist/services/api-service.js +31 -1
package/dist/services/tdd-service.js +32 -5
package/dist/types/services/api-service.d.ts +14 -0
package/package.json +1 -1
package/claude-plugin/commands/debug-diff.md +0 -153
package/claude-plugin/commands/tdd-status.md +0 -43
/package/claude-plugin/{.claude-plugin/.mcp.json → .mcp.json} +0 -0

package/claude-plugin/.claude-plugin/README.md CHANGED Viewed

@@ -30,28 +30,99 @@ cd vizzly-cli
 /plugin install vizzly@vizzly-marketplace
 ```
+## Migration Guide (v0.0.x → v0.1.0)
+**⚠️ Breaking Changes:** Slash commands for status checking and debugging have been replaced with Agent Skills.
+### What Changed
+**Before (v0.0.x):**
+```bash
+# Manually invoke slash commands
+/vizzly:tdd-status
+/vizzly:debug-diff homepage
+```
+**After (v0.1.0):**
+```bash
+# Just ask naturally - Skills activate automatically
+"How are my visual tests?"
+"The homepage screenshot is failing"
+```
+### Command Migration
+| Old Slash Command | New Approach | How It Works |
+|-------------------|--------------|--------------|
+| `/vizzly:tdd-status` | Ask: "How are my tests?" | `check-visual-tests` Skill activates automatically |
+| `/vizzly:debug-diff <name>` | Say: "Debug the homepage screenshot" | `debug-visual-regression` Skill activates automatically |
+| `/vizzly:setup` | Still `/vizzly:setup` | ✅ No change - explicit setup workflow |
+| `/vizzly:suggest-screenshots` | Still `/vizzly:suggest-screenshots` | ✅ No change - explicit suggestions workflow |
+### Why This Change?
+**Better UX:**
+- No need to memorize command syntax
+- Just describe what you need in natural language
+- Claude understands your intent and activates the right Skill
+- More intuitive and conversational workflow
+**What Are Agent Skills?**
+Agent Skills are model-invoked capabilities that Claude uses autonomously based on your request. Instead of explicitly typing `/command`, you simply ask questions or describe problems, and Claude will automatically use the appropriate Skill.
+**Still Prefer Explicit Commands?**
+You can still be explicit in your requests:
+- "Check my Vizzly test status" → Activates `check-visual-tests` Skill
+- "Debug the login screenshot failure" → Activates `debug-visual-regression` Skill
+The Skills will activate based on your intent, not rigid command syntax.
 ## Features
-### 🔍 **TDD Status Checking**
-- `/vizzly:tdd-status` - Check current TDD status and comparison results
-- See failed/new/passed screenshot counts
-- Direct links to diff images and dashboard
+### ✨ **Agent Skills** (Model-Invoked)
-### 🐛 **Smart Diff Analysis**
-- `/vizzly:debug-diff <screenshot-name>` - Analyze visual regression failures
-- AI-powered analysis with contextual suggestions
-- Guidance on whether to accept or fix changes
+Claude automatically uses these Skills when you mention visual testing:
-### 💡 **Screenshot Suggestions**
-- `/vizzly:suggest-screenshots` - Analyze test files for screenshot opportunities
-- Framework-specific code examples
-- Respect your test structure and patterns
+**🐛 Debug Visual Regression**
+- Activated when you mention failing visual tests or screenshot differences
+- Automatically analyzes visual changes, identifies root causes
+- Compares baseline vs current screenshots
+- Suggests whether to accept or fix changes
+- Works with both local TDD and cloud modes
+**🔍 Check Visual Test Status**
+- Activated when you ask about test status or results
+- Provides quick summary of passed/failed/new screenshots
+- Shows diff percentages and threshold information
+- Links to dashboard for detailed review
+**Example usage:**
+- Just say: *"The homepage screenshot is failing"* → Claude debugs it
+- Just ask: *"How are my visual tests?"* → Claude checks status
+- No slash commands needed—Claude activates Skills automatically!
+### 📋 **Slash Commands** (User-Invoked)
-### ⚡ **Quick Setup**
+Explicit workflows you trigger manually:
+**⚡ Quick Setup**
 - `/vizzly:setup` - Initialize Vizzly configuration
 - Environment variable guidance
 - CI/CD integration help
+**💡 Screenshot Suggestions**
+- `/vizzly:suggest-screenshots` - Analyze test files for screenshot opportunities
+- Framework-specific code examples
+- Respect your test structure and patterns
+### Skills vs Slash Commands
+**Skills** are capabilities Claude uses autonomously based on your request. Just describe what you need naturally, and Claude will use the appropriate Skill.
+**Slash Commands** are explicit workflows you invoke manually when you want step-by-step guidance through a process.
 ## MCP Server Tools
 The plugin provides an MCP server with direct access to Vizzly data:
@@ -103,11 +174,92 @@ The plugin will automatically use the appropriate token based on your context.
 - TDD mode running for local features
 - Authentication configured (see above) for cloud features
+## How It Works
+### Agent Skills
+The plugin's Skills use Claude Code's `allowed-tools` feature to restrict what actions they can perform:
+**Check Visual Test Status Skill:**
+- Can use: `get_tdd_status`, `detect_context`
+- Purpose: Quickly check test status without modifying anything
+**Debug Visual Regression Skill:**
+- Can use: `Read`, `WebFetch`, `read_comparison_details`, `accept_baseline`, `approve_comparison`, `reject_comparison`
+- Purpose: Analyze failures and suggest/apply fixes
+### MCP Server Integration
+The plugin bundles an MCP server that provides 15 tools for interacting with Vizzly:
+- **Automatic startup** - MCP server starts when plugin is enabled
+- **Token resolution** - Automatically finds your authentication token
+- **Dual mode** - Works with both local TDD and cloud builds
+- **No configuration needed** - Just install and use
+## Troubleshooting
+### Skills not activating
+If Claude isn't using the Skills automatically:
+1. Verify plugin is enabled: `/plugin`
+2. Check MCP server status: `/mcp` (should show `plugin:vizzly:vizzly`)
+3. Try being more explicit: "Check my Vizzly test status"
+### MCP server not connecting
+If the MCP server shows as "failed" in `/mcp`:
+1. Check Node.js version: `node --version` (requires 20+)
+2. View logs: `claude --debug`
+3. Reinstall plugin: `/plugin uninstall vizzly@vizzly-marketplace` then `/plugin install vizzly@vizzly-marketplace`
+### TDD server not found
+If Skills report "TDD server not running":
+1. Start TDD mode: `vizzly tdd start`
+2. Verify server is running: Check for `.vizzly/server.json`
+3. Run tests to generate screenshots
+## Example Workflows
+### Local TDD Development
+```bash
+# Start TDD server
+vizzly tdd start
+# Run your tests
+npm test
+# Ask Claude to check status
+# "How are my visual tests?"
+# If failures, ask Claude to debug
+# "The login page screenshot is failing"
+```
+### Cloud Build Review
+```bash
+# After CI/CD runs and creates a build
+# "Show me recent Vizzly builds"
+# Review specific comparison
+# "Analyze comparison cmp_abc123"
+# Approve or reject
+# Claude will suggest using approve/reject tools
+```
 ## Documentation
 - [Vizzly CLI](https://github.com/vizzly-testing/vizzly-cli) - Official CLI documentation
 - [Vizzly Platform](https://vizzly.dev) - Web dashboard and cloud features
 - [Claude Code](https://claude.com/claude-code) - Claude Code documentation
+- [Agent Skills](https://docs.claude.com/en/docs/claude-code/skills) - Learn about Claude Code Skills
 ## License

package/claude-plugin/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,58 @@
+# Changelog
+All notable changes to the Vizzly Claude Code plugin will be documented in this file.
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.1.0] - 2025-10-18
+### Added
+- ✨ **Agent Skills** - Model-invoked capabilities that activate autonomously
+  - `check-visual-tests` Skill - Automatically checks test status when you ask about tests
+  - `debug-visual-regression` Skill - Automatically analyzes failures when you mention visual issues
+- 📦 MCP server configuration moved to plugin root (`.mcp.json`)
+- 📝 Comprehensive README with Skills documentation, troubleshooting, and workflows
+- 🔒 Tool restrictions via `allowed-tools` for better security and focused capabilities
+### Changed
+- 🔧 MCP server name: `vizzly-server` → `vizzly` (cleaner naming)
+- 🔧 Skills use correct tool prefix: `mcp__plugin_vizzly_vizzly__*`
+### Removed
+- ❌ **BREAKING:** `/vizzly:tdd-status` slash command → Replaced by `check-visual-tests` Skill
+- ❌ **BREAKING:** `/vizzly:debug-diff` slash command → Replaced by `debug-visual-regression` Skill
+### Migration Guide
+**Before (v0.0.x):**
+```bash
+# Manually invoke slash commands
+/vizzly:tdd-status
+/vizzly:debug-diff homepage
+```
+**After (v0.1.0):**
+```bash
+# Just ask naturally - Skills activate automatically
+"How are my visual tests?"
+"The homepage screenshot is failing"
+```
+**What Changed:**
+- **Status checks** are now autonomous - just ask about your tests
+- **Debugging** happens automatically when you mention failures
+- **No need to remember slash commands** - Claude understands your intent
+- **Setup and suggestions** still use slash commands (`/vizzly:setup`, `/vizzly:suggest-screenshots`)
+**Why This Change:**
+Agent Skills provide a more natural, intuitive experience. Instead of memorizing command syntax, you can ask questions in plain language and Claude will automatically use the right tools.
+**If You Prefer Explicit Commands:**
+While the slash commands are removed, you can still be explicit in your requests:
+- "Check my Vizzly test status" → Activates `check-visual-tests` Skill
+- "Debug the homepage screenshot" → Activates `debug-visual-regression` Skill
+### Fixed
+- MCP server location now follows Claude Code plugin specifications
+- Tool naming consistency across Skills and MCP server

package/claude-plugin/mcp/vizzly-server/cloud-api-provider.js CHANGED Viewed

@@ -145,6 +145,35 @@ export class CloudAPIProvider {
     return data.comparison;
   }
+  /**
+   * Search for comparisons by name across builds
+   */
+  async searchComparisons(name, apiToken, options = {}) {
+    if (!name || typeof name !== 'string') {
+      throw new Error('name is required and must be a non-empty string');
+    }
+    let { branch, limit = 50, offset = 0, apiUrl } = options;
+    let queryParams = new URLSearchParams({
+      name,
+      limit: limit.toString(),
+      offset: offset.toString()
+    });
+    if (branch) {
+      queryParams.append('branch', branch);
+    }
+    let data = await this.makeRequest(
+      `/api/sdk/comparisons/search?${queryParams}`,
+      apiToken,
+      apiUrl
+    );
+    return data;
+  }
   // ==================================================================
   // BUILD COMMENTS
   // ==================================================================

package/claude-plugin/mcp/vizzly-server/index.js CHANGED Viewed

@@ -21,7 +21,7 @@ class VizzlyMCPServer {
   constructor() {
     this.server = new Server(
       {
-        name: 'vizzly-server',
+        name: 'vizzly',
         version: '0.1.0'
       },
       {
@@ -247,6 +247,41 @@ class VizzlyMCPServer {
             required: ['comparisonId']
           }
         },
+        {
+          name: 'search_comparisons',
+          description:
+            'Search for comparisons by screenshot name across recent builds in the cloud. Returns matching comparisons with their build context and screenshot URLs. Use this to find all instances of a specific screenshot across different builds for debugging.',
+          inputSchema: {
+            type: 'object',
+            properties: {
+              name: {
+                type: 'string',
+                description: 'Screenshot/comparison name to search for (supports partial matching)'
+              },
+              branch: {
+                type: 'string',
+                description: 'Optional branch name to filter results'
+              },
+              limit: {
+                type: 'number',
+                description: 'Maximum number of results to return (default: 50)'
+              },
+              offset: {
+                type: 'number',
+                description: 'Offset for pagination (default: 0)'
+              },
+              apiToken: {
+                type: 'string',
+                description: 'Vizzly API token (optional, auto-resolves from user login or env)'
+              },
+              apiUrl: {
+                type: 'string',
+                description: 'API base URL (optional)'
+              }
+            },
+            required: ['name']
+          }
+        },
         {
           name: 'create_build_comment',
           description: 'Create a comment on a build for collaboration',
@@ -668,6 +703,24 @@ class VizzlyMCPServer {
             };
           }
+          case 'search_comparisons': {
+            let apiToken = await this.resolveApiToken(args);
+            let results = await this.cloudProvider.searchComparisons(args.name, apiToken, {
+              branch: args.branch,
+              limit: args.limit,
+              offset: args.offset,
+              apiUrl: args.apiUrl
+            });
+            return {
+              content: [
+                {
+                  type: 'text',
+                  text: JSON.stringify(results, null, 2)
+                }
+              ]
+            };
+          }
           case 'create_build_comment': {
             let apiToken = await this.resolveApiToken(args);
             let result = await this.cloudProvider.createBuildComment(

package/claude-plugin/skills/check-visual-tests/SKILL.md ADDED Viewed

@@ -0,0 +1,158 @@
+---
+name: Check Visual Test Status
+description: Check the status of Vizzly visual regression tests. Use when the user asks about test status, test results, what's failing, how tests are doing, or wants a summary of visual tests. Works with both local TDD and cloud modes.
+allowed-tools: mcp__plugin_vizzly_vizzly__get_tdd_status, mcp__plugin_vizzly_vizzly__detect_context
+---
+# Check Visual Test Status
+Automatically check Vizzly visual test status when the user asks about their tests. Provides a quick summary of passed, failed, and new screenshots.
+## When to Use This Skill
+Activate this Skill when the user:
+- Asks "How are my tests doing?"
+- Asks "Are there any failing tests?"
+- Asks "What's the status of visual tests?"
+- Asks "Show me test results"
+- Asks "What's failing?"
+- Wants a summary of visual regression tests
+## How This Skill Works
+1. **Detect context** (local TDD or cloud mode)
+2. **Fetch TDD status** from the local server
+3. **Analyze results** to identify failures, new screenshots, and passes
+4. **Provide summary** with actionable information
+5. **Link to dashboard** for detailed review
+## Instructions
+### Step 1: Get TDD Status
+Use the `get_tdd_status` tool from the Vizzly MCP server to fetch current comparison results.
+This returns:
+- Total screenshot count
+- Passed, failed, and new screenshot counts
+- List of all comparisons with details
+- Dashboard URL (if TDD server is running)
+### Step 2: Analyze the Results
+Examine the comparison data:
+- Count total, passed, failed, and new screenshots
+- Identify which specific screenshots failed
+- Note diff percentages for failures
+- Check if new screenshots need baselines
+### Step 3: Provide Clear Summary
+Format the output to be scannable and actionable:
+```
+Vizzly TDD Status:
+✅ Total: [count] screenshots
+✅ Passed: [count]
+❌ Failed: [count] (exceeded threshold)
+🆕 New: [count] (no baseline)
+Failed Comparisons:
+- [name] ([diff]% diff) - Exceeds [threshold]% threshold
+- [name] ([diff]% diff) - Exceeds [threshold]% threshold
+New Screenshots:
+- [name] (no baseline for comparison)
+Dashboard: http://localhost:47392
+```
+### Step 4: Suggest Next Steps
+Based on the results, provide guidance:
+**If there are failures:**
+- Suggest using the debug-visual-regression Skill for detailed analysis
+- Provide dashboard link for visual review
+- Mention accept/reject options
+**If there are new screenshots:**
+- Explain that new screenshots need baseline approval
+- Show how to accept them from dashboard or CLI
+**If all passed:**
+- Confirm tests are passing
+- No action needed
+## Example Output
+```
+User: "How are my tests?"
+Vizzly TDD Status:
+✅ Total: 15 screenshots
+✅ Passed: 12
+❌ Failed: 2 (exceeded threshold)
+🆕 New: 1 (no baseline)
+Failed Comparisons:
+- homepage (2.3% diff) - Exceeds 0.1% threshold
+  Check .vizzly/diffs/homepage.png
+- login-form (1.8% diff) - Exceeds 0.1% threshold
+  Check .vizzly/diffs/login-form.png
+New Screenshots:
+- dashboard (no baseline for comparison)
+Dashboard: http://localhost:47392
+Next Steps:
+- Review diff images to understand what changed
+- Accept baselines from dashboard if changes are intentional
+- For detailed analysis of failures, ask me to debug specific screenshots
+- Fix visual issues if changes are unintentional
+```
+## Example When All Pass
+```
+User: "Are my visual tests passing?"
+Vizzly TDD Status:
+✅ Total: 15 screenshots
+✅ All passed!
+No visual regressions detected. All screenshots match their baselines.
+Dashboard: http://localhost:47392
+```
+## Example When TDD Not Running
+```
+User: "How are my tests?"
+Vizzly TDD Status:
+❌ TDD server is not running
+To start TDD mode:
+  vizzly tdd start
+Then run your tests to capture screenshots.
+```
+## Important Notes
+- **Quick status check** - Designed for fast overview, not detailed analysis
+- **Use dashboard for visuals** - Link to dashboard for image review
+- **Suggest next steps** - Always provide actionable guidance
+- **Detect TDD mode** - Only works with local TDD server running
+- **For detailed debugging** - Suggest using debug-visual-regression Skill
+## Focus on Actionability
+Always end with clear next steps:
+- What to investigate
+- Which tools to use (dashboard, debug Skill)
+- How to accept/reject baselines
+- When to fix code vs accept changes

package/claude-plugin/skills/debug-visual-regression/SKILL.md ADDED Viewed

@@ -0,0 +1,269 @@
+---
+name: Debug Visual Regression
+description: Analyze visual regression test failures in Vizzly. Use when the user mentions failing visual tests, screenshot differences, visual bugs, diffs, or asks to debug/investigate/analyze visual changes. Works with both local TDD and cloud modes.
+allowed-tools: Read, WebFetch, mcp__plugin_vizzly_vizzly__read_comparison_details, mcp__plugin_vizzly_vizzly__search_comparisons, mcp__plugin_vizzly_vizzly__accept_baseline, mcp__plugin_vizzly_vizzly__approve_comparison, mcp__plugin_vizzly_vizzly__reject_comparison
+---
+# Debug Visual Regression
+Automatically analyze visual regression failures when the user mentions them. This Skill helps identify the root cause of visual differences and suggests whether to accept or fix changes.
+## When to Use This Skill
+Activate this Skill when the user:
+- Mentions "failing visual test" or "screenshot failure"
+- Asks "what's wrong with my visual tests?"
+- Says "the homepage screenshot is different" or similar
+- Wants to understand why a visual comparison failed
+- Asks to "debug", "analyze", or "investigate" visual changes
+- Mentions specific screenshot names that are failing
+## How This Skill Works
+1. **Automatically detect the mode** (local TDD or cloud)
+2. **Fetch comparison details** using the screenshot name or comparison ID
+3. **View the actual images** to perform visual analysis
+4. **Provide detailed insights** on what changed and why
+5. **Suggest next steps** (accept, reject, or fix)
+## Instructions
+### Step 1: Call the Unified MCP Tool
+Use `read_comparison_details` with the identifier:
+- Pass screenshot name (e.g., "homepage_desktop") for local mode
+- Pass comparison ID (e.g., "cmp_abc123") for cloud mode
+- The tool automatically detects which mode to use
+- Returns a response with `mode` field indicating local or cloud
+### Step 2: Check the Mode in Response
+The response will contain a `mode` field:
+- **Local mode** (`mode: "local"`): Returns filesystem paths (`baselinePath`, `currentPath`, `diffPath`)
+- **Cloud mode** (`mode: "cloud"`): Returns URLs (`baselineUrl`, `currentUrl`, `diffUrl`)
+### Step 3: Analyze Comparison Data
+Examine the comparison details:
+- Diff percentage and threshold
+- Status (failed/new/passed)
+- Image references (paths or URLs depending on mode)
+- Viewport and browser information
+### Step 4: View the Actual Images
+**CRITICAL:** You MUST view the baseline and current images to provide accurate analysis.
+**If mode is "local":**
+- Response contains filesystem paths (`baselinePath`, `currentPath`, `diffPath`)
+- **Use the Read tool to view ONLY baselinePath and currentPath**
+- **DO NOT read diffPath** - it causes API errors
+**If mode is "cloud":**
+- Response contains public URLs (`baselineUrl`, `currentUrl`, `diffUrl`)
+- **Use the WebFetch tool to view ONLY baselineUrl and currentUrl**
+- **DO NOT fetch diffUrl** - it causes API errors
+### Step 5: Provide Detailed Visual Insights
+Based on what you observe in the images:
+**Describe the specific visual differences:**
+- Which UI components, elements, or layouts changed
+- Colors, spacing, typography, positioning changes
+- Missing or added elements
+**Categorize the change by diff percentage:**
+- **< 1%:** Anti-aliasing, font rendering, subpixel differences
+- **1-5%:** Layout shifts, padding/margin changes, color variations
+- **> 5%:** Significant layout changes, missing content, major visual updates
+**Identify possible causes:**
+- CSS changes (margin, padding, positioning)
+- Content changes (text, images)
+- State issues (hover, focus, loading states)
+- Browser/viewport rendering differences
+### Step 6: Suggest Next Steps
+**If local mode:**
+- Whether to accept using `accept_baseline` tool
+- Specific code areas to investigate if unintentional
+- How to fix common issues
+**If cloud mode:**
+- Whether to approve using `approve_comparison` tool
+- Whether to reject using `reject_comparison` tool with reason
+- Team coordination steps
+**If changes are intentional:**
+- Explain why it's safe to accept/approve
+- Confirm this matches expected behavior
+**If changes are unintentional:**
+- Specific files to check (CSS, templates, components)
+- Git commands to investigate recent changes
+- How to reproduce and fix
+## Example Analysis (Local TDD Mode)
+```
+User: "The homepage screenshot is failing"
+Step 1: Call tool
+Tool: read_comparison_details({ identifier: "homepage" })
+Response:
+{
+  "name": "homepage",
+  "status": "failed",
+  "diffPercentage": 2.3,
+  "threshold": 0.1,
+  "mode": "local",
+  "baselinePath": "/Users/you/project/.vizzly/baselines/homepage.png",
+  "currentPath": "/Users/you/project/.vizzly/screenshots/homepage.png"
+}
+Step 2: View images
+Read(baselinePath)
+Read(currentPath)
+Step 3: Visual Analysis
+Comparing the two images, I can see the following changes:
+Visual Differences:
+- The navigation header has shifted down by approximately 10-15 pixels
+- Logo position moved from y:20px to y:35px
+- Navigation menu items are now overlapping with the hero section
+- The "Sign Up" button background changed from blue (#2563eb) to darker blue (#1e40af)
+Root Cause Analysis:
+Based on the 2.3% diff and visual comparison, this appears to be:
+1. A margin or padding change on the header element
+2. The button color change suggests a hover state being captured
+Recommendations:
+If intentional:
+- This appears to be a deliberate redesign of the header spacing
+- Accept as new baseline if these changes match your design updates
+If unintentional:
+- Check recent CSS changes to `.header` or `nav` margin-top/padding-top
+- Verify button states during screenshot capture to avoid hover state
+- Investigate with: `git diff src/styles/header.css`
+- Fix the header positioning and re-run tests
+```
+## Example Analysis (Cloud Mode)
+```
+User: "Can you check comparison cmp_xyz789?"
+Step 1: Call tool
+Tool: read_comparison_details({
+  identifier: "cmp_xyz789",
+  apiToken: "vzt_..."
+})
+Response:
+{
+  "name": "login-page",
+  "status": "failed",
+  "diffPercentage": 0.8,
+  "threshold": 0.1,
+  "mode": "cloud",
+  "baselineUrl": "https://app.vizzly.dev/screenshots/abc123/baseline.png",
+  "currentUrl": "https://app.vizzly.dev/screenshots/abc123/current.png",
+  "comparisonId": "cmp_xyz789"
+}
+Step 2: View images
+WebFetch(baselineUrl, "show me the baseline screenshot")
+WebFetch(currentUrl, "show me the current screenshot")
+Step 3: Visual Analysis
+The differences are very subtle (0.8% diff):
+Visual Differences:
+- Input field border color slightly different (#e5e7eb vs #d1d5db)
+- Submit button has very slight shadow change
+- Font anti-aliasing appears slightly different
+Root Cause Analysis:
+This appears to be browser rendering differences or minor CSS updates:
+- The 0.8% diff suggests this is not a major visual change
+- Could be Chrome version differences
+- Could be sub-pixel rendering variations
+Recommendations:
+This is likely acceptable to approve:
+- The changes are extremely minor (< 1%)
+- No functional UI changes
+- Probably browser/rendering variations
+Approve with: approve_comparison tool
+Or adjust threshold to 1% if these variations are expected
+```
+## Cross-Build Debugging (Cloud Only)
+When debugging visual regressions in cloud mode, you can track a screenshot across multiple builds to identify when changes were introduced.
+### When to Use Search
+Use `search_comparisons` when:
+- The user asks "when did this screenshot start failing?"
+- They want to track a visual change across builds
+- They're investigating a regression that appeared recently
+- They want to see the history of a specific screenshot
+### How to Search
+```javascript
+// Search for all comparisons of a screenshot
+search_comparisons({
+  name: "homepage_desktop",
+  branch: "main",  // optional: filter by branch
+  limit: 10        // optional: limit results
+})
+```
+### Example Workflow
+```
+User: "When did the homepage screenshot start failing?"
+Step 1: Search for the screenshot across builds
+Tool: search_comparisons({ name: "homepage", branch: "main", limit: 10 })
+Response shows 10 comparisons from most recent to oldest, each with:
+- Build name, branch, and creation date
+- Diff percentage and status
+- Screenshot URLs
+Step 2: Analyze the timeline
+- Build #45 (today): 5.2% diff - FAILED
+- Build #44 (yesterday): 5.1% diff - FAILED
+- Build #43 (2 days ago): 0.3% diff - PASSED
+- Build #42 (3 days ago): 0.2% diff - PASSED
+Step 3: Report findings
+"The homepage screenshot started failing between Build #43 and #44
+(approximately 1-2 days ago). The diff jumped from 0.3% to 5.1%,
+suggesting a significant visual change was introduced."
+Step 4: Deep dive into the failing build
+Tool: read_comparison_details({ identifier: "cmp_xyz_from_build44" })
+[View images and provide detailed analysis as usual]
+```
+## Important Notes
+- **Always use `read_comparison_details`** - it automatically detects the mode
+- **Use `search_comparisons` for cloud debugging** - to track changes across builds
+- **Check the `mode` field** to know which viewing tool to use (Read vs WebFetch)
+- **Never view diff images** - only baseline and current
+- **Visual inspection is critical** - don't rely solely on diff percentages
+- **Be specific in analysis** - identify exact elements that changed
+- **Provide actionable advice** - specific files, commands, or tools to use
+- **Consider context** - small diffs might be acceptable, large ones need investigation

package/dist/services/api-service.js CHANGED Viewed

@@ -128,7 +128,37 @@ export class ApiService {
    * @returns {Promise<Object>} Comparison data
    */
   async getComparison(comparisonId) {
-    return this.request(`/api/sdk/comparisons/${comparisonId}`);
+    let response = await this.request(`/api/sdk/comparisons/${comparisonId}`);
+    return response.comparison;
+  }
+  /**
+   * Search for comparisons by name across builds
+   * @param {string} name - Screenshot name to search for
+   * @param {Object} filters - Optional filters (branch, limit, offset)
+   * @param {string} [filters.branch] - Filter by branch name
+   * @param {number} [filters.limit=50] - Maximum number of results (default: 50)
+   * @param {number} [filters.offset=0] - Pagination offset (default: 0)
+   * @returns {Promise<Object>} Search results with comparisons and pagination
+   */
+  async searchComparisons(name, filters = {}) {
+    if (!name || typeof name !== 'string') {
+      throw new VizzlyError('name is required and must be a non-empty string');
+    }
+    let {
+      branch,
+      limit = 50,
+      offset = 0
+    } = filters;
+    const queryParams = new URLSearchParams({
+      name,
+      limit: String(limit),
+      offset: String(offset)
+    });
+    // Only add branch if provided
+    if (branch) queryParams.append('branch', branch);
+    return this.request(`/api/sdk/comparisons/search?${queryParams}`);
   }
   /**

package/dist/services/tdd-service.js CHANGED Viewed

@@ -132,10 +132,35 @@ export class TddService {
           logger.warn(`⚠️  Build ${buildId} has status: ${baselineBuild.status} (expected: completed)`);
         }
       } else if (comparisonId) {
-        // Use specific comparison ID
+        // Use specific comparison ID - download only this comparison's baseline screenshot
         logger.info(`📌 Using comparison: ${comparisonId}`);
         const comparison = await this.api.getComparison(comparisonId);
-        baselineBuild = comparison.baselineBuild;
+        // A comparison doesn't have baselineBuild directly - we need to get it
+        // The comparison has baseline_screenshot which contains the build_id
+        if (!comparison.baseline_screenshot) {
+          throw new Error(`Comparison ${comparisonId} has no baseline screenshot. This comparison may be a "new" screenshot with no baseline to compare against.`);
+        }
+        // The original_url might be in baseline_screenshot.original_url or comparison.baseline_screenshot_url
+        let baselineUrl = comparison.baseline_screenshot.original_url || comparison.baseline_screenshot_url;
+        if (!baselineUrl) {
+          throw new Error(`Baseline screenshot for comparison ${comparisonId} has no download URL`);
+        }
+        // For a specific comparison, we only download that one baseline screenshot
+        // Create a mock build structure with just this one screenshot
+        baselineBuild = {
+          id: comparison.baseline_screenshot.build_id || 'comparison-baseline',
+          name: `Comparison ${comparisonId.substring(0, 8)}`,
+          screenshots: [{
+            id: comparison.baseline_screenshot.id,
+            name: comparison.baseline_name || comparison.current_name,
+            original_url: baselineUrl,
+            metadata: {},
+            properties: {}
+          }]
+        };
       } else {
         // Get the latest passed build for this environment and branch
         const builds = await this.api.getBuilds({
@@ -152,10 +177,12 @@ export class TddService {
         baselineBuild = builds.data[0];
       }
-      // For specific buildId, we already have screenshots, otherwise get build details
+      // For specific buildId, we already have screenshots
+      // For comparisonId, we created a mock build with just the one screenshot
+      // Otherwise, get build details with screenshots
       let buildDetails = baselineBuild;
-      if (!buildId) {
-        // Get build details with screenshots for non-buildId cases
+      if (!buildId && !comparisonId) {
+        // Get build details with screenshots for non-buildId/non-comparisonId cases
         const actualBuildId = baselineBuild.id;
         buildDetails = await this.api.getBuild(actualBuildId, 'screenshots');
       }

package/dist/types/services/api-service.d.ts CHANGED Viewed

@@ -28,6 +28,20 @@ export class ApiService {
      * @returns {Promise<Object>} Comparison data
      */
     getComparison(comparisonId: string): Promise<any>;
+    /**
+     * Search for comparisons by name across builds
+     * @param {string} name - Screenshot name to search for
+     * @param {Object} filters - Optional filters (branch, limit, offset)
+     * @param {string} [filters.branch] - Filter by branch name
+     * @param {number} [filters.limit=50] - Maximum number of results (default: 50)
+     * @param {number} [filters.offset=0] - Pagination offset (default: 0)
+     * @returns {Promise<Object>} Search results with comparisons and pagination
+     */
+    searchComparisons(name: string, filters?: {
+        branch?: string;
+        limit?: number;
+        offset?: number;
+    }): Promise<any>;
     /**
      * Get builds for a project
      * @param {Object} filters - Filter options

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@vizzly-testing/cli",
-  "version": "0.11.1",
+  "version": "0.11.2",
   "description": "Visual review platform for UI developers and designers",
   "keywords": [
     "visual-testing",

package/claude-plugin/commands/debug-diff.md DELETED Viewed

@@ -1,153 +0,0 @@
----
-description: Analyze a specific visual regression failure and suggest fixes
----
-# Debug Vizzly Visual Regression
-Analyze a specific failing visual comparison in detail. Supports both local TDD mode and cloud API mode.
-## Process
-1. **Call the unified MCP tool**: Use `read_comparison_details` with the identifier (screenshot name or comparison ID)
-   - The tool automatically detects whether to use local TDD mode or cloud mode
-   - Pass screenshot name (e.g., "homepage_desktop") for local mode
-   - Pass comparison ID (e.g., "cmp_abc123") for cloud mode
-   - Returns a response with `mode` field indicating which mode was used
-2. **Check the mode in the response**:
-   - **Local mode** (`mode: "local"`): Returns filesystem paths (`baselinePath`, `currentPath`, `diffPath`)
-   - **Cloud mode** (`mode: "cloud"`): Returns URLs (`baselineUrl`, `currentUrl`, `diffUrl`)
-3. **Analyze the comparison data**:
-   - Diff percentage and threshold
-   - Status (failed/new/passed)
-   - Image references (paths or URLs depending on mode)
-4. **View the actual images** (critical for visual analysis):
-   Check the `mode` field in the response to determine which tool to use:
-   **If mode is "local":**
-   - Response contains filesystem paths (`baselinePath`, `currentPath`, `diffPath`)
-   - **Use the Read tool to view ONLY baselinePath and currentPath**
-   - **DO NOT read diffPath** - it causes API errors
-   **If mode is "cloud":**
-   - Response contains public URLs (`baselineUrl`, `currentUrl`, `diffUrl`)
-   - **Use the WebFetch tool to view ONLY baselineUrl and currentUrl**
-   - **DO NOT fetch diffUrl** - it causes API errors
-   **IMPORTANT:** You MUST view the baseline and current images to provide accurate analysis
-5. **Provide detailed visual insights** based on what you see:
-   - What type of change was detected (small/moderate/large diff)
-   - Describe the specific visual differences you observe in the images
-   - Identify which UI components, elements, or layouts changed
-   - Possible causes based on diff percentage and visual inspection:
-     - <1%: Anti-aliasing, font rendering, subpixel differences
-     - 1-5%: Layout shifts, padding/margin changes, color variations
-     - > 5%: Significant layout changes, missing content, major visual updates
-6. **Suggest next steps** based on the mode:
-   - **If local mode**: Whether to accept using `accept_baseline` tool
-   - **If cloud mode**: Whether to approve/reject using `approve_comparison` or `reject_comparison` tools
-   - Areas to investigate if unintentional
-   - How to fix common issues
-   - Specific code changes if you can identify them from the visual diff
-## Example Analysis (Local TDD Mode)
-```
-Step 1: Call read_comparison_details with screenshot name
-Tool: read_comparison_details({ identifier: "homepage" })
-Response:
-{
-  "name": "homepage",
-  "status": "failed",
-  "diffPercentage": 2.3,
-  "threshold": 0.1,
-  "mode": "local",
-  "baselinePath": "/Users/you/project/.vizzly/baselines/homepage.png",
-  "currentPath": "/Users/you/project/.vizzly/screenshots/homepage.png",
-  "diffPath": "/Users/you/project/.vizzly/diffs/homepage.png"
-}
-Step 2: Detected mode is "local", so use Read tool for images
-Read(baselinePath) and Read(currentPath)
-Visual Analysis:
-[After reading the baseline and current image files...]
-Comparing the two images, the navigation header has shifted down by approximately 10-15 pixels.
-Specific changes observed:
-- The logo position moved from y:20px to y:35px
-- Navigation menu items are now overlapping with the hero section
-- The "Sign Up" button background changed from blue (#2563eb) to a darker blue (#1e40af)
-Root Cause:
-Based on the visual comparison, this appears to be a margin or padding change on the
-header element. The button color change is likely a hover state being captured.
-Recommendations:
-1. Check for recent CSS changes to:
-   - `.header` or `nav` margin-top/padding-top
-   - Any global layout shifts affecting the header
-2. The button color change suggests a hover state - ensure consistent state during screenshot capture
-3. If the header position change is intentional:
-   - Accept as new baseline using `accept_baseline` tool
-4. If unintentional:
-   - Revert CSS changes to header positioning
-   - Verify with: `git diff src/styles/header.css`
-```
-## Example Analysis (Cloud Mode)
-```
-Step 1: Call read_comparison_details with comparison ID
-Tool: read_comparison_details({
-  identifier: "cmp_xyz789",
-  apiToken: "vzt_..."
-})
-Response:
-{
-  "name": "homepage",
-  "status": "failed",
-  "diffPercentage": 2.3,
-  "threshold": 0.1,
-  "mode": "cloud",
-  "baselineUrl": "https://app.vizzly.dev/screenshots/abc123/baseline.png",
-  "currentUrl": "https://app.vizzly.dev/screenshots/abc123/current.png",
-  "diffUrl": "https://app.vizzly.dev/screenshots/abc123/diff.png",
-  "comparisonId": "cmp_xyz789",
-  "buildId": "bld_abc123"
-}
-Step 2: Detected mode is "cloud", so use WebFetch tool for images
-WebFetch(baselineUrl) and WebFetch(currentUrl)
-Visual Analysis:
-[After fetching the baseline and current image URLs...]
-[Same analysis as local mode example...]
-Recommendations:
-1. [Same technical recommendations as local mode...]
-2. If the header position change is intentional:
-   - Approve this comparison using `approve_comparison` tool
-3. If unintentional:
-   - Reject using `reject_comparison` tool with detailed reason
-   - Have the team fix the CSS changes
-```
-## Important Notes
-- **Unified Tool**: Always use `read_comparison_details` with the identifier - it automatically detects the mode
-- **Mode Detection**: Check the `mode` field in the response to know which viewing tool to use
-- **Image Viewing**:
-  - Local mode → Use Read tool with filesystem paths
-  - Cloud mode → Use WebFetch tool with URLs
-- **Diff Images**: NEVER attempt to view/read/fetch the diff image - it causes API errors
-- **Visual Analysis**: Always view the baseline and current images before providing analysis
-- Visual inspection reveals details that diff percentages alone cannot convey

package/claude-plugin/commands/tdd-status.md DELETED Viewed

@@ -1,43 +0,0 @@
----
-description: Check TDD dashboard status and view visual regression test results
----
-# Check Vizzly TDD Status
-Use the Vizzly MCP server to check the current TDD status:
-1. Call the `get_tdd_status` tool from the vizzly MCP server
-2. Analyze the comparison results
-3. Show a summary of:
-   - Total screenshots tested
-   - Passed, failed, and new screenshot counts
-   - List of failed comparisons with diff percentages
-   - Available diff images to inspect
-4. If TDD server is running, provide the dashboard URL
-5. For failed comparisons, provide guidance on next steps
-## Example Output Format
-```
-Vizzly TDD Status:
-✅ Total: 15 screenshots
-✅ Passed: 12
-❌ Failed: 2 (exceeded threshold)
-🆕 New: 1 (no baseline)
-Failed Comparisons:
-- homepage (2.3% diff) - Check .vizzly/diffs/homepage.png
-- login-form (1.8% diff) - Check .vizzly/diffs/login-form.png
-New Screenshots:
-- dashboard (no baseline for comparison)
-Dashboard: http://localhost:47392
-Next Steps:
-- Review diff images to understand what changed
-- Accept baselines from dashboard if changes are intentional
-- Fix visual issues if changes are unintentional
-```
-Focus on providing actionable information to help the developer understand what's failing and why.

/package/claude-plugin/{.claude-plugin/.mcp.json → .mcp.json} RENAMED Viewed

File without changes