npm - @zibby/skills - Versions diffs - 0.1.8 → 0.1.10 - Mend

@zibby/skills 0.1.8 → 0.1.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (35) hide show

package/dist/browser.js +2 -2
package/dist/chat-memory.js +15 -15
package/dist/core-tools.js +2 -2
package/dist/function-skill.js +1 -1
package/dist/git.js +2 -2
package/dist/github.js +3 -3
package/dist/index.js +646 -1
package/dist/jira.js +6 -6
package/dist/memory.js +4 -4
package/dist/package.json +16 -11
package/dist/sentry.js +2 -2
package/dist/skill-installer.js +3 -3
package/dist/slack.js +2 -2
package/dist/test-runner.js +13 -13
package/dist/workflow-builder.js +146 -82
package/docs/analysis.md +109 -0
package/docs/cli-reference.md +338 -0
package/docs/cloning-repositories.md +285 -0
package/docs/custom-workflows.md +358 -0
package/docs/getting-started.md +108 -0
package/docs/installation.md +127 -0
package/docs/integrations/github.md +73 -0
package/docs/integrations/jira.md +71 -0
package/docs/intro.md +87 -0
package/docs/packages/cli.md +238 -0
package/docs/packages/core.md +256 -0
package/docs/packages/mcp-browser.md +110 -0
package/docs/packages/memory.md +223 -0
package/docs/packages/skills.md +216 -0
package/docs/reviewing-results.md +114 -0
package/docs/running-tests.md +134 -0
package/docs/triggering-workflows.md +552 -0
package/docs/workflow-artifact-layout-evaluation.md +119 -0
package/docs/workflow.md +558 -0
package/package.json +7 -2

package/docs/packages/mcp-browser.md ADDED Viewed

@@ -0,0 +1,110 @@
+---
+sidebar_position: 4
+title: "@zibby/mcp-browser"
+---
+# @zibby/mcp-browser
+Wrapper around [`@playwright/mcp`](https://github.com/anthropics/playwright-mcp) with stable ID injection, event recording, and session-aware video capture.
+```bash
+npm install @zibby/mcp-browser
+```
+> Installed automatically as a dependency of `@zibby/core`.
+## Why a Wrapper?
+The official `@playwright/mcp` provides standard browser automation tools. `@zibby/mcp-browser` adds:
+| Feature | Description |
+|---|---|
+| **Stable IDs** | Injects deterministic `data-zibby-id` attributes into the DOM, so selectors survive page re-renders |
+| **Event Recording** | Captures every MCP tool call (navigate, click, type) with timestamps for the action timeline |
+| **Session Awareness** | Reads `ZIBBY_SESSION_INFO` to know which session and node is active — saves videos and events to the right directory |
+| **Video Capture** | Records browser sessions as video with configurable resolution |
+## Usage
+### As an MCP Server (standalone)
+```bash
+npx mcp-browser-zibby --save-video=1280x720 --viewport-size=1280x720 --output-dir=./output
+```
+### Via the Browser Skill (recommended)
+The browser skill in `@zibby/skills` resolves to `@zibby/mcp-browser` automatically:
+```javascript
+import { SKILLS } from '@zibby/core';
+export const myNode = {
+  name: 'my_node',
+  skills: [SKILLS.BROWSER],
+  // The framework starts @zibby/mcp-browser as an MCP server
+  // and wires it to whichever agent is active
+};
+```
+### Programmatic
+```javascript
+import { startServer } from '@zibby/mcp-browser';
+const server = await startServer({
+  saveVideo: '1280x720',
+  viewportSize: '1280x720',
+  outputDir: './test-results',
+});
+```
+## Stable IDs
+Standard CSS selectors break when the DOM changes between renders. Stable IDs solve this by injecting deterministic attributes based on element role, position, and content:
+```html
+<!-- Before stable ID injection -->
+<button class="btn-primary sc-fEOsli">Submit</button>
+<!-- After stable ID injection -->
+<button class="btn-primary sc-fEOsli" data-zibby-id="form-submit-button">Submit</button>
+```
+The stable ID algorithm considers:
+- Element tag and ARIA role
+- Accessible name and label
+- Position relative to landmarks
+- Content hash for disambiguation
+This gives the AI and generated Playwright scripts more reliable selectors.
+## Event Recording
+Every MCP tool call is recorded as a structured event:
+```json
+{
+  "timestamp": 1710784523000,
+  "node": "execute_live",
+  "tool": "browser_click",
+  "arguments": { "element": "Submit button", "ref": "s1e45" },
+  "result": "Clicked",
+  "duration": 234
+}
+```
+Events are saved to `{sessionPath}/{nodeName}/events.json` and used by:
+- The **action timeline** in the Zibby dashboard
+- The **generate_script** node to produce accurate Playwright scripts
+- The **memory** system to record selector usage
+## CLI Options
+| Flag | Default | Description |
+|---|---|---|
+| `--save-video=WxH` | — | Record video at specified resolution |
+| `--viewport-size=WxH` | `1280x720` | Browser viewport size |
+| `--output-dir=PATH` | `test-results/` | Where to save video and event files |
+| `--headed` | `false` | Show the browser window |
+| `--headless` | `true` | Run without a visible window |

package/docs/packages/memory.md ADDED Viewed

@@ -0,0 +1,223 @@
+---
+sidebar_position: 3
+title: "@zibby/memory"
+---
+# @zibby/memory
+Version-controlled test memory database powered by [Dolt](https://www.dolthub.com/). Learns from every test run — selectors that worked, pages that were visited, patterns that failed, workarounds that helped.
+```bash
+npm install @zibby/memory
+```
+## Why Memory?
+Without memory, every test run starts from scratch. The AI has no idea which selectors are stable, which pages have changed, or what workarounds were discovered in previous runs.
+With `@zibby/memory`:
+- **Selectors** — the AI knows which CSS/XPath selectors are reliable and which are flaky
+- **Page models** — the AI has a map of page elements, roles, and structure before it even navigates
+- **Navigation patterns** — the AI knows which URL transitions are valid
+- **Test history** — the AI sees pass/fail trends and avoids repeating past failures
+- **Insights** — the AI reads and writes tips (timing quirks, workarounds, selector alternatives)
+## Setup
+### 1. Install Dolt
+Dolt is a version-controlled SQL database (Git for data):
+```bash
+# macOS
+brew install dolt
+# Linux
+sudo bash -c 'curl -L https://github.com/dolthub/dolt/releases/latest/download/install.sh | bash'
+```
+### 2. Initialize Memory Database
+```bash
+zibby init --mem
+```
+This creates a Dolt database at `.zibby/memory/` with the schema for test runs, selectors, page models, navigation, and insights.
+### 3. Enable Memory in Your Workflow
+Add `SKILLS.MEMORY` to any node that should have memory access:
+```javascript
+import { SKILLS } from '@zibby/core';
+export const executeLiveNode = {
+  name: 'execute_live',
+  skills: [SKILLS.BROWSER, SKILLS.MEMORY],
+  // ...
+};
+```
+The built-in `execute_live` node already has memory enabled by default.
+## How It Works
+### During a Test Run
+1. **Before execution** — the memory middleware loads relevant history:
+   - Previous runs for this spec (pass/fail, timing)
+   - Known selectors for the target pages
+   - Saved insights and tips
+2. **During execution** — the AI can call memory tools:
+   - `memory_get_selectors` to find stable selectors
+   - `memory_get_page_model` to understand page structure
+   - `memory_save_insight` to record a finding
+3. **After execution** — the result handler persists new data:
+   - Test result (pass/fail, duration)
+   - Selectors used and their success/failure
+   - Page model updates
+   - Navigation transitions discovered
+### Version Control
+Every persist operation creates a Dolt commit. You can:
+```bash
+# View memory history
+cd .zibby/memory
+dolt log
+# Diff between runs
+dolt diff HEAD~1 HEAD
+# Branch for experiments
+dolt branch experiment
+dolt checkout experiment
+```
+## Database Schema
+### `test_runs`
+| Column | Type | Description |
+|---|---|---|
+| `session_id` | VARCHAR | Unique session identifier |
+| `spec_path` | VARCHAR | Path to the test spec file |
+| `passed` | BOOLEAN | Whether the test passed |
+| `duration_ms` | INT | Total execution time |
+| `agent_type` | VARCHAR | Which agent ran the test |
+| `created_at` | DATETIME | Timestamp |
+### `selectors`
+| Column | Type | Description |
+|---|---|---|
+| `page_url` | VARCHAR | URL where this selector was used |
+| `selector` | VARCHAR | The CSS/XPath selector string |
+| `stable_id` | VARCHAR | Zibby stable ID (if available) |
+| `success_count` | INT | Times this selector worked |
+| `fail_count` | INT | Times this selector failed |
+| `last_used` | DATETIME | Last usage timestamp |
+### `page_model`
+| Column | Type | Description |
+|---|---|---|
+| `url` | VARCHAR | Page URL |
+| `element_role` | VARCHAR | ARIA role |
+| `element_name` | VARCHAR | Accessible name |
+| `selector` | VARCHAR | Best known selector |
+| `updated_at` | DATETIME | Last update |
+### `navigation`
+| Column | Type | Description |
+|---|---|---|
+| `from_url` | VARCHAR | Source page URL |
+| `to_url` | VARCHAR | Destination page URL |
+| `trigger` | VARCHAR | What caused the navigation (click, submit, etc.) |
+| `count` | INT | Times this transition was observed |
+### `insights`
+| Column | Type | Description |
+|---|---|---|
+| `category` | ENUM | `selector_tip`, `timing`, `navigation`, `workaround`, `flaky`, `general` |
+| `content` | TEXT | The insight text |
+| `spec_path` | VARCHAR | Related spec |
+| `session_id` | VARCHAR | Session that created it |
+| `created_at` | DATETIME | Timestamp |
+## MCP Tools
+The memory MCP server exposes five tools:
+```
+memory_get_test_history   — Query recent test runs (filter by spec path)
+memory_get_selectors      — Query selectors with stability metrics (filter by page URL)
+memory_get_page_model     — Query page elements and roles (filter by URL)
+memory_get_navigation     — Query page-to-page transitions (filter by source URL)
+memory_save_insight       — Save a useful observation for future runs
+```
+### Example: AI Querying Memory
+During execution, the AI might call:
+```json
+{
+  "tool": "memory_get_selectors",
+  "input": { "pageUrl": "myapp.com/login", "limit": 10 }
+}
+```
+And receive:
+```json
+[
+  { "selector": "[data-testid='email']", "success_count": 12, "fail_count": 0 },
+  { "selector": "#login-email", "success_count": 8, "fail_count": 3 },
+  { "selector": "input[name='email']", "success_count": 5, "fail_count": 1 }
+]
+```
+The AI then prefers `[data-testid='email']` because it has the highest success rate.
+## Middleware Integration
+Memory provides automatic middleware that injects history into the node context:
+```javascript
+import { createMemoryMiddleware } from '@zibby/memory';
+const middleware = createMemoryMiddleware();
+const graph = new WorkflowGraph({ middleware: [middleware] });
+```
+The memory skill registers this middleware automatically when `SKILLS.MEMORY` is declared on a node.
+## CLI Commands
+```bash
+# Initialize memory database
+zibby init --mem
+# View memory stats
+zibby memory status
+# Sync memory (push to Dolt remote)
+zibby memory sync
+```
+## Exports
+```javascript
+import {
+  createMemoryMiddleware,
+  memoryEndRun,
+  memorySyncPush,
+} from '@zibby/memory';
+```

package/docs/packages/skills.md ADDED Viewed

@@ -0,0 +1,216 @@
+---
+sidebar_position: 2
+title: "@zibby/skills"
+---
+# @zibby/skills
+Built-in skill definitions for Zibby's test automation framework.
+```bash
+npm install @zibby/skills
+```
+> Installed automatically as a dependency of `@zibby/cli`.
+## What Are Skills?
+A **skill** is a declarative description of an MCP (Model Context Protocol) server and the tools it exposes. Skills are the bridge between your workflow nodes and external capabilities like browser automation, Jira, GitHub, Slack, and test memory.
+Skills are **agent-agnostic** — the same skill definition works across Cursor, Claude, and Codex. The framework resolves the skill into the right MCP configuration for whichever agent is active.
+## Built-in Skills
+| Skill ID | MCP Server | Tools Provided |
+|---|---|---|
+| `browser` | `@zibby/mcp-browser` / `@playwright/mcp` | Browser navigation, clicking, typing, snapshots, video |
+| `memory` | `@zibby/mcp-memory` | Test history, selector stability, page model, save insights |
+| `jira` | `@zibby/mcp-jira` | Read/write Jira tickets |
+| `github` | GitHub MCP server | Repository access, PR creation |
+| `slack` | Slack MCP server | Send notifications, post results |
+## Using Skills in Nodes
+Declare skills in a node definition:
+```javascript
+import { SKILLS } from '@zibby/core';
+export const executeLiveNode = {
+  name: 'execute_live',
+  skills: [SKILLS.BROWSER, SKILLS.MEMORY],
+  prompt: (state) => `Execute the test: ${state.testSpec}`,
+  outputSchema: ExecutionSchema,
+};
+```
+When the workflow runs:
+1. The framework reads the node's `skills` array
+2. For each skill, calls `skill.resolve()` to get the MCP server config
+3. Injects the resolved MCP server into the agent's environment
+4. Appends the skill's `promptFragment` to the prompt (if defined)
+5. Runs skill middleware (if defined)
+## Skill Anatomy
+Every skill has this shape:
+```javascript
+{
+  id: 'browser',                           // Unique identifier
+  type: 'mcp',                             // 'mcp' or 'function'
+  serverName: 'playwright',                // MCP server name
+  allowedTools: ['mcp__playwright__*'],     // Tool patterns for Claude SDK
+  cursorKey: 'playwright-official',         // Key in ~/.cursor/mcp.json
+  sessionEnvKey: 'ZIBBY_SESSION_INFO',     // Env var with session path
+  envKeys: [],                             // Required env vars
+  description: 'Playwright Browser MCP',
+  // Prompt text appended to every node that uses this skill
+  promptFragment: 'Execute using browser tools...',
+  // Returns MCP server config { command, args, env }
+  resolve({ sessionPath, workspace }) {
+    return {
+      command: 'node',
+      args: ['/path/to/mcp-server.js', '--output-dir', sessionPath],
+    };
+  },
+  // Optional: middleware factory (called once per graph run)
+  async middleware() {
+    return async (nodeName, next, stateValues, state) => {
+      // Pre-node logic (e.g., load test history)
+      const result = await next();
+      // Post-node logic (e.g., persist insights)
+      return result;
+    };
+  },
+  // Tool schemas for compile-time validation
+  tools: [
+    { name: 'tool_name', description: '...', input_schema: { ... } }
+  ],
+}
+```
+## Creating Custom Skills
+### MCP Skill (wraps an external MCP server)
+```javascript
+import { skill } from '@zibby/skills';
+export const linear = skill('linear', {
+  description: 'Linear issue tracker',
+  serverName: 'linear',
+  allowedTools: ['mcp__linear__*'],
+  envKeys: ['LINEAR_API_KEY'],
+  resolve() {
+    if (!process.env.LINEAR_API_KEY) return null;
+    return {
+      command: 'npx',
+      args: ['-y', '@anthropic/linear-mcp-server'],
+      env: { LINEAR_API_KEY: process.env.LINEAR_API_KEY },
+    };
+  },
+});
+```
+Use it in a node:
+```javascript
+graph.addNode('create_issue', {
+  name: 'create_issue',
+  skills: ['linear'],
+  prompt: (state) => `Create a Linear issue for: ${state.bugReport}`,
+  outputSchema: IssueSchema,
+});
+```
+### Function Skill (single tool, auto-bridged to MCP)
+For simple tools that don't need a full MCP server:
+```javascript
+import { skill } from '@zibby/skills';
+export const calculator = skill('calculator', {
+  description: 'Perform arithmetic calculations',
+  input: {
+    expression: 'string',
+  },
+  handler: async ({ expression }) => {
+    const result = eval(expression); // simplified example
+    return { result: String(result) };
+  },
+});
+```
+The framework automatically spawns a lightweight MCP bridge server for function skills at runtime.
+### Skill with Middleware
+Middleware runs before and after every node that uses the skill:
+```javascript
+import { skill } from '@zibby/skills';
+export const audit = skill('audit', {
+  description: 'Audit logging',
+  resolve() { return null; }, // No MCP server needed
+  async middleware() {
+    return async (nodeName, next, stateValues, state) => {
+      console.log(`[audit] Node ${nodeName} starting`);
+      const startTime = Date.now();
+      const result = await next();
+      console.log(`[audit] Node ${nodeName} completed in ${Date.now() - startTime}ms`);
+      return result;
+    };
+  },
+});
+```
+## Browser Skill Details
+The browser skill resolves to `@zibby/mcp-browser` if installed, otherwise falls back to `@playwright/mcp`:
+```javascript
+// Resolution priority:
+// 1. MCP_BROWSER_PATH env var
+// 2. @zibby/mcp-browser (enhanced: stable IDs, event recording)
+// 3. @playwright/mcp (community fallback)
+```
+Default configuration:
+- Video resolution: 1280x720
+- Viewport: 1280x720
+- Output directory: session path or `test-results/`
+## Memory Skill Details
+The memory skill provides five tools:
+| Tool | Description |
+|---|---|
+| `memory_get_test_history` | Query recent test runs with pass/fail results |
+| `memory_get_selectors` | Query known selectors with stability metrics |
+| `memory_get_page_model` | Query page structure — elements, roles, selectors |
+| `memory_get_navigation` | Query known page-to-page transitions |
+| `memory_save_insight` | Save observations for future runs (selector tips, timing, workarounds) |
+The memory skill also includes middleware that automatically loads relevant test history before node execution.
+## Exports
+```javascript
+import { SKILLS } from '@zibby/skills';
+// SKILLS.BROWSER, SKILLS.JIRA, SKILLS.GITHUB, SKILLS.SLACK, SKILLS.MEMORY
+import { browserSkill, jiraSkill, githubSkill, slackSkill, memorySkill } from '@zibby/skills';
+import { skill, functionSkill } from '@zibby/skills';
+// Re-exported from @zibby/core
+import { registerSkill, getSkill, hasSkill, getAllSkills, listSkillIds } from '@zibby/skills';
+```

package/docs/reviewing-results.md ADDED Viewed

@@ -0,0 +1,114 @@
+---
+sidebar_position: 7
+title: Reviewing Results
+---
+# Reviewing Results
+After running tests with `--sync`, all results are available in the Zibby dashboard for review.
+## Test Runs
+Go to your project's **Test Runs** page to see all executions.
+Each run shows:
+- Run name and timestamp
+- Pass/fail status
+- Video thumbnail (hover to preview)
+- Source (local spec or cloud test case)
+## Video Replay
+Click on a test run to open the full replay view:
+### Video Player
+- Full video recording of the browser session
+- Playback controls (play, pause, seek)
+- Speed control (0.5x, 1x, 1.5x, 2x)
+- Cinematic mode for full-screen viewing
+### Action Timeline
+A timeline below the video shows every action the AI performed:
+- **Navigate** — page navigations
+- **Click** — button and link clicks
+- **Type** — keyboard input
+- **Assert** — verification checks
+- **Screenshot** — captured screenshots
+Click any action to jump to that point in the video.
+### Events Sidebar
+A detailed list of all events with:
+- Timestamps
+- Action descriptions
+- Element selectors
+- AI-generated captions explaining each action
+## Generated Scripts
+The **Scripts** tab shows the auto-generated Playwright test script:
+```javascript
+const { test, expect } = require('@playwright/test');
+test('User Login Flow', async ({ page }) => {
+  await page.goto('https://example.com/login');
+  await page.fill('[data-testid="email"]', 'test@example.com');
+  await page.fill('[data-testid="password"]', 'TestPass123');
+  await page.click('[data-testid="submit"]');
+  await expect(page).toHaveURL('/dashboard');
+});
+```
+Click **Copy** to copy the script to your clipboard and add it to your test suite.
+## Test Case Details
+The **Test Case** tab displays the original test specification that was executed, with syntax highlighting.
+## Analysis Results
+For ticket analyses, the analysis page provides:
+### Code Diff Viewer
+- Side-by-side diff of all code changes
+- File tree navigation
+- **Create Pull Request** button to push changes to GitHub
+### Test Cases
+- Structured test cases with steps, data, and expected outcomes
+- Editable — refine test cases before running them
+- Auto-saves changes
+### Pipeline View
+Live status of each analysis step:
+| Step | Description |
+|---|---|
+| Setup | Repository cloned and initialized |
+| Analyze Ticket | AI reads and understands the ticket |
+| Generate Code | Code changes produced |
+| Generate Test Cases | Test cases written |
+| Finalize | Report compiled and uploaded |
+Each step shows logs that can be expanded for debugging.
+## Collections
+Organize runs into collections for grouping related tests:
+- View collections in the **Collections** sidebar
+- Each collection can have subfolders
+- Drag runs between collections
+- Filter and search within collections
+## Sharing Results
+Share a test run with your team by copying the URL from the browser. Anyone with access to the project can view the results.