npm - @zibby/skills - Versions diffs - 0.1.11 → 0.1.12 - Mend

@zibby/skills 0.1.11 → 0.1.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/dist/package.json +2 -2
package/docs/cli-reference.md +120 -256
package/docs/cloning-repositories.md +2 -2
package/docs/cloud/bundles.md +92 -0
package/docs/cloud/dedicated-egress.md +140 -0
package/docs/cloud/env-vars.md +144 -0
package/docs/cloud/limits.md +81 -0
package/docs/cloud/logs.md +104 -0
package/docs/cloud/triggering.md +114 -0
package/docs/concepts/agents.md +112 -0
package/docs/concepts/graph.md +83 -0
package/docs/concepts/sessions.md +70 -0
package/docs/concepts/skills.md +84 -0
package/docs/concepts/state.md +106 -0
package/docs/get-started/deploy.md +75 -0
package/docs/get-started/install.md +58 -0
package/docs/get-started/run-locally.md +94 -0
package/docs/get-started/trigger-and-logs.md +90 -0
package/docs/get-started/your-first-workflow.md +66 -0
package/docs/intro.md +37 -65
package/docs/legacy/test-automation.md +110 -0
package/docs/packages/agent-workflow.md +88 -0
package/docs/packages/cli.md +42 -207
package/docs/packages/core.md +40 -224
package/docs/recipes/index.md +62 -0
package/docs/recipes/test.md +154 -0
package/package.json +2 -2

package/docs/packages/core.md CHANGED Viewed

@@ -1,256 +1,72 @@
 ---
-sidebar_position: 1
-title: "@zibby/core"
+sidebar_position: 2
+title: '@zibby/core'
 ---
 # @zibby/core
-Core test automation engine with multi-agent and multi-MCP support.
+[![npm](https://img.shields.io/npm/v/@zibby/core.svg)](https://www.npmjs.com/package/@zibby/core)
+The batteries — five built-in agent strategies, the runtime, and a re-export of the `@zibby/agent-workflow` engine. This is what `zibby workflow new` scaffolds workflows against.
 ```bash
 npm install @zibby/core
 ```
-> Most users don't install this directly — `@zibby/cli` depends on it automatically.
-## What's Inside
+## What's in it
-`@zibby/core` is the runtime heart of Zibby. It provides:
+### Built-in agent strategies
-| Module | Purpose |
+| Strategy | Backed by |
 |---|---|
-| **Workflow Framework** | `WorkflowGraph`, `Node`, `WorkflowState`, graph compiler |
-| **Agent Strategies** | `CursorAgentStrategy`, `ClaudeAgentStrategy`, `CodexAgentStrategy` |
-| **Skill Registry** | `registerSkill`, `getSkill`, `listSkillIds` |
-| **Graph Compiler** | `compileGraph`, `validateGraphConfig` — JSON → executable graph |
-| **Tool Resolver** | Maps node skill declarations to concrete MCP tool permissions |
-| **Code Generator** | `generateWorkflowCode` — serializes graphs to code |
-| **Templates** | Built-in workflow templates (browser-test-automation, code-analysis) |
-| **Enrichment Pipeline** | DOM enrichers, accessibility enrichers, page-state enrichers |
-| **Runtime** | Stable-ID runtime, verification strategies, Playwright integration |
-| **Sync** | Cloud upload/download for test results |
-## Exports
-```javascript
-// Framework
-import {
-  WorkflowGraph,
-  Node,
-  ConditionalNode,
-  WorkflowState,
-  OutputParser,
-  compileGraph,
-  validateGraphConfig,
-  extractSteps,
-} from '@zibby/core';
+| `CursorAgentStrategy` | `cursor-agent` CLI |
+| `ClaudeAgentStrategy` | `@anthropic-ai/claude-agent-sdk` |
+| `CodexAgentStrategy` | `@openai/codex` CLI |
+| `GeminiAgentStrategy` | `@google/gemini-cli` |
+| `AssistantStrategy` | OpenAI Assistants API |
-// Agent strategies
-import {
-  invokeAgent,
-  getAgentStrategy,
-  CursorAgentStrategy,
-  ClaudeAgentStrategy,
-  AgentStrategy,
-} from '@zibby/core';
-// Skill registry
-import {
-  registerSkill,
-  getSkill,
-  hasSkill,
-  getAllSkills,
-  listSkillIds,
-} from '@zibby/core';
-// Zod (re-exported for convenience)
-import { z } from '@zibby/core';
+Importing `@zibby/core` registers all five into the `@zibby/agent-workflow` strategy registry automatically. So:
-// Skills enum
-import { SKILLS } from '@zibby/core';
-```
-## WorkflowGraph
-The central class. Create a graph, add nodes and edges, then run it.
-```javascript
+```js
 import { WorkflowGraph } from '@zibby/core';
-import { z } from '@zibby/core';
-const graph = new WorkflowGraph({
-  stateSchema: z.object({        // Optional: validates initial state
-    testSpec: z.string(),
-    cwd: z.string().optional(),
-  }),
-  middleware: [],                 // Optional: wrap every node execution
-});
-graph.addNode('step_a', myNode);
-graph.addEdge('step_a', 'step_b');
-graph.setEntryPoint('step_a');
-const result = await graph.run(agent, { testSpec: '...' });
-```
-**Key methods:**
-| Method | Description |
-|---|---|
-| `addNode(name, config, options?)` | Add an executable node |
-| `addConditionalNode(name, config)` | Add a node that routes based on state |
-| `addEdge(from, to)` | Linear connection between nodes |
-| `addConditionalEdges(from, routeFn)` | Branch based on state — `routeFn(state) => 'next_node'` |
-| `setEntryPoint(name)` | Declare which node runs first |
-| `setStateSchema(zodSchema)` | Validate initial state before execution |
-| `use(middlewareFn)` | Add graph-level middleware |
-| `serialize()` | Export graph as JSON (for dashboard visual editor) |
-| `run(agent, initialState)` | Execute the graph |
-## Node Definition
-A node is a plain object with `name`, `prompt`, and `outputSchema`:
-```javascript
-import { z } from '@zibby/core';
-export const myNode = {
-  name: 'my_node',
-  // Prompt: function receiving state, returns string
-  prompt: (state) => `Analyze: ${state.testSpec}`,
-  // Output schema: Zod object validated at runtime
-  outputSchema: z.object({
-    title: z.string().describe('Short title'),
-    confidence: z.number().min(0).max(1),
-  }),
-  // Optional: skills needed (MCP tools)
-  skills: ['browser', 'memory'],
-  // Optional: timeout in ms (default 300000)
-  timeout: 600000,
-  // Optional: retry count
-  retries: 1,
-  // Optional: post-processing hook
-  onComplete: async (state, result) => {
-    // Transform result before it goes into state
-    return { ...result, processedAt: Date.now() };
-  },
-};
-```
-For nodes that don't call an LLM (pure data transforms), use a custom `execute` function:
-```javascript
-export const transformNode = {
-  name: 'transform',
-  _isCustomCode: true,
-  outputSchema: z.object({ cleaned: z.string() }),
-  execute: async (context) => {
-    const raw = context.state.get('raw_data');
-    return { cleaned: raw.trim().toLowerCase() };
-  },
-};
+const graph = new WorkflowGraph()
+  .addNode('plan', { prompt, outputSchema, agent: 'cursor' })   // already registered
+  .setEntryPoint('plan');
 ```
-## Agent Strategy Framework
+### Re-exports from agent-workflow
-All agents implement the same `AgentStrategy` base class:
+So you can do `import { WorkflowGraph, z, AgentStrategy } from '@zibby/core'` without separately importing the engine:
-```javascript
-import { AgentStrategy } from '@zibby/core';
+- `WorkflowGraph`, `WorkflowAgent`, `Node`, `WorkflowState`
+- `AgentStrategy`, `registerStrategy`, `getAgentStrategy`, `invokeAgent`
+- `registerSkill`, `getSkill`, `getAllSkills`
+- `compileGraph`, `validateGraphConfig`
+- `timeline`, session helpers, constants
-class MyCustomAgent extends AgentStrategy {
-  constructor() {
-    super('my-agent', 'My Custom Agent', 50);
-  }
+### z (Zod)
-  canHandle(context) {
-    return !!process.env.MY_AGENT_KEY;
-  }
-  async invoke(prompt, options) {
-    // options.schema — Zod schema for structured output
-    // options.workspace — working directory
-    // options.skills — skill IDs to resolve
-    // options.model — model name or 'auto'
-    // options.timeout — execution timeout
-    const response = await myApiCall(prompt, options);
-    if (options.schema) {
-      return { raw: response, structured: options.schema.parse(JSON.parse(response)) };
-    }
-    return response;
-  }
-}
-```
-Select and invoke an agent:
-```javascript
-import { invokeAgent, getAgentStrategy } from '@zibby/core';
-// Auto-select based on config
-const result = await invokeAgent(prompt, { state: { agentType: 'cursor' } }, {
-  model: 'auto',
-  workspace: process.cwd(),
-  schema: MyZodSchema,
-  skills: ['browser'],
-});
-```
+Re-exports `zod/v3` so you don't need a separate Zod dep:
-## Graph Compiler
-Compile a JSON graph definition (from the Zibby dashboard visual editor) into an executable `WorkflowGraph`:
-```javascript
-import { compileGraph, validateGraphConfig } from '@zibby/core';
-const graphJson = {
-  nodes: [
-    { id: 'preflight', type: 'preflight', data: { nodeType: 'preflight' } },
-    { id: 'execute_live', type: 'execute_live', data: { nodeType: 'execute_live' } },
-  ],
-  edges: [
-    { source: 'preflight', target: 'execute_live' },
-    { source: 'execute_live', target: 'END' },
-  ],
-  nodeConfigs: {
-    preflight: { prompt: 'Analyze: {{testSpec}}' },
-  },
-};
-// Validate
-const { valid, errors } = validateGraphConfig(graphJson);
-// Compile
-const graph = compileGraph(graphJson, { stateSchema: MySchema });
+```js
+import { z } from '@zibby/core';
-// Run
-await graph.run(agent, initialState);
+const Plan = z.object({ tasks: z.array(z.string()) });
 ```
-## Built-in Templates
+### Cloud-runtime helpers
-### Browser Test Automation
+Functions used by the cloud executor and Studio integration — `ZibbyRuntime`, `StableIdRuntime`, `resolveIntegrationToken`, `cloneRepo`, `patchCursorAgentForCI`, `runPlaywrightTestTool`. Most users don't import these directly.
-```
-preflight → execute_live → generate_script → END
-```
-| Node | Skills | Output |
-|---|---|---|
-| `preflight` | — | `{ title, assertions[] }` |
-| `execute_live` | browser, memory | `{ success, steps[], actions[], assertions[] }` |
-| `generate_script` | — | `{ testCode, filename }` |
+## When to depend on this vs. agent-workflow
-### Code Analysis
+| Goal | Pick |
+|---|---|
+| Build a workflow with built-in agents (cursor/claude/codex/gemini/assistant) | `@zibby/core` |
+| Embed the engine in your own app with custom agents only | `@zibby/agent-workflow` |
+| Use the CLI | `@zibby/cli` (transitively pulls both) |
-```
-setup → analyze_ticket → generate_code → generate_test_cases → finalize → END
-```
+## Source
-Used by the cloud pipeline for Jira ticket analysis and code generation.
+- npm: [`@zibby/core`](https://www.npmjs.com/package/@zibby/core)

package/docs/recipes/index.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+sidebar_position: 1
+title: Recipes overview
+---
+# Built-in workflow recipes
+Zibby ships with a few **vertical slice workflows** — production-ready pipelines you can run today, demonstrating what the platform does. Each recipe is a real Zibby workflow under the hood, eating its own dog food.
+```
+                       ┌─────────────────────┐
+   ┌──────────────────►│  zibby workflow new │  ◄── Build your own
+   │                   │  zibby workflow ... │
+   │                   └─────────────────────┘
+   │                              ▲
+   │                              │ uses the same primitives
+   │                              │
+   │                   ┌─────────────────────┐
+   │  Recipes built ──►│  zibby test         │  ◄── Browser testing
+   │  on top of the    │  zibby analyze      │  ◄── Code analysis
+   │  same platform    │  zibby video        │  ◄── Test playback
+   │                   │  zibby generate     │  ◄── Spec generation
+   │                   └─────────────────────┘
+```
+You don't have to use the recipes. You can build whatever pipeline you want with `zibby workflow new`. The recipes just save you from writing the obvious starter graphs for common cases.
+## Available recipes
+| Recipe | What it does | Best for |
+|---|---|---|
+| [`zibby test`](./test) | Drives a browser via Cursor or Claude, runs assertions, generates a Playwright script + verification video | E2E test generation from plain-English specs |
+| `zibby analyze` | Reads a Jira/Linear ticket, walks the codebase, produces an implementation plan | Pre-implementation planning, ticket triage |
+| `zibby generate` | Generates test specs from a ticket + codebase | Backfilling test coverage on legacy projects |
+| `zibby video` | Re-records or organizes verification videos for an existing test | Producing demos, regenerating after code changes |
+## Why recipes matter
+Three reasons we ship vertical workflows alongside the platform:
+1. **Proof of concept** — every recipe IS a Zibby workflow. If `zibby test` works, the platform works. You can see the actual graph definition and adapt it.
+2. **Faster onboarding** — you don't need to design a full graph on day one. Run a recipe, see the output, then build your own.
+3. **Demonstrates multi-vendor** — the test recipe runs across Cursor / Claude / Codex / Gemini. Pick the agent that gives you the best results for your use case; the recipe doesn't care.
+## Building your own recipe
+If you have a workflow you'd want shipped as a built-in:
+```bash
+zibby workflow new my-recipe          # scaffold
+# ... build it out ...
+zibby workflow run my-recipe          # test locally
+zibby workflow deploy my-recipe       # ship to your cloud account
+```
+If it's broadly useful, we may pull it into the official recipes set. Open an issue or PR.
+## Next
+- **[`zibby test` recipe](./test)** — the most-used recipe, walked through end-to-end
+- **[Build your own workflow](../get-started/your-first-workflow)** — scaffold and customize
+- **[Concepts: graph](../concepts/graph)** — the primitives every recipe is built on

package/docs/recipes/test.md ADDED Viewed

@@ -0,0 +1,154 @@
+---
+sidebar_position: 2
+title: Browser test recipe (zibby test)
+---
+# `zibby test` — browser test recipe
+The browser-test recipe takes a plain-English spec, drives a real browser via a coding agent (Cursor / Claude / Codex / Gemini), runs the assertions, and produces a Playwright script + verification video.
+It's a worked example of what the Zibby platform does — every step is a regular workflow node with Zod-validated handoff. You can read the source, fork it, or build your own variation.
+## Quick start
+```bash
+# Inline spec
+zibby test "Go to https://example.com and verify the title is 'Example Domain'"
+# Spec file
+zibby test test-specs/login.txt
+# With a specific agent
+zibby test test-specs/checkout.txt --agent claude
+```
+## What it produces
+```
+.zibby/output/sessions/<session-id>/
+├── execute_live/
+│   ├── result.json          ← Zod-validated assertions + agent reasoning
+│   └── browser-trace/       ← Playwright trace files
+├── generate_script/
+│   ├── result.json          ← parsed script + metadata
+│   └── generated.spec.js    ← reusable Playwright test
+└── video/
+    └── recording.webm       ← visual verification
+```
+Open the session in [Zibby Studio](https://zibby.app/studio) to scrub through the run, swap the prompt, re-execute any node.
+## The graph (this is just a Zibby workflow)
+Under the hood, `zibby test` is a 3-node graph:
+```
+   ┌──────────────┐    ┌──────────────────┐    ┌─────────────────┐
+   │  preflight   │ →  │   execute_live   │ →  │ generate_script │
+   │              │    │                  │    │                 │
+   │ extract      │    │ agent drives     │    │ produce         │
+   │ assertions   │    │ browser via MCP, │    │ Playwright      │
+   │ from spec    │    │ records video    │    │ test file       │
+   └──────────────┘    └──────────────────┘    └─────────────────┘
+        │                     │                       │
+     Zod out               Zod out                 Zod out
+   (Assertions)         (BrowserResult)         (PlaywrightScript)
+```
+Each node is a real `WorkflowGraph` node. The agent in `execute_live` does its own tool loop (browser navigation, click, assertion checking) — Zibby just defines the contract.
+## Customizing
+**Use a different agent per run:**
+```bash
+zibby test test-specs/checkout.txt --agent claude    # Claude Code
+zibby test test-specs/checkout.txt --agent cursor    # Cursor (default)
+zibby test test-specs/checkout.txt --agent codex     # OpenAI Codex
+```
+**Run only one node** (e.g. just regenerate the script from an existing run):
+```bash
+zibby test --session 1768974629717 --node generate_script
+```
+**Headless vs headed:**
+```bash
+zibby test test-specs/login.txt              # headed (default — see the browser)
+zibby test test-specs/login.txt --headless   # headless mode (for CI)
+```
+## Forking the recipe
+If the built-in recipe doesn't fit your case, scaffold a custom workflow and copy the structure:
+```bash
+zibby workflow new my-test-pipeline
+```
+Then in `graph.mjs`, define your own nodes:
+```js
+import { WorkflowGraph, z } from '@zibby/agent-workflow';
+const AssertionsSchema = z.object({
+  assertions: z.array(z.string()),
+  baseUrl: z.string().url(),
+});
+const BrowserResultSchema = z.object({
+  passed: z.boolean(),
+  details: z.array(z.object({ assertion: z.string(), passed: z.boolean() })),
+  videoPath: z.string().optional(),
+});
+const graph = new WorkflowGraph();
+graph.addNode('preflight', {
+  agent: 'claude',
+  prompt: ({ spec }) => `Extract assertions and base URL from: ${spec}`,
+  outputSchema: AssertionsSchema,
+});
+graph.addNode('execute_live', {
+  agent: 'cursor',
+  skills: ['browser'],
+  prompt: ({ preflight }) => `Navigate to ${preflight.baseUrl} and verify: ${preflight.assertions.join('; ')}`,
+  outputSchema: BrowserResultSchema,
+});
+graph.addEdge('preflight', 'execute_live');
+graph.setEntryPoint('preflight');
+export default graph;
+```
+That's the platform. The recipe is just a starter.
+## CI/CD
+```yaml
+- name: Run Zibby test
+  env:
+    ZIBBY_USER_TOKEN: ${{ secrets.ZIBBY_USER_TOKEN }}
+  run: |
+    npx @zibby/cli test test-specs/checkout.txt --headless
+```
+For workflows triggered remotely (rather than per-CI-run), use [`workflow trigger`](../cloud/triggering) on a deployed graph.
+## Why this is different from Playwright codegen / a basic LLM script
+| | Playwright codegen | LLM-only script | Zibby test recipe |
+|---|---|---|---|
+| Plain-English input | ❌ | ✅ | ✅ |
+| Real browser execution | ✅ | ❌ (just generates code) | ✅ |
+| Coding-agent driven | ❌ | partial | ✅ Cursor / Claude / Codex |
+| Multi-step verification | ❌ | ❌ | ✅ Zod-validated nodes |
+| Replayable + debuggable | ❌ | ❌ | ✅ Studio |
+| Vendor-neutral | N/A | locked to one LLM | swap agent per run |
+## See also
+- [Recipes overview](./index)
+- [Concepts: graph](../concepts/graph) — the primitives this recipe uses
+- [Cloud triggering](../cloud/triggering) — fire workflows from CI/CD

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@zibby/skills",
-  "version": "0.1.11",
+  "version": "0.1.12",
   "description": "Built-in skill definitions for Zibby test automation framework",
   "type": "module",
   "main": "dist/index.js",
@@ -46,7 +46,7 @@
     "node": ">=18.0.0"
   },
   "dependencies": {
-    "@zibby/agent-workflow": "^0.1.2"
+    "@zibby/agent-workflow": "^0.3.0"
   },
   "peerDependencies": {
     "@zibby/core": ">=0.1.44"