npm - @every-env/compound-plugin - Versions diffs - 2.36.4 → 2.37.0 - Mend

@every-env/compound-plugin 2.36.4 → 2.37.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/plugins/compound-engineering/skills/agent-browser/references/video-recording.md ADDED Viewed

@@ -0,0 +1,173 @@
+# Video Recording
+Capture browser automation as video for debugging, documentation, or verification.
+**Related**: [commands.md](commands.md) for full command reference, [SKILL.md](../SKILL.md) for quick start.
+## Contents
+- [Basic Recording](#basic-recording)
+- [Recording Commands](#recording-commands)
+- [Use Cases](#use-cases)
+- [Best Practices](#best-practices)
+- [Output Format](#output-format)
+- [Limitations](#limitations)
+## Basic Recording
+```bash
+# Start recording
+agent-browser record start ./demo.webm
+# Perform actions
+agent-browser open https://example.com
+agent-browser snapshot -i
+agent-browser click @e1
+agent-browser fill @e2 "test input"
+# Stop and save
+agent-browser record stop
+```
+## Recording Commands
+```bash
+# Start recording to file
+agent-browser record start ./output.webm
+# Stop current recording
+agent-browser record stop
+# Restart with new file (stops current + starts new)
+agent-browser record restart ./take2.webm
+```
+## Use Cases
+### Debugging Failed Automation
+```bash
+#!/bin/bash
+# Record automation for debugging
+agent-browser record start ./debug-$(date +%Y%m%d-%H%M%S).webm
+# Run your automation
+agent-browser open https://app.example.com
+agent-browser snapshot -i
+agent-browser click @e1 || {
+    echo "Click failed - check recording"
+    agent-browser record stop
+    exit 1
+}
+agent-browser record stop
+```
+### Documentation Generation
+```bash
+#!/bin/bash
+# Record workflow for documentation
+agent-browser record start ./docs/how-to-login.webm
+agent-browser open https://app.example.com/login
+agent-browser wait 1000  # Pause for visibility
+agent-browser snapshot -i
+agent-browser fill @e1 "demo@example.com"
+agent-browser wait 500
+agent-browser fill @e2 "password"
+agent-browser wait 500
+agent-browser click @e3
+agent-browser wait --load networkidle
+agent-browser wait 1000  # Show result
+agent-browser record stop
+```
+### CI/CD Test Evidence
+```bash
+#!/bin/bash
+# Record E2E test runs for CI artifacts
+TEST_NAME="${1:-e2e-test}"
+RECORDING_DIR="./test-recordings"
+mkdir -p "$RECORDING_DIR"
+agent-browser record start "$RECORDING_DIR/$TEST_NAME-$(date +%s).webm"
+# Run test
+if run_e2e_test; then
+    echo "Test passed"
+else
+    echo "Test failed - recording saved"
+fi
+agent-browser record stop
+```
+## Best Practices
+### 1. Add Pauses for Clarity
+```bash
+# Slow down for human viewing
+agent-browser click @e1
+agent-browser wait 500  # Let viewer see result
+```
+### 2. Use Descriptive Filenames
+```bash
+# Include context in filename
+agent-browser record start ./recordings/login-flow-2024-01-15.webm
+agent-browser record start ./recordings/checkout-test-run-42.webm
+```
+### 3. Handle Recording in Error Cases
+```bash
+#!/bin/bash
+set -e
+cleanup() {
+    agent-browser record stop 2>/dev/null || true
+    agent-browser close 2>/dev/null || true
+}
+trap cleanup EXIT
+agent-browser record start ./automation.webm
+# ... automation steps ...
+```
+### 4. Combine with Screenshots
+```bash
+# Record video AND capture key frames
+agent-browser record start ./flow.webm
+agent-browser open https://example.com
+agent-browser screenshot ./screenshots/step1-homepage.png
+agent-browser click @e1
+agent-browser screenshot ./screenshots/step2-after-click.png
+agent-browser record stop
+```
+## Output Format
+- Default format: WebM (VP8/VP9 codec)
+- Compatible with all modern browsers and video players
+- Compressed but high quality
+## Limitations
+- Recording adds slight overhead to automation
+- Large recordings can consume significant disk space
+- Some headless environments may have codec limitations

package/plugins/compound-engineering/skills/agent-browser/templates/authenticated-session.sh ADDED Viewed

@@ -0,0 +1,105 @@
+#!/bin/bash
+# Template: Authenticated Session Workflow
+# Purpose: Login once, save state, reuse for subsequent runs
+# Usage: ./authenticated-session.sh <login-url> [state-file]
+#
+# RECOMMENDED: Use the auth vault instead of this template:
+#   echo "<pass>" | agent-browser auth save myapp --url <login-url> --username <user> --password-stdin
+#   agent-browser auth login myapp
+# The auth vault stores credentials securely and the LLM never sees passwords.
+#
+# Environment variables:
+#   APP_USERNAME - Login username/email
+#   APP_PASSWORD - Login password
+#
+# Two modes:
+#   1. Discovery mode (default): Shows form structure so you can identify refs
+#   2. Login mode: Performs actual login after you update the refs
+#
+# Setup steps:
+#   1. Run once to see form structure (discovery mode)
+#   2. Update refs in LOGIN FLOW section below
+#   3. Set APP_USERNAME and APP_PASSWORD
+#   4. Delete the DISCOVERY section
+set -euo pipefail
+LOGIN_URL="${1:?Usage: $0 <login-url> [state-file]}"
+STATE_FILE="${2:-./auth-state.json}"
+echo "Authentication workflow: $LOGIN_URL"
+# ================================================================
+# SAVED STATE: Skip login if valid saved state exists
+# ================================================================
+if [[ -f "$STATE_FILE" ]]; then
+    echo "Loading saved state from $STATE_FILE..."
+    if agent-browser --state "$STATE_FILE" open "$LOGIN_URL" 2>/dev/null; then
+        agent-browser wait --load networkidle
+        CURRENT_URL=$(agent-browser get url)
+        if [[ "$CURRENT_URL" != *"login"* ]] && [[ "$CURRENT_URL" != *"signin"* ]]; then
+            echo "Session restored successfully"
+            agent-browser snapshot -i
+            exit 0
+        fi
+        echo "Session expired, performing fresh login..."
+        agent-browser close 2>/dev/null || true
+    else
+        echo "Failed to load state, re-authenticating..."
+    fi
+    rm -f "$STATE_FILE"
+fi
+# ================================================================
+# DISCOVERY MODE: Shows form structure (delete after setup)
+# ================================================================
+echo "Opening login page..."
+agent-browser open "$LOGIN_URL"
+agent-browser wait --load networkidle
+echo ""
+echo "Login form structure:"
+echo "---"
+agent-browser snapshot -i
+echo "---"
+echo ""
+echo "Next steps:"
+echo "  1. Note the refs: username=@e?, password=@e?, submit=@e?"
+echo "  2. Update the LOGIN FLOW section below with your refs"
+echo "  3. Set: export APP_USERNAME='...' APP_PASSWORD='...'"
+echo "  4. Delete this DISCOVERY MODE section"
+echo ""
+agent-browser close
+exit 0
+# ================================================================
+# LOGIN FLOW: Uncomment and customize after discovery
+# ================================================================
+# : "${APP_USERNAME:?Set APP_USERNAME environment variable}"
+# : "${APP_PASSWORD:?Set APP_PASSWORD environment variable}"
+#
+# agent-browser open "$LOGIN_URL"
+# agent-browser wait --load networkidle
+# agent-browser snapshot -i
+#
+# # Fill credentials (update refs to match your form)
+# agent-browser fill @e1 "$APP_USERNAME"
+# agent-browser fill @e2 "$APP_PASSWORD"
+# agent-browser click @e3
+# agent-browser wait --load networkidle
+#
+# # Verify login succeeded
+# FINAL_URL=$(agent-browser get url)
+# if [[ "$FINAL_URL" == *"login"* ]] || [[ "$FINAL_URL" == *"signin"* ]]; then
+#     echo "Login failed - still on login page"
+#     agent-browser screenshot /tmp/login-failed.png
+#     agent-browser close
+#     exit 1
+# fi
+#
+# # Save state for future runs
+# echo "Saving state to $STATE_FILE"
+# agent-browser state save "$STATE_FILE"
+# echo "Login successful"
+# agent-browser snapshot -i

package/plugins/compound-engineering/skills/agent-browser/templates/capture-workflow.sh ADDED Viewed

@@ -0,0 +1,69 @@
+#!/bin/bash
+# Template: Content Capture Workflow
+# Purpose: Extract content from web pages (text, screenshots, PDF)
+# Usage: ./capture-workflow.sh <url> [output-dir]
+#
+# Outputs:
+#   - page-full.png: Full page screenshot
+#   - page-structure.txt: Page element structure with refs
+#   - page-text.txt: All text content
+#   - page.pdf: PDF version
+#
+# Optional: Load auth state for protected pages
+set -euo pipefail
+TARGET_URL="${1:?Usage: $0 <url> [output-dir]}"
+OUTPUT_DIR="${2:-.}"
+echo "Capturing: $TARGET_URL"
+mkdir -p "$OUTPUT_DIR"
+# Optional: Load authentication state
+# if [[ -f "./auth-state.json" ]]; then
+#     echo "Loading authentication state..."
+#     agent-browser state load "./auth-state.json"
+# fi
+# Navigate to target
+agent-browser open "$TARGET_URL"
+agent-browser wait --load networkidle
+# Get metadata
+TITLE=$(agent-browser get title)
+URL=$(agent-browser get url)
+echo "Title: $TITLE"
+echo "URL: $URL"
+# Capture full page screenshot
+agent-browser screenshot --full "$OUTPUT_DIR/page-full.png"
+echo "Saved: $OUTPUT_DIR/page-full.png"
+# Get page structure with refs
+agent-browser snapshot -i > "$OUTPUT_DIR/page-structure.txt"
+echo "Saved: $OUTPUT_DIR/page-structure.txt"
+# Extract all text content
+agent-browser get text body > "$OUTPUT_DIR/page-text.txt"
+echo "Saved: $OUTPUT_DIR/page-text.txt"
+# Save as PDF
+agent-browser pdf "$OUTPUT_DIR/page.pdf"
+echo "Saved: $OUTPUT_DIR/page.pdf"
+# Optional: Extract specific elements using refs from structure
+# agent-browser get text @e5 > "$OUTPUT_DIR/main-content.txt"
+# Optional: Handle infinite scroll pages
+# for i in {1..5}; do
+#     agent-browser scroll down 1000
+#     agent-browser wait 1000
+# done
+# agent-browser screenshot --full "$OUTPUT_DIR/page-scrolled.png"
+# Cleanup
+agent-browser close
+echo ""
+echo "Capture complete:"
+ls -la "$OUTPUT_DIR"

package/plugins/compound-engineering/skills/agent-browser/templates/form-automation.sh ADDED Viewed

@@ -0,0 +1,62 @@
+#!/bin/bash
+# Template: Form Automation Workflow
+# Purpose: Fill and submit web forms with validation
+# Usage: ./form-automation.sh <form-url>
+#
+# This template demonstrates the snapshot-interact-verify pattern:
+# 1. Navigate to form
+# 2. Snapshot to get element refs
+# 3. Fill fields using refs
+# 4. Submit and verify result
+#
+# Customize: Update the refs (@e1, @e2, etc.) based on your form's snapshot output
+set -euo pipefail
+FORM_URL="${1:?Usage: $0 <form-url>}"
+echo "Form automation: $FORM_URL"
+# Step 1: Navigate to form
+agent-browser open "$FORM_URL"
+agent-browser wait --load networkidle
+# Step 2: Snapshot to discover form elements
+echo ""
+echo "Form structure:"
+agent-browser snapshot -i
+# Step 3: Fill form fields (customize these refs based on snapshot output)
+#
+# Common field types:
+#   agent-browser fill @e1 "John Doe"           # Text input
+#   agent-browser fill @e2 "user@example.com"   # Email input
+#   agent-browser fill @e3 "SecureP@ss123"      # Password input
+#   agent-browser select @e4 "Option Value"     # Dropdown
+#   agent-browser check @e5                     # Checkbox
+#   agent-browser click @e6                     # Radio button
+#   agent-browser fill @e7 "Multi-line text"   # Textarea
+#   agent-browser upload @e8 /path/to/file.pdf # File upload
+#
+# Uncomment and modify:
+# agent-browser fill @e1 "Test User"
+# agent-browser fill @e2 "test@example.com"
+# agent-browser click @e3  # Submit button
+# Step 4: Wait for submission
+# agent-browser wait --load networkidle
+# agent-browser wait --url "**/success"  # Or wait for redirect
+# Step 5: Verify result
+echo ""
+echo "Result:"
+agent-browser get url
+agent-browser snapshot -i
+# Optional: Capture evidence
+agent-browser screenshot /tmp/form-result.png
+echo "Screenshot saved: /tmp/form-result.png"
+# Cleanup
+agent-browser close
+echo "Done"

package/plugins/compound-engineering/skills/create-agent-skills/SKILL.md CHANGED Viewed

@@ -97,22 +97,11 @@ Access individual args: `$ARGUMENTS[0]` or shorthand `$0`, `$1`, `$2`.
 ### Dynamic Context Injection
-The `` !`command` `` syntax runs shell commands before content is sent to Claude:
+Skills support dynamic context injection: prefix a backtick-wrapped shell command with an exclamation mark, and the preprocessor executes it at load time, replacing the directive with stdout. Write an exclamation mark immediately before the opening backtick of the command you want executed (for example, to inject the current git branch, write the exclamation mark followed by `git branch --show-current` wrapped in backticks).
-```yaml
----
-name: pr-summary
-description: Summarize changes in a pull request
-context: fork
-agent: Explore
----
-## Context
-- PR diff: !`gh pr diff`
-- Changed files: !`gh pr diff --name-only`
+**Important:** The preprocessor scans the entire SKILL.md as plain text — it does not parse markdown. Directives inside fenced code blocks or inline code spans are still executed. If a skill documents this syntax with literal examples, the preprocessor will attempt to run them, causing load failures. To safely document this feature, describe it in prose (as done here) or place examples in a reference file, which is loaded on-demand by Claude and not preprocessed.
-Summarize this pull request...
-```
+For a concrete example of dynamic context injection in a skill, see [official-spec.md](references/official-spec.md) § "Dynamic Context Injection".
 ### Running in a Subagent

package/plugins/compound-engineering/skills/skill-creator/SKILL.md DELETED Viewed

@@ -1,210 +0,0 @@
----
-name: skill-creator
-description: Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.
-license: Complete terms in LICENSE.txt
-disable-model-invocation: true
----
-# Skill Creator
-This skill provides guidance for creating effective skills.
-## About Skills
-Skills are modular, self-contained packages that extend Claude's capabilities by providing
-specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific
-domains or tasks—they transform Claude from a general-purpose agent into a specialized agent
-equipped with procedural knowledge that no model can fully possess.
-### What Skills Provide
-1. Specialized workflows - Multi-step procedures for specific domains
-2. Tool integrations - Instructions for working with specific file formats or APIs
-3. Domain expertise - Company-specific knowledge, schemas, business logic
-4. Bundled resources - Scripts, references, and assets for complex and repetitive tasks
-### Anatomy of a Skill
-Every skill consists of a required SKILL.md file and optional bundled resources:
-```
-skill-name/
-├── SKILL.md (required)
-│   ├── YAML frontmatter metadata (required)
-│   │   ├── name: (required)
-│   │   └── description: (required)
-│   └── Markdown instructions (required)
-└── Bundled Resources (optional)
-    ├── scripts/          - Executable code (Python/Bash/etc.)
-    ├── references/       - Documentation intended to be loaded into context as needed
-    └── assets/           - Files used in output (templates, icons, fonts, etc.)
-```
-#### SKILL.md (required)
-**Metadata Quality:** The `name` and `description` in YAML frontmatter determine when Claude will use the skill. Be specific about what the skill does and when to use it. Use the third-person (e.g. "This skill should be used when..." instead of "Use this skill when...").
-#### Bundled Resources (optional)
-##### Scripts (`scripts/`)
-Executable code (Python/Bash/etc.) for tasks that require deterministic reliability or are repeatedly rewritten.
-- **When to include**: When the same code is being rewritten repeatedly or deterministic reliability is needed
-- **Example**: `scripts/rotate_pdf.py` for PDF rotation tasks
-- **Benefits**: Token efficient, deterministic, may be executed without loading into context
-- **Note**: Scripts may still need to be read by Claude for patching or environment-specific adjustments
-##### References (`references/`)
-Documentation and reference material intended to be loaded as needed into context to inform Claude's process and thinking.
-- **When to include**: For documentation that Claude should reference while working
-- **Examples**: `references/finance.md` for financial schemas, `references/mnda.md` for company NDA template, `references/policies.md` for company policies, `references/api_docs.md` for API specifications
-- **Use cases**: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides
-- **Benefits**: Keeps SKILL.md lean, loaded only when Claude determines it's needed
-- **Best practice**: If files are large (>10k words), include grep search patterns in SKILL.md
-- **Avoid duplication**: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skill—this keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files.
-##### Assets (`assets/`)
-Files not intended to be loaded into context, but rather used within the output Claude produces.
-- **When to include**: When the skill needs files that will be used in the final output
-- **Examples**: `assets/logo.png` for brand assets, `assets/slides.pptx` for PowerPoint templates, `assets/frontend-template/` for HTML/React boilerplate, `assets/font.ttf` for typography
-- **Use cases**: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified
-- **Benefits**: Separates output resources from documentation, enables Claude to use files without loading them into context
-### Progressive Disclosure Design Principle
-Skills use a three-level loading system to manage context efficiently:
-1. **Metadata (name + description)** - Always in context (~100 words)
-2. **SKILL.md body** - When skill triggers (<5k words)
-3. **Bundled resources** - As needed by Claude (Unlimited*)
-*Unlimited because scripts can be executed without reading into context window.
-## Skill Creation Process
-To create a skill, follow the "Skill Creation Process" in order, skipping steps only if there is a clear reason why they are not applicable.
-### Step 1: Understanding the Skill with Concrete Examples
-Skip this step only when the skill's usage patterns are already clearly understood. It remains valuable even when working with an existing skill.
-To create an effective skill, clearly understand concrete examples of how the skill will be used. This understanding can come from either direct user examples or generated examples that are validated with user feedback.
-For example, when building an image-editor skill, relevant questions include:
-- "What functionality should the image-editor skill support? Editing, rotating, anything else?"
-- "Can you give some examples of how this skill would be used?"
-- "I can imagine users asking for things like 'Remove the red-eye from this image' or 'Rotate this image'. Are there other ways you imagine this skill being used?"
-- "What would a user say that should trigger this skill?"
-To avoid overwhelming users, avoid asking too many questions in a single message. Start with the most important questions and follow up as needed for better effectiveness.
-Conclude this step when there is a clear sense of the functionality the skill should support.
-### Step 2: Planning the Reusable Skill Contents
-To turn concrete examples into an effective skill, analyze each example by:
-1. Considering how to execute on the example from scratch
-2. Identifying what scripts, references, and assets would be helpful when executing these workflows repeatedly
-Example: When building a `pdf-editor` skill to handle queries like "Help me rotate this PDF," the analysis shows:
-1. Rotating a PDF requires re-writing the same code each time
-2. A `scripts/rotate_pdf.py` script would be helpful to store in the skill
-Example: When designing a `frontend-webapp-builder` skill for queries like "Build me a todo app" or "Build me a dashboard to track my steps," the analysis shows:
-1. Writing a frontend webapp requires the same boilerplate HTML/React each time
-2. An `assets/hello-world/` template containing the boilerplate HTML/React project files would be helpful to store in the skill
-Example: When building a `big-query` skill to handle queries like "How many users have logged in today?" the analysis shows:
-1. Querying BigQuery requires re-discovering the table schemas and relationships each time
-2. A `references/schema.md` file documenting the table schemas would be helpful to store in the skill
-To establish the skill's contents, analyze each concrete example to create a list of the reusable resources to include: scripts, references, and assets.
-### Step 3: Initializing the Skill
-At this point, it is time to actually create the skill.
-Skip this step only if the skill being developed already exists, and iteration or packaging is needed. In this case, continue to the next step.
-When creating a new skill from scratch, always run the `init_skill.py` script. The script conveniently generates a new template skill directory that automatically includes everything a skill requires, making the skill creation process much more efficient and reliable.
-Usage:
-```bash
-scripts/init_skill.py <skill-name> --path <output-directory>
-```
-The script:
-- Creates the skill directory at the specified path
-- Generates a SKILL.md template with proper frontmatter and TODO placeholders
-- Creates example resource directories: `scripts/`, `references/`, and `assets/`
-- Adds example files in each directory that can be customized or deleted
-After initialization, customize or remove the generated SKILL.md and example files as needed.
-### Step 4: Edit the Skill
-When editing the (newly-generated or existing) skill, remember that the skill is being created for another instance of Claude to use. Focus on including information that would be beneficial and non-obvious to Claude. Consider what procedural knowledge, domain-specific details, or reusable assets would help another Claude instance execute these tasks more effectively.
-#### Start with Reusable Skill Contents
-To begin implementation, start with the reusable resources identified above: `scripts/`, `references/`, and `assets/` files. Note that this step may require user input. For example, when implementing a `brand-guidelines` skill, the user may need to provide brand assets or templates to store in `assets/`, or documentation to store in `references/`.
-Also, delete any example files and directories not needed for the skill. The initialization script creates example files in `scripts/`, `references/`, and `assets/` to demonstrate structure, but most skills won't need all of them.
-#### Update SKILL.md
-**Writing Style:** Write the entire skill using **imperative/infinitive form** (verb-first instructions), not second person. Use objective, instructional language (e.g., "To accomplish X, do Y" rather than "You should do X" or "If you need to do X"). This maintains consistency and clarity for AI consumption.
-To complete SKILL.md, answer the following questions:
-1. What is the purpose of the skill, in a few sentences?
-2. When should the skill be used?
-3. In practice, how should Claude use the skill? All reusable skill contents developed above should be referenced so that Claude knows how to use them.
-### Step 5: Packaging a Skill
-Once the skill is ready, it should be packaged into a distributable zip file that gets shared with the user. The packaging process automatically validates the skill first to ensure it meets all requirements:
-```bash
-scripts/package_skill.py <path/to/skill-folder>
-```
-Optional output directory specification:
-```bash
-scripts/package_skill.py <path/to/skill-folder> ./dist
-```
-The packaging script will:
-1. **Validate** the skill automatically, checking:
-   - YAML frontmatter format and required fields
-   - Skill naming conventions and directory structure
-   - Description completeness and quality
-   - File organization and resource references
-2. **Package** the skill if validation passes, creating a zip file named after the skill (e.g., `my-skill.zip`) that includes all files and maintains the proper directory structure for distribution.
-If validation fails, the script will report the errors and exit without creating a package. Fix any validation errors and run the packaging command again.
-### Step 6: Iterate
-After testing the skill, users may request improvements. Often this happens right after using the skill, with fresh context of how the skill performed.
-**Iteration workflow:**
-1. Use the skill on real tasks
-2. Notice struggles or inefficiencies
-3. Identify how SKILL.md or bundled resources should be updated
-4. Implement changes and test again