npm - @openrig/cli - Versions diffs - 0.1.2 → 0.1.4 - Mend

@openrig/cli 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (104) hide show

package/daemon/specs/agents/shared/skills/brainstorming/SKILL.md ADDED Viewed

@@ -0,0 +1,96 @@
+---
+name: brainstorming
+description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation."
+---
+# Brainstorming Ideas Into Designs
+## Overview
+Help turn ideas into fully formed designs and specs through natural collaborative dialogue.
+Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design and get user approval.
+<HARD-GATE>
+Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity.
+</HARD-GATE>
+## Anti-Pattern: "This Is Too Simple To Need A Design"
+Every project goes through this process. A todo list, a single-function utility, a config change — all of them. "Simple" projects are where unexamined assumptions cause the most wasted work. The design can be short (a few sentences for truly simple projects), but you MUST present it and get approval.
+## Checklist
+You MUST create a task for each of these items and complete them in order:
+1. **Explore project context** — check files, docs, recent commits
+2. **Ask clarifying questions** — one at a time, understand purpose/constraints/success criteria
+3. **Propose 2-3 approaches** — with trade-offs and your recommendation
+4. **Present design** — in sections scaled to their complexity, get user approval after each section
+5. **Write design doc** — save to `docs/plans/YYYY-MM-DD-<topic>-design.md` and commit
+6. **Transition to implementation** — invoke writing-plans skill to create implementation plan
+## Process Flow
+```dot
+digraph brainstorming {
+    "Explore project context" [shape=box];
+    "Ask clarifying questions" [shape=box];
+    "Propose 2-3 approaches" [shape=box];
+    "Present design sections" [shape=box];
+    "User approves design?" [shape=diamond];
+    "Write design doc" [shape=box];
+    "Invoke writing-plans skill" [shape=doublecircle];
+    "Explore project context" -> "Ask clarifying questions";
+    "Ask clarifying questions" -> "Propose 2-3 approaches";
+    "Propose 2-3 approaches" -> "Present design sections";
+    "Present design sections" -> "User approves design?";
+    "User approves design?" -> "Present design sections" [label="no, revise"];
+    "User approves design?" -> "Write design doc" [label="yes"];
+    "Write design doc" -> "Invoke writing-plans skill";
+}
+```
+**The terminal state is invoking writing-plans.** Do NOT invoke frontend-design, mcp-builder, or any other implementation skill. The ONLY skill you invoke after brainstorming is writing-plans.
+## The Process
+**Understanding the idea:**
+- Check out the current project state first (files, docs, recent commits)
+- Ask questions one at a time to refine the idea
+- Prefer multiple choice questions when possible, but open-ended is fine too
+- Only one question per message - if a topic needs more exploration, break it into multiple questions
+- Focus on understanding: purpose, constraints, success criteria
+**Exploring approaches:**
+- Propose 2-3 different approaches with trade-offs
+- Present options conversationally with your recommendation and reasoning
+- Lead with your recommended option and explain why
+**Presenting the design:**
+- Once you believe you understand what you're building, present the design
+- Scale each section to its complexity: a few sentences if straightforward, up to 200-300 words if nuanced
+- Ask after each section whether it looks right so far
+- Cover: architecture, components, data flow, error handling, testing
+- Be ready to go back and clarify if something doesn't make sense
+## After the Design
+**Documentation:**
+- Write the validated design to `docs/plans/YYYY-MM-DD-<topic>-design.md`
+- Use elements-of-style:writing-clearly-and-concisely skill if available
+- Commit the design document to git
+**Implementation:**
+- Invoke the writing-plans skill to create a detailed implementation plan
+- Do NOT invoke any other skill. writing-plans is the next step.
+## Key Principles
+- **One question at a time** - Don't overwhelm with multiple questions
+- **Multiple choice preferred** - Easier to answer than open-ended when possible
+- **YAGNI ruthlessly** - Remove unnecessary features from all designs
+- **Explore alternatives** - Always propose 2-3 approaches before settling
+- **Incremental validation** - Present design, get approval before moving on
+- **Be flexible** - Go back and clarify when something doesn't make sense

package/daemon/specs/agents/shared/skills/containerized-e2e/SKILL.md ADDED Viewed

@@ -0,0 +1,256 @@
+---
+name: containerized-e2e
+description: Run end-to-end dogfood tests inside Docker containers to simulate real user experiences. Use when you need to verify install paths, control plane functionality, UI rendering, or packaging correctness in a clean environment. Triggers include "containerized test", "Docker dogfood", "clean install test", "e2e in container", or testing that requires a fresh environment without dev-mode shortcuts.
+allowed-tools: Bash(docker:*), Bash(agent-browser:*), Bash(npx agent-browser:*)
+---
+# Containerized E2E Testing
+Run OpenRig (or any npm-installable CLI + web UI project) through end-to-end testing inside Docker containers, simulating real user install and usage scenarios.
+## When to Use
+- Verifying the npm install path works from a packed tarball
+- Testing control plane functionality without live agent runtimes
+- Checking UI rendering via agent-browser in a clean environment
+- Regression testing after packaging changes
+- Phase boundary acceptance gates
+## When NOT to Use
+- Testing live agent behavior (send/capture/broadcast, whoami from inside an agent, transcript capture) — these need real claude-code/codex runtimes on the host
+- Quick feedback during active TDD cycles — too slow for the edit/test loop
+## Prerequisites
+- Docker installed and running
+- The repo builds successfully (`npm run build` for all workspaces)
+- `agent-browser` skill loaded (for UI verification commands)
+## Testing Personas
+### Fresh User
+A brand new user who has never installed OpenRig. Exercises the first-run experience.
+- Empty `~/.openrig/` directory
+- No existing rigs, snapshots, or specs
+- Tests: install, daemon start, preflight, doctor, first rig up, UI renders
+### Mature User
+A user with existing OpenRig state — rigs, snapshots, library additions, transcripts.
+- Pre-populated `~/.openrig/` via Docker volume persistence
+- Build this state organically: run the fresh-user tests first, and the volume accumulates real state
+- Tests: restore from existing snapshots, expand existing rigs, library with user-added specs, upgrade-path behaviors
+To set up a mature user volume:
+```bash
+# Create a named volume
+docker volume create openrig-mature-user
+# Run fresh-user tests with the volume mounted
+docker run -it --rm --shm-size=1g \
+  -v openrig-mature-user:/root/.openrig \
+  -v /tmp/openrig-e2e-artifacts:/artifacts \
+  openrig-e2e
+# The volume now has real state from actual rig commands
+# Subsequent runs with the same volume simulate a mature user
+```
+## Workflow
+### 1. Build the E2E Image
+Use the provided build script or do it manually:
+```bash
+# Using the build script (recommended)
+bash {SKILL_DIR}/scripts/build-e2e-image.sh /path/to/repo
+# Or manually:
+cd /path/to/repo
+npm run build --workspace @openrig/daemon
+npm run build --workspace @openrig/ui
+npm run build --workspace @openrig/cli
+bash scripts/build-package.sh
+cd packages/cli && npm pack --pack-destination /tmp/e2e-build
+cp {SKILL_DIR}/scripts/Dockerfile /tmp/e2e-build/
+mv /tmp/e2e-build/openrig-cli-*.tgz /tmp/e2e-build/openrig-cli.tgz
+cd /tmp/e2e-build && docker build -t openrig-e2e:latest .
+```
+### 2. Start the Container
+```bash
+# Fresh user (ephemeral state)
+docker run -d --rm --name openrig-e2e \
+  --shm-size=1g \
+  -v /tmp/openrig-e2e-artifacts:/artifacts \
+  openrig-e2e sleep infinity
+# Mature user (persistent volume)
+docker run -d --rm --name openrig-e2e \
+  --shm-size=1g \
+  -v openrig-mature-user:/root/.openrig \
+  -v /tmp/openrig-e2e-artifacts:/artifacts \
+  openrig-e2e sleep infinity
+```
+**Important:** Always use `--shm-size=1g` for Chromium stability during browser tests.
+### 3. Run Tests Inside the Container
+Execute commands via `docker exec`:
+```bash
+# Start the daemon
+docker exec openrig-e2e rig daemon start
+# Run preflight and doctor
+docker exec openrig-e2e rig preflight --json
+docker exec openrig-e2e rig doctor --json
+# Copy test specs into the container
+docker cp {SKILL_DIR}/templates/control-plane-test.yaml openrig-e2e:/workspace/
+docker cp {SKILL_DIR}/templates/expansion-pod-fragment.yaml openrig-e2e:/workspace/
+docker cp {SKILL_DIR}/templates/expansion-collision-fragment.yaml openrig-e2e:/workspace/
+# Launch a rig
+docker exec openrig-e2e rig up /workspace/control-plane-test.yaml --json
+# Check topology
+docker exec openrig-e2e rig ps --json
+docker exec openrig-e2e rig ps --nodes --json
+```
+### 4. Browser Testing Inside the Container
+agent-browser runs inside the container via `docker exec`:
+```bash
+# Open the daemon UI
+docker exec openrig-e2e agent-browser open http://127.0.0.1:7433
+docker exec openrig-e2e agent-browser wait --load networkidle
+# Inspect interactive elements
+docker exec openrig-e2e agent-browser snapshot -i
+# Capture screenshots
+docker exec openrig-e2e agent-browser screenshot /artifacts/screenshots/dashboard.png
+docker exec openrig-e2e agent-browser screenshot --annotate /artifacts/screenshots/dashboard-annotated.png
+# Navigate and verify specific surfaces
+docker exec openrig-e2e agent-browser click @e4  # Open specs drawer (ref from snapshot)
+docker exec openrig-e2e agent-browser wait 1000
+docker exec openrig-e2e agent-browser screenshot /artifacts/screenshots/specs-drawer.png
+```
+**ARM64 note:** The Dockerfile uses Debian's system chromium instead of Chrome for Testing, which is unavailable on Linux ARM64. The environment variables `AGENT_BROWSER_EXECUTABLE_PATH` and `AGENT_BROWSER_ARGS` are set in the image.
+### 5. Test Scenarios
+#### Control Plane Lifecycle
+```bash
+# Launch the multi-pod test spec
+docker exec openrig-e2e rig up /workspace/control-plane-test.yaml --json
+RIG_ID=$(docker exec openrig-e2e rig ps --json | jq -r '.[0].rigId')
+# Verify topology
+docker exec openrig-e2e rig ps --nodes --json
+# Expand with a new pod
+docker exec openrig-e2e rig expand "$RIG_ID" /workspace/expansion-pod-fragment.yaml --json
+# Verify expansion
+docker exec openrig-e2e rig ps --nodes --json
+# Test validation rejection (colliding namespace)
+docker exec openrig-e2e rig expand "$RIG_ID" /workspace/expansion-collision-fragment.yaml --json
+# Should fail with namespace collision error, rig unchanged
+# Snapshot
+docker exec openrig-e2e rig down "$RIG_ID" --snapshot --json
+# Restore and verify
+SNAPSHOT_ID=$(docker exec openrig-e2e rig snapshot list "$RIG_ID" | awk 'NR==2 {print $1}')
+docker exec openrig-e2e rig restore "$SNAPSHOT_ID" --rig "$RIG_ID"
+# Export
+docker exec openrig-e2e rig export "$RIG_ID" -o /artifacts/captures/exported-rig.yaml
+```
+#### UI Verification
+```bash
+# After launching a rig, verify the graph renders
+docker exec openrig-e2e agent-browser open http://127.0.0.1:7433
+docker exec openrig-e2e agent-browser wait --load networkidle
+docker exec openrig-e2e agent-browser snapshot -i
+docker exec openrig-e2e agent-browser screenshot --annotate /artifacts/screenshots/graph-with-rig.png
+# Open drawers and verify content
+docker exec openrig-e2e agent-browser snapshot -i  # Get fresh refs
+# Click through specs drawer, discovery drawer, rig detail, etc.
+# Take screenshots at each step
+```
+### 6. Cleanup
+```bash
+docker exec openrig-e2e rig daemon stop
+docker exec openrig-e2e agent-browser close
+docker stop openrig-e2e
+```
+### 7. Write Report
+Copy the report template and fill it in:
+```bash
+cp {SKILL_DIR}/templates/e2e-report-template.md /tmp/openrig-e2e-artifacts/report.md
+```
+Fill in results as tests complete — do not batch findings for the end.
+## Test Spec Templates
+| Template | Purpose |
+|----------|---------|
+| `templates/control-plane-test.yaml` | Multi-pod terminal-only rig spec (backend + frontend, 3 nodes, cross-pod edges) |
+| `templates/expansion-pod-fragment.yaml` | Pod fragment for expansion happy path (ops pod with cross-pod edge) |
+| `templates/expansion-collision-fragment.yaml` | Pod fragment that intentionally collides — for validation rejection testing |
+| `templates/e2e-report-template.md` | Structured test report template |
+## Scripts
+| Script | Purpose |
+|--------|---------|
+| `scripts/build-e2e-image.sh` | Build the Docker image from the repo (builds packages, packs tarball, builds image) |
+| `scripts/Dockerfile` | The proven Dockerfile — Node 22, tmux, system Chromium, agent-browser, OpenRig CLI |
+## Limitations
+- **No live agent runtimes.** The container does not include claude-code or codex. Use `runtime: terminal` specs for control plane testing. Test live agent behavior on the host.
+- **ARM64 browser workaround.** Chrome for Testing is unavailable on Linux ARM64. The Dockerfile uses Debian chromium. This is transparent to agent-browser commands.
+- **No GPU/display.** All browser testing is headless. Screenshots and videos capture what a user would see, but there is no visible browser window.
+## Combining with Host-Based Dogfood
+For complete coverage, use both approaches:
+| What to test | Where | Tool |
+|-------------|-------|------|
+| Install path, packaging | Container | This skill |
+| CLI commands, lifecycle | Container | This skill |
+| UI rendering, drawers | Container | This skill + agent-browser |
+| Validation/error paths | Container | This skill |
+| Live agent startup | Host | QA with /dogfood skill |
+| Communication (send/capture) | Host | QA with live agents |
+| Whoami from inside agent | Host | QA with live agents |
+| Transcript capture | Host | QA with live agents |
+| Chatroom with real participants | Host | QA with live agents |

package/daemon/specs/agents/shared/skills/containerized-e2e/scripts/Dockerfile ADDED Viewed

@@ -0,0 +1,39 @@
+# OpenRig Containerized E2E Testing Image
+#
+# Provides: Node 22, tmux, agent-browser (with system Chromium), OpenRig CLI
+# Usage:
+#   1. Build the CLI tarball: cd <repo>/packages/cli && npm pack
+#   2. Copy tarball to build context as openrig-cli.tgz
+#   3. docker build -t openrig-e2e .
+#   4. docker run -it --rm --shm-size=1g -v /tmp/e2e-artifacts:/artifacts openrig-e2e
+#
+# ARM64 note: Chrome for Testing is unavailable on Linux ARM64.
+# This image uses Debian's system chromium instead, pointed via
+# AGENT_BROWSER_EXECUTABLE_PATH.
+FROM node:22-bookworm
+ENV DEBIAN_FRONTEND=noninteractive
+ENV AGENT_BROWSER_ARGS=--no-sandbox
+ENV AGENT_BROWSER_EXECUTABLE_PATH=/usr/bin/chromium
+ENV OPENRIG_HOME=/root/.openrig
+RUN apt-get update \
+ && apt-get install -y --no-install-recommends \
+    ca-certificates \
+    chromium \
+    curl \
+    git \
+    procps \
+    tmux \
+ && rm -rf /var/lib/apt/lists/*
+RUN npm install -g agent-browser
+COPY openrig-cli.tgz /tmp/openrig-cli.tgz
+RUN npm install -g /tmp/openrig-cli.tgz \
+ && rm /tmp/openrig-cli.tgz
+WORKDIR /workspace
+CMD ["/bin/bash"]

package/daemon/specs/agents/shared/skills/containerized-e2e/scripts/build-e2e-image.sh ADDED Viewed

@@ -0,0 +1,37 @@
+#!/bin/bash
+# Build the OpenRig containerized E2E testing image.
+#
+# Usage: ./build-e2e-image.sh [repo-root]
+#   repo-root defaults to the current directory.
+#
+# Produces: Docker image tagged openrig-e2e:latest
+set -euo pipefail
+REPO_ROOT="${1:-.}"
+SKILL_DIR="$(cd "$(dirname "$0")/.." && pwd)"
+BUILD_CTX="/tmp/openrig-e2e-build"
+echo "=== Building OpenRig packages ==="
+cd "$REPO_ROOT"
+npm run build --workspace @openrig/daemon
+npm run build --workspace @openrig/ui
+npm run build --workspace @openrig/cli
+bash scripts/build-package.sh
+echo "=== Packing CLI tarball ==="
+mkdir -p "$BUILD_CTX"
+cd "$REPO_ROOT/packages/cli"
+npm pack --pack-destination "$BUILD_CTX"
+echo "=== Preparing Docker build context ==="
+cp "$SKILL_DIR/scripts/Dockerfile" "$BUILD_CTX/Dockerfile"
+# Rename tarball to a stable name
+mv "$BUILD_CTX"/openrig-cli-*.tgz "$BUILD_CTX/openrig-cli.tgz"
+echo "=== Building Docker image ==="
+cd "$BUILD_CTX"
+docker build -t openrig-e2e:latest .
+echo "=== Done ==="
+echo "Run: docker run -it --rm --shm-size=1g -v /tmp/openrig-e2e-artifacts:/artifacts openrig-e2e"

package/daemon/specs/agents/shared/skills/containerized-e2e/templates/control-plane-test.yaml ADDED Viewed

@@ -0,0 +1,40 @@
+version: "0.2"
+name: control-plane-test
+summary: >
+  Multi-pod terminal-only topology for containerized control-plane testing.
+  Exercises pods, cross-pod edges, expansion targets, and snapshot/restore
+  without requiring live coding runtimes.
+pods:
+  - id: backend
+    label: Backend Team
+    members:
+      - id: api
+        runtime: terminal
+        agent_ref: "builtin:terminal"
+        profile: none
+        cwd: /tmp
+      - id: db
+        runtime: terminal
+        agent_ref: "builtin:terminal"
+        profile: none
+        cwd: /tmp
+    edges:
+      - kind: delegates_to
+        from: api
+        to: db
+  - id: frontend
+    label: Frontend Team
+    members:
+      - id: ui
+        runtime: terminal
+        agent_ref: "builtin:terminal"
+        profile: none
+        cwd: /tmp
+    edges: []
+edges:
+  - kind: delegates_to
+    from: frontend.ui
+    to: backend.api

package/daemon/specs/agents/shared/skills/containerized-e2e/templates/e2e-report-template.md ADDED Viewed

@@ -0,0 +1,94 @@
+# Containerized E2E Test Report
+Date: {{DATE}}
+Persona: {{PERSONA}}
+Image: openrig-e2e:latest
+## Summary
+- Tests run: {{TOTAL}}
+- Passed: {{PASSED}}
+- Failed: {{FAILED}}
+- Skipped: {{SKIPPED}}
+## Environment
+- Node: {{NODE_VERSION}}
+- tmux: {{TMUX_VERSION}}
+- Chromium: {{CHROMIUM_VERSION}}
+- OpenRig CLI: {{RIG_VERSION}}
+- Platform: {{PLATFORM}}
+## Test Results
+### Install & Boot
+| Test | Result | Notes |
+|------|--------|-------|
+| npm install -g | | |
+| rig daemon start | | |
+| rig preflight | | |
+| rig doctor | | |
+| UI loads in browser | | |
+### Rig Lifecycle
+| Test | Result | Notes |
+|------|--------|-------|
+| rig up (terminal-only spec) | | |
+| rig ps / rig ps --nodes | | |
+| Graph renders in browser | | |
+| rig down --snapshot | | |
+| rig restore | | |
+| Restored nodes match | | |
+### Expansion
+| Test | Result | Notes |
+|------|--------|-------|
+| rig expand (happy path) | | |
+| Graph updates after expand | | |
+| ps --nodes shows new nodes | | |
+| Expand with collision (rejected) | | |
+| Rig unchanged after rejection | | |
+### Snapshot/Restore with Expansion
+| Test | Result | Notes |
+|------|--------|-------|
+| Snapshot captures expanded pods | | |
+| Restore brings back expanded pods | | |
+| Cross-pod edges survive restore | | |
+| Export includes expanded topology | | |
+### CLI Surface
+| Test | Result | Notes |
+|------|--------|-------|
+| rig specs ls | | |
+| rig specs show | | |
+| rig config | | |
+| rig whoami (daemon down) | | |
+| rig export | | |
+### UI Surface (agent-browser)
+| Test | Result | Notes |
+|------|--------|-------|
+| Dashboard renders | | |
+| Explorer sidebar | | |
+| Specs drawer opens | | |
+| Discovery drawer opens | | |
+| System drawer opens | | |
+| Rig detail drawer | | |
+| Node detail in graph | | |
+## Bugs Found
+(Append each bug as discovered — do not batch)
+## Artifacts
+- Screenshots: /artifacts/screenshots/
+- Videos: /artifacts/videos/
+- CLI transcript: /artifacts/cli-transcript.txt

package/daemon/specs/agents/shared/skills/containerized-e2e/templates/expansion-collision-fragment.yaml ADDED Viewed

@@ -0,0 +1,13 @@
+# This fragment intentionally collides with the existing 'backend' pod.
+# Use it to verify that expansion validation rejects namespace collisions
+# and leaves the rig unchanged.
+pod:
+  id: backend
+  label: Colliding Pod
+  members:
+    - id: worker
+      runtime: terminal
+      agent_ref: "builtin:terminal"
+      profile: none
+      cwd: /tmp
+  edges: []

package/daemon/specs/agents/shared/skills/containerized-e2e/templates/expansion-pod-fragment.yaml ADDED Viewed

@@ -0,0 +1,14 @@
+pod:
+  id: ops
+  label: Operations
+  members:
+    - id: monitor
+      runtime: terminal
+      agent_ref: "builtin:terminal"
+      profile: none
+      cwd: /tmp
+  edges: []
+crossPodEdges:
+  - kind: delegates_to
+    from: ops.monitor
+    to: backend.api