npm - buildwright - Versions diffs - 0.0.8 → 0.0.10 - Mend

buildwright 0.0.8 → 0.0.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +401 -45
package/bin/buildwright.js +8 -0
package/package.json +1 -1
package/src/commands/commands.js +81 -0
package/templates/.buildwright/commands/bw-help.md +4 -1
package/templates/.buildwright/commands/bw-new-feature.md +23 -0
package/templates/.buildwright/commands/bw-plan.md +210 -0
package/templates/.buildwright/commands/bw-quick.md +11 -0
package/templates/CLAUDE.md +3 -1
package/templates/Makefile +8 -1
package/templates/scripts/sync-agents.sh +32 -1
package/templates/scripts/validate-docs.sh +67 -0

package/README.md CHANGED Viewed

@@ -1,81 +1,437 @@
-# buildwright
+# Buildwright
-Agent-first autonomous development workflow. Ship code you don't read.
+**Ship code you don't read. Let automated systems be your reviewer.**
-## Install
+An agent-first autonomous development workflow where humans approve specifications and agents handle everything else — implementation, testing, security review, code review, and shipping.
+---
+## The Flow
+```mermaid
+flowchart TD
+      A["/bw-new-feature"] --> B{Greenfield?}
+      B -- Yes --> C[Ask product vision<br>Derive tech stack from vision]
+      C --> D
+      B -- No --> D["1. RESEARCH<br>Deep-read codebase"]
+      D --> E["1.5. RESOLVE AMBIGUITIES<br>Auto-decide or ask user<br>(BUILDWRIGHT_AUTO_APPROVE)"]
+      E --> F["2. PLAN<br>Generate spec"]
+      F --> G["3. VALIDATE<br>Staff Engineer review (auto)"]
+      G --> H["4. APPROVE<br>Human or auto<br>(BUILDWRIGHT_AUTO_APPROVE)"]
+      H --> I["5. BUILD<br>TDD per milestone<br>→ verify after each"]
+      I --> J["6.5 UPDATE DOCS<br>README · CHANGELOG · docs/"]
+      J --> K["7. SHIP"]
+      K --> L[Verify]
+      L --> M[Security]
+      M --> N[Review]
+      N --> O["PR Ready ✓"]
+      P["/bw-quick"] --> Q[Quick research]
+      Q --> R[Implement TDD]
+      R --> S[Verify]
+      S --> U[Security]
+      U --> V[Code Review]
+      V --> W[Update Docs]
+      W --> T["Commit Ready ✓"]
+```
+> If anything fails → commit completed work, push, PR with failure report, exit(1). No orphaned branches.
+---
+## Greenfield Projects
+Starting a new project? Buildwright handles it:
+```
+/bw-new-feature "Add product catalog with search"
+> "This looks like a new project. What is the product vision, and do you have
+> any tech constraints (team expertise, deployment environment, integrations,
+> compliance)?"
+E-commerce platform for handmade crafts. Team knows Python. Deploying to AWS Lambda.
+> [AI generates steering docs + spec]
+> [Derives and presents tech stack for approval]
+PROPOSED TECH STACK
+───────────────────
+[Stack derived from your product vision and constraints]
+Chosen because: [2-3 sentences linking requirements to stack]
+Alternatives considered: [brief list]
+Reply "approved" to proceed with this stack.
+Or adjust: "approved, but use PostgreSQL instead of DynamoDB"
+```
+One question. One approval. Tech stack + spec reviewed together.
+---
+## Autonomous Mode
+Want fully autonomous operation? Skip human approval entirely:
 ```bash
-npm install -g buildwright
+# Set environment variable
+export BUILDWRIGHT_AUTO_APPROVE=true
+# Then run as usual
+/bw-new-feature "Add user authentication"
 ```
-Requires Node.js 18+.
+**What happens:**
+- Research, plan, validate still run (quality preserved)
+- Spec documents committed to git BEFORE implementation
+- No approval wait — proceeds directly to build
+- Full audit trail in version control
+```
+docs(spec): add specification for user-auth
+- research.md: codebase analysis
+- spec.md: implementation plan
+- Validated by Staff Engineer agent
+Auto-approved: BUILDWRIGHT_AUTO_APPROVE=true
+```
+**Use autonomous mode when:**
+- You trust the workflow for routine features
+- Running in CI/CD pipelines
+- Batch processing multiple features
+- You want to review specs via git history instead of real-time
+---
 ## Quick Start
+### npm (Recommended)
 ```bash
-# Navigate to your project
-cd my-project
-git init  # if not already a git repo
+npm install -g buildwright
+cd my-project && buildwright init
+```
+Requires Node.js 18+. All templates are bundled — works offline after install. Run `buildwright update` to pull the latest commands/agents/claws from GitHub.
+### For Claude Code
-# Set up Buildwright
-buildwright init
+```bash
+# Add to any project
+curl -sL https://raw.githubusercontent.com/raunakkathuria/buildwright/main/setup.sh | bash
-# Customize for your project
-nano .buildwright/steering/product.md   # add product context
-nano .buildwright/steering/tech.md      # add tech stack + commands
+# Customize steering docs
+nano .buildwright/steering/product.md   # Your product context
+nano .buildwright/steering/tech.md      # Your tech stack
 # Start building
 claude
-> /bw-new-feature "Add user authentication"
+> /bw-new-feature "Add user authentication with OAuth2"
 ```
-## Commands
+### For an existing clone
-| Command | Description |
-|---------|-------------|
-| `buildwright init` | Set up Buildwright in the current project |
-| `buildwright update` | Update commands/agents/claws from GitHub (preserves your steering docs) |
-| `buildwright sync` | Re-sync `.buildwright/` to `.claude/`, `.opencode/`, `.cursor/rules/` |
+```bash
+# After cloning the repo, generate tool-specific configs
+make sync
-## What `init` Does
+# This creates:
+#   .claude/        (commands, agents, claws, steering — from .buildwright/)
+#   .opencode/      (commands, agents, claws, steering — from .buildwright/)
+#   .cursor/rules/  (commands, agents, claws, steering — from .buildwright/)
+#   AGENTS.md       (from CLAUDE.md — for OpenCode compatibility)
-1. Copies all Buildwright templates to your project:
-   - `.buildwright/` — canonical config (agents, claws, commands, steering)
-   - `scripts/` — sync and hook scripts
-   - `Makefile`, `CLAUDE.md`, `BUILDWRIGHT.md`
-2. Makes scripts executable
-3. Runs `make sync` to generate `.claude/`, `.opencode/`, `.cursor/rules/`
-4. Installs git hooks for auto-sync on `.buildwright/` changes
+# Install git hooks to auto-sync on .buildwright/ changes
+make install-hooks
+```
-## What `update` Does
+### For OpenClaw
-Downloads the latest release from GitHub and updates:
-- `.buildwright/commands/` — slash command definitions
-- `.buildwright/agents/` — agent personas
-- `.buildwright/claws/` — domain specialist prompts
-- `CLAUDE.md` — agent instructions
+The recommended approach is to run the setup script, which installs the full workflow (commands, agents, claws, steering docs) into your project:
+```bash
+curl -sL https://raw.githubusercontent.com/raunakkathuria/buildwright/main/setup.sh | bash
+make sync
+```
+Alternatively, install just the skill globally for reference across all projects:
+```bash
+mkdir -p ~/.openclaw/skills/buildwright
+curl -s https://raw.githubusercontent.com/raunakkathuria/buildwright/main/SKILL.md > ~/.openclaw/skills/buildwright/SKILL.md
+```
-**Preserves** your customizations in `.buildwright/steering/` (product.md, tech.md, etc.).
+> **Note:** The global skill install provides buildwright's workflow guidance via SKILL.md. The slash commands (`/bw-new-feature`, `/bw-claw`, etc.) require the full project setup above.
-## Slash Commands (inside AI editors)
+### For OpenCode
+Same as above — run the setup script for the full workflow:
+```bash
+curl -sL https://raw.githubusercontent.com/raunakkathuria/buildwright/main/setup.sh | bash
+make sync
+```
+Or install the skill globally:
+```bash
+mkdir -p ~/.config/opencode/skills/buildwright
+curl -s https://raw.githubusercontent.com/raunakkathuria/buildwright/main/SKILL.md > ~/.config/opencode/skills/buildwright/SKILL.md
+```
+> **Note:** The global skill install provides buildwright's workflow guidance via SKILL.md. The slash commands require the full project setup.
+### For Cursor
+Same setup — run the setup script for the full workflow:
+```bash
+curl -sL https://raw.githubusercontent.com/raunakkathuria/buildwright/main/setup.sh | bash
+```
+Cursor rules are generated automatically in `.cursor/rules/` by the sync step. Open the project in Cursor — steering rules apply always, commands/agents/claws apply intelligently.
+### For Codex CLI
+Buildwright skills are discovered by Codex via `~/.agents/skills/`. After cloning:
+```bash
+cd ~/.codex/buildwright && make codex
+# Creates: ~/.agents/skills/buildwright → skills/
+```
+Or follow the detailed guide in [`.codex/INSTALL.md`](.codex/INSTALL.md).
+Each `bw-*` command is available as a Codex skill. `AGENTS.md` (generated by `make sync`) is also read automatically by Codex at the project level.
+---
+## When to Use What
+| Scenario | Approach | Why |
+|----------|----------|-----|
+| New feature, unclear scope | `/bw-new-feature` | Research prevents building the wrong thing |
+| New feature, clear scope | `/bw-new-feature` | Spec creates audit trail + validation |
+| Bug fix | `/bw-quick` | Fast path with full quality gates |
+| Small task (< 2 hrs) | `/bw-quick` | Lightweight planning, full quality gates |
+| Config change | `/bw-quick` | Quick path with security scan + code review |
+| Refactor, clear scope | `/bw-quick` | You already know what to change |
+| Refactor, unclear scope | `/bw-new-feature` | Research phase prevents breaking things |
+| Unfamiliar / brownfield codebase | `/bw-analyse` | Generates stack, architecture, conventions, and concerns docs so every session starts with real context |
+| Greenfield project | `/bw-new-feature` | Auto-generates steering docs + tech stack |
+| Prototype / spike | Just code it | Ceremony kills exploration speed |
+| One-off script | Just code it | No need for spec, review, or CI |
+| Learning / experimenting | Just code it | Pipeline adds friction to discovery |
+---
+## Commands
 | Command | Purpose |
 |---------|---------|
+| `/bw-analyse` | Analyse codebase: writes stack, architecture, conventions, concerns to `.buildwright/codebase/`, updates `tech.md` |
 | `/bw-new-feature` | Full pipeline: research → spec → approve → build → ship |
+| `/bw-claw` | Multi-agent: architect decomposes → claws execute per domain |
 | `/bw-quick` | Fast path for bug fixes, small tasks |
-| `/bw-claw` | Cross-domain features (DB + API + UI) |
-| `/bw-ship` | Quality gates + push + PR |
-| `/bw-verify` | Quick typecheck/lint/test/build |
-| `/bw-analyse` | Analyse brownfield codebase |
-| `/bw-help` | Show all commands |
+| `/bw-plan` | Research a question, produce a written deliverable — no implementation, no commits |
+| `/bw-ship` | Quality gates + release: verify → security → review → push → PR |
+| `/bw-verify` | Quick checks: typecheck, lint, test, build |
+| `/bw-help` | Show available commands |
+---
+## Agent Personas
+Modular, extensible agent personas in `.buildwright/agents/`:
+| Agent | File | Used By | Purpose |
+|-------|------|---------|---------|
+| Staff Engineer | `staff-engineer.md` | `/bw-new-feature`, `/bw-ship` | Spec & code review |
+| Security Engineer | `bw-security-engineer.md` | `/bw-ship` | OWASP, SAST, security review |
+### Adding New Agents
+```bash
+# Create new agent persona
+cat > .buildwright/agents/qa-engineer.md << 'EOF'
+# QA Engineer Agent
+You are a QA Engineer specialized in test coverage...
+## What You Look For
+...
+## Output Format
+...
+EOF
+```
+Then reference in commands:
+```markdown
+## Adopt Persona
+Read and adopt persona from `.buildwright/agents/qa-engineer.md`
+```
+---
+## Security Review
+The security phase in `/bw-ship` covers:
+| Category | Checks |
+|----------|--------|
+| **OWASP Top 10** | All 10 categories (A01-A10:2021) |
+| **SAST** | Static analysis via Semgrep |
+| **Secrets** | API keys, passwords, tokens, private keys |
+| **Dependencies** | npm audit, cargo audit, pip-audit |
+| **Financial** | No float for currency, transaction integrity, audit logging |
+### Severity Triage
+Findings are classified by severity to avoid blocking on low-risk items:
+| Severity | Action | Example |
+|----------|--------|---------|
+| **Critical / High** | Block — must fix before merge | SQL injection, exposed secrets, auth bypass |
+| **Medium** | Fix in this PR if feasible, otherwise track | Missing rate limiting, verbose error messages |
+| **Low / Info** | Advisory — log and move on | Minor header hardening, informational findings |
+---
+## Project Structure
+```
+your-project/
+├── CLAUDE.md                      # Agent instructions (committed)
+├── BUILDWRIGHT.md                 # Human documentation (committed)
+├── SKILL.md                       # Agent Skills standard (committed)
+├── .buildwright/                  # Canonical config (committed)
+│   ├── agents/                    # Reusable personas
+│   │   ├── architect.md
+│   │   ├── staff-engineer.md
+│   │   └── security-engineer.md
+│   ├── claws/                     # Domain specialists
+│   │   ├── frontend.md
+│   │   ├── backend.md
+│   │   ├── database.md
+│   │   └── TEMPLATE.md
+│   ├── codebase/                  # Analysis docs (generated by /bw-analyse)
+│   │   ├── STACK.md
+│   │   ├── ARCHITECTURE.md
+│   │   ├── CONVENTIONS.md
+│   │   └── CONCERNS.md
+│   ├── commands/                  # Slash commands
+│   │   ├── bw-analyse.md
+│   │   ├── bw-claw.md
+│   │   ├── bw-new-feature.md
+│   │   ├── bw-plan.md
+│   │   ├── bw-quick.md
+│   │   ├── bw-ship.md
+│   │   ├── bw-verify.md
+│   │   └── bw-help.md
+│   ├── steering/                  # Project context
+│   │   ├── product.md
+│   │   ├── tech.md
+│   │   ├── quality-gates.md
+│   │   └── naming-conventions.md
+│   └── tasks/
+├── .claude/                       # Generated by `make sync` (gitignored)
+│   ├── settings.json              # Claude Code permissions (committed)
+│   └── agents/, claws/, commands/, steering/  (generated)
+├── .opencode/                     # Generated by `make sync` (gitignored)
+├── .cursor/rules/                 # Generated by `make sync` (gitignored)
+├── AGENTS.md                      # Generated by `make sync` (gitignored)
+├── scripts/
+│   ├── sync-agents.sh             # Sync .buildwright/ → .claude/ + .opencode/ + .cursor/rules/
+│   ├── install-hooks.sh           # Install git hooks (run via: make install-hooks)
+│   ├── validate-skill.sh          # Validate SKILL.md against agentskills.io
+│   └── hooks/                     # Git hooks source (committed, installed to .git/hooks/)
+│       ├── pre-commit             # Auto-sync when .buildwright/ files are staged
+│       ├── post-merge             # Auto-sync after git pull/merge
+│       └── post-checkout          # Auto-sync on branch switches
+├── docs/
+│   ├── requirements/
+│   └── specs/
+└── .github/
+    └── workflows/
+        └── quality-gates.yml
+```
+> **Note:** After cloning, run `make sync && make install-hooks` to generate tool-specific configs and install git hooks that auto-sync on `.buildwright/` changes.
+---
+## Design Principles (Built-In)
+Every spec and implementation follows:
+| Principle | Meaning |
+|-----------|---------|
+| **KISS** | Keep It Simple — prefer simple over clever |
+| **YAGNI** | You Aren't Gonna Need It — build only what's needed now |
+| **No Premature Optimization** | Make it work first, optimize with data |
+| **Boring Technology** | Proven tools over shiny new ones |
+| **Fail Fast** | Validate early, error loudly |
+---
+## Extending the Workflow
+### Add New Agent
+1. Create `.buildwright/agents/[role].md`
+2. Define mindset, checklist, output format
+3. Reference in commands with `Adopt persona from...`
+### Add New Command
+1. Create `.buildwright/commands/[name].md`
+2. Define arguments, phases, output format
+3. Reference agents as needed
+### Planned Agents (Future)
+| Agent | Purpose |
+|-------|---------|
+| QA Engineer | Test coverage, edge cases |
+| Performance Engineer | Bottleneck identification |
+| DevOps Engineer | Infrastructure review |
+| Database Engineer | Schema review, query optimization |
+| UX Engineer | API design review |
+---
+## Customization
+| File | Purpose | Edit Frequency |
+|------|---------|----------------|
+| `.buildwright/steering/product.md` | Product context | Per project |
+| `.buildwright/steering/tech.md` | Tech stack & commands | Per project |
+| `.buildwright/agents/*.md` | Agent personas | Rarely |
+| `.buildwright/commands/*.md` | Slash commands | Rarely |
+| `CLAUDE.md` | Learned patterns | As discovered |
+---
+## FAQ
+### Do I need to review code?
+No. `/bw-ship` handles security review and code review automatically using Staff Engineer and Security Engineer personas.
-## Offline Support
+### What if a step fails?
+- **Verify fails**: Retries up to 2x automatically.
+- **Security/Review fails**: No retry — requires judgment.
+- **Autonomous mode** (`BUILDWRIGHT_AUTO_APPROVE=true`): Commits completed work, pushes, creates PR with failure details, exits with error code.
+- **Interactive mode** (`BUILDWRIGHT_AUTO_APPROVE=false`): STOP immediately — human fixes in-session.
-All templates are bundled in the npm package — `buildwright init` works without internet access after installation. Use `buildwright update` to pull the latest templates from GitHub.
+### Can I skip security review?
+No. Both `/bw-ship` and `/bw-quick` include mandatory security and code review steps. Use `/bw-verify` for quick checks during active development, before committing.
-## More Information
+### How do I add project-specific rules?
+Add to `CLAUDE.md` under "Learned Patterns" or create a new agent.
-See the [full documentation](https://github.com/raunakkathuria/buildwright) on GitHub.
+---
 ## License

package/bin/buildwright.js CHANGED Viewed

@@ -5,6 +5,7 @@ const { Command } = require('commander');
 const { init } = require('../src/commands/init');
 const { update } = require('../src/commands/update');
 const { sync } = require('../src/commands/sync');
+const { commands } = require('../src/commands/commands');
 const pkg = require('../package.json');
@@ -36,4 +37,11 @@ program
     sync();
   });
+program
+  .command('commands')
+  .description('List available agent slash commands (/bw-*)')
+  .action(() => {
+    commands();
+  });
 program.parse(process.argv);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "buildwright",
-  "version": "0.0.8",
+  "version": "0.0.10",
   "description": "Agent-first autonomous development workflow. Ship code you don't read.",
   "license": "MIT",
   "engines": {

package/src/commands/commands.js ADDED Viewed

@@ -0,0 +1,81 @@
+'use strict';
+const fs = require('fs');
+const path = require('path');
+const CYAN = '\x1b[36m';
+const BOLD = '\x1b[1m';
+const RESET = '\x1b[0m';
+const DIM = '\x1b[2m';
+const HARDCODED = [
+  { name: 'bw-analyse',     description: 'Analyse codebase: writes stack, architecture, conventions, concerns to .buildwright/codebase/' },
+  { name: 'bw-claw',        description: 'Multi-agent: architect decomposes → claws execute per domain' },
+  { name: 'bw-new-feature', description: 'Full pipeline: research → spec → approve → build → ship' },
+  { name: 'bw-plan',        description: 'Research a question, produce a written deliverable — no implementation, no commits' },
+  { name: 'bw-quick',       description: 'Fast path for bug fixes, small tasks, config changes' },
+  { name: 'bw-ship',        description: 'Quality gates + release: verify → security → review → push → PR' },
+  { name: 'bw-verify',      description: 'Quick checks: typecheck, lint, test, build' },
+];
+function parseFrontmatter(content) {
+  const parts = content.split('---');
+  if (parts.length < 3) return null;
+  const front = parts[1];
+  const name = front.match(/^name:\s*(.+)$/m)?.[1]?.trim();
+  const description = front.match(/^description:\s*(.+)$/m)?.[1]?.trim();
+  return name && description ? { name, description } : null;
+}
+function loadFromDir(dir) {
+  const files = fs.readdirSync(dir).filter(f => f.startsWith('bw-') && f.endsWith('.md') && f !== 'bw-help.md');
+  const entries = [];
+  for (const file of files) {
+    const content = fs.readFileSync(path.join(dir, file), 'utf8');
+    const parsed = parseFrontmatter(content);
+    if (parsed) entries.push(parsed);
+  }
+  return entries.sort((a, b) => a.name.localeCompare(b.name));
+}
+function formatList(entries) {
+  const maxLen = Math.max(...entries.map(e => e.name.length));
+  return entries.map(({ name, description }) => {
+    const padded = `/${name}`.padEnd(maxLen + 2);
+    return `  ${BOLD}${padded}${RESET}  ${DIM}${description}${RESET}`;
+  }).join('\n');
+}
+function commands() {
+  const cwd = process.cwd();
+  const commandsDir = path.join(cwd, '.buildwright', 'commands');
+  let entries;
+  let source;
+  if (fs.existsSync(commandsDir)) {
+    entries = loadFromDir(commandsDir);
+    source = 'project';
+  } else {
+    entries = HARDCODED;
+    source = 'default';
+  }
+  console.log('');
+  console.log(`${CYAN}${BOLD}╔═══════════════════════════════════════════════════════════════╗${RESET}`);
+  console.log(`${CYAN}${BOLD}║              AGENT SLASH COMMANDS                             ║${RESET}`);
+  console.log(`${CYAN}${BOLD}╚═══════════════════════════════════════════════════════════════╝${RESET}`);
+  console.log('');
+  console.log('Use these inside Claude Code, Cursor, or OpenCode:');
+  console.log('');
+  console.log(formatList(entries));
+  console.log('');
+  if (source === 'default') {
+    console.log(`${DIM}(Showing default commands. Run inside a Buildwright project to see project-specific commands.)${RESET}`);
+    console.log('');
+  }
+  console.log(`Run ${BOLD}buildwright help${RESET} for CLI setup commands.`);
+  console.log('');
+}
+module.exports = { commands };

package/templates/.buildwright/commands/bw-help.md CHANGED Viewed

@@ -18,6 +18,7 @@ WORKFLOW
 /bw-ship                 Quality gates + release: verify → security → review → push → PR
 /bw-verify               Quick checks: typecheck, lint, test, build
 /bw-analyse              Analyse codebase: writes stack, architecture, conventions, concerns to .buildwright/codebase/
+/bw-plan <question>      Research a question, produce a written deliverable — no implementation, no commits
 /bw-help                 Show this command list
 ╔═══════════════════════════════════════════════════════════════╗
@@ -49,6 +50,7 @@ SHIP EXISTING WORK:
 ╚═══════════════════════════════════════════════════════════════╝
   Brownfield, need codebase context  → /bw-analyse
+  Question/analysis, no coding      → /bw-plan
   Single domain, needs planning     → /bw-new-feature
   Crosses multiple domains/layers   → /bw-claw
   Small task, clear scope           → /bw-quick
@@ -74,7 +76,8 @@ SHIP EXISTING WORK:
 ║                         TIPS                                  ║
 ╚═══════════════════════════════════════════════════════════════╝
-• Use /bw-new-feature for anything that needs planning
+• Use /bw-plan for research, analysis, or planning without implementation
+• Use /bw-new-feature for anything that needs planning + implementation
 • Use /bw-claw when feature crosses domain boundaries
 • Use /bw-quick for clear, bounded tasks
 • Verify retries 2x automatically; security/review stops immediately

package/templates/.buildwright/commands/bw-new-feature.md CHANGED Viewed

@@ -110,6 +110,29 @@ Extract:
 - Existing patterns to follow
 - Commands to use
+### 2.1b Read Codebase Analysis Docs (if present)
+If `.buildwright/codebase/` exists (generated by `/bw-analyse`), read all docs there:
+```bash
+ls .buildwright/codebase/ 2>/dev/null && cat .buildwright/codebase/*.md
+```
+These docs contain pre-mapped context:
+- `STACK.md` — confirmed tech stack and dependencies
+- `ARCHITECTURE.md` — layer structure, entry points, data flow
+- `CONVENTIONS.md` — naming patterns, import style, test framework
+- `CONCERNS.md` — known issues and gaps to avoid repeating
+**If codebase docs exist:** they replace most of the general scanning in 2.2. Focus your
+deep-read (2.2) on the specific files relevant to this feature only. Follow the
+conventions and architecture exactly — do not invent new patterns.
+**If not present:** proceed with 2.2 as written. Consider running `/bw-analyse` first on
+brownfield projects for faster, more accurate results.
+---
 ### 2.2 Deep-Read Relevant Codebase
 Based on the requirements, identify and deeply read relevant areas:

package/templates/.buildwright/commands/bw-plan.md ADDED Viewed

@@ -0,0 +1,210 @@
+---
+name: bw-plan
+description: Research a question or topic and produce a written deliverable — no implementation, no commits
+arguments:
+  - name: question
+    description: A question, topic, or path to a structured task file (.md)
+    required: true
+---
+# /bw-plan — Research and Planning Without Implementation
+## When to use
+Research and planning tasks where the output is a written deliverable — not
+implementation. Use when someone asks a question, wants an analysis, or needs
+a structured plan/report before (or instead of) writing code.
+Examples:
+- "What are the performance risks in this Flutter app?"
+- "Plan a migration from monolith to microservices"
+- "Evaluate which payment provider we should use"
+- "Produce a static analysis report for this codebase"
+**Contrast with other commands:**
+- `/bw-new-feature` — research + spec + implement + ship
+- `/bw-quick` — implement immediately (no planning doc)
+- `/bw-analyse` — analyse this project's own codebase for Buildwright context
+- `/bw-plan` — research a question, write a deliverable, stop there
+## Invocation
+```
+/bw-plan [question or topic | path/to/task.md]
+```
+**Inline question:**
+```
+/bw-plan "what are the performance risks in this Flutter app?"
+/bw-plan "plan a migration from REST to GraphQL"
+/bw-plan "compare Redis vs Memcached for our session cache"
+```
+**Structured task file:**
+```
+/bw-plan tasks/flutter-perf-review.md
+/bw-plan .buildwright/tasks/architecture-review.md
+```
+---
+## Phase 1 — Understand
+If a **task file** is provided, read it. Extract:
+- `Inputs` block — variable names, descriptions, defaults
+- `Rules` block — constraints on evidence, scope, output format
+- `Research Areas` block — what categories to investigate
+- `Outputs` block — file names and content requirements
+Substitute any `<placeholder>` tokens from invocation args or sensible defaults
+(e.g. `<date>` → today's date, `<repo_path>` → current working directory).
+If **inline text** is provided, infer:
+- The question or goal
+- Likely research areas from the question
+- Default output location: `docs/plans/<kebab-slug>/<YYYY-MM-DD>/plan.md`
+---
+## Phase 2 — Clarify
+In **interactive mode** (`BUILDWRIGHT_AUTO_APPROVE=false`): if a critical input
+is ambiguous (e.g. target repository path is unknown), ask one focused question
+before proceeding.
+In **autonomous mode** (`BUILDWRIGHT_AUTO_APPROVE=true`, default): apply
+sensible defaults and proceed. Note any assumptions in the deliverable.
+---
+## Phase 3 — Research
+**First, check for pre-analysed codebase docs:**
+If `.buildwright/codebase/` exists (generated by `/bw-analyse`), read it before scanning files:
+- `STACK.md` — confirmed tech stack for context
+- `ARCHITECTURE.md` — layer structure to reference in findings
+- `CONVENTIONS.md` — patterns to cite as "current standard" in recommendations
+- `CONCERNS.md` — known issues; findings that overlap get higher severity/priority
+These docs accelerate research and improve accuracy of recommendations.
+Read relevant code, config, and documentation. Run read-only tools listed in
+the task (e.g. `flutter analyze`, `dart analyze`, `npm audit`, `cargo audit`,
+`semgrep --config auto`). Capture all stdout/stderr as labeled evidence.
+**Evidence labels (use exactly these):**
+- `inferred from code` — finding comes from reading source files
+- `backed by tool output` — finding is confirmed by a tool's output
+- `backed by docs/external source` — finding references documentation
+**Do NOT:**
+- Modify any source file
+- Run commands that write to the target repository
+- Run build steps unless the task explicitly requires them
+If a tool is unavailable, note it as a blocker in the summary; skip and continue.
+---
+## Phase 4 — Synthesize
+Organize findings into structured sections per the task's Research Areas (or
+inferred categories for inline questions).
+**For analysis/audit tasks**, each finding includes:
+- `title` — short descriptive name
+- `category` — from the task's research areas
+- `severity` — critical / high / medium / low
+- `confidence` — high / medium / low
+- `evidence_type` — one of the three labels above
+- `evidence` — file path + line number, or tool command + output excerpt
+- `why it matters` — impact on users or system
+- `recommended action` — concrete fix or next step
+- `estimated effort` — hours/days rough estimate
+- `needs_runtime_verification` — true/false
+**For planning/decision tasks**, structure around:
+- Options considered
+- Recommendation
+- Rationale
+- Risks and mitigations
+- Next steps
+---
+## Phase 5 — Write Deliverable
+Write artifact files to `output_dir`. Must write at minimum:
+- `plan.md` — or the task-specified primary document
+May also write supporting files if the task specifies them:
+- `.csv` — comma-separated with a header row
+- `.json` — valid JSON, schema as specified in task
+- Additional `.md` files (e.g. `top10.md`, `backlog.md`)
+Create `output_dir` if it does not exist.
+---
+## Phase 6 — Summarize
+Print to stdout:
+- Output directory path and files written
+- Top 3–5 findings or key recommendations
+- Any blockers that prevented complete research (tools unavailable, files inaccessible)
+---
+## Hard Constraints (always enforced)
+- **NEVER** modify source files in any target repository
+- **NEVER** commit, push, or create PRs
+- **NEVER** claim something is "measured" or "confirmed" without direct evidence
+- Every finding must cite evidence (file + line, or tool output excerpt)
+- The task's `Rules` block can add constraints; it cannot remove these hard constraints
+---
+## Task File Format
+Structured task files follow this template:
+```markdown
+## Inputs
+- `<variable_name>`: description [default: value]
+- `<repo_path>`: path to the repository to analyse [default: current directory]
+- `<output_dir>`: where to write artifacts [default: docs/plans/<slug>/<date>/]
+## Rules
+1. This is a read-only pass. Do not modify any source files.
+2. Separate findings by evidence type: inferred / tool-backed / doc-backed.
+3. Every finding must include a file path and line reference or tool output.
+## Research Areas
+1. Category A — what to look for
+2. Category B — what to look for
+3. ...
+## Outputs
+- `plan.md` — primary report with executive summary and all findings
+- `top10.md` — top 10 prioritized findings
+- `backlog.csv` — all findings as CSV with columns: title,category,severity,confidence,effort
+- `summary.json` — machine-readable summary: { findings_count, top_findings[], blockers[] }
+```
+---
+## Example
+The Flutter performance review task maps directly to this format:
+```
+/bw-plan tasks/flutter-perf-review.md
+```
+Where `tasks/flutter-perf-review.md` defines:
+- Inputs: repo_path, main package, entrypoints, target platforms, output_dir
+- Rules: static analysis only, evidence required, no source modification
+- Research Areas: rebuild risks, list/grid risks, rendering, CPU, data/arch, images, deps, code health
+- Outputs: performance-static-review.md, performance-static-top10.md, performance-static-backlog.csv, performance-static-summary.json

package/templates/.buildwright/commands/bw-quick.md CHANGED Viewed

@@ -71,6 +71,17 @@ Continue with /bw-quick anyway? (say "continue" or use /bw-new-feature)
 **Lightweight research - only what's needed for this task.**
+**First, check for pre-analysed codebase docs:**
+```bash
+ls .buildwright/codebase/ 2>/dev/null
+```
+If present, read `CONVENTIONS.md` and `ARCHITECTURE.md` — they give you naming patterns
+and layer boundaries without scanning the whole codebase. Check `CONCERNS.md` to avoid
+introducing more of the same issues. Then narrow your search to the specific files for
+this task.
 ```bash
 # Find directly relevant files
 grep -r "[relevant terms]" --include="*.ts" --include="*.tsx" -l .

package/templates/CLAUDE.md CHANGED Viewed

@@ -25,7 +25,7 @@ are codebase analysis docs (stack, architecture, conventions, concerns) generate
   agents/               ← Architect, Staff Engineer, Security Engineer
   claws/                ← Frontend, Backend, Database, TEMPLATE
   codebase/             ← Generated by /bw-analyse (stack, architecture, conventions, concerns)
-  commands/             ← bw-new-feature, bw-claw, bw-quick, bw-ship, bw-verify, bw-help
+  commands/             ← bw-new-feature, bw-claw, bw-quick, bw-ship, bw-verify, bw-plan, bw-help
   steering/             ← product.md, tech.md, quality-gates.md, naming-conventions.md
   tasks/
@@ -55,6 +55,7 @@ After cloning or editing `.buildwright/`, run `make sync` to regenerate tool-spe
 6. **Quick quality check**: /bw-verify → typecheck, lint, test, build
 7. **Show commands**: /bw-help
 8. **Analyse existing codebase**: /bw-analyse → reads codebase → writes structured docs to .buildwright/codebase/ → updates tech.md. Run first on any brownfield project.
+9. **Research/planning without implementation**: /bw-plan → Understand question → Research (read-only) → Synthesize → Write deliverable. No code changes, no commits.
 ## Command Discovery
@@ -97,6 +98,7 @@ If ANY required step fails: fix and retry (max 2 attempts). If same error repeat
 - Only `.buildwright/` is committed — never commit `.claude/` or `.opencode/` content files
 - After editing any file in `.buildwright/`, run `make sync` before committing
 - Before committing, update README.md, SKILL.md, docs/, or CHANGELOG.md if the change affects user-facing behavior
+- After editing any `.buildwright/commands/` file, run `make sync` — it automatically checks that every command is documented in README.md and SKILL.md and warns about gaps
 ## Cross-Domain Features (Claw Architecture)
 When a feature touches multiple domains (e.g., DB + API + UI):

package/templates/Makefile CHANGED Viewed

@@ -1,4 +1,4 @@
-.PHONY: dist clean sync sync-check cursor opencode openclaw validate install-hooks uninstall-hooks bump release test-cli
+.PHONY: dist clean sync sync-check cursor opencode openclaw codex validate install-hooks uninstall-hooks bump release test-cli
 # ============================================================================
 # Sync — Generate .claude/, .opencode/, .cursor/rules/ from .buildwright/ (canonical)
@@ -39,6 +39,13 @@ openclaw: sync
 	@cp SKILL.md ~/.openclaw/skills/buildwright/SKILL.md
 	@echo "Installed to ~/.openclaw/skills/buildwright/"
+# Codex CLI — install skills for native skill discovery
+codex: sync
+	@mkdir -p ~/.agents/skills
+	@ln -sfn $(PWD)/skills ~/.agents/skills/buildwright
+	@echo "Installed to ~/.agents/skills/buildwright"
+	@echo "Restart Codex to discover Buildwright skills."
 # ============================================================================
 # Validate SKILL.md against Agent Skills spec (agentskills.io)
 # ============================================================================

package/templates/scripts/sync-agents.sh CHANGED Viewed

@@ -261,7 +261,21 @@ sync_cursor_dir ".buildwright/agents"    "agents"    "agent"
 sync_cursor_dir ".buildwright/claws"     "claws"     "claw"
 # ============================================================================
-# 5. Package for ClawHub (dist/)
+# 5. .buildwright/commands/ → skills/ (Codex CLI skill discovery)
+# ============================================================================
+if [ "$CHECK_ONLY" = false ]; then
+  for file in .buildwright/commands/bw-*.md; do
+    [ -f "$file" ] || continue
+    name=$(basename "$file" .md)
+    mkdir -p "skills/$name"
+    cp "$file" "skills/$name/SKILL.md"
+    echo "  synced $file → skills/$name/SKILL.md"
+  done
+fi
+# ============================================================================
+# 6. Package for ClawHub (dist/)
 # ============================================================================
 if [ "$CHECK_ONLY" = false ] && [ -f "SKILL.md" ]; then
@@ -270,6 +284,15 @@ if [ "$CHECK_ONLY" = false ] && [ -f "SKILL.md" ]; then
   echo "  packaged dist/buildwright/SKILL.md for ClawHub"
 fi
+# ============================================================================
+# 7. README.md → cli/README.md (single source of truth for npm package page)
+# ============================================================================
+if [ "$CHECK_ONLY" = false ] && [ -f "README.md" ]; then
+  cp README.md cli/README.md
+  echo "  README.md → cli/README.md"
+fi
 # ============================================================================
 # Result
 # ============================================================================
@@ -291,4 +314,12 @@ else
   echo "  .buildwright/ → .cursor/rules/   (.mdc with frontmatter)"
   echo "  CLAUDE.md     → AGENTS.md"
   echo "  SKILL.md      → dist/buildwright/SKILL.md"
+  echo "  .buildwright/commands/ → skills/          (Codex CLI skill discovery)"
+  echo "  README.md     → cli/README.md             (npm package page)"
+  # Validate all commands are documented in SKILL.md and README.md
+  if [ -f "scripts/validate-docs.sh" ]; then
+    echo ""
+    bash scripts/validate-docs.sh || true
+  fi
 fi

package/templates/scripts/validate-docs.sh ADDED Viewed

@@ -0,0 +1,67 @@
+#!/usr/bin/env bash
+# validate-docs.sh — checks that every bw-* command is documented in SKILL.md and README.md
+# Run automatically by sync-agents.sh after each sync.
+# Exit code 1 if any commands are missing from documentation.
+set -euo pipefail
+COMMANDS_DIR=".buildwright/commands"
+SKILL_MD="SKILL.md"
+README_MD="README.md"
+RED='\033[0;31m'
+YELLOW='\033[1;33m'
+GREEN='\033[0;32m'
+BOLD='\033[1m'
+RESET='\033[0m'
+if [ ! -d "$COMMANDS_DIR" ]; then
+  echo "  validate-docs: $COMMANDS_DIR not found, skipping"
+  exit 0
+fi
+missing=0
+for file in "$COMMANDS_DIR"/bw-*.md; do
+  [ -f "$file" ] || continue
+  # Skip bw-help.md — it IS the help output, doesn't need its own docs section
+  basename=$(basename "$file")
+  [ "$basename" = "bw-help.md" ] && continue
+  # Extract name from YAML frontmatter
+  name=$(awk '/^---/{f=!f;next} f && /^name:/{print $2;exit}' "$file" 2>/dev/null | tr -d '\r')
+  if [ -z "$name" ]; then
+    echo -e "  ${YELLOW}validate-docs: $basename has no 'name' in frontmatter — skipping${RESET}"
+    continue
+  fi
+  cmd="/$name"
+  file_missing=0
+  # Check SKILL.md
+  if [ -f "$SKILL_MD" ] && ! grep -qi "### $cmd" "$SKILL_MD" 2>/dev/null; then
+    echo -e "  ${RED}${BOLD}validate-docs: $cmd missing from SKILL.md${RESET}"
+    file_missing=1
+    missing=$((missing + 1))
+  fi
+  # Check README.md
+  if [ -f "$README_MD" ] && ! grep -q "$cmd" "$README_MD" 2>/dev/null; then
+    echo -e "  ${RED}${BOLD}validate-docs: $cmd missing from README.md${RESET}"
+    file_missing=1
+    missing=$((missing + 1))
+  fi
+  if [ "$file_missing" -eq 0 ]; then
+    echo -e "  ${GREEN}validate-docs: $cmd ✓${RESET}"
+  fi
+done
+if [ "$missing" -gt 0 ]; then
+  echo ""
+  echo -e "  ${RED}${BOLD}validate-docs: $missing documentation gap(s) found.${RESET}"
+  echo -e "  Update README.md and SKILL.md, then re-run 'make sync'."
+  exit 1
+fi