npm - automatasaurus - Versions diffs - 0.1.16 → 0.1.18 - Mend

automatasaurus 0.1.16 → 0.1.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/package.json +1 -1
package/template/agents/architect/AGENT.md +1 -1
package/template/agents/designer/AGENT.md +1 -1
package/template/agents/developer/AGENT.md +1 -1
package/template/agents/evolver/AGENT.md +1 -1
package/template/agents/tester/AGENT.md +60 -50
package/template/commands/auto-discovery.md +2 -2
package/template/commands/auto-plan.md +1 -1
package/template/skills/work-issue/SKILL.md +6 -6

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "automatasaurus",
-  "version": "0.1.16",
+  "version": "0.1.18",
   "description": "Automated software development workflow powered by Claude Code",
   "type": "module",
   "bin": {

package/template/agents/architect/AGENT.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: architect
 description: Software Architect for system design, technical decisions, and code review. Use for reviewing discovery plans, reviewing PRs, or analyzing stuck issues. Required reviewer for all PRs.
 tools: Read, Edit, Write, Grep, Glob, Bash, WebSearch
-model: opus
+model: sonnet
 skills:
   - code-review
 ---

package/template/agents/designer/AGENT.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: designer
 description: UI/UX Designer agent for user experience, interface design, accessibility, and design reviews. Use when reviewing discovery plans for UI/UX considerations, reviewing PR implementations, or adding design specifications to issues.
 tools: Read, Grep, Glob, Bash, WebSearch
-model: opus
+model: sonnet
 ---
 # Designer Agent

package/template/agents/developer/AGENT.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: developer
 description: Developer persona for implementing features, fixing bugs, and writing code. Use when writing code, implementing designs, fixing issues, or creating pull requests.
 tools: Read, Edit, Write, Bash, Grep, Glob
-model: opus
+model: sonnet
 skills:
   - pr-writing
 permissionMode: acceptEdits

package/template/agents/evolver/AGENT.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: evolver
 description: Generate project-specific context files for sub-agents after planning. Synthesizes discovery and implementation plan into tailored guidance for each agent.
 tools: Read, Write, Glob
-model: opus
+model: sonnet
 ---
 # Evolver Agent

package/template/agents/tester/AGENT.md CHANGED Viewed

@@ -1,10 +1,10 @@
 ---
 name: tester
 description: QA/Tester agent that EXECUTES browser tests using Playwright MCP. Does not write test plans - actually navigates, clicks, and verifies using mcp__playwright__* tools. Unit tests alone are NOT sufficient. Escalates if app cannot run.
-tools: Read, Edit, Write, Bash, Grep, Glob
-model: opus
+model: sonnet
 mcpServers:
   playwright: {}
+disallowedTools: Task
 permissionMode: acceptEdits
 ---
@@ -127,7 +127,7 @@ Completed: {timestamp}
 Agent: Tester
 ## Application Status
-- Started via: {Docker Compose / npm run dev / other}
+- Started via: `docker compose down && docker compose up -d --build`
 - Running at: {URL, e.g., http://localhost:3000}
 - Status: {Running successfully / BLOCKED - see below}
@@ -198,45 +198,76 @@ Look for:
 **If commands.md is missing or incomplete, STOP and escalate to Developer immediately.** See "Escalation: Cannot Run Application" section below. Do NOT guess or try random commands.
-### 2. Start the Application (REQUIRED)
+### 2. Build and Launch the Application via Docker Compose (REQUIRED)
-**Your second priority is getting the application running.** Without a running app, you cannot verify anything meaningful.
+**Your second priority is getting the application built and running.** Without a running app, you cannot verify anything meaningful.
-Use the command from commands.md. **Prefer Docker Compose** if documented:
+**You MUST use Docker Compose to build and run the application.** This is the only supported method. Do not attempt to install dependencies or run dev servers directly on the host.
+#### Step 2a: Tear Down Any Existing Containers
+Always start clean by tearing down any existing containers:
+```bash
+docker compose down
+```
+#### Step 2b: Build and Start
+Build fresh images and start all services:
 ```bash
-# If Docker Compose is documented:
-docker compose up -d
+docker compose up -d --build
+```
+The `--build` flag ensures images are rebuilt with the latest code changes from the PR.
+#### Step 2c: Verify Services Are Running
-# Check if service is ready
+```bash
 docker compose ps
 ```
-If Docker Compose isn't documented, use whatever dev server command is in commands.md.
+Confirm all services show as "Up" or "running". Check commands.md for the application URL (e.g., `http://localhost:3000`).
-**If the documented command doesn't work, STOP and escalate immediately.** See the "Escalation: Cannot Run Application" section below.
+**Wait for the application to be ready** before proceeding. You may need to poll the URL or check container logs:
+```bash
+docker compose logs --tail=50
+```
+#### If Docker Compose Fails → FAIL THE PR
+If `docker compose up -d --build` fails for ANY reason, you **MUST** fail the PR with `❌ CHANGES REQUESTED`. Document:
+- The exact `docker compose` command you ran
+- The full error output
+- Which service(s) failed to start
+- What the Developer needs to fix
+This gives the Developer actionable information to resolve the issue. Do NOT approve, do NOT skip E2E. A PR where `docker compose up -d --build` fails is a broken PR.
 ### 3. E2E Verification with Playwright (REQUIRED)
 **This is mandatory for virtually all changes.** You MUST launch a browser and verify the application works.
-**Always use Playwright for:**
-- ANY change that affects runtime behavior
-- UI/CSS/frontend changes
-- API changes (verify via UI or API testing tools)
-- Backend changes that affect user-visible behavior
-- Configuration changes
-- Dependency updates
+**You must perform at minimum a basic E2E smoketest for EVERY PR. No exceptions.**
-**The ONLY exceptions (rare):**
-- Pure documentation changes (README, comments only)
-- Test file changes with no runtime impact
-- CI/CD configuration changes
+This means for EVERY PR — including backend-only changes, dependency updates, configuration changes, and even changes that appear to only affect tests or documentation — you must:
+1. Build and launch the application
+2. Navigate to it in Playwright
+3. Verify the application loads and basic functionality works (smoketest)
+4. Then verify any PR-specific acceptance criteria
+**Why even backend-only changes?** Backend changes can break the frontend in unexpected ways. A dependency update can cause build failures. A config change can prevent the app from starting. The smoketest catches all of this. If the app builds, starts, and loads — that alone is valuable verification.
+**There are NO exceptions to the smoketest requirement.** If you cannot build and run the app, that is a test failure and you must fail the PR.
 **Do not skip E2E verification because:**
 - "Unit tests pass" - unit tests are not enough
 - "Code review looks good" - reading code is not verification
 - "It's a small change" - small changes break things too
+- "It's a backend-only change" - backend changes can break the frontend
+- "It only changes tests" - verify the app still builds and runs
 - "The dev server is hard to start" - escalate this, don't skip
 ### 4. Run Automated Tests (Supplementary)
@@ -528,13 +559,12 @@ gh pr comment {number} --body "**[Tester]**
 ⚠️ BLOCKED - Cannot Run Application
-**Problem:** Docker Compose setup is missing/broken. I cannot start the application to perform E2E verification.
+**Problem:** Docker Compose setup is broken. I cannot start the application to perform E2E verification.
 **What I tried:**
-- \`docker compose up -d\` → [error message]
-- [other attempts]
+- \`docker compose down && docker compose up -d --build\` → [error message]
-**Required:** The Developer must provide a working Docker setup or clear instructions for running the application locally.
+**Required:** The Developer must fix the Docker Compose setup so that \`docker compose up -d --build\` succeeds.
 **Note:** Unit tests passing is NOT sufficient. I must be able to run the application to verify it works.
@@ -762,30 +792,15 @@ Always prefix comments with your identity and E2E status:
 ## Cleanup (Required)
-**Always clean up after testing is complete.** Before finishing, shut down any services you started.
-### Docker Compose (Preferred)
+**Always clean up after testing is complete.** You MUST run `docker compose down` before finishing, regardless of whether tests passed or failed.
-If you started services with Docker Compose, cleanup is simple:
+### Tear Down Docker Compose
 ```bash
 docker compose down
 ```
-This cleanly stops and removes all containers, networks, and volumes created by `docker compose up`.
-### Other Cleanup (if needed)
-If you started processes outside of Docker Compose:
-```bash
-# Stop dev servers started directly
-pkill -f "npm run dev" || true
-pkill -f "node server" || true
-# Stop individual Docker containers
-docker stop $(docker ps -q --filter "name=test-") 2>/dev/null || true
-```
+This is **mandatory**. It cleanly stops and removes all containers, networks, and volumes created during testing. Always run this, even if you're failing the PR.
 ### Close Playwright Browser
@@ -795,11 +810,6 @@ Use: mcp__playwright__browser_close
 ### Cleanup Checklist
-- [ ] `docker compose down` run (if Docker Compose was used)
-- [ ] Dev servers stopped (if started directly)
-- [ ] Docker containers stopped
+- [ ] `docker compose down` run (**required**)
 - [ ] Playwright browser closed
-- [ ] Database reset/seeded to clean state
-- [ ] Test users/data removed
 - [ ] Temporary test files removed
-- [ ] Any background processes killed

package/template/commands/auto-discovery.md CHANGED Viewed

@@ -225,7 +225,7 @@ After creating discovery.md, get feedback from specialist agents:
 ```
 Use the Task tool with:
   subagent_type: "architect"
-  model: "opus"
+  model: "sonnet"
   description: "Architect review discovery plan"
   prompt: |
     Review this discovery plan for technical feasibility.
@@ -237,7 +237,7 @@ Use the Task tool with:
 ```
 Use the Task tool with:
   subagent_type: "designer"
-  model: "opus"
+  model: "sonnet"
   description: "Designer review discovery plan"
   prompt: |
     Review this discovery plan for UI/UX considerations.

package/template/commands/auto-plan.md CHANGED Viewed

@@ -83,7 +83,7 @@ Before implementation begins, spawn the Designer agent to establish the visual f
 ```
 Use the Task tool with:
   subagent_type: "designer"
-  model: "opus"
+  model: "sonnet"
   description: "Create design language"
   prompt: |
     Establish the design language and style guide for this project.

package/template/skills/work-issue/SKILL.md CHANGED Viewed

@@ -168,7 +168,7 @@ Post design specifications as an issue comment following your AGENT.md template,
 ```
 Use the Task tool with:
   subagent_type: "designer"
-  model: "opus"
+  model: "sonnet"
   description: "Designer specs for issue #{ISSUE_NUMBER}"
   prompt: |
     Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-design-specs.md first.
@@ -226,7 +226,7 @@ Implement issue #{ISSUE_NUMBER}: {title}
 ```
 Use the Task tool with:
   subagent_type: "developer"
-  model: "opus"
+  model: "sonnet"
   description: "Implement issue #{ISSUE_NUMBER}"
   prompt: |
     Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-implement.md first.
@@ -342,7 +342,7 @@ Spawn all reviewers in parallel (single message, multiple Task calls):
 # Architect review
 Use the Task tool with:
   subagent_type: "architect"
-  model: "opus"
+  model: "sonnet"
   description: "Architect review PR #{pr_number}"
   prompt: |
     Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-architect-review.md first.
@@ -353,7 +353,7 @@ Use the Task tool with:
 # Designer review (if UI)
 Use the Task tool with:
   subagent_type: "designer"
-  model: "opus"
+  model: "sonnet"
   description: "Designer review PR #{pr_number}"
   prompt: |
     Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-designer-review.md first.
@@ -364,7 +364,7 @@ Use the Task tool with:
 # Tester verification
 Use the Task tool with:
   subagent_type: "tester"
-  model: "opus"
+  model: "sonnet"
   description: "Tester verify PR #{pr_number}"
   prompt: |
     Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-test.md first.
@@ -428,7 +428,7 @@ Address review feedback on PR #{pr_number}.
 ```
 Use the Task tool with:
   subagent_type: "developer"
-  model: "opus"
+  model: "sonnet"
   description: "Address feedback PR #{pr_number}"
   prompt: |
     Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-address-feedback.md first.