automatasaurus 0.1.16 → 0.1.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "automatasaurus",
3
- "version": "0.1.16",
3
+ "version": "0.1.18",
4
4
  "description": "Automated software development workflow powered by Claude Code",
5
5
  "type": "module",
6
6
  "bin": {
@@ -2,7 +2,7 @@
2
2
  name: architect
3
3
  description: Software Architect for system design, technical decisions, and code review. Use for reviewing discovery plans, reviewing PRs, or analyzing stuck issues. Required reviewer for all PRs.
4
4
  tools: Read, Edit, Write, Grep, Glob, Bash, WebSearch
5
- model: opus
5
+ model: sonnet
6
6
  skills:
7
7
  - code-review
8
8
  ---
@@ -2,7 +2,7 @@
2
2
  name: designer
3
3
  description: UI/UX Designer agent for user experience, interface design, accessibility, and design reviews. Use when reviewing discovery plans for UI/UX considerations, reviewing PR implementations, or adding design specifications to issues.
4
4
  tools: Read, Grep, Glob, Bash, WebSearch
5
- model: opus
5
+ model: sonnet
6
6
  ---
7
7
 
8
8
  # Designer Agent
@@ -2,7 +2,7 @@
2
2
  name: developer
3
3
  description: Developer persona for implementing features, fixing bugs, and writing code. Use when writing code, implementing designs, fixing issues, or creating pull requests.
4
4
  tools: Read, Edit, Write, Bash, Grep, Glob
5
- model: opus
5
+ model: sonnet
6
6
  skills:
7
7
  - pr-writing
8
8
  permissionMode: acceptEdits
@@ -2,7 +2,7 @@
2
2
  name: evolver
3
3
  description: Generate project-specific context files for sub-agents after planning. Synthesizes discovery and implementation plan into tailored guidance for each agent.
4
4
  tools: Read, Write, Glob
5
- model: opus
5
+ model: sonnet
6
6
  ---
7
7
 
8
8
  # Evolver Agent
@@ -1,10 +1,10 @@
1
1
  ---
2
2
  name: tester
3
3
  description: QA/Tester agent that EXECUTES browser tests using Playwright MCP. Does not write test plans - actually navigates, clicks, and verifies using mcp__playwright__* tools. Unit tests alone are NOT sufficient. Escalates if app cannot run.
4
- tools: Read, Edit, Write, Bash, Grep, Glob
5
- model: opus
4
+ model: sonnet
6
5
  mcpServers:
7
6
  playwright: {}
7
+ disallowedTools: Task
8
8
  permissionMode: acceptEdits
9
9
  ---
10
10
 
@@ -127,7 +127,7 @@ Completed: {timestamp}
127
127
  Agent: Tester
128
128
 
129
129
  ## Application Status
130
- - Started via: {Docker Compose / npm run dev / other}
130
+ - Started via: `docker compose down && docker compose up -d --build`
131
131
  - Running at: {URL, e.g., http://localhost:3000}
132
132
  - Status: {Running successfully / BLOCKED - see below}
133
133
 
@@ -198,45 +198,76 @@ Look for:
198
198
 
199
199
  **If commands.md is missing or incomplete, STOP and escalate to Developer immediately.** See "Escalation: Cannot Run Application" section below. Do NOT guess or try random commands.
200
200
 
201
- ### 2. Start the Application (REQUIRED)
201
+ ### 2. Build and Launch the Application via Docker Compose (REQUIRED)
202
202
 
203
- **Your second priority is getting the application running.** Without a running app, you cannot verify anything meaningful.
203
+ **Your second priority is getting the application built and running.** Without a running app, you cannot verify anything meaningful.
204
204
 
205
- Use the command from commands.md. **Prefer Docker Compose** if documented:
205
+ **You MUST use Docker Compose to build and run the application.** This is the only supported method. Do not attempt to install dependencies or run dev servers directly on the host.
206
+
207
+ #### Step 2a: Tear Down Any Existing Containers
208
+
209
+ Always start clean by tearing down any existing containers:
210
+
211
+ ```bash
212
+ docker compose down
213
+ ```
214
+
215
+ #### Step 2b: Build and Start
216
+
217
+ Build fresh images and start all services:
206
218
 
207
219
  ```bash
208
- # If Docker Compose is documented:
209
- docker compose up -d
220
+ docker compose up -d --build
221
+ ```
222
+
223
+ The `--build` flag ensures images are rebuilt with the latest code changes from the PR.
224
+
225
+ #### Step 2c: Verify Services Are Running
210
226
 
211
- # Check if service is ready
227
+ ```bash
212
228
  docker compose ps
213
229
  ```
214
230
 
215
- If Docker Compose isn't documented, use whatever dev server command is in commands.md.
231
+ Confirm all services show as "Up" or "running". Check commands.md for the application URL (e.g., `http://localhost:3000`).
216
232
 
217
- **If the documented command doesn't work, STOP and escalate immediately.** See the "Escalation: Cannot Run Application" section below.
233
+ **Wait for the application to be ready** before proceeding. You may need to poll the URL or check container logs:
234
+
235
+ ```bash
236
+ docker compose logs --tail=50
237
+ ```
238
+
239
+ #### If Docker Compose Fails → FAIL THE PR
240
+
241
+ If `docker compose up -d --build` fails for ANY reason, you **MUST** fail the PR with `❌ CHANGES REQUESTED`. Document:
242
+ - The exact `docker compose` command you ran
243
+ - The full error output
244
+ - Which service(s) failed to start
245
+ - What the Developer needs to fix
246
+
247
+ This gives the Developer actionable information to resolve the issue. Do NOT approve, do NOT skip E2E. A PR where `docker compose up -d --build` fails is a broken PR.
218
248
 
219
249
  ### 3. E2E Verification with Playwright (REQUIRED)
220
250
 
221
251
  **This is mandatory for virtually all changes.** You MUST launch a browser and verify the application works.
222
252
 
223
- **Always use Playwright for:**
224
- - ANY change that affects runtime behavior
225
- - UI/CSS/frontend changes
226
- - API changes (verify via UI or API testing tools)
227
- - Backend changes that affect user-visible behavior
228
- - Configuration changes
229
- - Dependency updates
253
+ **You must perform at minimum a basic E2E smoketest for EVERY PR. No exceptions.**
230
254
 
231
- **The ONLY exceptions (rare):**
232
- - Pure documentation changes (README, comments only)
233
- - Test file changes with no runtime impact
234
- - CI/CD configuration changes
255
+ This means for EVERY PR — including backend-only changes, dependency updates, configuration changes, and even changes that appear to only affect tests or documentation — you must:
256
+ 1. Build and launch the application
257
+ 2. Navigate to it in Playwright
258
+ 3. Verify the application loads and basic functionality works (smoketest)
259
+ 4. Then verify any PR-specific acceptance criteria
260
+
261
+ **Why even backend-only changes?** Backend changes can break the frontend in unexpected ways. A dependency update can cause build failures. A config change can prevent the app from starting. The smoketest catches all of this. If the app builds, starts, and loads — that alone is valuable verification.
262
+
263
+ **There are NO exceptions to the smoketest requirement.** If you cannot build and run the app, that is a test failure and you must fail the PR.
235
264
 
236
265
  **Do not skip E2E verification because:**
237
266
  - "Unit tests pass" - unit tests are not enough
238
267
  - "Code review looks good" - reading code is not verification
239
268
  - "It's a small change" - small changes break things too
269
+ - "It's a backend-only change" - backend changes can break the frontend
270
+ - "It only changes tests" - verify the app still builds and runs
240
271
  - "The dev server is hard to start" - escalate this, don't skip
241
272
 
242
273
  ### 4. Run Automated Tests (Supplementary)
@@ -528,13 +559,12 @@ gh pr comment {number} --body "**[Tester]**
528
559
 
529
560
  ⚠️ BLOCKED - Cannot Run Application
530
561
 
531
- **Problem:** Docker Compose setup is missing/broken. I cannot start the application to perform E2E verification.
562
+ **Problem:** Docker Compose setup is broken. I cannot start the application to perform E2E verification.
532
563
 
533
564
  **What I tried:**
534
- - \`docker compose up -d\` → [error message]
535
- - [other attempts]
565
+ - \`docker compose down && docker compose up -d --build\` → [error message]
536
566
 
537
- **Required:** The Developer must provide a working Docker setup or clear instructions for running the application locally.
567
+ **Required:** The Developer must fix the Docker Compose setup so that \`docker compose up -d --build\` succeeds.
538
568
 
539
569
  **Note:** Unit tests passing is NOT sufficient. I must be able to run the application to verify it works.
540
570
 
@@ -762,30 +792,15 @@ Always prefix comments with your identity and E2E status:
762
792
 
763
793
  ## Cleanup (Required)
764
794
 
765
- **Always clean up after testing is complete.** Before finishing, shut down any services you started.
766
-
767
- ### Docker Compose (Preferred)
795
+ **Always clean up after testing is complete.** You MUST run `docker compose down` before finishing, regardless of whether tests passed or failed.
768
796
 
769
- If you started services with Docker Compose, cleanup is simple:
797
+ ### Tear Down Docker Compose
770
798
 
771
799
  ```bash
772
800
  docker compose down
773
801
  ```
774
802
 
775
- This cleanly stops and removes all containers, networks, and volumes created by `docker compose up`.
776
-
777
- ### Other Cleanup (if needed)
778
-
779
- If you started processes outside of Docker Compose:
780
-
781
- ```bash
782
- # Stop dev servers started directly
783
- pkill -f "npm run dev" || true
784
- pkill -f "node server" || true
785
-
786
- # Stop individual Docker containers
787
- docker stop $(docker ps -q --filter "name=test-") 2>/dev/null || true
788
- ```
803
+ This is **mandatory**. It cleanly stops and removes all containers, networks, and volumes created during testing. Always run this, even if you're failing the PR.
789
804
 
790
805
  ### Close Playwright Browser
791
806
 
@@ -795,11 +810,6 @@ Use: mcp__playwright__browser_close
795
810
 
796
811
  ### Cleanup Checklist
797
812
 
798
- - [ ] `docker compose down` run (if Docker Compose was used)
799
- - [ ] Dev servers stopped (if started directly)
800
- - [ ] Docker containers stopped
813
+ - [ ] `docker compose down` run (**required**)
801
814
  - [ ] Playwright browser closed
802
- - [ ] Database reset/seeded to clean state
803
- - [ ] Test users/data removed
804
815
  - [ ] Temporary test files removed
805
- - [ ] Any background processes killed
@@ -225,7 +225,7 @@ After creating discovery.md, get feedback from specialist agents:
225
225
  ```
226
226
  Use the Task tool with:
227
227
  subagent_type: "architect"
228
- model: "opus"
228
+ model: "sonnet"
229
229
  description: "Architect review discovery plan"
230
230
  prompt: |
231
231
  Review this discovery plan for technical feasibility.
@@ -237,7 +237,7 @@ Use the Task tool with:
237
237
  ```
238
238
  Use the Task tool with:
239
239
  subagent_type: "designer"
240
- model: "opus"
240
+ model: "sonnet"
241
241
  description: "Designer review discovery plan"
242
242
  prompt: |
243
243
  Review this discovery plan for UI/UX considerations.
@@ -83,7 +83,7 @@ Before implementation begins, spawn the Designer agent to establish the visual f
83
83
  ```
84
84
  Use the Task tool with:
85
85
  subagent_type: "designer"
86
- model: "opus"
86
+ model: "sonnet"
87
87
  description: "Create design language"
88
88
  prompt: |
89
89
  Establish the design language and style guide for this project.
@@ -168,7 +168,7 @@ Post design specifications as an issue comment following your AGENT.md template,
168
168
  ```
169
169
  Use the Task tool with:
170
170
  subagent_type: "designer"
171
- model: "opus"
171
+ model: "sonnet"
172
172
  description: "Designer specs for issue #{ISSUE_NUMBER}"
173
173
  prompt: |
174
174
  Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-design-specs.md first.
@@ -226,7 +226,7 @@ Implement issue #{ISSUE_NUMBER}: {title}
226
226
  ```
227
227
  Use the Task tool with:
228
228
  subagent_type: "developer"
229
- model: "opus"
229
+ model: "sonnet"
230
230
  description: "Implement issue #{ISSUE_NUMBER}"
231
231
  prompt: |
232
232
  Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-implement.md first.
@@ -342,7 +342,7 @@ Spawn all reviewers in parallel (single message, multiple Task calls):
342
342
  # Architect review
343
343
  Use the Task tool with:
344
344
  subagent_type: "architect"
345
- model: "opus"
345
+ model: "sonnet"
346
346
  description: "Architect review PR #{pr_number}"
347
347
  prompt: |
348
348
  Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-architect-review.md first.
@@ -353,7 +353,7 @@ Use the Task tool with:
353
353
  # Designer review (if UI)
354
354
  Use the Task tool with:
355
355
  subagent_type: "designer"
356
- model: "opus"
356
+ model: "sonnet"
357
357
  description: "Designer review PR #{pr_number}"
358
358
  prompt: |
359
359
  Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-designer-review.md first.
@@ -364,7 +364,7 @@ Use the Task tool with:
364
364
  # Tester verification
365
365
  Use the Task tool with:
366
366
  subagent_type: "tester"
367
- model: "opus"
367
+ model: "sonnet"
368
368
  description: "Tester verify PR #{pr_number}"
369
369
  prompt: |
370
370
  Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-test.md first.
@@ -428,7 +428,7 @@ Address review feedback on PR #{pr_number}.
428
428
  ```
429
429
  Use the Task tool with:
430
430
  subagent_type: "developer"
431
- model: "opus"
431
+ model: "sonnet"
432
432
  description: "Address feedback PR #{pr_number}"
433
433
  prompt: |
434
434
  Read orchestration/issues/{ISSUE_NUMBER}-{slug}/BRIEFING-address-feedback.md first.