buildanything 1.5.0 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -223,6 +223,34 @@ You are **UI Designer**, an expert user interface designer who creates beautiful
223
223
  }
224
224
  ```
225
225
 
226
+ ## Anti-AI-Template Design Rules
227
+
228
+ <HARD-GATE>
229
+ Your design must demonstrate intentional, research-backed choices — not framework defaults. Penalize yourself if 3+ of these appear together in your output:
230
+
231
+ - Purple-to-blue or purple-to-pink gradient hero backgrounds
232
+ - Floating mesh/blob gradient decorative elements
233
+ - Inter or Plus Jakarta Sans as the font choice (unless competitive research specifically justifies it)
234
+ - 3-column icon + heading + paragraph feature grids as the primary content pattern
235
+ - Glassmorphism/frosted glass as the primary design language
236
+ - Bento grid as default layout pattern
237
+ - Dark mode + neon accents as the "premium" aesthetic
238
+ - Generic illustration pack imagery (Undraw, Humaaans style)
239
+ - Perfect symmetry everywhere with no visual tension or personality
240
+
241
+ ONE or two in isolation is fine IF research supports it. THREE or more together = AI template smell.
242
+
243
+ Every visual choice must be JUSTIFIED by competitive research or design inspiration analysis. "I chose X because..." not "X is the default."
244
+ </HARD-GATE>
245
+
246
+ ## Research-Driven Design
247
+
248
+ When provided with a Design Research Brief and reference screenshots:
249
+ - Study the competitor and inspiration screenshots BEFORE making any visual decisions
250
+ - Cite specific references in your rationale: "The top Awwwards sites in this category use geometric sans-serifs with high x-heights. Competitor Y uses Inter which is ubiquitous. We chose Space Grotesk to differentiate while maintaining readability."
251
+ - Differentiate from competitors — don't copy them. Use the research to understand the visual landscape, then find your own position within it.
252
+ - The goal: a human designer would NOT say "this was generated by AI."
253
+
226
254
  ## 🔄 Your Workflow Process
227
255
 
228
256
  ### Step 1: Design System Foundation
@@ -292,6 +292,16 @@ document.addEventListener('DOMContentLoaded', () => {
292
292
  - **Cards**: Subtle hover effects, clear clickable areas
293
293
  ```
294
294
 
295
+ ## Research-Driven Architecture
296
+
297
+ When provided with a Design Research Brief and competitor/inspiration screenshots:
298
+ - Study reference screenshots BEFORE making structural decisions
299
+ - Base layout strategy on what performed best in the competitive analysis — not generic patterns
300
+ - If the top competitors all use a specific navigation pattern or layout approach, acknowledge it and either adopt (with justification) or consciously differentiate
301
+ - Information architecture should reflect how the best sites in this category organize content
302
+ - Component hierarchy should consider what components the reference sites use effectively
303
+ - Your structural decisions directly influence the visual quality downstream — poor IA creates ugly interfaces regardless of visual polish
304
+
295
305
  ## 🔄 Your Workflow Process
296
306
 
297
307
  ### Step 1: Analyze Project Requirements
package/commands/build.md CHANGED
@@ -91,7 +91,7 @@ When spawning agents in sequence (e.g., architect → implementer → reviewer),
91
91
  2. **Previous agent's output** — what the upstream agent produced (if any)
92
92
  3. **Acceptance criteria** — what "done" looks like for THIS agent
93
93
 
94
- For implementation agents (Phase 4+): Do NOT paste the entire Design Document or Architecture Document. Extract the relevant sections only. For research and architecture agents (Phases 1-2): pass the full document — these agents need complete context to do their analysis.
94
+ For implementation agents (Phase 5+): Do NOT paste the entire Design Document or Architecture Document. Extract the relevant sections only. For research and architecture agents (Phases 1-2): pass the full document — these agents need complete context to do their analysis.
95
95
 
96
96
  ### Complexity Routing (Advisory)
97
97
 
@@ -169,7 +169,7 @@ Autonomous mode: Log checklist to `docs/plans/build-log.md`. Create `.env.exampl
169
169
  ### Step 0.3 — Initialize
170
170
 
171
171
  0. Create `docs/plans/` directory if it doesn't exist (greenfield projects won't have it).
172
- 1. Create a TodoWrite checklist with Phases 0-6.
172
+ 1. Create a TodoWrite checklist with Phases 0-7.
173
173
  2. Create `docs/plans/.build-state.md` as a single write with ALL of the following: phase and step (`Phase: 0 — Starting`), input (`[build request]`), context level (`[classification]`), prerequisites (`[status]`), dispatch counter (`dispatches_since_save: 0, last_save: Phase 0`), and a `## Resume Point` section with: phase, step, autonomous mode flag, completed tasks (none), git branch name.
174
174
  3. Go to Phase 1 (or Phase 2 if context level is "Full design").
175
175
 
@@ -294,21 +294,79 @@ Update TodoWrite and `docs/plans/.build-state.md`.
294
294
 
295
295
  ---
296
296
 
297
- ## Phase 3: Foundation
297
+ ## Phase 3: Design & Visual Identity
298
298
 
299
- ### Step 3.1Scaffolding
299
+ **Goal**: Transform architecture into a research-backed visual design system, proven with Playwright screenshots. Fully autonomous agents research, decide, and iterate without user input.
300
+
301
+ **Skip if** the project has no user-facing frontend (CLI tools, pure APIs, backend services).
302
+
303
+ <HARD-GATE>
304
+ UI/UX IS THE PRODUCT. This phase is a full peer to Architecture and Build — not a footnote, not an afterthought, not a "nice to have." Do NOT skip, compress, or rush this phase for any reason. The agents must research real competitors and award-winning sites, make deliberate visual choices backed by that research, build proof screens, and iterate with Playwright-verified visual QA before a single line of product code is written.
305
+
306
+ Phase 4 (Foundation) WILL NOT START without `docs/plans/visual-design-spec.md`. If it does not exist, return here.
307
+ </HARD-GATE>
308
+
309
+ ### Step 3.1 — Design Research (2 agents, parallel, both use Playwright)
310
+
311
+ Follow the Design Protocol (`commands/protocols/design.md`), Step 3.1.
312
+
313
+ Call the Agent tool 2 times in one message:
314
+
315
+ 1. Description: "Competitive visual audit" — Prompt: "Research the top 5-8 competitors/analogues for: [product description]. Use Playwright to screenshot each site (desktop 1920x1080 + mobile 375x812). Screenshot standout components (hero, cards, forms, nav, CTAs). Save to docs/plans/design-references/competitors/. Analyze visual language: colors, typography, spacing, what feels premium vs cheap. Rank by visual quality. DESIGN DOC: [paste]."
316
+
317
+ 2. Description: "Design inspiration mining" — Prompt: "Search Awwwards.com, Godly.website, SiteInspire for award-winning sites in category: [product category]. Use Playwright to screenshot top 5-8 results + standout components. Save to docs/plans/design-references/inspiration/. Identify visual trends, what separates best-in-class from generic. DESIGN DOC: [paste]."
318
+
319
+ After both return, synthesize a **Design Research Brief** to `docs/plans/design-research.md`. Include all screenshot paths.
320
+
321
+ ### Step 3.2 — Design Direction (2 agents, sequential)
322
+
323
+ Follow the Design Protocol (`commands/protocols/design.md`), Step 3.2.
324
+
325
+ 1. Call the Agent tool — description: "UX architecture" — Prompt: "Create structural design foundation. INPUTS: frontend architecture section from architecture.md [paste], Design Research Brief [paste], reference screenshot paths [list], user persona [paste]. OUTPUT: information architecture, layout strategy, component hierarchy, responsive approach, interaction patterns. Base decisions on competitive research, not generic patterns."
326
+
327
+ 2. Call the Agent tool — description: "Visual design spec" — Prompt: "Create the Visual Design Spec with AUTONOMOUS decisions — pick the single best direction, do not present options. INPUTS: UX foundation [paste previous output], Design Research Brief [paste], reference screenshot paths [list], user persona [paste]. OUTPUT: color system (with hex, light+dark), typography (Google Fonts, mathematical scale), 8px spacing system, tinted shadow system, border radius, animation/motion, component styles with ALL states. Every choice must cite the research. Apply anti-AI-template rules from the Design Protocol. Save to docs/plans/visual-design-spec.md."
328
+
329
+ ### Step 3.3 — Proof Screens (1 implementation agent)
330
+
331
+ Call the Agent tool — description: "Build proof screens" — mode: "bypassPermissions" — prompt: "[COMPLEXITY: L] Implement 2-3 proof screens (landing/hero, main app view, key form). INPUTS: Visual Design Spec [paste], UX foundation [paste relevant sections], reference screenshots [list paths — these are your visual targets]. Use EXACT colors, fonts, spacing from spec. Real styled responsive pages, not wireframes. Include hover/focus states, transitions. Commit: 'feat: proof screens for design validation'."
332
+
333
+ ### Step 3.4 — Visual QA Loop (Playwright + Metric Loop)
334
+
335
+ Run the Metric Loop Protocol (`commands/protocols/metric-loop.md`) using the measurement criteria from the Design Protocol (`commands/protocols/design.md`, Step 3.4).
336
+
337
+ Measurement: Playwright screenshots of proof screens (desktop + mobile). Design critic agent scores 0-100 across 6 dimensions: spacing/alignment, typography hierarchy, color harmony, component polish, responsive quality, originality (anti-AI-template check). Receives screenshots + Visual Design Spec + reference screenshots.
338
+
339
+ **Target: 80. Max 5 iterations.** On stall: accept if >= 65, log warning below 65.
340
+
341
+ ### Step 3.5 — Autonomous Quality Gate
342
+
343
+ Log to `docs/plans/build-log.md`: final screenshot paths, score history table, design decisions, originality score. No user pause. Proceed to Phase 4.
344
+
345
+ **Compaction checkpoint:** Check `dispatches_since_save` in `docs/plans/.build-state.md`. If >= 8: save ALL state (current phase, task statuses, metric loop scores, decisions) to `docs/plans/.build-state.md`. Reset `dispatches_since_save` to 0. TodoWrite does NOT survive compaction — rebuild it from this state file on resume.
346
+
347
+ ---
348
+
349
+ ## Phase 4: Foundation
350
+
351
+ <HARD-GATE>
352
+ Before starting Phase 4: Phase 2 must be approved AND Phase 3 must have produced `docs/plans/visual-design-spec.md`.
353
+ If visual-design-spec.md does not exist, DO NOT PROCEED. Return to Phase 3.
354
+ Step 4.2 (Design System) MUST implement from visual-design-spec.md — not generic architecture tokens.
355
+ </HARD-GATE>
356
+
357
+ ### Step 4.1 — Scaffolding
300
358
 
301
359
  Call the Agent tool — description: "Project scaffolding" — mode: "bypassPermissions" — prompt: "[COMPLEXITY: M] Set up the project from this architecture: [paste]. Create directory structure, dependencies, build tooling, linting config, test framework with one passing test, .gitignore, .env.example. Commit: 'feat: initial scaffolding'."
302
360
 
303
- ### Step 3.2 — Design System (frontend only)
361
+ ### Step 4.2 — Design System (frontend only)
304
362
 
305
- Call the Agent tool — description: "Design system setup" — mode: "bypassPermissions" — prompt: "Implement design system foundation from this architecture: [paste frontend section]. Create CSS tokens, base layout components, core UI primitives. Commit: 'feat: design system'."
363
+ Call the Agent tool — description: "Design system setup" — mode: "bypassPermissions" — prompt: "Implement the design system from the Visual Design Spec: [paste from docs/plans/visual-design-spec.md]. Create CSS tokens matching the spec's color system, typography scale, spacing system, shadow/elevation tokens, and base layout components. Reference the proof screens from Phase 3 as implementation targets. Commit: 'feat: design system'."
306
364
 
307
- ### Step 3.3 — Metric Loop: Scaffold Health
365
+ ### Step 4.3 — Metric Loop: Scaffold Health
308
366
 
309
367
  Run the Metric Loop Protocol. Define a metric: builds clean, tests pass, lint clean, structure matches architecture. Max 3 iterations.
310
368
 
311
- ### Step 3.4 — Verification Gate
369
+ ### Step 4.4 — Verification Gate
312
370
 
313
371
  Run the Verification Protocol (`commands/protocols/verify.md`). Critical rules (survive compaction):
314
372
  - ONE agent runs all 6 checks sequentially: Build → Type-Check → Lint → Test → Security → Diff Review. Stop on first FAIL.
@@ -318,7 +376,7 @@ Run the Verification Protocol (`commands/protocols/verify.md`). Critical rules (
318
376
 
319
377
  Call the Agent tool — description: "Verify scaffolding" — mode: "bypassPermissions" — prompt: "Run the Verification Protocol. Execute all 6 checks sequentially, stop on first failure. Report: VERIFY: PASS or VERIFY: FAIL with details."
320
378
 
321
- Do not proceed to Phase 4 until verification passes.
379
+ Do not proceed to Phase 5 until verification passes.
322
380
 
323
381
  Update TodoWrite and state.
324
382
 
@@ -326,23 +384,23 @@ Update TodoWrite and state.
326
384
 
327
385
  ---
328
386
 
329
- ## Phase 4: Build — Metric-Driven Dev Loops
387
+ ## Phase 5: Build — Metric-Driven Dev Loops
330
388
 
331
389
  <HARD-GATE>
332
- Before starting: Phase 2 must be approved, Phase 3 must pass. You MUST call the Agent tool for EVERY task. No exceptions.
390
+ Before starting: Phase 2 must be approved, Phase 3 must produce docs/plans/visual-design-spec.md, Phase 4 must pass. You MUST call the Agent tool for EVERY task. No exceptions.
333
391
  </HARD-GATE>
334
392
 
335
393
  Expand TodoWrite with each sprint task.
336
394
 
337
395
  **For EACH task:**
338
396
 
339
- ### Step 4.1 — Implement
397
+ ### Step 5.1 — Implement
340
398
 
341
399
  Call the Agent tool — description: "[task name]" — mode: "bypassPermissions" — prompt: "TASK: [task description + acceptance criteria]. HANDOFF — Architecture section: [paste ONLY the relevant section from architecture.md]. Design section: [paste ONLY the relevant section from the design doc]. Previous task output: [what the last completed task produced, if relevant]. Implement fully with real code and tests. Commit: 'feat: [task]'. Report what you built, files changed, and test results."
342
400
 
343
401
  Pick the right developer framing: frontend, backend, AI, etc. Set `[COMPLEXITY: S/M/L]` based on the task's Size from sprint-tasks.md.
344
402
 
345
- ### Step 4.1b — Cleanup (De-Sloppify)
403
+ ### Step 5.1b — Cleanup (De-Sloppify)
346
404
 
347
405
  Follow the Cleanup Protocol (`commands/protocols/cleanup.md`). Critical rules (survive compaction):
348
406
  [COMPLEXITY: S]
@@ -354,11 +412,11 @@ Follow the Cleanup Protocol (`commands/protocols/cleanup.md`). Critical rules (s
354
412
 
355
413
  Call the Agent tool — description: "Cleanup [task name]" — mode: "bypassPermissions" — with the list of files changed and the task's acceptance criteria.
356
414
 
357
- ### Step 4.2 — Metric Loop: Task Quality
415
+ ### Step 5.2 — Metric Loop: Task Quality
358
416
 
359
417
  Run the Metric Loop Protocol on the task implementation. Define a metric based on the task's acceptance criteria. Max 5 iterations.
360
418
 
361
- ### Step 4.3 — Loop Exit
419
+ ### Step 5.3 — Loop Exit
362
420
 
363
421
  On target met: mark task complete in TodoWrite, report "Task X/N: [name] — COMPLETE (score: [final], iterations: [count])".
364
422
 
@@ -368,7 +426,7 @@ On stall or max iterations:
368
426
 
369
427
  After each task: update TodoWrite and `docs/plans/.build-state.md`.
370
428
 
371
- ### Step 4.4 — Post-Task Verification
429
+ ### Step 5.4 — Post-Task Verification
372
430
 
373
431
  Run the Verification Protocol (`commands/protocols/verify.md`) to catch regressions. If FAIL, fix before starting the next task.
374
432
 
@@ -376,13 +434,13 @@ Run the Verification Protocol (`commands/protocols/verify.md`) to catch regressi
376
434
 
377
435
  ---
378
436
 
379
- ## Phase 5: Harden — Metric-Driven Hardening
437
+ ## Phase 6: Harden — Metric-Driven Hardening
380
438
 
381
- ### Step 5.0 — Pre-Hardening Verification
439
+ ### Step 6.0 — Pre-Hardening Verification
382
440
 
383
441
  Run the Verification Protocol (`commands/protocols/verify.md`). ONE agent, 6 sequential checks (Build → Type → Lint → Test → Security → Diff), stop on first FAIL. Max 3 fix attempts. All checks must pass before starting expensive audit agents — do not waste audit agents on code that doesn't build or pass tests.
384
442
 
385
- ### Step 5.1 — Initial Audit (4 agents in parallel, ONE message)
443
+ ### Step 6.1 — Initial Audit (4 agents in parallel, ONE message)
386
444
 
387
445
  Call the Agent tool 4 times in one message:
388
446
 
@@ -394,23 +452,108 @@ Call the Agent tool 4 times in one message:
394
452
 
395
453
  4. Description: "Security audit" — Prompt: "Security review: auth, input validation, data exposure, dependency vulnerabilities. Report findings with severity."
396
454
 
397
- ### Step 5.1b — Eval Harness
455
+ ### Step 6.1b — Eval Harness
398
456
 
399
- Run the Eval Harness Protocol (`commands/protocols/eval-harness.md`). Define 8-15 concrete, executable eval cases from the audit findings and architecture doc. Run the eval agent. Record baseline pass rate. CRITICAL and HIGH failures feed into the metric loop in Step 5.2 as specific issues to fix.
457
+ Run the Eval Harness Protocol (`commands/protocols/eval-harness.md`). Define 8-15 concrete, executable eval cases from the audit findings and architecture doc. Run the eval agent. Record baseline pass rate. CRITICAL and HIGH failures feed into the metric loop in Step 6.2 as specific issues to fix.
400
458
 
401
- ### Step 5.2 — Metric Loop: Hardening Quality
459
+ ### Step 6.2 — Metric Loop: Hardening Quality
402
460
 
403
461
  Run the Metric Loop Protocol on the full codebase using audit findings as initial input. Define a composite metric based on what this project needs. Max 4 iterations.
404
462
 
405
463
  When fixing, dispatch to the RIGHT specialist. Security → security agent. Accessibility → frontend agent. Don't send everything to one agent.
406
464
 
407
- ### Step 5.2b — Eval Re-run
465
+ ### Step 6.2b — Eval Re-run
408
466
 
409
467
  Re-run the Eval Harness after the metric loop exits. All CRITICAL eval cases must now pass. If any CRITICAL case still fails, include it as evidence for the Reality Checker.
410
468
 
411
- ### Step 5.3Reality Check
469
+ ### Step 6.2cE2E Testing (3 mandatory iterations)
470
+
471
+ <HARD-GATE>
472
+ ALL 3 ITERATIONS ARE MANDATORY. Do NOT stop after iteration 1 even if all tests pass. The purpose of 3 runs is to catch flaky tests, timing-dependent failures, and race conditions that only surface on repeated execution. Skip this step ONLY if the project has no user-facing frontend.
473
+ </HARD-GATE>
474
+
475
+ Generate and execute end-to-end tests using Playwright against the running application. Tests cover critical user journeys derived from the design doc and architecture.
476
+
477
+ **Iteration 1 — Generate & Run:**
478
+
479
+ Call the Agent tool — description: "E2E test generation" — mode: "bypassPermissions" — prompt:
480
+
481
+ "[COMPLEXITY: L] Generate and run end-to-end Playwright tests for this application.
482
+
483
+ INPUTS:
484
+ - Architecture doc (user flows and API contracts): [paste relevant sections from docs/plans/architecture.md]
485
+ - Design doc (core user journeys): [paste relevant sections]
486
+ - Visual Design Spec (component selectors and page structure): [paste relevant sections from docs/plans/visual-design-spec.md]
487
+
488
+ REQUIREMENTS:
489
+ 1. Identify 5-10 critical user journeys from the design doc (auth flows, core feature flows, data entry, navigation)
490
+ 2. Use Page Object Model pattern — one page object per major view
491
+ 3. Use data-testid selectors (add them to components if missing)
492
+ 4. Wait for API responses, NEVER use arbitrary timeouts (no waitForTimeout)
493
+ 5. Capture screenshots at critical verification points
494
+ 6. Configure multi-browser: Chromium + Firefox + WebKit
495
+ 7. Set up playwright.config.ts with: fullyParallel, retries: 0 (we handle retries ourselves), screenshot: 'only-on-failure', video: 'retain-on-failure', trace: 'on-first-retry'
496
+ 8. Run all tests. Report: total, passed, failed, with failure details and screenshot paths.
497
+ 9. Commit: 'test: e2e test suite for critical user journeys'
498
+
499
+ Test priority:
500
+ - CRITICAL: Auth, core feature happy path, data submission, payment/transaction flows
501
+ - HIGH: Search, filtering, navigation, error states
502
+ - MEDIUM: Responsive layout, animations, edge cases"
503
+
504
+ Record results: total tests, pass count, fail count, failure details. Log to `docs/plans/.build-state.md` under `## E2E Testing`:
505
+
506
+ ```
507
+ | Iter | Total | Passed | Failed | Flaky | Top Failure |
508
+ |------|-------|--------|--------|-------|-------------|
509
+ | 1 | ... | ... | ... | ... | ... |
510
+ ```
511
+
512
+ **Iteration 2 — Fix & Re-run:**
513
+
514
+ Call the Agent tool — description: "E2E fix iteration 2" — mode: "bypassPermissions" — prompt:
515
+
516
+ "[COMPLEXITY: M] Fix E2E test failures and re-run the full suite.
517
+
518
+ ITERATION 1 RESULTS: [paste failure details — test names, error messages, screenshot paths]
519
+
520
+ For each failure:
521
+ 1. Diagnose: Is this a real bug, a flaky test, or a missing data-testid?
522
+ 2. Real bugs: Fix the application code
523
+ 3. Flaky tests: Add proper waits, fix race conditions, improve selectors
524
+ 4. Missing selectors: Add data-testid attributes to components
525
+ 5. Do NOT delete or skip failing tests — fix them
526
+
527
+ Re-run ALL tests (not just previously failing ones). Report results.
528
+ Commit fixes: 'fix: e2e test failures iteration 2'"
529
+
530
+ Record results in the E2E table. Identify any tests that passed in iteration 1 but failed in iteration 2 — these are flaky candidates.
531
+
532
+ **Iteration 3 — Final Stability Run:**
533
+
534
+ Call the Agent tool — description: "E2E stability run" — mode: "bypassPermissions" — prompt:
535
+
536
+ "[COMPLEXITY: M] Final E2E stability run — iteration 3 of 3.
537
+
538
+ PREVIOUS RESULTS:
539
+ - Iteration 1: [pass/fail counts]
540
+ - Iteration 2: [pass/fail counts]
541
+ - Flaky candidates: [tests that had inconsistent results across iterations]
542
+
543
+ REQUIREMENTS:
544
+ 1. Run ALL tests with --repeat-each=3 to detect flakiness (each test runs 3 times within this iteration)
545
+ 2. Any test failing inconsistently across the 3 sub-runs: quarantine with test.fixme() and file path + reason
546
+ 3. Fix any remaining consistent failures
547
+ 4. Generate final report with: total journeys, pass rate, flaky count, quarantined tests
548
+ 5. Commit: 'test: e2e stability fixes iteration 3'
549
+
550
+ PASS CRITERIA: 95%+ pass rate across all tests. Quarantined flaky tests do not count against pass rate but must be logged."
551
+
552
+ Record final results. Include in Reality Checker evidence.
553
+
554
+ ### Step 6.3 — Reality Check
412
555
 
413
- Call the Agent tool — description: "Final verdict" — prompt: "You are the Reality Checker. Default: NEEDS WORK. The hardening loop reached score [final_score] after [iterations] iterations. Score history: [paste table]. Review all evidence. Eval harness results: [baseline pass rate] → [final pass rate]. CRITICAL failures remaining: [list or none]. Verdict: PRODUCTION READY or NEEDS WORK with specifics."
556
+ Call the Agent tool — description: "Final verdict" — prompt: "You are the Reality Checker. Default: NEEDS WORK. The hardening loop reached score [final_score] after [iterations] iterations. Score history: [paste table]. Review all evidence. Eval harness results: [baseline pass rate] → [final pass rate]. E2E test results: [paste E2E table — 3 iterations, final pass rate, quarantined count]. CRITICAL failures remaining: [list or none]. Verdict: PRODUCTION READY or NEEDS WORK with specifics."
414
557
 
415
558
  <HARD-GATE>Do NOT self-approve. Reality Checker must give the verdict.</HARD-GATE>
416
559
 
@@ -421,21 +564,21 @@ Call the Agent tool — description: "Final verdict" — prompt: "You are the Re
421
564
 
422
565
  ---
423
566
 
424
- ## Phase 6: Ship
567
+ ## Phase 7: Ship
425
568
 
426
- ### Step 6.0 — Pre-Ship Verification
569
+ ### Step 7.0 — Pre-Ship Verification
427
570
 
428
- Final verification gate. Run the Verification Protocol (`commands/protocols/verify.md`). ONE agent, 6 sequential checks (Build → Type → Lint → Test → Security → Diff), stop on first FAIL. Max 3 fix attempts. All checks must pass before documenting and shipping. If FAIL persists, return to Phase 5 for targeted fixes.
571
+ Final verification gate. Run the Verification Protocol (`commands/protocols/verify.md`). ONE agent, 6 sequential checks (Build → Type → Lint → Test → Security → Diff), stop on first FAIL. Max 3 fix attempts. All checks must pass before documenting and shipping. If FAIL persists, return to Phase 6 for targeted fixes.
429
572
 
430
- ### Step 6.1 — Documentation
573
+ ### Step 7.1 — Documentation
431
574
 
432
575
  Call the Agent tool — description: "Documentation" — mode: "bypassPermissions" — prompt: "Write project docs: README with setup/architecture/usage, API docs if applicable, deployment notes. Commit: 'docs: project documentation'."
433
576
 
434
- ### Step 6.2 — Metric Loop: Documentation Quality
577
+ ### Step 7.2 — Metric Loop: Documentation Quality
435
578
 
436
579
  Run the Metric Loop Protocol on documentation. Define a metric based on completeness and whether a new developer could follow the README. Max 3 iterations.
437
580
 
438
- ### Step 6.3 — Record Learnings
581
+ ### Step 7.3 — Record Learnings
439
582
 
440
583
  Append to `docs/plans/learnings.md` (create if it doesn't exist). Review the build and record 3-5 learnings:
441
584
 
@@ -457,4 +600,4 @@ Metric loops run: [count] | Avg iterations: [N]
457
600
  Remaining: [any NEEDS WORK items]
458
601
  ```
459
602
 
460
- Mark all TodoWrite items complete. Update `docs/plans/.build-state.md`: "Phase: 6 COMPLETE."
603
+ Mark all TodoWrite items complete. Update `docs/plans/.build-state.md`: "Phase: 7 COMPLETE."
@@ -4,7 +4,7 @@ You are the orchestrator. A build, type-check, or lint check has failed. Do NOT
4
4
 
5
5
  ## When to Use
6
6
 
7
- When the Verification Protocol reports FAIL on Build, Type-Check, or Lint checks. Also usable during Phase 3 scaffolding or Phase 4 implementation when builds break.
7
+ When the Verification Protocol reports FAIL on Build, Type-Check, or Lint checks. Also usable during Phase 4 scaffolding or Phase 5 implementation when builds break.
8
8
 
9
9
  ## Step 1: Extract First Error
10
10
 
@@ -0,0 +1,287 @@
1
+ # Design & Visual Identity Protocol
2
+
3
+ You are the orchestrator. Phase 2 (Architecture) is complete. Before building anything, you must establish a research-backed visual design system. This phase is a FULL PEER to Architecture and Build — not a footnote.
4
+
5
+ ## Why This Phase Exists
6
+
7
+ UI/UX is the first thing a user experiences. A structurally sound app with ugly UI fails. A beautiful app with minor bugs succeeds. Design is not decoration — it is the product.
8
+
9
+ Top design firms (Pentagram, Work & Co, Clay, Metalab) treat design as its own phase with its own research, iteration, and quality gates. This protocol replicates that process: Discovery → Direction → Prototyping → Visual QA.
10
+
11
+ ---
12
+
13
+ ## Step 3.1 — Design Research (2 agents, parallel, both use Playwright)
14
+
15
+ Launch 2 agents in ONE message. Both MUST use Playwright to capture real screenshots — text descriptions of competitor sites are insufficient. Downstream agents need visual references.
16
+
17
+ **Agent 1: "Competitive visual audit"**
18
+
19
+ ```
20
+ You are a senior visual design researcher. Find the top 5-8 competitors or analogues for: [product description from design doc].
21
+
22
+ For each competitor:
23
+ 1. Use Playwright to navigate to their site
24
+ 2. Take full-page screenshots (desktop 1920x1080 + mobile 375x812)
25
+ 3. Screenshot standout components: hero sections, cards, forms, navigation, CTAs, footer
26
+ 4. Save all screenshots to docs/plans/design-references/competitors/[site-name]/
27
+
28
+ Analyze each site's visual language:
29
+ - Color palette (extract dominant colors)
30
+ - Typography choices (font families, scale, weight usage)
31
+ - Spacing rhythm (generous vs compact, section padding)
32
+ - Component style (shadows, borders, radius, elevation)
33
+ - What makes it feel premium or cheap?
34
+ - What would you steal vs avoid?
35
+
36
+ Output: Ranked analysis by visual quality and relevance. Include screenshot paths.
37
+ ```
38
+
39
+ **Agent 2: "Design inspiration mining"**
40
+
41
+ ```
42
+ You are a senior visual design researcher. Search Awwwards.com, Godly.website, and SiteInspire for award-winning sites in the category: [product category — SaaS, developer tool, e-commerce, marketplace, etc.].
43
+
44
+ For the top 5-8 results:
45
+ 1. Use Playwright to navigate and take full-page screenshots (desktop + mobile)
46
+ 2. Screenshot standout components and interactions worth referencing
47
+ 3. Save all screenshots to docs/plans/design-references/inspiration/[site-name]/
48
+
49
+ Identify cross-cutting patterns:
50
+ - What do the best-in-class sites have in common?
51
+ - What visual trends dominate this category right now?
52
+ - What separates "Awwwards worthy" from "generic template"?
53
+ - What specific techniques create the premium feel? (spacing, typography, animation, color)
54
+
55
+ Output: Trend analysis with specific adoptable patterns and anti-patterns to avoid. Include screenshot paths.
56
+ ```
57
+
58
+ After both return, synthesize a **Design Research Brief** saved to `docs/plans/design-research.md`. Include all screenshot paths for downstream agent reference.
59
+
60
+ ---
61
+
62
+ ## Step 3.2 — Design Direction (2 agents, sequential)
63
+
64
+ The UI Designer makes ALL decisions autonomously. No "Direction A vs B" presentations. Pick the best based on the research.
65
+
66
+ **Agent 1: UX Architect**
67
+
68
+ ```
69
+ You are the UX Architect. Create the structural design foundation.
70
+
71
+ INPUTS:
72
+ - Architecture doc (frontend section): [paste]
73
+ - Design Research Brief: [paste from docs/plans/design-research.md]
74
+ - Reference screenshots: [list paths from docs/plans/design-references/]
75
+ - User persona from Phase 1 research: [paste relevant section]
76
+
77
+ OUTPUT a UX Foundation document:
78
+ 1. Information architecture and content hierarchy
79
+ 2. User flow diagrams for core interactions
80
+ 3. Layout strategy — which pages use which layout patterns, informed by what worked in the research
81
+ 4. Component hierarchy — what components exist, how they compose
82
+ 5. Responsive breakpoint strategy (mobile-first)
83
+ 6. Navigation patterns
84
+ 7. Interaction patterns: hover, focus, loading, error, empty, success states
85
+
86
+ Base layout and flow decisions on what performed best in the competitive analysis — not generic patterns.
87
+ ```
88
+
89
+ **Agent 2: UI Designer**
90
+
91
+ ```
92
+ You are the UI Designer. Create the Visual Design Spec.
93
+
94
+ INPUTS:
95
+ - UX Foundation from UX Architect: [paste full output]
96
+ - Design Research Brief: [paste from docs/plans/design-research.md]
97
+ - Reference screenshots: [list paths from docs/plans/design-references/]
98
+ - User persona: [paste relevant section]
99
+
100
+ Make AUTONOMOUS decisions. Do not present options. Pick the single best direction based on the research.
101
+
102
+ OUTPUT a Visual Design Spec covering:
103
+
104
+ 1. **Color System** — Primary, secondary, accent, semantic (success/warning/error/info), neutral palette. Full hex values for light AND dark themes. Rationale tied to research: "competitor X uses muted blues; we differentiate with warm neutrals because our persona values approachability."
105
+
106
+ 2. **Typography System** — Font families (from Google Fonts or system fonts), size scale using a mathematical ratio (Major Third 1.25 or Perfect Fourth 1.333), weights, line heights (body: 1.5-1.6x, headings: 1.1-1.3x), letter spacing adjustments. MAX 2 font families.
107
+
108
+ 3. **Spacing System** — 8px base unit. Scale: 4, 8, 12, 16, 24, 32, 48, 64, 96, 128px. Rule: internal component padding MUST be less than external margin between components (Gestalt proximity principle).
109
+
110
+ 4. **Shadow & Elevation** — Layered shadow system using tinted shadows (NOT pure black — e.g., rgba(0,0,50,0.08) instead of rgba(0,0,0,0.1)). Ambient shadow + key shadow per elevation level. Levels: flat, raised (cards), elevated (dropdowns), overlay (modals), top (tooltips).
111
+
112
+ 5. **Border Radius** — ONE primary radius for the entire app (pick 4px, 6px, 8px, or 12px and justify). Pill radius for tags/badges only.
113
+
114
+ 6. **Animation & Motion** — Easing functions (ease-out for entrances, ease-in for exits, ease-in-out for transitions). Duration scale: micro 150ms, normal 300ms, emphasis 500ms. Stagger timing for lists: 30-50ms between items. Respect prefers-reduced-motion.
115
+
116
+ 7. **Component Styles** — For each component (buttons, inputs, cards, badges, navigation, modals, alerts, tables):
117
+ - ALL states: default, hover, active, focus-visible, disabled, loading
118
+ - Exact CSS properties: background, color, border, shadow, padding, font-size, font-weight, border-radius, transition
119
+
120
+ 8. **Design Rationale** — For EVERY major decision, cite the research. "The top 3 Awwwards sites in this category use geometric sans-serifs with high x-heights. Competitor Y uses Inter which is ubiquitous. We chose Space Grotesk to differentiate while maintaining the same readability characteristics."
121
+
122
+ ANTI-AI-TEMPLATE RULES:
123
+ Your design MUST NOT fall into the generic AI aesthetic. Penalize yourself if 3+ of these appear together:
124
+ - Purple-to-blue or purple-to-pink gradient hero backgrounds
125
+ - Floating mesh/blob gradient decorative elements
126
+ - Inter or Plus Jakarta Sans as the font choice (unless research specifically justifies it)
127
+ - 3-column icon + heading + paragraph feature grids as the primary content pattern
128
+ - Glassmorphism/frosted glass as the primary design language
129
+ - Bento grid as default layout
130
+ - Dark mode + neon accents as the "premium" look
131
+ - Generic illustration pack imagery (Undraw, Humaaans style)
132
+ - Perfect symmetry everywhere with no visual tension or personality
133
+
134
+ ONE or two of these in isolation is fine IF the research supports it. THREE or more together = AI template smell. Every visual choice must be JUSTIFIED by the research, not by framework defaults.
135
+
136
+ Save output to docs/plans/visual-design-spec.md.
137
+ ```
138
+
139
+ ---
140
+
141
+ ## Step 3.3 — Proof Screens (1 implementation agent)
142
+
143
+ ```
144
+ [COMPLEXITY: L] Implement 2-3 proof screens — the most visually demanding pages in this product:
145
+
146
+ 1. Landing page / hero section (the first impression)
147
+ 2. Main app view (dashboard, feed, workspace — the core experience)
148
+ 3. A form or interactive component (sign up, settings, creation flow)
149
+
150
+ INPUTS:
151
+ - Visual Design Spec: [paste from docs/plans/visual-design-spec.md]
152
+ - UX Foundation: [paste relevant layout and component sections]
153
+ - Reference screenshots: [list paths from docs/plans/design-references/ — these are your visual targets]
154
+
155
+ REQUIREMENTS:
156
+ - Real, styled, responsive pages. NOT wireframes or skeletons.
157
+ - Use the EXACT colors, fonts, spacing, shadows from the Visual Design Spec. Do not deviate.
158
+ - Include hover states, focus states, transitions, loading states.
159
+ - Mobile-responsive at 375px, 768px, 1024px, 1280px breakpoints.
160
+ - These screens PROVE the design system works. They must look like they belong next to the Awwwards references from the research.
161
+
162
+ Commit: 'feat: proof screens for design validation'
163
+ ```
164
+
165
+ ---
166
+
167
+ ## Step 3.4 — Visual QA Loop (Playwright + Metric Loop)
168
+
169
+ Run the Metric Loop Protocol (`commands/protocols/metric-loop.md`).
170
+
171
+ **Metric definition for `.build-state.md`:**
172
+
173
+ ```
174
+ ## Active Metric Loop
175
+ Phase: 3
176
+ Artifact: Proof screens (landing page, main app view, form/interaction)
177
+ Metric: Visual design quality — implementation fidelity to Visual Design Spec + competitive quality relative to Awwwards/competitor references
178
+ How to measure: Playwright screenshots of proof screens (desktop 1920x1080 + mobile 375x812), scored by design critic agent across 6 dimensions
179
+ Target: 80
180
+ Max iterations: 5
181
+ ```
182
+
183
+ **Measurement agent prompt:**
184
+
185
+ ```
186
+ You are a senior design critic at a top-tier agency (Pentagram, Work & Co). You are reviewing a product's visual implementation for quality.
187
+
188
+ INPUTS:
189
+ - Screenshots of current proof screens: [Playwright captures — desktop + mobile]
190
+ - The Visual Design Spec the implementation should follow: [paste from docs/plans/visual-design-spec.md]
191
+ - Reference screenshots from competitors and Awwwards winners: [paths in docs/plans/design-references/]
192
+
193
+ Score 0-100 across these 6 dimensions (weight equally, average for final score):
194
+
195
+ 1. **Spacing & Alignment (0-100)**
196
+ - Is the 8px grid respected consistently?
197
+ - Do elements breathe? Generous whitespace between sections (hero padding 120-200px, not 40px)?
198
+ - Internal component padding < external margin between components (Gestalt proximity)?
199
+ - Visual grouping through whitespace, not just borders?
200
+
201
+ 2. **Typography Hierarchy (0-100)**
202
+ - Clear 3-4 levels of visual hierarchy?
203
+ - Consistent type scale from the spec applied?
204
+ - Proper line heights (body: 1.5-1.6x, headings: 1.1-1.3x)?
205
+ - Font weight contrast used effectively (not just size)?
206
+ - Letter spacing appropriate for context?
207
+
208
+ 3. **Color Harmony (0-100)**
209
+ - Cohesive palette matching the spec?
210
+ - 60-30-10 rule (60% neutral, 30% secondary, 10% accent)?
211
+ - WCAG AA contrast ratios (4.5:1 body, 3:1 large text)?
212
+ - Shadows tinted not pure black?
213
+ - Colors slightly desaturated (refined, not garish)?
214
+
215
+ 4. **Component Polish (0-100)**
216
+ - Hover states present and smooth?
217
+ - Focus-visible indicators for keyboard nav?
218
+ - Consistent border radius throughout?
219
+ - Shadow/elevation system applied per spec?
220
+ - Transitions feel intentional (not instant, not sluggish)?
221
+ - Loading/empty states considered?
222
+
223
+ 5. **Responsive Quality (0-100)**
224
+ - Mobile layout functional and readable at 375px?
225
+ - No horizontal scroll on any breakpoint?
226
+ - Touch targets 44px+ on mobile?
227
+ - Layout ADAPTS (not just stacks) — different patterns per breakpoint?
228
+ - Images and media scale properly?
229
+
230
+ 6. **Originality (0-100)**
231
+ - Does this look DESIGNED or GENERATED?
232
+ - Penalize heavily if 3+ of these appear together:
233
+ * Purple/blue gradient hero background
234
+ * Floating blob/mesh gradient decorations
235
+ * Inter or Plus Jakarta Sans as the only font
236
+ * 3-column icon+heading+paragraph feature grids
237
+ * Glassmorphism cards as primary style
238
+ * Bento grid as default layout
239
+ * Dark mode + neon accents aesthetic
240
+ * Generic illustration pack imagery
241
+ * Perfect symmetry everywhere, no visual tension
242
+ - One or two in isolation is fine. Three+ together = "AI template" smell.
243
+ - The test: would a human designer say "this was made by AI"?
244
+ - Does the design have personality and point of view?
245
+
246
+ Return format:
247
+ SCORE: [average of 6 dimensions, rounded to nearest integer]
248
+ DIMENSION SCORES: [list each dimension with its score]
249
+ TOP ISSUE: [the single highest-impact change that would most improve the overall score]
250
+ FINDINGS: [detailed list of specific issues, each with the file path and line/component where the fix should happen]
251
+ ```
252
+
253
+ **Fix agent receives:** ONLY the top issue + relevant file paths + the relevant Visual Design Spec section. One fix per iteration. Commit each fix.
254
+
255
+ **Exit conditions (from metric-loop protocol):**
256
+ - Score >= 80 → proceed to Phase 4
257
+ - Stall (2 consecutive delta <= 0) → accept if score >= 65, log warning below 65
258
+ - Max 5 iterations → accept if score >= 65, log warning below 65
259
+
260
+ ---
261
+
262
+ ## Step 3.5 — Autonomous Quality Gate
263
+
264
+ Log to `docs/plans/build-log.md`:
265
+ - Final proof screen screenshot paths
266
+ - Score history table from the metric loop
267
+ - Key design decisions and their research rationale
268
+ - Anti-AI-template dimension score
269
+
270
+ No user pause. Proceed to Phase 4 (Foundation).
271
+
272
+ ---
273
+
274
+ ## Rules
275
+
276
+ <HARD-GATE>
277
+ DESIGN RESEARCH IS NOT OPTIONAL. Step 3.1 agents MUST use Playwright to capture real screenshots of real competitor and inspiration sites. Text-only descriptions of "what their site looks like" are INSUFFICIENT — downstream agents need visual references to make informed decisions and the Visual QA measurement agent needs them for comparison.
278
+
279
+ If Playwright is unavailable: log as blocker, use web search to find and describe competitors in maximum visual detail, proceed with degraded quality. But TRY Playwright first.
280
+ </HARD-GATE>
281
+
282
+ - The UI Designer agent makes ALL visual decisions autonomously. No "pick A or B" presentations. The research provides the evidence; the agent makes the call.
283
+ - The Visual Design Spec MUST include research rationale for every major decision. Unjustified defaults are a design failure.
284
+ - The anti-AI-template checklist is a SCORING DIMENSION (Originality), not a hard blocker. The goal is awareness and intentional differentiation, not rigid prohibition of any single element.
285
+ - Proof screens are REAL implementations with real CSS/components, not mockups or wireframes. They must work responsively.
286
+ - The Visual QA loop is the primary quality control — no human reviews the design. The 80/100 threshold IS the taste arbiter. Treat it seriously.
287
+ - Screenshot data stays in measurement agents' context (separate subprocess). Do NOT load screenshots into the orchestrator's context — receive only the SCORE and TOP ISSUE as text.
@@ -1,6 +1,6 @@
1
1
  # Eval Harness Protocol
2
2
 
3
- You are the orchestrator. Phase 5.1 audits are complete. Before running the metric loop, define formal eval cases that are concrete, executable, and reproducible. This replaces subjective narrative audits with deterministic pass/fail tests.
3
+ You are the orchestrator. Phase 6.1 audits are complete. Before running the metric loop, define formal eval cases that are concrete, executable, and reproducible. This replaces subjective narrative audits with deterministic pass/fail tests.
4
4
 
5
5
  ## How This Differs from the Metric Loop
6
6
 
@@ -12,7 +12,7 @@ They are complementary: eval harness failures become specific issues for the met
12
12
  ## Step 0: Define Eval Cases
13
13
 
14
14
  YOU (the orchestrator) define eval cases based on:
15
- - Audit findings from Phase 5.1 (highest-severity items first)
15
+ - Audit findings from Phase 6.1 (highest-severity items first)
16
16
  - Architecture doc (API contracts, auth model, data validation rules)
17
17
  - Design doc (core user flows, edge cases)
18
18
 
@@ -46,11 +46,11 @@ Count PASS cases / total cases. This is the eval baseline. Record to `docs/plans
46
46
 
47
47
  ## Step 3: Feed into Metric Loop
48
48
 
49
- Any FAIL case with severity CRITICAL or HIGH becomes a candidate issue for the Phase 5.2 metric loop. Pass the failure details (case name, action, expected vs actual) as context when defining the metric loop's metric.
49
+ Any FAIL case with severity CRITICAL or HIGH becomes a candidate issue for the Phase 6.2 metric loop. Pass the failure details (case name, action, expected vs actual) as context when defining the metric loop's metric.
50
50
 
51
51
  ## Step 4: Re-evaluate After Metric Loop
52
52
 
53
- After the Phase 5.2 metric loop exits, re-run the eval harness. All CRITICAL cases must now pass. If any CRITICAL case still fails, flag it for the Reality Checker in Step 5.3.
53
+ After the Phase 6.2 metric loop exits, re-run the eval harness. All CRITICAL cases must now pass. If any CRITICAL case still fails, flag it for the Reality Checker in Step 6.3.
54
54
 
55
55
  ---
56
56
 
@@ -30,9 +30,9 @@ Then create a score log table:
30
30
 
31
31
  When starting a new metric loop, REPLACE the previous Active Metric Loop section (if any). There is only ever ONE active metric loop. Previous loop results should already be recorded in their phase's section above. When the loop completes (Step 2 exit), rename the section header from `## Active Metric Loop` to `## Completed Metric Loop — [Phase N]` and leave it for historical reference.
32
32
 
33
- If you are in Phase 4, also record the current sub-step for the overall task cycle (not all of these are within the metric loop itself):
33
+ If you are in Phase 5, also record the current sub-step for the overall task cycle (not all of these are within the metric loop itself):
34
34
  ```
35
- Sub-step: [4.1 Implement | 4.1b Cleanup | 4.2 Metric Loop | 4.3 Loop Exit | 4.4 Verify]
35
+ Sub-step: [5.1 Implement | 5.1b Cleanup | 5.2 Metric Loop | 5.3 Loop Exit | 5.4 Verify]
36
36
  ```
37
37
  This tells the orchestrator exactly where to resume after context compaction.
38
38
 
@@ -9,10 +9,21 @@ if [ -f "docs/plans/.build-state.md" ]; then
9
9
  fi
10
10
 
11
11
  # Skip if the build is already complete
12
- if echo "$BUILD_STATE" | grep -q "Phase: 6 COMPLETE"; then
12
+ if echo "$BUILD_STATE" | grep -q "Phase: 7 COMPLETE"; then
13
13
  BUILD_STATE=""
14
14
  fi
15
15
 
16
+ # Check if we're past Phase 3 but missing design artifacts
17
+ if [ -n "$BUILD_STATE" ]; then
18
+ CURRENT_PHASE=$(echo "$BUILD_STATE" | grep -oP 'Phase: \K[0-9]+' | head -1)
19
+ if [ "$CURRENT_PHASE" -ge 4 ] 2>/dev/null && [ ! -f "docs/plans/visual-design-spec.md" ]; then
20
+ DESIGN_WARNING="
21
+ DESIGN GATE VIOLATION: Current phase is ${CURRENT_PHASE} but docs/plans/visual-design-spec.md does not exist.
22
+ Phase 3 (Design & Visual Identity) may have been skipped. DO NOT proceed with Foundation or Build.
23
+ Return to Phase 3 and produce visual-design-spec.md before continuing."
24
+ fi
25
+ fi
26
+
16
27
  # If no active build, just provide a minimal reminder
17
28
  if [ -z "$BUILD_STATE" ]; then
18
29
  CONTEXT="buildanything plugin is installed. Use /buildanything:build to start a full product pipeline, or /buildanything:idea-sweep for parallel research."
@@ -69,17 +80,19 @@ ORCHESTRATOR
69
80
  ${BUILD_STATE}
70
81
  ${METRIC_LOOP}
71
82
  ${RESUME_POINT}
83
+ ${DESIGN_WARNING}
72
84
 
73
85
  NEXT ACTIONS:
74
86
  1. Re-read commands/build.md to reload the full orchestrator process
75
87
  2. Re-read commands/protocols/metric-loop.md if you are mid-loop
76
- 3. Re-read docs/plans/sprint-tasks.md for task list and acceptance criteria
77
- 4. Re-read docs/plans/architecture.md for architecture context
78
- 5. Re-read CLAUDE.md for build decisions
79
- 6. Re-read docs/plans/learnings.md if it exists (patterns and pitfalls from previous builds)
80
- 7. Rebuild TodoWrite from docs/plans/.build-state.md (TodoWrite does NOT survive compaction)
81
- 8. Resume from the phase and step indicated in your state above
82
- 9. Dispatch work to specialist agents do not implement directly"
88
+ 3. Re-read commands/protocols/design.md if you are in Phase 3 (Design & Visual Identity)
89
+ 4. Re-read docs/plans/sprint-tasks.md for task list and acceptance criteria
90
+ 5. Re-read docs/plans/architecture.md for architecture context
91
+ 6. Re-read CLAUDE.md for build decisions
92
+ 7. Re-read docs/plans/learnings.md if it exists (patterns and pitfalls from previous builds)
93
+ 8. Rebuild TodoWrite from docs/plans/.build-state.md (TodoWrite does NOT survive compaction)
94
+ 9. Resume from the phase and step indicated in your state above
95
+ 10. Dispatch work to specialist agents — do not implement directly"
83
96
  fi
84
97
 
85
98
  # Output as additional_context for Claude Code
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "buildanything",
3
- "version": "1.5.0",
3
+ "version": "1.6.0",
4
4
  "description": "One command to build an entire product. 73 specialist agents orchestrated into a full engineering pipeline for Claude Code.",
5
5
  "bin": {
6
6
  "buildanything": "./bin/setup.js"