@aslomon/effectum 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/README.md +633 -0
  2. package/bin/install.js +652 -0
  3. package/package.json +29 -0
  4. package/system/README.md +118 -0
  5. package/system/commands/build-fix.md +89 -0
  6. package/system/commands/cancel-ralph.md +90 -0
  7. package/system/commands/checkpoint.md +63 -0
  8. package/system/commands/code-review.md +120 -0
  9. package/system/commands/e2e.md +92 -0
  10. package/system/commands/plan.md +111 -0
  11. package/system/commands/ralph-loop.md +163 -0
  12. package/system/commands/refactor-clean.md +104 -0
  13. package/system/commands/tdd.md +84 -0
  14. package/system/commands/verify.md +71 -0
  15. package/system/stacks/generic.md +96 -0
  16. package/system/stacks/nextjs-supabase.md +114 -0
  17. package/system/stacks/python-fastapi.md +140 -0
  18. package/system/stacks/swift-ios.md +136 -0
  19. package/system/templates/AUTONOMOUS-WORKFLOW.md +1368 -0
  20. package/system/templates/CLAUDE.md.tmpl +141 -0
  21. package/system/templates/guardrails.md.tmpl +39 -0
  22. package/system/templates/settings.json.tmpl +201 -0
  23. package/workshop/knowledge/01-prd-template.md +275 -0
  24. package/workshop/knowledge/02-questioning-framework.md +209 -0
  25. package/workshop/knowledge/03-decomposition-guide.md +234 -0
  26. package/workshop/knowledge/04-examples.md +435 -0
  27. package/workshop/knowledge/05-quality-checklist.md +166 -0
  28. package/workshop/knowledge/06-network-map-guide.md +413 -0
  29. package/workshop/knowledge/07-prompt-templates.md +315 -0
  30. package/workshop/knowledge/08-workflow-modes.md +198 -0
  31. package/workshop/projects/_example-project/PROJECT.md +33 -0
  32. package/workshop/projects/_example-project/notes/decisions.md +15 -0
  33. package/workshop/projects/_example-project/notes/discovery-log.md +9 -0
  34. package/workshop/templates/PROJECT.md +25 -0
  35. package/workshop/templates/network-map.mmd +13 -0
  36. package/workshop/templates/prd.md +133 -0
  37. package/workshop/templates/requirements-map.md +48 -0
  38. package/workshop/templates/shared-contracts.md +89 -0
  39. package/workshop/templates/vision.md +66 -0
@@ -0,0 +1,1368 @@
1
+ # Autonomous Feature Implementation Guide
2
+
3
+ > Standard workflow for fully autonomous feature implementation with Claude Code -- from PRD to production.
4
+
5
+ ## Overview
6
+
7
+ This document defines how to hand off a feature to Claude Code for fully autonomous implementation. It covers the PRD format, prompt templates, command chains, quality gates, and lessons learned from real sessions. The goal: you provide a PRD + prompt, Claude Code delivers production-ready code including tests, migrations, and security.
8
+
9
+ <important>
10
+ All PRDs, prompts, code, comments, commits, and documentation must be written in English. This ensures consistency across the codebase, better LLM comprehension, and avoids encoding or locale-related issues in tooling and CI/CD pipelines.
11
+ </important>
12
+
13
+ ---
14
+
15
+ ## 1. PRD Template
16
+
17
+ Every PRD must contain these sections. The more complete the PRD, the fewer interaction points and the more autonomous the implementation.
18
+
19
+ ```markdown
20
+ # PRD: [Feature Name]
21
+
22
+ ## Problem
23
+
24
+ What is the problem? Why does it need to be solved? Include business context.
25
+
26
+ ## Goal
27
+
28
+ What should work when this is done? (1-3 sentences, measurable)
29
+
30
+ ## User Stories
31
+
32
+ - As a [role], I want to [action], so that [benefit]
33
+ - As a [role], I want to [action], so that [benefit]
34
+
35
+ ## Acceptance Criteria
36
+
37
+ - [ ] AC1: [Concrete, testable criterion]
38
+ - [ ] AC2: [Concrete, testable criterion]
39
+ - [ ] AC3: [Concrete, testable criterion]
40
+
41
+ ## Scope
42
+
43
+ ### In Scope
44
+
45
+ - Feature X, Screen Y, API Endpoint Z
46
+
47
+ ### Out of Scope
48
+
49
+ - What should NOT be built (explicit boundaries)
50
+
51
+ ## Data Model (optional but recommended)
52
+
53
+ - Tables, fields, relations, RLS requirements
54
+ - Include SQL migration sketches if possible
55
+
56
+ ## UI/UX (optional)
57
+
58
+ - Wireframes, screenshots, Figma links, or text description
59
+ - Responsive requirements, breakpoints
60
+
61
+ ## API Design (optional)
62
+
63
+ - Endpoints, HTTP methods, request/response shapes
64
+ - Authentication and authorization requirements
65
+
66
+ ## Constraints
67
+
68
+ - Performance requirements (latency, throughput)
69
+ - Dependencies on existing features or services
70
+ - Technical or timeline constraints
71
+ ```
72
+
73
+ ### Why Each Section Matters
74
+
75
+ | Section | If missing, Claude Code will... |
76
+ | -------------------- | ------------------------------------------------------ |
77
+ | Problem | ...assume incorrect context and build the wrong thing |
78
+ | Acceptance Criteria | ...not know when "done" means done |
79
+ | Scope / Out of Scope | ...build too much or too little |
80
+ | Data Model | ...ask for clarification or design its own schema |
81
+ | Constraints | ...use default assumptions that may not fit your needs |
82
+
83
+ ### PRD Best Practices
84
+
85
+ - Write acceptance criteria as **testable assertions** -- each one should map to at least one automated test
86
+ - Define the data model upfront to eliminate architectural decision points and reduce interaction stops to zero
87
+ - List out-of-scope items explicitly -- this prevents scope creep more effectively than vague boundaries
88
+ - Include example request/response shapes for APIs -- this removes ambiguity about data contracts
89
+
90
+ ### Agent-Ready PRD Extension (for Ralph Loop / Full Autonomy)
91
+
92
+ When using Ralph Loop or full-auto mode, add these three sections to your PRD. They bridge the gap between "what to build" and "how to verify it autonomously."
93
+
94
+ ```markdown
95
+ ### Quality Gates
96
+
97
+ Automated checks that MUST pass before the feature is considered done:
98
+
99
+ - Build: `pnpm build` -- 0 errors
100
+ - Types: `tsc --noEmit` -- 0 errors
101
+ - Tests: `pnpm vitest run` -- all pass, 80%+ coverage
102
+ - Lint: `pnpm lint` -- 0 errors
103
+ - E2E: `npx playwright test` -- all pass (if applicable)
104
+ - Custom: [project-specific checks]
105
+
106
+ ### Autonomy Rules
107
+
108
+ Where Claude can make its own decisions vs. where it should follow strict guidelines:
109
+
110
+ - Design decisions: [yes/no + guidelines, e.g. "Follow DESIGN.md"]
111
+ - Library choices: [predefined or free]
112
+ - Architecture: [predefined or "follow existing patterns"]
113
+ - On ambiguity: [decide autonomously / stop and document]
114
+
115
+ ### Completion Promise
116
+
117
+ Exact phrase for Ralph Loop (must be 100% true before outputting):
118
+
119
+ "All acceptance criteria met, build passes, tests pass, 0 lint errors"
120
+ ```
121
+
122
+ **Why these fields matter for autonomy:**
123
+
124
+ | Field | Without it, Claude will... |
125
+ | ------------------ | ------------------------------------------------------------------------- |
126
+ | Quality Gates | ...use default checks that may miss project-specific requirements |
127
+ | Autonomy Rules | ...stop and ask about decisions it could safely make on its own |
128
+ | Completion Promise | ...not know when to stop iterating (Ralph Loop runs until max-iterations) |
129
+
130
+ ---
131
+
132
+ ## 1.5 PRD -> Ralph Loop Conversion
133
+
134
+ ### Quick Conversion
135
+
136
+ 1. Write your PRD using the Agent-Ready template (Section 1 + extension)
137
+ 2. Copy it into this Ralph Loop prompt template:
138
+
139
+ ```
140
+ /ralph-loop Implement the following feature fully autonomously from database to frontend.
141
+
142
+ <workflow>
143
+ Each iteration:
144
+ 1. Read the PRD and current project state
145
+ 2. Check what was already done (git diff, existing files)
146
+ 3. Implement the next logical step
147
+ 4. Run quality gates after every significant change
148
+ 5. When ALL acceptance criteria AND quality gates pass: output the completion promise
149
+ </workflow>
150
+
151
+ <quality_gates>
152
+ [INSERT Quality Gates from your PRD]
153
+ </quality_gates>
154
+
155
+ <autonomy_rules>
156
+ [INSERT Autonomy Rules from your PRD]
157
+ </autonomy_rules>
158
+
159
+ <prd>
160
+ [INSERT your complete PRD]
161
+ </prd>
162
+
163
+ --max-iterations [N] --completion-promise '[Completion Promise from your PRD]'
164
+ ```
165
+
166
+ ### Max-Iterations Selection
167
+
168
+ | Feature Type | Max | Rationale |
169
+ | ---------------- | --- | ------------------------------- |
170
+ | Bugfix | 10 | Find + Fix + Test |
171
+ | Small feature | 20 | CRUD, single endpoint |
172
+ | Standard feature | 30 | DB + API + Frontend + Tests |
173
+ | Large feature | 50 | Multi-domain, E2E, complex UI |
174
+ | Refactoring | 15 | Scoped changes, little new code |
175
+
176
+ ### Iteration Planning (recommended for features > 20 iterations)
177
+
178
+ Add an `<iteration_plan>` to give Claude a roadmap for sequencing work:
179
+
180
+ ```xml
181
+ <iteration_plan>
182
+ Iterations 1-3: Setup -- DB migration, types, Zod schemas
183
+ Iterations 4-10: Backend -- API routes with TDD
184
+ Iterations 11-18: Frontend -- Components with TDD
185
+ Iterations 19-25: E2E Tests + Edge Cases
186
+ Iterations 26-30: Polish -- Code review, cleanup, final verification
187
+ </iteration_plan>
188
+ ```
189
+
190
+ Adjust the ranges based on your max-iterations and feature complexity.
191
+
192
+ ---
193
+
194
+ ## 2. Prompt Templates
195
+
196
+ All prompts follow Anthropic's prompt engineering best practices: clear structure, explicit instructions, context for the "why", and positive directives (say what to do, not what to avoid).
197
+
198
+ ### Standard Prompt (Recommended)
199
+
200
+ ```
201
+ Implement the following feature autonomously from database to frontend.
202
+
203
+ <workflow>
204
+ 1. /plan -- Create an implementation plan and wait for my approval
205
+ 2. After approval: Implement using /tdd (tests first, then code)
206
+ 3. /verify after each phase
207
+ 4. /e2e for critical user journeys
208
+ 5. /code-review at the end
209
+ 6. Do NOT commit -- show me the final git diff
210
+ </workflow>
211
+
212
+ <prd>
213
+ [Insert PRD here]
214
+ </prd>
215
+
216
+ <context>
217
+ - Project: [project name/path]
218
+ - Supabase Project ID: [ID] (if applicable)
219
+ - Follow existing patterns in: [relevant files/directories]
220
+ </context>
221
+ ```
222
+
223
+ ### Express Prompt (Small Features)
224
+
225
+ ```
226
+ Implement: [1-2 sentence feature description]
227
+
228
+ <acceptance_criteria>
229
+ - [ ] [AC1]
230
+ - [ ] [AC2]
231
+ </acceptance_criteria>
232
+
233
+ Workflow: /tdd -> /verify -> /code-review. Do not commit.
234
+ ```
235
+
236
+ ### Full-Auto Prompt (Maximum Autonomy)
237
+
238
+ ```
239
+ Implement the following feature fully autonomously.
240
+ Make your own decisions on architecture and UI details.
241
+ Only ask when encountering breaking changes or genuinely ambiguous scope.
242
+
243
+ Workflow: /plan -> /tdd -> /e2e -> /verify -> /code-review
244
+ Commit when all quality gates pass.
245
+
246
+ <prd>
247
+ [Insert PRD here]
248
+ </prd>
249
+ ```
250
+
251
+ ### Prompt Writing Guidelines
252
+
253
+ These guidelines are derived from Anthropic's official prompt engineering documentation:
254
+
255
+ 1. **Be specific and direct** -- Treat Claude like a brilliant new team member who lacks context about your specific project norms and conventions
256
+ 2. **Provide context for the "why"** -- Explain the reasoning behind constraints so Claude can generalize correctly to edge cases
257
+ 3. **Use XML tags for structure** -- Wrap distinct sections (`<prd>`, `<workflow>`, `<context>`, `<acceptance_criteria>`) so Claude can parse complex prompts unambiguously
258
+ 4. **Say what to do, not what to avoid** -- "Write concise functions under 40 lines" is more effective than "Don't write long functions"
259
+ 5. **Reference existing patterns** -- "Follow the pattern in `src/lib/billing/` for the service layer" gives Claude a concrete example to generalize from
260
+ 6. **Specify the autonomy level explicitly** -- Claude calibrates its interaction behavior based on your stated preference
261
+
262
+ ### Workflow Mode Decision Matrix
263
+
264
+ Use this table to decide which workflow mode fits your situation:
265
+
266
+ | Criterion | Normal Session | Full-Auto Prompt | Ralph Loop |
267
+ | ------------------------- | -------------- | ---------------- | ----------------- |
268
+ | Subjective UI decisions | Yes | No | No |
269
+ | Clear acceptance criteria | Optional | Recommended | **Required** |
270
+ | PRD available | Optional | Recommended | **Required** |
271
+ | Quality gates defined | Optional | Recommended | **Required** |
272
+ | Autonomy rules defined | N/A | Optional | **Recommended** |
273
+ | You are present | Yes | Yes | No (unattended) |
274
+ | Iteration needed | Manual | Manual | **Automatic** |
275
+ | Completion promise | N/A | N/A | **Required** |
276
+ | Typical duration | 30min-2h | 1-3h | 1-8h (unattended) |
277
+
278
+ **Quick recommendation:**
279
+
280
+ - **No PRD?** -> Normal Session
281
+ - **PRD without quality gates?** -> Full-Auto Prompt
282
+ - **PRD with quality gates + completion promise?** -> Ralph Loop
283
+ - **PRD + Agent Teams flag?** -> /orchestrate (with Teams)
284
+
285
+ ---
286
+
287
+ ## 3. Autonomous Workflow Phases
288
+
289
+ ### Phase 1: Analysis & Planning (`/plan`)
290
+
291
+ ```
292
+ Read PRD -> Explore codebase -> Identify risks -> Write plan -> STOP and wait for approval
293
+ ```
294
+
295
+ - Reads the PRD and builds understanding of the problem context
296
+ - Explores the existing codebase (Glob, Grep, Read) to identify reusable patterns, utilities, and components
297
+ - Creates a step-by-step implementation plan with phases and dependencies
298
+ - Surfaces risks, open questions, and architectural decisions
299
+ - **STOPS and waits for explicit approval before writing any code**
300
+
301
+ ### Phase 2: Database & Types
302
+
303
+ ```
304
+ Write migration -> RLS policies -> Generate types -> Derive Zod schemas
305
+ ```
306
+
307
+ - Supabase migrations via `apply_migration` (never raw DDL)
308
+ - RLS policies for every new table (multi-tenant with org_id)
309
+ - Security advisor check after DDL changes
310
+ - TypeScript types generated via `generate_typescript_types`
311
+ - Zod schemas derived for API validation
312
+
313
+ ### Phase 3: Backend / API (`/tdd`)
314
+
315
+ ```
316
+ Write tests (RED) -> Implement API (GREEN) -> Refactor
317
+ ```
318
+
319
+ - Route handlers in `src/app/api/`
320
+ - Server-side services in `src/lib/[domain]/`
321
+ - Zod validation for all inputs
322
+ - Result pattern `{ data, error }` for all operations that can fail
323
+ - Tests with Vitest + Testing Library
324
+
325
+ ### Phase 4: Frontend (`/tdd`)
326
+
327
+ ```
328
+ Write component tests (RED) -> Implement components (GREEN) -> Refactor
329
+ ```
330
+
331
+ - Server Components by default
332
+ - Client Components only when interactivity, hooks, or browser APIs are needed
333
+ - Shadcn UI + Tailwind CSS v4
334
+ - Feature components in `src/components/[feature]/`
335
+ - Hooks in `src/hooks/`
336
+
337
+ ### Phase 5: E2E Tests (`/e2e`)
338
+
339
+ ```
340
+ Write Playwright tests -> Browser automation -> Verify user journeys
341
+ ```
342
+
343
+ - Critical user journeys as E2E tests
344
+ - Accessibility-tree-based selectors (fast, reliable)
345
+ - Tests in `tests/e2e/`
346
+
347
+ <important>
348
+ Write a trivial smoke test immediately after scaffolding (Phase 2) to validate that the test environment works correctly -- routing is accessible, middleware fires, API responds. Do not wait until Phase 5 to discover infrastructure misconfigurations.
349
+ </important>
350
+
351
+ ### Phase 6: Verification (`/verify` + `/code-review`)
352
+
353
+ ```
354
+ Build -> Types -> Lint -> Tests -> Security -> Review
355
+ ```
356
+
357
+ - TypeScript compilation (strict mode)
358
+ - ESLint/Biome check
359
+ - All tests (unit + integration + e2e)
360
+ - No console.log in production code
361
+ - Security review (OWASP Top 10, RLS check)
362
+ - Code quality review
363
+
364
+ ---
365
+
366
+ ## 4. Command Chains by Feature Type
367
+
368
+ ### New Feature (Standard)
369
+
370
+ ```
371
+ /plan -> [approval] -> /tdd -> /verify -> /e2e -> /verify -> /code-review
372
+ ```
373
+
374
+ ### Small Feature / Bugfix
375
+
376
+ ```
377
+ /tdd -> /verify -> /code-review
378
+ ```
379
+
380
+ ### Large Feature (Multi-Domain)
381
+
382
+ ```
383
+ /plan -> [approval] -> /tdd (DB+API) -> /verify -> /tdd (Frontend) -> /verify -> /e2e -> /code-review
384
+ ```
385
+
386
+ ### Refactoring
387
+
388
+ ```
389
+ /plan -> [approval] -> /refactor-clean -> /verify -> /code-review
390
+ ```
391
+
392
+ ### Performance Optimization
393
+
394
+ ```
395
+ /plan -> [approval] -> [profiling] -> /tdd -> /verify -> /e2e
396
+ ```
397
+
398
+ ---
399
+
400
+ ## 5. Interaction Points
401
+
402
+ These are the moments where Claude Code pauses and waits for your input.
403
+
404
+ | Point | Why | What you do |
405
+ | --------------------- | ---------------------------- | ----------------------------------- |
406
+ | After `/plan` | Plan approval required | "OK", "Change X", or "Start over" |
407
+ | Ambiguous scope | PRD has gaps | Answer the question |
408
+ | Breaking change | Existing API/DB modification | Confirm or choose alternative |
409
+ | Architecture decision | Multiple valid approaches | Pick one |
410
+ | Test failure | Unexpected error | Provide context or allow workaround |
411
+
412
+ ### How to Minimize Interaction Stops
413
+
414
+ - Write complete acceptance criteria with testable assertions
415
+ - Define explicit scope and out-of-scope boundaries
416
+ - Pre-define the data model (tables, fields, relations)
417
+ - State autonomy preference: "Make your own decisions, only ask about breaking changes"
418
+ - Include the Supabase Project ID so Claude can execute migrations and generate types directly
419
+
420
+ ---
421
+
422
+ ## 6. Quality Gates
423
+
424
+ Before Claude Code reports "done", it verifies all of these automatically:
425
+
426
+ | Gate | Tool | Criterion |
427
+ | ------------- | ---------------- | ------------------------------------- |
428
+ | Compilation | `tsc --noEmit` | 0 errors |
429
+ | Linting | ESLint/Biome | 0 errors, 0 warnings |
430
+ | Unit Tests | Vitest | 80%+ coverage, all passing |
431
+ | E2E Tests | Playwright | All journeys passing |
432
+ | Security | `/code-review` | No OWASP vulnerabilities |
433
+ | RLS Check | Supabase Advisor | All tables have RLS policies |
434
+ | No Debug Logs | `/verify` | 0 console.log hits in production code |
435
+ | Type Safety | Zod + TS Strict | No `any`, no `as` casts |
436
+ | File Size | Check | No file exceeds 300 lines |
437
+
438
+ ---
439
+
440
+ ## 7. Example PRDs
441
+
442
+ ### Example A: Simple Feature
443
+
444
+ ```markdown
445
+ # PRD: User Profile Avatar Upload
446
+
447
+ ## Problem
448
+
449
+ Users cannot upload a profile picture. Their profiles show a generic placeholder,
450
+ making it harder for team members to identify each other.
451
+
452
+ ## Goal
453
+
454
+ Users can upload a profile picture that is displayed across the application.
455
+
456
+ ## User Stories
457
+
458
+ - As a user, I want to upload a profile picture so that others can recognize me
459
+
460
+ ## Acceptance Criteria
461
+
462
+ - [ ] Upload accepts JPG/PNG/WebP files up to 5MB
463
+ - [ ] Image is stored in Supabase Storage
464
+ - [ ] Avatar URL is persisted in the user profile
465
+ - [ ] Preview is shown before upload confirmation
466
+ - [ ] Fallback avatar is displayed when no image is uploaded
467
+
468
+ ## Scope
469
+
470
+ ### In Scope
471
+
472
+ - Upload component, storage integration, profile display
473
+
474
+ ### Out of Scope
475
+
476
+ - Image cropping, filters, social media import
477
+ ```
478
+
479
+ **Prompt:**
480
+
481
+ ```
482
+ Implement the avatar upload feature.
483
+ Workflow: /plan -> /tdd -> /e2e -> /verify -> /code-review. Do not commit.
484
+
485
+ <prd>
486
+ [Insert PRD above]
487
+ </prd>
488
+ ```
489
+
490
+ ### Example B: Complex Feature
491
+
492
+ ```markdown
493
+ # PRD: Team Invitation System
494
+
495
+ ## Problem
496
+
497
+ New team members can only be added manually via direct database insertion.
498
+ This is error-prone, unauditable, and requires developer involvement.
499
+
500
+ ## Goal
501
+
502
+ Admins can send email invitations. Invited users receive a magic link
503
+ and are automatically assigned to the team upon acceptance.
504
+
505
+ ## User Stories
506
+
507
+ - As an admin, I want to invite team members via email
508
+ - As an invitee, I want to join via a link without manual registration
509
+ - As an admin, I want to see and revoke pending invitations
510
+
511
+ ## Acceptance Criteria
512
+
513
+ - [ ] Admin can enter email addresses and assign a role
514
+ - [ ] Invitation generates a single-use token (valid for 24 hours)
515
+ - [ ] Email is sent via Supabase Edge Function
516
+ - [ ] Magic link leads to onboarding for new users
517
+ - [ ] Magic link assigns existing users directly to the team
518
+ - [ ] Admin sees a list of pending invitations with status
519
+ - [ ] Admin can revoke invitations
520
+ - [ ] Maximum 50 pending invitations per team
521
+ - [ ] Rate limit: 10 invitations per hour per admin
522
+
523
+ ## Data Model
524
+
525
+ - `invitations` table: id, org_id, email, role, token, status,
526
+ invited_by, expires_at, accepted_at, created_at
527
+ - RLS: Only admins of the same org can CRUD
528
+ - Index on token (unique) and org_id + status
529
+
530
+ ## API Design
531
+
532
+ - POST /api/invitations -- Create invitation
533
+ - GET /api/invitations -- List for current org
534
+ - DELETE /api/invitations/:id -- Revoke invitation
535
+ - POST /api/invitations/accept -- Redeem token
536
+
537
+ ## Constraints
538
+
539
+ - Use Supabase Auth for magic links
540
+ - Edge Function for email sending
541
+ - No external email providers (use Supabase built-in)
542
+ ```
543
+
544
+ **Prompt:**
545
+
546
+ ```
547
+ Implement the team invitation system fully autonomously.
548
+ Make your own decisions on UI details.
549
+ Only ask about architectural ambiguities.
550
+
551
+ Workflow: /plan -> /tdd -> /e2e -> /verify -> /code-review
552
+ Commit when all quality gates pass.
553
+
554
+ <prd>
555
+ [Insert PRD above]
556
+ </prd>
557
+ ```
558
+
559
+ ### Example C: Bugfix
560
+
561
+ ```markdown
562
+ # PRD: Fix -- Dashboard Does Not Load for New Users
563
+
564
+ ## Problem
565
+
566
+ New users without projects see a spinner that never resolves.
567
+
568
+ ## Reproduction Steps
569
+
570
+ 1. Create a new account
571
+ 2. Navigate to the dashboard route
572
+ 3. Spinner spins indefinitely
573
+
574
+ ## Expected Behavior
575
+
576
+ Empty state with a "Create your first project" CTA button.
577
+
578
+ ## Acceptance Criteria
579
+
580
+ - [ ] New users without projects see an empty state
581
+ - [ ] Empty state has a "Create project" button
582
+ - [ ] Existing users with projects are unaffected
583
+ - [ ] Edge case: Users whose projects have all been deleted
584
+ ```
585
+
586
+ **Prompt:**
587
+
588
+ ```
589
+ Find and fix the dashboard bug for new users.
590
+ Workflow: /tdd -> /verify. Do not commit.
591
+
592
+ <prd>
593
+ [Insert PRD above]
594
+ </prd>
595
+ ```
596
+
597
+ ### Example D: Agent-Ready Frontend Feature (with Ralph Loop)
598
+
599
+ ```markdown
600
+ # PRD: Dark Mode Toggle
601
+
602
+ ## Problem
603
+
604
+ Users cannot switch between light and dark themes. The app only supports
605
+ light mode, which causes eye strain for users in low-light environments.
606
+
607
+ ## Goal
608
+
609
+ Users can toggle between light and dark mode. The preference persists across sessions.
610
+
611
+ ## Acceptance Criteria
612
+
613
+ - [ ] Toggle button in the header switches between light and dark mode
614
+ - [ ] Theme preference is saved to localStorage
615
+ - [ ] Theme is applied on page load (no flash of wrong theme)
616
+ - [ ] All components respect the active theme
617
+ - [ ] Respects system preference (prefers-color-scheme) as default
618
+
619
+ ## Scope
620
+
621
+ ### In Scope
622
+
623
+ - Theme toggle component, CSS variables, localStorage persistence
624
+
625
+ ### Out of Scope
626
+
627
+ - Custom theme editor, per-component theme overrides, server-side preference storage
628
+
629
+ ## Quality Gates
630
+
631
+ - Build: `pnpm build` -- 0 errors
632
+ - Types: `tsc --noEmit` -- 0 errors
633
+ - Tests: `pnpm vitest run` -- all pass, 80%+ coverage
634
+ - Lint: `pnpm lint` -- 0 errors
635
+
636
+ ## Autonomy Rules
637
+
638
+ - Design decisions: Follow DESIGN.md color tokens
639
+ - Library choices: Use existing Tailwind dark mode (class strategy)
640
+ - Architecture: Follow existing patterns
641
+ - On ambiguity: Decide autonomously
642
+
643
+ ## Completion Promise
644
+
645
+ "All acceptance criteria met, build passes, tests pass, 0 lint errors"
646
+ ```
647
+
648
+ **Ralph Loop Prompt:**
649
+
650
+ ```
651
+ /ralph-loop Implement dark mode toggle per the PRD below.
652
+
653
+ <workflow>
654
+ Each iteration:
655
+ 1. Check current state (git diff, test results)
656
+ 2. Implement the next logical step
657
+ 3. Run quality gates after every significant change
658
+ 4. When ALL criteria pass: output <promise>DARK MODE COMPLETE</promise>
659
+ </workflow>
660
+
661
+ <quality_gates>
662
+ - pnpm build: 0 errors
663
+ - tsc --noEmit: 0 errors
664
+ - pnpm vitest run: all pass, 80%+ coverage
665
+ - pnpm lint: 0 errors
666
+ </quality_gates>
667
+
668
+ <autonomy_rules>
669
+ - Follow DESIGN.md color tokens
670
+ - Use Tailwind dark mode (class strategy)
671
+ - Follow existing patterns
672
+ - Decide autonomously on ambiguity
673
+ </autonomy_rules>
674
+
675
+ <prd>
676
+ [Insert PRD above]
677
+ </prd>
678
+
679
+ --max-iterations 20 --completion-promise 'DARK MODE COMPLETE'
680
+ ```
681
+
682
+ ### Example E: Agent-Ready Full-Stack Feature (with Ralph Loop)
683
+
684
+ ```markdown
685
+ # PRD: API Key Management
686
+
687
+ ## Problem
688
+
689
+ Users cannot generate or manage API keys for programmatic access. All integrations
690
+ require manual token exchange, which is insecure and unscalable.
691
+
692
+ ## Goal
693
+
694
+ Users can create, view, and revoke API keys from their settings page.
695
+ Each key has a name, scoped permissions, and expiration date.
696
+
697
+ ## Acceptance Criteria
698
+
699
+ - [ ] User can generate a new API key with a name and expiration
700
+ - [ ] Full key is shown once after creation (never again)
701
+ - [ ] Key list shows name, prefix (first 8 chars), created date, and expiration
702
+ - [ ] User can revoke any key (immediate invalidation)
703
+ - [ ] API validates keys via middleware (Authorization: Bearer sk\_...)
704
+ - [ ] Rate limit: 100 requests/minute per key
705
+ - [ ] Maximum 10 active keys per user
706
+
707
+ ## Scope
708
+
709
+ ### In Scope
710
+
711
+ - API key CRUD, key validation middleware, settings page UI
712
+
713
+ ### Out of Scope
714
+
715
+ - OAuth scopes, team-shared keys, usage analytics dashboard
716
+
717
+ ## Data Model
718
+
719
+ - `api_keys` table: id, user_id, org_id, name, key_hash, key_prefix,
720
+ permissions, expires_at, revoked_at, last_used_at, created_at
721
+ - RLS: Users can only manage their own keys within their org
722
+ - Index on key_hash (unique), user_id + org_id
723
+
724
+ ## API Design
725
+
726
+ - POST /api/keys -- Create key (returns full key once)
727
+ - GET /api/keys -- List keys for current user
728
+ - DELETE /api/keys/:id -- Revoke key
729
+
730
+ ## Quality Gates
731
+
732
+ - Build: `pnpm build` -- 0 errors
733
+ - Types: `tsc --noEmit` -- 0 errors
734
+ - Tests: `pnpm vitest run` -- all pass, 80%+ coverage
735
+ - Lint: `pnpm lint` -- 0 errors
736
+ - E2E: `npx playwright test` -- all pass
737
+ - RLS: Supabase security advisor -- no warnings
738
+
739
+ ## Autonomy Rules
740
+
741
+ - Design decisions: Follow DESIGN.md
742
+ - Library choices: Use existing project dependencies
743
+ - Architecture: Follow existing patterns in src/lib/ and src/app/api/
744
+ - On ambiguity: Decide autonomously, document decisions in code comments
745
+
746
+ ## Constraints
747
+
748
+ - Use crypto.randomBytes for key generation (not UUIDs)
749
+ - Store only bcrypt hash of key, never plaintext
750
+ - Key format: sk*live*[32 random chars]
751
+
752
+ ## Completion Promise
753
+
754
+ "All acceptance criteria met, migration applied, RLS active, all tests pass, build succeeds, 0 lint errors"
755
+ ```
756
+
757
+ **Ralph Loop Prompt:**
758
+
759
+ ```
760
+ /ralph-loop Implement API key management per the PRD below.
761
+
762
+ <iteration_plan>
763
+ Iterations 1-5: DB migration, RLS, types, Zod schemas
764
+ Iterations 6-15: API routes with TDD (create, list, revoke, middleware)
765
+ Iterations 16-22: Settings page UI with TDD
766
+ Iterations 23-28: E2E tests + edge cases
767
+ Iterations 29-30: Polish, code review, final verification
768
+ </iteration_plan>
769
+
770
+ <quality_gates>
771
+ - pnpm build: 0 errors
772
+ - tsc --noEmit: 0 errors
773
+ - pnpm vitest run: all pass, 80%+ coverage
774
+ - pnpm lint: 0 errors
775
+ - npx playwright test: all pass
776
+ - Supabase security advisor: no warnings
777
+ </quality_gates>
778
+
779
+ <autonomy_rules>
780
+ - Follow DESIGN.md for UI
781
+ - Use existing project dependencies
782
+ - Follow existing patterns in src/lib/ and src/app/api/
783
+ - Decide autonomously, document decisions in code comments
784
+ </autonomy_rules>
785
+
786
+ <prd>
787
+ [Insert PRD above]
788
+ </prd>
789
+
790
+ --max-iterations 30 --completion-promise 'All acceptance criteria met, migration applied, RLS active, all tests pass, build succeeds, 0 lint errors'
791
+ ```
792
+
793
+ ---
794
+
795
+ ## 8. Common Mistakes and How to Fix Them
796
+
797
+ ### Vague prompts
798
+
799
+ ```
800
+ # Ineffective -- Claude lacks context to make good decisions
801
+ "Build an invite system"
802
+
803
+ # Effective -- Clear scope, explicit workflow, referenced spec
804
+ "Implement the invite system per the attached PRD.
805
+ Workflow: /plan -> /tdd -> /verify"
806
+ ```
807
+
808
+ ### Missing acceptance criteria
809
+
810
+ ```
811
+ # Ineffective -- No definition of "done"
812
+ "It should work"
813
+
814
+ # Effective -- Testable, measurable criteria
815
+ "AC: Upload JPG/PNG up to 5MB, show preview before upload, display fallback avatar"
816
+ ```
817
+
818
+ ### No workflow specified
819
+
820
+ ```
821
+ # Ineffective -- Claude doesn't know whether to plan or code directly
822
+ "Just do it"
823
+
824
+ # Effective -- Explicit sequence of operations
825
+ "Workflow: /plan -> /tdd -> /e2e -> /verify"
826
+ ```
827
+
828
+ ### Unbounded scope
829
+
830
+ ```
831
+ # Ineffective -- Could mean anything from CRUD to full RBAC
832
+ "User management feature"
833
+
834
+ # Effective -- Clear boundaries with explicit exclusions
835
+ "User invitation system. Out of scope: role management, permissions, user deactivation"
836
+ ```
837
+
838
+ ### Over-specifying implementation details
839
+
840
+ ```
841
+ # Ineffective -- Micromanages file structure, function names, implementation
842
+ "Create a file src/lib/invite.ts with a function createInvitation
843
+ that generates a token using crypto.randomUUID..."
844
+
845
+ # Effective -- States the goal, trusts Claude to follow project conventions
846
+ "Implement invitation creation with token generation.
847
+ Follow the existing project structure and patterns."
848
+ ```
849
+
850
+ ### Unclear commit expectations
851
+
852
+ ```
853
+ # Ineffective -- Claude defaults to never committing, or commits unexpectedly
854
+ (saying nothing about commits)
855
+
856
+ # Effective -- Explicit instruction
857
+ "Commit when all quality gates pass" OR "Do not commit -- show me the diff"
858
+ ```
859
+
860
+ ---
861
+
862
+ ## 9. Tips for Maximum Autonomy
863
+
864
+ ### The PRD is the single highest-leverage input
865
+
866
+ A well-written PRD eliminates most interaction stops. Spending 10 extra minutes on the PRD saves 5+ rounds of back-and-forth.
867
+
868
+ ### Pre-define the data model
869
+
870
+ When you specify the database schema upfront, Claude makes zero architectural decisions and zero stops.
871
+
872
+ ### State the autonomy level explicitly
873
+
874
+ - **Standard:** "Ask when unclear" -> 2-3 stops
875
+ - **High:** "Make your own decisions, only ask about breaking changes" -> 0-1 stops
876
+ - **Full Auto:** "Implement fully autonomously, commit when green" -> 0 stops
877
+
878
+ ### Reference existing patterns in the codebase
879
+
880
+ ```
881
+ "Follow the pattern in src/lib/billing/ for the service layer"
882
+ "Use the same form validation approach as src/components/auth/login-form.tsx"
883
+ ```
884
+
885
+ ### Provide infrastructure identifiers
886
+
887
+ When you include the Supabase Project ID, database connection details, or deployment targets in the prompt, Claude can execute migrations, generate types, and query the database directly -- without asking.
888
+
889
+ ### Batch similar features
890
+
891
+ ```
892
+ "Implement these 3 CRUD endpoints in one session:
893
+ 1. /api/projects (PRD A)
894
+ 2. /api/teams (PRD B)
895
+ 3. /api/invitations (PRD C)
896
+ Use /plan for the overall architecture, then /tdd per endpoint."
897
+ ```
898
+
899
+ ---
900
+
901
+ ## 9.5 Agent Teams -- Parallel Multi-Instance Orchestration
902
+
903
+ > **Feature-flagged.** Only active when `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` is set in `~/.claude/settings.json` env. When disabled, everything uses Subagents (Task Tool) as before.
904
+
905
+ ### Three Execution Tiers
906
+
907
+ | Tier | Mechanism | Best For |
908
+ | ------------------------ | --------------------------------------------- | ---------------------------------------------- |
909
+ | **Single-Agent** | Main session works alone | Simple tasks, bugfixes, single-file changes |
910
+ | **Subagents** (default) | Task Tool spawns agents within session | Parallel research, code reviews, focused tasks |
911
+ | **Agent Teams** (opt-in) | Independent Claude instances with own context | Complex features with 3+ parallel workstreams |
912
+
913
+ ### When to Use Agent Teams vs. Subagents
914
+
915
+ **Use Subagents (default) when:**
916
+
917
+ - Task is completable by 1-2 agents
918
+ - Results need to flow back to a coordinator
919
+ - Token budget is a concern
920
+ - Feature is small-to-medium scope
921
+
922
+ **Use Agent Teams when:**
923
+
924
+ - Feature spans 3+ independent workstreams (e.g., frontend + backend + tests)
925
+ - Teammates need to communicate directly (not just report back)
926
+ - Parallel implementation would save significant time
927
+ - Each Teammate has clear file ownership boundaries
928
+
929
+ ### Command Chain with Agent Teams
930
+
931
+ ```
932
+ /plan -> [approval] -> /tasks (with Teammate assignments) -> /orchestrate [profile] -> [parallel work] -> /verify -> /code-review
933
+ ```
934
+
935
+ Compare to Classic:
936
+
937
+ ```
938
+ /plan -> [approval] -> /tasks -> /tdd (sequential per task) -> /verify -> /code-review
939
+ ```
940
+
941
+ ### Team Profiles
942
+
943
+ Predefined in `~/.claude/teams/profiles.md`:
944
+
945
+ - **web-feature** (3): frontend + backend + testing
946
+ - **fullstack** (5): frontend + backend + database + testing + reviewer
947
+ - **frontend-only** (3): layout + components + content
948
+ - **review** (2): quality + security
949
+ - **custom**: user-defined
950
+
951
+ ### Best Practices
952
+
953
+ 1. **3-5 Teammates** -- more creates coordination overhead that outweighs parallelism
954
+ 2. **5-6 Tasks per Teammate** -- enough work to justify the context window cost
955
+ 3. **Clear file ownership** -- never have two Teammates editing the same file
956
+ 4. **Database first** -- schema changes must complete before other Teammates start
957
+ 5. **Reviewer last** -- code review Teammate begins after implementation is done
958
+ 6. **Use task dependencies** -- `blocked by:` prevents premature work
959
+ 7. **Monitor costs** -- Agent Teams use N separate Claude instances (N x token cost)
960
+
961
+ ### Display Modes
962
+
963
+ - **In-process**: All Teammates in same terminal. `Shift+Down` to cycle.
964
+ - **Split panes**: Each Teammate in own pane (requires tmux or iTerm2).
965
+
966
+ Configure via `"teammateMode": "in-process"` or `"tmux"` in settings.json.
967
+
968
+ ---
969
+
970
+ ## 10. Ralph Loop -- Iterative Autonomy
971
+
972
+ ### What is Ralph Loop?
973
+
974
+ Ralph Loop is a **self-referential agentic loop**: you start Claude Code once with a prompt, and it iterates autonomously over its own work -- until the quality criteria are met or the max-iterations limit is reached.
975
+
976
+ **Comparison with a normal session:**
977
+
978
+ | | Normal Session | Ralph Loop |
979
+ | ----------- | ------------------------------------ | ----------------------------------------- |
980
+ | Interaction | You provide prompts, Claude responds | One prompt, Claude iterates alone |
981
+ | Stops | Possible at every phase | Only at max-iterations or completion |
982
+ | Feedback | You review each step | Claude reviews itself via git/files |
983
+ | Best for | Exploratory work, design decisions | Well-defined tasks with testable criteria |
984
+
985
+ ### When to Use Ralph Loop
986
+
987
+ | Autonomy Level | Mode | Rationale |
988
+ | -------------- | --------------------------------- | --------------------------------------------------------------- |
989
+ | **Standard** | Normal session | You want control over each step |
990
+ | **High** | Normal session + full-auto prompt | Few stops, but you're present |
991
+ | **Full Auto** | **Ralph Loop** | Well-defined goal, measurable criteria, no subjective decisions |
992
+
993
+ **Use Ralph Loop when:**
994
+
995
+ - Acceptance criteria are 100% testable (tests, build, lint)
996
+ - No subjective UI/UX decisions are needed
997
+ - Task is well-defined (PRD with data model + API design)
998
+ - You want to run overnight or unattended
999
+
1000
+ **Use a normal session instead when:**
1001
+
1002
+ - Design decisions are needed (subjective UI work)
1003
+ - Scope is unclear or exploratory
1004
+ - Production debugging with unknown root cause
1005
+ - One-shot tasks (single change, no iteration needed)
1006
+
1007
+ ### How to Start
1008
+
1009
+ ```bash
1010
+ /ralph-loop "PROMPT" --max-iterations N --completion-promise "PHRASE"
1011
+ ```
1012
+
1013
+ **Parameters:**
1014
+
1015
+ - **Prompt** -- The complete work order (PRD + workflow)
1016
+ - **--max-iterations N** -- Safety limit (always set this)
1017
+ - **--completion-promise "PHRASE"** -- Claude must output `<promise>PHRASE</promise>` when done
1018
+
1019
+ **To stop:**
1020
+
1021
+ ```bash
1022
+ /cancel-ralph
1023
+ ```
1024
+
1025
+ ### Max-Iterations Recommendations
1026
+
1027
+ | Feature Type | Max-Iterations | Rationale |
1028
+ | ------------------------ | -------------- | ----------------------------------------- |
1029
+ | Bugfix | **10** | Find cause + fix + test -- few iterations |
1030
+ | Small feature | **20** | CRUD, single endpoint, simple UI |
1031
+ | Standard feature | **30** | DB + API + frontend + tests |
1032
+ | Large feature | **50** | Multi-domain, many endpoints, complex UI |
1033
+ | Refactoring | **15** | Scoped changes, little new code |
1034
+ | Performance optimization | **20** | Profiling + optimization + verification |
1035
+
1036
+ ### Completion Promises
1037
+
1038
+ The completion promise must align with your quality gates. Claude may only output `<promise>...</promise>` **when the statement is 100% true**.
1039
+
1040
+ **Recommended promises by feature type:**
1041
+
1042
+ #### Standard Feature
1043
+
1044
+ ```
1045
+ --completion-promise "All tests pass, build succeeds, 0 lint errors, no console.log in production code"
1046
+ ```
1047
+
1048
+ #### Feature with E2E
1049
+
1050
+ ```
1051
+ --completion-promise "All unit tests pass, all e2e tests pass, build succeeds, 0 lint errors, code review complete"
1052
+ ```
1053
+
1054
+ #### Bugfix
1055
+
1056
+ ```
1057
+ --completion-promise "Bug is fixed, regression test added and passing, build succeeds"
1058
+ ```
1059
+
1060
+ #### Refactoring
1061
+
1062
+ ```
1063
+ --completion-promise "All tests pass, no dead code, build succeeds, 0 lint errors"
1064
+ ```
1065
+
1066
+ #### Full-Stack with Supabase
1067
+
1068
+ ```
1069
+ --completion-promise "Migration applied, RLS policies active, all tests pass, build succeeds, 0 lint errors, types generated"
1070
+ ```
1071
+
1072
+ ### Complete Ralph Loop Prompt Examples
1073
+
1074
+ #### Standard Feature
1075
+
1076
+ ```
1077
+ /ralph-loop "
1078
+ Implement the following feature fully autonomously from database to frontend.
1079
+
1080
+ <workflow>
1081
+ Each iteration:
1082
+ 1. Check current state (git diff, test results)
1083
+ 2. Implement the next logical step
1084
+ 3. Run /verify after every significant change
1085
+ 4. When all acceptance criteria are met: output <promise>COMPLETE</promise>
1086
+ </workflow>
1087
+
1088
+ <quality_gates>
1089
+ All of these must pass before outputting the completion promise:
1090
+ - tsc --noEmit: 0 errors
1091
+ - vitest: all tests green, 80%+ coverage
1092
+ - eslint: 0 errors, 0 warnings
1093
+ - No console.log in src/
1094
+ - Playwright e2e: all journeys passing
1095
+ </quality_gates>
1096
+
1097
+ <prd>
1098
+ [Insert PRD here]
1099
+ </prd>
1100
+ " --max-iterations 30 --completion-promise "COMPLETE"
1101
+ ```
1102
+
1103
+ #### Bugfix
1104
+
1105
+ ```
1106
+ /ralph-loop "
1107
+ Find and fix the following bug. Write a regression test.
1108
+
1109
+ <bug>
1110
+ [Description + reproduction steps]
1111
+ </bug>
1112
+
1113
+ <workflow>
1114
+ Each iteration:
1115
+ 1. Analyze the bug (logs, code, stack trace)
1116
+ 2. Write a failing test that reproduces the bug
1117
+ 3. Implement the fix
1118
+ 4. Verify: test green, build OK, no regression
1119
+ 5. When fix is confirmed: output <promise>BUG FIXED</promise>
1120
+ </workflow>
1121
+ " --max-iterations 10 --completion-promise "BUG FIXED"
1122
+ ```
1123
+
1124
+ #### Large Feature (Overnight)
1125
+
1126
+ ```
1127
+ /ralph-loop "
1128
+ Implement the team invitation system fully autonomously.
1129
+ Make your own decisions on all details.
1130
+
1131
+ <iteration_plan>
1132
+ Iterations 1-5: DB migration, RLS, types, Zod schemas
1133
+ Iterations 6-15: API endpoints with TDD
1134
+ Iterations 16-25: Frontend components with TDD
1135
+ Iterations 26-35: E2E tests, edge cases
1136
+ Iterations 36-50: Polish, code review, cleanup
1137
+ </iteration_plan>
1138
+
1139
+ <quality_gates>
1140
+ - All acceptance criteria met
1141
+ - tsc: 0 errors
1142
+ - vitest: 80%+ coverage
1143
+ - playwright e2e: all journeys passing
1144
+ - eslint: 0 errors
1145
+ - /code-review: no security issues
1146
+ - No console.log, no hardcoded strings
1147
+ </quality_gates>
1148
+
1149
+ <prd>
1150
+ [Insert complete PRD here]
1151
+ </prd>
1152
+
1153
+ When ALL quality gates pass: output <promise>FEATURE COMPLETE</promise>
1154
+ " --max-iterations 50 --completion-promise "FEATURE COMPLETE"
1155
+ ```
1156
+
1157
+ ### Ralph Loop Best Practices
1158
+
1159
+ 1. **Always set --max-iterations** -- Safety net against infinite loops
1160
+ 2. **Make the completion promise specific** -- "COMPLETE" works, but "All tests pass and build succeeds" forces Claude to verify honestly
1161
+ 3. **Include an iteration plan** -- Gives Claude a roadmap for sequencing work
1162
+ 4. **List quality gates explicitly** -- Claude checks exactly these criteria before the promise
1163
+ 5. **Initialize a git repo first** -- Claude tracks its own progress via git diff/log
1164
+ 6. **Build in an escape hatch** -- "After 80% of max-iterations: document blockers and open items"
1165
+
1166
+ ### Tasks as Ralph Loop Roadmap
1167
+
1168
+ When you run `/tasks` before the Ralph Loop, Claude can use the generated tasks.md as a progress tracker across iterations:
1169
+
1170
+ ```
1171
+ /ralph-loop Implement all tasks from tasks.md in this project.
1172
+
1173
+ <workflow>
1174
+ Each iteration:
1175
+ 1. Read tasks.md -- find the next uncompleted task
1176
+ 2. Implement it following TDD (RED -> GREEN -> REFACTOR)
1177
+ 3. Mark the task as complete in tasks.md
1178
+ 4. Run quality gates
1179
+ 5. When ALL tasks are complete and ALL quality gates pass:
1180
+ output <promise>ALL TASKS COMPLETE</promise>
1181
+ </workflow>
1182
+
1183
+ <quality_gates>
1184
+ [Insert Quality Gates]
1185
+ </quality_gates>
1186
+
1187
+ --max-iterations [N] --completion-promise 'ALL TASKS COMPLETE'
1188
+ ```
1189
+
1190
+ **Advantage:** Claude sees which tasks are done and which are open across iterations via the tasks.md file -- no context loss between iterations.
1191
+
1192
+ ### Error Recovery
1193
+
1194
+ When an iteration fails, Claude follows this recovery pattern:
1195
+
1196
+ 1. **Build error** -> Next iteration fixes it automatically. Claude sees the error in terminal output and git diff.
1197
+
1198
+ 2. **Same error 3 times** -> Try a different approach. If that also fails: document the blocker in `.claude/ralph-blockers.md` and move to the next task.
1199
+
1200
+ 3. **Missing dependency** -> Install it (`pnpm add <package>`). Check package.json and project conventions before adding anything new.
1201
+
1202
+ 4. **80% of iterations consumed** -> Write a status report to `.claude/ralph-status.md`:
1203
+ - What is done (with test results)
1204
+ - What remains
1205
+ - Blockers and open questions
1206
+ - Suggested next steps
1207
+
1208
+ 5. **Flaky test** -> Investigate root cause (see Rule 5 in Lessons Learned). Do not retry blindly -- identify the stateful dependency causing flakiness.
1209
+
1210
+ ---
1211
+
1212
+ ## 11. Lessons Learned -- Common Pitfalls in Autonomous Development
1213
+
1214
+ These rules come from real autonomous sessions and prevent the most common time sinks. They apply to any project type -- web, mobile, backend, CLI, infrastructure.
1215
+
1216
+ ### Rule 1: Test environment differs from development environment
1217
+
1218
+ **Problem:** Code works locally and passes manual testing, but automated tests fail.
1219
+
1220
+ Test runners (Playwright, Jest, pytest, XCTest) create their own execution contexts with their own defaults: locale, timezone, environment variables, permissions, network configuration. These defaults almost always differ from the development environment.
1221
+
1222
+ **Rule:** After scaffolding test infrastructure, **write a trivial smoke test that validates the environment** (e.g., "page loads", "API responds", "DB connection works") -- before writing any feature tests. If the smoke test fails, the problem is test configuration, not application code.
1223
+
1224
+ **Examples across stacks:**
1225
+
1226
+ - Browser test runner uses a different locale than the system -> i18n content renders unexpectedly
1227
+ - CI pipeline has different environment variables -> configuration errors
1228
+ - Test containers start with a different DB version -> SQL incompatibilities
1229
+ - Mobile simulator has different permissions than a real device -> feature gates trigger
1230
+
1231
+ ### Rule 2: Validate integration points immediately after scaffolding
1232
+
1233
+ **Problem:** A feature is fully implemented across all layers, but integration testing reveals a fundamental configuration error that should have been caught in phase 2.
1234
+
1235
+ When multiple systems interact (framework + middleware + routing + DB + auth + testing), the most expensive bugs occur at the **boundaries**, not within individual systems.
1236
+
1237
+ **Rule:** After scaffolding (Phase 2), **immediately test one end-to-end roundtrip** before building features on top. Do not wait until Phase 5 (E2E).
1238
+
1239
+ **Post-scaffolding checklist:**
1240
+
1241
+ - [ ] Routing: Are all planned routes accessible? (not 404)
1242
+ - [ ] Middleware: Does it fire correctly on all paths?
1243
+ - [ ] Auth: Does the login/logout flow work end-to-end?
1244
+ - [ ] DB: Does migration run and are seed data readable?
1245
+ - [ ] API: Does one request/response roundtrip succeed?
1246
+ - [ ] Build: Does the production build complete without errors?
1247
+
1248
+ ### Rule 3: Read diagnostics before attempting fixes
1249
+
1250
+ **Problem:** Test fails -> immediately adjust code/test -> still broken after 5 iterations.
1251
+
1252
+ Every test framework produces detailed diagnostic output: error context files, stack traces, screenshots, accessibility trees, log files. **These outputs almost always contain the answer** -- but they are often skipped in favor of trial-and-error.
1253
+
1254
+ **Rule:** On test failure, **always read the complete diagnostics first** (logs, traces, screenshots, error context) before changing any code. Diagnostic analysis saves an average of 3-5 debug iterations.
1255
+
1256
+ **Diagnostic priority:**
1257
+
1258
+ 1. Error message + stack trace (what exactly failed?)
1259
+ 2. Framework-specific outputs (Playwright: error context, pytest: captured output, etc.)
1260
+ 3. Application logs (server logs, browser console)
1261
+ 4. Screenshots / DOM snapshots (visual state)
1262
+
1263
+ ### Rule 4: Manual verification is not the same as automated testing
1264
+
1265
+ **Problem:** Feature manually tested -> works. Automated test -> fails. "But it works!"
1266
+
1267
+ Manual tests and automated tests run in **different execution contexts**. A human waits intuitively; a test does not. The browser has stored cookies; the test runner starts clean. The IDE forwards ports; the CI runner does not.
1268
+
1269
+ **Rule:** "Manually tested" does not count as verification. Only a **passing automated test** confirms the feature works. When manual and automated results disagree, the cause is almost always in the **execution context**, not the code.
1270
+
1271
+ **Common context differences:**
1272
+
1273
+ - Timing: Humans wait intuitively; tests have fixed timeouts
1274
+ - State: Browser has caches/cookies; test runner starts clean
1275
+ - Locale/timezone: System settings vs. test runner defaults
1276
+ - Network: Local ports vs. container network
1277
+ - Permissions: Dev user vs. service account
1278
+
1279
+ ### Rule 5: Stateful dependencies cause flaky tests
1280
+
1281
+ **Problem:** Test passes the first time, fails the second -- with no code changes.
1282
+
1283
+ Any stateful component (rate limiter, in-memory cache, queue, session store, connection pool) retains state between test runs. When tests modify this state without resetting it, subsequent tests become unreliable.
1284
+
1285
+ **Rule:** Identify stateful dependencies and reset them per test run. Alternatively, **guarantee isolation** in tests against stateful services (own instance, reset endpoint, mocking).
1286
+
1287
+ **Common candidates:**
1288
+
1289
+ - Rate limiters (in-memory maps, Redis keys)
1290
+ - Caches (CDN, application cache, browser cache)
1291
+ - Queues (unprocessed messages from previous runs)
1292
+ - Database state (seed data, auto-increment counters)
1293
+ - File system (temp files, upload directories)
1294
+ - External APIs (sandbox limits, quotas)
1295
+
1296
+ ### Rule 6: Use precise identifiers from the start
1297
+
1298
+ **Problem:** Tests use vague selectors ("find text X") -> break on every UI change because X appears in multiple places.
1299
+
1300
+ Vague selectors (`getByText("Save")`, `querySelector(".btn")`, `find_element_by_class`) are fragile. As soon as the page or app changes, they match too many or the wrong elements.
1301
+
1302
+ **Rule:** Write tests with **scoped, specific identifiers** from the beginning:
1303
+
1304
+ - Element IDs or data-testid attributes
1305
+ - Scoped queries (search within a container, not globally)
1306
+ - Exact matches instead of partial/regex when possible
1307
+ - `.first()` / `[0]` only as a last resort -- prefer refining the selector
1308
+
1309
+ ### Autonomous Debug Loop
1310
+
1311
+ When a test fails during autonomous development, follow this sequence:
1312
+
1313
+ ```
1314
+ Test fails
1315
+ |
1316
+ v
1317
+ 1. STOP -- Do not attempt a fix immediately
1318
+ |
1319
+ v
1320
+ 2. Read diagnostics (error context, logs, traces, screenshots)
1321
+ |
1322
+ v
1323
+ 3. Check execution context (locale, env vars, state, timing)
1324
+ |
1325
+ v
1326
+ 4. Identify root cause (config? code? test? infrastructure?)
1327
+ |
1328
+ v
1329
+ 5. Apply a targeted fix (one change per iteration)
1330
+ |
1331
+ v
1332
+ 6. Re-run the test
1333
+ ```
1334
+
1335
+ ---
1336
+
1337
+ ## 12. Available Tools -- Reference
1338
+
1339
+ ### Commands (usable in prompts)
1340
+
1341
+ | Command | Phase | Function |
1342
+ | ----------------- | -------------- | ---------------------------------------- |
1343
+ | `/plan` | Start | Analysis + plan + **waits for approval** |
1344
+ | `/tdd` | Implementation | Tests first -> code -> refactor |
1345
+ | `/verify` | QA | Build + types + lint + tests |
1346
+ | `/e2e` | QA | Playwright E2E tests |
1347
+ | `/code-review` | Review | Security + quality audit |
1348
+ | `/build-fix` | Debugging | Incremental error resolution |
1349
+ | `/refactor-clean` | Cleanup | Remove dead code |
1350
+ | `/checkpoint` | Safety | Create a restore point |
1351
+
1352
+ ### MCP Servers (automatically available)
1353
+
1354
+ | Server | Function |
1355
+ | ----------------------- | ------------------------------------------------ |
1356
+ | **Supabase** | Migrations, SQL, types, Edge Functions, branches |
1357
+ | **Playwright** | Browser automation, E2E testing |
1358
+ | **Sequential Thinking** | Structured reasoning |
1359
+
1360
+ ### Specialized Agents (automatically selected)
1361
+
1362
+ | Agent | Use Case |
1363
+ | ------------------- | ------------------------------- |
1364
+ | fullstack-developer | Features spanning all layers |
1365
+ | postgres-pro | DB schema, queries, performance |
1366
+ | security-engineer | Auth, RLS, vulnerabilities |
1367
+ | test-automator | Test strategy, coverage |
1368
+ | ui-designer | Component design, UX |