@aslomon/effectum 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +633 -0
- package/bin/install.js +652 -0
- package/package.json +29 -0
- package/system/README.md +118 -0
- package/system/commands/build-fix.md +89 -0
- package/system/commands/cancel-ralph.md +90 -0
- package/system/commands/checkpoint.md +63 -0
- package/system/commands/code-review.md +120 -0
- package/system/commands/e2e.md +92 -0
- package/system/commands/plan.md +111 -0
- package/system/commands/ralph-loop.md +163 -0
- package/system/commands/refactor-clean.md +104 -0
- package/system/commands/tdd.md +84 -0
- package/system/commands/verify.md +71 -0
- package/system/stacks/generic.md +96 -0
- package/system/stacks/nextjs-supabase.md +114 -0
- package/system/stacks/python-fastapi.md +140 -0
- package/system/stacks/swift-ios.md +136 -0
- package/system/templates/AUTONOMOUS-WORKFLOW.md +1368 -0
- package/system/templates/CLAUDE.md.tmpl +141 -0
- package/system/templates/guardrails.md.tmpl +39 -0
- package/system/templates/settings.json.tmpl +201 -0
- package/workshop/knowledge/01-prd-template.md +275 -0
- package/workshop/knowledge/02-questioning-framework.md +209 -0
- package/workshop/knowledge/03-decomposition-guide.md +234 -0
- package/workshop/knowledge/04-examples.md +435 -0
- package/workshop/knowledge/05-quality-checklist.md +166 -0
- package/workshop/knowledge/06-network-map-guide.md +413 -0
- package/workshop/knowledge/07-prompt-templates.md +315 -0
- package/workshop/knowledge/08-workflow-modes.md +198 -0
- package/workshop/projects/_example-project/PROJECT.md +33 -0
- package/workshop/projects/_example-project/notes/decisions.md +15 -0
- package/workshop/projects/_example-project/notes/discovery-log.md +9 -0
- package/workshop/templates/PROJECT.md +25 -0
- package/workshop/templates/network-map.mmd +13 -0
- package/workshop/templates/prd.md +133 -0
- package/workshop/templates/requirements-map.md +48 -0
- package/workshop/templates/shared-contracts.md +89 -0
- package/workshop/templates/vision.md +66 -0
|
@@ -0,0 +1,1368 @@
|
|
|
1
|
+
# Autonomous Feature Implementation Guide
|
|
2
|
+
|
|
3
|
+
> Standard workflow for fully autonomous feature implementation with Claude Code -- from PRD to production.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
This document defines how to hand off a feature to Claude Code for fully autonomous implementation. It covers the PRD format, prompt templates, command chains, quality gates, and lessons learned from real sessions. The goal: you provide a PRD + prompt, Claude Code delivers production-ready code including tests, migrations, and security.
|
|
8
|
+
|
|
9
|
+
<important>
|
|
10
|
+
All PRDs, prompts, code, comments, commits, and documentation must be written in English. This ensures consistency across the codebase, better LLM comprehension, and avoids encoding or locale-related issues in tooling and CI/CD pipelines.
|
|
11
|
+
</important>
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## 1. PRD Template
|
|
16
|
+
|
|
17
|
+
Every PRD must contain these sections. The more complete the PRD, the fewer interaction points and the more autonomous the implementation.
|
|
18
|
+
|
|
19
|
+
```markdown
|
|
20
|
+
# PRD: [Feature Name]
|
|
21
|
+
|
|
22
|
+
## Problem
|
|
23
|
+
|
|
24
|
+
What is the problem? Why does it need to be solved? Include business context.
|
|
25
|
+
|
|
26
|
+
## Goal
|
|
27
|
+
|
|
28
|
+
What should work when this is done? (1-3 sentences, measurable)
|
|
29
|
+
|
|
30
|
+
## User Stories
|
|
31
|
+
|
|
32
|
+
- As a [role], I want to [action], so that [benefit]
|
|
33
|
+
- As a [role], I want to [action], so that [benefit]
|
|
34
|
+
|
|
35
|
+
## Acceptance Criteria
|
|
36
|
+
|
|
37
|
+
- [ ] AC1: [Concrete, testable criterion]
|
|
38
|
+
- [ ] AC2: [Concrete, testable criterion]
|
|
39
|
+
- [ ] AC3: [Concrete, testable criterion]
|
|
40
|
+
|
|
41
|
+
## Scope
|
|
42
|
+
|
|
43
|
+
### In Scope
|
|
44
|
+
|
|
45
|
+
- Feature X, Screen Y, API Endpoint Z
|
|
46
|
+
|
|
47
|
+
### Out of Scope
|
|
48
|
+
|
|
49
|
+
- What should NOT be built (explicit boundaries)
|
|
50
|
+
|
|
51
|
+
## Data Model (optional but recommended)
|
|
52
|
+
|
|
53
|
+
- Tables, fields, relations, RLS requirements
|
|
54
|
+
- Include SQL migration sketches if possible
|
|
55
|
+
|
|
56
|
+
## UI/UX (optional)
|
|
57
|
+
|
|
58
|
+
- Wireframes, screenshots, Figma links, or text description
|
|
59
|
+
- Responsive requirements, breakpoints
|
|
60
|
+
|
|
61
|
+
## API Design (optional)
|
|
62
|
+
|
|
63
|
+
- Endpoints, HTTP methods, request/response shapes
|
|
64
|
+
- Authentication and authorization requirements
|
|
65
|
+
|
|
66
|
+
## Constraints
|
|
67
|
+
|
|
68
|
+
- Performance requirements (latency, throughput)
|
|
69
|
+
- Dependencies on existing features or services
|
|
70
|
+
- Technical or timeline constraints
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
### Why Each Section Matters
|
|
74
|
+
|
|
75
|
+
| Section | If missing, Claude Code will... |
|
|
76
|
+
| -------------------- | ------------------------------------------------------ |
|
|
77
|
+
| Problem | ...assume incorrect context and build the wrong thing |
|
|
78
|
+
| Acceptance Criteria | ...not know when "done" means done |
|
|
79
|
+
| Scope / Out of Scope | ...build too much or too little |
|
|
80
|
+
| Data Model | ...ask for clarification or design its own schema |
|
|
81
|
+
| Constraints | ...use default assumptions that may not fit your needs |
|
|
82
|
+
|
|
83
|
+
### PRD Best Practices
|
|
84
|
+
|
|
85
|
+
- Write acceptance criteria as **testable assertions** -- each one should map to at least one automated test
|
|
86
|
+
- Define the data model upfront to eliminate architectural decision points and reduce interaction stops to zero
|
|
87
|
+
- List out-of-scope items explicitly -- this prevents scope creep more effectively than vague boundaries
|
|
88
|
+
- Include example request/response shapes for APIs -- this removes ambiguity about data contracts
|
|
89
|
+
|
|
90
|
+
### Agent-Ready PRD Extension (for Ralph Loop / Full Autonomy)
|
|
91
|
+
|
|
92
|
+
When using Ralph Loop or full-auto mode, add these three sections to your PRD. They bridge the gap between "what to build" and "how to verify it autonomously."
|
|
93
|
+
|
|
94
|
+
```markdown
|
|
95
|
+
### Quality Gates
|
|
96
|
+
|
|
97
|
+
Automated checks that MUST pass before the feature is considered done:
|
|
98
|
+
|
|
99
|
+
- Build: `pnpm build` -- 0 errors
|
|
100
|
+
- Types: `tsc --noEmit` -- 0 errors
|
|
101
|
+
- Tests: `pnpm vitest run` -- all pass, 80%+ coverage
|
|
102
|
+
- Lint: `pnpm lint` -- 0 errors
|
|
103
|
+
- E2E: `npx playwright test` -- all pass (if applicable)
|
|
104
|
+
- Custom: [project-specific checks]
|
|
105
|
+
|
|
106
|
+
### Autonomy Rules
|
|
107
|
+
|
|
108
|
+
Where Claude can make its own decisions vs. where it should follow strict guidelines:
|
|
109
|
+
|
|
110
|
+
- Design decisions: [yes/no + guidelines, e.g. "Follow DESIGN.md"]
|
|
111
|
+
- Library choices: [predefined or free]
|
|
112
|
+
- Architecture: [predefined or "follow existing patterns"]
|
|
113
|
+
- On ambiguity: [decide autonomously / stop and document]
|
|
114
|
+
|
|
115
|
+
### Completion Promise
|
|
116
|
+
|
|
117
|
+
Exact phrase for Ralph Loop (must be 100% true before outputting):
|
|
118
|
+
|
|
119
|
+
"All acceptance criteria met, build passes, tests pass, 0 lint errors"
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
**Why these fields matter for autonomy:**
|
|
123
|
+
|
|
124
|
+
| Field | Without it, Claude will... |
|
|
125
|
+
| ------------------ | ------------------------------------------------------------------------- |
|
|
126
|
+
| Quality Gates | ...use default checks that may miss project-specific requirements |
|
|
127
|
+
| Autonomy Rules | ...stop and ask about decisions it could safely make on its own |
|
|
128
|
+
| Completion Promise | ...not know when to stop iterating (Ralph Loop runs until max-iterations) |
|
|
129
|
+
|
|
130
|
+
---
|
|
131
|
+
|
|
132
|
+
## 1.5 PRD -> Ralph Loop Conversion
|
|
133
|
+
|
|
134
|
+
### Quick Conversion
|
|
135
|
+
|
|
136
|
+
1. Write your PRD using the Agent-Ready template (Section 1 + extension)
|
|
137
|
+
2. Copy it into this Ralph Loop prompt template:
|
|
138
|
+
|
|
139
|
+
```
|
|
140
|
+
/ralph-loop Implement the following feature fully autonomously from database to frontend.
|
|
141
|
+
|
|
142
|
+
<workflow>
|
|
143
|
+
Each iteration:
|
|
144
|
+
1. Read the PRD and current project state
|
|
145
|
+
2. Check what was already done (git diff, existing files)
|
|
146
|
+
3. Implement the next logical step
|
|
147
|
+
4. Run quality gates after every significant change
|
|
148
|
+
5. When ALL acceptance criteria AND quality gates pass: output the completion promise
|
|
149
|
+
</workflow>
|
|
150
|
+
|
|
151
|
+
<quality_gates>
|
|
152
|
+
[INSERT Quality Gates from your PRD]
|
|
153
|
+
</quality_gates>
|
|
154
|
+
|
|
155
|
+
<autonomy_rules>
|
|
156
|
+
[INSERT Autonomy Rules from your PRD]
|
|
157
|
+
</autonomy_rules>
|
|
158
|
+
|
|
159
|
+
<prd>
|
|
160
|
+
[INSERT your complete PRD]
|
|
161
|
+
</prd>
|
|
162
|
+
|
|
163
|
+
--max-iterations [N] --completion-promise '[Completion Promise from your PRD]'
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
### Max-Iterations Selection
|
|
167
|
+
|
|
168
|
+
| Feature Type | Max | Rationale |
|
|
169
|
+
| ---------------- | --- | ------------------------------- |
|
|
170
|
+
| Bugfix | 10 | Find + Fix + Test |
|
|
171
|
+
| Small feature | 20 | CRUD, single endpoint |
|
|
172
|
+
| Standard feature | 30 | DB + API + Frontend + Tests |
|
|
173
|
+
| Large feature | 50 | Multi-domain, E2E, complex UI |
|
|
174
|
+
| Refactoring | 15 | Scoped changes, little new code |
|
|
175
|
+
|
|
176
|
+
### Iteration Planning (recommended for features > 20 iterations)
|
|
177
|
+
|
|
178
|
+
Add an `<iteration_plan>` to give Claude a roadmap for sequencing work:
|
|
179
|
+
|
|
180
|
+
```xml
|
|
181
|
+
<iteration_plan>
|
|
182
|
+
Iterations 1-3: Setup -- DB migration, types, Zod schemas
|
|
183
|
+
Iterations 4-10: Backend -- API routes with TDD
|
|
184
|
+
Iterations 11-18: Frontend -- Components with TDD
|
|
185
|
+
Iterations 19-25: E2E Tests + Edge Cases
|
|
186
|
+
Iterations 26-30: Polish -- Code review, cleanup, final verification
|
|
187
|
+
</iteration_plan>
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
Adjust the ranges based on your max-iterations and feature complexity.
|
|
191
|
+
|
|
192
|
+
---
|
|
193
|
+
|
|
194
|
+
## 2. Prompt Templates
|
|
195
|
+
|
|
196
|
+
All prompts follow Anthropic's prompt engineering best practices: clear structure, explicit instructions, context for the "why", and positive directives (say what to do, not what to avoid).
|
|
197
|
+
|
|
198
|
+
### Standard Prompt (Recommended)
|
|
199
|
+
|
|
200
|
+
```
|
|
201
|
+
Implement the following feature autonomously from database to frontend.
|
|
202
|
+
|
|
203
|
+
<workflow>
|
|
204
|
+
1. /plan -- Create an implementation plan and wait for my approval
|
|
205
|
+
2. After approval: Implement using /tdd (tests first, then code)
|
|
206
|
+
3. /verify after each phase
|
|
207
|
+
4. /e2e for critical user journeys
|
|
208
|
+
5. /code-review at the end
|
|
209
|
+
6. Do NOT commit -- show me the final git diff
|
|
210
|
+
</workflow>
|
|
211
|
+
|
|
212
|
+
<prd>
|
|
213
|
+
[Insert PRD here]
|
|
214
|
+
</prd>
|
|
215
|
+
|
|
216
|
+
<context>
|
|
217
|
+
- Project: [project name/path]
|
|
218
|
+
- Supabase Project ID: [ID] (if applicable)
|
|
219
|
+
- Follow existing patterns in: [relevant files/directories]
|
|
220
|
+
</context>
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
### Express Prompt (Small Features)
|
|
224
|
+
|
|
225
|
+
```
|
|
226
|
+
Implement: [1-2 sentence feature description]
|
|
227
|
+
|
|
228
|
+
<acceptance_criteria>
|
|
229
|
+
- [ ] [AC1]
|
|
230
|
+
- [ ] [AC2]
|
|
231
|
+
</acceptance_criteria>
|
|
232
|
+
|
|
233
|
+
Workflow: /tdd -> /verify -> /code-review. Do not commit.
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
### Full-Auto Prompt (Maximum Autonomy)
|
|
237
|
+
|
|
238
|
+
```
|
|
239
|
+
Implement the following feature fully autonomously.
|
|
240
|
+
Make your own decisions on architecture and UI details.
|
|
241
|
+
Only ask when encountering breaking changes or genuinely ambiguous scope.
|
|
242
|
+
|
|
243
|
+
Workflow: /plan -> /tdd -> /e2e -> /verify -> /code-review
|
|
244
|
+
Commit when all quality gates pass.
|
|
245
|
+
|
|
246
|
+
<prd>
|
|
247
|
+
[Insert PRD here]
|
|
248
|
+
</prd>
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
### Prompt Writing Guidelines
|
|
252
|
+
|
|
253
|
+
These guidelines are derived from Anthropic's official prompt engineering documentation:
|
|
254
|
+
|
|
255
|
+
1. **Be specific and direct** -- Treat Claude like a brilliant new team member who lacks context about your specific project norms and conventions
|
|
256
|
+
2. **Provide context for the "why"** -- Explain the reasoning behind constraints so Claude can generalize correctly to edge cases
|
|
257
|
+
3. **Use XML tags for structure** -- Wrap distinct sections (`<prd>`, `<workflow>`, `<context>`, `<acceptance_criteria>`) so Claude can parse complex prompts unambiguously
|
|
258
|
+
4. **Say what to do, not what to avoid** -- "Write concise functions under 40 lines" is more effective than "Don't write long functions"
|
|
259
|
+
5. **Reference existing patterns** -- "Follow the pattern in `src/lib/billing/` for the service layer" gives Claude a concrete example to generalize from
|
|
260
|
+
6. **Specify the autonomy level explicitly** -- Claude calibrates its interaction behavior based on your stated preference
|
|
261
|
+
|
|
262
|
+
### Workflow Mode Decision Matrix
|
|
263
|
+
|
|
264
|
+
Use this table to decide which workflow mode fits your situation:
|
|
265
|
+
|
|
266
|
+
| Criterion | Normal Session | Full-Auto Prompt | Ralph Loop |
|
|
267
|
+
| ------------------------- | -------------- | ---------------- | ----------------- |
|
|
268
|
+
| Subjective UI decisions | Yes | No | No |
|
|
269
|
+
| Clear acceptance criteria | Optional | Recommended | **Required** |
|
|
270
|
+
| PRD available | Optional | Recommended | **Required** |
|
|
271
|
+
| Quality gates defined | Optional | Recommended | **Required** |
|
|
272
|
+
| Autonomy rules defined | N/A | Optional | **Recommended** |
|
|
273
|
+
| You are present | Yes | Yes | No (unattended) |
|
|
274
|
+
| Iteration needed | Manual | Manual | **Automatic** |
|
|
275
|
+
| Completion promise | N/A | N/A | **Required** |
|
|
276
|
+
| Typical duration | 30min-2h | 1-3h | 1-8h (unattended) |
|
|
277
|
+
|
|
278
|
+
**Quick recommendation:**
|
|
279
|
+
|
|
280
|
+
- **No PRD?** -> Normal Session
|
|
281
|
+
- **PRD without quality gates?** -> Full-Auto Prompt
|
|
282
|
+
- **PRD with quality gates + completion promise?** -> Ralph Loop
|
|
283
|
+
- **PRD + Agent Teams flag?** -> /orchestrate (with Teams)
|
|
284
|
+
|
|
285
|
+
---
|
|
286
|
+
|
|
287
|
+
## 3. Autonomous Workflow Phases
|
|
288
|
+
|
|
289
|
+
### Phase 1: Analysis & Planning (`/plan`)
|
|
290
|
+
|
|
291
|
+
```
|
|
292
|
+
Read PRD -> Explore codebase -> Identify risks -> Write plan -> STOP and wait for approval
|
|
293
|
+
```
|
|
294
|
+
|
|
295
|
+
- Reads the PRD and builds understanding of the problem context
|
|
296
|
+
- Explores the existing codebase (Glob, Grep, Read) to identify reusable patterns, utilities, and components
|
|
297
|
+
- Creates a step-by-step implementation plan with phases and dependencies
|
|
298
|
+
- Surfaces risks, open questions, and architectural decisions
|
|
299
|
+
- **STOPS and waits for explicit approval before writing any code**
|
|
300
|
+
|
|
301
|
+
### Phase 2: Database & Types
|
|
302
|
+
|
|
303
|
+
```
|
|
304
|
+
Write migration -> RLS policies -> Generate types -> Derive Zod schemas
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
- Supabase migrations via `apply_migration` (never raw DDL)
|
|
308
|
+
- RLS policies for every new table (multi-tenant with org_id)
|
|
309
|
+
- Security advisor check after DDL changes
|
|
310
|
+
- TypeScript types generated via `generate_typescript_types`
|
|
311
|
+
- Zod schemas derived for API validation
|
|
312
|
+
|
|
313
|
+
### Phase 3: Backend / API (`/tdd`)
|
|
314
|
+
|
|
315
|
+
```
|
|
316
|
+
Write tests (RED) -> Implement API (GREEN) -> Refactor
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
- Route handlers in `src/app/api/`
|
|
320
|
+
- Server-side services in `src/lib/[domain]/`
|
|
321
|
+
- Zod validation for all inputs
|
|
322
|
+
- Result pattern `{ data, error }` for all operations that can fail
|
|
323
|
+
- Tests with Vitest + Testing Library
|
|
324
|
+
|
|
325
|
+
### Phase 4: Frontend (`/tdd`)
|
|
326
|
+
|
|
327
|
+
```
|
|
328
|
+
Write component tests (RED) -> Implement components (GREEN) -> Refactor
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
- Server Components by default
|
|
332
|
+
- Client Components only when interactivity, hooks, or browser APIs are needed
|
|
333
|
+
- Shadcn UI + Tailwind CSS v4
|
|
334
|
+
- Feature components in `src/components/[feature]/`
|
|
335
|
+
- Hooks in `src/hooks/`
|
|
336
|
+
|
|
337
|
+
### Phase 5: E2E Tests (`/e2e`)
|
|
338
|
+
|
|
339
|
+
```
|
|
340
|
+
Write Playwright tests -> Browser automation -> Verify user journeys
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
- Critical user journeys as E2E tests
|
|
344
|
+
- Accessibility-tree-based selectors (fast, reliable)
|
|
345
|
+
- Tests in `tests/e2e/`
|
|
346
|
+
|
|
347
|
+
<important>
|
|
348
|
+
Write a trivial smoke test immediately after scaffolding (Phase 2) to validate that the test environment works correctly -- routing is accessible, middleware fires, API responds. Do not wait until Phase 5 to discover infrastructure misconfigurations.
|
|
349
|
+
</important>
|
|
350
|
+
|
|
351
|
+
### Phase 6: Verification (`/verify` + `/code-review`)
|
|
352
|
+
|
|
353
|
+
```
|
|
354
|
+
Build -> Types -> Lint -> Tests -> Security -> Review
|
|
355
|
+
```
|
|
356
|
+
|
|
357
|
+
- TypeScript compilation (strict mode)
|
|
358
|
+
- ESLint/Biome check
|
|
359
|
+
- All tests (unit + integration + e2e)
|
|
360
|
+
- No console.log in production code
|
|
361
|
+
- Security review (OWASP Top 10, RLS check)
|
|
362
|
+
- Code quality review
|
|
363
|
+
|
|
364
|
+
---
|
|
365
|
+
|
|
366
|
+
## 4. Command Chains by Feature Type
|
|
367
|
+
|
|
368
|
+
### New Feature (Standard)
|
|
369
|
+
|
|
370
|
+
```
|
|
371
|
+
/plan -> [approval] -> /tdd -> /verify -> /e2e -> /verify -> /code-review
|
|
372
|
+
```
|
|
373
|
+
|
|
374
|
+
### Small Feature / Bugfix
|
|
375
|
+
|
|
376
|
+
```
|
|
377
|
+
/tdd -> /verify -> /code-review
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
### Large Feature (Multi-Domain)
|
|
381
|
+
|
|
382
|
+
```
|
|
383
|
+
/plan -> [approval] -> /tdd (DB+API) -> /verify -> /tdd (Frontend) -> /verify -> /e2e -> /code-review
|
|
384
|
+
```
|
|
385
|
+
|
|
386
|
+
### Refactoring
|
|
387
|
+
|
|
388
|
+
```
|
|
389
|
+
/plan -> [approval] -> /refactor-clean -> /verify -> /code-review
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
### Performance Optimization
|
|
393
|
+
|
|
394
|
+
```
|
|
395
|
+
/plan -> [approval] -> [profiling] -> /tdd -> /verify -> /e2e
|
|
396
|
+
```
|
|
397
|
+
|
|
398
|
+
---
|
|
399
|
+
|
|
400
|
+
## 5. Interaction Points
|
|
401
|
+
|
|
402
|
+
These are the moments where Claude Code pauses and waits for your input.
|
|
403
|
+
|
|
404
|
+
| Point | Why | What you do |
|
|
405
|
+
| --------------------- | ---------------------------- | ----------------------------------- |
|
|
406
|
+
| After `/plan` | Plan approval required | "OK", "Change X", or "Start over" |
|
|
407
|
+
| Ambiguous scope | PRD has gaps | Answer the question |
|
|
408
|
+
| Breaking change | Existing API/DB modification | Confirm or choose alternative |
|
|
409
|
+
| Architecture decision | Multiple valid approaches | Pick one |
|
|
410
|
+
| Test failure | Unexpected error | Provide context or allow workaround |
|
|
411
|
+
|
|
412
|
+
### How to Minimize Interaction Stops
|
|
413
|
+
|
|
414
|
+
- Write complete acceptance criteria with testable assertions
|
|
415
|
+
- Define explicit scope and out-of-scope boundaries
|
|
416
|
+
- Pre-define the data model (tables, fields, relations)
|
|
417
|
+
- State autonomy preference: "Make your own decisions, only ask about breaking changes"
|
|
418
|
+
- Include the Supabase Project ID so Claude can execute migrations and generate types directly
|
|
419
|
+
|
|
420
|
+
---
|
|
421
|
+
|
|
422
|
+
## 6. Quality Gates
|
|
423
|
+
|
|
424
|
+
Before Claude Code reports "done", it verifies all of these automatically:
|
|
425
|
+
|
|
426
|
+
| Gate | Tool | Criterion |
|
|
427
|
+
| ------------- | ---------------- | ------------------------------------- |
|
|
428
|
+
| Compilation | `tsc --noEmit` | 0 errors |
|
|
429
|
+
| Linting | ESLint/Biome | 0 errors, 0 warnings |
|
|
430
|
+
| Unit Tests | Vitest | 80%+ coverage, all passing |
|
|
431
|
+
| E2E Tests | Playwright | All journeys passing |
|
|
432
|
+
| Security | `/code-review` | No OWASP vulnerabilities |
|
|
433
|
+
| RLS Check | Supabase Advisor | All tables have RLS policies |
|
|
434
|
+
| No Debug Logs | `/verify` | 0 console.log hits in production code |
|
|
435
|
+
| Type Safety | Zod + TS Strict | No `any`, no `as` casts |
|
|
436
|
+
| File Size | Check | No file exceeds 300 lines |
|
|
437
|
+
|
|
438
|
+
---
|
|
439
|
+
|
|
440
|
+
## 7. Example PRDs
|
|
441
|
+
|
|
442
|
+
### Example A: Simple Feature
|
|
443
|
+
|
|
444
|
+
```markdown
|
|
445
|
+
# PRD: User Profile Avatar Upload
|
|
446
|
+
|
|
447
|
+
## Problem
|
|
448
|
+
|
|
449
|
+
Users cannot upload a profile picture. Their profiles show a generic placeholder,
|
|
450
|
+
making it harder for team members to identify each other.
|
|
451
|
+
|
|
452
|
+
## Goal
|
|
453
|
+
|
|
454
|
+
Users can upload a profile picture that is displayed across the application.
|
|
455
|
+
|
|
456
|
+
## User Stories
|
|
457
|
+
|
|
458
|
+
- As a user, I want to upload a profile picture so that others can recognize me
|
|
459
|
+
|
|
460
|
+
## Acceptance Criteria
|
|
461
|
+
|
|
462
|
+
- [ ] Upload accepts JPG/PNG/WebP files up to 5MB
|
|
463
|
+
- [ ] Image is stored in Supabase Storage
|
|
464
|
+
- [ ] Avatar URL is persisted in the user profile
|
|
465
|
+
- [ ] Preview is shown before upload confirmation
|
|
466
|
+
- [ ] Fallback avatar is displayed when no image is uploaded
|
|
467
|
+
|
|
468
|
+
## Scope
|
|
469
|
+
|
|
470
|
+
### In Scope
|
|
471
|
+
|
|
472
|
+
- Upload component, storage integration, profile display
|
|
473
|
+
|
|
474
|
+
### Out of Scope
|
|
475
|
+
|
|
476
|
+
- Image cropping, filters, social media import
|
|
477
|
+
```
|
|
478
|
+
|
|
479
|
+
**Prompt:**
|
|
480
|
+
|
|
481
|
+
```
|
|
482
|
+
Implement the avatar upload feature.
|
|
483
|
+
Workflow: /plan -> /tdd -> /e2e -> /verify -> /code-review. Do not commit.
|
|
484
|
+
|
|
485
|
+
<prd>
|
|
486
|
+
[Insert PRD above]
|
|
487
|
+
</prd>
|
|
488
|
+
```
|
|
489
|
+
|
|
490
|
+
### Example B: Complex Feature
|
|
491
|
+
|
|
492
|
+
```markdown
|
|
493
|
+
# PRD: Team Invitation System
|
|
494
|
+
|
|
495
|
+
## Problem
|
|
496
|
+
|
|
497
|
+
New team members can only be added manually via direct database insertion.
|
|
498
|
+
This is error-prone, unauditable, and requires developer involvement.
|
|
499
|
+
|
|
500
|
+
## Goal
|
|
501
|
+
|
|
502
|
+
Admins can send email invitations. Invited users receive a magic link
|
|
503
|
+
and are automatically assigned to the team upon acceptance.
|
|
504
|
+
|
|
505
|
+
## User Stories
|
|
506
|
+
|
|
507
|
+
- As an admin, I want to invite team members via email
|
|
508
|
+
- As an invitee, I want to join via a link without manual registration
|
|
509
|
+
- As an admin, I want to see and revoke pending invitations
|
|
510
|
+
|
|
511
|
+
## Acceptance Criteria
|
|
512
|
+
|
|
513
|
+
- [ ] Admin can enter email addresses and assign a role
|
|
514
|
+
- [ ] Invitation generates a single-use token (valid for 24 hours)
|
|
515
|
+
- [ ] Email is sent via Supabase Edge Function
|
|
516
|
+
- [ ] Magic link leads to onboarding for new users
|
|
517
|
+
- [ ] Magic link assigns existing users directly to the team
|
|
518
|
+
- [ ] Admin sees a list of pending invitations with status
|
|
519
|
+
- [ ] Admin can revoke invitations
|
|
520
|
+
- [ ] Maximum 50 pending invitations per team
|
|
521
|
+
- [ ] Rate limit: 10 invitations per hour per admin
|
|
522
|
+
|
|
523
|
+
## Data Model
|
|
524
|
+
|
|
525
|
+
- `invitations` table: id, org_id, email, role, token, status,
|
|
526
|
+
invited_by, expires_at, accepted_at, created_at
|
|
527
|
+
- RLS: Only admins of the same org can CRUD
|
|
528
|
+
- Index on token (unique) and org_id + status
|
|
529
|
+
|
|
530
|
+
## API Design
|
|
531
|
+
|
|
532
|
+
- POST /api/invitations -- Create invitation
|
|
533
|
+
- GET /api/invitations -- List for current org
|
|
534
|
+
- DELETE /api/invitations/:id -- Revoke invitation
|
|
535
|
+
- POST /api/invitations/accept -- Redeem token
|
|
536
|
+
|
|
537
|
+
## Constraints
|
|
538
|
+
|
|
539
|
+
- Use Supabase Auth for magic links
|
|
540
|
+
- Edge Function for email sending
|
|
541
|
+
- No external email providers (use Supabase built-in)
|
|
542
|
+
```
|
|
543
|
+
|
|
544
|
+
**Prompt:**
|
|
545
|
+
|
|
546
|
+
```
|
|
547
|
+
Implement the team invitation system fully autonomously.
|
|
548
|
+
Make your own decisions on UI details.
|
|
549
|
+
Only ask about architectural ambiguities.
|
|
550
|
+
|
|
551
|
+
Workflow: /plan -> /tdd -> /e2e -> /verify -> /code-review
|
|
552
|
+
Commit when all quality gates pass.
|
|
553
|
+
|
|
554
|
+
<prd>
|
|
555
|
+
[Insert PRD above]
|
|
556
|
+
</prd>
|
|
557
|
+
```
|
|
558
|
+
|
|
559
|
+
### Example C: Bugfix
|
|
560
|
+
|
|
561
|
+
```markdown
|
|
562
|
+
# PRD: Fix -- Dashboard Does Not Load for New Users
|
|
563
|
+
|
|
564
|
+
## Problem
|
|
565
|
+
|
|
566
|
+
New users without projects see a spinner that never resolves.
|
|
567
|
+
|
|
568
|
+
## Reproduction Steps
|
|
569
|
+
|
|
570
|
+
1. Create a new account
|
|
571
|
+
2. Navigate to the dashboard route
|
|
572
|
+
3. Spinner spins indefinitely
|
|
573
|
+
|
|
574
|
+
## Expected Behavior
|
|
575
|
+
|
|
576
|
+
Empty state with a "Create your first project" CTA button.
|
|
577
|
+
|
|
578
|
+
## Acceptance Criteria
|
|
579
|
+
|
|
580
|
+
- [ ] New users without projects see an empty state
|
|
581
|
+
- [ ] Empty state has a "Create project" button
|
|
582
|
+
- [ ] Existing users with projects are unaffected
|
|
583
|
+
- [ ] Edge case: Users whose projects have all been deleted
|
|
584
|
+
```
|
|
585
|
+
|
|
586
|
+
**Prompt:**
|
|
587
|
+
|
|
588
|
+
```
|
|
589
|
+
Find and fix the dashboard bug for new users.
|
|
590
|
+
Workflow: /tdd -> /verify. Do not commit.
|
|
591
|
+
|
|
592
|
+
<prd>
|
|
593
|
+
[Insert PRD above]
|
|
594
|
+
</prd>
|
|
595
|
+
```
|
|
596
|
+
|
|
597
|
+
### Example D: Agent-Ready Frontend Feature (with Ralph Loop)
|
|
598
|
+
|
|
599
|
+
```markdown
|
|
600
|
+
# PRD: Dark Mode Toggle
|
|
601
|
+
|
|
602
|
+
## Problem
|
|
603
|
+
|
|
604
|
+
Users cannot switch between light and dark themes. The app only supports
|
|
605
|
+
light mode, which causes eye strain for users in low-light environments.
|
|
606
|
+
|
|
607
|
+
## Goal
|
|
608
|
+
|
|
609
|
+
Users can toggle between light and dark mode. The preference persists across sessions.
|
|
610
|
+
|
|
611
|
+
## Acceptance Criteria
|
|
612
|
+
|
|
613
|
+
- [ ] Toggle button in the header switches between light and dark mode
|
|
614
|
+
- [ ] Theme preference is saved to localStorage
|
|
615
|
+
- [ ] Theme is applied on page load (no flash of wrong theme)
|
|
616
|
+
- [ ] All components respect the active theme
|
|
617
|
+
- [ ] Respects system preference (prefers-color-scheme) as default
|
|
618
|
+
|
|
619
|
+
## Scope
|
|
620
|
+
|
|
621
|
+
### In Scope
|
|
622
|
+
|
|
623
|
+
- Theme toggle component, CSS variables, localStorage persistence
|
|
624
|
+
|
|
625
|
+
### Out of Scope
|
|
626
|
+
|
|
627
|
+
- Custom theme editor, per-component theme overrides, server-side preference storage
|
|
628
|
+
|
|
629
|
+
## Quality Gates
|
|
630
|
+
|
|
631
|
+
- Build: `pnpm build` -- 0 errors
|
|
632
|
+
- Types: `tsc --noEmit` -- 0 errors
|
|
633
|
+
- Tests: `pnpm vitest run` -- all pass, 80%+ coverage
|
|
634
|
+
- Lint: `pnpm lint` -- 0 errors
|
|
635
|
+
|
|
636
|
+
## Autonomy Rules
|
|
637
|
+
|
|
638
|
+
- Design decisions: Follow DESIGN.md color tokens
|
|
639
|
+
- Library choices: Use existing Tailwind dark mode (class strategy)
|
|
640
|
+
- Architecture: Follow existing patterns
|
|
641
|
+
- On ambiguity: Decide autonomously
|
|
642
|
+
|
|
643
|
+
## Completion Promise
|
|
644
|
+
|
|
645
|
+
"All acceptance criteria met, build passes, tests pass, 0 lint errors"
|
|
646
|
+
```
|
|
647
|
+
|
|
648
|
+
**Ralph Loop Prompt:**
|
|
649
|
+
|
|
650
|
+
```
|
|
651
|
+
/ralph-loop Implement dark mode toggle per the PRD below.
|
|
652
|
+
|
|
653
|
+
<workflow>
|
|
654
|
+
Each iteration:
|
|
655
|
+
1. Check current state (git diff, test results)
|
|
656
|
+
2. Implement the next logical step
|
|
657
|
+
3. Run quality gates after every significant change
|
|
658
|
+
4. When ALL criteria pass: output <promise>DARK MODE COMPLETE</promise>
|
|
659
|
+
</workflow>
|
|
660
|
+
|
|
661
|
+
<quality_gates>
|
|
662
|
+
- pnpm build: 0 errors
|
|
663
|
+
- tsc --noEmit: 0 errors
|
|
664
|
+
- pnpm vitest run: all pass, 80%+ coverage
|
|
665
|
+
- pnpm lint: 0 errors
|
|
666
|
+
</quality_gates>
|
|
667
|
+
|
|
668
|
+
<autonomy_rules>
|
|
669
|
+
- Follow DESIGN.md color tokens
|
|
670
|
+
- Use Tailwind dark mode (class strategy)
|
|
671
|
+
- Follow existing patterns
|
|
672
|
+
- Decide autonomously on ambiguity
|
|
673
|
+
</autonomy_rules>
|
|
674
|
+
|
|
675
|
+
<prd>
|
|
676
|
+
[Insert PRD above]
|
|
677
|
+
</prd>
|
|
678
|
+
|
|
679
|
+
--max-iterations 20 --completion-promise 'DARK MODE COMPLETE'
|
|
680
|
+
```
|
|
681
|
+
|
|
682
|
+
### Example E: Agent-Ready Full-Stack Feature (with Ralph Loop)
|
|
683
|
+
|
|
684
|
+
```markdown
|
|
685
|
+
# PRD: API Key Management
|
|
686
|
+
|
|
687
|
+
## Problem
|
|
688
|
+
|
|
689
|
+
Users cannot generate or manage API keys for programmatic access. All integrations
|
|
690
|
+
require manual token exchange, which is insecure and unscalable.
|
|
691
|
+
|
|
692
|
+
## Goal
|
|
693
|
+
|
|
694
|
+
Users can create, view, and revoke API keys from their settings page.
|
|
695
|
+
Each key has a name, scoped permissions, and expiration date.
|
|
696
|
+
|
|
697
|
+
## Acceptance Criteria
|
|
698
|
+
|
|
699
|
+
- [ ] User can generate a new API key with a name and expiration
|
|
700
|
+
- [ ] Full key is shown once after creation (never again)
|
|
701
|
+
- [ ] Key list shows name, prefix (first 8 chars), created date, and expiration
|
|
702
|
+
- [ ] User can revoke any key (immediate invalidation)
|
|
703
|
+
- [ ] API validates keys via middleware (Authorization: Bearer sk\_...)
|
|
704
|
+
- [ ] Rate limit: 100 requests/minute per key
|
|
705
|
+
- [ ] Maximum 10 active keys per user
|
|
706
|
+
|
|
707
|
+
## Scope
|
|
708
|
+
|
|
709
|
+
### In Scope
|
|
710
|
+
|
|
711
|
+
- API key CRUD, key validation middleware, settings page UI
|
|
712
|
+
|
|
713
|
+
### Out of Scope
|
|
714
|
+
|
|
715
|
+
- OAuth scopes, team-shared keys, usage analytics dashboard
|
|
716
|
+
|
|
717
|
+
## Data Model
|
|
718
|
+
|
|
719
|
+
- `api_keys` table: id, user_id, org_id, name, key_hash, key_prefix,
|
|
720
|
+
permissions, expires_at, revoked_at, last_used_at, created_at
|
|
721
|
+
- RLS: Users can only manage their own keys within their org
|
|
722
|
+
- Index on key_hash (unique), user_id + org_id
|
|
723
|
+
|
|
724
|
+
## API Design
|
|
725
|
+
|
|
726
|
+
- POST /api/keys -- Create key (returns full key once)
|
|
727
|
+
- GET /api/keys -- List keys for current user
|
|
728
|
+
- DELETE /api/keys/:id -- Revoke key
|
|
729
|
+
|
|
730
|
+
## Quality Gates
|
|
731
|
+
|
|
732
|
+
- Build: `pnpm build` -- 0 errors
|
|
733
|
+
- Types: `tsc --noEmit` -- 0 errors
|
|
734
|
+
- Tests: `pnpm vitest run` -- all pass, 80%+ coverage
|
|
735
|
+
- Lint: `pnpm lint` -- 0 errors
|
|
736
|
+
- E2E: `npx playwright test` -- all pass
|
|
737
|
+
- RLS: Supabase security advisor -- no warnings
|
|
738
|
+
|
|
739
|
+
## Autonomy Rules
|
|
740
|
+
|
|
741
|
+
- Design decisions: Follow DESIGN.md
|
|
742
|
+
- Library choices: Use existing project dependencies
|
|
743
|
+
- Architecture: Follow existing patterns in src/lib/ and src/app/api/
|
|
744
|
+
- On ambiguity: Decide autonomously, document decisions in code comments
|
|
745
|
+
|
|
746
|
+
## Constraints
|
|
747
|
+
|
|
748
|
+
- Use crypto.randomBytes for key generation (not UUIDs)
|
|
749
|
+
- Store only bcrypt hash of key, never plaintext
|
|
750
|
+
- Key format: sk*live*[32 random chars]
|
|
751
|
+
|
|
752
|
+
## Completion Promise
|
|
753
|
+
|
|
754
|
+
"All acceptance criteria met, migration applied, RLS active, all tests pass, build succeeds, 0 lint errors"
|
|
755
|
+
```
|
|
756
|
+
|
|
757
|
+
**Ralph Loop Prompt:**
|
|
758
|
+
|
|
759
|
+
```
|
|
760
|
+
/ralph-loop Implement API key management per the PRD below.
|
|
761
|
+
|
|
762
|
+
<iteration_plan>
|
|
763
|
+
Iterations 1-5: DB migration, RLS, types, Zod schemas
|
|
764
|
+
Iterations 6-15: API routes with TDD (create, list, revoke, middleware)
|
|
765
|
+
Iterations 16-22: Settings page UI with TDD
|
|
766
|
+
Iterations 23-28: E2E tests + edge cases
|
|
767
|
+
Iterations 29-30: Polish, code review, final verification
|
|
768
|
+
</iteration_plan>
|
|
769
|
+
|
|
770
|
+
<quality_gates>
|
|
771
|
+
- pnpm build: 0 errors
|
|
772
|
+
- tsc --noEmit: 0 errors
|
|
773
|
+
- pnpm vitest run: all pass, 80%+ coverage
|
|
774
|
+
- pnpm lint: 0 errors
|
|
775
|
+
- npx playwright test: all pass
|
|
776
|
+
- Supabase security advisor: no warnings
|
|
777
|
+
</quality_gates>
|
|
778
|
+
|
|
779
|
+
<autonomy_rules>
|
|
780
|
+
- Follow DESIGN.md for UI
|
|
781
|
+
- Use existing project dependencies
|
|
782
|
+
- Follow existing patterns in src/lib/ and src/app/api/
|
|
783
|
+
- Decide autonomously, document decisions in code comments
|
|
784
|
+
</autonomy_rules>
|
|
785
|
+
|
|
786
|
+
<prd>
|
|
787
|
+
[Insert PRD above]
|
|
788
|
+
</prd>
|
|
789
|
+
|
|
790
|
+
--max-iterations 30 --completion-promise 'All acceptance criteria met, migration applied, RLS active, all tests pass, build succeeds, 0 lint errors'
|
|
791
|
+
```
|
|
792
|
+
|
|
793
|
+
---
|
|
794
|
+
|
|
795
|
+
## 8. Common Mistakes and How to Fix Them
|
|
796
|
+
|
|
797
|
+
### Vague prompts
|
|
798
|
+
|
|
799
|
+
```
|
|
800
|
+
# Ineffective -- Claude lacks context to make good decisions
|
|
801
|
+
"Build an invite system"
|
|
802
|
+
|
|
803
|
+
# Effective -- Clear scope, explicit workflow, referenced spec
|
|
804
|
+
"Implement the invite system per the attached PRD.
|
|
805
|
+
Workflow: /plan -> /tdd -> /verify"
|
|
806
|
+
```
|
|
807
|
+
|
|
808
|
+
### Missing acceptance criteria
|
|
809
|
+
|
|
810
|
+
```
|
|
811
|
+
# Ineffective -- No definition of "done"
|
|
812
|
+
"It should work"
|
|
813
|
+
|
|
814
|
+
# Effective -- Testable, measurable criteria
|
|
815
|
+
"AC: Upload JPG/PNG up to 5MB, show preview before upload, display fallback avatar"
|
|
816
|
+
```
|
|
817
|
+
|
|
818
|
+
### No workflow specified
|
|
819
|
+
|
|
820
|
+
```
|
|
821
|
+
# Ineffective -- Claude doesn't know whether to plan or code directly
|
|
822
|
+
"Just do it"
|
|
823
|
+
|
|
824
|
+
# Effective -- Explicit sequence of operations
|
|
825
|
+
"Workflow: /plan -> /tdd -> /e2e -> /verify"
|
|
826
|
+
```
|
|
827
|
+
|
|
828
|
+
### Unbounded scope
|
|
829
|
+
|
|
830
|
+
```
|
|
831
|
+
# Ineffective -- Could mean anything from CRUD to full RBAC
|
|
832
|
+
"User management feature"
|
|
833
|
+
|
|
834
|
+
# Effective -- Clear boundaries with explicit exclusions
|
|
835
|
+
"User invitation system. Out of scope: role management, permissions, user deactivation"
|
|
836
|
+
```
|
|
837
|
+
|
|
838
|
+
### Over-specifying implementation details
|
|
839
|
+
|
|
840
|
+
```
|
|
841
|
+
# Ineffective -- Micromanages file structure, function names, implementation
|
|
842
|
+
"Create a file src/lib/invite.ts with a function createInvitation
|
|
843
|
+
that generates a token using crypto.randomUUID..."
|
|
844
|
+
|
|
845
|
+
# Effective -- States the goal, trusts Claude to follow project conventions
|
|
846
|
+
"Implement invitation creation with token generation.
|
|
847
|
+
Follow the existing project structure and patterns."
|
|
848
|
+
```
|
|
849
|
+
|
|
850
|
+
### Unclear commit expectations
|
|
851
|
+
|
|
852
|
+
```
|
|
853
|
+
# Ineffective -- Claude defaults to never committing, or commits unexpectedly
|
|
854
|
+
(saying nothing about commits)
|
|
855
|
+
|
|
856
|
+
# Effective -- Explicit instruction
|
|
857
|
+
"Commit when all quality gates pass" OR "Do not commit -- show me the diff"
|
|
858
|
+
```
|
|
859
|
+
|
|
860
|
+
---
|
|
861
|
+
|
|
862
|
+
## 9. Tips for Maximum Autonomy
|
|
863
|
+
|
|
864
|
+
### The PRD is the single highest-leverage input
|
|
865
|
+
|
|
866
|
+
A well-written PRD eliminates most interaction stops. Spending 10 extra minutes on the PRD saves 5+ rounds of back-and-forth.
|
|
867
|
+
|
|
868
|
+
### Pre-define the data model
|
|
869
|
+
|
|
870
|
+
When you specify the database schema upfront, Claude makes zero architectural decisions and zero stops.
|
|
871
|
+
|
|
872
|
+
### State the autonomy level explicitly
|
|
873
|
+
|
|
874
|
+
- **Standard:** "Ask when unclear" -> 2-3 stops
|
|
875
|
+
- **High:** "Make your own decisions, only ask about breaking changes" -> 0-1 stops
|
|
876
|
+
- **Full Auto:** "Implement fully autonomously, commit when green" -> 0 stops
|
|
877
|
+
|
|
878
|
+
### Reference existing patterns in the codebase
|
|
879
|
+
|
|
880
|
+
```
|
|
881
|
+
"Follow the pattern in src/lib/billing/ for the service layer"
|
|
882
|
+
"Use the same form validation approach as src/components/auth/login-form.tsx"
|
|
883
|
+
```
|
|
884
|
+
|
|
885
|
+
### Provide infrastructure identifiers
|
|
886
|
+
|
|
887
|
+
When you include the Supabase Project ID, database connection details, or deployment targets in the prompt, Claude can execute migrations, generate types, and query the database directly -- without asking.
|
|
888
|
+
|
|
889
|
+
### Batch similar features
|
|
890
|
+
|
|
891
|
+
```
|
|
892
|
+
"Implement these 3 CRUD endpoints in one session:
|
|
893
|
+
1. /api/projects (PRD A)
|
|
894
|
+
2. /api/teams (PRD B)
|
|
895
|
+
3. /api/invitations (PRD C)
|
|
896
|
+
Use /plan for the overall architecture, then /tdd per endpoint."
|
|
897
|
+
```
|
|
898
|
+
|
|
899
|
+
---
|
|
900
|
+
|
|
901
|
+
## 9.5 Agent Teams -- Parallel Multi-Instance Orchestration
|
|
902
|
+
|
|
903
|
+
> **Feature-flagged.** Only active when `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` is set in `~/.claude/settings.json` env. When disabled, everything uses Subagents (Task Tool) as before.
|
|
904
|
+
|
|
905
|
+
### Three Execution Tiers
|
|
906
|
+
|
|
907
|
+
| Tier | Mechanism | Best For |
|
|
908
|
+
| ------------------------ | --------------------------------------------- | ---------------------------------------------- |
|
|
909
|
+
| **Single-Agent** | Main session works alone | Simple tasks, bugfixes, single-file changes |
|
|
910
|
+
| **Subagents** (default) | Task Tool spawns agents within session | Parallel research, code reviews, focused tasks |
|
|
911
|
+
| **Agent Teams** (opt-in) | Independent Claude instances with own context | Complex features with 3+ parallel workstreams |
|
|
912
|
+
|
|
913
|
+
### When to Use Agent Teams vs. Subagents
|
|
914
|
+
|
|
915
|
+
**Use Subagents (default) when:**
|
|
916
|
+
|
|
917
|
+
- Task is completable by 1-2 agents
|
|
918
|
+
- Results need to flow back to a coordinator
|
|
919
|
+
- Token budget is a concern
|
|
920
|
+
- Feature is small-to-medium scope
|
|
921
|
+
|
|
922
|
+
**Use Agent Teams when:**
|
|
923
|
+
|
|
924
|
+
- Feature spans 3+ independent workstreams (e.g., frontend + backend + tests)
|
|
925
|
+
- Teammates need to communicate directly (not just report back)
|
|
926
|
+
- Parallel implementation would save significant time
|
|
927
|
+
- Each Teammate has clear file ownership boundaries
|
|
928
|
+
|
|
929
|
+
### Command Chain with Agent Teams
|
|
930
|
+
|
|
931
|
+
```
|
|
932
|
+
/plan -> [approval] -> /tasks (with Teammate assignments) -> /orchestrate [profile] -> [parallel work] -> /verify -> /code-review
|
|
933
|
+
```
|
|
934
|
+
|
|
935
|
+
Compare to Classic:
|
|
936
|
+
|
|
937
|
+
```
|
|
938
|
+
/plan -> [approval] -> /tasks -> /tdd (sequential per task) -> /verify -> /code-review
|
|
939
|
+
```
|
|
940
|
+
|
|
941
|
+
### Team Profiles
|
|
942
|
+
|
|
943
|
+
Predefined in `~/.claude/teams/profiles.md`:
|
|
944
|
+
|
|
945
|
+
- **web-feature** (3): frontend + backend + testing
|
|
946
|
+
- **fullstack** (5): frontend + backend + database + testing + reviewer
|
|
947
|
+
- **frontend-only** (3): layout + components + content
|
|
948
|
+
- **review** (2): quality + security
|
|
949
|
+
- **custom**: user-defined
|
|
950
|
+
|
|
951
|
+
### Best Practices
|
|
952
|
+
|
|
953
|
+
1. **3-5 Teammates** -- more creates coordination overhead that outweighs parallelism
|
|
954
|
+
2. **5-6 Tasks per Teammate** -- enough work to justify the context window cost
|
|
955
|
+
3. **Clear file ownership** -- never have two Teammates editing the same file
|
|
956
|
+
4. **Database first** -- schema changes must complete before other Teammates start
|
|
957
|
+
5. **Reviewer last** -- code review Teammate begins after implementation is done
|
|
958
|
+
6. **Use task dependencies** -- `blocked by:` prevents premature work
|
|
959
|
+
7. **Monitor costs** -- Agent Teams use N separate Claude instances (N x token cost)
|
|
960
|
+
|
|
961
|
+
### Display Modes
|
|
962
|
+
|
|
963
|
+
- **In-process**: All Teammates in same terminal. `Shift+Down` to cycle.
|
|
964
|
+
- **Split panes**: Each Teammate in own pane (requires tmux or iTerm2).
|
|
965
|
+
|
|
966
|
+
Configure via `"teammateMode": "in-process"` or `"tmux"` in settings.json.
|
|
967
|
+
|
|
968
|
+
---
|
|
969
|
+
|
|
970
|
+
## 10. Ralph Loop -- Iterative Autonomy
|
|
971
|
+
|
|
972
|
+
### What is Ralph Loop?
|
|
973
|
+
|
|
974
|
+
Ralph Loop is a **self-referential agentic loop**: you start Claude Code once with a prompt, and it iterates autonomously over its own work -- until the quality criteria are met or the max-iterations limit is reached.
|
|
975
|
+
|
|
976
|
+
**Comparison with a normal session:**
|
|
977
|
+
|
|
978
|
+
| | Normal Session | Ralph Loop |
|
|
979
|
+
| ----------- | ------------------------------------ | ----------------------------------------- |
|
|
980
|
+
| Interaction | You provide prompts, Claude responds | One prompt, Claude iterates alone |
|
|
981
|
+
| Stops | Possible at every phase | Only at max-iterations or completion |
|
|
982
|
+
| Feedback | You review each step | Claude reviews itself via git/files |
|
|
983
|
+
| Best for | Exploratory work, design decisions | Well-defined tasks with testable criteria |
|
|
984
|
+
|
|
985
|
+
### When to Use Ralph Loop
|
|
986
|
+
|
|
987
|
+
| Autonomy Level | Mode | Rationale |
|
|
988
|
+
| -------------- | --------------------------------- | --------------------------------------------------------------- |
|
|
989
|
+
| **Standard** | Normal session | You want control over each step |
|
|
990
|
+
| **High** | Normal session + full-auto prompt | Few stops, but you're present |
|
|
991
|
+
| **Full Auto** | **Ralph Loop** | Well-defined goal, measurable criteria, no subjective decisions |
|
|
992
|
+
|
|
993
|
+
**Use Ralph Loop when:**
|
|
994
|
+
|
|
995
|
+
- Acceptance criteria are 100% testable (tests, build, lint)
|
|
996
|
+
- No subjective UI/UX decisions are needed
|
|
997
|
+
- Task is well-defined (PRD with data model + API design)
|
|
998
|
+
- You want to run overnight or unattended
|
|
999
|
+
|
|
1000
|
+
**Use a normal session instead when:**
|
|
1001
|
+
|
|
1002
|
+
- Design decisions are needed (subjective UI work)
|
|
1003
|
+
- Scope is unclear or exploratory
|
|
1004
|
+
- Production debugging with unknown root cause
|
|
1005
|
+
- One-shot tasks (single change, no iteration needed)
|
|
1006
|
+
|
|
1007
|
+
### How to Start
|
|
1008
|
+
|
|
1009
|
+
```bash
|
|
1010
|
+
/ralph-loop "PROMPT" --max-iterations N --completion-promise "PHRASE"
|
|
1011
|
+
```
|
|
1012
|
+
|
|
1013
|
+
**Parameters:**
|
|
1014
|
+
|
|
1015
|
+
- **Prompt** -- The complete work order (PRD + workflow)
|
|
1016
|
+
- **--max-iterations N** -- Safety limit (always set this)
|
|
1017
|
+
- **--completion-promise "PHRASE"** -- Claude must output `<promise>PHRASE</promise>` when done
|
|
1018
|
+
|
|
1019
|
+
**To stop:**
|
|
1020
|
+
|
|
1021
|
+
```bash
|
|
1022
|
+
/cancel-ralph
|
|
1023
|
+
```
|
|
1024
|
+
|
|
1025
|
+
### Max-Iterations Recommendations
|
|
1026
|
+
|
|
1027
|
+
| Feature Type | Max-Iterations | Rationale |
|
|
1028
|
+
| ------------------------ | -------------- | ----------------------------------------- |
|
|
1029
|
+
| Bugfix | **10** | Find cause + fix + test -- few iterations |
|
|
1030
|
+
| Small feature | **20** | CRUD, single endpoint, simple UI |
|
|
1031
|
+
| Standard feature | **30** | DB + API + frontend + tests |
|
|
1032
|
+
| Large feature | **50** | Multi-domain, many endpoints, complex UI |
|
|
1033
|
+
| Refactoring | **15** | Scoped changes, little new code |
|
|
1034
|
+
| Performance optimization | **20** | Profiling + optimization + verification |
|
|
1035
|
+
|
|
1036
|
+
### Completion Promises
|
|
1037
|
+
|
|
1038
|
+
The completion promise must align with your quality gates. Claude may only output `<promise>...</promise>` **when the statement is 100% true**.
|
|
1039
|
+
|
|
1040
|
+
**Recommended promises by feature type:**
|
|
1041
|
+
|
|
1042
|
+
#### Standard Feature
|
|
1043
|
+
|
|
1044
|
+
```
|
|
1045
|
+
--completion-promise "All tests pass, build succeeds, 0 lint errors, no console.log in production code"
|
|
1046
|
+
```
|
|
1047
|
+
|
|
1048
|
+
#### Feature with E2E
|
|
1049
|
+
|
|
1050
|
+
```
|
|
1051
|
+
--completion-promise "All unit tests pass, all e2e tests pass, build succeeds, 0 lint errors, code review complete"
|
|
1052
|
+
```
|
|
1053
|
+
|
|
1054
|
+
#### Bugfix
|
|
1055
|
+
|
|
1056
|
+
```
|
|
1057
|
+
--completion-promise "Bug is fixed, regression test added and passing, build succeeds"
|
|
1058
|
+
```
|
|
1059
|
+
|
|
1060
|
+
#### Refactoring
|
|
1061
|
+
|
|
1062
|
+
```
|
|
1063
|
+
--completion-promise "All tests pass, no dead code, build succeeds, 0 lint errors"
|
|
1064
|
+
```
|
|
1065
|
+
|
|
1066
|
+
#### Full-Stack with Supabase
|
|
1067
|
+
|
|
1068
|
+
```
|
|
1069
|
+
--completion-promise "Migration applied, RLS policies active, all tests pass, build succeeds, 0 lint errors, types generated"
|
|
1070
|
+
```
|
|
1071
|
+
|
|
1072
|
+
### Complete Ralph Loop Prompt Examples
|
|
1073
|
+
|
|
1074
|
+
#### Standard Feature
|
|
1075
|
+
|
|
1076
|
+
```
|
|
1077
|
+
/ralph-loop "
|
|
1078
|
+
Implement the following feature fully autonomously from database to frontend.
|
|
1079
|
+
|
|
1080
|
+
<workflow>
|
|
1081
|
+
Each iteration:
|
|
1082
|
+
1. Check current state (git diff, test results)
|
|
1083
|
+
2. Implement the next logical step
|
|
1084
|
+
3. Run /verify after every significant change
|
|
1085
|
+
4. When all acceptance criteria are met: output <promise>COMPLETE</promise>
|
|
1086
|
+
</workflow>
|
|
1087
|
+
|
|
1088
|
+
<quality_gates>
|
|
1089
|
+
All of these must pass before outputting the completion promise:
|
|
1090
|
+
- tsc --noEmit: 0 errors
|
|
1091
|
+
- vitest: all tests green, 80%+ coverage
|
|
1092
|
+
- eslint: 0 errors, 0 warnings
|
|
1093
|
+
- No console.log in src/
|
|
1094
|
+
- Playwright e2e: all journeys passing
|
|
1095
|
+
</quality_gates>
|
|
1096
|
+
|
|
1097
|
+
<prd>
|
|
1098
|
+
[Insert PRD here]
|
|
1099
|
+
</prd>
|
|
1100
|
+
" --max-iterations 30 --completion-promise "COMPLETE"
|
|
1101
|
+
```
|
|
1102
|
+
|
|
1103
|
+
#### Bugfix
|
|
1104
|
+
|
|
1105
|
+
```
|
|
1106
|
+
/ralph-loop "
|
|
1107
|
+
Find and fix the following bug. Write a regression test.
|
|
1108
|
+
|
|
1109
|
+
<bug>
|
|
1110
|
+
[Description + reproduction steps]
|
|
1111
|
+
</bug>
|
|
1112
|
+
|
|
1113
|
+
<workflow>
|
|
1114
|
+
Each iteration:
|
|
1115
|
+
1. Analyze the bug (logs, code, stack trace)
|
|
1116
|
+
2. Write a failing test that reproduces the bug
|
|
1117
|
+
3. Implement the fix
|
|
1118
|
+
4. Verify: test green, build OK, no regression
|
|
1119
|
+
5. When fix is confirmed: output <promise>BUG FIXED</promise>
|
|
1120
|
+
</workflow>
|
|
1121
|
+
" --max-iterations 10 --completion-promise "BUG FIXED"
|
|
1122
|
+
```
|
|
1123
|
+
|
|
1124
|
+
#### Large Feature (Overnight)
|
|
1125
|
+
|
|
1126
|
+
```
|
|
1127
|
+
/ralph-loop "
|
|
1128
|
+
Implement the team invitation system fully autonomously.
|
|
1129
|
+
Make your own decisions on all details.
|
|
1130
|
+
|
|
1131
|
+
<iteration_plan>
|
|
1132
|
+
Iterations 1-5: DB migration, RLS, types, Zod schemas
|
|
1133
|
+
Iterations 6-15: API endpoints with TDD
|
|
1134
|
+
Iterations 16-25: Frontend components with TDD
|
|
1135
|
+
Iterations 26-35: E2E tests, edge cases
|
|
1136
|
+
Iterations 36-50: Polish, code review, cleanup
|
|
1137
|
+
</iteration_plan>
|
|
1138
|
+
|
|
1139
|
+
<quality_gates>
|
|
1140
|
+
- All acceptance criteria met
|
|
1141
|
+
- tsc: 0 errors
|
|
1142
|
+
- vitest: 80%+ coverage
|
|
1143
|
+
- playwright e2e: all journeys passing
|
|
1144
|
+
- eslint: 0 errors
|
|
1145
|
+
- /code-review: no security issues
|
|
1146
|
+
- No console.log, no hardcoded strings
|
|
1147
|
+
</quality_gates>
|
|
1148
|
+
|
|
1149
|
+
<prd>
|
|
1150
|
+
[Insert complete PRD here]
|
|
1151
|
+
</prd>
|
|
1152
|
+
|
|
1153
|
+
When ALL quality gates pass: output <promise>FEATURE COMPLETE</promise>
|
|
1154
|
+
" --max-iterations 50 --completion-promise "FEATURE COMPLETE"
|
|
1155
|
+
```
|
|
1156
|
+
|
|
1157
|
+
### Ralph Loop Best Practices
|
|
1158
|
+
|
|
1159
|
+
1. **Always set --max-iterations** -- Safety net against infinite loops
|
|
1160
|
+
2. **Make the completion promise specific** -- "COMPLETE" works, but "All tests pass and build succeeds" forces Claude to verify honestly
|
|
1161
|
+
3. **Include an iteration plan** -- Gives Claude a roadmap for sequencing work
|
|
1162
|
+
4. **List quality gates explicitly** -- Claude checks exactly these criteria before the promise
|
|
1163
|
+
5. **Initialize a git repo first** -- Claude tracks its own progress via git diff/log
|
|
1164
|
+
6. **Build in an escape hatch** -- "After 80% of max-iterations: document blockers and open items"
|
|
1165
|
+
|
|
1166
|
+
### Tasks as Ralph Loop Roadmap
|
|
1167
|
+
|
|
1168
|
+
When you run `/tasks` before the Ralph Loop, Claude can use the generated tasks.md as a progress tracker across iterations:
|
|
1169
|
+
|
|
1170
|
+
```
|
|
1171
|
+
/ralph-loop Implement all tasks from tasks.md in this project.
|
|
1172
|
+
|
|
1173
|
+
<workflow>
|
|
1174
|
+
Each iteration:
|
|
1175
|
+
1. Read tasks.md -- find the next uncompleted task
|
|
1176
|
+
2. Implement it following TDD (RED -> GREEN -> REFACTOR)
|
|
1177
|
+
3. Mark the task as complete in tasks.md
|
|
1178
|
+
4. Run quality gates
|
|
1179
|
+
5. When ALL tasks are complete and ALL quality gates pass:
|
|
1180
|
+
output <promise>ALL TASKS COMPLETE</promise>
|
|
1181
|
+
</workflow>
|
|
1182
|
+
|
|
1183
|
+
<quality_gates>
|
|
1184
|
+
[Insert Quality Gates]
|
|
1185
|
+
</quality_gates>
|
|
1186
|
+
|
|
1187
|
+
--max-iterations [N] --completion-promise 'ALL TASKS COMPLETE'
|
|
1188
|
+
```
|
|
1189
|
+
|
|
1190
|
+
**Advantage:** Claude sees which tasks are done and which are open across iterations via the tasks.md file -- no context loss between iterations.
|
|
1191
|
+
|
|
1192
|
+
### Error Recovery
|
|
1193
|
+
|
|
1194
|
+
When an iteration fails, Claude follows this recovery pattern:
|
|
1195
|
+
|
|
1196
|
+
1. **Build error** -> Next iteration fixes it automatically. Claude sees the error in terminal output and git diff.
|
|
1197
|
+
|
|
1198
|
+
2. **Same error 3 times** -> Try a different approach. If that also fails: document the blocker in `.claude/ralph-blockers.md` and move to the next task.
|
|
1199
|
+
|
|
1200
|
+
3. **Missing dependency** -> Install it (`pnpm add <package>`). Check package.json and project conventions before adding anything new.
|
|
1201
|
+
|
|
1202
|
+
4. **80% of iterations consumed** -> Write a status report to `.claude/ralph-status.md`:
|
|
1203
|
+
- What is done (with test results)
|
|
1204
|
+
- What remains
|
|
1205
|
+
- Blockers and open questions
|
|
1206
|
+
- Suggested next steps
|
|
1207
|
+
|
|
1208
|
+
5. **Flaky test** -> Investigate root cause (see Rule 5 in Lessons Learned). Do not retry blindly -- identify the stateful dependency causing flakiness.
|
|
1209
|
+
|
|
1210
|
+
---
|
|
1211
|
+
|
|
1212
|
+
## 11. Lessons Learned -- Common Pitfalls in Autonomous Development
|
|
1213
|
+
|
|
1214
|
+
These rules come from real autonomous sessions and prevent the most common time sinks. They apply to any project type -- web, mobile, backend, CLI, infrastructure.
|
|
1215
|
+
|
|
1216
|
+
### Rule 1: Test environment differs from development environment
|
|
1217
|
+
|
|
1218
|
+
**Problem:** Code works locally and passes manual testing, but automated tests fail.
|
|
1219
|
+
|
|
1220
|
+
Test runners (Playwright, Jest, pytest, XCTest) create their own execution contexts with their own defaults: locale, timezone, environment variables, permissions, network configuration. These defaults almost always differ from the development environment.
|
|
1221
|
+
|
|
1222
|
+
**Rule:** After scaffolding test infrastructure, **write a trivial smoke test that validates the environment** (e.g., "page loads", "API responds", "DB connection works") -- before writing any feature tests. If the smoke test fails, the problem is test configuration, not application code.
|
|
1223
|
+
|
|
1224
|
+
**Examples across stacks:**
|
|
1225
|
+
|
|
1226
|
+
- Browser test runner uses a different locale than the system -> i18n content renders unexpectedly
|
|
1227
|
+
- CI pipeline has different environment variables -> configuration errors
|
|
1228
|
+
- Test containers start with a different DB version -> SQL incompatibilities
|
|
1229
|
+
- Mobile simulator has different permissions than a real device -> feature gates trigger
|
|
1230
|
+
|
|
1231
|
+
### Rule 2: Validate integration points immediately after scaffolding
|
|
1232
|
+
|
|
1233
|
+
**Problem:** A feature is fully implemented across all layers, but integration testing reveals a fundamental configuration error that should have been caught in phase 2.
|
|
1234
|
+
|
|
1235
|
+
When multiple systems interact (framework + middleware + routing + DB + auth + testing), the most expensive bugs occur at the **boundaries**, not within individual systems.
|
|
1236
|
+
|
|
1237
|
+
**Rule:** After scaffolding (Phase 2), **immediately test one end-to-end roundtrip** before building features on top. Do not wait until Phase 5 (E2E).
|
|
1238
|
+
|
|
1239
|
+
**Post-scaffolding checklist:**
|
|
1240
|
+
|
|
1241
|
+
- [ ] Routing: Are all planned routes accessible? (not 404)
|
|
1242
|
+
- [ ] Middleware: Does it fire correctly on all paths?
|
|
1243
|
+
- [ ] Auth: Does the login/logout flow work end-to-end?
|
|
1244
|
+
- [ ] DB: Does migration run and are seed data readable?
|
|
1245
|
+
- [ ] API: Does one request/response roundtrip succeed?
|
|
1246
|
+
- [ ] Build: Does the production build complete without errors?
|
|
1247
|
+
|
|
1248
|
+
### Rule 3: Read diagnostics before attempting fixes
|
|
1249
|
+
|
|
1250
|
+
**Problem:** Test fails -> immediately adjust code/test -> still broken after 5 iterations.
|
|
1251
|
+
|
|
1252
|
+
Every test framework produces detailed diagnostic output: error context files, stack traces, screenshots, accessibility trees, log files. **These outputs almost always contain the answer** -- but they are often skipped in favor of trial-and-error.
|
|
1253
|
+
|
|
1254
|
+
**Rule:** On test failure, **always read the complete diagnostics first** (logs, traces, screenshots, error context) before changing any code. Diagnostic analysis saves an average of 3-5 debug iterations.
|
|
1255
|
+
|
|
1256
|
+
**Diagnostic priority:**
|
|
1257
|
+
|
|
1258
|
+
1. Error message + stack trace (what exactly failed?)
|
|
1259
|
+
2. Framework-specific outputs (Playwright: error context, pytest: captured output, etc.)
|
|
1260
|
+
3. Application logs (server logs, browser console)
|
|
1261
|
+
4. Screenshots / DOM snapshots (visual state)
|
|
1262
|
+
|
|
1263
|
+
### Rule 4: Manual verification is not the same as automated testing
|
|
1264
|
+
|
|
1265
|
+
**Problem:** Feature manually tested -> works. Automated test -> fails. "But it works!"
|
|
1266
|
+
|
|
1267
|
+
Manual tests and automated tests run in **different execution contexts**. A human waits intuitively; a test does not. The browser has stored cookies; the test runner starts clean. The IDE forwards ports; the CI runner does not.
|
|
1268
|
+
|
|
1269
|
+
**Rule:** "Manually tested" does not count as verification. Only a **passing automated test** confirms the feature works. When manual and automated results disagree, the cause is almost always in the **execution context**, not the code.
|
|
1270
|
+
|
|
1271
|
+
**Common context differences:**
|
|
1272
|
+
|
|
1273
|
+
- Timing: Humans wait intuitively; tests have fixed timeouts
|
|
1274
|
+
- State: Browser has caches/cookies; test runner starts clean
|
|
1275
|
+
- Locale/timezone: System settings vs. test runner defaults
|
|
1276
|
+
- Network: Local ports vs. container network
|
|
1277
|
+
- Permissions: Dev user vs. service account
|
|
1278
|
+
|
|
1279
|
+
### Rule 5: Stateful dependencies cause flaky tests
|
|
1280
|
+
|
|
1281
|
+
**Problem:** Test passes the first time, fails the second -- with no code changes.
|
|
1282
|
+
|
|
1283
|
+
Any stateful component (rate limiter, in-memory cache, queue, session store, connection pool) retains state between test runs. When tests modify this state without resetting it, subsequent tests become unreliable.
|
|
1284
|
+
|
|
1285
|
+
**Rule:** Identify stateful dependencies and reset them per test run. Alternatively, **guarantee isolation** in tests against stateful services (own instance, reset endpoint, mocking).
|
|
1286
|
+
|
|
1287
|
+
**Common candidates:**
|
|
1288
|
+
|
|
1289
|
+
- Rate limiters (in-memory maps, Redis keys)
|
|
1290
|
+
- Caches (CDN, application cache, browser cache)
|
|
1291
|
+
- Queues (unprocessed messages from previous runs)
|
|
1292
|
+
- Database state (seed data, auto-increment counters)
|
|
1293
|
+
- File system (temp files, upload directories)
|
|
1294
|
+
- External APIs (sandbox limits, quotas)
|
|
1295
|
+
|
|
1296
|
+
### Rule 6: Use precise identifiers from the start
|
|
1297
|
+
|
|
1298
|
+
**Problem:** Tests use vague selectors ("find text X") -> break on every UI change because X appears in multiple places.
|
|
1299
|
+
|
|
1300
|
+
Vague selectors (`getByText("Save")`, `querySelector(".btn")`, `find_element_by_class`) are fragile. As soon as the page or app changes, they match too many or the wrong elements.
|
|
1301
|
+
|
|
1302
|
+
**Rule:** Write tests with **scoped, specific identifiers** from the beginning:
|
|
1303
|
+
|
|
1304
|
+
- Element IDs or data-testid attributes
|
|
1305
|
+
- Scoped queries (search within a container, not globally)
|
|
1306
|
+
- Exact matches instead of partial/regex when possible
|
|
1307
|
+
- `.first()` / `[0]` only as a last resort -- prefer refining the selector
|
|
1308
|
+
|
|
1309
|
+
### Autonomous Debug Loop
|
|
1310
|
+
|
|
1311
|
+
When a test fails during autonomous development, follow this sequence:
|
|
1312
|
+
|
|
1313
|
+
```
|
|
1314
|
+
Test fails
|
|
1315
|
+
|
|
|
1316
|
+
v
|
|
1317
|
+
1. STOP -- Do not attempt a fix immediately
|
|
1318
|
+
|
|
|
1319
|
+
v
|
|
1320
|
+
2. Read diagnostics (error context, logs, traces, screenshots)
|
|
1321
|
+
|
|
|
1322
|
+
v
|
|
1323
|
+
3. Check execution context (locale, env vars, state, timing)
|
|
1324
|
+
|
|
|
1325
|
+
v
|
|
1326
|
+
4. Identify root cause (config? code? test? infrastructure?)
|
|
1327
|
+
|
|
|
1328
|
+
v
|
|
1329
|
+
5. Apply a targeted fix (one change per iteration)
|
|
1330
|
+
|
|
|
1331
|
+
v
|
|
1332
|
+
6. Re-run the test
|
|
1333
|
+
```
|
|
1334
|
+
|
|
1335
|
+
---
|
|
1336
|
+
|
|
1337
|
+
## 12. Available Tools -- Reference
|
|
1338
|
+
|
|
1339
|
+
### Commands (usable in prompts)
|
|
1340
|
+
|
|
1341
|
+
| Command | Phase | Function |
|
|
1342
|
+
| ----------------- | -------------- | ---------------------------------------- |
|
|
1343
|
+
| `/plan` | Start | Analysis + plan + **waits for approval** |
|
|
1344
|
+
| `/tdd` | Implementation | Tests first -> code -> refactor |
|
|
1345
|
+
| `/verify` | QA | Build + types + lint + tests |
|
|
1346
|
+
| `/e2e` | QA | Playwright E2E tests |
|
|
1347
|
+
| `/code-review` | Review | Security + quality audit |
|
|
1348
|
+
| `/build-fix` | Debugging | Incremental error resolution |
|
|
1349
|
+
| `/refactor-clean` | Cleanup | Remove dead code |
|
|
1350
|
+
| `/checkpoint` | Safety | Create a restore point |
|
|
1351
|
+
|
|
1352
|
+
### MCP Servers (automatically available)
|
|
1353
|
+
|
|
1354
|
+
| Server | Function |
|
|
1355
|
+
| ----------------------- | ------------------------------------------------ |
|
|
1356
|
+
| **Supabase** | Migrations, SQL, types, Edge Functions, branches |
|
|
1357
|
+
| **Playwright** | Browser automation, E2E testing |
|
|
1358
|
+
| **Sequential Thinking** | Structured reasoning |
|
|
1359
|
+
|
|
1360
|
+
### Specialized Agents (automatically selected)
|
|
1361
|
+
|
|
1362
|
+
| Agent | Use Case |
|
|
1363
|
+
| ------------------- | ------------------------------- |
|
|
1364
|
+
| fullstack-developer | Features spanning all layers |
|
|
1365
|
+
| postgres-pro | DB schema, queries, performance |
|
|
1366
|
+
| security-engineer | Auth, RLS, vulnerabilities |
|
|
1367
|
+
| test-automator | Test strategy, coverage |
|
|
1368
|
+
| ui-designer | Component design, UX |
|