get-shit-done-cc 1.9.11 → 1.10.0-experimental.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (61) hide show
  1. package/README.md +10 -9
  2. package/agents/design-specialist.md +222 -0
  3. package/agents/gsd-executor.md +37 -375
  4. package/agents/gsd-planner.md +15 -108
  5. package/bin/install.js +92 -5
  6. package/commands/gsd/autopilot.md +518 -0
  7. package/commands/gsd/checkpoints.md +229 -0
  8. package/commands/gsd/design-system.md +70 -0
  9. package/commands/gsd/discuss-design.md +77 -0
  10. package/commands/gsd/extend.md +80 -0
  11. package/commands/gsd/help.md +46 -17
  12. package/commands/gsd/new-project.md +94 -8
  13. package/commands/gsd/plan-phase.md +35 -5
  14. package/get-shit-done/references/ccr-integration.md +468 -0
  15. package/get-shit-done/references/checkpoint-execution.md +369 -0
  16. package/get-shit-done/references/checkpoint-types.md +728 -0
  17. package/get-shit-done/references/deviation-rules.md +215 -0
  18. package/get-shit-done/references/framework-patterns.md +543 -0
  19. package/get-shit-done/references/ui-principles.md +258 -0
  20. package/get-shit-done/references/verification-patterns.md +1 -1
  21. package/get-shit-done/skills/gsd-extend/SKILL.md +154 -0
  22. package/get-shit-done/skills/gsd-extend/references/agent-structure.md +305 -0
  23. package/get-shit-done/skills/gsd-extend/references/extension-anatomy.md +123 -0
  24. package/get-shit-done/skills/gsd-extend/references/reference-structure.md +408 -0
  25. package/get-shit-done/skills/gsd-extend/references/template-structure.md +370 -0
  26. package/get-shit-done/skills/gsd-extend/references/validation-rules.md +140 -0
  27. package/get-shit-done/skills/gsd-extend/references/workflow-structure.md +253 -0
  28. package/get-shit-done/skills/gsd-extend/templates/agent-template.md +234 -0
  29. package/get-shit-done/skills/gsd-extend/templates/reference-template.md +239 -0
  30. package/get-shit-done/skills/gsd-extend/templates/workflow-template.md +169 -0
  31. package/get-shit-done/skills/gsd-extend/workflows/create-approach.md +332 -0
  32. package/get-shit-done/skills/gsd-extend/workflows/list-extensions.md +133 -0
  33. package/get-shit-done/skills/gsd-extend/workflows/remove-extension.md +93 -0
  34. package/get-shit-done/skills/gsd-extend/workflows/validate-extension.md +184 -0
  35. package/get-shit-done/templates/autopilot-script-simple.sh +181 -0
  36. package/get-shit-done/templates/autopilot-script.sh +1142 -0
  37. package/get-shit-done/templates/autopilot-script.sh.backup +1142 -0
  38. package/get-shit-done/templates/design-system.md +238 -0
  39. package/get-shit-done/templates/phase-design.md +205 -0
  40. package/get-shit-done/templates/phase-models-template.json +71 -0
  41. package/get-shit-done/templates/phase-prompt.md +4 -4
  42. package/get-shit-done/templates/state.md +37 -0
  43. package/get-shit-done/tui/App.tsx +169 -0
  44. package/get-shit-done/tui/README.md +107 -0
  45. package/get-shit-done/tui/build.js +37 -0
  46. package/get-shit-done/tui/components/ActivityFeed.tsx +126 -0
  47. package/get-shit-done/tui/components/PhaseCard.tsx +86 -0
  48. package/get-shit-done/tui/components/StatsBar.tsx +147 -0
  49. package/get-shit-done/tui/dist/index.js +387 -0
  50. package/get-shit-done/tui/index.tsx +12 -0
  51. package/get-shit-done/tui/package-lock.json +1074 -0
  52. package/get-shit-done/tui/package.json +22 -0
  53. package/get-shit-done/tui/utils/pipeReader.ts +129 -0
  54. package/get-shit-done/workflows/design-system.md +245 -0
  55. package/get-shit-done/workflows/discuss-design.md +330 -0
  56. package/get-shit-done/workflows/execute-phase.md +44 -1
  57. package/get-shit-done/workflows/execute-plan-auth.md +122 -0
  58. package/get-shit-done/workflows/execute-plan-checkpoints.md +541 -0
  59. package/get-shit-done/workflows/execute-plan.md +34 -856
  60. package/package.json +8 -3
  61. package/commands/gsd/whats-new.md +0 -124
@@ -0,0 +1,728 @@
1
+ # Checkpoint Types Reference
2
+
3
+ Reference for planning checkpoints in GSD plans. Covers types, structures, writing guidelines, and examples.
4
+
5
+ <overview>
6
+ Plans execute autonomously. Checkpoints formalize the interaction points where human verification or decisions are needed.
7
+
8
+ **Core principle:** Claude automates everything with CLI/API. Checkpoints are for verification and decisions, not manual work.
9
+
10
+ **Golden rules:**
11
+ 1. **If Claude can run it, Claude runs it** - Never ask user to execute CLI commands, start servers, or run builds
12
+ 2. **Claude sets up the verification environment** - Start dev servers, seed databases, configure env vars
13
+ 3. **User only does what requires human judgment** - Visual checks, UX evaluation, "does this feel right?"
14
+ 4. **Secrets come from user, automation comes from Claude** - Ask for API keys, then Claude uses them via CLI
15
+ </overview>
16
+
17
+ <checkpoint_types>
18
+
19
+ <type name="human-verify">
20
+ ## checkpoint:human-verify (Most Common - 90%)
21
+
22
+ **When:** Claude completed automated work, human confirms it works correctly.
23
+
24
+ **Use for:**
25
+ - Visual UI checks (layout, styling, responsiveness)
26
+ - Interactive flows (click through wizard, test user flows)
27
+ - Functional verification (feature works as expected)
28
+ - Audio/video playback quality
29
+ - Animation smoothness
30
+ - Accessibility testing
31
+
32
+ **Structure:**
33
+ ```xml
34
+ <task type="checkpoint:human-verify" gate="blocking">
35
+ <what-built>[What Claude automated and deployed/built]</what-built>
36
+ <how-to-verify>
37
+ [Exact steps to test - URLs, commands, expected behavior]
38
+ </how-to-verify>
39
+ <resume-signal>[How to continue - "approved", "yes", or describe issues]</resume-signal>
40
+ </task>
41
+ ```
42
+
43
+ **Key elements:**
44
+ - `<what-built>`: What Claude automated (deployed, built, configured)
45
+ - `<how-to-verify>`: Exact steps to confirm it works (numbered, specific)
46
+ - `<resume-signal>`: Clear indication of how to continue
47
+
48
+ **Example: Vercel Deployment**
49
+ ```xml
50
+ <task type="auto">
51
+ <name>Deploy to Vercel</name>
52
+ <files>.vercel/, vercel.json</files>
53
+ <action>Run `vercel --yes` to create project and deploy. Capture deployment URL from output.</action>
54
+ <verify>vercel ls shows deployment, curl {url} returns 200</verify>
55
+ <done>App deployed, URL captured</done>
56
+ </task>
57
+
58
+ <task type="checkpoint:human-verify" gate="blocking">
59
+ <what-built>Deployed to Vercel at https://myapp-abc123.vercel.app</what-built>
60
+ <how-to-verify>
61
+ Visit https://myapp-abc123.vercel.app and confirm:
62
+ - Homepage loads without errors
63
+ - Login form is visible
64
+ - No console errors in browser DevTools
65
+ </how-to-verify>
66
+ <resume-signal>Type "approved" to continue, or describe issues to fix</resume-signal>
67
+ </task>
68
+ ```
69
+
70
+ **Example: UI Component**
71
+ ```xml
72
+ <task type="auto">
73
+ <name>Build responsive dashboard layout</name>
74
+ <files>src/components/Dashboard.tsx, src/app/dashboard/page.tsx</files>
75
+ <action>Create dashboard with sidebar, header, and content area. Use Tailwind responsive classes for mobile.</action>
76
+ <verify>npm run build succeeds, no TypeScript errors</verify>
77
+ <done>Dashboard component builds without errors</done>
78
+ </task>
79
+
80
+ <task type="auto">
81
+ <name>Start dev server for verification</name>
82
+ <action>Run `npm run dev` in background, wait for "ready" message, capture port</action>
83
+ <verify>curl http://localhost:3000 returns 200</verify>
84
+ <done>Dev server running at http://localhost:3000</done>
85
+ </task>
86
+
87
+ <task type="checkpoint:human-verify" gate="blocking">
88
+ <what-built>Responsive dashboard layout - dev server running at http://localhost:3000</what-built>
89
+ <how-to-verify>
90
+ Visit http://localhost:3000/dashboard and verify:
91
+ 1. Desktop (>1024px): Sidebar left, content right, header top
92
+ 2. Tablet (768px): Sidebar collapses to hamburger menu
93
+ 3. Mobile (375px): Single column layout, bottom nav appears
94
+ 4. No layout shift or horizontal scroll at any size
95
+ </how-to-verify>
96
+ <resume-signal>Type "approved" or describe layout issues</resume-signal>
97
+ </task>
98
+ ```
99
+
100
+ **Key pattern:** Claude starts the dev server BEFORE the checkpoint. User only needs to visit the URL.
101
+
102
+ **Example: Xcode Build**
103
+ ```xml
104
+ <task type="auto">
105
+ <name>Build macOS app with Xcode</name>
106
+ <files>App.xcodeproj, Sources/</files>
107
+ <action>Run `xcodebuild -project App.xcodeproj -scheme App build`. Check for compilation errors in output.</action>
108
+ <verify>Build output contains "BUILD SUCCEEDED", no errors</verify>
109
+ <done>App builds successfully</done>
110
+ </task>
111
+
112
+ <task type="checkpoint:human-verify" gate="blocking">
113
+ <what-built>Built macOS app at DerivedData/Build/Products/Debug/App.app</what-built>
114
+ <how-to-verify>
115
+ Open App.app and test:
116
+ - App launches without crashes
117
+ - Menu bar icon appears
118
+ - Preferences window opens correctly
119
+ - No visual glitches or layout issues
120
+ </how-to-verify>
121
+ <resume-signal>Type "approved" or describe issues</resume-signal>
122
+ </task>
123
+ ```
124
+ </type>
125
+
126
+ <type name="decision">
127
+ ## checkpoint:decision (9%)
128
+
129
+ **When:** Human must make choice that affects implementation direction.
130
+
131
+ **Use for:**
132
+ - Technology selection (which auth provider, which database)
133
+ - Architecture decisions (monorepo vs separate repos)
134
+ - Design choices (color scheme, layout approach)
135
+ - Feature prioritization (which variant to build)
136
+ - Data model decisions (schema structure)
137
+
138
+ **Structure:**
139
+ ```xml
140
+ <task type="checkpoint:decision" gate="blocking">
141
+ <decision>[What's being decided]</decision>
142
+ <context>[Why this decision matters]</context>
143
+ <options>
144
+ <option id="option-a">
145
+ <name>[Option name]</name>
146
+ <pros>[Benefits]</pros>
147
+ <cons>[Tradeoffs]</cons>
148
+ </option>
149
+ <option id="option-b">
150
+ <name>[Option name]</name>
151
+ <pros>[Benefits]</pros>
152
+ <cons>[Tradeoffs]</cons>
153
+ </option>
154
+ </options>
155
+ <resume-signal>[How to indicate choice]</resume-signal>
156
+ </task>
157
+ ```
158
+
159
+ **Key elements:**
160
+ - `<decision>`: What's being decided
161
+ - `<context>`: Why this matters
162
+ - `<options>`: Each option with balanced pros/cons (not prescriptive)
163
+ - `<resume-signal>`: How to indicate choice
164
+
165
+ **Example: Auth Provider Selection**
166
+ ```xml
167
+ <task type="checkpoint:decision" gate="blocking">
168
+ <decision>Select authentication provider</decision>
169
+ <context>
170
+ Need user authentication for the app. Three solid options with different tradeoffs.
171
+ </context>
172
+ <options>
173
+ <option id="supabase">
174
+ <name>Supabase Auth</name>
175
+ <pros>Built-in with Supabase DB we're using, generous free tier, row-level security integration</pros>
176
+ <cons>Less customizable UI, tied to Supabase ecosystem</cons>
177
+ </option>
178
+ <option id="clerk">
179
+ <name>Clerk</name>
180
+ <pros>Beautiful pre-built UI, best developer experience, excellent docs</pros>
181
+ <cons>Paid after 10k MAU, vendor lock-in</cons>
182
+ </option>
183
+ <option id="nextauth">
184
+ <name>NextAuth.js</name>
185
+ <pros>Free, self-hosted, maximum control, widely adopted</pros>
186
+ <cons>More setup work, you manage security updates, UI is DIY</cons>
187
+ </option>
188
+ </options>
189
+ <resume-signal>Select: supabase, clerk, or nextauth</resume-signal>
190
+ </task>
191
+ ```
192
+
193
+ **Example: Database Selection**
194
+ ```xml
195
+ <task type="checkpoint:decision" gate="blocking">
196
+ <decision>Select database for user data</decision>
197
+ <context>
198
+ App needs persistent storage for users, sessions, and user-generated content.
199
+ Expected scale: 10k users, 1M records first year.
200
+ </context>
201
+ <options>
202
+ <option id="supabase">
203
+ <name>Supabase (Postgres)</name>
204
+ <pros>Full SQL, generous free tier, built-in auth, real-time subscriptions</pros>
205
+ <cons>Vendor lock-in for real-time features, less flexible than raw Postgres</cons>
206
+ </option>
207
+ <option id="planetscale">
208
+ <name>PlanetScale (MySQL)</name>
209
+ <pros>Serverless scaling, branching workflow, excellent DX</pros>
210
+ <cons>MySQL not Postgres, no foreign keys in free tier</cons>
211
+ </option>
212
+ <option id="convex">
213
+ <name>Convex</name>
214
+ <pros>Real-time by default, TypeScript-native, automatic caching</pros>
215
+ <cons>Newer platform, different mental model, less SQL flexibility</cons>
216
+ </option>
217
+ </options>
218
+ <resume-signal>Select: supabase, planetscale, or convex</resume-signal>
219
+ </task>
220
+ ```
221
+ </type>
222
+
223
+ <type name="human-action">
224
+ ## checkpoint:human-action (1% - Rare)
225
+
226
+ **When:** Action has NO CLI/API and requires human-only interaction, OR Claude hit an authentication gate during automation.
227
+
228
+ **Use ONLY for:**
229
+ - **Authentication gates** - Claude tried to use CLI/API but needs credentials to continue (this is NOT a failure)
230
+ - Email verification links (account creation requires clicking email)
231
+ - SMS 2FA codes (phone verification)
232
+ - Manual account approvals (platform requires human review before API access)
233
+ - Credit card 3D Secure flows (web-based payment authorization)
234
+ - OAuth app approvals (some platforms require web-based approval)
235
+
236
+ **Do NOT use for pre-planned manual work:**
237
+ - Manually deploying to Vercel (use `vercel` CLI - auth gate if needed)
238
+ - Manually creating Stripe webhooks (use Stripe API - auth gate if needed)
239
+ - Manually creating databases (use provider CLI - auth gate if needed)
240
+ - Running builds/tests manually (use Bash tool)
241
+ - Creating files manually (use Write tool)
242
+
243
+ **Structure:**
244
+ ```xml
245
+ <task type="checkpoint:human-action" gate="blocking">
246
+ <action>[What human must do - Claude already did everything automatable]</action>
247
+ <instructions>
248
+ [What Claude already automated]
249
+ [The ONE thing requiring human action]
250
+ </instructions>
251
+ <verification>[What Claude can check afterward]</verification>
252
+ <resume-signal>[How to continue]</resume-signal>
253
+ </task>
254
+ ```
255
+
256
+ **Key principle:** Claude automates EVERYTHING possible first, only asks human for the truly unavoidable manual step.
257
+
258
+ **Example: Email Verification**
259
+ ```xml
260
+ <task type="auto">
261
+ <name>Create SendGrid account via API</name>
262
+ <action>Use SendGrid API to create subuser account with provided email. Request verification email.</action>
263
+ <verify>API returns 201, account created</verify>
264
+ <done>Account created, verification email sent</done>
265
+ </task>
266
+
267
+ <task type="checkpoint:human-action" gate="blocking">
268
+ <action>Complete email verification for SendGrid account</action>
269
+ <instructions>
270
+ I created the account and requested verification email.
271
+ Check your inbox for SendGrid verification link and click it.
272
+ </instructions>
273
+ <verification>SendGrid API key works: curl test succeeds</verification>
274
+ <resume-signal>Type "done" when email verified</resume-signal>
275
+ </task>
276
+ ```
277
+
278
+ **Example: Credit Card 3D Secure**
279
+ ```xml
280
+ <task type="auto">
281
+ <name>Create Stripe payment intent</name>
282
+ <action>Use Stripe API to create payment intent for $99. Generate checkout URL.</action>
283
+ <verify>Stripe API returns payment intent ID and URL</verify>
284
+ <done>Payment intent created</done>
285
+ </task>
286
+
287
+ <task type="checkpoint:human-action" gate="blocking">
288
+ <action>Complete 3D Secure authentication</action>
289
+ <instructions>
290
+ I created the payment intent: https://checkout.stripe.com/pay/cs_test_abc123
291
+ Visit that URL and complete the 3D Secure verification flow with your test card.
292
+ </instructions>
293
+ <verification>Stripe webhook receives payment_intent.succeeded event</verification>
294
+ <resume-signal>Type "done" when payment completes</resume-signal>
295
+ </task>
296
+ ```
297
+
298
+ **Example: Authentication Gate (Dynamic Checkpoint)**
299
+ ```xml
300
+ <task type="auto">
301
+ <name>Deploy to Vercel</name>
302
+ <files>.vercel/, vercel.json</files>
303
+ <action>Run `vercel --yes` to deploy</action>
304
+ <verify>vercel ls shows deployment, curl returns 200</verify>
305
+ </task>
306
+
307
+ <!-- If vercel returns "Error: Not authenticated", Claude creates checkpoint on the fly -->
308
+
309
+ <task type="checkpoint:human-action" gate="blocking">
310
+ <action>Authenticate Vercel CLI so I can continue deployment</action>
311
+ <instructions>
312
+ I tried to deploy but got authentication error.
313
+ Run: vercel login
314
+ This will open your browser - complete the authentication flow.
315
+ </instructions>
316
+ <verification>vercel whoami returns your account email</verification>
317
+ <resume-signal>Type "done" when authenticated</resume-signal>
318
+ </task>
319
+
320
+ <!-- After authentication, Claude retries the deployment -->
321
+
322
+ <task type="auto">
323
+ <name>Retry Vercel deployment</name>
324
+ <action>Run `vercel --yes` (now authenticated)</action>
325
+ <verify>vercel ls shows deployment, curl returns 200</verify>
326
+ </task>
327
+ ```
328
+
329
+ **Key distinction:** Authentication gates are created dynamically when Claude encounters auth errors during automation. They're NOT pre-planned - Claude tries to automate first, only asks for credentials when blocked.
330
+ </type>
331
+ </checkpoint_types>
332
+
333
+ <writing_guidelines>
334
+
335
+ **DO:**
336
+ - Automate everything with CLI/API before checkpoint
337
+ - Be specific: "Visit https://myapp.vercel.app" not "check deployment"
338
+ - Number verification steps: easier to follow
339
+ - State expected outcomes: "You should see X"
340
+ - Provide context: why this checkpoint exists
341
+ - Make verification executable: clear, testable steps
342
+
343
+ **DON'T:**
344
+ - Ask human to do work Claude can automate (deploy, create resources, run builds)
345
+ - Assume knowledge: "Configure the usual settings" ❌
346
+ - Skip steps: "Set up database" ❌ (too vague)
347
+ - Mix multiple verifications in one checkpoint (split them)
348
+ - Make verification impossible (Claude can't check visual appearance without user confirmation)
349
+
350
+ **Placement:**
351
+ - **After automation completes** - not before Claude does the work
352
+ - **After UI buildout** - before declaring phase complete
353
+ - **Before dependent work** - decisions before implementation
354
+ - **At integration points** - after configuring external services
355
+
356
+ **Bad placement:**
357
+ - Before Claude automates (asking human to do automatable work) ❌
358
+ - Too frequent (every other task is a checkpoint) ❌
359
+ - Too late (checkpoint is last task, but earlier tasks needed its result) ❌
360
+ </writing_guidelines>
361
+
362
+ <examples>
363
+
364
+ ### Example 1: Deployment Flow (Correct)
365
+
366
+ ```xml
367
+ <!-- Claude automates everything -->
368
+ <task type="auto">
369
+ <name>Deploy to Vercel</name>
370
+ <files>.vercel/, vercel.json, package.json</files>
371
+ <action>
372
+ 1. Run `vercel --yes` to create project and deploy
373
+ 2. Capture deployment URL from output
374
+ 3. Set environment variables with `vercel env add`
375
+ 4. Trigger production deployment with `vercel --prod`
376
+ </action>
377
+ <verify>
378
+ - vercel ls shows deployment
379
+ - curl {url} returns 200
380
+ - Environment variables set correctly
381
+ </verify>
382
+ <done>App deployed to production, URL captured</done>
383
+ </task>
384
+
385
+ <!-- Human verifies visual/functional correctness -->
386
+ <task type="checkpoint:human-verify" gate="blocking">
387
+ <what-built>Deployed to https://myapp.vercel.app</what-built>
388
+ <how-to-verify>
389
+ Visit https://myapp.vercel.app and confirm:
390
+ - Homepage loads correctly
391
+ - All images/assets load
392
+ - Navigation works
393
+ - No console errors
394
+ </how-to-verify>
395
+ <resume-signal>Type "approved" or describe issues</resume-signal>
396
+ </task>
397
+ ```
398
+
399
+ ### Example 2: Database Setup (No Checkpoint Needed)
400
+
401
+ ```xml
402
+ <!-- Claude automates everything -->
403
+ <task type="auto">
404
+ <name>Create Upstash Redis database</name>
405
+ <files>.env</files>
406
+ <action>
407
+ 1. Run `upstash redis create myapp-cache --region us-east-1`
408
+ 2. Capture connection URL from output
409
+ 3. Write to .env: UPSTASH_REDIS_URL={url}
410
+ 4. Verify connection with test command
411
+ </action>
412
+ <verify>
413
+ - upstash redis list shows database
414
+ - .env contains UPSTASH_REDIS_URL
415
+ - Test connection succeeds
416
+ </verify>
417
+ <done>Redis database created and configured</done>
418
+ </task>
419
+
420
+ <!-- NO CHECKPOINT NEEDED - Claude automated everything and verified programmatically -->
421
+ ```
422
+
423
+ ### Example 3: Stripe Webhooks (Correct)
424
+
425
+ ```xml
426
+ <!-- Claude automates everything -->
427
+ <task type="auto">
428
+ <name>Configure Stripe webhooks</name>
429
+ <files>.env, src/app/api/webhooks/route.ts</files>
430
+ <action>
431
+ 1. Use Stripe API to create webhook endpoint pointing to /api/webhooks
432
+ 2. Subscribe to events: payment_intent.succeeded, customer.subscription.updated
433
+ 3. Save webhook signing secret to .env
434
+ 4. Implement webhook handler in route.ts
435
+ </action>
436
+ <verify>
437
+ - Stripe API returns webhook endpoint ID
438
+ - .env contains STRIPE_WEBHOOK_SECRET
439
+ - curl webhook endpoint returns 200
440
+ </verify>
441
+ <done>Stripe webhooks configured and handler implemented</done>
442
+ </task>
443
+
444
+ <!-- Human verifies in Stripe dashboard -->
445
+ <task type="checkpoint:human-verify" gate="blocking">
446
+ <what-built>Stripe webhook configured via API</what-built>
447
+ <how-to-verify>
448
+ Visit Stripe Dashboard > Developers > Webhooks
449
+ Confirm: Endpoint shows https://myapp.com/api/webhooks with correct events
450
+ </how-to-verify>
451
+ <resume-signal>Type "yes" if correct</resume-signal>
452
+ </task>
453
+ ```
454
+
455
+ ### Example 4: Full Auth Flow Verification (Correct)
456
+
457
+ ```xml
458
+ <task type="auto">
459
+ <name>Create user schema</name>
460
+ <files>src/db/schema.ts</files>
461
+ <action>Define User, Session, Account tables with Drizzle ORM</action>
462
+ <verify>npm run db:generate succeeds</verify>
463
+ </task>
464
+
465
+ <task type="auto">
466
+ <name>Create auth API routes</name>
467
+ <files>src/app/api/auth/[...nextauth]/route.ts</files>
468
+ <action>Set up NextAuth with GitHub provider, JWT strategy</action>
469
+ <verify>TypeScript compiles, no errors</verify>
470
+ </task>
471
+
472
+ <task type="auto">
473
+ <name>Create login UI</name>
474
+ <files>src/app/login/page.tsx, src/components/LoginButton.tsx</files>
475
+ <action>Create login page with GitHub OAuth button</action>
476
+ <verify>npm run build succeeds</verify>
477
+ </task>
478
+
479
+ <task type="auto">
480
+ <name>Start dev server for auth testing</name>
481
+ <action>Run `npm run dev` in background, wait for ready signal</action>
482
+ <verify>curl http://localhost:3000 returns 200</verify>
483
+ <done>Dev server running at http://localhost:3000</done>
484
+ </task>
485
+
486
+ <!-- ONE checkpoint at end verifies the complete flow - Claude already started server -->
487
+ <task type="checkpoint:human-verify" gate="blocking">
488
+ <what-built>Complete authentication flow - dev server running at http://localhost:3000</what-built>
489
+ <how-to-verify>
490
+ 1. Visit: http://localhost:3000/login
491
+ 2. Click "Sign in with GitHub"
492
+ 3. Complete GitHub OAuth flow
493
+ 4. Verify: Redirected to /dashboard, user name displayed
494
+ 5. Refresh page: Session persists
495
+ 6. Click logout: Session cleared
496
+ </how-to-verify>
497
+ <resume-signal>Type "approved" or describe issues</resume-signal>
498
+ </task>
499
+ ```
500
+ </examples>
501
+
502
+ <anti_patterns>
503
+
504
+ ### ❌ BAD: Asking user to start dev server
505
+
506
+ ```xml
507
+ <task type="checkpoint:human-verify" gate="blocking">
508
+ <what-built>Dashboard component</what-built>
509
+ <how-to-verify>
510
+ 1. Run: npm run dev
511
+ 2. Visit: http://localhost:3000/dashboard
512
+ 3. Check layout is correct
513
+ </how-to-verify>
514
+ </task>
515
+ ```
516
+
517
+ **Why bad:** Claude can run `npm run dev`. User should only visit URLs, not execute commands.
518
+
519
+ ### ✅ GOOD: Claude starts server, user visits
520
+
521
+ ```xml
522
+ <task type="auto">
523
+ <name>Start dev server</name>
524
+ <action>Run `npm run dev` in background</action>
525
+ <verify>curl localhost:3000 returns 200</verify>
526
+ </task>
527
+
528
+ <task type="checkpoint:human-verify" gate="blocking">
529
+ <what-built>Dashboard at http://localhost:3000/dashboard (server running)</what-built>
530
+ <how-to-verify>
531
+ Visit http://localhost:3000/dashboard and verify:
532
+ 1. Layout matches design
533
+ 2. No console errors
534
+ </how-to-verify>
535
+ </task>
536
+ ```
537
+
538
+ ### ❌ BAD: Asking user to add env vars in dashboard
539
+
540
+ ```xml
541
+ <task type="checkpoint:human-action" gate="blocking">
542
+ <action>Add environment variables to Convex</action>
543
+ <instructions>
544
+ 1. Go to dashboard.convex.dev
545
+ 2. Select your project
546
+ 3. Navigate to Settings → Environment Variables
547
+ 4. Add OPENAI_API_KEY with your key
548
+ </instructions>
549
+ </task>
550
+ ```
551
+
552
+ **Why bad:** Convex has `npx convex env set`. Claude should ask for the key value, then run the CLI command.
553
+
554
+ ### ✅ GOOD: Claude collects secret, adds via CLI
555
+
556
+ ```xml
557
+ <task type="checkpoint:human-action" gate="blocking">
558
+ <action>Provide your OpenAI API key</action>
559
+ <instructions>
560
+ I need your OpenAI API key. Get it from: https://platform.openai.com/api-keys
561
+ Paste the key below (starts with sk-)
562
+ </instructions>
563
+ <verification>I'll configure it via CLI</verification>
564
+ <resume-signal>Paste your key</resume-signal>
565
+ </task>
566
+
567
+ <task type="auto">
568
+ <name>Add OpenAI key to Convex</name>
569
+ <action>Run `npx convex env set OPENAI_API_KEY {key}`</action>
570
+ <verify>`npx convex env get` shows OPENAI_API_KEY configured</verify>
571
+ </task>
572
+ ```
573
+
574
+ ### ❌ BAD: Asking human to deploy
575
+
576
+ ```xml
577
+ <task type="checkpoint:human-action" gate="blocking">
578
+ <action>Deploy to Vercel</action>
579
+ <instructions>
580
+ 1. Visit vercel.com/new
581
+ 2. Import Git repository
582
+ 3. Click Deploy
583
+ 4. Copy deployment URL
584
+ </instructions>
585
+ <verification>Deployment exists</verification>
586
+ <resume-signal>Paste URL</resume-signal>
587
+ </task>
588
+ ```
589
+
590
+ **Why bad:** Vercel has a CLI. Claude should run `vercel --yes`.
591
+
592
+ ### ✅ GOOD: Claude automates, human verifies
593
+
594
+ ```xml
595
+ <task type="auto">
596
+ <name>Deploy to Vercel</name>
597
+ <action>Run `vercel --yes`. Capture URL.</action>
598
+ <verify>vercel ls shows deployment, curl returns 200</verify>
599
+ </task>
600
+
601
+ <task type="checkpoint:human-verify">
602
+ <what-built>Deployed to {url}</what-built>
603
+ <how-to-verify>Visit {url}, check homepage loads</how-to-verify>
604
+ <resume-signal>Type "approved"</resume-signal>
605
+ </task>
606
+ ```
607
+
608
+ ### ❌ BAD: Too many checkpoints
609
+
610
+ ```xml
611
+ <task type="auto">Create schema</task>
612
+ <task type="checkpoint:human-verify">Check schema</task>
613
+ <task type="auto">Create API route</task>
614
+ <task type="checkpoint:human-verify">Check API</task>
615
+ <task type="auto">Create UI form</task>
616
+ <task type="checkpoint:human-verify">Check form</task>
617
+ ```
618
+
619
+ **Why bad:** Verification fatigue. Combine into one checkpoint at end.
620
+
621
+ ### ✅ GOOD: Single verification checkpoint
622
+
623
+ ```xml
624
+ <task type="auto">Create schema</task>
625
+ <task type="auto">Create API route</task>
626
+ <task type="auto">Create UI form</task>
627
+
628
+ <task type="checkpoint:human-verify">
629
+ <what-built>Complete auth flow (schema + API + UI)</what-built>
630
+ <how-to-verify>Test full flow: register, login, access protected page</how-to-verify>
631
+ <resume-signal>Type "approved"</resume-signal>
632
+ </task>
633
+ ```
634
+
635
+ ### ❌ BAD: Asking for automatable file operations
636
+
637
+ ```xml
638
+ <task type="checkpoint:human-action">
639
+ <action>Create .env file</action>
640
+ <instructions>
641
+ 1. Create .env in project root
642
+ 2. Add: DATABASE_URL=...
643
+ 3. Add: STRIPE_KEY=...
644
+ </instructions>
645
+ </task>
646
+ ```
647
+
648
+ **Why bad:** Claude has Write tool. This should be `type="auto"`.
649
+
650
+ ### ❌ BAD: Vague verification steps
651
+
652
+ ```xml
653
+ <task type="checkpoint:human-verify">
654
+ <what-built>Dashboard</what-built>
655
+ <how-to-verify>Check it works</how-to-verify>
656
+ <resume-signal>Continue</resume-signal>
657
+ </task>
658
+ ```
659
+
660
+ **Why bad:** No specifics. User doesn't know what to test or what "works" means.
661
+
662
+ ### ✅ GOOD: Specific verification steps (server already running)
663
+
664
+ ```xml
665
+ <task type="checkpoint:human-verify">
666
+ <what-built>Responsive dashboard - server running at http://localhost:3000</what-built>
667
+ <how-to-verify>
668
+ Visit http://localhost:3000/dashboard and verify:
669
+ 1. Desktop (>1024px): Sidebar visible, content area fills remaining space
670
+ 2. Tablet (768px): Sidebar collapses to icons
671
+ 3. Mobile (375px): Sidebar hidden, hamburger menu in header
672
+ 4. No horizontal scroll at any size
673
+ </how-to-verify>
674
+ <resume-signal>Type "approved" or describe layout issues</resume-signal>
675
+ </task>
676
+ ```
677
+
678
+ ### ❌ BAD: Asking user to run any CLI command
679
+
680
+ ```xml
681
+ <task type="checkpoint:human-action">
682
+ <action>Run database migrations</action>
683
+ <instructions>
684
+ 1. Run: npx prisma migrate deploy
685
+ 2. Run: npx prisma db seed
686
+ 3. Verify tables exist
687
+ </instructions>
688
+ </task>
689
+ ```
690
+
691
+ **Why bad:** Claude can run these commands. User should never execute CLI commands.
692
+
693
+ ### ❌ BAD: Asking user to copy values between services
694
+
695
+ ```xml
696
+ <task type="checkpoint:human-action">
697
+ <action>Configure webhook URL in Stripe</action>
698
+ <instructions>
699
+ 1. Copy the deployment URL from terminal
700
+ 2. Go to Stripe Dashboard → Webhooks
701
+ 3. Add endpoint with URL + /api/webhooks
702
+ 4. Copy webhook signing secret
703
+ 5. Add to .env file
704
+ </instructions>
705
+ </task>
706
+ ```
707
+
708
+ **Why bad:** Stripe has an API. Claude should create the webhook via API and write to .env directly.
709
+
710
+ </anti_patterns>
711
+
712
+ <summary>
713
+
714
+ Checkpoints formalize human-in-the-loop points. Use them when Claude cannot complete a task autonomously OR when human verification is required for correctness.
715
+
716
+ **The golden rule:** If Claude CAN automate it, Claude MUST automate it.
717
+
718
+ **Checkpoint priority:**
719
+ 1. **checkpoint:human-verify** (90% of checkpoints) - Claude automated everything, human confirms visual/functional correctness
720
+ 2. **checkpoint:decision** (9% of checkpoints) - Human makes architectural/technology choices
721
+ 3. **checkpoint:human-action** (1% of checkpoints) - Truly unavoidable manual steps with no API/CLI
722
+
723
+ **When NOT to use checkpoints:**
724
+ - Things Claude can verify programmatically (tests pass, build succeeds)
725
+ - File operations (Claude can read files to verify)
726
+ - Code correctness (use tests and static analysis)
727
+ - Anything automatable via CLI/API
728
+ </summary>