the-grid-cc 1.7.14 → 1.7.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,483 @@
1
+ # Grid Persistence - Checkpoint & Resume Flow
2
+
3
+ **Version:** 1.0
4
+ **Visual Guide**
5
+
6
+ This document visualizes the checkpoint and resume flow in The Grid.
7
+
8
+ ## State Lifecycle
9
+
10
+ ```
11
+ ┌─────────────────────────────────────────────────────────────────────┐
12
+ │ GRID STATE LIFECYCLE │
13
+ └─────────────────────────────────────────────────────────────────────┘
14
+
15
+ INITIALIZED ──▶ ACTIVE ──▶ CHECKPOINT ──▶ INTERRUPTED ──▶ RESUMED
16
+ │ │ │ │ │
17
+ │ │ │ │ ▼
18
+ │ │ │ │ ACTIVE
19
+ │ │ │ │ │
20
+ │ │ │ │ ▼
21
+ │ │ │ └──────▶ COMPLETED
22
+ │ │ │
23
+ │ │ └────────────────────────▶ FAILED
24
+ │ │
25
+ │ └──────────────────────────────────▶ COMPLETED
26
+
27
+ └──────────────────────────────────────────────▶ (never started)
28
+ ```
29
+
30
+ ## File Write Timeline
31
+
32
+ ```
33
+ Time ──────────────────────────────────────────────────────────▶
34
+
35
+ /grid:init
36
+
37
+ ├─ STATE.md (status: initialized)
38
+ ├─ WARMTH.md (empty)
39
+ ├─ SCRATCHPAD.md (empty)
40
+ ├─ DECISIONS.md (empty)
41
+ ├─ BLOCKERS.md (empty)
42
+ └─ config.json
43
+
44
+ /grid "build blog"
45
+
46
+ ├─ STATE.md (status: active, cluster: blog)
47
+ └─ plans/ (block plans written)
48
+
49
+ Block 01 execution starts
50
+
51
+ ├─ SCRATCHPAD.md (executor writes discoveries)
52
+ │ ├─ executor-01 | 14:00:00 | PATTERN
53
+ │ │ Finding: "Astro uses .astro extension"
54
+ │ │
55
+ │ ├─ executor-01 | 14:05:00 | PROGRESS
56
+ │ │ Finding: "Thread 1 complete (commit abc123)"
57
+ │ │
58
+ │ └─ executor-01 | 14:10:00 | GOTCHA
59
+ │ Finding: "Config must be .mjs for ESM"
60
+
61
+ Block 01 completes
62
+
63
+ ├─ phases/01-foundation/01-SUMMARY.md (complete block record)
64
+ ├─ WARMTH.md (aggregated from lessons_learned)
65
+ └─ STATE.md (block: 2, progress: 16%)
66
+
67
+ Block 02 execution starts
68
+
69
+ └─ SCRATCHPAD.md (new entries)
70
+
71
+ Thread 02.3 hits checkpoint
72
+
73
+ ├─ CHECKPOINT.md (type: human_verify)
74
+ └─ STATE.md (status: checkpoint)
75
+
76
+ Session dies (context exhaustion)
77
+
78
+ └─ (last STATE.md write: status: checkpoint)
79
+
80
+ /grid:resume (new session)
81
+
82
+ ├─ Read STATE.md (status: checkpoint)
83
+ ├─ Read CHECKPOINT.md (awaiting user)
84
+ ├─ Read WARMTH.md (accumulated knowledge)
85
+ └─ Present checkpoint to user
86
+
87
+ User responds "approved"
88
+
89
+ ├─ CHECKPOINT.md (user_response: approved)
90
+ ├─ STATE.md (status: active)
91
+ └─ Spawn continuation executor
92
+
93
+ Execution continues
94
+
95
+ └─ (normal flow resumes)
96
+ ```
97
+
98
+ ## Checkpoint Types Flow
99
+
100
+ ### Human Verify Checkpoint
101
+
102
+ ```
103
+ ┌─────────────────────────────────────────────────────────────────────┐
104
+ │ HUMAN VERIFY CHECKPOINT │
105
+ └─────────────────────────────────────────────────────────────────────┘
106
+
107
+ Executor completes thread requiring verification
108
+
109
+ ├─ Writes CHECKPOINT.md
110
+ │ ├─ type: human_verify
111
+ │ ├─ verification_instructions
112
+ │ ├─ expected_behavior
113
+ │ └─ completed_threads (with commits)
114
+
115
+ ├─ Updates STATE.md (status: checkpoint)
116
+
117
+ └─ Returns to MC
118
+
119
+ MC presents checkpoint to user
120
+
121
+ └─ "Please test dark mode and respond 'approved' or describe issues"
122
+
123
+ User tests and responds
124
+
125
+ ├─ "approved" ──▶ Continue from next thread
126
+
127
+ └─ "toggle doesn't persist" ──▶ Fix issue, retry thread
128
+
129
+ Session may die here ──▶ Resume reconstructs this state
130
+ ```
131
+
132
+ ### Decision Checkpoint
133
+
134
+ ```
135
+ ┌─────────────────────────────────────────────────────────────────────┐
136
+ │ DECISION CHECKPOINT │
137
+ └─────────────────────────────────────────────────────────────────────┘
138
+
139
+ Executor reaches architectural fork
140
+
141
+ ├─ Writes CHECKPOINT.md
142
+ │ ├─ type: decision
143
+ │ ├─ question: "Deploy to Vercel or Netlify?"
144
+ │ ├─ options: [vercel, netlify]
145
+ │ └─ affects: "Block 01, Block 06"
146
+
147
+ ├─ Updates STATE.md (status: checkpoint)
148
+
149
+ └─ Returns to MC
150
+
151
+ MC presents decision to user
152
+
153
+ └─ Options: 1) Vercel, 2) Netlify
154
+
155
+ User chooses
156
+
157
+ ├─ CHECKPOINT.md (user_response: "vercel")
158
+ ├─ DECISIONS.md (new entry)
159
+ └─ Continue with chosen path
160
+ ```
161
+
162
+ ### Session Death Checkpoint
163
+
164
+ ```
165
+ ┌─────────────────────────────────────────────────────────────────────┐
166
+ │ SESSION DEATH CHECKPOINT │
167
+ └─────────────────────────────────────────────────────────────────────┘
168
+
169
+ Executor detects context approaching exhaustion
170
+
171
+ ├─ Writes CHECKPOINT.md
172
+ │ ├─ type: session_death
173
+ │ ├─ last_action: "Writing localStorage logic"
174
+ │ ├─ partial_work: {files, changes}
175
+ │ └─ completed_threads
176
+
177
+ ├─ Updates STATE.md (status: interrupted)
178
+
179
+ └─ Session ends
180
+
181
+ /grid:resume (new session)
182
+
183
+ ├─ Detects: status: interrupted
184
+ ├─ Reads CHECKPOINT.md (type: session_death)
185
+ ├─ Validates partial_work
186
+
187
+ ├─ If good ──▶ Commit and continue
188
+ └─ If broken ──▶ Rollback and retry thread
189
+ ```
190
+
191
+ ### Failure Checkpoint
192
+
193
+ ```
194
+ ┌─────────────────────────────────────────────────────────────────────┐
195
+ │ FAILURE CHECKPOINT │
196
+ └─────────────────────────────────────────────────────────────────────┘
197
+
198
+ Executor encounters unrecoverable error
199
+
200
+ ├─ Writes CHECKPOINT.md
201
+ │ ├─ type: failure
202
+ │ ├─ error: "Vercel API key not found"
203
+ │ ├─ partial_work: {what was done}
204
+ │ └─ completed_threads
205
+
206
+ ├─ Updates STATE.md (status: failed)
207
+
208
+ └─ Returns to MC
209
+
210
+ MC presents failure to user
211
+
212
+ └─ Options:
213
+ 1) Retry (spawn fresh executor)
214
+ 2) Rollback (revert partial work)
215
+ 3) Manual (user fixes, then continue)
216
+ 4) Skip (skip this block)
217
+
218
+ User chooses
219
+
220
+ ├─ Retry ──▶ Spawn executor with failure context
221
+ ├─ Rollback ──▶ git reset --hard, restart block
222
+ ├─ Manual ──▶ Wait for user to fix, then continue
223
+ └─ Skip ──▶ Mark block as skipped, continue to next
224
+ ```
225
+
226
+ ## Resume Context Reconstruction
227
+
228
+ ```
229
+ ┌─────────────────────────────────────────────────────────────────────┐
230
+ │ CONTEXT RECONSTRUCTION │
231
+ └─────────────────────────────────────────────────────────────────────┘
232
+
233
+ /grid:resume
234
+
235
+ ├─ Step 1: State Detection
236
+ │ ├─ Read STATE.md
237
+ │ ├─ Check status: checkpoint | interrupted | failed
238
+ │ └─ Determine resume strategy
239
+
240
+ ├─ Step 2: State Validation
241
+ │ ├─ Verify commits exist in git ✓
242
+ │ ├─ Verify claimed files exist ✓
243
+ │ ├─ Check for conflicts ✓
244
+ │ └─ If invalid ──▶ Enter recovery mode
245
+
246
+ ├─ Step 3: Context Reconstruction
247
+ │ ├─ Load WARMTH.md
248
+ │ │ └─ Accumulated knowledge from all prior Programs
249
+ │ │
250
+ │ ├─ Load DECISIONS.md
251
+ │ │ └─ User decisions that affect future work
252
+ │ │
253
+ │ ├─ Load CHECKPOINT.md (if exists)
254
+ │ │ └─ Exact thread state when interrupted
255
+ │ │
256
+ │ ├─ Collect SUMMARY.md files
257
+ │ │ └─ All completed blocks with commits
258
+ │ │
259
+ │ └─ Load pending PLAN.md
260
+ │ └─ Remaining work to be done
261
+
262
+ ├─ Step 4: Build Executor Prompt
263
+ │ │
264
+ │ ├─ <warmth>
265
+ │ │ {accumulated knowledge}
266
+ │ │ </warmth>
267
+ │ │
268
+ │ ├─ <completed_threads>
269
+ │ │ - Thread 02.1: BaseLayout (commit abc123)
270
+ │ │ - Thread 02.2: Header (commit def456)
271
+ │ │ </completed_threads>
272
+ │ │
273
+ │ ├─ <resume_point>
274
+ │ │ Continue from: Thread 02.4
275
+ │ │ User feedback: {if any}
276
+ │ │ </resume_point>
277
+ │ │
278
+ │ └─ <plan>
279
+ │ {remaining threads from PLAN.md}
280
+ │ </plan>
281
+
282
+ └─ Step 5: Spawn Continuation
283
+ ├─ Inject warmth ──▶ Executor has institutional knowledge
284
+ ├─ Provide completed_threads ──▶ Executor knows what's done
285
+ ├─ Set resume_point ──▶ Executor starts from correct thread
286
+ └─ Execute ──▶ Picks up exactly where prior session left off
287
+ ```
288
+
289
+ ## Warmth Aggregation Flow
290
+
291
+ ```
292
+ ┌─────────────────────────────────────────────────────────────────────┐
293
+ │ WARMTH AGGREGATION │
294
+ └─────────────────────────────────────────────────────────────────────┘
295
+
296
+ Block 01 completes
297
+
298
+ └─ 01-SUMMARY.md written with lessons_learned:
299
+ ├─ codebase_patterns: ["Astro uses content collections"]
300
+ ├─ gotchas: ["Config must be .mjs"]
301
+ └─ user_preferences: ["Minimal dependencies"]
302
+
303
+ MC aggregates warmth
304
+
305
+ ├─ Read 01-SUMMARY.md lessons_learned
306
+ ├─ Read existing WARMTH.md (if exists)
307
+ ├─ Merge (deduplicate)
308
+ │ ├─ codebase_patterns: +1 new pattern
309
+ │ ├─ gotchas: +1 new gotcha
310
+ │ └─ user_preferences: +1 new preference
311
+ └─ Write updated WARMTH.md
312
+
313
+ Block 02 starts
314
+
315
+ └─ Executor receives WARMTH.md in prompt
316
+ ├─ Knows about content collections
317
+ ├─ Avoids .js config mistake
318
+ └─ Respects minimal dependencies preference
319
+
320
+ Block 02 completes
321
+
322
+ └─ 02-SUMMARY.md written with MORE lessons_learned:
323
+ ├─ codebase_patterns: ["Dark mode uses class strategy"]
324
+ ├─ gotchas: ["Must inline theme script to prevent FOUC"]
325
+ └─ decisions_made: ["Chose dark as default"]
326
+
327
+ MC aggregates warmth again
328
+
329
+ ├─ Read 02-SUMMARY.md lessons_learned
330
+ ├─ Read existing WARMTH.md (has Block 01 warmth)
331
+ ├─ Merge (deduplicate)
332
+ │ ├─ codebase_patterns: +1 (now has 2)
333
+ │ ├─ gotchas: +1 (now has 2)
334
+ │ ├─ user_preferences: (still has 1)
335
+ │ └─ decisions_made: +1 (new category)
336
+ └─ Write updated WARMTH.md
337
+
338
+ Block 03 starts
339
+
340
+ └─ Executor receives WARMTH.md with ALL prior knowledge
341
+ └─ Benefits from all discoveries in Blocks 01 and 02
342
+
343
+ Session dies
344
+
345
+ └─ WARMTH.md persists on disk
346
+
347
+ /grid:resume
348
+
349
+ └─ Continuation executor receives WARMTH.md
350
+ └─ Has full institutional knowledge
351
+ └─ Doesn't repeat mistakes
352
+ └─ Applies learned patterns
353
+ ```
354
+
355
+ ## State Validation Flow
356
+
357
+ ```
358
+ ┌─────────────────────────────────────────────────────────────────────┐
359
+ │ STATE VALIDATION │
360
+ └─────────────────────────────────────────────────────────────────────┘
361
+
362
+ /grid:resume
363
+
364
+ └─ Validation Phase
365
+
366
+ Check 1: Commits exist
367
+
368
+ ├─ For each SUMMARY.md:
369
+ │ ├─ Extract commit hashes
370
+ │ └─ Run: git cat-file -t {hash}
371
+
372
+ ├─ All exist ──▶ [PASS]
373
+ └─ Missing ──▶ [FAIL] "Commit abc123 not found"
374
+
375
+ Check 2: Files exist
376
+
377
+ ├─ For each SUMMARY.md:
378
+ │ ├─ Extract artifacts_created paths
379
+ │ └─ Check file_exists(path)
380
+
381
+ ├─ All exist ──▶ [PASS]
382
+ └─ Missing ──▶ [FAIL] "File src/layouts/BaseLayout.astro missing"
383
+
384
+ Check 3: Plans available
385
+
386
+ ├─ For each pending block:
387
+ │ └─ Check: .grid/plans/*-block-{N}.md exists
388
+
389
+ ├─ All exist ──▶ [PASS]
390
+ └─ Missing ──▶ [FAIL] "Plan for block 03 missing"
391
+
392
+ Check 4: Single checkpoint
393
+
394
+ ├─ Count CHECKPOINT*.md files
395
+
396
+ ├─ 0 or 1 ──▶ [PASS]
397
+ └─ Multiple ──▶ [FAIL] "Multiple checkpoint files found"
398
+
399
+ Check 5: State parseable
400
+
401
+ ├─ Try: parse_yaml(STATE.md)
402
+
403
+ ├─ Success ──▶ [PASS]
404
+ └─ Error ──▶ [FAIL] "STATE.md corrupted"
405
+
406
+ Results:
407
+
408
+ ├─ All PASS ──▶ State is valid, continue resume
409
+
410
+ └─ Any FAIL ──▶ Enter recovery mode
411
+ ├─ Attempt reconstruction from git + artifacts
412
+ ├─ Present reconstructed state to user
413
+ └─ User approves or aborts
414
+ ```
415
+
416
+ ## Recovery Mode Flow
417
+
418
+ ```
419
+ ┌─────────────────────────────────────────────────────────────────────┐
420
+ │ RECOVERY MODE │
421
+ └─────────────────────────────────────────────────────────────────────┘
422
+
423
+ Validation failed or STATE.md corrupted
424
+
425
+ └─ Enter recovery mode
426
+
427
+ Step 1: Scan for artifacts
428
+
429
+ ├─ Find all SUMMARY.md files
430
+ │ └─ Extract: blocks complete, commits
431
+
432
+ ├─ Scan git log for Grid commits
433
+ │ └─ Pattern: "feat(NN):" or "fix(NN):"
434
+
435
+ └─ Find all plan files
436
+ └─ Determine: total blocks planned
437
+
438
+ Step 2: Reconstruct state
439
+
440
+ ├─ Completed blocks: [01, 02] (from SUMMARY.md)
441
+ ├─ Total commits: 15 (from git log)
442
+ ├─ Current block: 03 (inferred)
443
+ ├─ Progress: ~33% (estimated)
444
+ └─ Cluster: "blog" (inferred from plans)
445
+
446
+ Step 3: Present to user
447
+
448
+ └─ Display:
449
+ """
450
+ STATE RECOVERY
451
+ ==============
452
+
453
+ Reconstructed state:
454
+ - Cluster: blog (inferred from plans)
455
+ - Completed: Blocks 01, 02
456
+ - Current: Block 03 (inferred)
457
+ - Progress: ~33%
458
+ - Commits: 15 found in git
459
+
460
+ Accept reconstructed state? (yes/no)
461
+ """
462
+
463
+ Step 4: User decision
464
+
465
+ ├─ Yes ──▶ Write new STATE.md, continue resume
466
+
467
+ └─ No ──▶ Options:
468
+ 1) Abort (clear state, start fresh)
469
+ 2) Manual (user fixes STATE.md manually)
470
+ 3) Inspect (show all found artifacts for manual review)
471
+ ```
472
+
473
+ ## Summary
474
+
475
+ Grid's persistence system provides:
476
+
477
+ 1. **Checkpoint-based recovery** - Resume from exact stopping point
478
+ 2. **Warmth accumulation** - Knowledge transfers across sessions
479
+ 3. **State validation** - Ensures consistency before resume
480
+ 4. **Recovery mode** - Handles corrupted state gracefully
481
+ 5. **Multiple checkpoint types** - Handles various interruption scenarios
482
+
483
+ End of Line.