the-grid-cc 1.7.14 → 1.7.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,361 @@
1
+ # Grid Persistence - Implementation Guide
2
+
3
+ **Version:** 1.0
4
+ **Date:** 2026-01-23
5
+
6
+ This guide explains how to implement and use Grid's persistence system.
7
+
8
+ ## Quick Start
9
+
10
+ ### Initialize a Project
11
+
12
+ ```bash
13
+ /grid:init
14
+ ```
15
+
16
+ This creates the complete `.grid/` directory structure with all state files.
17
+
18
+ ### Resume an Interrupted Mission
19
+
20
+ ```bash
21
+ /grid:resume
22
+ ```
23
+
24
+ This reconstructs context from `.grid/` files and continues from the last checkpoint.
25
+
26
+ ## Implementation Status
27
+
28
+ ### Phase 1: Core Persistence (REQUIRED)
29
+
30
+ - [x] Template files created
31
+ - [x] `/grid:resume` added to help
32
+ - [ ] STATE.md write on every wave complete
33
+ - [ ] CHECKPOINT.md write on checkpoint/interrupt
34
+ - [ ] WARMTH.md aggregation after block complete
35
+ - [ ] `/grid:resume` command full implementation
36
+ - [ ] Basic context reconstruction
37
+
38
+ ### Phase 2: Enhanced Recovery
39
+
40
+ - [ ] Session death detection via scratchpad staleness
41
+ - [ ] Git-based state reconstruction
42
+ - [ ] Corrupted state recovery
43
+ - [ ] Rollback support
44
+
45
+ ### Phase 3: Advanced Features
46
+
47
+ - [ ] Multi-cluster support
48
+ - [ ] State diff visualization
49
+ - [ ] Time-travel debugging
50
+ - [ ] Cross-session analytics
51
+
52
+ ## File Templates Location
53
+
54
+ All templates are in: `/Users/jacweath/grid/templates/grid-state/`
55
+
56
+ - `STATE.md` - Central state tracking
57
+ - `WARMTH.md` - Institutional knowledge
58
+ - `SCRATCHPAD.md` - Live discoveries
59
+ - `DECISIONS.md` - User decisions
60
+ - `BLOCKERS.md` - Blocker tracking
61
+ - `CHECKPOINT.md` - Interrupted thread state
62
+ - `config.json` - Grid configuration
63
+ - `BLOCK-SUMMARY.md` - Completed block record
64
+
65
+ ## Integration Points
66
+
67
+ ### Master Control (mc.md)
68
+
69
+ Master Control must:
70
+
71
+ 1. **On wave complete:**
72
+ - Update STATE.md with new position
73
+ - Update progress_percent
74
+ - Update updated_at timestamp
75
+
76
+ 2. **On block complete:**
77
+ - Aggregate WARMTH.md from executor's lessons_learned
78
+ - Verify SUMMARY.md was written
79
+ - Update STATE.md
80
+
81
+ 3. **On checkpoint:**
82
+ - Write CHECKPOINT.md with current thread state
83
+ - Set STATE.md status to "checkpoint"
84
+ - Wait for user response
85
+
86
+ 4. **On session approaching exhaustion:**
87
+ - Write CHECKPOINT.md with type: session_death
88
+ - Set STATE.md status to "interrupted"
89
+ - Include partial_work details
90
+
91
+ ### Grid Executor (grid-executor.md)
92
+
93
+ Executors must:
94
+
95
+ 1. **On thread complete:**
96
+ - Commit work with clear message
97
+ - Record commit hash
98
+
99
+ 2. **On block complete:**
100
+ - Write SUMMARY.md to `.grid/phases/{phase}/`
101
+ - Include all commits with hashes
102
+ - Include lessons_learned for warmth aggregation
103
+ - List all artifacts_created
104
+
105
+ 3. **On scratchpad entry:**
106
+ - Write discovery to SCRATCHPAD.md
107
+ - Include timestamp and program-id
108
+ - Follow standard format
109
+
110
+ 4. **On blocker encountered:**
111
+ - Write to BLOCKERS.md
112
+ - Include type, description, position
113
+ - Mark as ACTIVE
114
+
115
+ ### Resume Command (resume.md)
116
+
117
+ The `/grid:resume` command must:
118
+
119
+ 1. **Detect state:**
120
+ - Check if STATE.md exists
121
+ - Parse status and position
122
+ - Determine resume strategy
123
+
124
+ 2. **Validate state:**
125
+ - Verify commits exist in git
126
+ - Verify claimed files exist
127
+ - Check for conflicts
128
+
129
+ 3. **Reconstruct context:**
130
+ - Load STATE.md
131
+ - Load WARMTH.md
132
+ - Load DECISIONS.md
133
+ - Load CHECKPOINT.md if exists
134
+ - Collect all SUMMARY.md files
135
+ - Build execution context
136
+
137
+ 4. **Spawn continuation:**
138
+ - Inject warmth into executor prompt
139
+ - Provide completed_threads table
140
+ - Provide pending plan
141
+ - Set resume_point
142
+
143
+ ## State Update Protocol
144
+
145
+ ### Atomic Updates
146
+
147
+ Always use atomic writes to prevent corruption:
148
+
149
+ ```python
150
+ # 1. Read current state
151
+ current = parse_yaml(read(".grid/STATE.md"))
152
+
153
+ # 2. Apply updates
154
+ merged = deep_merge(current, updates)
155
+ merged["updated_at"] = datetime.now().isoformat()
156
+
157
+ # 3. Write to temp file
158
+ write(".grid/STATE.md.tmp", to_yaml(merged))
159
+
160
+ # 4. Atomic rename
161
+ rename(".grid/STATE.md.tmp", ".grid/STATE.md")
162
+ ```
163
+
164
+ ### Update Triggers
165
+
166
+ | Event | File | Field Updated |
167
+ |-------|------|---------------|
168
+ | Wave starts | STATE.md | status: active, wave: N |
169
+ | Wave completes | STATE.md | wave: N+1, progress_percent |
170
+ | Block completes | STATE.md | block: N+1 |
171
+ | Checkpoint hit | STATE.md, CHECKPOINT.md | status: checkpoint |
172
+ | Session ending | STATE.md, CHECKPOINT.md | status: interrupted |
173
+ | Mission complete | STATE.md | status: completed |
174
+ | Blocker found | BLOCKERS.md | New entry |
175
+ | User decision | DECISIONS.md | New entry |
176
+ | Discovery made | SCRATCHPAD.md | New entry |
177
+
178
+ ## Warmth Aggregation
179
+
180
+ After each block completes, aggregate warmth:
181
+
182
+ ```python
183
+ def aggregate_warmth(block_summary_path):
184
+ # 1. Parse block summary
185
+ summary = parse_yaml(read(block_summary_path))
186
+ block_warmth = summary.get("lessons_learned", {})
187
+
188
+ # 2. Load existing warmth
189
+ if file_exists(".grid/WARMTH.md"):
190
+ existing = parse_yaml(read(".grid/WARMTH.md"))
191
+ else:
192
+ existing = {
193
+ "codebase_patterns": [],
194
+ "gotchas": [],
195
+ "user_preferences": [],
196
+ "decisions_made": [],
197
+ "almost_did": [],
198
+ }
199
+
200
+ # 3. Merge (deduplicate)
201
+ for category in ["codebase_patterns", "gotchas", "user_preferences", "almost_did"]:
202
+ new_items = block_warmth.get(category, [])
203
+ for item in new_items:
204
+ if item not in existing[category]:
205
+ existing[category].append(item)
206
+
207
+ # 4. Write aggregated warmth
208
+ write(".grid/WARMTH.md", to_yaml(existing))
209
+ ```
210
+
211
+ ## Context Reconstruction
212
+
213
+ When resuming, rebuild complete context:
214
+
215
+ ```python
216
+ def reconstruct_context():
217
+ context = {
218
+ "cluster": None,
219
+ "position": None,
220
+ "completed_blocks": [],
221
+ "completed_threads": [],
222
+ "pending_plans": [],
223
+ "warmth": None,
224
+ "decisions": [],
225
+ "blockers": [],
226
+ "checkpoint": None,
227
+ }
228
+
229
+ # 1. Load central state
230
+ state = parse_yaml(read(".grid/STATE.md"))
231
+ context["cluster"] = state["cluster"]
232
+ context["position"] = state["position"]
233
+
234
+ # 2. Collect completed work
235
+ for summary_path in glob(".grid/phases/*/SUMMARY.md"):
236
+ summary = parse_yaml(read(summary_path))
237
+ context["completed_blocks"].append(summary)
238
+
239
+ # 3. Load warmth
240
+ if file_exists(".grid/WARMTH.md"):
241
+ context["warmth"] = read(".grid/WARMTH.md")
242
+
243
+ # 4. Load decisions
244
+ if file_exists(".grid/DECISIONS.md"):
245
+ context["decisions"] = parse_decisions(".grid/DECISIONS.md")
246
+
247
+ # 5. Load checkpoint if exists
248
+ if file_exists(".grid/CHECKPOINT.md"):
249
+ context["checkpoint"] = parse_yaml(read(".grid/CHECKPOINT.md"))
250
+
251
+ # 6. Identify pending plans
252
+ for plan_path in glob(".grid/plans/*-block-*.md"):
253
+ block_num = extract_block_number(plan_path)
254
+ if block_num not in [b["block"] for b in context["completed_blocks"]]:
255
+ context["pending_plans"].append({
256
+ "path": plan_path,
257
+ "block": block_num,
258
+ "content": read(plan_path),
259
+ })
260
+
261
+ return context
262
+ ```
263
+
264
+ ## Checkpoint Types
265
+
266
+ ### Human Verify Checkpoint
267
+
268
+ ```yaml
269
+ type: human_verify
270
+ checkpoint_details:
271
+ verification_instructions: |
272
+ 1. Run: npm run dev
273
+ 2. Visit: http://localhost:4321
274
+ 3. Click dark mode toggle
275
+ 4. Verify theme persists on refresh
276
+ awaiting: "User to respond 'approved' or describe issues"
277
+ ```
278
+
279
+ ### Decision Checkpoint
280
+
281
+ ```yaml
282
+ type: decision
283
+ checkpoint_details:
284
+ question: "Deploy to Vercel or Netlify?"
285
+ options:
286
+ - id: vercel
287
+ description: "Native Astro support, edge functions"
288
+ - id: netlify
289
+ description: "Simpler config, build plugins"
290
+ awaiting: "User to choose option"
291
+ ```
292
+
293
+ ### Human Action Checkpoint
294
+
295
+ ```yaml
296
+ type: human_action
297
+ checkpoint_details:
298
+ required_action: "Run: vercel login"
299
+ reason: "Vercel CLI authentication needed"
300
+ awaiting: "User to complete action and confirm"
301
+ ```
302
+
303
+ ### Session Death Checkpoint
304
+
305
+ ```yaml
306
+ type: session_death
307
+ checkpoint_details:
308
+ last_action: "Writing localStorage persistence logic"
309
+ partial_work:
310
+ files_created: ["src/components/DarkModeToggle.astro"]
311
+ files_modified: ["src/layouts/BaseLayout.astro"]
312
+ staged_changes: true
313
+ awaiting: "Automatic resume on next session"
314
+ ```
315
+
316
+ ## Testing Persistence
317
+
318
+ ### Test 1: Clean Checkpoint Resume
319
+
320
+ 1. Start mission: `/grid`
321
+ 2. Wait for checkpoint
322
+ 3. Approve checkpoint
323
+ 4. Close terminal (simulate session death)
324
+ 5. New session: `/grid:resume`
325
+ 6. **Expected:** Continues from approved checkpoint
326
+
327
+ ### Test 2: Session Death Recovery
328
+
329
+ 1. Start mission: `/grid`
330
+ 2. During execution, kill terminal (SIGKILL)
331
+ 3. New session: `/grid:resume`
332
+ 4. **Expected:** Detects stale state, reconstructs from scratchpad + git
333
+
334
+ ### Test 3: Failure Recovery
335
+
336
+ 1. Start mission that will fail (e.g., missing API key)
337
+ 2. Executor returns failure
338
+ 3. New session: `/grid:resume`
339
+ 4. **Expected:** Presents failure report with recovery options
340
+
341
+ ## Next Steps
342
+
343
+ 1. **Implement STATE.md updates in mc.md**
344
+ - Add wave complete handler
345
+ - Add block complete handler
346
+ - Add checkpoint handler
347
+
348
+ 2. **Implement SUMMARY.md writes in grid-executor.md**
349
+ - Add block complete handler
350
+ - Include lessons_learned section
351
+
352
+ 3. **Complete /grid:resume implementation**
353
+ - Add state validation
354
+ - Add context reconstruction
355
+ - Add continuation spawning
356
+
357
+ 4. **Test end-to-end persistence**
358
+ - Run full mission with interruption
359
+ - Verify resume works correctly
360
+
361
+ End of Line.
@@ -0,0 +1,283 @@
1
+ # Grid Persistence - Quick Start Guide
2
+
3
+ **Version:** 1.0
4
+ **Status:** Ready for Implementation
5
+
6
+ This guide provides a quick overview of Grid's persistence system.
7
+
8
+ ## What is Grid Persistence?
9
+
10
+ Grid Persistence allows missions to survive session death. When your session ends unexpectedly (context exhaustion, timeout, disconnect), all state is preserved in `.grid/` files. A fresh Master Control can resume exactly where you left off.
11
+
12
+ ## Core Concepts
13
+
14
+ ### State Files
15
+
16
+ | File | Purpose | When Written |
17
+ |------|---------|--------------|
18
+ | `STATE.md` | Central position tracking | Every wave |
19
+ | `WARMTH.md` | Accumulated knowledge | After each block |
20
+ | `CHECKPOINT.md` | Interrupted thread state | On checkpoint/interrupt |
21
+ | `SCRATCHPAD.md` | Live discoveries | During execution |
22
+ | `DECISIONS.md` | User decisions | When user decides |
23
+ | `BLOCKERS.md` | Issues blocking progress | When blocker encountered |
24
+
25
+ ### Directory Structure
26
+
27
+ ```
28
+ .grid/
29
+ ├── STATE.md # Read this first on resume
30
+ ├── WARMTH.md # Knowledge accumulation
31
+ ├── SCRATCHPAD.md # Live discoveries
32
+ ├── DECISIONS.md # User decisions
33
+ ├── BLOCKERS.md # Active blockers
34
+ ├── config.json # Grid configuration
35
+
36
+ ├── plans/ # Execution plans
37
+ ├── phases/ # Completed work
38
+ ├── discs/ # Program Identity Discs
39
+ ├── debug/ # Debug sessions
40
+ └── refinement/ # Refinement outputs
41
+ ```
42
+
43
+ ## Commands
44
+
45
+ ### Initialize State
46
+
47
+ ```bash
48
+ /grid:init
49
+ ```
50
+
51
+ Creates the `.grid/` directory with all state files.
52
+
53
+ ### Resume Mission
54
+
55
+ ```bash
56
+ /grid:resume # Auto-detect and resume
57
+ /grid:resume --validate # Validate state only
58
+ /grid:resume --from block-02 # Resume from specific block
59
+ ```
60
+
61
+ Reconstructs context from `.grid/` files and continues execution.
62
+
63
+ ### Check Status
64
+
65
+ ```bash
66
+ /grid:status
67
+ ```
68
+
69
+ Shows current mission state and progress.
70
+
71
+ ## How It Works
72
+
73
+ ### 1. During Execution
74
+
75
+ - **Master Control** updates STATE.md after each wave
76
+ - **Executors** write to SCRATCHPAD.md during work
77
+ - **Executors** write SUMMARY.md after each block
78
+ - **Master Control** aggregates WARMTH.md after each block
79
+
80
+ ### 2. On Checkpoint
81
+
82
+ When execution hits a checkpoint (human verification, decision point, etc.):
83
+
84
+ 1. Executor writes CHECKPOINT.md with current thread state
85
+ 2. MC updates STATE.md status to "checkpoint"
86
+ 3. System waits for user response
87
+
88
+ ### 3. On Session Death
89
+
90
+ If session dies unexpectedly:
91
+
92
+ 1. Last STATE.md update shows position
93
+ 2. SCRATCHPAD.md shows last activity
94
+ 3. Git history shows last commits
95
+ 4. Resume reconstructs state from these
96
+
97
+ ### 4. On Resume
98
+
99
+ ```
100
+ /grid:resume
101
+
102
+ 1. Read STATE.md
103
+ 2. Validate state consistency
104
+ 3. Load WARMTH.md, DECISIONS.md, CHECKPOINT.md
105
+ 4. Collect all SUMMARY.md files
106
+ 5. Build execution context
107
+ 6. Spawn continuation with full context
108
+ ```
109
+
110
+ ## Resume Scenarios
111
+
112
+ ### Scenario 1: Clean Checkpoint
113
+
114
+ **State:** User approved checkpoint, session ended normally
115
+
116
+ ```
117
+ STATUS: checkpoint
118
+ CHECKPOINT.md: exists with user_response
119
+ ```
120
+
121
+ **Action:** Continue from next thread after checkpoint
122
+
123
+ ### Scenario 2: Session Death
124
+
125
+ **State:** Session died mid-execution
126
+
127
+ ```
128
+ STATUS: active (stale timestamp)
129
+ CHECKPOINT.md: may not exist
130
+ SCRATCHPAD.md: recent entries
131
+ ```
132
+
133
+ **Action:** Reconstruct from scratchpad + git, resume from last known point
134
+
135
+ ### Scenario 3: Failure
136
+
137
+ **State:** Executor failed with error
138
+
139
+ ```
140
+ STATUS: failed
141
+ CHECKPOINT.md: type: failure with details
142
+ ```
143
+
144
+ **Action:** Present failure report, offer rollback/retry/manual options
145
+
146
+ ## Warmth Accumulation
147
+
148
+ Warmth is institutional knowledge that survives across sessions:
149
+
150
+ ```yaml
151
+ # WARMTH.md
152
+ codebase_patterns:
153
+ - "This project uses Astro content collections"
154
+ - "Dark mode uses class strategy with localStorage"
155
+
156
+ gotchas:
157
+ - "Astro config must use .mjs extension for ESM"
158
+ - "Shiki syntax highlighting is build-time only"
159
+
160
+ user_preferences:
161
+ - "User prefers minimal dependencies"
162
+ - "User wants dark mode as default"
163
+
164
+ decisions_made:
165
+ - "Chose serverless adapter for future flexibility"
166
+
167
+ almost_did:
168
+ - "Considered MDX but stuck with plain MD"
169
+ ```
170
+
171
+ When resuming, this warmth is injected into the continuation executor so it:
172
+ - Doesn't repeat mistakes
173
+ - Applies learned patterns
174
+ - Respects user preferences
175
+
176
+ ## State Validation
177
+
178
+ Before resuming, Grid validates:
179
+
180
+ 1. **Commits exist** - All claimed commit hashes are in git
181
+ 2. **Files exist** - All claimed artifacts exist on disk
182
+ 3. **Plans available** - Plans exist for pending blocks
183
+ 4. **Single checkpoint** - No conflicting checkpoint files
184
+ 5. **State parseable** - All YAML frontmatter is valid
185
+
186
+ If validation fails, Grid enters recovery mode and attempts to reconstruct state from git + artifacts.
187
+
188
+ ## Templates
189
+
190
+ All templates are in: `templates/grid-state/`
191
+
192
+ - `STATE.md` - Central state template
193
+ - `WARMTH.md` - Warmth template
194
+ - `SCRATCHPAD.md` - Scratchpad template
195
+ - `DECISIONS.md` - Decisions template
196
+ - `BLOCKERS.md` - Blockers template
197
+ - `CHECKPOINT.md` - Checkpoint template
198
+ - `config.json` - Configuration template
199
+ - `BLOCK-SUMMARY.md` - Block summary template
200
+
201
+ ## Example Flow
202
+
203
+ ```
204
+ 1. User: /grid
205
+ MC: Initializes .grid/, starts mission
206
+
207
+ 2. MC spawns Executor for block 01
208
+ Executor: Works on threads, commits, writes SCRATCHPAD.md
209
+
210
+ 3. Executor completes block 01
211
+ Executor: Writes SUMMARY.md with commits and lessons_learned
212
+ MC: Aggregates WARMTH.md from lessons_learned
213
+ MC: Updates STATE.md (block: 2, wave: 1, progress: 16%)
214
+
215
+ 4. Executor hits checkpoint for human verification
216
+ Executor: Writes CHECKPOINT.md (type: human_verify)
217
+ MC: Updates STATE.md (status: checkpoint)
218
+ MC: Waits for user
219
+
220
+ 5. Session dies (context exhaustion)
221
+ STATE.md: status: checkpoint (last write)
222
+ CHECKPOINT.md: awaiting user response
223
+
224
+ 6. New session
225
+ User: /grid:resume
226
+ MC: Reads STATE.md (status: checkpoint)
227
+ MC: Reads CHECKPOINT.md (type: human_verify)
228
+ MC: Loads WARMTH.md (accumulated knowledge)
229
+ MC: Presents checkpoint to user
230
+
231
+ 7. User approves checkpoint
232
+ MC: Updates CHECKPOINT.md (user_response: approved)
233
+ MC: Spawns continuation executor with:
234
+ - Warmth injected
235
+ - Completed threads table
236
+ - Resume point (next thread after checkpoint)
237
+
238
+ 8. Execution continues
239
+ Executor: Picks up from thread after checkpoint
240
+ Executor: Has full context from warmth
241
+ Executor: Doesn't repeat mistakes from prior attempts
242
+ ```
243
+
244
+ ## Implementation Status
245
+
246
+ ### Completed
247
+
248
+ - [x] Template files created
249
+ - [x] `/grid:resume` added to help
250
+ - [x] resume.md spec complete
251
+ - [x] init.md enhanced with templates
252
+ - [x] PERSISTENCE.md design document
253
+ - [x] Documentation complete
254
+
255
+ ### Next Steps
256
+
257
+ 1. **Implement state updates in mc.md**
258
+ - Wave complete handler
259
+ - Block complete handler
260
+ - Checkpoint handler
261
+
262
+ 2. **Implement summary writes in grid-executor.md**
263
+ - Block complete handler
264
+ - lessons_learned section
265
+
266
+ 3. **Complete /grid:resume implementation**
267
+ - State validation logic
268
+ - Context reconstruction logic
269
+ - Continuation spawning logic
270
+
271
+ 4. **Test end-to-end**
272
+ - Full mission with interruption
273
+ - Verify resume works correctly
274
+
275
+ ## Related Documentation
276
+
277
+ - `/docs/PERSISTENCE.md` - Full technical design
278
+ - `/docs/PERSISTENCE_IMPLEMENTATION.md` - Implementation guide
279
+ - `/templates/grid-state/README.md` - Template documentation
280
+ - `/commands/grid/resume.md` - Resume command spec
281
+ - `/commands/grid/init.md` - Init command spec
282
+
283
+ End of Line.