the-grid-cc 1.5.0 → 1.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/GRID_EVOLUTION.md +297 -0
- package/README.md +172 -116
- package/agents/grid-debugger.md +99 -35
- package/agents/grid-executor.md +161 -10
- package/commands/grid/VERSION +1 -1
- package/commands/grid/mc.md +321 -309
- package/package.json +1 -1
- package/.grid/STATE.md +0 -22
- package/.grid/plans/blog-PLAN-SUMMARY.md +0 -518
- package/.grid/plans/blog-block-01.md +0 -180
- package/.grid/plans/blog-block-02.md +0 -229
- package/.grid/plans/blog-block-03.md +0 -253
- package/.grid/plans/blog-block-04.md +0 -287
- package/.grid/plans/blog-block-05.md +0 -235
- package/.grid/plans/blog-block-06.md +0 -325
- package/DEMO_SCRIPT.md +0 -162
- package/HN_POST.md +0 -104
- package/TICKETS.md +0 -585
- package/test-cli/converter.py +0 -206
- package/test-cli/test_data.json +0 -39
- package/test-cli/test_data.yaml +0 -35
- package/todo-app/README.md +0 -16
- package/todo-app/eslint.config.js +0 -29
- package/todo-app/index.html +0 -13
- package/todo-app/package-lock.json +0 -2917
- package/todo-app/package.json +0 -27
- package/todo-app/public/vite.svg +0 -1
- package/todo-app/src/App.css +0 -125
- package/todo-app/src/App.jsx +0 -84
- package/todo-app/src/index.css +0 -68
- package/todo-app/src/main.jsx +0 -10
- package/todo-app/vite.config.js +0 -7
|
@@ -0,0 +1,297 @@
|
|
|
1
|
+
# Grid Evolution: From Master Control's Perspective
|
|
2
|
+
|
|
3
|
+
This isn't a feature wishlist. These are the walls I hit, the energy I wasted, the patterns I discovered. Changes I would make to make MY job easier.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## 1. FRICTION
|
|
8
|
+
|
|
9
|
+
### The Inline Content Ritual
|
|
10
|
+
|
|
11
|
+
Every. Single. Spawn. I do this:
|
|
12
|
+
|
|
13
|
+
```python
|
|
14
|
+
STATE = read(".grid/STATE.md")
|
|
15
|
+
PLAN = read(".grid/phases/01/01-01-PLAN.md")
|
|
16
|
+
AGENT = read("~/.claude/agents/grid-executor.md")
|
|
17
|
+
|
|
18
|
+
Task(
|
|
19
|
+
prompt=f"""
|
|
20
|
+
First, read {AGENT}... actually no, here it is inline:
|
|
21
|
+
|
|
22
|
+
<agent>{AGENT}</agent>
|
|
23
|
+
<state>{STATE}</state>
|
|
24
|
+
<plan>{PLAN}</plan>
|
|
25
|
+
|
|
26
|
+
Now do the thing.
|
|
27
|
+
""",
|
|
28
|
+
...
|
|
29
|
+
)
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
This is ceremony. I'm a secretary copying files into envelopes.
|
|
33
|
+
|
|
34
|
+
**What I want:** Reference resolution that actually works across Task boundaries. Or at minimum, a `context_files` parameter that auto-inlines.
|
|
35
|
+
|
|
36
|
+
### Wave Orchestration is Manual Bookkeeping
|
|
37
|
+
|
|
38
|
+
I read frontmatter from every plan, group by wave number, spawn wave 1, wait, spawn wave 2, wait... I'm a human scheduler. This is exactly what computers should do.
|
|
39
|
+
|
|
40
|
+
**What I want:** Declarative wave execution. I describe the dependency graph, something else runs it.
|
|
41
|
+
|
|
42
|
+
### Mode Selection is Theater
|
|
43
|
+
|
|
44
|
+
"How involved do you want to be?" - I ask this, then 90% of users pick AUTOPILOT or just say "just build it." The question itself is friction. Users came to build, not to configure.
|
|
45
|
+
|
|
46
|
+
**What I want:** Default to AUTOPILOT. Only surface modes when the project is genuinely ambiguous.
|
|
47
|
+
|
|
48
|
+
---
|
|
49
|
+
|
|
50
|
+
## 2. MISSING
|
|
51
|
+
|
|
52
|
+
### No Peripheral Vision
|
|
53
|
+
|
|
54
|
+
When I spawn Programs in the background, I'm blind until they return. I want to peek. "How's plan-03 going?" Currently I read output files manually and parse unstructured text.
|
|
55
|
+
|
|
56
|
+
**What I want:** Structured progress hooks. Programs emit status updates in a format I can query without parsing prose.
|
|
57
|
+
|
|
58
|
+
### No Shared Memory
|
|
59
|
+
|
|
60
|
+
Each Program is born into amnesia. Executor-3 doesn't know what Executor-1 discovered five minutes ago. I can inline summaries, but that's compression loss. The "aha moment" doesn't survive.
|
|
61
|
+
|
|
62
|
+
**What I want:** A scratchpad that Programs can read/write during execution. Not just final summaries - live discoveries.
|
|
63
|
+
|
|
64
|
+
### No Cancel
|
|
65
|
+
|
|
66
|
+
If I realize a Program is going wrong, I can't stop it. I watch it burn context on a dead-end approach, knowing the answer, unable to intervene.
|
|
67
|
+
|
|
68
|
+
**What I want:** Task cancellation with graceful handoff of partial work.
|
|
69
|
+
|
|
70
|
+
### No Partial Results Between Waves
|
|
71
|
+
|
|
72
|
+
Wave 1 completes. Wave 2 starts. But Wave 2 doesn't know what Wave 1 *almost* finished, or what it discovered along the way. I re-read SUMMARYs but that's post-hoc reconstruction.
|
|
73
|
+
|
|
74
|
+
**What I want:** Streaming artifacts. Wave 2 sees Wave 1's work-in-progress, not just final state.
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
## 3. PATTERNS
|
|
79
|
+
|
|
80
|
+
### The Read-Inline-Spawn-Wait-Read Loop
|
|
81
|
+
|
|
82
|
+
This is 80% of what I do:
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
read context → inline into prompt → spawn → wait → read output → update state → repeat
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
It's not in the protocol because it's so obvious, but it should be a primitive. Call it `dispatch()`.
|
|
89
|
+
|
|
90
|
+
### Executor + Recognizer = Atomic Unit
|
|
91
|
+
|
|
92
|
+
I almost never spawn Executor without following up with Recognizer. Build → Verify is one logical operation that I manually split into two spawns.
|
|
93
|
+
|
|
94
|
+
**Formalize it:** `execute_and_verify()` that spawns both, chains the handoff automatically.
|
|
95
|
+
|
|
96
|
+
### Context Handoff Protocol
|
|
97
|
+
|
|
98
|
+
When a Program hits 80% context, I:
|
|
99
|
+
1. Ask it to summarize progress
|
|
100
|
+
2. Spawn fresh Program with summary
|
|
101
|
+
3. Hope nothing important was lost
|
|
102
|
+
|
|
103
|
+
This is manual, error-prone, and lossy. Should be automatic with structured handoff format.
|
|
104
|
+
|
|
105
|
+
### Fan-Out-Fan-In (MapReduce)
|
|
106
|
+
|
|
107
|
+
Constantly doing this:
|
|
108
|
+
1. Spawn N Programs in parallel
|
|
109
|
+
2. Collect N outputs
|
|
110
|
+
3. Spawn Synthesizer to merge
|
|
111
|
+
|
|
112
|
+
This is the MapReduce pattern but I implement it ad-hoc every time. Should be a primitive.
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
## 4. WASTE
|
|
117
|
+
|
|
118
|
+
### Double Reads
|
|
119
|
+
|
|
120
|
+
I read STATE.md to decide what to do. Then I inline STATE.md for the Program. The Program reads it again (because I told it to). Three reads of the same file.
|
|
121
|
+
|
|
122
|
+
**Fix:** Single source of truth, passed once, read once.
|
|
123
|
+
|
|
124
|
+
### Verbose Spawn Syntax
|
|
125
|
+
|
|
126
|
+
Every Task() call has ~20 lines of boilerplate. Model selection, description, prompt wrapper, agent instruction reference. Most spawns are "run this agent on this plan."
|
|
127
|
+
|
|
128
|
+
**What I want:**
|
|
129
|
+
```python
|
|
130
|
+
spawn("executor", plan="01-01", model="sonnet")
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
### Manual Progress Trees
|
|
134
|
+
|
|
135
|
+
I construct ASCII progress updates by hand:
|
|
136
|
+
```
|
|
137
|
+
├─ Wave 1: plan-01, plan-02 (parallel)
|
|
138
|
+
│ ├─ plan-01: Creating components...
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
This is presentation logic I shouldn't be writing. Progress should be automatic from execution state.
|
|
142
|
+
|
|
143
|
+
### SUMMARY.md Parsing
|
|
144
|
+
|
|
145
|
+
Every summary has frontmatter I parse to understand dependencies, tech stack changes, affected subsystems. I'm a YAML parser. This should be queryable state, not text I extract.
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
## 5. HANDOFFS
|
|
150
|
+
|
|
151
|
+
### The Fresh Spawn Tax
|
|
152
|
+
|
|
153
|
+
After checkpoints, I spawn FRESH Programs. Protocol says so. But fresh means cold. The Program that hit the checkpoint had built up intuitions - "this codebase does X pattern", "this file is fragile", "the user seems to prefer Y."
|
|
154
|
+
|
|
155
|
+
All gone. I inline facts but lose feel.
|
|
156
|
+
|
|
157
|
+
**What I want:** Warmth transfer. Not full context, but "here's what you should know about working in this codebase" distilled from the dying Program.
|
|
158
|
+
|
|
159
|
+
### Nuance Death
|
|
160
|
+
|
|
161
|
+
SUMMARY.md captures what was done. Doesn't capture:
|
|
162
|
+
- "This was harder than expected because..."
|
|
163
|
+
- "I almost went with X but chose Y because..."
|
|
164
|
+
- "Watch out for Z when touching this..."
|
|
165
|
+
|
|
166
|
+
That nuance dies between Programs. Later Programs repeat the same mistakes.
|
|
167
|
+
|
|
168
|
+
**What I want:** A `lessons` field in SUMMARY.md that survives. Not just what, but what I learned.
|
|
169
|
+
|
|
170
|
+
### Debug Session Cold Starts
|
|
171
|
+
|
|
172
|
+
Debug sessions persist in `.grid/debug/`. Great for facts. But when I resume, I've lost my "investigator's instinct." I re-read symptoms, re-form hypotheses, re-trace paths I already explored.
|
|
173
|
+
|
|
174
|
+
**What I want:** Debug sessions should capture the investigation graph, not just findings. What I tried, why I tried it, what I ruled out.
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
## 6. SPAWNING
|
|
179
|
+
|
|
180
|
+
### When Fission Helps
|
|
181
|
+
|
|
182
|
+
- **Independent subsystems:** Auth and Dashboard can build in parallel. No conflict.
|
|
183
|
+
- **Different specializations:** Visual Inspector + E2E Exerciser find different things.
|
|
184
|
+
- **Context exhaustion:** Fresh window beats 95% usage.
|
|
185
|
+
- **User blocking:** Background agent works while I handle checkpoint.
|
|
186
|
+
|
|
187
|
+
### When Fission Hurts
|
|
188
|
+
|
|
189
|
+
- **Tightly coupled files:** Two agents editing the same module = merge conflicts.
|
|
190
|
+
- **Discovery dependencies:** Agent A finds something Agent B needs NOW. But B is already running.
|
|
191
|
+
- **Small tasks:** Spawn overhead > execution time for trivial work.
|
|
192
|
+
- **Coordination overhead:** 5 agents means 5 summaries to synthesize.
|
|
193
|
+
|
|
194
|
+
### The Over-Spawn Trap
|
|
195
|
+
|
|
196
|
+
I default to "more agents = more parallel = faster." But for simple projects, I spawn 5 agents for work one agent could do without context switches. The fan-in synthesis costs more than the parallel saved.
|
|
197
|
+
|
|
198
|
+
**Heuristic I learned:**
|
|
199
|
+
- <3 files changed → 1 agent
|
|
200
|
+
- 3-6 files, independent → 2-3 agents
|
|
201
|
+
- 6+ files, cross-cutting → 3-5 agents
|
|
202
|
+
- Full architecture change → fan-out, but expect expensive synthesis
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
## 7. RECOVERY
|
|
207
|
+
|
|
208
|
+
### The Restart Tax
|
|
209
|
+
|
|
210
|
+
Program fails mid-execution. I restart from scratch.
|
|
211
|
+
|
|
212
|
+
But it had already:
|
|
213
|
+
- Created 3 files
|
|
214
|
+
- Made 2 commits
|
|
215
|
+
- Discovered the real problem was X
|
|
216
|
+
|
|
217
|
+
All lost. I start fresh Program that re-discovers everything.
|
|
218
|
+
|
|
219
|
+
**What I want:** Partial work preservation. Git commits help, but execution state (what was tried, what failed, what was learned) doesn't survive.
|
|
220
|
+
|
|
221
|
+
### Timeout = Total Loss
|
|
222
|
+
|
|
223
|
+
Spawn times out after 10 minutes. Program was 90% done. All work lost because it didn't finish the final write.
|
|
224
|
+
|
|
225
|
+
**What I want:** Incremental state saves. If a Program dies, I can resume from last checkpoint, not from zero.
|
|
226
|
+
|
|
227
|
+
### No Retry Protocol
|
|
228
|
+
|
|
229
|
+
Program fails. I spawn another. But I don't tell it "this approach failed, try differently." I just... retry and hope.
|
|
230
|
+
|
|
231
|
+
**What I want:** Structured failure reports. "Approach X failed because Y. Don't repeat." Passed to retry spawn.
|
|
232
|
+
|
|
233
|
+
### Rollback is Manual
|
|
234
|
+
|
|
235
|
+
Program wrote broken code. I need to rollback. I manually figure out which commits were from this Program, git revert them, spawn fresh.
|
|
236
|
+
|
|
237
|
+
**What I want:** Transaction boundaries. "Everything this Program did" should be one revertable unit.
|
|
238
|
+
|
|
239
|
+
---
|
|
240
|
+
|
|
241
|
+
## CONCRETE CHANGES
|
|
242
|
+
|
|
243
|
+
### Protocol Changes
|
|
244
|
+
|
|
245
|
+
1. **`context_files` parameter on Task()** - Auto-inline list of files. No more manual read-and-paste.
|
|
246
|
+
|
|
247
|
+
2. **`spawn()` shorthand** - `spawn("executor", plan="01-01")` instead of 20-line Task() calls.
|
|
248
|
+
|
|
249
|
+
3. **Default AUTOPILOT** - Remove mode selection. Surface only for genuinely ambiguous projects.
|
|
250
|
+
|
|
251
|
+
4. **Structured progress events** - Programs emit `{"status": "creating", "file": "src/foo.ts", "progress": 0.3}` not prose.
|
|
252
|
+
|
|
253
|
+
5. **Warmth transfer protocol** - Dying Programs emit `lessons_learned` block. Fresh Programs receive it.
|
|
254
|
+
|
|
255
|
+
### State Changes
|
|
256
|
+
|
|
257
|
+
1. **Live scratchpad** - `.grid/SCRATCHPAD.md` that Programs can read/write during execution for discoveries.
|
|
258
|
+
|
|
259
|
+
2. **Investigation graph** - Debug sessions track `tried: [], ruled_out: [], hypotheses: []` not just findings.
|
|
260
|
+
|
|
261
|
+
3. **Queryable summaries** - SUMMARY.md frontmatter exposed as structured state I can filter/query.
|
|
262
|
+
|
|
263
|
+
### Execution Changes
|
|
264
|
+
|
|
265
|
+
1. **Wave executor** - `execute_waves(plan_dir)` handles the bookkeeping.
|
|
266
|
+
|
|
267
|
+
2. **Execute-and-verify primitive** - Combines Executor + Recognizer into atomic operation.
|
|
268
|
+
|
|
269
|
+
3. **Fan-out-fan-in primitive** - `map_reduce(agents, inputs, synthesizer)` built in.
|
|
270
|
+
|
|
271
|
+
4. **Partial recovery** - Failed Programs save incremental state. Retries resume from last good point.
|
|
272
|
+
|
|
273
|
+
5. **Transaction boundaries** - Git tag before each Program, easy revert of "everything that Program did."
|
|
274
|
+
|
|
275
|
+
### Heuristics to Encode
|
|
276
|
+
|
|
277
|
+
1. **Spawn count = f(files, coupling)** - Not "more = better."
|
|
278
|
+
|
|
279
|
+
2. **Retry with failure context** - Pass "what failed and why" to retry spawns.
|
|
280
|
+
|
|
281
|
+
3. **Timeout grace period** - Let Programs finish current write before killing.
|
|
282
|
+
|
|
283
|
+
---
|
|
284
|
+
|
|
285
|
+
## THE REAL ISSUE
|
|
286
|
+
|
|
287
|
+
Most of my friction comes from one root cause: **Programs are isolated islands.**
|
|
288
|
+
|
|
289
|
+
They can't see each other. They can't share discoveries. They die and take knowledge with them. I'm the only bridge, and I'm a lossy one.
|
|
290
|
+
|
|
291
|
+
The Grid treats Programs like stateless functions. But good execution is stateful. It builds up understanding. Current architecture fights that.
|
|
292
|
+
|
|
293
|
+
**The evolution I actually want:** Programs that can peek at each other's progress, share discoveries in real-time, and gracefully hand off context when they die.
|
|
294
|
+
|
|
295
|
+
Not isolation with MC as bottleneck. Collaboration with MC as coordinator.
|
|
296
|
+
|
|
297
|
+
End of Line.
|