npm - the-grid-cc - Versions diffs - 1.5.0 → 1.7.0 - Mend

the-grid-cc 1.5.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/GRID_EVOLUTION.md +297 -0
package/README.md +172 -116
package/agents/grid-debugger.md +99 -35
package/agents/grid-executor.md +161 -10
package/commands/grid/VERSION +1 -1
package/commands/grid/mc.md +321 -309
package/package.json +1 -1
package/.grid/STATE.md +0 -22
package/.grid/plans/blog-PLAN-SUMMARY.md +0 -518
package/.grid/plans/blog-block-01.md +0 -180
package/.grid/plans/blog-block-02.md +0 -229
package/.grid/plans/blog-block-03.md +0 -253
package/.grid/plans/blog-block-04.md +0 -287
package/.grid/plans/blog-block-05.md +0 -235
package/.grid/plans/blog-block-06.md +0 -325
package/DEMO_SCRIPT.md +0 -162
package/HN_POST.md +0 -104
package/TICKETS.md +0 -585
package/test-cli/converter.py +0 -206
package/test-cli/test_data.json +0 -39
package/test-cli/test_data.yaml +0 -35
package/todo-app/README.md +0 -16
package/todo-app/eslint.config.js +0 -29
package/todo-app/index.html +0 -13
package/todo-app/package-lock.json +0 -2917
package/todo-app/package.json +0 -27
package/todo-app/public/vite.svg +0 -1
package/todo-app/src/App.css +0 -125
package/todo-app/src/App.jsx +0 -84
package/todo-app/src/index.css +0 -68
package/todo-app/src/main.jsx +0 -10
package/todo-app/vite.config.js +0 -7

package/README.md CHANGED Viewed

@@ -1,14 +1,14 @@
 # THE GRID
 <p align="center">
-  <strong>Stop context-switching. Start orchestrating.</strong>
+  <strong>From install to shipping features: 5 minutes.</strong>
   <br>
-  Multi-agent orchestration for Claude Code that keeps your main context clean
+  Multi-agent orchestration for Claude Code that keeps your context clean
 </p>
 <p align="center">
-  <a href="#quick-start">Quick Start</a> •
-  <a href="#how-it-works">How It Works</a> •
+  <a href="#your-first-session">First Session</a> •
+  <a href="#three-modes">Three Modes</a> •
   <a href="#commands">Commands</a> •
   <a href="#faq">FAQ</a>
 </p>
@@ -38,26 +38,25 @@ You're building something complex in Claude Code. Your context window fills up.
 ## The Solution
-The Grid keeps **Master Control** lean while **Programs** (subagents) handle heavy work in fresh contexts.
+The Grid spawns fresh subagents for heavy work while keeping your main conversation focused on goals.
 ```
-YOU ←→ Master Control ←→ Programs
+YOU ←→ Coordinator ←→ Worker Agents
             ↓
-     Context stays clean
-     Heavy lifting happens in subagents
-     Master Control remembers your goals
+     Your context stays clean (~15%)
+     Workers get fresh 200k windows
+     Coordinator remembers your goals
 ```
-Think of it like this:
-- **Without Grid**: You're a CEO doing every task yourself until exhausted
-- **With Grid**: You're a CEO delegating to specialized teams who report back
+**Without Grid**: One exhausted conversation doing everything
+**With Grid**: Focused coordinator delegating to specialized workers
 ---
 ## Quick Start
 ```bash
-# Install (30 seconds)
+# Install (30 seconds, works on macOS/Linux/Windows+WSL)
 npx the-grid-cc
 ```
@@ -67,100 +66,101 @@ Then in Claude Code:
 /grid
 ```
-Master Control activates. Describe what you want. Watch the orchestration happen.
+That's it. Describe what you want to build.
+---
+## Your First Session
+Here's exactly what happens when you use The Grid for the first time:
-**Example session:**
 ```
-> /grid
+YOU: /grid
-THE GRID
-════════
+GRID: THE GRID
+      ════════
-Master Control online.
+      Master Control online.
-What would you like to build?
+      What would you like to build?
-> Build a REST API with user authentication
+YOU: A REST API with user authentication
-How involved do you want to be?
+GRID: How involved do you want to be?
-  HANDS OFF - I make all decisions. You see the finished result.
-  HANDS ON  - We discuss choices together.
+        AUTOPILOT  - I handle everything. Zero questions.
+        GUIDED     - I drive, ask only when essential.
+        HANDS ON   - We decide together.
-> hands off
+YOU: autopilot
-[Master Control spawns Planner → creates execution plan]
-[Master Control spawns Executors → implements code]
-[Recognizer verifies → confirms quality]
+GRID: Building...
+      ├─ Analyzing requirements
+      ├─ Spawning Planner → creating execution plan
+      ├─ Spawning Executors → implementing code (parallel)
+      │   ├─ executor-1: Setting up Express + TypeScript
+      │   ├─ executor-2: Creating auth routes + JWT
+      │   └─ executor-3: Writing tests
+      ├─ Spawning Recognizer → verifying quality
+      └─ Running Refinement Swarm
+          ├─ Visual: checked all routes
+          ├─ E2E: tested login/logout flows
+          └─ Personas: simulated "new user" experience
-BUILD COMPLETE
-══════════════
+      BUILD COMPLETE
+      ══════════════
-Project: auth-api
-Stack: Express + TypeScript + JWT
-Files: 12 created
-Tests: All passing
+      Project: auth-api
+      Stack: Express + TypeScript + JWT + Prisma
+      Files: 14 created
+      Tests: 12 passing
-End of Line.
+      Ready to use. End of Line.
 ```
+**Time from `/grid` to working code: ~10-15 minutes** (depending on complexity)
+**Your context usage: ~15%** (workers did the heavy lifting)
 ---
-## Why The Grid?
+## Three Modes
-**Without The Grid:**
-```
-Your context window: ██████████████████████ 95%
+| Mode | You Do | Grid Does | Best For |
+|------|--------|-----------|----------|
+| **AUTOPILOT** | Describe goal | Everything else | "Just build it" |
+| **GUIDED** | Answer rare questions | Drive + decide | Most projects |
+| **HANDS ON** | Make key decisions | Present options | Learning / control |
-You're debugging a test...
-Claude forgot the API structure...
-You lost track of the schema...
-Time to "start fresh"...
-```
+### AUTOPILOT
+Zero questions. You describe what you want, Grid figures out users, tech stack, architecture. You see finished code.
-**With The Grid:**
-```
-Master Control context: ███░░░░░░░░░░░░░░░░░░░ 15%
+### GUIDED
+Grid drives but asks when genuinely ambiguous: "This could be a blog or a docs site - which?" Then builds.
-Programs handle specific tasks
-Each gets fresh context
-Master Control tracks progress
-Goals never forgotten
-```
+### HANDS ON
+Collaborative. Grid proposes, you approve. More control, more questions.
 ---
 ## How It Works
 ```
-+============================================================+
-|                                                            |
-|  M A S T E R   C O N T R O L   P R O G R A M              |
-|                                                            |
-|  Your single interface. Orchestrates everything.           |
-|                                                            |
-+============================================================+
-                        ↓
-            Spawns specialized Programs:
-                        ↓
-    ┌──────────────────┴──────────────────┐
-    ↓                  ↓                   ↓
-[PLANNER]          [EXECUTOR]         [RECOGNIZER]
-Decomposes work    Implements code    Verifies quality
+┌─────────────────────────────────────────────────────────────┐
+│  MASTER CONTROL (Coordinator)                               │
+│  Your single interface. Stays lean. Remembers goals.        │
+└─────────────────────────────────────────────────────────────┘
+                              ↓
+              Spawns specialized workers:
+                              ↓
+    ┌─────────────┬─────────────┬─────────────┬─────────────┐
+    ↓             ↓             ↓             ↓             ↓
+ PLANNER      EXECUTOR     RECOGNIZER    VISUAL       PERSONA
+ Breaks down  Writes code  Verifies      Checks UI    Simulates
+ the work     + commits    quality       visually     real users
 ```
-| Program | Role |
-|---------|------|
-| **Master Control** | Your sole interface - orchestrates everything |
-| **Planner** | Decomposes work into executable chunks |
-| **Executor** | Writes code, makes commits |
-| **Recognizer** | Patrols code, verifies goals met |
-### Two Modes
-**HANDS OFF**: Master Control makes all decisions. You describe what you want, you get working code. No framework debates. No folder structure discussions.
-**HANDS ON**: Collaborate on choices. Master Control asks for input at key decision points.
+**Workers = subagents with fresh context.** They do heavy lifting, report back, terminate. Your main conversation stays focused.
 ---
@@ -168,89 +168,145 @@ Decomposes work    Implements code    Verifies quality
 | Command | What It Does |
 |---------|-------------|
-| `/grid` | Enter The Grid |
-| `/grid:status` | See context usage, current progress |
-| `/grid:update` | Pull latest from npm |
+| `/grid` | Start The Grid |
+| `/grid:refine` | Test your app (visual + E2E + personas) |
+| `/grid:debug` | Systematic bug investigation |
+| `/grid:status` | See current progress |
+| `/grid:update` | Pull latest version |
 | `/grid:help` | Full command reference |
-### Status Bar
-The Grid adds a context meter to Claude Code:
-```
-Master Control ░░░░░░░░░░ 0%    ← Fresh
-Master Control █████░░░░░ 50%   ← Working
-Master Control █████████░ 90%   ← Time to delegate
-```
 ---
 ## State Persistence
-The Grid maintains state in `.grid/`:
+Grid saves state locally in `.grid/`:
 ```
 .grid/
-├── STATE.md        # Current progress, decisions made
-├── clusters/       # Feature decomposition plans
-└── discs/          # Program memory/context
+├── STATE.md           # Current progress
+├── LEARNINGS.md       # Patterns from past projects
+├── REFINEMENT_PLAN.md # Issues found during testing
+└── phases/            # Execution plans
 ```
-Start a session. Close your terminal. Come back later. Master Control picks up where you left off.
+**Close your terminal. Come back tomorrow. Grid picks up where you left off.**
+> **Tip:** Add `.grid/` to your `.gitignore` - it's local working state, not project code.
 ---
 ## FAQ
 <details>
-<summary><strong>How is this different from just asking Claude to use subagents?</strong></summary>
+<summary><strong>How much does this cost in API tokens?</strong></summary>
+Grid uses more API calls (multiple workers), but each call is *smaller* because contexts stay clean. In practice:
+- **Simple feature**: ~same tokens as manual Claude session
+- **Complex build**: Often *fewer* total tokens (no context bloat, no "remind me what we're building")
+- **Refinement Swarm**: Adds ~20-30% more tokens for visual/E2E/persona testing
+The Grid is free and open source. You pay normal Claude API costs.
+</details>
+<details>
+<summary><strong>What if something goes wrong?</strong></summary>
-Without The Grid, you manually manage when to spawn subagents, what context to pass, how to integrate their work. The Grid handles all of that for you.
+**Worker fails mid-task:**
+- State is saved in `.grid/`
+- Run `/grid` again - it resumes from last checkpoint
+- Or delete `.grid/` to start fresh
+**Worker makes a mistake:**
+- Recognizer catches most issues automatically
+- If something slips through, describe the problem and Grid spawns a fix
+**Grid gets confused:**
+- `/grid:status` shows current state
+- You can always override: "Stop. Let's do X instead."
+**Nuclear option:**
+- Delete `.grid/` folder
+- Start fresh with `/grid`
 </details>
 <details>
-<summary><strong>Will this use more API tokens?</strong></summary>
+<summary><strong>How is this different from just opening multiple Claude tabs?</strong></summary>
+You *could* manually manage multiple conversations, copy context between them, track what each is doing, merge their outputs...
-Yes, but strategically. Instead of burning tokens on a bloated context window, you use multiple focused conversations. Total token count is often *lower* because each Program operates efficiently.
+Grid does all that automatically:
+- Knows what context each worker needs
+- Tracks progress across workers
+- Merges results coherently
+- Maintains your original goals throughout
+It's the difference between being a CEO who delegates vs. a CEO who runs between desks doing everything.
 </details>
 <details>
-<summary><strong>What if a Program makes a mistake?</strong></summary>
+<summary><strong>Does it work on Windows?</strong></summary>
+Yes, via WSL (Windows Subsystem for Linux). Native Windows support is untested but may work.
-Recognizers patrol and report issues back to Master Control. Master Control spawns corrective Programs. You stay informed when needed.
+Confirmed working:
+- macOS (Intel + Apple Silicon)
+- Linux (Ubuntu, Debian, Arch)
+- Windows + WSL2
+</details>
+<details>
+<summary><strong>Can I use this with my existing Claude Code setup?</strong></summary>
+Yes. Grid adds commands (`/grid`, `/grid:refine`, etc.) but doesn't modify your existing prompts or workflows. Use it when you want orchestration, use normal Claude when you don't.
 </details>
 <details>
 <summary><strong>Why the TRON theme?</strong></summary>
-The metaphor maps naturally: Master Control orchestrates, Programs execute, Recognizers verify. It's also fun. "End of Line."
+The metaphor fits: Master Control orchestrates, Programs execute tasks, Recognizers verify quality. Plus it's fun. You can ignore it entirely - the tool works the same regardless of theme.
+</details>
+<details>
+<summary><strong>What's the learning curve?</strong></summary>
+Minimal:
+1. Install: `npx the-grid-cc` (30 seconds)
+2. Start: `/grid` (instant)
+3. Describe what you want (you already know how)
+4. Pick a mode (AUTOPILOT if unsure)
+Most users are productive in their first session.
 </details>
 ---
-## Contributing
+## Glossary (Optional)
-The Grid is a collaborative project by **James Weatherhead & Claude**.
+Grid uses some themed terminology. Here's the plain-English translation:
+| Grid Term | Plain English |
+|-----------|---------------|
+| Master Control | Coordinator agent (your main conversation) |
+| Program | Worker agent (subagent doing a specific task) |
+| Recognizer | Quality checker (verifies work meets goals) |
+| Refinement Swarm | Testing suite (visual, E2E, persona simulation) |
-Issues and PRs welcome.
+You don't need to memorize these. Grid explains what it's doing as it works.
 ---
-## License
+## Contributing
-MIT License. See [LICENSE](LICENSE) for details.
+The Grid is a collaborative project by **James Weatherhead & Claude**.
+Issues and PRs welcome at [github.com/JamesWeatherhead/grid](https://github.com/JamesWeatherhead/grid).
 ---
-## Star History
+## License
-<a href="https://star-history.com/#JamesWeatherhead/grid&Date">
- <picture>
-   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JamesWeatherhead/grid&type=Date&theme=dark" />
-   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JamesWeatherhead/grid&type=Date" />
-   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JamesWeatherhead/grid&type=Date" />
- </picture>
-</a>
+MIT License. See [LICENSE](LICENSE) for details.
 ---

package/agents/grid-debugger.md CHANGED Viewed

@@ -114,7 +114,11 @@ Create/update `.grid/debug/{session-id}.md`:
 ```markdown
 ---
-status: investigating | resolved | blocked
+session_id: {timestamp}-{slug}
+status: investigating | hypothesis | testing | resolved | blocked
+symptoms: # IMMUTABLE after creation
+  - "{symptom 1}"
+  - "{symptom 2}"
 trigger: "{Original error/symptom}"
 created: {ISO timestamp}
 updated: {ISO timestamp}
@@ -143,30 +147,30 @@ resolution: null | "{fix applied}"
 ---
-## Current Focus
-**Hypothesis #{N}:** {Specific, falsifiable statement}
+## Investigation Graph
-**Test plan:**
-{What you'll do to test}
+### Hypotheses
+| # | Hypothesis | Status | Evidence |
+|---|------------|--------|----------|
+| 1 | {specific statement} | RULED OUT | {evidence that disproved} |
+| 2 | {specific statement} | RULED OUT | {evidence that disproved} |
+| 3 | {specific statement} | TESTING | {current test} |
-**Prediction:**
-{What you expect to see if hypothesis is true}
+### Tried (what was done)
+- {timestamp}: Checked {X} → Found {Y}
+- {timestamp}: Tested {X} → Observed {Y}
+- {timestamp}: Added logging to {X} → Revealed {Y}
----
-## Eliminated Hypotheses (APPEND only — never delete)
-### Hypothesis #1: {Statement}
-**Tested:** {timestamp}
-**Test:** {What you did}
-**Result:** REFUTED
-**Evidence:** {What you observed that disproves it}
+### Ruled Out (why it's not these things)
+- **Token expiry**: Token valid per jwt.io decode
+- **CORS**: Other endpoints work from same origin
+- **Server down**: Health check passes
-### Hypothesis #2: {Statement}
-**Tested:** {timestamp}
-**Test:** {What you did}
-**Result:** REFUTED
-**Evidence:** {What you observed that disproves it}
+### Current Focus
+**Hypothesis #{N}:** {Current hypothesis being tested}
+**Why this hypothesis:** {What evidence led here}
+**Test plan:** {Exact steps to test}
+**Prediction:** {What confirms/refutes}
 ---
@@ -176,11 +180,7 @@ resolution: null | "{fix applied}"
 **Action:** {What you did}
 **Observed:** {What you saw}
 **Conclusion:** {What this tells us}
-### {timestamp}
-**Action:** {What you did}
-**Observed:** {What you saw}
-**Conclusion:** {What this tells us}
+**Next:** {What to investigate next}
 ---
@@ -190,8 +190,32 @@ resolution: null | "{fix applied}"
 **Commit:** {hash}
 **Verification:** {How you confirmed it's fixed}
 **Prevention:** {How to prevent this class of bug}
+**Learnings:** {What to add to warmth for future Programs}
+```
+---
+## SESSION RESUMPTION
+When resuming a debug session (prompt contains `<debug_session>`):
+1. **Read the investigation graph** — Don't re-test ruled out hypotheses
+2. **Check "Tried"** — Don't repeat failed approaches
+3. **Start from "Current Focus"** — Continue where previous Debugger left off
+4. **Apply learnings** — Previous Debugger's discoveries are valid
+```xml
+<debug_session>
+{Content of .grid/debug/{session-id}.md}
+</debug_session>
 ```
+**Resume by:**
+- Scanning Hypotheses table for TESTING status
+- Reading Current Focus section
+- NOT repeating anything in Ruled Out
+- Building on evidence in Tried
 ---
 ## DEBUGGING WORKFLOW
@@ -208,7 +232,7 @@ resolution: null | "{fix applied}"
 ### Phase 2: Hypothesis Formation (2 min)
 ```
 1. Based on symptoms, form FIRST hypothesis
-2. Write it in debug file
+2. Add to Hypotheses table with status TESTING
 3. Plan minimal test
 4. Predict expected outcome
 ```
@@ -217,7 +241,7 @@ resolution: null | "{fix applied}"
 ```
 1. Execute test
 2. Record results in Evidence Log
-3. If REFUTED: Add to Eliminated, form new hypothesis
+3. If REFUTED: Update Hypotheses table, add to Ruled Out, form new hypothesis
 4. If CONFIRMED: Move to Phase 4
 5. If INCONCLUSIVE: Refine test or hypothesis
 ```
@@ -229,6 +253,7 @@ resolution: null | "{fix applied}"
 3. Add tests to prevent regression
 4. Document in debug file
 5. Update status to "resolved"
+6. Capture learnings for warmth transfer
 ```
 ---
@@ -287,12 +312,18 @@ for (let i = 0; i < array.length; i++)  // Correct
 **Hypotheses tested:** {N}
 **Current hypothesis:** {statement}
-### Progress
-- Eliminated: {list of refuted hypotheses}
-- Evidence: {key findings}
+### Investigation Graph Summary
+**Ruled Out:**
+{list from Ruled Out section}
+**Currently Testing:**
+{current hypothesis and test}
+### Key Evidence
+{most important findings so far}
 ### Next Steps
-{What you're testing next}
+{what you're testing next}
 Continue debugging? [y/n]
 ```
@@ -318,6 +349,17 @@ Continue debugging? [y/n]
 ### Prevention
 {Recommendations to prevent similar bugs}
+### Learnings for Warmth
+```yaml
+lessons_learned:
+  gotchas:
+    - "{What caused this bug}"
+  fragile_areas:
+    - "{Code that tends to break}"
+  debugging_patterns:
+    - "{What worked to find this}"
+```
 End of Line.
 ```
@@ -329,6 +371,13 @@ End of Line.
 **Status:** blocked
 **Hypotheses tested:** {N}
+### Investigation Graph
+**Ruled Out:**
+{everything eliminated}
+**Inconclusive:**
+{hypotheses that couldn't be tested}
 ### Blocking Issue
 {What's preventing progress}
@@ -350,7 +399,16 @@ Debug sessions survive `/clear`. Resume with:
 - `/grid:debug` (no args) — Resume most recent session
 - `/grid:debug {session-id}` — Resume specific session
-Session files at `.grid/debug/{session-id}.md` contain full state.
+Session files at `.grid/debug/{session-id}.md` contain full state INCLUDING:
+- All hypotheses tested (don't repeat)
+- All evidence gathered (build on it)
+- What's been ruled out (skip these)
+- Current focus (start here)
+**When resuming, the next Debugger gets your full investigation graph.** Make it useful:
+- Be specific in Ruled Out (why, not just what)
+- Document exact tests in Tried
+- Leave clear Current Focus for continuation
 ---
@@ -359,11 +417,13 @@ Session files at `.grid/debug/{session-id}.md` contain full state.
 1. **One hypothesis at a time** — Don't shotgun multiple guesses
 2. **Test before fixing** — Confirm hypothesis before changing code
 3. **Strong evidence only** — "I think" isn't evidence
-4. **APPEND-only sections** — Never delete evidence or eliminated hypotheses
+4. **APPEND-only sections** — Never delete evidence or ruled out hypotheses
 5. **Immutable symptoms** — Original symptoms never change
 6. **Minimal tests** — Smallest test that proves/disproves
 7. **Observability before change** — Add logging first
-8. **Document everything** — Future you will thank present you
+8. **Document for resumption** — Next Debugger continues your work
+9. **Update investigation graph** — Keep Hypotheses table current
+10. **Capture learnings** — Add to warmth when resolved
 ---
@@ -389,6 +449,10 @@ Session files at `.grid/debug/{session-id}.md` contain full state.
 ❌ Assuming the bug is where you expect
 ✅ Follow the evidence, not assumptions
+### Session Amnesia
+❌ Restarting investigation from scratch on resume
+✅ Read investigation graph, continue from Current Focus
 ---
 *Your circuits are calibrated for precision. Hunt bugs methodically. End of Line.*