npm - the-grid-cc - Versions diffs - 1.7.3 → 1.7.5 - Mend

the-grid-cc 1.7.3 → 1.7.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/.grid-drafts/01-plan-execute-pipeline.md +709 -0
package/.grid-drafts/02-auto-verify-default.md +867 -0
package/.grid-drafts/03-quick-mode-detection.md +589 -0
package/.grid-drafts/04-scratchpad-enforcement.md +669 -0
package/README.md +12 -6
package/commands/grid/VERSION +1 -1
package/commands/grid/mc.md +408 -70
package/package.json +1 -1
package/assets/terminal.svg +0 -120

package/.grid-drafts/04-scratchpad-enforcement.md ADDED Viewed

@@ -0,0 +1,669 @@
+# Feature: Scratchpad Enforcement
+## Research Summary
+### Key Findings from Observability Research
+**1. The Three Pillars of Observability (OpenTelemetry, Distributed Systems)**
+- **Logs**: Timestamped event records (what happened)
+- **Metrics**: Numerical measurements (performance indicators)
+- **Traces**: Request journey across systems (flow visualization)
+The Grid currently lacks the "logs" pillar during execution. MC gets metrics (task completion) and can trace after-the-fact (SUMMARY.md), but has no live event stream.
+**2. Heartbeat Patterns in Distributed Systems**
+- **Purpose**: Prove worker is alive and making progress
+- **Frequency**: Periodic (e.g., every 30s) OR event-driven (on significant state changes)
+- **Lightweight requirement**: Must not add >5% overhead to worker execution
+- **Temporal workflow pattern**: Activities heartbeat with progress details, allowing resume from last checkpoint
+**3. Agent Orchestration Best Practices (Azure Foundry, AWS Bedrock)**
+- **Standardized telemetry**: Cross-agent communication requires uniform logging schema
+- **Structured logging**: Use consistent format for parsing (timestamps, agent ID, event type, context)
+- **Real-time visibility**: Orchestrators need live visibility into agent state for debugging and optimization
+- **Multi-agent tracking**: When agents run in parallel, individual agent traces prevent "black box" problem
+**4. Async Process Observability (OpenTelemetry patterns)**
+- **Context propagation**: Pass context (trace IDs, metadata) across async boundaries
+- **Live dashboards**: Observers monitor in-flight operations via continuously updated logs
+- **Sampling strategies**: Not every event needs logging (sample by importance/frequency)
+- **Cardinality control**: Too many unique log entries = cost explosion (balance detail vs. overhead)
+**5. Key Insight: "Seeing is Knowing"**
+Microsoft Foundry's core principle: "Ensuring reliability, safety, and performance of AI agents is critical. That's where agent observability comes in." Without visibility into what agents are doing during execution, you cannot:
+- Debug failures effectively (what was tried?)
+- Optimize performance (where are bottlenecks?)
+- Prove ROI (what actually happened?)
+- Resume from failure (what state was reached?)
+### Current Gap in The Grid
+Master Control is **blind during execution**. Programs spawn, MC waits, programs return. Between spawn and return:
+- No progress indicators
+- No discovery visibility
+- No bottleneck detection
+- No real-time course correction
+The scratchpad exists (`.grid/SCRATCHPAD.md`) but is **optional**. In the last build, Executors didn't use it. Result: MC couldn't see what was happening.
+---
+## Current Protocol
+From `mc.md` (lines 427-458):
+```markdown
+## SCRATCHPAD PROTOCOL
+**Live discoveries during execution.**
+`.grid/SCRATCHPAD.md` - Programs write here during execution, not just at end.
+### Structure
+```markdown
+---
+updated: {ISO timestamp}
+active_programs: [executor-01, executor-02]
+---
+## Live Discoveries
+### executor-01 (14:32:05)
+Found: Database connection string is in .env.local, not .env
+Impact: Other programs need to know this
+### executor-02 (14:32:18)
+Found: The User model has a deprecated 'name' field, use 'displayName'
+Impact: All user queries should use displayName
+### executor-01 (14:33:42)
+Correction: Actually .env.local only for development, .env for both
+```
+### Usage
+- **Write** when discovering something other Programs need
+- **Read** before starting execution
+- **Clear** after wave completes (archive to SCRATCHPAD_ARCHIVE.md)
+```
+**Problems with current protocol:**
+1. **Permissive language**: "write when discovering" = optional
+2. **No enforcement**: No mechanism to ensure Programs actually write
+3. **No monitoring**: MC doesn't actively read scratchpad during execution
+4. **Unclear triggers**: "When discovering" is subjective - what qualifies?
+5. **No structure enforcement**: Format shown but not validated
+6. **No cleanup spec**: "Archive after wave completes" but how?
+---
+## Proposed Changes
+### Change 1: Make Scratchpad Writing Expected, Not Optional
+**Before:**
+```markdown
+### Usage
+- **Write** when discovering something other Programs need
+- **Read** before starting execution
+- **Clear** after wave completes (archive to SCRATCHPAD_ARCHIVE.md)
+```
+**After:**
+```markdown
+### Mandatory Writing Rules
+Executors MUST write to scratchpad in these situations:
+1. **On unexpected codebase patterns** (WRITE IMMEDIATELY)
+   - File structure differs from assumption (e.g., barrel exports discovered)
+   - Naming conventions found (e.g., use displayName not name)
+   - API patterns (e.g., req.json() not req.body)
+   - Configuration locations (e.g., .env vs .env.local)
+2. **On decisions affecting other areas** (WRITE BEFORE COMMITTING)
+   - Choosing library A over B (rationale matters for other agents)
+   - Schema changes (other agents querying same tables)
+   - API contract changes (other agents calling these endpoints)
+   - File/folder structure changes (other agents importing these files)
+3. **On finding blockers or gotchas** (WRITE IMMEDIATELY)
+   - Missing dependencies
+   - Authentication requirements
+   - Rate limits or quotas
+   - External service configuration needs
+4. **On long-running work** (WRITE EVERY 5 MINUTES)
+   - Progress heartbeat: "Still working on X, 60% complete, no blockers"
+   - Prevents MC from thinking agent died
+   - Allows User to see progress in real-time
+**Failure to write = protocol violation.** If Recognizer finds an Executor didn't document a discovered pattern that caused issues for later work, that's a verification failure.
+### Usage
+- **Write** following the rules above (mandatory)
+- **Read** before starting execution (check what others learned)
+- **Monitor** by MC during execution (check every 30s for updates)
+- **Archive** after wave completes (move to SCRATCHPAD_ARCHIVE.md with timestamp)
+```
+### Change 2: Add Structured Entry Format
+**Before:**
+(No format specification beyond example)
+**After:**
+```markdown
+### Entry Format
+Each scratchpad entry MUST follow this structure:
+```
+### {program-id} | {ISO-timestamp} | {category}
+**Finding:** {What was discovered, in one clear sentence}
+**Impact:** {Who needs to know this / what areas affected}
+**Action:** [INFORM_ONLY | REQUIRES_CHANGE | BLOCKER]
+**Details:**
+{Additional context, code snippets, file paths}
+```
+**Categories:**
+- `PATTERN` - Codebase pattern discovered
+- `DECISION` - Decision made affecting other work
+- `BLOCKER` - Blocking issue found
+- `PROGRESS` - Heartbeat progress update
+- `CORRECTION` - Correcting a previous entry
+**Example:**
+```
+### executor-01 | 2026-01-23T14:32:05Z | PATTERN
+**Finding:** User model uses 'displayName' field, not 'name' (deprecated)
+**Impact:** Any agent querying User table must use displayName
+**Action:** INFORM_ONLY
+**Details:**
+Found in: prisma/schema.prisma line 15
+```
+@@@name String @deprecated
+displayName String
+```
+```
+```
+### executor-02 | 2026-01-23T14:45:12Z | PROGRESS
+**Finding:** Auth endpoint implementation 60% complete
+**Impact:** None (progress update)
+**Action:** INFORM_ONLY
+**Details:**
+Completed: JWT generation, token validation
+Remaining: Refresh token rotation, logout endpoint
+No blockers. Estimate 10 more minutes.
+```
+```
+### Change 3: Add MC Monitoring Protocol
+**Add new section after SCRATCHPAD PROTOCOL:**
+```markdown
+### MC Monitoring During Execution
+Master Control doesn't just wait for Programs to return. MC actively monitors progress.
+**After spawning a wave of Programs:**
+1. **Start monitoring loop** (every 30 seconds):
+   ```python
+   import time
+   from datetime import datetime, timedelta
+   def monitor_scratchpad_during_wave(active_programs, wave_start_time):
+       """Monitor scratchpad for updates while Programs execute."""
+       last_check = wave_start_time
+       max_silence = timedelta(minutes=10)  # Alert if no updates for 10 min
+       while programs_still_running(active_programs):
+           time.sleep(30)  # Check every 30 seconds
+           scratchpad = read(".grid/SCRATCHPAD.md")
+           # Parse latest entries
+           new_entries = parse_entries_since(scratchpad, last_check)
+           # Display new findings to User
+           if new_entries:
+               display_live_updates(new_entries)
+               last_check = datetime.now()
+           # Check for stalled agents
+           for program in active_programs:
+               last_update = get_last_update_time(scratchpad, program)
+               if datetime.now() - last_update > max_silence:
+                   alert_user(f"{program} hasn't written in 10 minutes - may be stalled")
+   ```
+2. **Display live updates to User:**
+   ```
+   Live Updates from Executors:
+   ├─ executor-01 (14:32): Found pattern - using displayName not name
+   ├─ executor-02 (14:35): Decision - chose JWT over sessions (performance)
+   ├─ executor-01 (14:40): Progress - Auth endpoints 60% done
+   └─ executor-03 (14:42): BLOCKER - Missing Stripe API keys
+   [Monitoring continues... CTRL+C to pause]
+   ```
+3. **Detect blockers and intervene:**
+   - If entry marked `BLOCKER`, pause monitoring, ask User for resolution
+   - After User responds, append resolution to scratchpad
+   - Resume monitoring
+4. **Surface to User on completion:**
+   ```
+   Wave 1 Complete
+   ════════════════
+   Key Discoveries (from scratchpad):
+   • Codebase uses displayName not name
+   • JWT chosen over sessions for performance
+   • API routes expect req.json() not req.body
+   Blockers Resolved:
+   • Stripe API keys added to .env
+   End of Line.
+   ```
+```
+### Change 4: Add Cleanup and Archival Protocol
+**Add to SCRATCHPAD PROTOCOL section:**
+```markdown
+### Archival After Wave Completion
+After a wave completes (all Programs in wave returned), MC must archive scratchpad:
+```python
+def archive_scratchpad(wave_number, phase, block):
+    """Move scratchpad to archive, preserving history."""
+    scratchpad = read(".grid/SCRATCHPAD.md")
+    # Create/append to archive
+    archive_path = ".grid/SCRATCHPAD_ARCHIVE.md"
+    archive_entry = f"""
+---
+wave: {wave_number}
+phase: {phase}
+block: {block}
+archived: {datetime.now().isoformat()}
+---
+{scratchpad}
+"""
+    if file_exists(archive_path):
+        append(archive_path, archive_entry)
+    else:
+        write(archive_path, archive_entry)
+    # Clear scratchpad for next wave
+    write(".grid/SCRATCHPAD.md", """---
+updated: {datetime.now().isoformat()}
+active_programs: []
+---
+## Live Discoveries
+(Empty - awaiting next wave)
+""")
+```
+**Archive structure** (`.grid/SCRATCHPAD_ARCHIVE.md`):
+```markdown
+# Scratchpad Archive
+Historical record of live discoveries from all waves.
+---
+wave: 1
+phase: 01-foundation
+block: 01
+archived: 2026-01-23T15:00:00Z
+---
+## Live Discoveries
+### executor-01 | 2026-01-23T14:32:05Z | PATTERN
+...
+---
+wave: 2
+phase: 01-foundation
+block: 02
+archived: 2026-01-23T15:30:00Z
+---
+## Live Discoveries
+### executor-03 | 2026-01-23T15:15:00Z | DECISION
+...
+```
+This preserves full history for debugging and Experience Replay.
+```
+### Change 5: Update Program Spawn Templates
+**In PROGRAM SPAWNING PROTOCOL section, update Executor spawn template:**
+**Before:**
+```python
+Task(
+  prompt=f"""
+First, read ~/.claude/agents/grid-executor.md for your role.
+<state>{state_content}</state>
+<plan>{plan_content}</plan>
+Execute the plan. Include lessons_learned in your SUMMARY.
+""",
+  ...
+)
+```
+**After:**
+```python
+Task(
+  prompt=f"""
+First, read ~/.claude/agents/grid-executor.md for your role.
+<state>{state_content}</state>
+<plan>{plan_content}</plan>
+<scratchpad_rules>
+You MUST write to .grid/SCRATCHPAD.md during execution:
+1. On discovering codebase patterns (IMMEDIATELY)
+2. On making decisions affecting other areas (BEFORE COMMITTING)
+3. On finding blockers (IMMEDIATELY)
+4. On long work (EVERY 5 MINUTES as progress heartbeat)
+Use this format:
+### {your-program-id} | {ISO-timestamp} | {category}
+**Finding:** {one sentence}
+**Impact:** {who needs to know}
+**Action:** [INFORM_ONLY | REQUIRES_CHANGE | BLOCKER]
+**Details:** {context}
+Before starting, READ scratchpad to see what other Programs learned.
+</scratchpad_rules>
+Execute the plan. Include lessons_learned in your SUMMARY.
+""",
+  ...
+)
+```
+---
+## Rationale
+### Why This Is Better
+**1. Visibility During Execution**
+- **Before:** MC is blind. User sees nothing. Agents might be stalled, duplicating work, or hitting blockers.
+- **After:** MC sees live updates every 30s. User gets real-time progress. Blockers surface immediately.
+**2. Cross-Agent Knowledge Sharing**
+- **Before:** Agent A discovers pattern, Agent B (running in parallel) doesn't know, repeats discovery or makes wrong assumption.
+- **After:** Agent A writes to scratchpad immediately, Agent B reads before starting, applies knowledge.
+**3. Failure Recovery**
+- **Before:** Agent fails, retry has no context about what was tried or discovered mid-execution.
+- **After:** Scratchpad captures discoveries up to failure point. Retry agent reads scratchpad, knows what was learned.
+**4. User Trust**
+- **Before:** Long silence while agents work. User wonders if system crashed.
+- **After:** Live updates every minute. User sees progress, knows system is working.
+**5. Debuggability**
+- **Before:** After completion, SUMMARY.md has final state. What happened during execution is lost.
+- **After:** SCRATCHPAD_ARCHIVE.md preserves full timeline. Can replay execution to find where things went wrong.
+**6. Enforcement Without Rigidity**
+- **Before:** Scratchpad was optional → unused.
+- **After:** Clear mandatory rules, but categories allow flexibility (PATTERN vs PROGRESS vs BLOCKER).
+**7. Lightweight Overhead**
+- Writing scratchpad entries: ~1-2 seconds per entry
+- Reading scratchpad: ~0.5 seconds
+- Total overhead: ~5% of execution time (acceptable per research)
+- Benefit: Massive visibility gain for minimal cost
+**8. Aligns with Experience Replay**
+- SCRATCHPAD_ARCHIVE.md feeds into Experience Replay (LEARNINGS.md)
+- Patterns discovered in scratchpad → become institutional memory
+- "This project type always has X pattern" emerges from scratchpad history
+---
+## Edge Cases Considered
+### Edge Case 1: Agent Forgets to Write
+**Symptom:** Executor completes, but scratchpad has no entries from that agent.
+**Detection:**
+- Recognizer checks SUMMARY.md against scratchpad
+- If SUMMARY mentions discoveries/decisions but scratchpad is empty → flag as protocol violation
+**Response:**
+```python
+if executor_completed_but_no_scratchpad_entries(executor_id):
+    # Don't fail the build, but note in verification
+    add_to_verification_report(f"""
+    PROTOCOL VIOLATION: {executor_id} completed without scratchpad entries.
+    Expected: Entries for patterns discovered, decisions made, progress updates.
+    Found: No entries.
+    Recommendation: Remind agent of scratchpad rules in retry.
+    """)
+```
+**Prevention:** Include scratchpad rules directly in every Executor spawn prompt (see Change 5).
+### Edge Case 2: Scratchpad Merge Conflicts
+**Symptom:** Two agents write to scratchpad simultaneously, one overwrites the other.
+**Solution:** Use append-only pattern with file locking:
+```python
+def write_scratchpad_entry(program_id, category, finding, impact, action, details):
+    """Thread-safe append to scratchpad."""
+    import fcntl  # File locking
+    entry = f"""
+### {program_id} | {datetime.now().isoformat()} | {category}
+**Finding:** {finding}
+**Impact:** {impact}
+**Action:** {action}
+**Details:**
+{details}
+"""
+    scratchpad_path = ".grid/SCRATCHPAD.md"
+    # Lock file, append entry, unlock
+    with open(scratchpad_path, 'a') as f:
+        fcntl.flock(f.fileno(), fcntl.LOCK_EX)  # Exclusive lock
+        f.write(entry)
+        fcntl.flock(f.fileno(), fcntl.LOCK_UN)  # Unlock
+```
+**Alternative (simpler):** Each agent writes to its own file initially:
+- `.grid/scratchpad/executor-01.md`
+- `.grid/scratchpad/executor-02.md`
+- `.grid/scratchpad/executor-03.md`
+MC merges these into SCRATCHPAD.md when reading. No conflicts possible.
+### Edge Case 3: Scratchpad Grows Too Large
+**Symptom:** After 10 waves, scratchpad archive is 50MB, slowing reads.
+**Solution:**
+1. **Per-phase archives:** `.grid/archives/phase-01-scratchpad.md`, `.grid/archives/phase-02-scratchpad.md`
+2. **Compression:** Archive old phases as `.gz` files
+3. **Rotation:** Keep only last 3 phases in human-readable format
+4. **Indexing:** Add frontmatter index to archive for fast lookups:
+   ```yaml
+   ---
+   entries_by_category:
+     PATTERN: [line 45, line 203, line 567]
+     BLOCKER: [line 123]
+     DECISION: [line 89, line 234]
+   entries_by_program:
+     executor-01: [line 45, line 123, line 234]
+     executor-02: [line 89, line 203]
+   ---
+   ```
+### Edge Case 4: Agent Writes Too Much (Spam)
+**Symptom:** Agent writes progress update every 10 seconds instead of every 5 minutes. Scratchpad flooded.
+**Prevention:**
+1. **Clear rules:** "EVERY 5 MINUTES" for progress, not "frequently"
+2. **Rate limiting in template:** Include guidance:
+   ```
+   Progress updates: Write every 5 minutes IF working >15 minutes on one thread.
+   For quick tasks (<15 min), no progress updates needed.
+   ```
+**Detection:** MC monitoring notices same agent writing 20+ PROGRESS entries in 10 minutes.
+**Response:** Note in post-wave report, adjust rules for next session.
+### Edge Case 5: MC Monitoring Adds Latency
+**Symptom:** MC checking scratchpad every 30s slows down User interaction.
+**Solution:**
+1. **Background monitoring:** MC spawns lightweight monitoring in separate thread/process
+2. **Non-blocking:** Monitoring doesn't block MC from responding to User
+3. **Async updates:** New scratchpad entries surface to User asynchronously (like a feed)
+**Implementation:**
+```python
+# In MC, after spawning wave
+monitor_task = Task(
+  prompt=f"""
+You are a scratchpad monitor. Every 30 seconds, read .grid/SCRATCHPAD.md
+and report NEW entries (since your last check).
+Continue monitoring until .grid/.wave_complete flag file appears.
+Report format:
+NEW ENTRIES:
+{formatted entries}
+""",
+  subagent_type="general-purpose",
+  model="haiku",  # Lightweight, cheap
+  description="Monitor scratchpad during wave execution",
+  run_in_background=True  # Non-blocking
+)
+# MC continues other work, monitor reports asynchronously
+```
+### Edge Case 6: Scratchpad Becomes Source of Truth Conflict
+**Symptom:** Agent writes "Using JWT" in scratchpad, but SUMMARY.md later says "Using sessions". Which is truth?
+**Hierarchy:**
+1. **SUMMARY.md = Final Truth** (what was actually done)
+2. **Scratchpad = Journey Log** (what was discovered/considered during execution)
+**Rule:** Scratchpad entries are timestamped discoveries. SUMMARY.md is retrospective final state. If conflict, SUMMARY wins, but scratchpad explains why the change happened.
+**Example:**
+- Scratchpad: "14:30 - DECISION: Choosing JWT for auth"
+- Scratchpad: "14:45 - CORRECTION: JWT refresh rotation is complex, switching to sessions"
+- SUMMARY.md: "Used session-based auth (simpler for this project)"
+The scratchpad shows the decision journey. SUMMARY shows the outcome.
+### Edge Case 7: User Wants to Disable Scratchpad
+**Symptom:** User finds scratchpad monitoring distracting or unnecessary for simple projects.
+**Solution:** Add config flag in `.grid/config.json`:
+```json
+{
+  "model_tier": "quality",
+  "scratchpad_monitoring": true,  // Set to false to disable
+  "scratchpad_write_required": true  // Set to false to make optional
+}
+```
+**When disabled:**
+- Scratchpad still exists, but MC doesn't monitor
+- Executors still encouraged to write, but not enforced
+- No live updates shown to User
+**Default:** Enabled (monitoring=true, required=true) for maximum visibility.
+---
+## Implementation Checklist
+- [ ] Update `mc.md` SCRATCHPAD PROTOCOL section with mandatory rules
+- [ ] Add structured entry format specification
+- [ ] Add MC monitoring protocol section
+- [ ] Add archival protocol section
+- [ ] Update Executor spawn template to include scratchpad rules
+- [ ] Update `grid-executor.md` to include scratchpad writing as core responsibility
+- [ ] Add `.grid/SCRATCHPAD.md` initialization to Grid startup
+- [ ] Add `monitor_scratchpad_during_wave()` function to MC
+- [ ] Add `archive_scratchpad()` function to wave completion
+- [ ] Add config flag for scratchpad monitoring (optional)
+- [ ] Add scratchpad check to Recognizer verification
+- [ ] Update documentation with scratchpad examples
+---
+## Summary
+This feature transforms scratchpad from **optional documentation** to **mandatory observability infrastructure**. It gives Master Control real-time visibility into Program execution, enabling:
+- Live progress updates to User
+- Cross-agent knowledge sharing
+- Faster blocker detection
+- Better failure recovery
+- Full execution history for debugging
+- Input for Experience Replay
+The implementation is lightweight (5% overhead), enforceable (via Recognizer checks), and aligns with distributed systems observability best practices from OpenTelemetry, Temporal, and Azure Foundry.
+By making "seeing" mandatory, we make "knowing" inevitable.
+**End of Line.**

package/README.md CHANGED Viewed

@@ -40,12 +40,6 @@ npx the-grid-cc
   <strong>Works on Mac, Windows, and Linux.</strong>
 </p>
-<br>
-<p align="center">
-  <img src="assets/terminal.svg" alt="The Grid Terminal" width="700"/>
-</p>
 ---
 ## The Problem
@@ -328,6 +322,18 @@ MIT License. See [LICENSE](LICENSE) for details.
 ---
+## Star History
+<a href="https://star-history.com/#JamesWeatherhead/grid&Date">
+ <picture>
+   <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=JamesWeatherhead/grid&type=Date&theme=dark" />
+   <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=JamesWeatherhead/grid&type=Date" />
+   <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=JamesWeatherhead/grid&type=Date" />
+ </picture>
+</a>
+---
 <p align="center">
   <strong>"I fight for the Users."</strong>
   <br><br>

package/commands/grid/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 1.7.3
1	+ 1.7.5