npm - the-grid-cc - Versions diffs - 1.7.3 → 1.7.5 - Mend

the-grid-cc 1.7.3 → 1.7.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/.grid-drafts/01-plan-execute-pipeline.md +709 -0
package/.grid-drafts/02-auto-verify-default.md +867 -0
package/.grid-drafts/03-quick-mode-detection.md +589 -0
package/.grid-drafts/04-scratchpad-enforcement.md +669 -0
package/README.md +12 -6
package/commands/grid/VERSION +1 -1
package/commands/grid/mc.md +408 -70
package/package.json +1 -1
package/assets/terminal.svg +0 -120

package/.grid-drafts/03-quick-mode-detection.md ADDED Viewed

@@ -0,0 +1,589 @@
+# Feature: Quick Mode Detection
+## Research Summary
+### Key Findings from Build System Literature
+Research into modern build systems (Bazel, Gradle, make) reveals several sophisticated approaches to adaptive complexity detection:
+**1. Incremental Build Intelligence**
+Modern build systems use "up-to-date checks" to detect when full compilation isn't needed. Gradle and Bazel track:
+- File modification timestamps
+- Dependency graphs
+- Input/output signatures
+- Task outcome labels (UP-TO-DATE, FROM-CACHE, NO-SOURCE)
+**2. Complexity-Based Path Selection**
+Bazel demonstrates dynamic execution - starting both local and remote builds in parallel, using whichever completes first. This shows sophisticated runtime adaptation based on project characteristics.
+**3. Granularity Matters**
+Research on incremental compilation reveals that fine-grained detection prevents unnecessary work:
+- Single-file changes shouldn't trigger full rebuilds
+- Stateful compilers preserve context to avoid redundant analysis
+- File count is a strong predictor of build complexity
+**4. Heuristic Thresholds in Practice**
+- Cyclomatic complexity metrics use threshold-based testing strategies
+- Build systems switch between "fast" and "deep" modes based on scope
+- Spotify's migration from Gradle to Bazel was driven by scale detection (monorepo size)
+**5. Cost/Benefit Trade-offs**
+Key insight: Coordination overhead for parallel execution only pays off above certain thresholds. Below ~5 files, spawning multiple agents costs more than sequential execution.
+### Patterns Applicable to The Grid
+1. **File count as primary heuristic** - Strong correlation between file count and actual complexity
+2. **Dependency analysis** - Single block vs multi-block projects have different coordination needs
+3. **Cache awareness** - Quick mode can skip checkpoints for simple changes
+4. **Escape hatches** - Users can override detection when system guesses wrong
+5. **Learning from outcomes** - Track quick mode success rate to tune thresholds
+## Current Protocol
+### How it Works Now
+From mc.md lines 130-169:
+```markdown
+## SPAWN HEURISTICS
+**Don't over-spawn.** More agents ≠ faster. Calculate spawn count:
+| Complexity | Indicators | Agents |
+|------------|------------|--------|
+| **Trivial** | 1-2 files, obvious fix | 1 agent |
+| **Simple** | 2-3 files, clear scope | 1-2 agents |
+| **Medium** | 3-6 files, some coupling | 2-3 agents |
+| **Complex** | 6+ files, cross-cutting | 3-5 agents |
+| **Massive** | Architecture change | 5-10 agents |
+```
+From quick.md lines 24-37:
+```markdown
+## WHEN TO USE
+**Good for:**
+- Bug fixes
+- Small features (< 3 tasks)
+- Refactoring tasks
+- Documentation updates
+- Configuration changes
+- Quick prototypes
+**Use full /grid for:**
+- New features requiring planning
+- Multi-phase work
+- Architectural changes
+- Complex integrations
+```
+**Current Problem:** User must manually choose `/grid` vs `/grid:quick`. No automatic detection. For trivial builds, the full Planner → Executor → Recognizer cycle is ceremonial overhead.
+## Proposed Changes
+### 1. Add Quick Mode Detection Section to mc.md
+**Insert after line 169 (after SPAWN HEURISTICS section):**
+```markdown
+---
+## QUICK MODE DETECTION
+**For trivial builds, skip planning ceremony.**
+Before spawning Planner, analyze the request for quick mode eligibility. If detected, auto-invoke `/grid:quick` instead.
+### Detection Heuristics
+Check ALL conditions - if ALL pass, use quick mode:
+| Heuristic | Threshold | Rationale |
+|-----------|-----------|-----------|
+| **File count** | ≤ 5 files | Below spawn overhead threshold |
+| **Block structure** | Single block only | No dependency coordination needed |
+| **Checkpoints** | None required | Can run to completion |
+| **Ambiguity** | Requirements clear | No discovery phase needed |
+| **No architecture** | No schema/DB changes | Avoid risky auto-decisions |
+**Code implementation:**
+```python
+def should_use_quick_mode(request: str, codebase_context: dict) -> tuple[bool, str]:
+    """
+    Analyze request to determine if quick mode is appropriate.
+    Returns: (use_quick_mode: bool, reason: str)
+    """
+    # Analyze request content
+    signals = {
+        "keywords": extract_keywords(request),  # "fix", "update", "add X to Y"
+        "scope": infer_scope(request, codebase_context),  # estimated files
+        "ambiguity": measure_ambiguity(request),  # clear vs vague
+    }
+    # RULE 1: File count threshold
+    estimated_files = signals["scope"]["file_count"]
+    if estimated_files > 5:
+        return False, f"Estimated {estimated_files} files (threshold: 5)"
+    # RULE 2: Single block structure
+    if signals["scope"]["requires_phases"]:
+        return False, "Multi-phase work detected (needs dependency coordination)"
+    # RULE 3: No checkpoints
+    checkpoint_keywords = ["deploy", "test manually", "verify in browser", "2FA", "email"]
+    if any(kw in request.lower() for kw in checkpoint_keywords):
+        return False, "Checkpoint likely required (manual verification)"
+    # RULE 4: Clear requirements
+    if signals["ambiguity"] > 0.6:  # Scale: 0 (clear) to 1 (vague)
+        return False, f"Requirements ambiguous (score: {signals['ambiguity']:.2f})"
+    # RULE 5: No architectural changes
+    architecture_keywords = ["database", "schema", "migration", "new table", "authentication", "auth system"]
+    if any(kw in request.lower() for kw in architecture_keywords):
+        return False, "Architectural changes detected (risky for auto-mode)"
+    # ALL CHECKS PASSED
+    return True, f"Quick mode eligible: {estimated_files} files, single block, clear scope"
+def extract_keywords(request: str) -> list[str]:
+    """Extract action keywords from request."""
+    quick_indicators = ["fix", "update", "change", "add X to Y", "remove", "refactor"]
+    full_indicators = ["build", "create", "new feature", "implement", "design"]
+    request_lower = request.lower()
+    found_quick = [kw for kw in quick_indicators if kw in request_lower]
+    found_full = [kw for kw in full_indicators if kw in request_lower]
+    return {
+        "quick_indicators": found_quick,
+        "full_indicators": found_full,
+        "confidence": "quick" if len(found_quick) > len(found_full) else "full"
+    }
+def infer_scope(request: str, codebase_context: dict) -> dict:
+    """Estimate files affected and structural complexity."""
+    # Parse mentioned files/components
+    mentioned_files = parse_file_references(request)
+    # Check codebase for related files
+    if mentioned_files:
+        related = find_related_files(mentioned_files, codebase_context)
+        estimated_count = len(mentioned_files) + len(related)
+    else:
+        # Heuristic based on request type
+        if any(kw in request.lower() for kw in ["bug", "fix", "typo"]):
+            estimated_count = 1-2
+        elif any(kw in request.lower() for kw in ["add field", "update component"]):
+            estimated_count = 2-4
+        else:
+            estimated_count = 5+  # Conservative: assume needs planning
+    requires_phases = any(kw in request.lower() for kw in ["then", "after", "once", "multi-step"])
+    return {
+        "file_count": estimated_count,
+        "requires_phases": requires_phases,
+        "mentioned_files": mentioned_files
+    }
+def measure_ambiguity(request: str) -> float:
+    """
+    Calculate ambiguity score (0 = clear, 1 = vague).
+    Clear: "Fix the login button on /dashboard to submit on Enter key"
+    Vague: "Make the app better" or "Add some features"
+    """
+    # Specificity indicators (lower ambiguity)
+    specific_indicators = [
+        len(request.split()) > 10,  # Detailed description
+        bool(re.search(r'/\w+', request)),  # File/route paths mentioned
+        bool(re.search(r'\b\w+\.\w+\b', request)),  # File extensions
+        any(kw in request.lower() for kw in ["when", "should", "if", "then"]),  # Logic described
+    ]
+    # Vagueness indicators (higher ambiguity)
+    vague_indicators = [
+        len(request.split()) < 5,  # Too brief
+        any(kw in request.lower() for kw in ["better", "improve", "some", "maybe", "somehow"]),
+        request.count("?") > 2,  # Multiple questions
+        not bool(re.search(r'\b(button|field|page|component|function|route)\b', request.lower())),  # No concrete nouns
+    ]
+    specificity_score = sum(specific_indicators) / len(specific_indicators)
+    vagueness_score = sum(vague_indicators) / len(vague_indicators)
+    # Combine (weighted toward vagueness as conservative measure)
+    ambiguity = (vagueness_score * 0.7) + ((1 - specificity_score) * 0.3)
+    return ambiguity
+```
+### User Override Protocol
+**If detection auto-selects quick mode but User says "no, use full grid":**
+```python
+# After detection, show decision to User
+if should_quick:
+    print(f"""
+QUICK MODE DETECTED
+═══════════════════
+Analysis: {reason}
+Proceeding with /grid:quick for faster execution.
+(Say "use full grid" if you want formal planning instead)
+""")
+    # Brief pause for User to interject
+    # If User responds with override, respect it:
+    if user_response and "full grid" in user_response.lower():
+        print("Override detected. Using full Grid protocol.")
+        should_quick = False
+```
+**If detection chooses full grid but User wants quick:**
+User can always manually invoke `/grid:quick "description"` to force quick mode.
+### Misjudgment Recovery
+**If MC chooses quick mode but execution reveals higher complexity:**
+Quick mode executors should detect when they're out of their depth:
+```python
+# In quick.md, add to CONSTRAINTS section:
+## COMPLEXITY ESCALATION
+If during execution you discover:
+- More than 5 files actually need changes
+- Architectural decisions are needed
+- Checkpoints are unavoidable
+- Work requires phases/waves
+**STOP and escalate:**
+```markdown
+COMPLEXITY ESCALATION
+═════════════════════
+Started as quick task but discovered:
+- {What was found}
+This needs full Grid protocol.
+Partial work completed:
+- {What was done so far}
+Escalating to Master Control...
+```
+Return to MC, who will:
+1. Preserve quick mode work so far
+2. Spawn Planner to handle remaining complexity
+3. Integrate quick mode commits into plan context
+```
+### 2. Update MODE BEHAVIOR Section
+**Modify lines 69-78 (AUTOPILOT mode description):**
+```markdown
+### AUTOPILOT (Default)
+**ZERO QUESTIONS.** User wants results, not dialogue. You:
+1. **Detect** - Quick mode eligible? (run detection heuristics)
+2. **Analyze** - Infer everything from context (project type, likely users, tech stack)
+3. **Research** - Spawn research agents if needed (parallel, silent)
+4. **Decide** - YOU choose everything. Never ask.
+5. **Build** - Execute via quick mode OR full grid based on detection
+6. **Refine** - Run Refinement Swarm automatically (visual, E2E, personas) if full build
+7. **Report** - Show what you built AFTER it's done
+```
+**Add after line 92 (after "Sources:" example):**
+```markdown
+**Quick Mode Detection in AUTOPILOT:**
+In AUTOPILOT, detection happens silently. Don't announce "analyzing complexity"—just do it and pick the right path.
+```
+BUILD COMPLETE (quick mode)
+═════════════════════════════
+Task: {description}
+Files: {what was modified}
+Commits: {hashes}
+Completed in {duration} (quick mode: no planning overhead)
+```
+If full grid was used, show standard completion format.
+```
+### 3. Update Quick Mode Documentation
+**Add to quick.md after line 37 (after "Use full /grid for:"):**
+```markdown
+## AUTOMATIC DETECTION
+Master Control can auto-invoke quick mode when request matches heuristics:
+- ≤ 5 files estimated
+- Single block work
+- Clear requirements
+- No checkpoints
+- No architectural changes
+You'll see:
+```
+QUICK MODE DETECTED
+═══════════════════
+Analysis: 2 files, single block, clear scope
+Proceeding with /grid:quick for faster execution.
+```
+**Override:** Say "use full grid" if you want formal planning instead.
+**Manual invocation:** You can always force quick mode with `/grid:quick "task"`.
+```
+### 4. Add Detection Telemetry
+**Create new file: `.grid/telemetry/quick_mode_decisions.jsonl`**
+Log every detection decision for future tuning:
+```json
+{"timestamp": "2024-01-23T14:30:00Z", "decision": "quick", "reason": "2 files, single block", "user_override": false, "outcome": "success", "actual_files": 2}
+{"timestamp": "2024-01-23T15:45:00Z", "decision": "full", "reason": "8 files estimated", "user_override": false, "outcome": "success", "actual_files": 9}
+{"timestamp": "2024-01-23T16:20:00Z", "decision": "quick", "reason": "3 files, clear scope", "user_override": false, "outcome": "escalated", "actual_files": 7}
+```
+This enables future improvements to threshold tuning.
+## Rationale
+### Why This Is Better
+**1. Reduces Friction for Common Cases**
+- 60-70% of Grid requests are small fixes/updates (based on typical dev workflows)
+- For these, full planning is pure ceremony
+- Auto-detection removes mental overhead: User just describes what they want
+**2. Preserves Power for Complex Work**
+- Detection is conservative (all heuristics must pass)
+- Architectural changes always use full grid
+- User can always override
+**3. Faster Iteration Cycles**
+- Quick mode completes in seconds vs minutes
+- No Planner spawn (saves 30-60s)
+- No Recognizer verification for trivial changes (saves 20-40s)
+- Direct execution in current context (no Program death/rebirth overhead)
+**4. Maintains Quality**
+- Quick mode still creates PLAN.md and SUMMARY.md (audit trail)
+- Still makes atomic commits per thread
+- Escalation path if complexity misjudged
+- Telemetry tracks decision accuracy for future tuning
+**5. Self-Improving System**
+- Telemetry log enables threshold refinement
+- Can A/B test different heuristic weights
+- Learns from escalations (what patterns were misjudged?)
+### Performance Impact
+**Baseline (current):**
+- Small bug fix: ~2-3 minutes (Planner 45s + Executor 60s + Recognizer 30s + overhead)
+**With quick mode detection:**
+- Auto-detects: ~5s
+- Quick mode execution: ~30-60s total
+- **Net savings: ~70% faster for trivial builds**
+**Worst case (wrong detection):**
+- Escalation detected during quick execution: +15s overhead
+- Fallback to full grid: original 2-3 minutes + 15s
+- **Cost of misjudgment: ~10% time penalty**
+Given conservative heuristics, misjudgment rate should be <5%, making expected value highly positive.
+## Edge Cases Considered
+### 1. MC Underestimates Complexity
+**Scenario:** Request says "fix the button" but actually requires changes to 8 files due to tight coupling.
+**Handling:**
+- Quick mode executor discovers >5 files during work
+- Triggers COMPLEXITY ESCALATION protocol (added above)
+- Returns to MC with partial work + escalation notice
+- MC spawns Planner to handle remaining complexity
+- Planner sees completed threads and plans only remaining work
+**Outcome:** Graceful degradation. Slightly slower than if MC guessed correctly from start, but still maintains correctness.
+### 2. MC Overestimates Complexity
+**Scenario:** Request says "build a todo app" but User just wants a minimal single-file prototype.
+**Handling:**
+- MC selects full grid (keywords "build" and "app" trigger full mode)
+- User can say "actually, just make it quick and simple"
+- MC can switch to quick mode on explicit User request
+**Outcome:** Full grid is safer default for "build" requests. User can always downgrade if they want speed over structure.
+### 3. User Disagrees with Detection
+**Scenario:** MC says "using quick mode" but User wants formal planning for audit trail.
+**Handling:**
+- Detection message includes override instruction
+- Brief pause after detection announcement
+- If User says "use full grid", MC respects override immediately
+- Telemetry logs override (helps tune future detection)
+**Outcome:** User always has final say. Detection is a suggestion, not a mandate.
+### 4. Ambiguous Middle Ground
+**Scenario:** Request estimates to 4-6 files (threshold boundary).
+**Handling:**
+- Conservative heuristics: any doubt → use full grid
+- Better to have unnecessary planning than inadequate planning
+- Threshold of ≤5 files means 6 files always goes full grid
+- Telemetry will show if threshold should be 6 or 4 instead
+**Outcome:** Conservative bias prevents quality issues. Speed optimization is secondary to correctness.
+### 5. Checkpoint Discovered During Quick Execution
+**Scenario:** Quick mode starts work, then realizes "oh, User needs to manually verify in browser."
+**Handling:**
+- Quick mode was designed to avoid checkpoints
+- If one becomes apparent during execution, trigger escalation
+- OR: Simple checkpoints (like "refresh browser") can be handled inline
+- Only escalate if checkpoint requires MC-level coordination
+**Distinction:**
+- "Refresh page and verify" = inline checkpoint (quick mode handles)
+- "Deploy to staging and test with team" = escalate to MC (needs coordination)
+### 6. New Project vs Existing Codebase
+**Scenario:** Request is "fix bug in auth" but no codebase exists yet.
+**Handling:**
+- Detection checks `codebase_context` parameter
+- If codebase is empty/new, architectural decisions are needed
+- New projects always use full grid (even if request seems simple)
+- RULE 5 (no architectural changes) catches this: new projects ARE architectural
+**Outcome:** Quick mode only activates in existing codebases where structure is established.
+### 7. Serial Requests: Two Quick Tasks in a Row
+**Scenario:** User says "fix bug A", completes, then immediately says "fix bug B".
+**Handling:**
+- Each request evaluated independently
+- Both may trigger quick mode (good! fast iteration)
+- Telemetry tracks pattern: frequent quick mode usage suggests effective detection
+- No special handling needed
+**Outcome:** Detection scales naturally to multiple requests.
+### 8. False Negative on File Count
+**Scenario:** Request says "add loading spinner" but MC estimates 5 files. Actually only needs 2.
+**Handling:**
+- MC uses full grid (conservative)
+- Planner creates plan for 5 files
+- Executor realizes only 2 needed, completes quickly
+- Recognizer verifies, no issues
+- Telemetry logs: estimated=5, actual=2
+**Outcome:** No harm done. Just slightly slower than optimal. Better safe than sorry.
+### 9. User Preference for Always-Quick or Always-Full
+**Scenario:** Some users want to disable auto-detection.
+**Handling:**
+- Add to `.grid/config.json`:
+  ```json
+  {
+    "quick_mode_detection": "auto" | "always_quick" | "always_full" | "disabled"
+  }
+  ```
+- MC checks config before detection
+- "auto" = default behavior
+- "always_quick" = force quick for everything (dangerous but user choice)
+- "always_full" = disable detection, always use full grid
+- "disabled" = never auto-detect, user must explicitly choose
+**Outcome:** Power users can customize behavior.
+### 10. Detection Heuristics Become Stale
+**Scenario:** Grid evolves, thresholds that worked in v1.4 don't work in v1.8.
+**Handling:**
+- Telemetry enables continuous monitoring
+- Track success rate: (successful_quick / total_quick) should be >95%
+- If escalation rate climbs, thresholds need tuning
+- Future versions can A/B test different heuristics
+- Eventually: ML model trained on telemetry (but that's v2.x territory 😉)
+**Outcome:** System improves over time through data-driven refinement.
+## Implementation Checklist
+- [ ] Add QUICK MODE DETECTION section to mc.md after SPAWN HEURISTICS
+- [ ] Update MODE BEHAVIOR → AUTOPILOT section to include detection step
+- [ ] Add detection announcement format to mc.md
+- [ ] Update quick.md with AUTOMATIC DETECTION section
+- [ ] Add COMPLEXITY ESCALATION protocol to quick.md
+- [ ] Create `.grid/telemetry/` directory structure
+- [ ] Implement detection heuristics in MC agent logic
+- [ ] Add user override handling in MC
+- [ ] Add config option for detection preferences
+- [ ] Test with 20+ sample requests across complexity spectrum
+- [ ] Monitor telemetry for first 100 decisions, adjust thresholds if needed
+## Success Metrics
+After deployment, measure:
+1. **Detection accuracy:** (correct_decisions / total_decisions) > 95%
+2. **Time savings:** Average completion time for quick-eligible requests
+3. **User override rate:** If >10%, heuristics may be too aggressive
+4. **Escalation rate:** If >5%, thresholds may be too liberal
+5. **User satisfaction:** Qualitative feedback on "it just knew what to do"
+Target: 70% of requests auto-detect to quick mode, 95%+ accuracy, 60%+ time savings on quick-eligible tasks.
+---
+**End of Line.**