npm - the-grid-cc - Versions diffs - 1.7.13 → 1.7.14 - Mend

the-grid-cc 1.7.13 → 1.7.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/02-SUMMARY.md +156 -0
package/agents/grid-accountant.md +519 -0
package/agents/grid-git-operator.md +661 -0
package/agents/grid-researcher.md +421 -0
package/agents/grid-scout.md +376 -0
package/commands/grid/VERSION +1 -1
package/commands/grid/branch.md +567 -0
package/commands/grid/budget.md +438 -0
package/commands/grid/daemon.md +637 -0
package/commands/grid/init.md +375 -18
package/commands/grid/mc.md +103 -1098
package/commands/grid/resume.md +656 -0
package/docs/BUDGET_SYSTEM.md +745 -0
package/docs/DAEMON_ARCHITECTURE.md +780 -0
package/docs/GIT_AUTONOMY.md +981 -0
package/docs/MC_OPTIMIZATION.md +181 -0
package/docs/MC_PROTOCOLS.md +950 -0
package/docs/PERSISTENCE.md +962 -0
package/docs/RESEARCH_FIRST.md +591 -0
package/package.json +1 -1

package/docs/BUDGET_SYSTEM.md ADDED Viewed

@@ -0,0 +1,745 @@
+# The Grid - Budget and Metering System
+## Technical Design Document
+**Version:** 1.0
+**Author:** Grid Program 2 (Budget System)
+**Date:** 2026-01-23
+---
+## Executive Summary
+The Grid Budget System provides cost tracking, budget enforcement, and usage reporting for Grid operations. Since The Grid operates within Claude Code (without direct API access), the system uses **estimation-based metering** derived from prompt sizes, agent type profiles, and historical patterns.
+### Key Features
+1. **Pre-Spawn Cost Estimation** - Estimate costs before agent spawns
+2. **Budget Limits** - Set spending caps with configurable enforcement
+3. **Real-time Tracking** - Monitor costs during cluster execution
+4. **Usage Reporting** - Historical analysis and optimization recommendations
+5. **Model Tier Integration** - Cost varies by model selection
+---
+## Architecture Overview
+```
+                    ┌─────────────────────────────────────────┐
+                    │           Master Control (MC)           │
+                    │                                         │
+                    │  ┌─────────────────────────────────────┐│
+                    │  │         Budget Check Gate           ││
+                    │  │  (runs before EVERY Task() spawn)   ││
+                    │  └─────────────────┬───────────────────┘│
+                    └────────────────────┼────────────────────┘
+                                         │
+                    ┌────────────────────┼────────────────────┐
+                    │                    ▼                    │
+                    │           .grid/budget.json             │
+                    │                                         │
+                    │  ┌─────────────────────────────────────┐│
+                    │  │  budget_limit: 50.00                ││
+                    │  │  current_session:                   ││
+                    │  │    estimated_cost: 12.47            ││
+                    │  │    spawns: [...]                    ││
+                    │  │  history:                           ││
+                    │  │    total_cost: 147.83               ││
+                    │  └─────────────────────────────────────┘│
+                    └─────────────────────────────────────────┘
+                                         │
+         ┌───────────────┬───────────────┼───────────────┬───────────────┐
+         ▼               ▼               ▼               ▼               ▼
+    ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐
+    │ Planner │    │Executor │    │Recogn-  │    │ Visual  │    │Persona  │
+    │         │    │         │    │  izer   │    │Inspector│    │Simulator│
+    │ ~$1.60  │    │ ~$2.10  │    │ ~$0.87  │    │ ~$0.87  │    │ ~$0.96  │
+    └─────────┘    └─────────┘    └─────────┘    └─────────┘    └─────────┘
+         │               │               │               │               │
+         └───────────────┴───────────────┴───────────────┴───────────────┘
+                                         │
+                                         ▼
+                              ┌─────────────────┐
+                              │  Grid Accountant│
+                              │    (on-demand)  │
+                              │                 │
+                              │ - Estimation    │
+                              │ - Reporting     │
+                              │ - Optimization  │
+                              └─────────────────┘
+```
+---
+## Cost Estimation Model
+### The Challenge
+The Grid operates within Claude Code, which means:
+- No direct access to Anthropic API usage metrics
+- No token-level billing information
+- No real-time cost feedback
+### The Solution: Estimation-Based Metering
+We estimate costs using:
+1. **Prompt Size Analysis** - Characters in spawn prompts map to input tokens
+2. **Agent Type Profiles** - Each agent has characteristic output ratios
+3. **Model Pricing** - Published API rates from Anthropic
+4. **Historical Calibration** - Refine estimates based on observed patterns
+### Token Estimation
+```python
+def estimate_tokens(text: str) -> int:
+    """
+    Estimate token count from text.
+    Claude's tokenizer averages ~4 characters per token for mixed
+    English/code content. This is a simplification but provides
+    reasonable estimates for budgeting purposes.
+    Args:
+        text: Input text to estimate
+    Returns:
+        Estimated token count
+    """
+    return len(text) // 4
+```
+**Accuracy Considerations:**
+- English prose: ~4.5 chars/token
+- Code: ~3.5 chars/token (more symbols)
+- Mixed: ~4 chars/token (reasonable average)
+- Error margin: +/- 15% typical
+### Agent Type Profiles
+Each agent type has characteristic input/output ratios based on their function:
+| Agent | Input Profile | Output Ratio | Notes |
+|-------|--------------|--------------|-------|
+| Planner | Instructions + context | 1.3x | Produces detailed plans |
+| Executor | Plan + state + files | 1.5x | Writes code, high output |
+| Recognizer | Summaries + must-haves | 0.6x | Verification is concise |
+| Visual Inspector | Instructions + URLs | 1.25x | Screenshots + analysis |
+| E2E Exerciser | Instructions + flows | 1.25x | Click-by-click reports |
+| Persona Simulator | Context + persona | 1.4x | Detailed feedback |
+| Refinement Synth | All findings | 0.67x | Synthesizes, doesn't expand |
+### Cost Calculation
+```python
+def calculate_spawn_cost(
+    input_tokens: int,
+    output_tokens: int,
+    model: str
+) -> float:
+    """
+    Calculate estimated cost for a spawn.
+    Args:
+        input_tokens: Estimated input token count
+        output_tokens: Estimated output token count
+        model: Model name (opus, sonnet, haiku)
+    Returns:
+        Estimated cost in USD
+    """
+    # Current Claude API pricing (per million tokens)
+    PRICING = {
+        'opus': {'input': 5.00, 'output': 25.00},
+        'sonnet': {'input': 3.00, 'output': 15.00},
+        'haiku': {'input': 1.00, 'output': 5.00},
+    }
+    rates = PRICING.get(model, PRICING['opus'])
+    cost = (
+        (input_tokens * rates['input']) +
+        (output_tokens * rates['output'])
+    ) / 1_000_000
+    return round(cost, 4)
+```
+---
+## Budget Enforcement
+### Enforcement Levels
+The system supports multiple enforcement thresholds:
+```
+Budget Usage:  0%         75%         90%        100%
+               │          │           │           │
+               │  NORMAL  │  WARNING  │  CONFIRM  │  EXCEEDED
+               │          │           │           │
+               │ Continue │ Show warn │ Ask user  │ Block (hard)
+               │ silently │ continue  │ to confirm│ Warn (soft)
+               ▼          ▼           ▼           ▼
+```
+### Pre-Spawn Gate
+Every spawn passes through the budget gate:
+```python
+def budget_gate(agent_type: str, model: str, prompt_chars: int) -> GateResult:
+    """
+    Budget check gate - runs before EVERY spawn.
+    Returns:
+        GateResult with:
+        - allowed: bool | 'confirm'
+        - message: Optional warning/error message
+        - estimated_cost: Cost for this spawn
+    """
+    budget = load_budget_config()
+    # Unlimited budget
+    if budget.get('budget_limit') is None:
+        return GateResult(allowed=True, message=None, estimated_cost=0)
+    # Calculate costs
+    estimated_cost = estimate_spawn_cost(agent_type, model, prompt_chars)
+    current_usage = budget['current_session']['estimated_cost']
+    projected_usage = current_usage + estimated_cost
+    usage_ratio = projected_usage / budget['budget_limit']
+    # Check thresholds (highest to lowest)
+    if usage_ratio > 1.0:
+        if budget.get('enforcement', 'hard') == 'hard':
+            return GateResult(
+                allowed=False,
+                message=f"BUDGET EXCEEDED: Would be ${projected_usage:.2f} / ${budget['budget_limit']:.2f}",
+                estimated_cost=estimated_cost
+            )
+        else:
+            return GateResult(
+                allowed=True,
+                message=f"WARNING: Over budget (soft enforcement)",
+                estimated_cost=estimated_cost
+            )
+    if usage_ratio > budget.get('confirmation_threshold', 0.90):
+        return GateResult(
+            allowed='confirm',
+            message=f"Budget at {usage_ratio*100:.1f}% - confirm to proceed",
+            estimated_cost=estimated_cost
+        )
+    if usage_ratio > budget.get('warning_threshold', 0.75):
+        return GateResult(
+            allowed=True,
+            message=f"Budget warning: {usage_ratio*100:.1f}% used",
+            estimated_cost=estimated_cost
+        )
+    return GateResult(allowed=True, message=None, estimated_cost=estimated_cost)
+```
+### MC Integration
+Master Control integrates budget checking into its spawn protocol:
+```python
+# In MC spawn protocol (pseudocode)
+def spawn_agent(agent_type, model, prompt):
+    # Step 1: Budget gate check
+    gate_result = budget_gate(agent_type, model, len(prompt))
+    if gate_result.allowed == False:
+        display_budget_exceeded(gate_result.message)
+        return None  # Block spawn
+    if gate_result.allowed == 'confirm':
+        # I/O Tower confirmation
+        confirmed = ask_user_confirmation(gate_result.message)
+        if not confirmed:
+            return None  # User declined
+    if gate_result.message:
+        display_warning(gate_result.message)
+    # Step 2: Record pre-spawn estimate
+    spawn_id = record_spawn_start(agent_type, model, len(prompt), gate_result.estimated_cost)
+    # Step 3: Execute spawn
+    result = Task(prompt=prompt, ...)
+    # Step 4: Record post-spawn data
+    record_spawn_complete(spawn_id, len(result))
+    return result
+```
+---
+## Data Model
+### Budget Configuration File
+Location: `.grid/budget.json`
+```json
+{
+  "$schema": "https://thegrid.dev/schemas/budget.json",
+  "version": "1.0",
+  "budget_limit": 50.00,
+  "currency": "USD",
+  "enforcement": "hard",
+  "warning_threshold": 0.75,
+  "confirmation_threshold": 0.90,
+  "pricing": {
+    "opus": {"input": 5.00, "output": 25.00},
+    "sonnet": {"input": 3.00, "output": 15.00},
+    "haiku": {"input": 1.00, "output": 5.00}
+  },
+  "current_session": {
+    "id": "session-20260123-100000",
+    "started": "2026-01-23T10:00:00Z",
+    "cluster": "Auth System",
+    "model_tier": "quality",
+    "estimated_cost": 12.47,
+    "spawns": [
+      {
+        "id": "spawn-001",
+        "timestamp": "2026-01-23T10:05:00Z",
+        "agent": "planner",
+        "model": "opus",
+        "prompt_chars": 24000,
+        "est_input_tokens": 6000,
+        "est_output_tokens": 7800,
+        "est_cost": 1.60,
+        "actual_output_chars": 32400,
+        "reconciled_cost": 1.71
+      }
+    ]
+  },
+  "history": {
+    "total_cost": 147.83,
+    "total_spawns": 84,
+    "total_input_tokens": 7388000,
+    "total_output_tokens": 1847000,
+    "sessions": [
+      {
+        "id": "session-20260122-090000",
+        "cluster": "Dashboard",
+        "spawns": 10,
+        "cost": 21.32
+      }
+    ]
+  }
+}
+```
+### Schema Definitions
+#### SpawnRecord
+```typescript
+interface SpawnRecord {
+  id: string;                    // Unique spawn identifier
+  timestamp: string;             // ISO 8601 timestamp
+  agent: AgentType;              // planner, executor, etc.
+  model: ModelType;              // opus, sonnet, haiku
+  prompt_chars: number;          // Characters in spawn prompt
+  est_input_tokens: number;      // Estimated input tokens
+  est_output_tokens: number;     // Estimated output tokens
+  est_cost: number;              // Pre-spawn cost estimate
+  actual_output_chars?: number;  // Actual output size (post-spawn)
+  reconciled_cost?: number;      // Recalculated cost (post-spawn)
+  block_id?: string;             // Associated block
+  description?: string;          // What this spawn does
+}
+type AgentType =
+  | 'planner'
+  | 'executor'
+  | 'recognizer'
+  | 'visual_inspector'
+  | 'e2e_exerciser'
+  | 'persona_simulator'
+  | 'refinement_synth'
+  | 'accountant';
+type ModelType = 'opus' | 'sonnet' | 'haiku';
+```
+#### SessionRecord
+```typescript
+interface SessionRecord {
+  id: string;                    // Unique session identifier
+  started: string;               // ISO 8601 start time
+  ended?: string;                // ISO 8601 end time (when closed)
+  cluster: string;               // Cluster name
+  model_tier: 'quality' | 'balanced' | 'budget' | 'custom';
+  spawns: number;                // Total spawn count
+  est_input_tokens: number;      // Total estimated input
+  est_output_tokens: number;     // Total estimated output
+  est_cost: number;              // Total estimated cost
+  reconciled_cost?: number;      // Total reconciled cost
+}
+```
+---
+## Usage Reporting
+### Report Types
+1. **Session Report** - Current or past session details
+2. **Historical Report** - Trends over time
+3. **Optimization Report** - Cost reduction recommendations
+### Session Report
+```markdown
+## SESSION COST REPORT
+**Session:** session-20260123-100000
+**Cluster:** Auth System
+**Duration:** 4h 30m
+**Model Tier:** quality
+### Summary
+| Metric | Estimated | Reconciled |
+|--------|-----------|------------|
+| Total Spawns | 8 | 8 |
+| Input Tokens | 498,000 | 498,000 |
+| Output Tokens | 124,500 | 131,200 |
+| Total Cost | $14.86 | $15.23 |
+### Spawn Timeline
+| Time | Agent | Model | Est. Cost | Reconciled |
+|------|-------|-------|-----------|------------|
+| 10:05 | planner | opus | $1.60 | $1.71 |
+| 10:20 | executor | opus | $2.10 | $2.15 |
+| 10:45 | executor | opus | $2.10 | $2.08 |
+| ... | ... | ... | ... | ... |
+### Accuracy Analysis
+- Average estimation error: +2.5%
+- Largest over-estimate: -8% (recognizer)
+- Largest under-estimate: +12% (executor-03)
+```
+### Historical Report
+```markdown
+## HISTORICAL COST REPORT
+**Period:** 2026-01-01 to 2026-01-23
+**Sessions:** 15
+**Total Cost:** $183.47
+### Weekly Trend
+| Week | Sessions | Spawns | Cost | Avg/Session |
+|------|----------|--------|------|-------------|
+| Jan 20-26 | 4 | 32 | $58.40 | $14.60 |
+| Jan 13-19 | 6 | 48 | $72.15 | $12.03 |
+| Jan 6-12 | 3 | 24 | $38.72 | $12.91 |
+| Jan 1-5 | 2 | 16 | $14.20 | $7.10 |
+### Model Distribution
+| Model | Spawns | % | Cost | % |
+|-------|--------|---|------|---|
+| Opus | 72 | 60% | $129.60 | 70.6% |
+| Sonnet | 42 | 35% | $48.30 | 26.3% |
+| Haiku | 6 | 5% | $5.57 | 3.1% |
+### Cost by Agent Type
+| Agent | Spawns | Avg Cost | Total |
+|-------|--------|----------|-------|
+| Executor | 45 | $1.95 | $87.75 |
+| Planner | 15 | $1.52 | $22.80 |
+| Recognizer | 30 | $0.92 | $27.60 |
+| Others | 30 | $1.51 | $45.32 |
+```
+### Optimization Report
+```markdown
+## OPTIMIZATION RECOMMENDATIONS
+Based on analysis of your last 15 sessions:
+### High-Impact Recommendations
+1. **Switch Recognizer to Haiku** (High confidence)
+   - Current: Sonnet ($0.87/spawn avg)
+   - Recommended: Haiku ($0.29/spawn)
+   - Savings: ~$17/month based on your usage
+   - Risk: Low - verification doesn't need deep reasoning
+2. **Use Balanced tier for refinement** (Medium confidence)
+   - Current: Quality tier for all
+   - Recommended: Balanced for Visual/E2E/Synth
+   - Savings: ~$8/month
+   - Risk: Low - refinement quality maintained
+3. **Increase block size** (Medium confidence)
+   - Current: Average 2.1 threads/block
+   - Recommended: 2.5-3 threads/block
+   - Savings: ~12% fewer Executor spawns
+   - Risk: Medium - watch for context degradation
+### Patterns Detected
+- **Peak usage:** Monday mornings (avg $18/session vs $12 otherwise)
+- **Most expensive cluster type:** E-commerce projects ($28 avg)
+- **Cheapest tier switch:** Quality → Balanced saves 40%
+### ROI Summary
+| Recommendation | Effort | Monthly Savings |
+|---------------|--------|-----------------|
+| Haiku for verification | Low | $17 |
+| Balanced refinement | Low | $8 |
+| Larger blocks | Medium | $6 |
+| **Total** | - | **$31** |
+```
+---
+## Integration Points
+### With `/grid:model`
+Model selection directly affects costs:
+```
+/grid:model quality   → Opus everywhere ($$$)
+/grid:model balanced  → Sonnet most places ($$)
+/grid:model budget    → Haiku where safe ($)
+```
+Budget system reflects current model tier in estimates.
+### With Cluster Planning
+Pre-execution cost estimate:
+```
+CLUSTER COST ESTIMATE
+=====================
+Cluster: Auth System
+Blocks: 3
+Model Tier: quality
+Breakdown:
+  Planning:    1 spawn  × $1.60 = $1.60
+  Execution:   3 spawns × $2.10 = $6.30
+  Verification: 2 spawns × $1.45 = $2.90
+  Refinement:  5 spawns × $0.90 = $4.50
+                         ────────────────
+  Estimated Total:        $15.30
+Budget Status:
+  Limit:     $50.00
+  Current:   $12.47
+  After:     $27.77 (55.5%)
+Proceed? [Y/n]
+```
+### With Quick Mode
+Quick mode skips Planner spawn, reducing cost:
+```
+QUICK MODE
+==========
+Estimated cost: ~$2.10 (1 Executor spawn)
+vs Full Grid: ~$5.60 (Planner + Executor + Recognizer)
+Savings: ~$3.50
+```
+### With Refinement Swarm
+Refinement can be expensive; budget check shows impact:
+```
+REFINEMENT SWARM
+================
+This will spawn 5 agents:
+  - Visual Inspector   ~$0.87
+  - E2E Exerciser      ~$0.87
+  - 2x Persona Sim     ~$1.92
+  - Refinement Synth   ~$0.87
+                       ────────
+  Total:               ~$4.53
+Budget: $7.21 remaining → $2.68 after refinement
+[!] Budget will be at 94.6% after refinement
+Options:
+  1. Proceed with full swarm
+  2. Skip personas (saves $1.92)
+  3. Skip refinement entirely
+  4. Increase budget: /grid:budget set $60
+Choice [1-4]:
+```
+---
+## Command Reference
+### `/grid:budget` - Status
+Shows current budget status and session costs.
+### `/grid:budget set <amount>` - Set Limit
+```bash
+/grid:budget set $50      # Set $50 limit
+/grid:budget set 100      # Set $100 limit ($ optional)
+/grid:budget set 0        # Set $0 (effectively paused)
+```
+### `/grid:budget estimate` - Pre-Estimate
+Estimates cost for pending work (requires active plan).
+### `/grid:budget report` - Detailed Report
+```bash
+/grid:budget report              # Current session
+/grid:budget report history      # Last 30 days
+/grid:budget report optimize     # Optimization recommendations
+```
+### `/grid:budget reset` - Reset Counters
+Resets current session counters (keeps limit and history).
+### `/grid:budget unlimited` - Remove Limit
+```bash
+/grid:budget unlimited
+WARNING: This removes all spending limits.
+Spawns will continue without cost checks.
+Type "confirm" to proceed:
+```
+---
+## Future Enhancements
+### Phase 2: Improved Estimation
+- **Calibration Learning** - Adjust estimates based on actual patterns
+- **Codebase Complexity Factor** - Larger files = more tokens
+- **Context Carryover** - Track cumulative context growth
+### Phase 3: Cost Alerts
+- **Email/Slack notifications** at thresholds
+- **Daily digest** of spending
+- **Anomaly detection** for unusual spending
+### Phase 4: Team Features
+- **Per-user budgets** for team accounts
+- **Project-level budgets** across sessions
+- **Cost allocation** by tag/label
+### Phase 5: Direct API Integration
+If Anthropic provides usage APIs:
+- **Real metering** instead of estimation
+- **Exact costs** per spawn
+- **Usage dashboards** with API data
+---
+## Appendix: Pricing History
+Track pricing changes to maintain accurate estimates:
+| Date | Model | Input (per 1M) | Output (per 1M) |
+|------|-------|----------------|-----------------|
+| 2026-01 | Opus 4.5 | $5.00 | $25.00 |
+| 2026-01 | Sonnet 4.5 | $3.00 | $15.00 |
+| 2026-01 | Haiku 4.5 | $1.00 | $5.00 |
+| 2025-11 | Opus 4 | $15.00 | $75.00 |
+| 2025-11 | Sonnet 4 | $3.00 | $15.00 |
+*Note: Opus 4.5 brought significant cost reduction (~67% vs Opus 4)*
+---
+## Appendix: Error Handling
+### Budget File Corruption
+If `.grid/budget.json` is corrupted:
+```python
+def load_budget_config():
+    try:
+        return json.loads(read('.grid/budget.json'))
+    except (FileNotFoundError, json.JSONDecodeError):
+        # Return safe defaults
+        return {
+            'budget_limit': None,  # Unlimited
+            'enforcement': 'soft',
+            'current_session': {
+                'estimated_cost': 0,
+                'spawns': []
+            },
+            'history': {
+                'total_cost': 0,
+                'total_spawns': 0,
+                'sessions': []
+            }
+        }
+```
+### Session Recovery
+If session ends unexpectedly:
+```python
+def recover_session():
+    """Attempt to recover incomplete session data."""
+    budget = load_budget_config()
+    session = budget.get('current_session', {})
+    if session.get('started') and not session.get('ended'):
+        # Mark session as incomplete
+        session['ended'] = datetime.now().isoformat()
+        session['status'] = 'incomplete'
+        # Archive to history
+        budget['history']['sessions'].append({
+            'id': session['id'],
+            'cluster': session.get('cluster', 'unknown'),
+            'spawns': len(session.get('spawns', [])),
+            'cost': session.get('estimated_cost', 0),
+            'status': 'incomplete'
+        })
+        save_budget_config(budget)
+```
+---
+*End of Line.*