npm - anvil-dev-framework - Versions diffs - 0.1.7 → 0.1.9 - Mend

anvil-dev-framework 0.1.7 → 0.1.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (143) hide show

package/README.md +71 -22
package/VERSION +1 -1
package/docs/ANV-263-hook-logging-investigation.md +116 -0
package/docs/command-reference.md +398 -17
package/docs/session-workflow.md +62 -9
package/docs/system-architecture.md +584 -0
package/global/api/__pycache__/ralph_api.cpython-314.pyc +0 -0
package/global/api/openapi.yaml +357 -0
package/global/api/ralph_api.py +528 -0
package/global/commands/anvil-settings.md +47 -19
package/global/commands/audit.md +163 -0
package/global/commands/checklist.md +180 -0
package/global/commands/coderabbit-fix.md +282 -0
package/global/commands/efficiency.md +356 -0
package/global/commands/evidence.md +117 -33
package/global/commands/hud.md +24 -0
package/global/commands/insights.md +101 -3
package/global/commands/orient.md +22 -21
package/global/commands/patterns.md +115 -0
package/global/commands/ralph.md +47 -1
package/global/commands/token-budget.md +214 -0
package/global/commands/weekly-review.md +21 -1
package/global/config/notifications.yaml.template +50 -0
package/global/hooks/ralph_stop.sh +33 -1
package/global/hooks/statusline.sh +67 -2
package/global/lib/__pycache__/coderabbit_metrics.cpython-314.pyc +0 -0
package/global/lib/__pycache__/command_tracker.cpython-314.pyc +0 -0
package/global/lib/__pycache__/context_optimizer.cpython-314.pyc +0 -0
package/global/lib/__pycache__/git_utils.cpython-314.pyc +0 -0
package/global/lib/__pycache__/issue_models.cpython-314.pyc +0 -0
package/global/lib/__pycache__/linear_provider.cpython-314.pyc +0 -0
package/global/lib/__pycache__/optimization_applier.cpython-314.pyc +0 -0
package/global/lib/__pycache__/ralph_state.cpython-314.pyc +0 -0
package/global/lib/__pycache__/ralph_webhooks.cpython-314.pyc +0 -0
package/global/lib/__pycache__/state_manager.cpython-314.pyc +0 -0
package/global/lib/__pycache__/token_analyzer.cpython-314.pyc +0 -0
package/global/lib/__pycache__/token_metrics.cpython-314.pyc +0 -0
package/global/lib/coderabbit_metrics.py +647 -0
package/global/lib/command_tracker.py +147 -0
package/global/lib/context_optimizer.py +323 -0
package/global/lib/linear_provider.py +210 -16
package/global/lib/log_rotation.py +287 -0
package/global/lib/optimization_applier.py +582 -0
package/global/lib/ralph_events.py +398 -0
package/global/lib/ralph_notifier.py +366 -0
package/global/lib/ralph_state.py +264 -24
package/global/lib/ralph_webhooks.py +470 -0
package/global/lib/state_manager.py +121 -0
package/global/lib/token_analyzer.py +1383 -0
package/global/lib/token_metrics.py +919 -0
package/global/tests/__pycache__/test_command_tracker.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_context_optimizer.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_doc_coverage.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_git_utils.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_issue_models.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_linear_filtering.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_linear_provider.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_local_provider.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_optimization_applier.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_token_analyzer.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_token_analyzer_phase6.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/__pycache__/test_token_metrics.cpython-314-pytest-9.0.2.pyc +0 -0
package/global/tests/test_command_tracker.py +172 -0
package/global/tests/test_context_optimizer.py +321 -0
package/global/tests/test_linear_filtering.py +319 -0
package/global/tests/test_linear_provider.py +40 -1
package/global/tests/test_optimization_applier.py +508 -0
package/global/tests/test_token_analyzer.py +735 -0
package/global/tests/test_token_analyzer_phase6.py +537 -0
package/global/tests/test_token_metrics.py +829 -0
package/global/tools/README.md +153 -0
package/global/tools/__pycache__/anvil-hud.cpython-314.pyc +0 -0
package/global/tools/__pycache__/orient_linear.cpython-314.pyc +0 -0
package/global/tools/__pycache__/ralph-watchcpython-314.pyc +0 -0
package/global/tools/anvil-hud.py +86 -1
package/global/tools/anvil-memory/src/__tests__/ccs/context-monitor.test.ts +472 -0
package/global/tools/anvil-memory/src/__tests__/ccs/fixtures.ts +405 -0
package/global/tools/anvil-memory/src/__tests__/ccs/index.ts +36 -0
package/global/tools/anvil-memory/src/__tests__/ccs/prompt-generator.test.ts +653 -0
package/global/tools/anvil-memory/src/__tests__/ccs/ralph-stop.test.ts +727 -0
package/global/tools/anvil-memory/src/__tests__/ccs/test-utils.ts +340 -0
package/global/tools/anvil-memory/src/__tests__/commands.test.ts +218 -0
package/global/tools/anvil-memory/src/commands/context.ts +322 -0
package/global/tools/anvil-memory/src/db.ts +108 -0
package/global/tools/anvil-memory/src/index.ts +2 -8
package/global/tools/orient_linear.py +159 -0
package/global/tools/ralph-watch +423 -0
package/package.json +2 -1
package/project/.anvil-project.yaml.template +93 -0
package/project/CLAUDE.md.template +343 -0
package/project/agents/README.md +119 -0
package/project/agents/cross-layer-debugger.md +217 -0
package/project/agents/security-code-reviewer.md +162 -0
package/project/constitution.md.template +235 -0
package/project/coordination.md +103 -0
package/project/docs/background-tasks.md +258 -0
package/project/docs/skills-frontmatter.md +243 -0
package/project/examples/README.md +106 -0
package/project/examples/api-route-template.ts +171 -0
package/project/examples/component-template.tsx +110 -0
package/project/examples/hook-template.ts +152 -0
package/project/examples/service-template.ts +207 -0
package/project/examples/test-template.test.tsx +249 -0
package/project/hooks/README.md +491 -0
package/project/hooks/__pycache__/notification.cpython-314.pyc +0 -0
package/project/hooks/__pycache__/post_tool_use.cpython-314.pyc +0 -0
package/project/hooks/__pycache__/pre_tool_use.cpython-314.pyc +0 -0
package/project/hooks/__pycache__/session_start.cpython-314.pyc +0 -0
package/project/hooks/__pycache__/stop.cpython-314.pyc +0 -0
package/project/hooks/notification.py +183 -0
package/project/hooks/permission_request.py +438 -0
package/project/hooks/post_tool_use.py +397 -0
package/project/hooks/pre_compact.py +126 -0
package/project/hooks/pre_tool_use.py +454 -0
package/project/hooks/session_start.py +656 -0
package/project/hooks/stop.py +356 -0
package/project/hooks/subagent_start.py +223 -0
package/project/hooks/subagent_stop.py +215 -0
package/project/hooks/user_prompt_submit.py +110 -0
package/project/hooks/utils/llm/anth.py +114 -0
package/project/hooks/utils/llm/oai.py +114 -0
package/project/hooks/utils/tts/elevenlabs_tts.py +63 -0
package/project/hooks/utils/tts/mlx_audio_tts.py +86 -0
package/project/hooks/utils/tts/openai_tts.py +92 -0
package/project/hooks/utils/tts/pyttsx3_tts.py +75 -0
package/project/linear.yaml.template +23 -0
package/project/product.md.template +238 -0
package/project/retros/README.md +126 -0
package/project/rules/README.md +90 -0
package/project/rules/debugging.md +139 -0
package/project/rules/security-review.md +115 -0
package/project/settings.yaml.template +185 -0
package/project/specs/SPEC-ANV-72-hud-kanban.md +525 -0
package/project/templates/api-python/CLAUDE.md +547 -0
package/project/templates/generic/CLAUDE.md +260 -0
package/project/templates/saas/CLAUDE.md +478 -0
package/project/tests/README.md +140 -0
package/project/tests/__pycache__/test_transcript_parser.cpython-314-pytest-9.0.2.pyc +0 -0
package/project/tests/fixtures/sample-transcript.jsonl +21 -0
package/project/tests/test-hooks.sh +259 -0
package/project/tests/test-lib.sh +248 -0
package/project/tests/test-statusline.sh +165 -0
package/project/tests/test_transcript_parser.py +323 -0

package/global/commands/efficiency.md ADDED Viewed

@@ -0,0 +1,356 @@
+# /efficiency - Historical Token Efficiency Analysis
+> Analyze token consumption patterns over time and identify optimization opportunities.
+## When to Use
+- Weekly/monthly efficiency reviews
+- Identify consistently low-efficiency components
+- Track optimization impact over time
+- Plan CLAUDE.md and hook optimization
+## Variants
+| Command | Description |
+|---------|-------------|
+| `/efficiency` | Weekly report (default, last 7 days) |
+| `/efficiency --weekly` | Explicit weekly report |
+| `/efficiency --monthly` | Monthly report (last 30 days) |
+| `/efficiency --recommendations` | Show only recommendations |
+| `/efficiency --apply [ID]` | Apply a specific recommendation |
+| `/efficiency --apply-all` | Apply all low-risk recommendations |
+| `/efficiency --rollback [ID]` | Rollback a previous optimization |
+| `/efficiency --impact` | Show impact of applied optimizations |
+## Execution Steps
+### Step 1: Load Token Analyzer
+```python
+import sys
+sys.path.insert(0, 'global/lib')
+from token_analyzer import get_analyzer
+analyzer = get_analyzer()
+```
+### Step 2: Generate Report
+```python
+# Weekly report (default)
+report = analyzer.generate_efficiency_report(period_days=7)
+# Monthly report
+report = analyzer.generate_efficiency_report(period_days=30)
+```
+### Step 3: Format and Output
+```python
+formatted = analyzer.format_efficiency_report(report)
+print(formatted)
+```
+### Step 4: Output Report
+Weekly report format:
+```markdown
+## Weekly Efficiency Report
+**Period**: Last 7 days
+**Generated**: 2026-01-15 14:30
+**Overall Efficiency**: 72/100
+### Summary
+- **Sessions Analyzed**: 42
+- **Total Tokens**: 1,250,000
+- **Avg per Session**: 29,762
+### Component Efficiency Scores
+| Component | Type | Score | Utilization | Trend |
+|-----------|------|-------|-------------|-------|
+| patterns | command | 35 | 15% | ↓ |
+| checklist | command | 42 | 22% | → |
+| orient | command | 85 | 92% | ↑ |
+| CLAUDE.md | system | 78 | 100% | → |
+| ready | command | 91 | 88% | ↑ |
+### Top Recommendations
+- 🔴 **Defer loading patterns**: Used only 15% of the time, avg 1,200 tokens
+  - Potential savings: ~1,020 tokens
+- 🔴 **Defer loading checklist**: Used only 22% of the time, avg 800 tokens
+  - Potential savings: ~624 tokens
+- 🟡 **Optimize large-context**: Averaging 3,500 tokens per load
+  - Potential savings: ~1,050 tokens
+```
+## Applying Optimizations
+### Step 1: Load Services
+```python
+import sys
+sys.path.insert(0, 'global/lib')
+from token_analyzer import get_analyzer
+from optimization_applier import OptimizationApplier
+analyzer = get_analyzer()
+applier = OptimizationApplier(auto_commit=False)  # Set True for auto-commit
+```
+### Step 2: Generate and Review Suggestions
+```python
+# Analyze usage patterns
+usage = analyzer.analyze_usage_patterns(days=30)
+# Generate suggestions
+suggestions = analyzer.generate_optimization_suggestions(usage)
+# Format for review
+report = analyzer.format_suggestions_report(suggestions)
+print(report)
+```
+### Step 3: Apply a Specific Recommendation
+```python
+# Find the suggestion by ID
+suggestion = next(s for s in suggestions if s['id'] == target_id)
+# Apply with backup
+result = applier.apply_recommendation(
+    recommendation_id=suggestion['id'],
+    recommendation_type=suggestion['type'],
+    description=suggestion['title'],
+    target_files=suggestion['target_files'],
+    changes=suggestion['changes'],
+    estimated_savings=suggestion['estimated_savings']
+)
+if result.success:
+    print(f"✅ Applied optimization #{result.optimization_id}")
+    print(f"   Files modified: {', '.join(result.files_modified)}")
+    print(f"   Tokens saved: {result.savings:,}")
+    print(f"   Backup at: {result.backup_paths[0]}")
+else:
+    print(f"❌ Failed: {result.error_message}")
+```
+### Step 4: Apply All Low-Risk Recommendations
+```python
+# Filter to low-risk only
+low_risk = [s for s in suggestions if s['risk_level'] == 'low']
+applied = []
+for suggestion in low_risk:
+    result = applier.apply_recommendation(
+        recommendation_id=suggestion['id'],
+        recommendation_type=suggestion['type'],
+        description=suggestion['title'],
+        target_files=suggestion['target_files'],
+        changes=suggestion['changes'],
+        estimated_savings=suggestion['estimated_savings']
+    )
+    if result.success:
+        applied.append(result)
+print(f"Applied {len(applied)} optimizations")
+print(f"Total savings: {sum(r.savings for r in applied):,} tokens")
+```
+### Step 5: View Impact Report
+```python
+# Get impact summary
+impact = applier.get_total_savings()
+print(f"Total tokens saved: {impact['total_tokens_saved']:,}")
+print(f"Optimizations applied: {impact['optimizations_count']}")
+print(f"Rollbacks: {impact['reverted_count']}")
+# Detailed report
+print(applier.generate_impact_report())
+```
+### Step 6: Rollback if Needed
+```python
+# Rollback a specific optimization
+success = applier.rollback_optimization(optimization_id=123)
+if success:
+    print("✅ Rollback successful")
+else:
+    print("❌ Rollback failed")
+```
+### Output Format: Apply Result
+```markdown
+## Optimization Applied
+**ID**: OPT-001
+**Type**: defer_loading
+**Description**: Defer loading of patterns command
+### Before/After
+| Metric | Before | After | Change |
+|--------|--------|-------|--------|
+| CLAUDE.md tokens | 3,500 | 2,200 | -1,300 |
+| Initial context | 8,200 | 6,900 | -1,300 |
+### Files Modified
+- `.claude/CLAUDE.md` — Removed patterns section
+- `global/commands/patterns.md` — Content moved here
+### Backup Location
+`.claude/backups/optimizations/CLAUDE.md.20260116_103000.bak`
+### Verify
+Run `/audit` to confirm token reduction in next session.
+### Rollback
+If issues occur: `/efficiency --rollback OPT-001`
+```
+## Efficiency Score Calculation
+Component efficiency score (0-100) is based on:
+| Factor | Points | Criteria |
+|--------|--------|----------|
+| Utilization | 0-50 | % of loads where component was used |
+| Token Cost | 0-30 | Lower avg tokens = higher score |
+| Consistency | 0-20 | Frequent use with high utilization |
+| Score Range | Interpretation |
+|-------------|----------------|
+| 90-100 | Excellent—keep as is |
+| 70-89 | Good—minor optimization possible |
+| 50-69 | Fair—consider optimization |
+| <50 | Poor—candidate for removal/deferral |
+## Trend Indicators
+| Icon | Meaning |
+|------|---------|
+| ↑ | Improving (utilization increasing) |
+| → | Stable (no significant change) |
+| ↓ | Degrading (utilization decreasing) |
+| ★ | New (no previous data) |
+## Key Behaviors
+- Reports compare to previous period when possible
+- Components sorted by efficiency score (lowest first)
+- Recommendations focus on highest-impact improvements
+- Historical data preserved for 90 days
+## Recommendations Categories
+| Category | Priority | Action |
+|----------|----------|--------|
+| defer | High (1-2) | Move to on-demand loading |
+| optimize | Medium (2) | Reduce size or split |
+| review | Low (3) | Evaluate if still needed |
+## Anti-Patterns to Avoid
+- ❌ Running without sufficient historical data (<7 days)
+- ❌ Ignoring degrading trends
+- ❌ Optimizing high-utilization components
+- ❌ Applying high-risk optimizations without review
+- ❌ Applying multiple optimizations without testing between
+- ❌ Skipping backup verification before rollback
+## Integration Points
+- **Requires**: Phase 1 instrumentation active
+- **Uses**: `global/lib/token_analyzer.py`, `global/lib/optimization_applier.py`
+- **Data source**: `~/.anvil/token_metrics.db`
+- **Backups**: `.claude/backups/optimizations/`
+- **Related commands**: `/audit` (real-time), `/token-budget` (proactive)
+## Recommendations Workflow
+After running `/efficiency`:
+1. Review low-score components (score < 50)
+2. Check trends for degrading patterns
+3. Apply recommendations:
+   - **defer**: Move to on-demand command
+   - **optimize**: Reduce component size
+   - **review**: Consider removal
+4. Track impact in next week's report
+## Example: Acting on Recommendations
+If `/efficiency` recommends deferring `patterns`:
+### Manual Approach
+1. Move detailed patterns from CLAUDE.md to `/patterns` command
+2. Keep only trigger keywords in CLAUDE.md
+3. Run `/audit` to verify reduction
+4. Check next `/efficiency` for improved score
+### Automated Approach (--apply)
+1. Run `/efficiency --recommendations` to see suggestions with IDs
+2. Review the suggestion: "Defer loading of patterns (REC-001)"
+3. Apply: `/efficiency --apply REC-001`
+4. Review before/after comparison
+5. Run `/audit` to verify
+6. If issues: `/efficiency --rollback OPT-001`
+## Self-Improvement Loop
+The `/efficiency --apply` system enables a continuous self-improvement loop:
+```
+┌─────────────────────────────────────────────────────┐
+│                 Weekly Cycle                         │
+├─────────────────────────────────────────────────────┤
+│  1. /efficiency              → Generate report       │
+│  2. Review recommendations   → Prioritize by risk    │
+│  3. /efficiency --apply ID   → Apply low-risk first  │
+│  4. /audit                   → Verify improvements   │
+│  5. Monitor next week        → Track trend changes   │
+│  6. /efficiency --rollback   → Revert if needed      │
+└─────────────────────────────────────────────────────┘
+```
+### Risk Assessment
+| Risk Level | Auto-Apply | Review Required | Example |
+|------------|------------|-----------------|---------|
+| Low | ✅ Safe | Optional | Remove unused command |
+| Medium | ⚠️ Caution | Recommended | Defer loading |
+| High | ❌ Never | Required | Modify CLAUDE.md core |
+### Tracking Impact Over Time
+After applying optimizations, track their cumulative impact:
+```bash
+/efficiency --impact
+```
+Output:
+```markdown
+## Optimization Impact Report
+**Active Optimizations**: 5
+**Reverted Optimizations**: 1
+**Total Tokens Saved**: 4,250
+### Active Optimizations
+| ID | Type | Description | Tokens Saved | Applied |
+|-----|------|-------------|--------------|---------|
+| 1 | defer_loading | Defer patterns... | 1,200 | 2026-01-10 |
+| 2 | defer_loading | Defer checklist... | 850 | 2026-01-10 |
+| 3 | reduce_context | Optimize CLAUDE.md... | 1,800 | 2026-01-12 |
+| 4 | prune_rarely_used | Remove unused... | 400 | 2026-01-15 |
+### Reverted Optimizations
+- [5] Remove debug command (reverted 2026-01-14) — caused issues
+```

package/global/commands/evidence.md CHANGED Viewed

@@ -107,46 +107,128 @@ Skip for: internal refactors, test-only changes, documentation updates.
 ---
-### Step 5: Code Review (Optional)
+### Step 5: Code Review
-If code review is enabled in `.claude/anvil.config.json`:
+Code review is integrated into the evidence workflow when enabled in `.claude/anvil.config.json`.
-1. Check configuration:
-   ```bash
-   # Read config if exists
-   if [ -f ".claude/anvil.config.json" ]; then
-       cat .claude/anvil.config.json | grep -A5 '"codeReview"'
-   fi
-   ```
+#### 5.1: Check Configuration
-2. Based on `codeReview.enforcement` setting:
+```bash
+# Read config if exists
+if [ -f ".claude/anvil.config.json" ]; then
+    REVIEW_ENABLED=$(cat .claude/anvil.config.json | python3 -c "import json,sys; c=json.load(sys.stdin); print(c.get('codeReview',{}).get('enabled',False))")
+    REVIEW_ENFORCEMENT=$(cat .claude/anvil.config.json | python3 -c "import json,sys; c=json.load(sys.stdin); print(c.get('codeReview',{}).get('enforcement','soft'))")
+    PRE_PR=$(cat .claude/anvil.config.json | python3 -c "import json,sys; c=json.load(sys.stdin); print(c.get('codeReview',{}).get('prePR',True))")
+    REVIEW_CMD=$(cat .claude/anvil.config.json | python3 -c "import json,sys; c=json.load(sys.stdin); print(c.get('codeReview',{}).get('command','coderabbit review --plain'))")
+    RETRY_ON_FIX=$(cat .claude/anvil.config.json | python3 -c "import json,sys; c=json.load(sys.stdin); print(c.get('codeReview',{}).get('retryOnFix',True))")
+fi
+```
-   | Enforcement | Behavior |
-   |-------------|----------|
-   | `hard` | Run review automatically. Block PR if critical issues found. |
-   | `soft` | Prompt: "Run code review? (recommended)" Proceed either way. |
-   | `manual` | Skip automatic prompt. User triggers when wanted. |
+#### 5.2: Enforcement Behavior
-3. If enabled, run configured tool:
-   ```bash
-   # Default command (configurable)
-   coderabbit --prompt-only
-   ```
+| Enforcement | Behavior |
+|-------------|----------|
+| `hard` | Run review automatically. **Block PR creation** if critical issues found. Must address before proceeding. |
+| `soft` | Run review automatically. Show warning if issues found, but allow user to proceed with acknowledgment. |
-4. Include results in evidence:
-   ```markdown
-   ### Code Review
-   **Tool**: CodeRabbit
-   **Status**: ✅ No critical issues / ⚠️ X issues found
+#### 5.3: Check CodeRabbit Availability
-   [Summary of findings if any]
-   ```
+```bash
+# Verify CodeRabbit CLI is available
+if ! command -v coderabbit &> /dev/null; then
+    echo "CodeRabbit CLI not found. Install: npm install -g coderabbit"
+    # Graceful fallback - warn but don't block
+fi
+```
-5. If code review not configured:
-   ```markdown
-   ### Code Review
-   Not configured. Enable with `/anvil-settings codeReview on`
-   ```
+#### 5.4: Run Code Review (if enabled and prePR is true)
+```bash
+# Run the configured code review command (default: coderabbit review --plain)
+eval "$REVIEW_CMD" 2>&1 | tee coderabbit-output.txt
+# Parse results
+ISSUES_COUNT=$(grep -c "issue\|warning\|error" coderabbit-output.txt || echo "0")
+CRITICAL_COUNT=$(grep -c "critical\|security" coderabbit-output.txt || echo "0")
+```
+#### 5.4.1: Retry After Fix (if retryOnFix is enabled)
+When `codeReview.retryOnFix` is enabled (default: true), the review process supports automatic re-validation:
+1. **Initial Review**: Run code review and capture issues
+2. **User Applies Fixes**: User or agent addresses issues (e.g., via `/coderabbit-fix`)
+3. **Re-run Review**: Automatically re-run code review to verify fixes
+4. **Repeat**: If new issues found, repeat until clean or user skips
+This ensures all issues are addressed before PR creation, especially for incremental fix workflows.
+#### 5.5: Handle Results Based on Enforcement
+**Hard Enforcement (blocks on critical issues):**
+```markdown
+### Code Review
+**Tool**: CodeRabbit
+**Enforcement**: Hard
+**Status**: 2 critical issues found
+**Critical Issues:**
+1. [Issue description from CodeRabbit]
+2. [Issue description from CodeRabbit]
+**Action Required**: Must address critical issues before PR creation.
+Run `/coderabbit-fix` to apply suggested fixes automatically.
+```
+**Soft Enforcement (warns but allows proceed):**
+```markdown
+### Code Review
+**Tool**: CodeRabbit
+**Enforcement**: Soft
+**Status**: 3 issues found (0 critical)
+**Issues:**
+1. [Issue description]
+2. [Issue description]
+3. [Issue description]
+**Note**: Issues found but enforcement is soft. You may proceed with PR creation.
+Consider running `/coderabbit-fix` to address these issues.
+```
+**Clean Review:**
+```markdown
+### Code Review
+**Tool**: CodeRabbit
+**Status**: No issues found
+Code review passed with no issues.
+```
+#### 5.6: Graceful Fallback
+If CodeRabbit is unavailable:
+```markdown
+### Code Review
+**Tool**: CodeRabbit
+**Status**: CodeRabbit unavailable
+CodeRabbit CLI not found or failed to run. Code review skipped.
+- Install: `npm install -g coderabbit`
+- Or disable: `/anvil-settings codeReview off`
+Proceeding without code review.
+```
+#### 5.7: Code Review Disabled
+If code review is explicitly disabled:
+```markdown
+### Code Review
+Disabled in configuration. Enable with `/anvil-settings codeReview on`
+```
+**Note**: Code review is enabled by default in Anvil v1.4+. If you see this message, code review was explicitly disabled via `/anvil-settings codeReview off`.
 ---
@@ -245,10 +327,12 @@ Closes [Issue key]
 | Manual Test | ✅ Works | Description or screenshot |
 | Documentation | ⚪ Soft prompt | Status noted |
 | Changelog | ⚪ Soft prompt | Entry added or justified skip |
-| Code Review | ⚪ If configured | Review results (when enabled) |
+| Code Review | ✅ Enabled by default | Review results (address issues before PR) |
 **Legend**: ✅ = Required | ⚪ = Soft prompt (use judgment)
+**Note**: Code review is enabled by default in Anvil v1.4+. Enforcement level (`soft` or `hard`) determines whether issues block PR creation.
 ## Failure Handling
 If any gate fails:

package/global/commands/hud.md CHANGED Viewed

@@ -51,11 +51,35 @@ uv run global/tools/anvil-hud.py --demo
 | ⚠ | Context warning (>70%) |
 | 🔴 | Context critical (>85%) |
+### Quality Panel
+The Quality panel (Tab 3) displays quality gate status for each agent's project:
+- **Tests**: Pass/fail status with count
+- **Lint**: Error and warning counts
+- **Types**: TypeScript error count
+- **CI**: GitHub Actions status
+- **CR**: CodeRabbit review status (issues/suggestions)
+#### CodeRabbit Weekly Metrics
+The Quality panel also shows aggregated CodeRabbit metrics from the past week:
+| Metric | Description | Healthy | Warning | Critical |
+|--------|-------------|---------|---------|----------|
+| Reviews | Total reviews this week | — | — | — |
+| Issues | Found vs fixed | — | — | — |
+| Avg/Review | Average issues per review | <2 | 2-5 | >5 |
+| Pass Rate | % of reviews with 0 issues | >50% | 25-50% | <25% |
+| Trend | Week-over-week direction | ↑ improving | → stable | ↓ degrading |
+These metrics are sourced from `~/.anvil/coderabbit_metrics.db` (see `/weekly-review` for details).
 ## Data Sources
 The HUD reads from:
 - `~/.anvil/agents.json` - Agent registry (auto-updated by statusline hook)
 - `.claude/anvil-state.json` - Current session state
+- `~/.anvil/coderabbit_metrics.db` - CodeRabbit review metrics (weekly stats)
 ## Troubleshooting

package/global/commands/insights.md CHANGED Viewed

@@ -138,6 +138,99 @@ From healthcheck files, extract:
 Note any framework issues that correlate with retro patterns.
+### Step 8: Update Watermark Tracking
+After generating the report (when user selects "save report"), update tracking:
+#### 8.1: Update Manifest
+```python
+import json
+import os
+from datetime import datetime
+manifest_path = '.claude/insights/.manifest.json'
+report_path = '.claude/insights/YYYY-MM-DD.md'  # Today's report
+retros_analyzed = [...]  # List of retro paths from Step 0.5
+# Load or create manifest
+if os.path.exists(manifest_path):
+    with open(manifest_path) as f:
+        manifest = json.load(f)
+else:
+    manifest = {"version": 1, "processed_retros": {}}
+# Update manifest
+now = datetime.utcnow().isoformat() + "Z"
+manifest["last_run"] = now
+for retro_path in retros_analyzed:
+    manifest["processed_retros"][retro_path] = {
+        "processed_at": now,
+        "insights_report": report_path
+    }
+# Ensure directory exists
+os.makedirs(os.path.dirname(manifest_path), exist_ok=True)
+# Write manifest
+with open(manifest_path, 'w') as f:
+    json.dump(manifest, f, indent=2)
+print(f"✓ Manifest updated: {len(retros_analyzed)} retros marked as processed")
+```
+#### 8.2: Update Retro Frontmatter
+For each analyzed retro, add tracking metadata:
+```python
+import re
+from datetime import date
+def update_retro_frontmatter(retro_path: str, report_path: str):
+    with open(retro_path) as f:
+        content = f.read()
+    today = date.today().isoformat()
+    new_fields = f"insights_processed: {today}\ninsights_report: {report_path}"
+    # Check for existing frontmatter
+    if content.startswith('---'):
+        # Insert before closing ---
+        parts = content.split('---', 2)
+        if len(parts) >= 3:
+            frontmatter = parts[1].strip()
+            # Remove old insights fields if present
+            frontmatter = re.sub(r'insights_processed:.*\n?', '', frontmatter)
+            frontmatter = re.sub(r'insights_report:.*\n?', '', frontmatter)
+            # Add new fields
+            new_content = f"---\n{frontmatter}\n{new_fields}\n---{parts[2]}"
+    else:
+        # Add frontmatter section
+        new_content = f"---\n{new_fields}\n---\n\n{content}"
+    with open(retro_path, 'w') as f:
+        f.write(new_content)
+# Apply to all analyzed retros
+for retro_path in retros_analyzed:
+    update_retro_frontmatter(retro_path, report_path)
+print(f"✓ Frontmatter updated in {len(retros_analyzed)} retros")
+```
+#### 8.3: Confirm Tracking Update
+```
+✓ Watermark tracking updated:
+  - Manifest: .claude/insights/.manifest.json
+  - Retros marked: 5
+  - Report: .claude/insights/2026-01-15.md
+Next /insights run will skip these retros unless --all is used.
+```
 ---
 ## Output Format
@@ -291,9 +384,9 @@ After generating the report, offer:
 ## Integration Points
-- **Reads**: `.claude/retros/`, `.claude/healthchecks/`, `.claude/handoffs/`
-- **Modifies**: `CLAUDE.md` (when "apply patch" requested)
-- **Creates**: Linear issues (when "create issues" requested)
+- **Reads**: `.claude/retros/`, `.claude/healthchecks/`, `.claude/handoffs/`, `.claude/insights/.manifest.json`
+- **Modifies**: `CLAUDE.md` (when "apply patch" requested), retro frontmatter (when "save report" requested)
+- **Creates**: Linear issues (when "create issues" requested), `.claude/insights/.manifest.json` (on first run)
 - **Saves**: `.claude/insights/` (when "save report" requested)
 ## Handling Edge Cases
@@ -305,6 +398,11 @@ After generating the report, offer:
 | No patterns found | Report "no recurring patterns" with individual learnings |
 | All patterns are positive | Focus on reinforcement, no fixes needed |
 | Conflicting learnings | Note the conflict, ask for clarification |
+| No manifest exists | First run - treat all retros as unprocessed |
+| Corrupted manifest | Backup, warn, treat all retros as unprocessed |
+| No new retros | Show message, suggest `--all` to re-analyze |
+| All retros already processed | Same as "no new retros" |
+| Retro deleted after processing | Remove from manifest on next run |
 ---