anvil-dev-framework 0.1.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +719 -0
- package/VERSION +1 -0
- package/docs/ANVIL-REPO-IMPLEMENTATION-PLAN.md +441 -0
- package/docs/FIRST-SKILL-TUTORIAL.md +408 -0
- package/docs/INSTALLATION-RETRO-NOTES.md +458 -0
- package/docs/INSTALLATION.md +984 -0
- package/docs/anvil-hud.md +469 -0
- package/docs/anvil-init.md +255 -0
- package/docs/anvil-state.md +210 -0
- package/docs/boris-cherny-ralph-wiggum-insights.md +608 -0
- package/docs/command-reference.md +2022 -0
- package/docs/hooks-tts.md +368 -0
- package/docs/implementation-guide.md +810 -0
- package/docs/linear-github-integration.md +247 -0
- package/docs/local-issues.md +677 -0
- package/docs/patterns/README.md +419 -0
- package/docs/planning-responsibilities.md +139 -0
- package/docs/session-workflow.md +573 -0
- package/docs/simplification-plan-template.md +297 -0
- package/docs/simplification-principles.md +129 -0
- package/docs/specifications/CCS-RALPH-INTEGRATION-DESIGN.md +633 -0
- package/docs/specifications/CCS-RESEARCH-REPORT.md +169 -0
- package/docs/specifications/PLAN-ANV-verification-ralph-wiggum.md +403 -0
- package/docs/specifications/PLAN-parallel-tracks-anvil-memory-ccs.md +494 -0
- package/docs/specifications/SPEC-ANV-VRW/component-01-verify.md +208 -0
- package/docs/specifications/SPEC-ANV-VRW/component-02-stop-gate.md +226 -0
- package/docs/specifications/SPEC-ANV-VRW/component-03-posttooluse.md +209 -0
- package/docs/specifications/SPEC-ANV-VRW/component-04-ralph-wiggum.md +604 -0
- package/docs/specifications/SPEC-ANV-VRW/component-05-atomic-actions.md +311 -0
- package/docs/specifications/SPEC-ANV-VRW/component-06-verify-subagent.md +264 -0
- package/docs/specifications/SPEC-ANV-VRW/component-07-claude-md.md +363 -0
- package/docs/specifications/SPEC-ANV-VRW/index.md +182 -0
- package/docs/specifications/SPEC-ANV-anvil-memory.md +573 -0
- package/docs/specifications/SPEC-ANV-context-checkpoints.md +781 -0
- package/docs/specifications/SPEC-ANV-verification-ralph-wiggum.md +789 -0
- package/docs/sync.md +122 -0
- package/global/CLAUDE.md +140 -0
- package/global/agents/verify-app.md +164 -0
- package/global/commands/anvil-settings.md +527 -0
- package/global/commands/anvil-sync.md +121 -0
- package/global/commands/change.md +197 -0
- package/global/commands/clarify.md +252 -0
- package/global/commands/cleanup.md +292 -0
- package/global/commands/commit-push-pr.md +207 -0
- package/global/commands/decay-review.md +127 -0
- package/global/commands/discover.md +158 -0
- package/global/commands/doc-coverage.md +122 -0
- package/global/commands/evidence.md +307 -0
- package/global/commands/explore.md +121 -0
- package/global/commands/force-exit.md +135 -0
- package/global/commands/handoff.md +191 -0
- package/global/commands/healthcheck.md +302 -0
- package/global/commands/hud.md +84 -0
- package/global/commands/insights.md +319 -0
- package/global/commands/linear-setup.md +184 -0
- package/global/commands/lint-fix.md +198 -0
- package/global/commands/orient.md +510 -0
- package/global/commands/plan.md +228 -0
- package/global/commands/ralph.md +346 -0
- package/global/commands/ready.md +182 -0
- package/global/commands/release.md +305 -0
- package/global/commands/retro.md +96 -0
- package/global/commands/shard.md +166 -0
- package/global/commands/spec.md +227 -0
- package/global/commands/sprint.md +184 -0
- package/global/commands/tasks.md +228 -0
- package/global/commands/test-and-commit.md +151 -0
- package/global/commands/validate.md +132 -0
- package/global/commands/verify.md +251 -0
- package/global/commands/weekly-review.md +156 -0
- package/global/hooks/__pycache__/ralph_context_monitor.cpython-314.pyc +0 -0
- package/global/hooks/__pycache__/statusline_agent_sync.cpython-314.pyc +0 -0
- package/global/hooks/anvil_memory_observe.ts +322 -0
- package/global/hooks/anvil_memory_session.ts +166 -0
- package/global/hooks/anvil_memory_stop.ts +187 -0
- package/global/hooks/parse_transcript.py +116 -0
- package/global/hooks/post_merge_cleanup.sh +132 -0
- package/global/hooks/post_tool_format.sh +215 -0
- package/global/hooks/ralph_context_monitor.py +240 -0
- package/global/hooks/ralph_stop.sh +502 -0
- package/global/hooks/statusline.sh +1110 -0
- package/global/hooks/statusline_agent_sync.py +224 -0
- package/global/hooks/stop_gate.sh +250 -0
- package/global/lib/.claude/anvil-state.json +21 -0
- package/global/lib/__pycache__/agent_registry.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/claim_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/coderabbit_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/config_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/coordination_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/doc_coverage_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/gate_logger.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/github_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/hygiene_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/issue_models.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/issue_provider.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/linear_data_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/linear_provider.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/local_provider.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/quality_service.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/ralph_state.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/state_manager.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/transcript_parser.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/verification_runner.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/verify_iteration.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/verify_subagent.cpython-314.pyc +0 -0
- package/global/lib/agent_registry.py +995 -0
- package/global/lib/anvil-state.sh +435 -0
- package/global/lib/claim_service.py +515 -0
- package/global/lib/coderabbit_service.py +314 -0
- package/global/lib/config_service.py +423 -0
- package/global/lib/coordination_service.py +331 -0
- package/global/lib/doc_coverage_service.py +1305 -0
- package/global/lib/gate_logger.py +316 -0
- package/global/lib/github_service.py +310 -0
- package/global/lib/handoff_generator.py +775 -0
- package/global/lib/hygiene_service.py +712 -0
- package/global/lib/issue_models.py +257 -0
- package/global/lib/issue_provider.py +339 -0
- package/global/lib/linear_data_service.py +210 -0
- package/global/lib/linear_provider.py +987 -0
- package/global/lib/linear_provider.py.backup +671 -0
- package/global/lib/local_provider.py +486 -0
- package/global/lib/orient_fast.py +457 -0
- package/global/lib/quality_service.py +470 -0
- package/global/lib/ralph_prompt_generator.py +563 -0
- package/global/lib/ralph_state.py +1202 -0
- package/global/lib/state_manager.py +417 -0
- package/global/lib/transcript_parser.py +597 -0
- package/global/lib/verification_runner.py +557 -0
- package/global/lib/verify_iteration.py +490 -0
- package/global/lib/verify_subagent.py +250 -0
- package/global/skills/README.md +155 -0
- package/global/skills/quality-gates/SKILL.md +252 -0
- package/global/skills/skill-template/SKILL.md +109 -0
- package/global/skills/testing-strategies/SKILL.md +337 -0
- package/global/templates/CHANGE-template.md +105 -0
- package/global/templates/HANDOFF-template.md +63 -0
- package/global/templates/PLAN-template.md +111 -0
- package/global/templates/SPEC-template.md +93 -0
- package/global/templates/ralph/PROMPT.md.template +89 -0
- package/global/templates/ralph/fix_plan.md.template +31 -0
- package/global/templates/ralph/progress.txt.template +23 -0
- package/global/tests/__pycache__/test_doc_coverage.cpython-314.pyc +0 -0
- package/global/tests/test_doc_coverage.py +520 -0
- package/global/tests/test_issue_models.py +299 -0
- package/global/tests/test_local_provider.py +323 -0
- package/global/tools/README.md +178 -0
- package/global/tools/__pycache__/anvil-hud.cpython-314.pyc +0 -0
- package/global/tools/anvil-hud.py +3622 -0
- package/global/tools/anvil-hud.py.bak +3318 -0
- package/global/tools/anvil-issue.py +432 -0
- package/global/tools/anvil-memory/CLAUDE.md +49 -0
- package/global/tools/anvil-memory/README.md +42 -0
- package/global/tools/anvil-memory/bun.lock +25 -0
- package/global/tools/anvil-memory/bunfig.toml +9 -0
- package/global/tools/anvil-memory/package.json +23 -0
- package/global/tools/anvil-memory/src/__tests__/ccs/context-monitor.test.ts +535 -0
- package/global/tools/anvil-memory/src/__tests__/ccs/edge-cases.test.ts +645 -0
- package/global/tools/anvil-memory/src/__tests__/ccs/fixtures.ts +363 -0
- package/global/tools/anvil-memory/src/__tests__/ccs/index.ts +8 -0
- package/global/tools/anvil-memory/src/__tests__/ccs/integration.test.ts +417 -0
- package/global/tools/anvil-memory/src/__tests__/ccs/prompt-generator.test.ts +571 -0
- package/global/tools/anvil-memory/src/__tests__/ccs/ralph-stop.test.ts +440 -0
- package/global/tools/anvil-memory/src/__tests__/ccs/test-utils.ts +252 -0
- package/global/tools/anvil-memory/src/__tests__/commands.test.ts +657 -0
- package/global/tools/anvil-memory/src/__tests__/db.test.ts +641 -0
- package/global/tools/anvil-memory/src/__tests__/hooks.test.ts +272 -0
- package/global/tools/anvil-memory/src/__tests__/performance.test.ts +427 -0
- package/global/tools/anvil-memory/src/__tests__/test-utils.ts +113 -0
- package/global/tools/anvil-memory/src/commands/checkpoint.ts +197 -0
- package/global/tools/anvil-memory/src/commands/get.ts +115 -0
- package/global/tools/anvil-memory/src/commands/init.ts +94 -0
- package/global/tools/anvil-memory/src/commands/observe.ts +163 -0
- package/global/tools/anvil-memory/src/commands/search.ts +112 -0
- package/global/tools/anvil-memory/src/db.ts +638 -0
- package/global/tools/anvil-memory/src/index.ts +205 -0
- package/global/tools/anvil-memory/src/types.ts +122 -0
- package/global/tools/anvil-memory/tsconfig.json +29 -0
- package/global/tools/ralph-loop.sh +359 -0
- package/package.json +45 -0
- package/scripts/anvil +822 -0
- package/scripts/extract_patterns.py +222 -0
- package/scripts/init-project.sh +541 -0
- package/scripts/install.sh +229 -0
- package/scripts/postinstall.js +41 -0
- package/scripts/rollback.sh +188 -0
- package/scripts/sync.sh +623 -0
- package/scripts/test-statusline.sh +248 -0
- package/scripts/update_claude_md.py +224 -0
- package/scripts/verify.sh +255 -0
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
# Exploration Report: Context Checkpoint System Research
|
|
2
|
+
|
|
3
|
+
**Date**: 2026-01-07
|
|
4
|
+
**Topic**: Auto-compact behavior and context preservation strategies
|
|
5
|
+
**Trigger**: Tweet from @dani_avila7 about disabling auto-compact
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Research Question
|
|
10
|
+
|
|
11
|
+
Can disabling Claude Code's auto-compact improve context preservation, and should this be incorporated into the Anvil framework?
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Sources Consulted
|
|
16
|
+
|
|
17
|
+
### Primary Sources
|
|
18
|
+
|
|
19
|
+
| Source | Type | Key Finding |
|
|
20
|
+
|--------|------|-------------|
|
|
21
|
+
| GitHub Issue #6689 | Feature request | Users requesting `--no-auto-compact` flag |
|
|
22
|
+
| GitHub Issue #12053 | Feature request | Request to disable auto-compact buffer |
|
|
23
|
+
| Claude Code `/config` | Official feature | Auto-compact CAN be disabled via toggle |
|
|
24
|
+
| DEV.to article | User report | Disabling recovers ~22.5% context (45k tokens) |
|
|
25
|
+
|
|
26
|
+
### Secondary Sources
|
|
27
|
+
|
|
28
|
+
| Source | Type | Key Finding |
|
|
29
|
+
|--------|------|-------------|
|
|
30
|
+
| matsen.fhcrc.org | Best practices | Planning docs preserve understanding across sessions |
|
|
31
|
+
| arsturn.com | Developer guide | Manual `/compact` at milestones recommended |
|
|
32
|
+
| ClaudeLog documentation | Community resource | Strategic manual compacting over auto-compact |
|
|
33
|
+
|
|
34
|
+
### Existing Anvil Components Analyzed
|
|
35
|
+
|
|
36
|
+
| Component | Location | Relevance |
|
|
37
|
+
|-----------|----------|-----------|
|
|
38
|
+
| `statusline.sh` | `global/hooks/` | Has 70/85/95% thresholds, visual context bar |
|
|
39
|
+
| `shard.md` | `global/commands/` | Task breakdown to prevent context overrun |
|
|
40
|
+
| `handoff.md` | `global/commands/` | Session continuity documents |
|
|
41
|
+
| `pre_compact.py` | `project/hooks/` | Backup transcript before compaction |
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## Key Findings
|
|
46
|
+
|
|
47
|
+
### Finding 1: Auto-Compact Can Be Disabled
|
|
48
|
+
|
|
49
|
+
**Method**: `/config` command → toggle "Auto-compact enabled" OFF
|
|
50
|
+
**Storage**: `~/.claude.json` as `{"autoCompactEnabled": false}`
|
|
51
|
+
**Effect**: Recovers ~22.5% of context window (~45k tokens on 200k model)
|
|
52
|
+
|
|
53
|
+
### Finding 2: Context Loss After Compaction is Well-Documented
|
|
54
|
+
|
|
55
|
+
Multiple users report Claude "forgets" after auto-compaction:
|
|
56
|
+
- CLAUDE.md instructions
|
|
57
|
+
- Tool configurations
|
|
58
|
+
- Learned patterns and decisions
|
|
59
|
+
- Project-specific rules
|
|
60
|
+
|
|
61
|
+
**Evidence quality**: Anecdotal but consistent across multiple sources
|
|
62
|
+
|
|
63
|
+
### Finding 3: Trade-off Between Approaches
|
|
64
|
+
|
|
65
|
+
| Approach | Pros | Cons |
|
|
66
|
+
|----------|------|------|
|
|
67
|
+
| Auto-compact ON | Sessions continue indefinitely | Lossy summarization, context degradation |
|
|
68
|
+
| Auto-compact OFF | More usable context, no degradation | Sessions end hard at limit |
|
|
69
|
+
| Manual compaction | Control over timing | Requires user awareness |
|
|
70
|
+
| Structured handoff | Zero-loss preservation | Requires session restart |
|
|
71
|
+
|
|
72
|
+
### Finding 4: Best Practices from External Research
|
|
73
|
+
|
|
74
|
+
1. **Planning documents** preserve understanding across sessions
|
|
75
|
+
2. **Small components** reduce context overhead per task
|
|
76
|
+
3. **Front-load instructions** in CLAUDE.md for persistence
|
|
77
|
+
4. **Subagents** for specialized tasks reduce main context usage
|
|
78
|
+
5. **One task per chat** prevents context bloat
|
|
79
|
+
6. **Checkpoint at milestones** not arbitrary points
|
|
80
|
+
|
|
81
|
+
### Finding 5: Existing Anvil Coverage
|
|
82
|
+
|
|
83
|
+
Anvil already has partial solutions:
|
|
84
|
+
- `pre_compact.py` - Backs up before compaction (reactive)
|
|
85
|
+
- `/handoff` - Creates continuity documents (manual)
|
|
86
|
+
- Anti-pattern in CLAUDE.md - "Uncommitted work before compaction"
|
|
87
|
+
- `statusline.sh` - Visual context monitoring
|
|
88
|
+
|
|
89
|
+
**Gap**: No proactive, automatic checkpoint triggering
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
## Analysis
|
|
94
|
+
|
|
95
|
+
### Why Simply Disabling Auto-Compact is Not Enough
|
|
96
|
+
|
|
97
|
+
1. **Hard session endings** - Work stops abruptly at limit
|
|
98
|
+
2. **No continuation path** - Next session starts cold
|
|
99
|
+
3. **Lost work risk** - Uncommitted changes may be lost
|
|
100
|
+
4. **No Linear integration** - Issue state not preserved
|
|
101
|
+
|
|
102
|
+
### What's Needed: Intelligent Context Checkpoint System
|
|
103
|
+
|
|
104
|
+
A system that:
|
|
105
|
+
1. Monitors context proactively (not just visually)
|
|
106
|
+
2. Triggers structured handoff at optimal breakpoints
|
|
107
|
+
3. Commits WIP before checkpoint
|
|
108
|
+
4. Updates Linear with progress
|
|
109
|
+
5. Provides clear continuation path for next session
|
|
110
|
+
6. Integrates with sharding to prevent overruns
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
114
|
+
## Threshold Model
|
|
115
|
+
|
|
116
|
+
Based on existing `statusline.sh` thresholds:
|
|
117
|
+
|
|
118
|
+
| Level | Threshold | Current Behavior | Proposed CCS Behavior |
|
|
119
|
+
|-------|-----------|------------------|----------------------|
|
|
120
|
+
| L0 | 0-69% | Green indicator | Normal operation |
|
|
121
|
+
| L1 | 70-84% | Yellow warning | Alert + prepare handoff |
|
|
122
|
+
| L2 | 85-94% | Red critical | Initiate handoff sequence |
|
|
123
|
+
| L3 | 95%+ | COMPACT warning | Force immediate handoff |
|
|
124
|
+
|
|
125
|
+
---
|
|
126
|
+
|
|
127
|
+
## Recommendations
|
|
128
|
+
|
|
129
|
+
### Recommendation 1: Do NOT Disable Auto-Compact by Default
|
|
130
|
+
|
|
131
|
+
**Rationale**: Without a robust alternative, disabling creates worse UX (hard session endings).
|
|
132
|
+
|
|
133
|
+
### Recommendation 2: Build Context Checkpoint System (CCS)
|
|
134
|
+
|
|
135
|
+
**Rationale**: Provides intelligent alternative that works with or without auto-compact.
|
|
136
|
+
|
|
137
|
+
### Recommendation 3: Integrate CCS with Ralph Wiggum
|
|
138
|
+
|
|
139
|
+
**Rationale**: Ralph's continuous loops will frequently hit context limits; needs built-in checkpointing.
|
|
140
|
+
|
|
141
|
+
### Recommendation 4: Enhance Existing Components
|
|
142
|
+
|
|
143
|
+
Build on:
|
|
144
|
+
- `statusline.sh` for monitoring
|
|
145
|
+
- `/handoff` for document generation
|
|
146
|
+
- `/shard` for task breakdown
|
|
147
|
+
- `pre_compact.py` for backup patterns
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## Open Questions for Design
|
|
152
|
+
|
|
153
|
+
1. How does CCS integrate with Ralph Wiggum's loop architecture?
|
|
154
|
+
2. Should CCS work alongside auto-compact or replace it entirely?
|
|
155
|
+
3. How do we estimate task size before starting?
|
|
156
|
+
4. What's the checkpoint document format for Ralph continuity?
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Next Steps
|
|
161
|
+
|
|
162
|
+
1. Deep analysis of Ralph Wiggum system architecture
|
|
163
|
+
2. Design CCS-Ralph integration points
|
|
164
|
+
3. Update specification with Ralph requirements
|
|
165
|
+
4. Create implementation plan
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
*Research conducted 2026-01-07. Specification draft created at `.claude/specs/current/SPEC-ANV-context-checkpoints.md`*
|
|
@@ -0,0 +1,403 @@
|
|
|
1
|
+
---
|
|
2
|
+
plan_id: PLAN-ANV-VRW
|
|
3
|
+
spec_id: SPEC-ANV-VRW
|
|
4
|
+
title: Verification Feedback Loops & Ralph Wiggum Implementation Plan
|
|
5
|
+
status: draft
|
|
6
|
+
created: 2026-01-04
|
|
7
|
+
linear_parent: ANV-140
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Implementation Plan: Verification Feedback Loops & Ralph Wiggum Mode
|
|
11
|
+
|
|
12
|
+
## Overview
|
|
13
|
+
|
|
14
|
+
This plan breaks down the SPEC-ANV-VRW specification into actionable implementation phases. Each phase is designed to be completed in a single focused session (2-4 hours).
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## Phase 1: Verification Command Foundation
|
|
19
|
+
|
|
20
|
+
**Linear Issue**: ANV-141
|
|
21
|
+
**Estimated Time**: 4-6 hours
|
|
22
|
+
**Dependencies**: None
|
|
23
|
+
|
|
24
|
+
### Objectives
|
|
25
|
+
- Implement `/verify` skill with full feedback loop
|
|
26
|
+
- Create verification configuration schema
|
|
27
|
+
- Build iteration tracking system
|
|
28
|
+
|
|
29
|
+
### Tasks
|
|
30
|
+
|
|
31
|
+
1. **Create Verification Skill** (`global/commands/verify.md`)
|
|
32
|
+
- [ ] Define skill frontmatter and metadata
|
|
33
|
+
- [ ] Document execution steps
|
|
34
|
+
- [ ] Add argument parsing (custom commands)
|
|
35
|
+
- [ ] Include anti-patterns and best practices
|
|
36
|
+
|
|
37
|
+
2. **Create Verification Runner** (`global/lib/verification_runner.py`)
|
|
38
|
+
- [ ] Load verification config from `.claude/settings.yaml`
|
|
39
|
+
- [ ] Execute test suite with output capture
|
|
40
|
+
- [ ] Execute lint with output parsing
|
|
41
|
+
- [ ] Execute typecheck with error extraction
|
|
42
|
+
- [ ] Return structured results (pass/fail/errors)
|
|
43
|
+
|
|
44
|
+
3. **Implement Iteration Logic** (`global/lib/verify_iteration.py`)
|
|
45
|
+
- [ ] Parse verification failures
|
|
46
|
+
- [ ] Identify failing files and line numbers
|
|
47
|
+
- [ ] Track iteration count (max 3)
|
|
48
|
+
- [ ] Generate fix suggestions
|
|
49
|
+
- [ ] Re-run verification after fix attempts
|
|
50
|
+
|
|
51
|
+
4. **Add Configuration Support**
|
|
52
|
+
- [ ] Create schema for `.claude/settings.yaml` verification section
|
|
53
|
+
- [ ] Support project-specific commands
|
|
54
|
+
- [ ] Add `max_iterations` setting
|
|
55
|
+
- [ ] Add `required_for_completion` flag
|
|
56
|
+
|
|
57
|
+
### Acceptance Criteria
|
|
58
|
+
- [ ] `/verify` runs configured test/lint/type commands
|
|
59
|
+
- [ ] On failure, automatically reads failing files
|
|
60
|
+
- [ ] Attempts up to 3 fix iterations
|
|
61
|
+
- [ ] Reports clear pass/fail status with details
|
|
62
|
+
- [ ] Respects project configuration
|
|
63
|
+
|
|
64
|
+
### Test Cases
|
|
65
|
+
- `/verify` with all passing checks
|
|
66
|
+
- `/verify` with single failing test
|
|
67
|
+
- `/verify` with multiple failures
|
|
68
|
+
- `/verify` reaching max iterations
|
|
69
|
+
- `/verify` with custom commands
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## Phase 2: Stop Hook Verification Gate
|
|
74
|
+
|
|
75
|
+
**Linear Issue**: ANV-142
|
|
76
|
+
**Estimated Time**: 3-4 hours
|
|
77
|
+
**Dependencies**: Phase 1
|
|
78
|
+
|
|
79
|
+
### Objectives
|
|
80
|
+
- Create stop hook that gates on verification
|
|
81
|
+
- Implement force-exit override
|
|
82
|
+
- Add gate event logging
|
|
83
|
+
|
|
84
|
+
### Tasks
|
|
85
|
+
|
|
86
|
+
1. **Create Stop Gate Hook** (`global/hooks/stop_gate.sh`)
|
|
87
|
+
- [ ] Check `ANVIL_REQUIRE_VERIFICATION` env var
|
|
88
|
+
- [ ] Check `ANVIL_FORCE_EXIT` override
|
|
89
|
+
- [ ] Run test suite verification
|
|
90
|
+
- [ ] Run lint verification
|
|
91
|
+
- [ ] Run typecheck verification
|
|
92
|
+
- [ ] Return non-zero on any failure
|
|
93
|
+
|
|
94
|
+
2. **Create Force Exit Command** (`global/commands/force-exit.md`)
|
|
95
|
+
- [ ] Set `ANVIL_FORCE_EXIT=true`
|
|
96
|
+
- [ ] Log override reason
|
|
97
|
+
- [ ] Trigger actual exit
|
|
98
|
+
|
|
99
|
+
3. **Update Settings Hook Registration**
|
|
100
|
+
- [ ] Add stop_gate.sh to Stop hooks
|
|
101
|
+
- [ ] Document hook ordering
|
|
102
|
+
|
|
103
|
+
4. **Add Gate Logging**
|
|
104
|
+
- [ ] Create log file at `.claude/logs/stop_gate.log`
|
|
105
|
+
- [ ] Log each gate event with timestamp
|
|
106
|
+
- [ ] Log pass/fail status and reason
|
|
107
|
+
|
|
108
|
+
### Acceptance Criteria
|
|
109
|
+
- [ ] Cannot exit with failing tests (exit code 1)
|
|
110
|
+
- [ ] Clear message shows what's blocking exit
|
|
111
|
+
- [ ] `/force-exit` bypasses gate with warning
|
|
112
|
+
- [ ] All gate events logged
|
|
113
|
+
|
|
114
|
+
### Test Cases
|
|
115
|
+
- Exit attempt with passing verification
|
|
116
|
+
- Exit attempt with failing tests
|
|
117
|
+
- Exit attempt with lint errors
|
|
118
|
+
- Force exit override
|
|
119
|
+
- Verification disabled scenario
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
|
|
123
|
+
## Phase 3: PostToolUse Formatting Hook
|
|
124
|
+
|
|
125
|
+
**Linear Issue**: ANV-143
|
|
126
|
+
**Estimated Time**: 2-3 hours
|
|
127
|
+
**Dependencies**: None (can parallel Phase 1-2)
|
|
128
|
+
|
|
129
|
+
### Objectives
|
|
130
|
+
- Auto-format code after every Edit/Write
|
|
131
|
+
- Support multiple formatters per language
|
|
132
|
+
- Ensure <500ms execution
|
|
133
|
+
|
|
134
|
+
### Tasks
|
|
135
|
+
|
|
136
|
+
1. **Create PostToolUse Hook** (`global/hooks/post_tool_format.sh`)
|
|
137
|
+
- [ ] Extract `CLAUDE_FILE_PATH` from environment
|
|
138
|
+
- [ ] Skip non-existent and binary files
|
|
139
|
+
- [ ] Skip node_modules, .git, etc.
|
|
140
|
+
- [ ] Detect file type from extension
|
|
141
|
+
- [ ] Run appropriate formatter
|
|
142
|
+
|
|
143
|
+
2. **Formatter Configuration**
|
|
144
|
+
- [ ] TypeScript/JavaScript: prettier
|
|
145
|
+
- [ ] Python: black or ruff
|
|
146
|
+
- [ ] Go: gofmt
|
|
147
|
+
- [ ] Bash: shfmt
|
|
148
|
+
- [ ] Add fallback when formatter unavailable
|
|
149
|
+
|
|
150
|
+
3. **Performance Optimization**
|
|
151
|
+
- [ ] Measure execution time
|
|
152
|
+
- [ ] Skip if formatter not installed
|
|
153
|
+
- [ ] Use `--silent` flags to reduce output
|
|
154
|
+
|
|
155
|
+
4. **Register Hook**
|
|
156
|
+
- [ ] Add to PostToolUse with matcher "Edit|Write"
|
|
157
|
+
- [ ] Test with both Edit and Write operations
|
|
158
|
+
|
|
159
|
+
### Acceptance Criteria
|
|
160
|
+
- [ ] All JS/TS files formatted after Edit/Write
|
|
161
|
+
- [ ] Python files formatted after Edit/Write
|
|
162
|
+
- [ ] Execution completes in <500ms
|
|
163
|
+
- [ ] No errors on missing formatters
|
|
164
|
+
|
|
165
|
+
### Test Cases
|
|
166
|
+
- Edit TypeScript file → formatted
|
|
167
|
+
- Write new Python file → formatted
|
|
168
|
+
- Edit binary file → skipped
|
|
169
|
+
- Write to node_modules → skipped
|
|
170
|
+
- No prettier installed → graceful skip
|
|
171
|
+
|
|
172
|
+
---
|
|
173
|
+
|
|
174
|
+
## Phase 4: Ralph Wiggum Core Infrastructure
|
|
175
|
+
|
|
176
|
+
**Linear Issue**: ANV-144
|
|
177
|
+
**Estimated Time**: 4-5 hours
|
|
178
|
+
**Dependencies**: Phase 1, Phase 2
|
|
179
|
+
|
|
180
|
+
### Objectives
|
|
181
|
+
- Implement Ralph Wiggum stop hook
|
|
182
|
+
- Create PROMPT.md templating
|
|
183
|
+
- Build iteration tracking system
|
|
184
|
+
|
|
185
|
+
### Tasks
|
|
186
|
+
|
|
187
|
+
1. **Create Ralph Stop Hook** (`global/hooks/ralph_stop.sh`)
|
|
188
|
+
- [ ] Read iteration count from temp file
|
|
189
|
+
- [ ] Increment iteration counter
|
|
190
|
+
- [ ] Check max iterations limit
|
|
191
|
+
- [ ] Check for `<promise>COMPLETE</promise>`
|
|
192
|
+
- [ ] Check for `<fatal>` error signal
|
|
193
|
+
- [ ] Return appropriate exit code
|
|
194
|
+
|
|
195
|
+
2. **Create Prompt Templates** (`global/templates/ralph/`)
|
|
196
|
+
- [ ] `PROMPT.md.template` - Main task prompt
|
|
197
|
+
- [ ] `fix_plan.md.template` - TODO list template
|
|
198
|
+
- [ ] Variable substitution support
|
|
199
|
+
|
|
200
|
+
3. **Create `/ralph` Skill** (`global/commands/ralph.md`)
|
|
201
|
+
- [ ] `start [description]` - Initialize Ralph mode
|
|
202
|
+
- [ ] `status` - Show current progress
|
|
203
|
+
- [ ] `stop` - Force exit Ralph mode
|
|
204
|
+
- [ ] Document constraints and rules
|
|
205
|
+
|
|
206
|
+
4. **Iteration State Management**
|
|
207
|
+
- [ ] Store state in `.claude/ralph-state.json`
|
|
208
|
+
- [ ] Track: iteration count, last action, remaining items
|
|
209
|
+
- [ ] Clean up on completion
|
|
210
|
+
|
|
211
|
+
### Acceptance Criteria
|
|
212
|
+
- [ ] `/ralph start` creates PROMPT.md and fix_plan.md
|
|
213
|
+
- [ ] Stop hook blocks exit until `<promise>COMPLETE</promise>`
|
|
214
|
+
- [ ] Max iterations safety stops runaway execution
|
|
215
|
+
- [ ] `/ralph status` shows current progress
|
|
216
|
+
- [ ] `/ralph stop` cleanly exits
|
|
217
|
+
|
|
218
|
+
### Test Cases
|
|
219
|
+
- Start Ralph mode with simple task
|
|
220
|
+
- Verify stop hook blocks incomplete exit
|
|
221
|
+
- Verify completion promise allows exit
|
|
222
|
+
- Verify max iterations trigger
|
|
223
|
+
- Status shows accurate progress
|
|
224
|
+
|
|
225
|
+
---
|
|
226
|
+
|
|
227
|
+
## Phase 5: Ralph Wiggum Execution Loop
|
|
228
|
+
|
|
229
|
+
**Linear Issue**: ANV-145
|
|
230
|
+
**Estimated Time**: 3-4 hours
|
|
231
|
+
**Dependencies**: Phase 4
|
|
232
|
+
|
|
233
|
+
### Objectives
|
|
234
|
+
- Implement the bash execution loop
|
|
235
|
+
- Add subagent integration for research
|
|
236
|
+
- Enable self-learning AGENT.md updates
|
|
237
|
+
|
|
238
|
+
### Tasks
|
|
239
|
+
|
|
240
|
+
1. **Create Execution Script** (`global/tools/ralph-loop.sh`)
|
|
241
|
+
- [ ] Read PROMPT.md
|
|
242
|
+
- [ ] Pipe to Claude CLI
|
|
243
|
+
- [ ] Check exit code
|
|
244
|
+
- [ ] Loop until completion or max iterations
|
|
245
|
+
|
|
246
|
+
2. **Subagent Integration**
|
|
247
|
+
- [ ] Document research subagent usage (max 500)
|
|
248
|
+
- [ ] Document build subagent limit (1)
|
|
249
|
+
- [ ] Add context pollution warnings
|
|
250
|
+
|
|
251
|
+
3. **Self-Learning Updates**
|
|
252
|
+
- [ ] Allow Claude to append to AGENT.md
|
|
253
|
+
- [ ] Extract learnings from completed iterations
|
|
254
|
+
- [ ] Persist patterns across sessions
|
|
255
|
+
|
|
256
|
+
4. **Progress Reporting**
|
|
257
|
+
- [ ] Real-time iteration updates
|
|
258
|
+
- [ ] TODO completion tracking
|
|
259
|
+
- [ ] Time per iteration metrics
|
|
260
|
+
|
|
261
|
+
### Acceptance Criteria
|
|
262
|
+
- [ ] Loop executes until completion
|
|
263
|
+
- [ ] Each iteration processes one TODO item
|
|
264
|
+
- [ ] Subagent usage follows limits
|
|
265
|
+
- [ ] AGENT.md updated with learnings
|
|
266
|
+
- [ ] Progress visible during execution
|
|
267
|
+
|
|
268
|
+
### Test Cases
|
|
269
|
+
- Complete 3-item TODO list
|
|
270
|
+
- Verify one-item-per-iteration constraint
|
|
271
|
+
- Test subagent spawning during research
|
|
272
|
+
- Test AGENT.md learning updates
|
|
273
|
+
|
|
274
|
+
---
|
|
275
|
+
|
|
276
|
+
## Phase 6: Atomic Action Commands
|
|
277
|
+
|
|
278
|
+
**Linear Issue**: ANV-146
|
|
279
|
+
**Estimated Time**: 3-4 hours
|
|
280
|
+
**Dependencies**: Phase 1
|
|
281
|
+
|
|
282
|
+
### Objectives
|
|
283
|
+
- Create rapid inner-loop commands
|
|
284
|
+
- Ensure atomic execution (all-or-nothing)
|
|
285
|
+
|
|
286
|
+
### Tasks
|
|
287
|
+
|
|
288
|
+
1. **Create `/test-and-commit`** (`global/commands/test-and-commit.md`)
|
|
289
|
+
- [ ] Run test suite
|
|
290
|
+
- [ ] If pass: stage and commit
|
|
291
|
+
- [ ] If fail: report errors, no commit
|
|
292
|
+
- [ ] Accept commit message argument
|
|
293
|
+
|
|
294
|
+
2. **Create `/commit-push-pr`** (`global/commands/commit-push-pr.md`)
|
|
295
|
+
- [ ] Stage all changes
|
|
296
|
+
- [ ] Create commit with message
|
|
297
|
+
- [ ] Push to remote
|
|
298
|
+
- [ ] Create PR with `gh pr create`
|
|
299
|
+
- [ ] Return PR URL
|
|
300
|
+
|
|
301
|
+
3. **Create `/lint-fix`** (`global/commands/lint-fix.md`)
|
|
302
|
+
- [ ] Run linter with `--fix`
|
|
303
|
+
- [ ] Report remaining unfixable issues
|
|
304
|
+
- [ ] Stage auto-fixed changes
|
|
305
|
+
|
|
306
|
+
4. **Error Handling**
|
|
307
|
+
- [ ] Clear failure messages
|
|
308
|
+
- [ ] No partial state on failure
|
|
309
|
+
- [ ] Rollback on error
|
|
310
|
+
|
|
311
|
+
### Acceptance Criteria
|
|
312
|
+
- [ ] `/test-and-commit` only commits if tests pass
|
|
313
|
+
- [ ] `/commit-push-pr` creates complete PR
|
|
314
|
+
- [ ] `/lint-fix` auto-fixes lintable issues
|
|
315
|
+
- [ ] All commands have clear success/failure output
|
|
316
|
+
|
|
317
|
+
### Test Cases
|
|
318
|
+
- `/test-and-commit` with passing tests
|
|
319
|
+
- `/test-and-commit` with failing tests
|
|
320
|
+
- `/commit-push-pr` full workflow
|
|
321
|
+
- `/lint-fix` with fixable issues
|
|
322
|
+
|
|
323
|
+
---
|
|
324
|
+
|
|
325
|
+
## Phase 7: Verify Subagent & CLAUDE.md Automation
|
|
326
|
+
|
|
327
|
+
**Linear Issue**: ANV-147
|
|
328
|
+
**Estimated Time**: 4-5 hours
|
|
329
|
+
**Dependencies**: Phase 1, Phase 6
|
|
330
|
+
|
|
331
|
+
### Objectives
|
|
332
|
+
- Create dedicated verification subagent
|
|
333
|
+
- Implement GitHub Action for CLAUDE.md updates
|
|
334
|
+
|
|
335
|
+
### Tasks
|
|
336
|
+
|
|
337
|
+
1. **Create Verify Subagent** (`global/agents/verify-app.json`)
|
|
338
|
+
- [ ] Define agent configuration
|
|
339
|
+
- [ ] Limit to verification tools
|
|
340
|
+
- [ ] Create invocation pattern
|
|
341
|
+
|
|
342
|
+
2. **Create Pattern Extraction Script** (`scripts/extract_patterns.py`)
|
|
343
|
+
- [ ] Parse commit messages
|
|
344
|
+
- [ ] Identify recurring patterns
|
|
345
|
+
- [ ] Output structured JSON
|
|
346
|
+
|
|
347
|
+
3. **Create CLAUDE.md Updater** (`scripts/update_claude_md.py`)
|
|
348
|
+
- [ ] Read patterns.json
|
|
349
|
+
- [ ] Update Project-Learned Patterns table
|
|
350
|
+
- [ ] Preserve existing content
|
|
351
|
+
|
|
352
|
+
4. **Create GitHub Action** (`.github/workflows/update-claude-md.yml`)
|
|
353
|
+
- [ ] Weekly schedule (Sundays)
|
|
354
|
+
- [ ] Manual trigger support
|
|
355
|
+
- [ ] Pattern extraction step
|
|
356
|
+
- [ ] CLAUDE.md update step
|
|
357
|
+
- [ ] PR creation (no auto-merge)
|
|
358
|
+
|
|
359
|
+
### Acceptance Criteria
|
|
360
|
+
- [ ] verify-app subagent runs all checks
|
|
361
|
+
- [ ] Pattern extraction identifies common patterns
|
|
362
|
+
- [ ] CLAUDE.md updated with new patterns
|
|
363
|
+
- [ ] PR created for human review
|
|
364
|
+
|
|
365
|
+
### Test Cases
|
|
366
|
+
- Invoke verify-app subagent
|
|
367
|
+
- Extract patterns from test commits
|
|
368
|
+
- Update CLAUDE.md with new patterns
|
|
369
|
+
- GitHub Action runs on schedule
|
|
370
|
+
|
|
371
|
+
---
|
|
372
|
+
|
|
373
|
+
## Implementation Schedule
|
|
374
|
+
|
|
375
|
+
| Phase | Name | Blocked By | Priority |
|
|
376
|
+
|-------|------|------------|----------|
|
|
377
|
+
| 1 | Verification Command | None | P0 |
|
|
378
|
+
| 2 | Stop Hook Gate | Phase 1 | P0 |
|
|
379
|
+
| 3 | PostToolUse Hook | None | P0 |
|
|
380
|
+
| 4 | Ralph Core | Phases 1, 2 | P1 |
|
|
381
|
+
| 5 | Ralph Execution | Phase 4 | P1 |
|
|
382
|
+
| 6 | Atomic Actions | Phase 1 | P1 |
|
|
383
|
+
| 7 | Subagent & Automation | Phases 1, 6 | P2 |
|
|
384
|
+
|
|
385
|
+
**Note**: Phases 1, 2, and 3 can start in parallel. Phase 6 can parallel with Phases 4-5.
|
|
386
|
+
|
|
387
|
+
---
|
|
388
|
+
|
|
389
|
+
## Definition of Done
|
|
390
|
+
|
|
391
|
+
For each phase:
|
|
392
|
+
- [ ] All tasks completed
|
|
393
|
+
- [ ] All acceptance criteria met
|
|
394
|
+
- [ ] All test cases verified
|
|
395
|
+
- [ ] Documentation updated
|
|
396
|
+
- [ ] Code reviewed
|
|
397
|
+
- [ ] Changes committed and pushed
|
|
398
|
+
|
|
399
|
+
For overall feature:
|
|
400
|
+
- [ ] All 7 phases complete
|
|
401
|
+
- [ ] Integration testing passed
|
|
402
|
+
- [ ] Demo to stakeholders
|
|
403
|
+
- [ ] Retrospective written
|