anvil-dev-framework 0.1.7 → 0.1.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +32 -13
- package/VERSION +1 -1
- package/docs/ANV-263-hook-logging-investigation.md +116 -0
- package/docs/command-reference.md +301 -1
- package/docs/session-workflow.md +62 -9
- package/docs/system-architecture.md +569 -0
- package/global/commands/anvil-settings.md +3 -1
- package/global/commands/audit.md +163 -0
- package/global/commands/checklist.md +180 -0
- package/global/commands/efficiency.md +356 -0
- package/global/commands/evidence.md +99 -32
- package/global/commands/insights.md +101 -3
- package/global/commands/patterns.md +115 -0
- package/global/commands/ralph.md +47 -1
- package/global/commands/token-budget.md +214 -0
- package/global/lib/__pycache__/context_optimizer.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/git_utils.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/issue_models.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/linear_provider.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/optimization_applier.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/ralph_state.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/token_analyzer.cpython-314.pyc +0 -0
- package/global/lib/__pycache__/token_metrics.cpython-314.pyc +0 -0
- package/global/lib/context_optimizer.py +323 -0
- package/global/lib/linear_provider.py +210 -16
- package/global/lib/optimization_applier.py +582 -0
- package/global/lib/ralph_state.py +264 -24
- package/global/lib/token_analyzer.py +1357 -0
- package/global/lib/token_metrics.py +873 -0
- package/global/tests/__pycache__/test_context_optimizer.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_doc_coverage.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_git_utils.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_issue_models.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_linear_filtering.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_linear_provider.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_local_provider.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_optimization_applier.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_token_analyzer.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_token_analyzer_phase6.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/__pycache__/test_token_metrics.cpython-314-pytest-9.0.2.pyc +0 -0
- package/global/tests/test_context_optimizer.py +321 -0
- package/global/tests/test_linear_filtering.py +319 -0
- package/global/tests/test_linear_provider.py +40 -1
- package/global/tests/test_optimization_applier.py +508 -0
- package/global/tests/test_token_analyzer.py +735 -0
- package/global/tests/test_token_analyzer_phase6.py +537 -0
- package/global/tests/test_token_metrics.py +791 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
```
|
|
2
2
|
___ _ ___ _____ _
|
|
3
3
|
/ \ | \ | \ \ / /_ _| |
|
|
4
|
-
/ /_\ \ | \| |\ \ / / | || | v0.1.
|
|
4
|
+
/ /_\ \ | \| |\ \ / / | || | v0.1.8.0 (alpha)
|
|
5
5
|
/ _____ \| |\ | \ V / | || |___
|
|
6
6
|
/_/ \_\_| \_| \_/ |___|_____|
|
|
7
7
|
|
|
@@ -10,7 +10,7 @@
|
|
|
10
10
|
══════════════════════════════════════════════════════════
|
|
11
11
|
```
|
|
12
12
|
|
|
13
|
-
# Anvil Development Framework <sup>v0.1.
|
|
13
|
+
# Anvil Development Framework <sup>v0.1.8.0</sup>
|
|
14
14
|
|
|
15
15
|
> **A structured AI development system for solo builders who demand production-quality output.**
|
|
16
16
|
|
|
@@ -18,18 +18,20 @@ Anvil is a comprehensive framework for AI-assisted software development that com
|
|
|
18
18
|
|
|
19
19
|
---
|
|
20
20
|
|
|
21
|
-
## 📦 Latest Changes in v0.1.
|
|
21
|
+
## 📦 Latest Changes in v0.1.8.0
|
|
22
22
|
|
|
23
|
-
*Released: 2026-01-
|
|
23
|
+
*Released: 2026-01-16*
|
|
24
24
|
|
|
25
|
-
- **
|
|
26
|
-
-
|
|
27
|
-
-
|
|
28
|
-
-
|
|
29
|
-
|
|
30
|
-
-
|
|
31
|
-
-
|
|
32
|
-
-
|
|
25
|
+
- **Token Efficiency Audit Framework** — Complete token consumption tracking and optimization
|
|
26
|
+
- `/efficiency` command for historical analysis with weekly/monthly reports
|
|
27
|
+
- `/token-budget` command for session budget management with alerts
|
|
28
|
+
- Efficiency scoring (0-100), trend detection, and automated recommendations
|
|
29
|
+
- **CodeRabbit Deep Integration** — Automated code review workflow
|
|
30
|
+
- Enhanced `.coderabbit.yaml` with pre-merge checks and custom Anvil validations
|
|
31
|
+
- `/evidence` command integration with enforcement levels (soft/hard)
|
|
32
|
+
- **Insights Watermark System** — Prevents re-analyzing processed retrospectives
|
|
33
|
+
- Manifest tracking in `.claude/insights/.manifest.json`
|
|
34
|
+
- `--all` flag to force re-analysis when needed
|
|
33
35
|
|
|
34
36
|
See [CHANGELOG.md](CHANGELOG.md) for complete history.
|
|
35
37
|
|
|
@@ -499,10 +501,27 @@ This remains your **default approach** for all normal development work.
|
|
|
499
501
|
|
|
500
502
|
### Ralph Mode (Special Scenarios Only)
|
|
501
503
|
|
|
502
|
-
```
|
|
504
|
+
```bash
|
|
505
|
+
# Manual task description
|
|
503
506
|
/ralph start "Migrate all tests from Jest to Vitest" --max-iterations 50
|
|
507
|
+
|
|
508
|
+
# From Linear issue (recommended) - fetches subtasks automatically
|
|
509
|
+
/ralph start --issue ANV-209
|
|
510
|
+
|
|
511
|
+
# From Linear project - process all issues in a project
|
|
512
|
+
/ralph start --project "HUD Development"
|
|
504
513
|
```
|
|
505
514
|
|
|
515
|
+
**Linear Integration Flags:**
|
|
516
|
+
|
|
517
|
+
| Flag | Description |
|
|
518
|
+
|------|-------------|
|
|
519
|
+
| `--issue` | Linear issue ID to fetch subtasks from (e.g., `ANV-209`) |
|
|
520
|
+
| `--project` | Linear project name to process all issues |
|
|
521
|
+
| `--subtasks` | Filter subtasks (e.g., `ANV-1..ANV-5` or `ANV-1,ANV-3`) |
|
|
522
|
+
| `--include-done` | Include already-completed issues in project mode |
|
|
523
|
+
| `--no-sync` | Disable syncing status back to Linear |
|
|
524
|
+
|
|
506
525
|
| Good For | Not Good For |
|
|
507
526
|
|----------|--------------|
|
|
508
527
|
| ✅ Large-scale refactoring with clear completion criteria | ❌ Exploratory work (figuring things out) |
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
0.1.
|
|
1
|
+
0.1.8.0
|
|
@@ -0,0 +1,116 @@
|
|
|
1
|
+
# Investigation: Pre/Post Tool Use Hook Logging Discrepancy
|
|
2
|
+
|
|
3
|
+
**Issue**: ANV-263
|
|
4
|
+
**Date**: 2026-01-15
|
|
5
|
+
**Status**: Root cause identified
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Problem Statement
|
|
10
|
+
|
|
11
|
+
The healthcheck report from 2026-01-15-1721 flagged:
|
|
12
|
+
> "pre_tool_use entries (2) much lower than post_tool_use (39) suggests potential hook logging inconsistency"
|
|
13
|
+
|
|
14
|
+
## Investigation Summary
|
|
15
|
+
|
|
16
|
+
### Initial Observations
|
|
17
|
+
|
|
18
|
+
| Metric | pre_tool_use.json | post_tool_use.json |
|
|
19
|
+
|--------|-------------------|-------------------|
|
|
20
|
+
| File size | 23KB → grew to larger | 125KB → smaller |
|
|
21
|
+
| Total entries | 42 | 2 |
|
|
22
|
+
| Unique tool_use_ids | 25 | 1 |
|
|
23
|
+
| Duplicate entries | 17 (40%) | 1 (50%) |
|
|
24
|
+
|
|
25
|
+
The discrepancy reversed direction during investigation - at the time of analysis, pre_tool_use had MORE entries than post_tool_use.
|
|
26
|
+
|
|
27
|
+
### Root Cause: Duplicate Hook Registration
|
|
28
|
+
|
|
29
|
+
**Both hooks are registered in TWO locations:**
|
|
30
|
+
|
|
31
|
+
1. **Global settings** (`~/.claude/settings.json`):
|
|
32
|
+
```json
|
|
33
|
+
"PreToolUse": [{
|
|
34
|
+
"hooks": [{
|
|
35
|
+
"command": "uv run .claude/hooks/pre_tool_use.py"
|
|
36
|
+
}]
|
|
37
|
+
}]
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
2. **Local settings** (`.claude/settings.local.json`):
|
|
41
|
+
```json
|
|
42
|
+
"PreToolUse": [{
|
|
43
|
+
"hooks": [{
|
|
44
|
+
"command": "uv run .claude/hooks/pre_tool_use.py --announce --track-tokens"
|
|
45
|
+
}]
|
|
46
|
+
}]
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
**Result**: Every tool invocation triggers the hook TWICE, producing duplicate log entries.
|
|
50
|
+
|
|
51
|
+
### Evidence
|
|
52
|
+
|
|
53
|
+
Duplicate tool_use_ids found (each appears exactly 2 times):
|
|
54
|
+
- `toolu_01SLbsZ2Mqn3eP...`
|
|
55
|
+
- `toolu_018xwr4o547Byp...`
|
|
56
|
+
- `toolu_01SrAGnm2mQ7vp...`
|
|
57
|
+
- `toolu_018SeysS4mLJih...`
|
|
58
|
+
- `toolu_01KAMaFWvecFKQ...`
|
|
59
|
+
|
|
60
|
+
The consistent 2x duplication pattern confirms the dual registration cause.
|
|
61
|
+
|
|
62
|
+
### Why Counts Varied
|
|
63
|
+
|
|
64
|
+
The original healthcheck showed pre_tool_use with fewer entries because:
|
|
65
|
+
1. Logs may have been rotated or cleared at different times
|
|
66
|
+
2. Different sessions accumulated different counts
|
|
67
|
+
3. The healthcheck snapshot was taken at a specific moment
|
|
68
|
+
|
|
69
|
+
## Recommendations
|
|
70
|
+
|
|
71
|
+
### Option A: Remove Global Registration (Recommended)
|
|
72
|
+
Remove the hook from `~/.claude/settings.json` since local settings are project-specific and include the correct flags.
|
|
73
|
+
|
|
74
|
+
```bash
|
|
75
|
+
# Edit ~/.claude/settings.json and remove the PreToolUse and PostToolUse entries
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
### Option B: Add Deduplication Logic
|
|
79
|
+
Add tool_use_id deduplication in the hook itself:
|
|
80
|
+
|
|
81
|
+
```python
|
|
82
|
+
# Before appending, check if tool_use_id already exists
|
|
83
|
+
if not any(e.get('tool_use_id') == input_data.get('tool_use_id') for e in log_data):
|
|
84
|
+
log_data.append(input_data)
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
### Option C: Use File Locking
|
|
88
|
+
If concurrent execution is needed, use proper file locking:
|
|
89
|
+
|
|
90
|
+
```python
|
|
91
|
+
import fcntl
|
|
92
|
+
with open(log_path, 'r+') as f:
|
|
93
|
+
fcntl.flock(f.fileno(), fcntl.LOCK_EX)
|
|
94
|
+
# ... read, modify, write ...
|
|
95
|
+
fcntl.flock(f.fileno(), fcntl.LOCK_UN)
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
## Related Issues
|
|
99
|
+
|
|
100
|
+
- ANV-264: Log Rotation (Phase 3b) - will need the log files to be consistent first
|
|
101
|
+
- ANV-260: Parent issue for Insights Recommended Actions
|
|
102
|
+
|
|
103
|
+
## Files Examined
|
|
104
|
+
|
|
105
|
+
| File | Purpose |
|
|
106
|
+
|------|---------|
|
|
107
|
+
| `.claude/hooks/pre_tool_use.py` | Pre-tool hook implementation (lines 378-398 for logging) |
|
|
108
|
+
| `.claude/hooks/post_tool_use.py` | Post-tool hook implementation (lines 171-189 for logging) |
|
|
109
|
+
| `.claude/settings.local.json` | Project-specific hook registration |
|
|
110
|
+
| `~/.claude/settings.json` | Global hook registration (duplicate) |
|
|
111
|
+
| `logs/pre_tool_use.json` | Pre-tool log file |
|
|
112
|
+
| `logs/post_tool_use.json` | Post-tool log file |
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
**Next Steps**: Fix the duplicate registration, then proceed to ANV-264 (log rotation).
|
|
@@ -28,6 +28,9 @@
|
|
|
28
28
|
| `/release` | Consolidate [Unreleased] into versioned release |
|
|
29
29
|
| `/healthcheck` | Framework diagnostics and session health |
|
|
30
30
|
| `/retro` | Write retrospective to capture learnings |
|
|
31
|
+
| `/audit` | Real-time token consumption analysis |
|
|
32
|
+
| `/efficiency` | Historical token efficiency reports |
|
|
33
|
+
| `/token-budget` | Session token budget management |
|
|
31
34
|
| `/shard` | Break large specs into atomic pieces |
|
|
32
35
|
| `/decay-review` | Archive old issues and clean handoffs |
|
|
33
36
|
| `/weekly-review` | Weekly analytics and improvement recommendations |
|
|
@@ -69,6 +72,10 @@
|
|
|
69
72
|
- [/evidence](#evidence)
|
|
70
73
|
- [/healthcheck](#healthcheck)
|
|
71
74
|
- [/retro](#retro)
|
|
75
|
+
- [Token Efficiency Commands](#token-efficiency-commands)
|
|
76
|
+
- [/audit](#audit)
|
|
77
|
+
- [/efficiency](#efficiency)
|
|
78
|
+
- [/token-budget](#token-budget)
|
|
72
79
|
- [Multi-Agent Commands](#multi-agent-commands)
|
|
73
80
|
- [/hud](#hud)
|
|
74
81
|
- [Maintenance Commands](#maintenance-commands)
|
|
@@ -1358,6 +1365,250 @@ Make a requirements checklist BEFORE copying patterns.
|
|
|
1358
1365
|
|
|
1359
1366
|
---
|
|
1360
1367
|
|
|
1368
|
+
## Token Efficiency Commands
|
|
1369
|
+
|
|
1370
|
+
### /audit
|
|
1371
|
+
|
|
1372
|
+
**Purpose**: Real-time token consumption analysis for the current session.
|
|
1373
|
+
|
|
1374
|
+
**When to Use**: During a session to understand token usage, detect waste patterns, and get optimization recommendations.
|
|
1375
|
+
|
|
1376
|
+
**What It Does**:
|
|
1377
|
+
1. Analyzes current session's token consumption
|
|
1378
|
+
2. Breaks down usage by component type (commands, hooks, tools)
|
|
1379
|
+
3. Identifies waste patterns (redundant loads, unused components)
|
|
1380
|
+
4. Calculates efficiency score (0-100)
|
|
1381
|
+
5. Generates actionable recommendations
|
|
1382
|
+
|
|
1383
|
+
**Output Format**:
|
|
1384
|
+
|
|
1385
|
+
```markdown
|
|
1386
|
+
## Token Audit Report
|
|
1387
|
+
|
|
1388
|
+
**Session**: `abc12345...`
|
|
1389
|
+
**Analyzed**: 2026-01-16 14:30
|
|
1390
|
+
**Efficiency Score**: 78/100
|
|
1391
|
+
|
|
1392
|
+
### Context Usage
|
|
1393
|
+
|
|
1394
|
+
- **Total Tokens**: 45,000
|
|
1395
|
+
- **Context Used**: 30% of effective limit
|
|
1396
|
+
- **Peak Tokens**: 52,000
|
|
1397
|
+
|
|
1398
|
+
### Breakdown by Type
|
|
1399
|
+
|
|
1400
|
+
| Type | Tokens | % of Total | Count |
|
|
1401
|
+
|------|--------|------------|-------|
|
|
1402
|
+
| command | 15,000 | 33.3% | 8 |
|
|
1403
|
+
| system | 12,000 | 26.7% | 3 |
|
|
1404
|
+
| tools | 18,000 | 40.0% | 45 |
|
|
1405
|
+
|
|
1406
|
+
### Detected Waste Patterns
|
|
1407
|
+
|
|
1408
|
+
- 🔄 **orient**: Loaded 3 times, costing 1,200 extra tokens
|
|
1409
|
+
- ⚠️ **patterns**: Loaded but never used, wasting 800 tokens
|
|
1410
|
+
```
|
|
1411
|
+
|
|
1412
|
+
**Related Commands**:
|
|
1413
|
+
- `/efficiency` — Historical analysis over days/weeks
|
|
1414
|
+
- `/token-budget` — Set and track session token budgets
|
|
1415
|
+
|
|
1416
|
+
---
|
|
1417
|
+
|
|
1418
|
+
### /efficiency
|
|
1419
|
+
|
|
1420
|
+
**Purpose**: Historical efficiency analysis over time periods (weekly/monthly).
|
|
1421
|
+
|
|
1422
|
+
**When to Use**: Weekly reviews, tracking optimization impact, identifying consistently low-efficiency components.
|
|
1423
|
+
|
|
1424
|
+
**Variants**:
|
|
1425
|
+
|
|
1426
|
+
| Command | Description |
|
|
1427
|
+
|---------|-------------|
|
|
1428
|
+
| `/efficiency` | Weekly report (default, last 7 days) |
|
|
1429
|
+
| `/efficiency --weekly` | Explicit weekly report |
|
|
1430
|
+
| `/efficiency --monthly` | Monthly report (last 30 days) |
|
|
1431
|
+
| `/efficiency --recommendations` | Show only recommendations |
|
|
1432
|
+
|
|
1433
|
+
**What It Does**:
|
|
1434
|
+
1. Analyzes component usage across multiple sessions
|
|
1435
|
+
2. Calculates efficiency scores per component
|
|
1436
|
+
3. Detects trends (improving/stable/degrading)
|
|
1437
|
+
4. Compares to previous period
|
|
1438
|
+
5. Generates optimization recommendations
|
|
1439
|
+
|
|
1440
|
+
**Efficiency Score Calculation**:
|
|
1441
|
+
|
|
1442
|
+
| Factor | Points | Criteria |
|
|
1443
|
+
|--------|--------|----------|
|
|
1444
|
+
| Utilization | 0-50 | % of loads where component was used |
|
|
1445
|
+
| Token Cost | 0-30 | Lower avg tokens = higher score |
|
|
1446
|
+
| Consistency | 0-20 | Frequent use with high utilization |
|
|
1447
|
+
|
|
1448
|
+
**Score Interpretation**:
|
|
1449
|
+
|
|
1450
|
+
| Score Range | Interpretation |
|
|
1451
|
+
|-------------|----------------|
|
|
1452
|
+
| 90-100 | Excellent—keep as is |
|
|
1453
|
+
| 70-89 | Good—minor optimization possible |
|
|
1454
|
+
| 50-69 | Fair—consider optimization |
|
|
1455
|
+
| <50 | Poor—candidate for removal/deferral |
|
|
1456
|
+
|
|
1457
|
+
**Trend Indicators**:
|
|
1458
|
+
|
|
1459
|
+
| Icon | Meaning |
|
|
1460
|
+
|------|---------|
|
|
1461
|
+
| ↑ | Improving (utilization increasing) |
|
|
1462
|
+
| → | Stable (no significant change) |
|
|
1463
|
+
| ↓ | Degrading (utilization decreasing) |
|
|
1464
|
+
| ★ | New (no previous data) |
|
|
1465
|
+
|
|
1466
|
+
**Output Format**:
|
|
1467
|
+
|
|
1468
|
+
```markdown
|
|
1469
|
+
## Weekly Efficiency Report
|
|
1470
|
+
|
|
1471
|
+
**Period**: Last 7 days
|
|
1472
|
+
**Generated**: 2026-01-16 14:30
|
|
1473
|
+
**Overall Efficiency**: 72/100
|
|
1474
|
+
|
|
1475
|
+
### Summary
|
|
1476
|
+
|
|
1477
|
+
- **Sessions Analyzed**: 42
|
|
1478
|
+
- **Total Tokens**: 1,250,000
|
|
1479
|
+
- **Avg per Session**: 29,762
|
|
1480
|
+
|
|
1481
|
+
### Component Efficiency Scores
|
|
1482
|
+
|
|
1483
|
+
| Component | Type | Score | Utilization | Trend |
|
|
1484
|
+
|-----------|------|-------|-------------|-------|
|
|
1485
|
+
| patterns | command | 35 | 15% | ↓ |
|
|
1486
|
+
| checklist | command | 42 | 22% | → |
|
|
1487
|
+
| orient | command | 85 | 92% | ↑ |
|
|
1488
|
+
| CLAUDE.md | system | 78 | 100% | → |
|
|
1489
|
+
|
|
1490
|
+
### Top Recommendations
|
|
1491
|
+
|
|
1492
|
+
- 🔴 **Defer loading patterns**: Used only 15% of the time
|
|
1493
|
+
- Potential savings: ~1,020 tokens
|
|
1494
|
+
- 🟡 **Optimize large-context**: Averaging 3,500 tokens
|
|
1495
|
+
- Potential savings: ~1,050 tokens
|
|
1496
|
+
```
|
|
1497
|
+
|
|
1498
|
+
**Recommendations Workflow**:
|
|
1499
|
+
|
|
1500
|
+
1. Review low-score components (score < 50)
|
|
1501
|
+
2. Check trends for degrading patterns
|
|
1502
|
+
3. Apply recommendations:
|
|
1503
|
+
- **defer**: Move to on-demand command
|
|
1504
|
+
- **optimize**: Reduce component size
|
|
1505
|
+
- **review**: Consider removal
|
|
1506
|
+
4. Track impact in next week's report
|
|
1507
|
+
|
|
1508
|
+
**Related Commands**:
|
|
1509
|
+
- `/audit` — Real-time session analysis
|
|
1510
|
+
- `/token-budget` — Proactive budget management
|
|
1511
|
+
|
|
1512
|
+
---
|
|
1513
|
+
|
|
1514
|
+
### /token-budget
|
|
1515
|
+
|
|
1516
|
+
**Purpose**: Proactive session token budget management with intelligent alerts.
|
|
1517
|
+
|
|
1518
|
+
**When to Use**: Before starting work to set a budget, during long sessions to monitor usage, when approaching context limits.
|
|
1519
|
+
|
|
1520
|
+
**Variants**:
|
|
1521
|
+
|
|
1522
|
+
| Command | Description |
|
|
1523
|
+
|---------|-------------|
|
|
1524
|
+
| `/token-budget` | Show current budget status (default) |
|
|
1525
|
+
| `/token-budget status` | Same as `/token-budget` |
|
|
1526
|
+
| `/token-budget set <tokens>` | Set budget (e.g., `set 100000`) |
|
|
1527
|
+
| `/token-budget alert <level> <percent>` | Set custom threshold |
|
|
1528
|
+
| `/token-budget clear` | Remove budget constraint |
|
|
1529
|
+
|
|
1530
|
+
**What It Does**:
|
|
1531
|
+
1. Sets a token budget for the current session
|
|
1532
|
+
2. Tracks usage against budget
|
|
1533
|
+
3. Estimates remaining turns based on consumption patterns
|
|
1534
|
+
4. Alerts when configurable thresholds are crossed
|
|
1535
|
+
5. Integrates with hooks for automatic alerts
|
|
1536
|
+
|
|
1537
|
+
**Default Alert Thresholds**:
|
|
1538
|
+
|
|
1539
|
+
| Level | Threshold | Trigger |
|
|
1540
|
+
|-------|-----------|---------|
|
|
1541
|
+
| Info | 60% | Informational notice |
|
|
1542
|
+
| Warning | 80% | Recommend `/handoff` soon |
|
|
1543
|
+
| Critical | 90% | Urgent action required |
|
|
1544
|
+
|
|
1545
|
+
**Setting Custom Thresholds**:
|
|
1546
|
+
|
|
1547
|
+
```bash
|
|
1548
|
+
/token-budget alert warning 75
|
|
1549
|
+
```
|
|
1550
|
+
|
|
1551
|
+
This sets the warning threshold to 75% instead of the default 80%.
|
|
1552
|
+
|
|
1553
|
+
**Output Format** (when budget is set):
|
|
1554
|
+
|
|
1555
|
+
```markdown
|
|
1556
|
+
## Token Budget Status
|
|
1557
|
+
|
|
1558
|
+
**Session**: `abc12345...`
|
|
1559
|
+
**Checked**: 2026-01-16 10:30
|
|
1560
|
+
|
|
1561
|
+
### Budget Overview
|
|
1562
|
+
|
|
1563
|
+
| Metric | Value |
|
|
1564
|
+
|--------|-------|
|
|
1565
|
+
| Budget | 100,000 tokens |
|
|
1566
|
+
| Used | 45,230 tokens |
|
|
1567
|
+
| Remaining | 54,770 tokens |
|
|
1568
|
+
| Used | 45.2% |
|
|
1569
|
+
|
|
1570
|
+
### Remaining Capacity
|
|
1571
|
+
|
|
1572
|
+
- **Estimated Turns**: ~109 turns remaining
|
|
1573
|
+
- **Alert Status**: ✅ Normal (under 60%)
|
|
1574
|
+
|
|
1575
|
+
### Alert Thresholds
|
|
1576
|
+
|
|
1577
|
+
| Level | Threshold | Status |
|
|
1578
|
+
|-------|-----------|--------|
|
|
1579
|
+
| Info | 60% | Not triggered |
|
|
1580
|
+
| Warning | 80% | Not triggered |
|
|
1581
|
+
| Critical | 90% | Not triggered |
|
|
1582
|
+
```
|
|
1583
|
+
|
|
1584
|
+
**Alert Messages** (when thresholds crossed):
|
|
1585
|
+
|
|
1586
|
+
```
|
|
1587
|
+
ℹ️ **Budget Notice**: 60% of token budget used (60,000/100,000).
|
|
1588
|
+
~80 turns remaining. Consider planning session wrap-up.
|
|
1589
|
+
|
|
1590
|
+
⚠️ **Budget Warning**: 80% of token budget used (80,000/100,000).
|
|
1591
|
+
~40 turns remaining. Recommend running `/handoff` soon.
|
|
1592
|
+
|
|
1593
|
+
🔴 **Budget Critical**: 90% of token budget used (90,000/100,000).
|
|
1594
|
+
~20 turns remaining. Run `/handoff` or `/clear` immediately.
|
|
1595
|
+
```
|
|
1596
|
+
|
|
1597
|
+
**Recommended Budget Sizes**:
|
|
1598
|
+
|
|
1599
|
+
| Session Type | Budget | Approx Turns |
|
|
1600
|
+
|--------------|--------|--------------|
|
|
1601
|
+
| Light session | 50,000 | ~100 |
|
|
1602
|
+
| Standard session | 100,000 | ~200 |
|
|
1603
|
+
| Heavy session | 150,000 | ~300 |
|
|
1604
|
+
|
|
1605
|
+
**Related Commands**:
|
|
1606
|
+
- `/audit` — Real-time session analysis
|
|
1607
|
+
- `/efficiency` — Historical patterns
|
|
1608
|
+
- `/handoff` — Session continuity (recommended at 80%)
|
|
1609
|
+
|
|
1610
|
+
---
|
|
1611
|
+
|
|
1361
1612
|
## Multi-Agent Commands
|
|
1362
1613
|
|
|
1363
1614
|
### /hud
|
|
@@ -1823,12 +2074,21 @@ This reminder appears in the hook output, prompting you to run the configured co
|
|
|
1823
2074
|
|
|
1824
2075
|
**Usage**:
|
|
1825
2076
|
```bash
|
|
1826
|
-
# Start autonomous execution
|
|
2077
|
+
# Start autonomous execution (manual task description)
|
|
1827
2078
|
/ralph start "Migrate all tests from Jest to Vitest"
|
|
1828
2079
|
|
|
1829
2080
|
# With options
|
|
1830
2081
|
/ralph start "Add OAuth authentication" --max-iterations 30
|
|
1831
2082
|
|
|
2083
|
+
# From Linear issue (recommended) - fetches subtasks automatically
|
|
2084
|
+
/ralph start --issue ANV-209
|
|
2085
|
+
|
|
2086
|
+
# From Linear with subtask filter
|
|
2087
|
+
/ralph start --issue ANV-209 --subtasks ANV-210..ANV-213
|
|
2088
|
+
|
|
2089
|
+
# From Linear project - process all issues in a project
|
|
2090
|
+
/ralph start --project "HUD Development"
|
|
2091
|
+
|
|
1832
2092
|
# Check progress
|
|
1833
2093
|
/ralph status
|
|
1834
2094
|
|
|
@@ -1850,6 +2110,11 @@ This reminder appears in the hook output, prompting you to run the configured co
|
|
|
1850
2110
|
|------|---------|-------------|
|
|
1851
2111
|
| `--max-iterations` | 50 | Maximum iterations before stopping |
|
|
1852
2112
|
| `--completion-promise` | COMPLETE | Text that signals completion |
|
|
2113
|
+
| `--issue` | — | Linear issue ID to fetch subtasks from (e.g., `ANV-209`) |
|
|
2114
|
+
| `--project` | — | Linear project name to process all issues |
|
|
2115
|
+
| `--subtasks` | — | Filter subtasks (e.g., `ANV-1..ANV-5` or `ANV-1,ANV-3`) |
|
|
2116
|
+
| `--include-done` | false | Include already-completed issues in project mode |
|
|
2117
|
+
| `--no-sync` | false | Disable syncing status back to Linear |
|
|
1853
2118
|
|
|
1854
2119
|
**What It Does**:
|
|
1855
2120
|
|
|
@@ -1906,6 +2171,41 @@ Ralph includes automatic safety stops:
|
|
|
1906
2171
|
- Iteration 3: Implemented OAuth redirect
|
|
1907
2172
|
```
|
|
1908
2173
|
|
|
2174
|
+
**With Linear integration**, additional fields are shown:
|
|
2175
|
+
|
|
2176
|
+
```markdown
|
|
2177
|
+
## Ralph Wiggum Status
|
|
2178
|
+
|
|
2179
|
+
| Metric | Value |
|
|
2180
|
+
|--------|-------|
|
|
2181
|
+
| Status | Running |
|
|
2182
|
+
| Iteration | 5 of 50 |
|
|
2183
|
+
| Items Complete | 3 of 8 |
|
|
2184
|
+
| Progress | 38% |
|
|
2185
|
+
|
|
2186
|
+
### Linear Integration
|
|
2187
|
+
| Field | Value |
|
|
2188
|
+
|-------|-------|
|
|
2189
|
+
| Parent Issue | [ANV-209](https://linear.app/your-org/issue/ANV-209) |
|
|
2190
|
+
| Subtasks | 3 done, 0 skipped, 5 remaining |
|
|
2191
|
+
| Last Sync | 2026-01-07T10:45:00Z |
|
|
2192
|
+
| Sync Status | Enabled |
|
|
2193
|
+
|
|
2194
|
+
### Current Subtask
|
|
2195
|
+
[ANV-212] Phase 3: Command Interface Update
|
|
2196
|
+
```
|
|
2197
|
+
|
|
2198
|
+
**Error Handling** (Linear integration):
|
|
2199
|
+
|
|
2200
|
+
Ralph includes robust error handling for Linear operations:
|
|
2201
|
+
|
|
2202
|
+
| Error Type | Behavior |
|
|
2203
|
+
|------------|----------|
|
|
2204
|
+
| Rate limit (429) | Retry with exponential backoff (1s, 2s, 4s) |
|
|
2205
|
+
| Timeout | Retry up to 3 times |
|
|
2206
|
+
| Missing issue | Skip gracefully, continue session |
|
|
2207
|
+
| Sync failure | Log warning, don't block progress |
|
|
2208
|
+
|
|
1909
2209
|
**Environment Variables**:
|
|
1910
2210
|
|
|
1911
2211
|
| Variable | Default | Description |
|
package/docs/session-workflow.md
CHANGED
|
@@ -22,6 +22,7 @@
|
|
|
22
22
|
- [Session Commands](#session-commands)
|
|
23
23
|
- [Workflow Commands](#workflow-commands)
|
|
24
24
|
- [Quality Commands](#quality-commands)
|
|
25
|
+
- [Token Management Commands](#token-management-commands)
|
|
25
26
|
- [Multi-Agent Commands](#multi-agent-commands)
|
|
26
27
|
- [Maintenance Commands](#maintenance-commands)
|
|
27
28
|
- [Power Mode Commands (Ralph Wiggum)](#power-mode-commands-ralph-wiggum)
|
|
@@ -347,21 +348,29 @@ Monitor all agents' context, cost, and work in real-time.
|
|
|
347
348
|
|
|
348
349
|
> **Important**: Ralph is a specialized power tool for specific scenarios, NOT a replacement for the standard workflow. See [When to Use Ralph](#ralph-wiggum-mode-special-scenarios) below.
|
|
349
350
|
|
|
350
|
-
```
|
|
351
|
+
```bash
|
|
352
|
+
# Manual task description
|
|
351
353
|
/ralph start "Migrate all tests from Jest to Vitest" --max-iterations 50
|
|
354
|
+
|
|
355
|
+
# From Linear issue (recommended) - fetches subtasks automatically
|
|
356
|
+
/ralph start --issue ANV-209
|
|
357
|
+
|
|
358
|
+
# From Linear project - process all issues
|
|
359
|
+
/ralph start --project "HUD Development"
|
|
352
360
|
```
|
|
353
361
|
|
|
354
362
|
Claude will:
|
|
355
|
-
1.
|
|
356
|
-
2.
|
|
357
|
-
3.
|
|
358
|
-
4.
|
|
359
|
-
5.
|
|
360
|
-
6.
|
|
363
|
+
1. Fetch subtasks from Linear (if using `--issue` or `--project`)
|
|
364
|
+
2. Break task into atomic TODO items
|
|
365
|
+
3. Create PROMPT.md, fix_plan.md, progress.txt
|
|
366
|
+
4. Start autonomous loop
|
|
367
|
+
5. Execute ONE item per iteration
|
|
368
|
+
6. Sync status back to Linear (unless `--no-sync`)
|
|
369
|
+
7. Stop when complete or circuit breaker triggers
|
|
361
370
|
|
|
362
371
|
Monitor progress:
|
|
363
372
|
```
|
|
364
|
-
/ralph status → See current iteration and progress
|
|
373
|
+
/ralph status → See current iteration, Linear sync status, and progress
|
|
365
374
|
```
|
|
366
375
|
|
|
367
376
|
Stop early:
|
|
@@ -436,6 +445,14 @@ Stop early:
|
|
|
436
445
|
| `/healthcheck` | Session end | Analyzes session for issues |
|
|
437
446
|
| `/retro` | After completion | Captures learnings |
|
|
438
447
|
|
|
448
|
+
### Token Management Commands
|
|
449
|
+
|
|
450
|
+
| Command | When | What It Does |
|
|
451
|
+
|---------|------|--------------|
|
|
452
|
+
| `/audit` | During session | Real-time token consumption analysis |
|
|
453
|
+
| `/efficiency` | Weekly reviews | Historical efficiency patterns and trends |
|
|
454
|
+
| `/token-budget` | Session start | Set and track session token budgets |
|
|
455
|
+
|
|
439
456
|
### Multi-Agent Commands
|
|
440
457
|
|
|
441
458
|
| Command | When | What It Does |
|
|
@@ -456,9 +473,20 @@ Stop early:
|
|
|
456
473
|
| Command | When | What It Does |
|
|
457
474
|
|---------|------|--------------|
|
|
458
475
|
| `/ralph start` | Large refactoring, overnight runs | Initializes autonomous execution loop |
|
|
459
|
-
| `/ralph
|
|
476
|
+
| `/ralph start --issue` | Linear-driven tasks | Fetches subtasks from Linear issue |
|
|
477
|
+
| `/ralph start --project` | Project-wide work | Processes all issues in Linear project |
|
|
478
|
+
| `/ralph status` | During autonomous execution | Shows iteration count, Linear sync, progress |
|
|
460
479
|
| `/ralph stop` | Early termination | Gracefully stops with summary |
|
|
461
480
|
|
|
481
|
+
**Linear Integration Flags:**
|
|
482
|
+
|
|
483
|
+
| Flag | Description |
|
|
484
|
+
|------|-------------|
|
|
485
|
+
| `--issue` | Linear issue ID to fetch subtasks from |
|
|
486
|
+
| `--project` | Linear project name to process all issues |
|
|
487
|
+
| `--subtasks` | Filter subtasks (e.g., `ANV-1..ANV-5`) |
|
|
488
|
+
| `--no-sync` | Disable syncing status back to Linear |
|
|
489
|
+
|
|
462
490
|
> **Note**: Ralph is a specialized tool for specific scenarios. See [Ralph Wiggum Mode](#ralph-wiggum-mode-special-scenarios).
|
|
463
491
|
|
|
464
492
|
### Setup Commands
|
|
@@ -537,6 +565,30 @@ Ralph Wiggum is a **specialized power tool** for autonomous, long-running AI exe
|
|
|
537
565
|
| ✅ Test coverage expansion | ❌ Quick fixes (overkill) |
|
|
538
566
|
| ✅ Overnight/unattended execution | ❌ Interactive debugging |
|
|
539
567
|
|
|
568
|
+
### Linear Integration
|
|
569
|
+
|
|
570
|
+
Ralph integrates with Linear for task-driven autonomous execution:
|
|
571
|
+
|
|
572
|
+
```bash
|
|
573
|
+
# From Linear issue - fetches subtasks automatically
|
|
574
|
+
/ralph start --issue ANV-209
|
|
575
|
+
|
|
576
|
+
# Filter to specific subtasks
|
|
577
|
+
/ralph start --issue ANV-209 --subtasks ANV-210..ANV-213
|
|
578
|
+
|
|
579
|
+
# From Linear project - process all issues
|
|
580
|
+
/ralph start --project "HUD Development"
|
|
581
|
+
|
|
582
|
+
# Without syncing back to Linear
|
|
583
|
+
/ralph start --issue ANV-209 --no-sync
|
|
584
|
+
```
|
|
585
|
+
|
|
586
|
+
When using Linear integration:
|
|
587
|
+
- Subtasks are fetched automatically as TODO items
|
|
588
|
+
- Status is synced back to Linear as work progresses
|
|
589
|
+
- Progress includes Linear-specific metrics (subtask counts, sync status)
|
|
590
|
+
- Error handling includes retry logic for rate limits
|
|
591
|
+
|
|
540
592
|
### Cost Awareness
|
|
541
593
|
|
|
542
594
|
| Scenario | Estimated Cost |
|
|
@@ -567,6 +619,7 @@ Ralph includes automatic circuit breakers:
|
|
|
567
619
|
- **Max iterations**: Configurable limit (default: 50)
|
|
568
620
|
- **Fatal signal**: Immediate stop on `<fatal>...</fatal>` output
|
|
569
621
|
- **Git checkpoints**: Auto-commits before each restart
|
|
622
|
+
- **Linear sync errors**: Retry with backoff, skip gracefully if persistent
|
|
570
623
|
|
|
571
624
|
---
|
|
572
625
|
|