npm - rafcode - Versions diffs - 2.2.0 → 2.3.0 - Mend

rafcode 2.2.0 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (49) hide show

package/RAF/ahtahs-token-reaper/decisions.md +37 -0
package/RAF/ahtahs-token-reaper/input.md +20 -0
package/RAF/ahtahs-token-reaper/outcomes/01-extend-token-tracker-data-model.md +42 -0
package/RAF/ahtahs-token-reaper/outcomes/02-accumulate-usage-in-retry-loop.md +31 -0
package/RAF/ahtahs-token-reaper/outcomes/03-per-attempt-display-formatting.md +60 -0
package/RAF/ahtahs-token-reaper/outcomes/04-add-model-name-to-claude-call-logs.md +57 -0
package/RAF/ahtahs-token-reaper/outcomes/05-handle-invalid-config-in-raf-config.md +46 -0
package/RAF/ahtahs-token-reaper/outcomes/06-fix-verbose-toggle-timer-display.md +38 -0
package/RAF/ahtahs-token-reaper/plans/01-extend-token-tracker-data-model.md +36 -0
package/RAF/ahtahs-token-reaper/plans/02-accumulate-usage-in-retry-loop.md +36 -0
package/RAF/ahtahs-token-reaper/plans/03-per-attempt-display-formatting.md +43 -0
package/RAF/ahtahs-token-reaper/plans/04-add-model-name-to-claude-call-logs.md +38 -0
package/RAF/ahtahs-token-reaper/plans/05-handle-invalid-config-in-raf-config.md +36 -0
package/RAF/ahtahs-token-reaper/plans/06-fix-verbose-toggle-timer-display.md +40 -0
package/dist/commands/config.d.ts.map +1 -1
package/dist/commands/config.js +27 -5
package/dist/commands/config.js.map +1 -1
package/dist/commands/do.js +17 -10
package/dist/commands/do.js.map +1 -1
package/dist/commands/plan.js +3 -2
package/dist/commands/plan.js.map +1 -1
package/dist/core/pull-request.d.ts.map +1 -1
package/dist/core/pull-request.js +3 -1
package/dist/core/pull-request.js.map +1 -1
package/dist/utils/config.d.ts +6 -0
package/dist/utils/config.d.ts.map +1 -1
package/dist/utils/config.js +21 -0
package/dist/utils/config.js.map +1 -1
package/dist/utils/terminal-symbols.d.ts +8 -4
package/dist/utils/terminal-symbols.d.ts.map +1 -1
package/dist/utils/terminal-symbols.js +31 -6
package/dist/utils/terminal-symbols.js.map +1 -1
package/dist/utils/token-tracker.d.ts +11 -1
package/dist/utils/token-tracker.d.ts.map +1 -1
package/dist/utils/token-tracker.js +37 -2
package/dist/utils/token-tracker.js.map +1 -1
package/package.json +1 -1
package/src/commands/config.ts +30 -4
package/src/commands/do.ts +17 -10
package/src/commands/plan.ts +3 -2
package/src/core/pull-request.ts +3 -1
package/src/utils/config.ts +22 -0
package/src/utils/terminal-symbols.ts +42 -7
package/src/utils/token-tracker.ts +44 -2
package/tests/unit/config-command.test.ts +80 -1
package/tests/unit/config.test.ts +24 -0
package/tests/unit/terminal-symbols.test.ts +121 -33
package/tests/unit/timer-verbose-integration.test.ts +170 -0
package/tests/unit/token-tracker.test.ts +350 -17

package/RAF/ahtahs-token-reaper/decisions.md ADDED Viewed

@@ -0,0 +1,37 @@
+# Project Decisions
+## For the per-task token summary, should it show accumulated total or per-attempt breakdown?
+Per-attempt breakdown — show token usage for each attempt individually, plus a combined total.
+## Should per-attempt breakdown appear in normal output or only with --verbose?
+Always show breakdown — per-attempt details shown regardless of verbose flag for full cost transparency.
+## Should TokenTracker store per-attempt data, or should accumulation happen in do.ts?
+Tracker stores attempts — TokenTracker gains a richer data model with per-attempt entries. addTask accepts an array of UsageData. Centralized logic.
+## Should the grand total summary also show per-attempt breakdown?
+Grand total only — the final summary shows combined totals. Per-attempt detail is available in individual task summaries above.
+## What format for the model name in log messages?
+"...with sonnet" style — append 'with <model>' before the ellipsis, e.g., "Generating project name suggestions with sonnet..."
+## Should the model name be the short alias or full model ID?
+Short alias — display friendly names like 'sonnet', 'haiku', 'opus'. Cleaner output.
+## Should model-in-log apply only to name generation or all Claude calls?
+All Claude calls — add model names to all log messages where RAF invokes Claude (name generation, failure analysis, PR generation, config session).
+## When config is invalid, should `raf config` silently fall back or warn?
+Warn then continue — show a warning about the invalid config, then launch the interactive session normally with defaults.
+## Should config resilience apply to all commands or only `raf config`?
+Only `raf config` — it's the recovery tool. Other commands can still fail fast on invalid config.
+## When verbose is ON, should the task name and elapsed time be shown as a header?
+No header at all — when verbose is ON, only show Claude's raw output and tool descriptions. No task name or timer.
+## When toggling back to verbose OFF, should the timer resume or reset?
+Resume counting — timer continues from actual elapsed time since task start.
+## When verbose is ON, should tool use descriptions still be shown?
+Show both — show Claude's text AND tool use descriptions (→ Reading file.ts, → Running: npm test, etc.).

package/RAF/ahtahs-token-reaper/input.md ADDED Viewed

@@ -0,0 +1,20 @@
+- [ ] **Accumulate token usage across retry attempts** When a task retries, this assignment overwrites prior `usageData`, and the tracker is only updated once after the retry loop, so tokens/cost from earlier failed attempts are dropped. In any task that takes multiple attempts, the per-task and total summaries underreport actual consumption, which skews cost reporting for long or flaky runs.
+---
+when i switch to verbose mode is see output together with timer and task name repeating on each line. could you remove interactive timer when verbose mode is on, and put it back on OFF. and don't put task on each line when in V ON mode. see log: ```● 01-extend-token-tracker-data-model 34s  [verbose: on]
+		● 01-extend-token-tracker-data-model 37s  → Updating task list
+		● 01-extend-token-tracker-data-model 39sNow let me add the `accumulateUsage()` function. I'll add it before the TokenTracker class.
+		● 01-extend-token-tracker-data-model 46s  → Editing /Users/eremeev/.raf/worktrees/RAF/ahtahs-token-reaper/src/utils/token-tracker.ts
+		● 01-extend-token-tracker-data-model 50s  → Updating task list
+		● 01-extend-token-tracker-data-model 52sNow let me update the `addTask()` method to accept an array.
+		● 01-extend-token-tracker-data-model 53s  → Reading /Users/eremeev/.raf/worktrees/RAF/ahtahs-token-reaper/src/utils/token-tracker.ts
+		● 01-extend-token-tracker-data-model 55sNow let me update the `addTask()` method to accept an array of UsageData.
+		● 01-extend-token-tracker-data-model 56s  [verbose: off]```

package/RAF/ahtahs-token-reaper/outcomes/01-extend-token-tracker-data-model.md ADDED Viewed

@@ -0,0 +1,42 @@
+# Task 01: Extend TokenTracker Data Model
+## Summary
+Refactored TokenTracker to accept and store per-attempt UsageData entries per task, enabling accurate token tracking across retries.
+## Changes Made
+### src/utils/token-tracker.ts
+- Added `attempts: UsageData[]` field to `TaskUsageEntry` interface
+- Created `accumulateUsage()` utility function that merges multiple UsageData objects into one, summing all token fields and merging modelUsage maps (handles different models across attempts)
+- Updated `addTask()` signature to accept `UsageData[]` instead of single `UsageData`
+- `addTask()` now calls `accumulateUsage()` to compute combined usage and stores raw attempts for future display breakdowns
+### src/commands/do.ts
+- Updated two call sites to wrap single `lastUsageData` in array `[lastUsageData]`
+- Added TODO comments indicating these should pass all attempt data once retry loop accumulates them
+### tests/unit/token-tracker.test.ts
+- Updated all existing test calls to use array syntax `[usage]`
+- Added new tests for:
+  - `accumulateUsage()` function (empty array, single element, multi-element, multi-model merging, non-mutation)
+  - Multi-attempt accumulation in `addTask()`
+  - Cost calculation for multi-model retry scenarios
+  - `attempts` array storage in entries
+## Acceptance Criteria Verification
+- [x] `TaskUsageEntry` has an `attempts: UsageData[]` field
+- [x] `addTask()` accepts an array and correctly accumulates tokens across attempts
+- [x] `accumulateUsage()` correctly sums all token fields including per-model breakdowns
+- [x] `getTotals()` returns correct grand totals when tasks have multiple attempts
+- [x] Single-attempt tasks behave identically to before
+- [x] All existing and new token-tracker tests pass (27 tests)
+## Notes
+- The `accumulateUsage()` function handles the case where different attempts use different models (e.g., Opus on first attempt, Sonnet on retry due to fallback)
+- `calculateCost()` was left unchanged as designed - it operates on the accumulated UsageData
+- Pre-existing test failures in validation.test.ts and claude-runner-interactive.test.ts are unrelated to this task
+<promise>COMPLETE</promise>

package/RAF/ahtahs-token-reaper/outcomes/02-accumulate-usage-in-retry-loop.md ADDED Viewed

@@ -0,0 +1,31 @@
+# Task 02: Accumulate Usage in Retry Loop
+## Summary
+Modified the retry loop in `do.ts` to collect usage data from every attempt instead of overwriting it, and pass the full array to TokenTracker for accurate token tracking across retries.
+## Changes Made
+### src/commands/do.ts
+- Replaced `let lastUsageData: UsageData | undefined` with `const attemptUsageData: UsageData[] = []`
+- Changed from overwriting `lastUsageData = result.usageData` to `attemptUsageData.push(result.usageData)` when usage data is present
+- Updated success path (lines ~1091-1095): now checks `attemptUsageData.length > 0` and passes the full array to `tokenTracker.addTask()`
+- Updated failure path (lines ~1118-1122): same change, passes full array for partial data tracking
+- Removed TODO comments that were added in Task 01 as placeholders
+## Acceptance Criteria Verification
+- [x] Usage data from all retry attempts is collected in an array
+- [x] The full array is passed to `tokenTracker.addTask()`
+- [x] Attempts with no usage data (timeout/crash) are excluded from the array (only push when `result.usageData` is defined)
+- [x] Single-attempt tasks still work correctly (array of length 1)
+- [x] All tests pass (token-tracker: 27 tests, do-*: 44 tests)
+## Notes
+- The `lastOutput` variable remains unchanged as designed - only final output matters for result parsing
+- The existing tests from Task 01 already cover the accumulation logic in `TokenTracker` and `accumulateUsage()`
+- The change is minimal and surgical - only the usage data collection mechanism was updated
+- Edge cases (timeouts, crashes, context overflow) correctly result in no usage data being pushed for that attempt
+<promise>COMPLETE</promise>

package/RAF/ahtahs-token-reaper/outcomes/03-per-attempt-display-formatting.md ADDED Viewed

@@ -0,0 +1,60 @@
+# Task 03: Per-Attempt Display Formatting
+## Summary
+Updated `formatTaskTokenSummary()` to display a per-attempt breakdown when a task took multiple attempts, while keeping single-attempt output unchanged.
+## Changes Made
+### src/utils/terminal-symbols.ts
+- Added import for `TaskUsageEntry` type from token-tracker
+- Created internal `formatTokenLine()` helper function that formats a single line of token usage (used for both attempts and totals)
+- Updated `formatTaskTokenSummary()` signature to accept:
+  - `entry: TaskUsageEntry` (replaces separate `usage` and `cost` parameters)
+  - `calculateAttemptCost?: (usage: UsageData) => CostBreakdown` (optional callback for per-attempt cost calculation)
+- Single-attempt behavior: When `entry.attempts.length <= 1`, output is identical to previous format: `"  Tokens: X in / Y out | Cache: ... | Est. cost: $X.XX"`
+- Multi-attempt behavior: Shows per-attempt breakdown with:
+  - Each attempt on its own line: `"    Attempt N: X in / Y out | Cache: ... | Est. cost: $X.XX"`
+  - Total line at the end: `"    Total: X in / Y out | Cache: ... | Est. cost: $X.XX"`
+### src/commands/do.ts
+- Updated both call sites (success and failure paths) to pass the full `TaskUsageEntry` and the `calculateCost` callback:
+  - `logger.dim(formatTaskTokenSummary(entry, (u) => tokenTracker.calculateCost(u)))`
+### tests/unit/terminal-symbols.test.ts
+- Added import for `TaskUsageEntry` type
+- Created `makeEntry()` helper to construct `TaskUsageEntry` objects for testing
+- Reorganized `formatTaskTokenSummary` tests into two describe blocks:
+  - `single-attempt tasks`: 6 tests verifying unchanged behavior for single-attempt scenarios
+  - `multi-attempt tasks`: 4 tests covering multi-attempt formatting, cost calculation, cache tokens, and 3+ attempts
+## Example Output
+**Single-attempt (unchanged):**
+```
+  Tokens: 5,234 in / 1,023 out | Cache: 18,500 read | Est. cost: $0.42
+```
+**Multi-attempt (new):**
+```
+    Attempt 1: 1,234 in / 567 out | Est. cost: $0.02
+    Attempt 2: 2,345 in / 890 out | Est. cost: $0.04
+    Total: 3,579 in / 1,457 out | Est. cost: $0.06
+```
+## Acceptance Criteria Verification
+- [x] Single-attempt tasks display identically to current format
+- [x] Multi-attempt tasks show per-attempt lines plus a total
+- [x] Formatting is clean and readable in terminal output
+- [x] `formatTokenTotalSummary()` is unchanged
+- [x] All call sites updated
+- [x] All tests pass (135 tests including 10 new tests for this feature)
+## Notes
+- The `calculateAttemptCost` callback is optional; when not provided, per-attempt costs show `$0.00` (the total still shows accurate accumulated cost)
+- Per-attempt lines use 4-space indent to visually nest under the task, while single-attempt uses 2-space indent
+- Cache tokens are included in per-attempt breakdowns when present
+<promise>COMPLETE</promise>

package/RAF/ahtahs-token-reaper/outcomes/04-add-model-name-to-claude-call-logs.md ADDED Viewed

@@ -0,0 +1,57 @@
+# Task 04: Add Model Name to Claude Invocation Logs
+## Summary
+Added a `getModelShortName()` utility function and updated all four Claude invocation log messages to display the short model alias (e.g., "sonnet", "haiku", "opus").
+## Changes Made
+### src/utils/config.ts
+- Added `getModelShortName(modelId: string)` utility function that:
+  - Returns short aliases (`opus`, `sonnet`, `haiku`) as-is
+  - Extracts family from full model IDs (e.g., `claude-sonnet-4-5-20250929` → `sonnet`)
+  - Returns unknown model IDs as-is for graceful fallback
+### src/commands/plan.ts
+- Added import for `getModel` and `getModelShortName`
+- Updated name generation log: `"Generating project name suggestions with ${nameModel}..."`
+### src/commands/do.ts
+- Added import for `getModel` and `getModelShortName`
+- Updated failure analysis log: `"Analyzing failure with ${analysisModel}..."`
+### src/core/pull-request.ts
+- Added import for `getModelShortName`
+- Added new log message in `generatePrBody()`: `"Generating PR with ${prModel}..."`
+### src/commands/config.ts
+- Added import for `getModelShortName`
+- Consolidated two log lines into one: `"Starting config session with ${configModel}..."`
+  - Previously: "Starting config session with Claude..." + "Using model: ${model}"
+  - Now: single line with short model name
+### tests/unit/config.test.ts
+- Added import for `getModelShortName`
+- Added test suite with 3 test cases:
+  - `should return short aliases as-is`
+  - `should extract family from full model IDs`
+  - `should return unknown model IDs as-is`
+## Acceptance Criteria Verification
+- [x] All four Claude invocation points show the model short name in their log messages
+  - Name generation: `"Generating project name suggestions with sonnet..."`
+  - Failure analysis: `"Analyzing failure with haiku..."`
+  - PR generation: `"Generating PR with sonnet..."`
+  - Config session: `"Starting config session with sonnet..."`
+- [x] Short name extraction works for full model IDs and already-short names
+- [x] Log format follows the "...with <model>..." pattern
+- [x] Unit tests cover the short name utility (3 tests)
+- [x] All tests pass (95 config tests, 1156 total passing)
+## Notes
+- Pre-existing test failures in `validation.test.ts` and `claude-runner-interactive.test.ts` are unrelated to this task
+- The `getModelShortName()` function reuses logic similar to `resolveModelPricingCategory()` but returns the original string for unknown models instead of `null`
+<promise>COMPLETE</promise>

package/RAF/ahtahs-token-reaper/outcomes/05-handle-invalid-config-in-raf-config.md ADDED Viewed

@@ -0,0 +1,46 @@
+# Task 05: Handle Invalid Config Gracefully in raf config Command
+## Summary
+Made `raf config` resilient to invalid or corrupt config files so it can serve as the recovery path for broken configurations. Previously, if `~/.raf/raf.config.json` contained invalid JSON or failed schema validation, `raf config` would crash before the interactive session could launch, blocking users from fixing the issue.
+## Changes Made
+### src/commands/config.ts
+- Added import for `resetConfigCache` from config utilities
+- Added import for `DEFAULT_CONFIG` from types/config
+- Wrapped `getModel('config')` and `getEffort('config')` calls in try-catch block
+- On error, falls back to `DEFAULT_CONFIG.models.config` ('sonnet') and `DEFAULT_CONFIG.effort.config` ('medium')
+- Displays warning message with the specific error: "Config file has errors, using defaults: {message}"
+- Provides guidance: "Fix the config in this session or run `raf config --reset` to start fresh."
+- Calls `resetConfigCache()` to clear any broken cached config
+- The interactive Claude session still receives the broken config file contents via `getCurrentConfigState()`, so the user can see and fix the issue
+### tests/unit/config-command.test.ts
+- Added imports for `resolveConfig`, `getModel`, `getEffort`, `resetConfigCache`, and `DEFAULT_CONFIG`
+- Added `resetConfigCache()` calls to beforeEach/afterEach for test isolation
+- Added new test suite "Error recovery - invalid config fallback" with 6 tests:
+  - Throws on invalid JSON when resolving config
+  - Throws on schema validation failure when resolving config
+  - Default fallback values are correct for config scenario
+  - Raw file contents readable even with invalid JSON
+  - Raw file contents readable even with schema validation failure
+  - resetConfigCache clears the cached config
+## Acceptance Criteria Verification
+- [x] `raf config` launches successfully even when `~/.raf/raf.config.json` is invalid JSON
+- [x] `raf config` launches successfully even when config fails schema validation
+- [x] A clear warning is displayed to the user about the config error
+- [x] The interactive session uses default model/effort values as fallback
+- [x] The broken config content is still visible in the session for the user to fix
+- [x] Other commands (`raf plan`, `raf do`, `raf status`) still fail fast on invalid config
+- [x] All tests pass (121 config-related tests, 1162 total passing; pre-existing failures in validation.test.ts and claude-runner-interactive.test.ts are unrelated)
+## Notes
+- The error handling is specific to `raf config` - other commands continue to fail fast on invalid config as expected
+- The `getCurrentConfigState()` function reads raw file content without JSON parsing, so broken content is always available to Claude in the session
+- The `--reset` option doesn't need this fix since it deletes the file without loading it
+<promise>COMPLETE</promise>

package/RAF/ahtahs-token-reaper/outcomes/06-fix-verbose-toggle-timer-display.md ADDED Viewed

@@ -0,0 +1,38 @@
+# Task 06: Fix Verbose Toggle Timer Display
+## Summary
+Modified the timer callback in `do.ts` to check the verbose toggle state on each tick. When verbose is toggled ON at runtime, the status line is immediately cleared and updates are skipped. When toggled back OFF, the timer resumes displaying with the accurate elapsed time.
+## Changes Made
+### src/commands/do.ts
+- Updated the `onTick` callback (lines 915-923) to check `verboseToggle.isVerbose` on every tick
+- When verbose is ON: calls `statusLine.clear()` and returns early (skipping the update)
+- When verbose is OFF: updates the status line as normal with task progress
+- The timer continues tracking elapsed time internally regardless of display state
+### tests/unit/timer-verbose-integration.test.ts (new file)
+- Created new test file with 5 tests covering the timer-verbose integration:
+  - `should update status line when verbose is off`
+  - `should clear status line and skip update when verbose is toggled on`
+  - `should resume updating status line when verbose is toggled back off`
+  - `should track elapsed time correctly regardless of verbose state`
+  - `should not create timer callback when started with verbose flag`
+## Acceptance Criteria Verification
+- [x] Toggling verbose ON clears the status line and stops timer/task-name display
+- [x] Toggling verbose OFF resumes the timer/status line with correct elapsed time
+- [x] No task name prefix appears on verbose output lines (status line cleared immediately)
+- [x] Starting with `--verbose` flag still works as before (no timer callback created)
+- [x] Timer internally tracks elapsed time correctly regardless of display state
+- [x] All existing tests pass (1167 passing; 3 pre-existing failures in validation.test.ts and claude-runner-interactive.test.ts are unrelated)
+## Notes
+- The fix is minimal: just 4 lines added to the existing `onTick` callback
+- The `statusLine.clear()` call happens on every tick while verbose is on, which is safe because the clear operation is idempotent
+- The next tick after toggling verbose OFF will immediately show the correct elapsed time since the timer tracks time independently
+<promise>COMPLETE</promise>

package/RAF/ahtahs-token-reaper/plans/01-extend-token-tracker-data-model.md ADDED Viewed

@@ -0,0 +1,36 @@
+# Task: Extend TokenTracker to store per-attempt usage data
+## Objective
+Refactor TokenTracker to accept and store an array of per-attempt UsageData entries per task, instead of a single UsageData.
+## Context
+Currently TokenTracker stores one `UsageData` per task via `addTask(taskId, usage)`. When a task retries, only the last attempt's data reaches the tracker. To fix underreporting, the tracker needs to accept multiple attempt entries per task and compute totals from all of them.
+## Requirements
+- Change `TaskUsageEntry` to hold an array of attempt `UsageData` entries alongside the aggregated totals
+- Update `addTask()` to accept an array of `UsageData` (one per attempt) instead of a single `UsageData`
+- The per-entry `usage` field should be the sum of all attempts (for backward compatibility with `getTotals()`)
+- The per-entry `cost` field should be the sum of all attempts' costs
+- Store the raw per-attempt data so formatting functions can display breakdowns
+- `getTotals()` should continue to work correctly — it already sums across entries, so as long as each entry's `usage` is the accumulated total, no changes needed there
+- Add a helper method or utility to merge/accumulate multiple `UsageData` objects into one
+- Maintain backward compatibility: if only one attempt occurred, behavior is identical to today
+- Cover changes with unit tests
+## Implementation Steps
+1. Add an `attempts` field to `TaskUsageEntry` that stores the array of individual `UsageData` objects
+2. Create an `accumulateUsage()` utility that merges multiple `UsageData` into a single combined `UsageData` (summing all token fields and merging `modelUsage` maps)
+3. Update `addTask()` signature to accept `UsageData[]` — it calls `accumulateUsage()` to compute the combined `usage` and `calculateCost()` on the combined result
+4. Update existing tests and add new tests for multi-attempt accumulation
+## Acceptance Criteria
+- [ ] `TaskUsageEntry` has an `attempts: UsageData[]` field
+- [ ] `addTask()` accepts an array and correctly accumulates tokens across attempts
+- [ ] `accumulateUsage()` correctly sums all token fields including per-model breakdowns
+- [ ] `getTotals()` returns correct grand totals when tasks have multiple attempts
+- [ ] Single-attempt tasks behave identically to before
+- [ ] All existing and new tests pass
+## Notes
+- The `accumulateUsage()` helper should handle merging `modelUsage` maps where different attempts may use different models (e.g., attempt 1 uses Opus, retry uses Sonnet via fallback)
+- Keep `calculateCost()` unchanged — it operates on a single `UsageData` which is the accumulated total

package/RAF/ahtahs-token-reaper/plans/02-accumulate-usage-in-retry-loop.md ADDED Viewed

@@ -0,0 +1,36 @@
+# Task: Accumulate usage data across retry attempts in the retry loop
+## Objective
+Change the retry loop in `do.ts` to collect usage data from every attempt instead of overwriting it, and pass the full array to TokenTracker.
+## Context
+The retry loop in `src/commands/do.ts` (around line 908-1021) currently declares a single `lastUsageData` variable that gets overwritten on each retry attempt. After the loop, only the final attempt's data is passed to `tokenTracker.addTask()`. This must change to collect all attempts' data.
+## Dependencies
+01
+## Requirements
+- Replace the single `lastUsageData` variable with an array that collects `UsageData` from each attempt
+- Push each attempt's `usageData` into the array (when present) instead of overwriting
+- After the retry loop, pass the full array to `tokenTracker.addTask()` (using the new signature from task 01)
+- Both success and failure paths (lines ~1090 and ~1117) should pass the array
+- Handle edge case: some attempts may not produce `usageData` (timeout, crash) — skip those entries
+- Cover changes with tests
+## Implementation Steps
+1. Replace `let lastUsageData: UsageData | undefined` with `const attemptUsageData: UsageData[] = []`
+2. Inside the retry loop, change the overwrite (`lastUsageData = result.usageData`) to a push (`attemptUsageData.push(result.usageData)`) when `result.usageData` is defined
+3. Update the success path: call `tokenTracker.addTask(task.id, attemptUsageData)` when the array is non-empty
+4. Update the failure path: same change
+5. Add/update tests to verify accumulation across retries
+## Acceptance Criteria
+- [ ] Usage data from all retry attempts is collected in an array
+- [ ] The full array is passed to `tokenTracker.addTask()`
+- [ ] Attempts with no usage data (timeout/crash) are excluded from the array
+- [ ] Single-attempt tasks still work correctly (array of length 1)
+- [ ] All tests pass
+## Notes
+- The variable `lastOutput` should remain as-is (overwritten each attempt) since only the final output matters for result parsing
+- Look at the `result.output` fallback path (line 971-974) — the old code had a fallback where `lastUsageData = result.output` which seems like a type issue; clean this up if it's not needed

package/RAF/ahtahs-token-reaper/plans/03-per-attempt-display-formatting.md ADDED Viewed

@@ -0,0 +1,43 @@
+# Task: Update token summary formatting to show per-attempt breakdowns
+## Objective
+Update `formatTaskTokenSummary()` to display a per-attempt breakdown when a task took multiple attempts, while keeping single-attempt output unchanged.
+## Context
+With tasks 01 and 02 complete, the `TaskUsageEntry` now contains an `attempts` array with per-attempt `UsageData`. The formatting function needs to render this breakdown so users can see token consumption per retry attempt.
+## Dependencies
+01, 02
+## Requirements
+- When a task has only 1 attempt, output is identical to the current format (no visual change)
+- When a task has multiple attempts, show each attempt's tokens and cost on its own line, followed by a total line
+- Always show the breakdown (not gated by --verbose flag)
+- The grand total summary (`formatTokenTotalSummary`) remains unchanged — it shows combined totals only
+- Update `formatTaskTokenSummary()` signature to accept the `attempts` array from `TaskUsageEntry`
+- Cover changes with unit tests
+## Implementation Steps
+1. Update `formatTaskTokenSummary()` to accept the full `TaskUsageEntry` (or at minimum `usage`, `cost`, and `attempts`)
+2. For single-attempt tasks (array length 1), render the existing format unchanged
+3. For multi-attempt tasks, render each attempt on its own indented line with attempt number, tokens, and cost, then a total line
+4. Update all call sites that invoke `formatTaskTokenSummary()` to pass the attempts data
+5. Add tests for both single-attempt and multi-attempt formatting
+## Acceptance Criteria
+- [ ] Single-attempt tasks display identically to current format
+- [ ] Multi-attempt tasks show per-attempt lines plus a total
+- [ ] Formatting is clean and readable in terminal output
+- [ ] `formatTokenTotalSummary()` is unchanged
+- [ ] All call sites updated
+- [ ] All tests pass
+## Notes
+- Example multi-attempt output (approximate):
+  ```
+    Attempt 1: 1,234 in / 567 out | Est. cost: $0.02
+    Attempt 2: 2,345 in / 890 out | Est. cost: $0.04
+    Total: 3,579 in / 1,457 out | Est. cost: $0.06
+  ```
+- Keep the dim styling consistent with existing token output
+- The TokenTracker's `calculateCost()` can be used to get per-attempt costs if needed

package/RAF/ahtahs-token-reaper/plans/04-add-model-name-to-claude-call-logs.md ADDED Viewed

@@ -0,0 +1,38 @@
+# Task: Add model name to all Claude invocation log messages
+## Objective
+Display the short model alias (e.g., "sonnet", "haiku") in all log messages where RAF invokes Claude for non-task purposes.
+## Context
+When RAF calls Claude for auxiliary tasks (name generation, failure analysis, PR generation, config), it logs a message but doesn't indicate which model is being used. Users want visibility into which model is handling each call, especially since models are configurable per scenario.
+## Requirements
+- Use the format "...with <model>..." — append the model name before the trailing ellipsis
+- Display the short alias (sonnet, haiku, opus) not the full model ID
+- Apply to all four Claude invocation log messages:
+  1. Name generation: `src/commands/plan.ts` line 158 — currently "Generating project name suggestions..."
+  2. Failure analysis: `src/commands/do.ts` line 1111 — currently "Analyzing failure..."
+  3. PR generation: `src/core/pull-request.ts` — currently no explicit log message (add one)
+  4. Config session: `src/commands/config.ts` line 184 — currently "Starting config session with Claude..."
+- Create a utility to extract the short alias from a full model ID string (e.g., "claude-sonnet-4-5-20250929" → "sonnet")
+- The model value comes from `getModel()` calls in each module — the short name should be derived from whatever that returns
+## Implementation Steps
+1. Add a `getModelShortName(modelId: string)` utility that extracts the short alias from a model ID string — handle both full IDs ("claude-sonnet-4-5-20250929") and already-short names ("sonnet")
+2. Update the name generation log in `src/commands/plan.ts` to include the model: "Generating project name suggestions with sonnet..."
+3. Update the failure analysis log in `src/commands/do.ts` to include the model: "Analyzing failure with haiku..."
+4. Add a log message for PR generation in `src/core/pull-request.ts`: "Generating PR with haiku..."
+5. Update the config session log in `src/commands/config.ts` to include the model: "Starting config session with sonnet..."
+6. Cover the `getModelShortName()` utility with unit tests
+## Acceptance Criteria
+- [ ] All four Claude invocation points show the model short name in their log messages
+- [ ] Short name extraction works for full model IDs and already-short names
+- [ ] Log format follows the "...with <model>..." pattern
+- [ ] Unit tests cover the short name utility
+- [ ] All tests pass
+## Notes
+- The model for each scenario is retrieved via `getModel('nameGeneration')`, `getModel('failureAnalysis')`, `getModel('prGeneration')`, `getModel('config')` from `src/utils/config.ts`
+- Some call sites may need to retrieve the model earlier or pass it around to have it available at the log point — for instance, name generation logs in `plan.ts` but the model is determined inside `name-generator.ts`
+- For the config session, there's already a line showing the model — consolidate if appropriate

package/RAF/ahtahs-token-reaper/plans/05-handle-invalid-config-in-raf-config.md ADDED Viewed

@@ -0,0 +1,36 @@
+# Task: Handle invalid config gracefully in raf config command
+## Objective
+Make `raf config` resilient to invalid or corrupt config files so it can serve as the recovery path for broken configurations.
+## Context
+In `src/commands/config.ts`, the command calls `getModel('config')` and `getEffort('config')` early in execution. These read from the resolved config, which requires loading and validating `~/.raf/raf.config.json`. If that file contains invalid JSON or fails schema validation, these calls throw and `raf config` exits immediately — blocking the user from using the interactive editor to fix their config. Since `raf config` is the intended way to edit config, it must survive a broken config file.
+## Requirements
+- Wrap the config-loading path in `raf config` with error handling that catches JSON parse errors and schema validation failures
+- On error, warn the user with a visible message (e.g., "Config file has errors, using defaults") that includes the specific error
+- Fall back to default config values for model and effort so the interactive session can launch
+- The interactive Claude session should still receive the current (broken) config file contents as context, so the user can see and fix the issue
+- Only apply this resilience to `raf config` — other commands should continue to fail fast on invalid config
+- Cover the error-handling path with tests
+## Implementation Steps
+1. In `src/commands/config.ts`, wrap the `getModel('config')` and `getEffort('config')` calls in a try-catch
+2. On catch, log a warning with the error details and fall back to the default model/effort values from `DEFAULT_CONFIG`
+3. Ensure the rest of the command continues normally — the interactive session launches with defaults
+4. Make sure the broken config file contents are still shown to Claude in the session prompt so the user can diagnose and fix
+5. Add tests for the error-recovery path (invalid JSON, schema validation failure)
+## Acceptance Criteria
+- [ ] `raf config` launches successfully even when `~/.raf/raf.config.json` is invalid JSON
+- [ ] `raf config` launches successfully even when config fails schema validation
+- [ ] A clear warning is displayed to the user about the config error
+- [ ] The interactive session uses default model/effort values as fallback
+- [ ] The broken config content is still visible in the session for the user to fix
+- [ ] Other commands (`raf plan`, `raf do`, `raf status`) still fail fast on invalid config
+- [ ] All tests pass
+## Notes
+- Check whether `loadConfig()` or the individual `getModel()`/`getEffort()` accessors are the right place to catch — it may be cleaner to catch at the `loadConfig()` level and return defaults
+- The post-session validation already checks for config errors after the session ends — this change handles the pre-session path
+- Consider whether `raf config --reset` also needs this fix (it probably doesn't since reset deletes the file without loading it)

package/RAF/ahtahs-token-reaper/plans/06-fix-verbose-toggle-timer-display.md ADDED Viewed

@@ -0,0 +1,40 @@
+# Task: Fix Verbose Toggle Timer Display
+## Objective
+Stop the interactive timer and task-name prefix from displaying when verbose mode is toggled ON, and resume them when toggled OFF.
+## Context
+When a user presses Tab during task execution to toggle verbose ON, the timer/status line continues updating and gets interleaved with Claude's streamed output. This produces garbled lines like:
+```
+● 01-extend-token-tracker-data-model 39sNow let me add the accumulateUsage() function.
+```
+The timer callback and status line need to be aware of the verbose toggle state so they pause/clear when verbose is ON and resume when verbose is OFF.
+## Requirements
+- When verbose toggles ON mid-execution: immediately clear the status line and stop timer display updates
+- When verbose toggles OFF mid-execution: resume the timer/status line from the actual elapsed time (no reset)
+- When started with `--verbose` flag: current behavior is already correct (no timer callback) — preserve this
+- No task name or timer shown at all while verbose output is streaming — no header line either
+- Tool use descriptions (→ Reading file.ts) and Claude text output continue to display normally when verbose is ON
+- The timer itself keeps counting internally regardless of display state (elapsed time stays accurate)
+## Implementation Steps
+1. Read the current timer callback setup in `do.ts` where `createTaskTimer` is called — understand how the `onTick` callback currently works
+2. Read `verbose-toggle.ts` to understand the toggle mechanism and its `isVerbose` property
+3. Modify the timer's `onTick` callback to check `verboseToggle.isVerbose` on each tick — if verbose is ON, clear the status line and skip the update; if OFF, render the status line as normal
+4. Ensure the status line is cleared immediately when verbose toggles ON (so the last timer line doesn't linger above the verbose output). This may require hooking into the toggle event or simply having the next tick handle it
+5. Verify that toggling OFF restores the status line with the correct elapsed time on the next tick
+6. Add tests for the new behavior: timer callback respects verbose state, status line cleared on verbose ON, resumed on verbose OFF
+## Acceptance Criteria
+- [ ] Toggling verbose ON clears the status line and stops timer/task-name display
+- [ ] Toggling verbose OFF resumes the timer/status line with correct elapsed time
+- [ ] No task name prefix appears on verbose output lines
+- [ ] Starting with `--verbose` flag still works as before (no timer at all)
+- [ ] Timer internally tracks elapsed time correctly regardless of display state
+- [ ] All existing tests pass
+## Notes
+- The key files are `src/commands/do.ts` (timer callback setup around line 914), `src/utils/status-line.ts`, `src/utils/timer.ts`, and `src/utils/verbose-toggle.ts`
+- The fix is likely a small change to the `onTick` callback — check `verboseToggle.isVerbose` and conditionally clear/update the status line
+- Be careful with the edge case where `verbose` is the initial flag (no toggle exists) vs. runtime toggle via Tab

package/dist/commands/config.d.ts.map CHANGED Viewed

	@@ -1 +1 @@
1	- {"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../../src/commands/config.ts"],"names":[],"mappings":"AAIA,OAAO,EAAE,OAAO,EAAE,MAAM,WAAW,CAAC;~~AA8GpC~~,wBAAgB,mBAAmB,IAAI,OAAO,CAgB7C"}
1	+ {"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../../src/commands/config.ts"],"names":[],"mappings":"AAIA,OAAO,EAAE,OAAO,EAAE,MAAM,WAAW,CAAC;AAiHpC,wBAAgB,mBAAmB,IAAI,OAAO,CAgB7C"}

package/dist/commands/config.js CHANGED Viewed

@@ -6,7 +6,8 @@ import { Command } from 'commander';
 import { ClaudeRunner } from '../core/claude-runner.js';
 import { shutdownHandler } from '../core/shutdown-handler.js';
 import { logger } from '../utils/logger.js';
-import { getConfigPath, getModel, getEffort, validateConfig, ConfigValidationError, } from '../utils/config.js';
+import { getConfigPath, getModel, getEffort, getModelShortName, validateConfig, ConfigValidationError, resetConfigCache, } from '../utils/config.js';
+import { DEFAULT_CONFIG } from '../types/config.js';
 /**
  * Load the config documentation markdown from src/prompts/config-docs.md.
  * Resolved relative to this file's location in the dist/ tree.
@@ -128,8 +129,29 @@ async function handleReset() {
 }
 async function runConfigSession(initialPrompt) {
     const configPath = getConfigPath();
-    const model = getModel('config');
-    const effort = getEffort('config');
+    // Try to load config, but fall back to defaults if it's broken
+    // This allows raf config to be used to fix a broken config file
+    let model;
+    let effort;
+    let configError = null;
+    try {
+        model = getModel('config');
+        effort = getEffort('config');
+    }
+    catch (error) {
+        // Config file has errors - fall back to defaults so the session can launch
+        configError = error instanceof Error ? error : new Error(String(error));
+        model = DEFAULT_CONFIG.models.config;
+        effort = DEFAULT_CONFIG.effort.config;
+        // Clear the cached config so subsequent calls don't use the broken cache
+        resetConfigCache();
+    }
+    // Warn user if config has errors, before starting the session
+    if (configError) {
+        logger.warn(`Config file has errors, using defaults: ${configError.message}`);
+        logger.warn('Fix the config in this session or run `raf config --reset` to start fresh.');
+        logger.newline();
+    }
     // Set effort level env var for the Claude session
     process.env['CLAUDE_CODE_EFFORT_LEVEL'] = effort;
     // Load config docs
@@ -151,8 +173,8 @@ async function runConfigSession(initialPrompt) {
     const claudeRunner = new ClaudeRunner({ model });
     shutdownHandler.init();
     shutdownHandler.registerClaudeRunner(claudeRunner);
-    logger.info('Starting config session with Claude...');
-    logger.info(`Using model: ${model}`);
+    const configModel = getModelShortName(model);
+    logger.info(`Starting config session with ${configModel}...`);
     logger.newline();
     try {
         const exitCode = await claudeRunner.runInteractive(systemPrompt, userMessage, {