rafcode 2.1.1 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/settings.local.json +4 -1
- package/CLAUDE.md +59 -11
- package/RAF/ahslfe-config-wizard/decisions.md +34 -0
- package/RAF/ahslfe-config-wizard/input.md +1 -0
- package/RAF/ahslfe-config-wizard/outcomes/01-define-config-schema.md +38 -0
- package/RAF/ahslfe-config-wizard/outcomes/02-refactor-codebase-to-use-config.md +67 -0
- package/RAF/ahslfe-config-wizard/outcomes/03-create-config-documentation.md +37 -0
- package/RAF/ahslfe-config-wizard/outcomes/04-implement-raf-config-command.md +47 -0
- package/RAF/ahslfe-config-wizard/outcomes/05-update-claude-md.md +26 -0
- package/RAF/ahslfe-config-wizard/plans/01-define-config-schema.md +73 -0
- package/RAF/ahslfe-config-wizard/plans/02-refactor-codebase-to-use-config.md +74 -0
- package/RAF/ahslfe-config-wizard/plans/03-create-config-documentation.md +57 -0
- package/RAF/ahslfe-config-wizard/plans/04-implement-raf-config-command.md +66 -0
- package/RAF/ahslfe-config-wizard/plans/05-update-claude-md.md +60 -0
- package/RAF/ahstvo-token-tracker/decisions.md +44 -0
- package/RAF/ahstvo-token-tracker/input.md +3 -0
- package/RAF/ahstvo-token-tracker/outcomes/01-full-model-id-support.md +43 -0
- package/RAF/ahstvo-token-tracker/outcomes/02-name-generation-no-session.md +33 -0
- package/RAF/ahstvo-token-tracker/outcomes/03-unify-stream-json-execution.md +48 -0
- package/RAF/ahstvo-token-tracker/outcomes/04-token-tracking-cost-calculation.md +53 -0
- package/RAF/ahstvo-token-tracker/outcomes/05-token-cost-console-reporting.md +57 -0
- package/RAF/ahstvo-token-tracker/outcomes/06-runtime-verbose-toggle.md +53 -0
- package/RAF/ahstvo-token-tracker/outcomes/07-readme-config-docs.md +36 -0
- package/RAF/ahstvo-token-tracker/plans/01-full-model-id-support.md +35 -0
- package/RAF/ahstvo-token-tracker/plans/02-name-generation-no-session.md +36 -0
- package/RAF/ahstvo-token-tracker/plans/03-unify-stream-json-execution.md +44 -0
- package/RAF/ahstvo-token-tracker/plans/04-token-tracking-cost-calculation.md +56 -0
- package/RAF/ahstvo-token-tracker/plans/05-token-cost-console-reporting.md +55 -0
- package/RAF/ahstvo-token-tracker/plans/06-runtime-verbose-toggle.md +48 -0
- package/RAF/ahstvo-token-tracker/plans/07-readme-config-docs.md +44 -0
- package/README.md +34 -0
- package/dist/commands/config.d.ts +3 -0
- package/dist/commands/config.d.ts.map +1 -0
- package/dist/commands/config.js +173 -0
- package/dist/commands/config.js.map +1 -0
- package/dist/commands/do.d.ts.map +1 -1
- package/dist/commands/do.js +47 -6
- package/dist/commands/do.js.map +1 -1
- package/dist/commands/plan.d.ts.map +1 -1
- package/dist/commands/plan.js +3 -2
- package/dist/commands/plan.js.map +1 -1
- package/dist/core/claude-runner.d.ts +19 -2
- package/dist/core/claude-runner.d.ts.map +1 -1
- package/dist/core/claude-runner.js +43 -96
- package/dist/core/claude-runner.js.map +1 -1
- package/dist/core/failure-analyzer.d.ts.map +1 -1
- package/dist/core/failure-analyzer.js +6 -3
- package/dist/core/failure-analyzer.js.map +1 -1
- package/dist/core/git.d.ts.map +1 -1
- package/dist/core/git.js +10 -3
- package/dist/core/git.js.map +1 -1
- package/dist/core/pull-request.d.ts +1 -1
- package/dist/core/pull-request.d.ts.map +1 -1
- package/dist/core/pull-request.js +7 -4
- package/dist/core/pull-request.js.map +1 -1
- package/dist/index.js +2 -0
- package/dist/index.js.map +1 -1
- package/dist/parsers/stream-renderer.d.ts +16 -1
- package/dist/parsers/stream-renderer.d.ts.map +1 -1
- package/dist/parsers/stream-renderer.js +34 -4
- package/dist/parsers/stream-renderer.js.map +1 -1
- package/dist/prompts/execution.d.ts.map +1 -1
- package/dist/prompts/execution.js +11 -1
- package/dist/prompts/execution.js.map +1 -1
- package/dist/types/config.d.ts +95 -4
- package/dist/types/config.d.ts.map +1 -1
- package/dist/types/config.js +63 -3
- package/dist/types/config.js.map +1 -1
- package/dist/utils/config.d.ts +59 -7
- package/dist/utils/config.d.ts.map +1 -1
- package/dist/utils/config.js +276 -21
- package/dist/utils/config.js.map +1 -1
- package/dist/utils/name-generator.d.ts +3 -7
- package/dist/utils/name-generator.d.ts.map +1 -1
- package/dist/utils/name-generator.js +75 -61
- package/dist/utils/name-generator.js.map +1 -1
- package/dist/utils/terminal-symbols.d.ts +21 -0
- package/dist/utils/terminal-symbols.d.ts.map +1 -1
- package/dist/utils/terminal-symbols.js +62 -0
- package/dist/utils/terminal-symbols.js.map +1 -1
- package/dist/utils/token-tracker.d.ts +45 -0
- package/dist/utils/token-tracker.d.ts.map +1 -0
- package/dist/utils/token-tracker.js +107 -0
- package/dist/utils/token-tracker.js.map +1 -0
- package/dist/utils/validation.d.ts +5 -5
- package/dist/utils/validation.d.ts.map +1 -1
- package/dist/utils/validation.js +10 -6
- package/dist/utils/validation.js.map +1 -1
- package/dist/utils/verbose-toggle.d.ts +33 -0
- package/dist/utils/verbose-toggle.d.ts.map +1 -0
- package/dist/utils/verbose-toggle.js +94 -0
- package/dist/utils/verbose-toggle.js.map +1 -0
- package/package.json +1 -1
- package/src/commands/config.ts +204 -0
- package/src/commands/do.ts +56 -5
- package/src/commands/plan.ts +3 -2
- package/src/core/claude-runner.ts +59 -115
- package/src/core/failure-analyzer.ts +6 -3
- package/src/core/git.ts +10 -3
- package/src/core/pull-request.ts +7 -4
- package/src/index.ts +2 -0
- package/src/parsers/stream-renderer.ts +54 -4
- package/src/prompts/config-docs.md +331 -0
- package/src/prompts/execution.ts +13 -1
- package/src/types/config.ts +156 -7
- package/src/utils/config.ts +335 -21
- package/src/utils/name-generator.ts +84 -71
- package/src/utils/terminal-symbols.ts +68 -0
- package/src/utils/token-tracker.ts +135 -0
- package/src/utils/validation.ts +15 -10
- package/src/utils/verbose-toggle.ts +103 -0
- package/tests/unit/claude-runner.test.ts +171 -7
- package/tests/unit/config-command.test.ts +163 -0
- package/tests/unit/config.test.ts +608 -30
- package/tests/unit/name-generator.test.ts +99 -75
- package/tests/unit/pull-request.test.ts +2 -0
- package/tests/unit/stream-renderer.test.ts +83 -0
- package/tests/unit/terminal-symbols.test.ts +157 -0
- package/tests/unit/token-tracker.test.ts +352 -0
- package/tests/unit/verbose-toggle.test.ts +204 -0
|
@@ -0,0 +1,66 @@
|
|
|
1
|
+
# Task: Implement `raf config` Command
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Create the `raf config [prompt]` CLI command that launches an interactive Claude Sonnet TTY session for editing RAF configuration.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
Users need a natural-language way to modify their config. This command spawns Claude Sonnet with the config documentation as a system prompt, giving it full knowledge of the schema. It also supports `--reset` to restore defaults.
|
|
8
|
+
|
|
9
|
+
## Dependencies
|
|
10
|
+
01, 02, 03
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
|
|
14
|
+
### `raf config` (no arguments)
|
|
15
|
+
- Spawn an interactive Claude Sonnet TTY session (same node-pty approach used in `raf plan`)
|
|
16
|
+
- Claude receives the config docs from `src/prompts/config-docs.md` as an appended system prompt
|
|
17
|
+
- Claude also receives the current config file contents (or "no config file exists, using defaults")
|
|
18
|
+
- User has a back-and-forth conversation with Claude to modify settings
|
|
19
|
+
- Claude reads, modifies, and writes `~/.raf/raf.config.json`
|
|
20
|
+
- After Claude writes, RAF validates the result and reports errors if invalid
|
|
21
|
+
|
|
22
|
+
### `raf config <prompt>`
|
|
23
|
+
- Same as above but with an initial prompt pre-filled
|
|
24
|
+
- Still interactive TTY — user can continue the conversation after the initial prompt is handled
|
|
25
|
+
|
|
26
|
+
### `raf config --reset`
|
|
27
|
+
- Prompt user for confirmation ("This will delete ~/.raf/raf.config.json and restore all defaults. Continue? [y/N]")
|
|
28
|
+
- On confirm: delete the config file, print success message
|
|
29
|
+
- On deny: abort, print "Cancelled"
|
|
30
|
+
|
|
31
|
+
### Validation after session:
|
|
32
|
+
- When the Claude session ends, read the config file and validate it
|
|
33
|
+
- If invalid: print warnings showing what's wrong, but don't delete the file (user can fix it)
|
|
34
|
+
- If valid: print "Config updated successfully" with a summary of changes
|
|
35
|
+
|
|
36
|
+
### Model and effort:
|
|
37
|
+
- Use `config.models.config` for the model (default: `'sonnet'`)
|
|
38
|
+
- Use `config.effort.config` for reasoning effort (default: `'medium'`)
|
|
39
|
+
|
|
40
|
+
## Implementation Steps
|
|
41
|
+
1. Create `src/commands/config.ts` with the Commander.js command definition
|
|
42
|
+
2. Register the command in `src/index.ts` CLI setup
|
|
43
|
+
3. Implement `--reset` flag: confirmation prompt, file deletion, feedback
|
|
44
|
+
4. Build the system prompt: load `src/prompts/config-docs.md` content, append current config state
|
|
45
|
+
5. Spawn interactive Claude session using the existing `ClaudeRunner` TTY infrastructure (same pattern as planning mode)
|
|
46
|
+
6. Pass the model from `config.models.config` and effort from `config.effort.config`
|
|
47
|
+
7. After session ends: validate the config file, report results
|
|
48
|
+
8. Handle edge cases: `~/.raf/` directory doesn't exist (create it), config file doesn't exist yet (that's fine, Claude will create it)
|
|
49
|
+
9. Write tests for the command setup, reset flow, and post-session validation
|
|
50
|
+
|
|
51
|
+
## Acceptance Criteria
|
|
52
|
+
- [ ] `raf config` starts an interactive Claude Sonnet session with config knowledge
|
|
53
|
+
- [ ] `raf config "use haiku for name generation"` starts session with that prompt
|
|
54
|
+
- [ ] `raf config --reset` prompts for confirmation and deletes config file
|
|
55
|
+
- [ ] Claude session has full config documentation as system prompt
|
|
56
|
+
- [ ] Claude session shows current config state
|
|
57
|
+
- [ ] Post-session validation checks the config file
|
|
58
|
+
- [ ] `~/.raf/` directory is created if it doesn't exist
|
|
59
|
+
- [ ] Command is registered and appears in `raf --help`
|
|
60
|
+
- [ ] Tests cover command setup and reset flow
|
|
61
|
+
|
|
62
|
+
## Notes
|
|
63
|
+
- Reuse the existing TTY session infrastructure from `src/core/claude-runner.ts` — don't reinvent it
|
|
64
|
+
- The planning command in `src/commands/plan.ts` is the best reference for how to spawn interactive Claude sessions
|
|
65
|
+
- Claude needs `--dangerously-skip-permissions` so it can write to `~/.raf/raf.config.json` without asking
|
|
66
|
+
- The config docs file needs to be readable at runtime — consider how the built JS accesses the .md file (may need to copy it to dist or embed it as a string)
|
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
# Task: Update CLAUDE.md with Config Architecture
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Update CLAUDE.md with the new config system architecture, the "configurable by default" principle, and all related documentation.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
CLAUDE.md serves as the project's internal documentation and instruction set. It needs to reflect the new config system so that future Claude sessions (planning, execution) understand how config works. The user also wants an explicit architectural principle: if a feature can be configurable, it should be configurable via config.
|
|
8
|
+
|
|
9
|
+
## Dependencies
|
|
10
|
+
01, 02, 03, 04
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
|
|
14
|
+
### Add "Configurable by Default" principle:
|
|
15
|
+
- New architectural decision section stating: if a feature/setting can be configurable, it should be configurable through the config system
|
|
16
|
+
- Default config provides sensible defaults; global config in `~/.raf/raf.config.json` overrides them
|
|
17
|
+
- CLI flags override config values (three-tier precedence: CLI flag > global config > built-in defaults)
|
|
18
|
+
|
|
19
|
+
### Document the config system:
|
|
20
|
+
- Config file location: `~/.raf/raf.config.json`
|
|
21
|
+
- Schema overview: `models`, `effort`, `timeout`, `maxRetries`, `autoCommit`, `worktree`, `commitFormat`, `claudeCommand`
|
|
22
|
+
- Precedence chain: CLI flag > `~/.raf/raf.config.json` > DEFAULT_CONFIG
|
|
23
|
+
- Validation: strict, unknown keys rejected
|
|
24
|
+
- Reference to `src/types/config.ts` as single source of truth
|
|
25
|
+
|
|
26
|
+
### Document the `raf config` command:
|
|
27
|
+
- `raf config` — interactive Claude session for config editing
|
|
28
|
+
- `raf config <prompt>` — interactive session with initial prompt
|
|
29
|
+
- `raf config --reset` — restore defaults with confirmation
|
|
30
|
+
- Config docs bundled at `src/prompts/config-docs.md`
|
|
31
|
+
|
|
32
|
+
### Update existing sections:
|
|
33
|
+
- Update "Development Commands" if any new npm scripts were added
|
|
34
|
+
- Update file structure to include new files (`src/commands/config.ts`, `src/prompts/config-docs.md`)
|
|
35
|
+
- Update any references to the old config system (`raf.config.json` in project dirs → `~/.raf/raf.config.json`)
|
|
36
|
+
- Update the commit format section to reference config templates instead of hardcoded formats
|
|
37
|
+
|
|
38
|
+
## Implementation Steps
|
|
39
|
+
1. Read current CLAUDE.md to understand existing structure
|
|
40
|
+
2. Add "Configurable by Default" as a new architectural decision section
|
|
41
|
+
3. Add "Configuration System" section documenting schema, loading, validation, precedence
|
|
42
|
+
4. Add `raf config` to the commands documentation
|
|
43
|
+
5. Update directory structure to include new files
|
|
44
|
+
6. Update commit format documentation to reference templates
|
|
45
|
+
7. Update any stale references to the old project-local config
|
|
46
|
+
8. Keep the document concise — link to `src/prompts/config-docs.md` for full config reference rather than duplicating it
|
|
47
|
+
|
|
48
|
+
## Acceptance Criteria
|
|
49
|
+
- [ ] "Configurable by Default" principle documented as architectural decision
|
|
50
|
+
- [ ] Config system fully documented (location, schema overview, precedence, validation)
|
|
51
|
+
- [ ] `raf config` command documented
|
|
52
|
+
- [ ] Directory structure updated
|
|
53
|
+
- [ ] Old config references updated
|
|
54
|
+
- [ ] Commit format section references templates
|
|
55
|
+
- [ ] CLAUDE.md remains well-organized and concise
|
|
56
|
+
|
|
57
|
+
## Notes
|
|
58
|
+
- CLAUDE.md is the source of truth for Claude sessions working on this project — accuracy is critical
|
|
59
|
+
- Don't duplicate the full config reference (that's in `src/prompts/config-docs.md`) — just provide an overview and pointer
|
|
60
|
+
- The principle "configurable by default" should guide future development: when adding new features, add corresponding config keys
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# Project Decisions
|
|
2
|
+
|
|
3
|
+
## Should short model names (sonnet, haiku, opus) still be supported alongside full model IDs?
|
|
4
|
+
Both supported. Keep short names as aliases and also accept full model IDs like claude-opus-4-5-20251101.
|
|
5
|
+
|
|
6
|
+
## How should full model IDs be validated?
|
|
7
|
+
Regex pattern. Validate that full IDs match a pattern like claude-{family}-{version} (e.g., claude-opus-4-5-20251101).
|
|
8
|
+
|
|
9
|
+
## Should default config values stay as short names or change to full model IDs?
|
|
10
|
+
Keep short names. Defaults stay as 'opus', 'sonnet', 'haiku' - simple and auto-resolve to latest.
|
|
11
|
+
|
|
12
|
+
## How should name generation call Claude CLI to avoid registering a session?
|
|
13
|
+
Use the same spawn-based approach as `raf do` (claude-runner.ts) but without the `--dangerously-skip-permissions` flag. Use `spawn` with `-p` flag for non-interactive print mode. Additionally, use the `--no-session-persistence` flag which prevents sessions from being saved to disk.
|
|
14
|
+
|
|
15
|
+
## What token usage data is available from Claude CLI?
|
|
16
|
+
Claude CLI provides comprehensive token data in both `--output-format json` and stream-json `result` events:
|
|
17
|
+
- `total_cost_usd`: Total cost in USD (already calculated by Claude CLI)
|
|
18
|
+
- `usage.input_tokens`, `usage.output_tokens`, `usage.cache_creation_input_tokens`, `usage.cache_read_input_tokens`
|
|
19
|
+
- `modelUsage.<model-id>`: Per-model breakdown with `inputTokens`, `outputTokens`, `cacheReadInputTokens`, `cacheCreationInputTokens`, `costUSD`
|
|
20
|
+
- `duration_ms`, `duration_api_ms`: Timing info
|
|
21
|
+
|
|
22
|
+
## Where should token reports be displayed?
|
|
23
|
+
Console output only. Print token counts and cost estimates to terminal after each task and at the end (total).
|
|
24
|
+
|
|
25
|
+
## Should pricing be hardcoded or configurable?
|
|
26
|
+
Configurable in raf config, with current prices as defaults.
|
|
27
|
+
|
|
28
|
+
## Should we use Claude CLI's built-in total_cost_usd or compute our own?
|
|
29
|
+
Compute own price from token counts × configurable prices. CLI's total_cost_usd doesn't work for subscription users.
|
|
30
|
+
|
|
31
|
+
## How should task execution capture token counts?
|
|
32
|
+
Switch all task execution to stream-json mode (already used in verbose mode). Parse the `result` event for usage data. Non-verbose mode suppresses tool display but still uses stream-json format to capture tokens.
|
|
33
|
+
|
|
34
|
+
## How should verbose mode toggling during execution work?
|
|
35
|
+
Keypress listener on process.stdin. Since all execution uses stream-json after task 03, toggling is purely a display concern — whether tool-use lines are printed or suppressed. Node's event loop can handle stdin events and child process output concurrently.
|
|
36
|
+
|
|
37
|
+
## Which key toggles verbose mode?
|
|
38
|
+
Tab key. Press Tab during task execution to toggle verbose display on/off.
|
|
39
|
+
|
|
40
|
+
## How detailed should the raf config README section be?
|
|
41
|
+
Brief + examples. Command usage, 1-2 basic config examples, mention that `raf config` launches an interactive Claude session for help. Keep it concise like other command sections in the README.
|
|
42
|
+
|
|
43
|
+
## Should CLAUDE.md mention README update requirements more explicitly?
|
|
44
|
+
Yes. Add a note about always updating README when adding/changing CLI commands, API changes, or important features (like worktrees, config).
|
|
@@ -0,0 +1,3 @@
|
|
|
1
|
+
- [ ] add token count and report it in the end of each task and total after all tasks finished. bot input and output and add price estimation in $ according to price for opus 4.6 https://claude.com/pricing
|
|
2
|
+
- [ ] run project name generation claude instance not like this " execSync("claude --print ...")" but in a way it will not register session
|
|
3
|
+
- [ ] **Add** **support** **for** **full** **model** **IDs** like claude-opus-4-5-20251101 (in config)
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
# Outcome: Support Full Model IDs in Config
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Added support for full Claude model IDs (e.g., `claude-opus-4-5-20251101`) in the RAF config system, alongside the existing short aliases (`sonnet`, `haiku`, `opus`).
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### `src/types/config.ts`
|
|
10
|
+
- Added `ClaudeModelAlias` type for short names (`sonnet | haiku | opus`)
|
|
11
|
+
- Widened `ClaudeModelName` to accept both short aliases and full model IDs via branded string intersection
|
|
12
|
+
- Added `FULL_MODEL_ID_PATTERN` regex: `/^claude-[a-z]+-\d+(-\d+)*$/`
|
|
13
|
+
- Added `VALID_MODEL_ALIASES` constant (replacing `VALID_MODELS` which is kept as deprecated alias)
|
|
14
|
+
|
|
15
|
+
### `src/utils/config.ts`
|
|
16
|
+
- Added `isValidModelName()` function that checks against both short aliases and the full model ID regex
|
|
17
|
+
- Updated model validation in `validateConfig()` to use `isValidModelName()`
|
|
18
|
+
- Updated error message to mention both short aliases and full model ID format
|
|
19
|
+
|
|
20
|
+
### `src/utils/validation.ts`
|
|
21
|
+
- Updated `validateModelName()` to accept full model IDs
|
|
22
|
+
- Updated `resolveModelOption()` return type to `ClaudeModelName`
|
|
23
|
+
- Updated error message to include full model ID example
|
|
24
|
+
- Deprecated `ValidModelName` type alias in favor of `ClaudeModelName`
|
|
25
|
+
|
|
26
|
+
### `src/prompts/config-docs.md`
|
|
27
|
+
- Updated model value description to mention full model IDs
|
|
28
|
+
- Updated validation rules to describe the new pattern
|
|
29
|
+
- Added "Pinned Model Versions" example config section
|
|
30
|
+
|
|
31
|
+
### `tests/unit/config.test.ts`
|
|
32
|
+
- Added `isValidModelName` tests: short aliases, full model IDs, and invalid strings
|
|
33
|
+
- Added `validateConfig` tests for full model IDs (with and without date suffix)
|
|
34
|
+
- Added `validateConfig` test for rejecting random strings
|
|
35
|
+
- Added `resolveConfig` test for deep-merging full model ID overrides
|
|
36
|
+
|
|
37
|
+
## Verification
|
|
38
|
+
|
|
39
|
+
- TypeScript build passes cleanly
|
|
40
|
+
- All 1066 tests pass (7 new tests added)
|
|
41
|
+
- 2 pre-existing test failures confirmed unrelated (same on base branch)
|
|
42
|
+
|
|
43
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# Outcome: Fix Name Generation to Not Register Sessions
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Replaced `execSync` with `spawn` in name generation and added `--no-session-persistence` flag to prevent throwaway name generation calls from cluttering the user's Claude session history.
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### `src/utils/name-generator.ts`
|
|
10
|
+
- Replaced `import { execSync }` with `import { spawn }` from `node:child_process`
|
|
11
|
+
- Added `runClaudePrint()` helper function that uses `spawn` with `--no-session-persistence` and `-p` flags
|
|
12
|
+
- Simplified `callSonnetForName()` and `callSonnetForMultipleNames()` to use `runClaudePrint()`
|
|
13
|
+
- Removed `escapeShellArg()` function (no longer needed - spawn passes args as array, not shell string)
|
|
14
|
+
- Removed `escapeShellArg` from exports
|
|
15
|
+
- Kept same 30-second timeout behavior
|
|
16
|
+
- Kept same fallback behavior on failure (returns null for single name, empty array for multiple)
|
|
17
|
+
|
|
18
|
+
### `tests/unit/name-generator.test.ts`
|
|
19
|
+
- Replaced `execSync` mock with `spawn` mock using EventEmitter-based fake ChildProcess
|
|
20
|
+
- Added `createMockSpawn()` helper to create mock spawn return values
|
|
21
|
+
- Added test for `--no-session-persistence` flag presence
|
|
22
|
+
- Added test for spawn `error` event handling (e.g., ENOENT)
|
|
23
|
+
- Removed `escapeShellArg` tests (function removed)
|
|
24
|
+
- Updated model expectation from `sonnet` to `haiku` (actual default for nameGeneration)
|
|
25
|
+
- All 29 tests pass
|
|
26
|
+
|
|
27
|
+
## Verification
|
|
28
|
+
|
|
29
|
+
- TypeScript build passes cleanly
|
|
30
|
+
- All 29 name-generator tests pass
|
|
31
|
+
- Full test suite: 1065 passed, 1 failed (pre-existing, unrelated to this change)
|
|
32
|
+
|
|
33
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# Outcome: Unify Task Execution to Stream-JSON Format
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Unified both `run()` and `runVerbose()` methods in `ClaudeRunner` to use `--output-format stream-json --verbose` internally, enabling token usage data extraction from every task execution. The difference between verbose and non-verbose is now purely a display concern.
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### `src/types/config.ts`
|
|
10
|
+
- Added `ModelTokenUsage` interface (per-model token breakdown)
|
|
11
|
+
- Added `UsageData` interface (aggregate + per-model token usage)
|
|
12
|
+
|
|
13
|
+
### `src/parsers/stream-renderer.ts`
|
|
14
|
+
- Added `UsageData` and `ModelTokenUsage` imports from types
|
|
15
|
+
- Extended `StreamEvent` interface with `usage` and `modelUsage` fields
|
|
16
|
+
- Added `usageData?: UsageData` field to `RenderResult` interface
|
|
17
|
+
- Updated `renderResult()` to extract usage data from result events
|
|
18
|
+
- Added `extractUsageData()` helper function
|
|
19
|
+
|
|
20
|
+
### `src/core/claude-runner.ts`
|
|
21
|
+
- Added `UsageData` import from types
|
|
22
|
+
- Added `usageData?: UsageData` field to `RunResult` interface
|
|
23
|
+
- Replaced separate `run()` and `runVerbose()` implementations with a unified `_runStreamJson()` private method
|
|
24
|
+
- `run()` now delegates to `_runStreamJson(prompt, options, false)` — suppresses display
|
|
25
|
+
- `runVerbose()` now delegates to `_runStreamJson(prompt, options, true)` — shows display
|
|
26
|
+
- Both methods now use `--output-format stream-json --verbose` flags
|
|
27
|
+
- Both methods capture `usageData` from the stream-json result event
|
|
28
|
+
|
|
29
|
+
### `tests/unit/stream-renderer.test.ts`
|
|
30
|
+
- Added 4 new tests for usage data extraction from result events
|
|
31
|
+
- Tests cover: full usage data, missing usage, partial usage, multi-model usage
|
|
32
|
+
|
|
33
|
+
### `tests/unit/claude-runner.test.ts`
|
|
34
|
+
- Updated test asserting `run()` does NOT have stream-json flags → now asserts it DOES
|
|
35
|
+
- Added trailing newlines to context overflow test data (needed for NDJSON line buffering)
|
|
36
|
+
- Added 4 new tests in "usage data extraction" describe block:
|
|
37
|
+
- `run()` returns usageData from result events
|
|
38
|
+
- `runVerbose()` returns usageData from result events
|
|
39
|
+
- undefined usageData when no result event
|
|
40
|
+
- `run()` suppresses display but still captures usage data
|
|
41
|
+
|
|
42
|
+
## Verification
|
|
43
|
+
|
|
44
|
+
- TypeScript build passes cleanly
|
|
45
|
+
- All 1073 tests pass (8 new tests added)
|
|
46
|
+
- 1 pre-existing test failure confirmed unrelated (same on base branch)
|
|
47
|
+
|
|
48
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
# Outcome: Add Token Tracking and Cost Calculation
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Implemented token usage accumulation across tasks and cost calculation using configurable per-model pricing. Added pricing config to the RAF config schema, a `TokenTracker` utility class, and model ID to pricing category mapping.
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### `src/types/config.ts`
|
|
10
|
+
- Added `PricingCategory` type (`'opus' | 'sonnet' | 'haiku'`)
|
|
11
|
+
- Added `ModelPricing` interface (inputPerMTok, outputPerMTok, cacheReadPerMTok, cacheCreatePerMTok)
|
|
12
|
+
- Added `PricingConfig` interface (per-category pricing)
|
|
13
|
+
- Added `pricing` field to `RafConfig` interface
|
|
14
|
+
- Added default pricing to `DEFAULT_CONFIG`:
|
|
15
|
+
- Opus: $15/$75 input/output, $1.50 cache read, $18.75 cache create
|
|
16
|
+
- Sonnet: $3/$15 input/output, $0.30 cache read, $3.75 cache create
|
|
17
|
+
- Haiku: $1/$5 input/output, $0.10 cache read, $1.25 cache create
|
|
18
|
+
|
|
19
|
+
### `src/utils/config.ts`
|
|
20
|
+
- Added `pricing` to `VALID_TOP_LEVEL_KEYS` and validation sets
|
|
21
|
+
- Added pricing validation in `validateConfig()`: validates categories, fields, and values
|
|
22
|
+
- Added pricing deep-merge in `deepMerge()` (per-category field-level merging)
|
|
23
|
+
- Added `resolveModelPricingCategory()`: maps full model IDs (e.g., `claude-opus-4-6`) and short aliases to pricing categories
|
|
24
|
+
- Added `getPricing(category)` and `getPricingConfig()` accessor helpers
|
|
25
|
+
|
|
26
|
+
### `src/utils/token-tracker.ts` (new file)
|
|
27
|
+
- `TokenTracker` class that accumulates `UsageData` across task executions
|
|
28
|
+
- `addTask(taskId, usage)`: records a task's usage and calculates per-task cost
|
|
29
|
+
- `getTotals()`: returns accumulated usage and cost across all tasks
|
|
30
|
+
- `calculateCost(usage)`: calculates cost using per-model pricing from `modelUsage` breakdown
|
|
31
|
+
- Falls back to sonnet pricing when model breakdown is unavailable or model family is unknown
|
|
32
|
+
- Exports `CostBreakdown` and `TaskUsageEntry` interfaces
|
|
33
|
+
|
|
34
|
+
### `src/prompts/config-docs.md`
|
|
35
|
+
- Added `pricing` section documenting all fields, defaults, and example override
|
|
36
|
+
- Updated validation rules to mention pricing constraints
|
|
37
|
+
- Updated "Full — All Settings Explicit" example to include pricing
|
|
38
|
+
|
|
39
|
+
### `tests/unit/token-tracker.test.ts` (new file)
|
|
40
|
+
- 14 tests covering: per-model cost calculation (opus/sonnet/haiku), multi-model usage, cache token pricing, fallback behavior, accumulation across tasks, custom pricing, zero tokens, per-task entries
|
|
41
|
+
|
|
42
|
+
### `tests/unit/config.test.ts`
|
|
43
|
+
- Added 10 pricing validation tests (valid, partial, invalid types, unknown keys, negative values, Infinity)
|
|
44
|
+
- Added 3 `resolveModelPricingCategory` tests (short aliases, full IDs, unknown families)
|
|
45
|
+
- Added 2 `resolveConfig` pricing tests (defaults, deep-merge partial override)
|
|
46
|
+
|
|
47
|
+
## Verification
|
|
48
|
+
|
|
49
|
+
- TypeScript build passes cleanly
|
|
50
|
+
- All 1103 tests pass (15 new tests added)
|
|
51
|
+
- 1 pre-existing test failure confirmed unrelated (same on base branch)
|
|
52
|
+
|
|
53
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
# Outcome: Add Token/Cost Reporting to Console Output
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Wired token usage tracking into the `raf do` execution flow, displaying per-task token summaries after each task and a grand total summary after all tasks complete.
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### `src/utils/terminal-symbols.ts`
|
|
10
|
+
- Added `formatNumber()`: formats numbers with thousands separators (e.g., `12,345`)
|
|
11
|
+
- Added `formatCost()`: formats USD cost with 2-4 decimal places (2 for >= $0.01, 4 for smaller)
|
|
12
|
+
- Added `formatTaskTokenSummary()`: per-task summary line showing input/output tokens, cache tokens, estimated cost
|
|
13
|
+
- Added `formatTokenTotalSummary()`: grand total block with dividers, total tokens, cache breakdown, estimated cost
|
|
14
|
+
- Added imports for `UsageData` and `CostBreakdown` types
|
|
15
|
+
|
|
16
|
+
### `src/commands/do.ts`
|
|
17
|
+
- Imported `TokenTracker`, `formatTaskTokenSummary`, `formatTokenTotalSummary`
|
|
18
|
+
- Instantiated `TokenTracker` at the start of `executeSingleProject()`
|
|
19
|
+
- Added `lastUsageData` variable to capture usage from the last retry attempt
|
|
20
|
+
- After successful tasks: tracks usage and displays per-task summary via `logger.dim()`
|
|
21
|
+
- After failed tasks: tracks partial usage data and displays per-task summary
|
|
22
|
+
- After all tasks: displays grand total summary block if any tasks reported usage
|
|
23
|
+
- Tasks with no usage data (timeout, crash, no result event) are silently skipped
|
|
24
|
+
|
|
25
|
+
### `CLAUDE.md`
|
|
26
|
+
- Added "Token Usage Tracking" section documenting the feature, key files, and formatting utilities
|
|
27
|
+
|
|
28
|
+
### `tests/unit/terminal-symbols.test.ts`
|
|
29
|
+
- Added 20 new tests covering:
|
|
30
|
+
- `formatNumber`: small numbers, thousands separators, large numbers, zero
|
|
31
|
+
- `formatCost`: zero, normal costs, small costs, threshold boundary
|
|
32
|
+
- `formatTaskTokenSummary`: no cache, cache read only, cache create only, both caches, small costs
|
|
33
|
+
- `formatTokenTotalSummary`: no cache, cache read, cache create, both caches, divider lines
|
|
34
|
+
|
|
35
|
+
## Output Examples
|
|
36
|
+
|
|
37
|
+
Per-task (displayed in dim text after each task):
|
|
38
|
+
```
|
|
39
|
+
Tokens: 5,234 in / 1,023 out | Cache: 18,500 read | Est. cost: $0.42
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Grand total (displayed after all tasks):
|
|
43
|
+
```
|
|
44
|
+
── Token Usage Summary ──────────────────
|
|
45
|
+
Total tokens: 45,678 in / 12,345 out
|
|
46
|
+
Cache: 125,000 read / 8,000 created
|
|
47
|
+
Estimated cost: $3.75
|
|
48
|
+
─────────────────────────────────────────
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## Verification
|
|
52
|
+
|
|
53
|
+
- TypeScript build passes cleanly
|
|
54
|
+
- All 1122 tests pass (20 new tests added)
|
|
55
|
+
- 1 pre-existing test failure confirmed unrelated (same on base branch)
|
|
56
|
+
|
|
57
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
# Outcome: Add Runtime Verbose Toggle During Task Execution
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Added a runtime verbose toggle that lets users press Tab during task execution to show or hide tool-use activity lines in real-time. The toggle works across sequential tasks, properly handles Ctrl+C for graceful shutdown, and automatically skips setup when stdin is not a TTY.
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### `src/utils/verbose-toggle.ts` (new file)
|
|
10
|
+
- `VerboseToggle` class that manages stdin raw mode and Tab keypress listening
|
|
11
|
+
- `start()`: sets stdin to raw mode, listens for Tab (0x09) to flip verbose state
|
|
12
|
+
- `stop()`: restores stdin to normal mode and removes the listener
|
|
13
|
+
- Ctrl+C (0x03) is re-emitted as `SIGINT` so the shutdown handler still works
|
|
14
|
+
- Shows `[verbose: on]` / `[verbose: off]` indicator on toggle
|
|
15
|
+
- Shows "Press Tab to toggle verbose mode" hint on start
|
|
16
|
+
- No-op when stdin is not a TTY (piped input)
|
|
17
|
+
- Safe to call `start()`/`stop()` multiple times
|
|
18
|
+
|
|
19
|
+
### `src/core/claude-runner.ts`
|
|
20
|
+
- Added `verboseCheck?: () => boolean` option to `ClaudeRunnerOptions`
|
|
21
|
+
- Updated `_runStreamJson()` to use `verboseCheck` callback when provided, falling back to static `verbose` parameter
|
|
22
|
+
- Both the main stdout handler and the remaining-buffer handler on `close` use `shouldDisplay()` callback
|
|
23
|
+
|
|
24
|
+
### `src/commands/do.ts`
|
|
25
|
+
- Imported `VerboseToggle`
|
|
26
|
+
- Creates `VerboseToggle` with initial state matching `--verbose` flag
|
|
27
|
+
- Registers `verboseToggle.stop()` as shutdown cleanup callback
|
|
28
|
+
- Starts toggle listener before the task execution loop
|
|
29
|
+
- Passes `verboseCheck: () => verboseToggle.isVerbose` to runner options
|
|
30
|
+
- Stops toggle listener after the task loop completes (before summary output)
|
|
31
|
+
|
|
32
|
+
### `tests/unit/verbose-toggle.test.ts` (new file)
|
|
33
|
+
- 15 tests covering:
|
|
34
|
+
- Initial state, active state, TTY/non-TTY behavior
|
|
35
|
+
- Tab keypress toggling (on/off)
|
|
36
|
+
- Ctrl+C re-emitting SIGINT
|
|
37
|
+
- Ignoring non-Tab keypresses
|
|
38
|
+
- Multiple bytes in single data event
|
|
39
|
+
- Stop/start lifecycle, double-call safety
|
|
40
|
+
- No response after stop
|
|
41
|
+
- Cross-task persistence (stop and restart)
|
|
42
|
+
- setRawMode error handling
|
|
43
|
+
|
|
44
|
+
### `tests/unit/claude-runner.test.ts`
|
|
45
|
+
- Added 1 test: `verboseCheck` callback dynamically controls display output
|
|
46
|
+
|
|
47
|
+
## Verification
|
|
48
|
+
|
|
49
|
+
- TypeScript build passes cleanly
|
|
50
|
+
- All 1138 tests pass (16 new tests added)
|
|
51
|
+
- 1 pre-existing test failure confirmed unrelated (same on base branch)
|
|
52
|
+
|
|
53
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
# Outcome: Document raf config in README and Strengthen README Update Policy
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Added `raf config` documentation to README.md and expanded CLAUDE.md's "Important Reminders" with explicit guidance on when to update the README.
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### `README.md`
|
|
10
|
+
- Added "Configurable" bullet to the Features list
|
|
11
|
+
- Added `### raf config` section after `raf status` in the Commands section, including:
|
|
12
|
+
- Command usage examples (interactive, with prompt, --reset)
|
|
13
|
+
- Precedence rule (CLI flags > config file > defaults)
|
|
14
|
+
- Minimal config file example with 3 common settings (models, worktree, timeout)
|
|
15
|
+
- Note directing users to the interactive session for full config reference
|
|
16
|
+
- Added `### raf config [prompt]` to the Command Reference section with `--reset` option
|
|
17
|
+
|
|
18
|
+
### `CLAUDE.md`
|
|
19
|
+
- Expanded "Important Reminders" section with explicit README update policy:
|
|
20
|
+
- When new CLI commands are added
|
|
21
|
+
- When existing command flags or behavior change
|
|
22
|
+
- When important features are added
|
|
23
|
+
- When the Features list needs updating
|
|
24
|
+
- Separated README and CLAUDE.md update guidance into distinct items
|
|
25
|
+
|
|
26
|
+
## Verification
|
|
27
|
+
|
|
28
|
+
- All acceptance criteria met:
|
|
29
|
+
- README has a `raf config` section with usage and basic example
|
|
30
|
+
- `raf config` appears in the Command Reference table
|
|
31
|
+
- Config file location (`~/.raf/raf.config.json`) and precedence rules are mentioned
|
|
32
|
+
- CLAUDE.md has explicit guidance about when to update README
|
|
33
|
+
- No existing README content was broken or removed
|
|
34
|
+
- Documentation tone and style match the rest of the README
|
|
35
|
+
|
|
36
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# Task: Support Full Model IDs in Config
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Allow users to specify full model IDs (e.g., `claude-opus-4-5-20251101`) in addition to short aliases (`opus`, `sonnet`, `haiku`) in the RAF config system.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
Currently, `ClaudeModelName` type only accepts `'sonnet' | 'haiku' | 'opus'`. The Claude CLI `--model` flag already accepts both short aliases and full model IDs, but RAF's config validation rejects full IDs. Users need this to pin specific model versions.
|
|
8
|
+
|
|
9
|
+
## Requirements
|
|
10
|
+
- Keep short names (`sonnet`, `haiku`, `opus`) working as before — they remain the default
|
|
11
|
+
- Accept full model IDs matching the pattern `claude-{family}-{version}` (e.g., `claude-opus-4-5-20251101`, `claude-sonnet-4-5-20250929`)
|
|
12
|
+
- Validate full model IDs with a regex pattern (not just accept any string)
|
|
13
|
+
- Default config values stay as short names
|
|
14
|
+
- Update all types, validation, config docs, and tests
|
|
15
|
+
|
|
16
|
+
## Implementation Steps
|
|
17
|
+
1. Widen `ClaudeModelName` type to be a union of the short aliases and a branded/validated string type for full IDs
|
|
18
|
+
2. Add a regex pattern constant for validating full model IDs (e.g., `/^claude-[a-z]+-[\d]+-[\d]+(-\d+)?$/` or similar — examine real model ID formats to get the pattern right)
|
|
19
|
+
3. Update `VALID_MODELS` and the validation logic in `src/utils/config.ts` to accept both short names and full IDs matching the regex
|
|
20
|
+
4. Update `PlanCommandOptions` and `DoCommandOptions` model types to accept full IDs
|
|
21
|
+
5. Update the config documentation in `src/prompts/config-docs.md` to mention full model ID support
|
|
22
|
+
6. Add tests for validation: valid short names, valid full IDs, invalid strings
|
|
23
|
+
|
|
24
|
+
## Acceptance Criteria
|
|
25
|
+
- [ ] Short model names (`sonnet`, `haiku`, `opus`) continue to work in config
|
|
26
|
+
- [ ] Full model IDs like `claude-opus-4-5-20251101` are accepted in config
|
|
27
|
+
- [ ] Invalid model strings (e.g., `gpt-4`, `random-string`) are rejected with a clear error
|
|
28
|
+
- [ ] Config docs updated to reflect new capabilities
|
|
29
|
+
- [ ] All existing tests pass
|
|
30
|
+
- [ ] New tests cover full model ID validation
|
|
31
|
+
|
|
32
|
+
## Notes
|
|
33
|
+
- The Claude CLI `--model` flag example from docs: `claude --model claude-sonnet-4-5-20250929`
|
|
34
|
+
- Keep the type system helpful — IDE autocomplete should still suggest short names
|
|
35
|
+
- The regex should be permissive enough to handle future model naming patterns but strict enough to catch obvious typos
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
# Task: Fix Name Generation to Not Register Sessions
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Replace `execSync("claude --print ...")` in name generation with a spawn-based approach using `--no-session-persistence` to prevent sessions from being saved to disk.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
Project name generation (`src/utils/name-generator.ts`) currently uses `execSync()` to call `claude --model X --print "prompt"`. This registers a session in Claude's session history, cluttering the user's session list with throwaway name generation calls. The `--no-session-persistence` CLI flag prevents this.
|
|
8
|
+
|
|
9
|
+
## Requirements
|
|
10
|
+
- Replace `execSync` with `spawn` from `child_process` (same approach as `claude-runner.ts`)
|
|
11
|
+
- Use `-p` flag for non-interactive print mode
|
|
12
|
+
- Add `--no-session-persistence` flag to prevent session registration
|
|
13
|
+
- Do NOT use `--dangerously-skip-permissions` (not needed for simple text generation)
|
|
14
|
+
- Keep the same timeout behavior (30 seconds)
|
|
15
|
+
- Keep the same fallback behavior on failure
|
|
16
|
+
- The functions are already `async` so switching to spawn is natural
|
|
17
|
+
|
|
18
|
+
## Implementation Steps
|
|
19
|
+
1. In `src/utils/name-generator.ts`, replace `execSync` calls in `callSonnetForName()` and `callSonnetForMultipleNames()` with async `spawn`-based execution
|
|
20
|
+
2. Add `--no-session-persistence` to the CLI arguments
|
|
21
|
+
3. Collect stdout from the spawned process and return it as the result
|
|
22
|
+
4. Handle errors and timeouts the same way as before (return null/empty on failure)
|
|
23
|
+
5. Update tests to reflect the new spawn-based approach
|
|
24
|
+
|
|
25
|
+
## Acceptance Criteria
|
|
26
|
+
- [ ] Name generation no longer uses `execSync`
|
|
27
|
+
- [ ] `--no-session-persistence` flag is passed to Claude CLI
|
|
28
|
+
- [ ] Sessions from name generation do not appear in `claude --resume` picker
|
|
29
|
+
- [ ] Name generation still works correctly (produces valid kebab-case names)
|
|
30
|
+
- [ ] Fallback to generated names works when Claude CLI fails
|
|
31
|
+
- [ ] All existing tests pass
|
|
32
|
+
|
|
33
|
+
## Notes
|
|
34
|
+
- The `--no-session-persistence` flag is documented in Claude CLI reference: "Disable session persistence so sessions are not saved to disk and cannot be resumed (print mode only)"
|
|
35
|
+
- Consider using a small helper function for spawn-based CLI calls since this pattern could be reused
|
|
36
|
+
- Current `escapeShellArg()` won't be needed with spawn (arguments passed as array, not shell string)
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# Task: Unify Task Execution to Stream-JSON Format
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Switch all task execution modes to use `--output-format stream-json --verbose` so that token usage data is always available from the `result` event.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
Currently `claude-runner.ts` has two execution methods: `run()` (plain text stdout) and `runVerbose()` (stream-json with tool display). Token usage data is only available in stream-json's `result` event. To enable token tracking, all execution must use stream-json format. The difference between verbose and non-verbose becomes purely a display concern — whether tool descriptions are printed to the console.
|
|
8
|
+
|
|
9
|
+
## Dependencies
|
|
10
|
+
01
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
- All task execution uses `--output-format stream-json --verbose` flags
|
|
14
|
+
- Non-verbose mode still suppresses tool-use display lines but still captures the stream-json data
|
|
15
|
+
- The `result` event from stream-json is always parsed and returned
|
|
16
|
+
- The return type of execution methods is updated to include usage data extracted from the result event
|
|
17
|
+
- Completion detection logic continues to work (parsing text content for markers)
|
|
18
|
+
- No change to interactive mode (`runInteractive()` with node-pty)
|
|
19
|
+
|
|
20
|
+
## Implementation Steps
|
|
21
|
+
1. Define a `UsageData` interface/type to represent token usage extracted from the stream-json result event (input tokens, output tokens, cache read tokens, cache creation tokens, per-model breakdown)
|
|
22
|
+
2. Update the `run()` method to use stream-json format internally, parse NDJSON lines, extract text content for completion detection, and capture the `result` event
|
|
23
|
+
3. Add a `verbose` display option that controls whether tool-use lines are printed to stdout (true = show tools, false = suppress)
|
|
24
|
+
4. Extract usage data from the `result` event's `usage` and `modelUsage` fields
|
|
25
|
+
5. Update the return type of both `run()` and `runVerbose()` (or unified method) to include `UsageData`
|
|
26
|
+
6. Update callers in `src/commands/do.ts` to receive and pass through usage data
|
|
27
|
+
7. Update tests for the new execution flow
|
|
28
|
+
|
|
29
|
+
## Acceptance Criteria
|
|
30
|
+
- [ ] `run()` and `runVerbose()` both return token usage data
|
|
31
|
+
- [ ] Non-verbose execution suppresses tool display but still gets usage data
|
|
32
|
+
- [ ] Verbose execution shows tool display as before AND returns usage data
|
|
33
|
+
- [ ] Completion detection (COMPLETE/FAILED markers) still works
|
|
34
|
+
- [ ] Context overflow detection still works
|
|
35
|
+
- [ ] Timeout behavior unchanged
|
|
36
|
+
- [ ] All existing tests pass
|
|
37
|
+
|
|
38
|
+
## Notes
|
|
39
|
+
- The stream-json result event structure (from actual CLI output):
|
|
40
|
+
```
|
|
41
|
+
{"type":"result","usage":{"input_tokens":N,"output_tokens":N,"cache_creation_input_tokens":N,"cache_read_input_tokens":N},"modelUsage":{"claude-opus-4-6":{"inputTokens":N,"outputTokens":N,...}}}
|
|
42
|
+
```
|
|
43
|
+
- Consider merging `run()` and `runVerbose()` into a single method with a `verbose` option to reduce code duplication
|
|
44
|
+
- The `renderStreamEvent()` function in `src/parsers/stream-renderer.ts` already handles parsing — extend it to also return usage data when a `result` event is encountered
|