rafcode 3.0.0 → 3.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/settings.local.json +3 -1
- package/CLAUDE.md +0 -1
- package/RAF/38-dual-wielder/decisions.md +9 -0
- package/RAF/38-dual-wielder/input.md +6 -1
- package/RAF/38-dual-wielder/outcomes/8-e2e-test-codex-provider.md +139 -0
- package/RAF/38-dual-wielder/plans/8-e2e-test-codex-provider.md +95 -0
- package/RAF/39-pathless-rover/decisions.md +16 -0
- package/RAF/39-pathless-rover/input.md +2 -0
- package/RAF/39-pathless-rover/outcomes/1-fix-codex-stream-renderer.md +21 -0
- package/RAF/39-pathless-rover/outcomes/2-wire-provider-flag.md +28 -0
- package/RAF/39-pathless-rover/outcomes/3-remove-worktree-flag-do.md +41 -0
- package/RAF/39-pathless-rover/outcomes/4-remove-worktree-flag-plan-amend.md +30 -0
- package/RAF/39-pathless-rover/outcomes/5-update-prompts-and-docs.md +26 -0
- package/RAF/39-pathless-rover/plans/1-fix-codex-stream-renderer.md +43 -0
- package/RAF/39-pathless-rover/plans/2-wire-provider-flag.md +48 -0
- package/RAF/39-pathless-rover/plans/3-remove-worktree-flag-do.md +41 -0
- package/RAF/39-pathless-rover/plans/4-remove-worktree-flag-plan-amend.md +43 -0
- package/RAF/39-pathless-rover/plans/5-update-prompts-and-docs.md +31 -0
- package/RAF/40-numeric-order-fix/decisions.md +7 -0
- package/RAF/40-numeric-order-fix/input.md +19 -0
- package/RAF/40-numeric-order-fix/outcomes/1-fix-numeric-sort-order.md +18 -0
- package/RAF/40-numeric-order-fix/outcomes/2-add-npm-keywords.md +10 -0
- package/RAF/40-numeric-order-fix/plans/1-fix-numeric-sort-order.md +48 -0
- package/RAF/40-numeric-order-fix/plans/2-add-npm-keywords.md +23 -0
- package/RAF/41-echo-chamber/decisions.md +13 -0
- package/RAF/41-echo-chamber/input.md +4 -0
- package/RAF/41-echo-chamber/outcomes/1-update-codex-model-defaults.md +24 -0
- package/RAF/41-echo-chamber/outcomes/2-e2e-test-codex-provider.md +74 -0
- package/RAF/41-echo-chamber/plans/1-update-codex-model-defaults.md +28 -0
- package/RAF/41-echo-chamber/plans/2-e2e-test-codex-provider.md +103 -0
- package/RAF/42-patch-parade/decisions.md +29 -0
- package/RAF/42-patch-parade/input.md +9 -0
- package/RAF/42-patch-parade/outcomes/1-fix-codex-model-resolution.md +36 -0
- package/RAF/42-patch-parade/outcomes/2-fix-provider-aware-name-generation.md +31 -0
- package/RAF/42-patch-parade/outcomes/3-fix-codex-error-event-rendering.md +32 -0
- package/RAF/42-patch-parade/outcomes/4-update-cli-help-docs.md +28 -0
- package/RAF/42-patch-parade/outcomes/5-update-default-codex-models-to-gpt-5-4.md +33 -0
- package/RAF/42-patch-parade/outcomes/6-unify-model-config-schema.md +89 -0
- package/RAF/42-patch-parade/plans/1-fix-codex-model-resolution.md +35 -0
- package/RAF/42-patch-parade/plans/2-fix-provider-aware-name-generation.md +38 -0
- package/RAF/42-patch-parade/plans/3-fix-codex-error-event-rendering.md +32 -0
- package/RAF/42-patch-parade/plans/4-update-cli-help-docs.md +31 -0
- package/RAF/42-patch-parade/plans/5-update-default-codex-models-to-gpt-5-4.md +35 -0
- package/RAF/42-patch-parade/plans/6-unify-model-config-schema.md +46 -0
- package/RAF/43-swiss-army/decisions.md +34 -0
- package/RAF/43-swiss-army/input.md +7 -0
- package/RAF/43-swiss-army/outcomes/1-fix-model-validation.md +21 -0
- package/RAF/43-swiss-army/outcomes/2-update-commit-format.md +31 -0
- package/RAF/43-swiss-army/outcomes/3-wire-reasoning-effort.md +28 -0
- package/RAF/43-swiss-army/outcomes/4-remove-provider-flag.md +27 -0
- package/RAF/43-swiss-army/outcomes/5-config-wizard-validation.md +23 -0
- package/RAF/43-swiss-army/outcomes/6-add-fast-mode.md +32 -0
- package/RAF/43-swiss-army/outcomes/7-config-preset.md +31 -0
- package/RAF/43-swiss-army/plans/1-fix-model-validation.md +38 -0
- package/RAF/43-swiss-army/plans/2-update-commit-format.md +46 -0
- package/RAF/43-swiss-army/plans/3-wire-reasoning-effort.md +39 -0
- package/RAF/43-swiss-army/plans/4-remove-provider-flag.md +43 -0
- package/RAF/43-swiss-army/plans/5-config-wizard-validation.md +42 -0
- package/RAF/43-swiss-army/plans/6-add-fast-mode.md +46 -0
- package/RAF/43-swiss-army/plans/7-config-preset.md +51 -0
- package/RAF/44-config-api-change/decisions.md +22 -0
- package/RAF/44-config-api-change/input.md +5 -0
- package/RAF/44-config-api-change/outcomes/1-restructure-config-subcommands.md +19 -0
- package/RAF/44-config-api-change/outcomes/2-move-preset-under-config.md +17 -0
- package/RAF/44-config-api-change/outcomes/3-update-existing-tests-for-config-api.md +14 -0
- package/RAF/44-config-api-change/outcomes/4-update-config-command-docs.md +11 -0
- package/RAF/44-config-api-change/outcomes/5-fix-codex-name-generation.md +18 -0
- package/RAF/44-config-api-change/plans/1-restructure-config-subcommands.md +37 -0
- package/RAF/44-config-api-change/plans/2-move-preset-under-config.md +38 -0
- package/RAF/44-config-api-change/plans/3-update-existing-tests-for-config-api.md +38 -0
- package/RAF/44-config-api-change/plans/4-update-config-command-docs.md +36 -0
- package/RAF/44-config-api-change/plans/5-fix-codex-name-generation.md +49 -0
- package/RAF/45-signal-cairn/decisions.md +7 -0
- package/RAF/45-signal-cairn/input.md +2 -0
- package/RAF/45-signal-cairn/outcomes/1-rename-provider-to-harness.md +19 -0
- package/RAF/45-signal-cairn/outcomes/2-normalize-model-display-names.md +18 -0
- package/RAF/45-signal-cairn/plans/1-rename-provider-to-harness.md +40 -0
- package/RAF/45-signal-cairn/plans/2-normalize-model-display-names.md +41 -0
- package/RAF/45-signal-lantern/decisions.md +10 -0
- package/RAF/45-signal-lantern/input.md +2 -0
- package/RAF/45-signal-lantern/outcomes/1-add-effort-and-fast-to-do-model-display.md +15 -0
- package/RAF/45-signal-lantern/outcomes/2-capture-codex-post-run-token-usage.md +15 -0
- package/RAF/45-signal-lantern/outcomes/3-show-codex-token-summaries-without-fake-cost.md +14 -0
- package/RAF/45-signal-lantern/plans/1-add-effort-and-fast-to-do-model-display.md +38 -0
- package/RAF/45-signal-lantern/plans/2-capture-codex-post-run-token-usage.md +37 -0
- package/RAF/45-signal-lantern/plans/3-show-codex-token-summaries-without-fake-cost.md +40 -0
- package/RAF/46-lantern-arc/decisions.md +19 -0
- package/RAF/46-lantern-arc/input.md +6 -0
- package/RAF/46-lantern-arc/outcomes/1-remove-spark-alias.md +16 -0
- package/RAF/46-lantern-arc/outcomes/2-clean-up-worktree-plan-command.md +30 -0
- package/RAF/46-lantern-arc/outcomes/3-fix-token-usage-accumulation.md +32 -0
- package/RAF/46-lantern-arc/outcomes/4-display-effort-in-compact-mode.md +22 -0
- package/RAF/46-lantern-arc/outcomes/5-codex-fast-mode-research.md +38 -0
- package/RAF/46-lantern-arc/outcomes/6-optimize-llm-prompts.md +39 -0
- package/RAF/46-lantern-arc/plans/1-remove-spark-alias.md +38 -0
- package/RAF/46-lantern-arc/plans/2-clean-up-worktree-plan-command.md +33 -0
- package/RAF/46-lantern-arc/plans/3-fix-token-usage-accumulation.md +33 -0
- package/RAF/46-lantern-arc/plans/4-display-effort-in-compact-mode.md +28 -0
- package/RAF/46-lantern-arc/plans/5-codex-fast-mode-research.md +34 -0
- package/RAF/46-lantern-arc/plans/6-optimize-llm-prompts.md +48 -0
- package/RAF/47-signal-trim/decisions.md +13 -0
- package/RAF/47-signal-trim/input.md +2 -0
- package/RAF/47-signal-trim/plans/1-remove-cache-from-status.md +73 -0
- package/README.md +50 -63
- package/dist/commands/config.d.ts.map +1 -1
- package/dist/commands/config.js +47 -49
- package/dist/commands/config.js.map +1 -1
- package/dist/commands/do.d.ts +2 -0
- package/dist/commands/do.d.ts.map +1 -1
- package/dist/commands/do.js +91 -230
- package/dist/commands/do.js.map +1 -1
- package/dist/commands/plan.d.ts.map +1 -1
- package/dist/commands/plan.js +54 -259
- package/dist/commands/plan.js.map +1 -1
- package/dist/commands/preset.d.ts +3 -0
- package/dist/commands/preset.d.ts.map +1 -0
- package/dist/commands/preset.js +158 -0
- package/dist/commands/preset.js.map +1 -0
- package/dist/core/claude-runner.d.ts +2 -0
- package/dist/core/claude-runner.d.ts.map +1 -1
- package/dist/core/claude-runner.js +36 -12
- package/dist/core/claude-runner.js.map +1 -1
- package/dist/core/codex-runner.d.ts +1 -0
- package/dist/core/codex-runner.d.ts.map +1 -1
- package/dist/core/codex-runner.js +26 -7
- package/dist/core/codex-runner.js.map +1 -1
- package/dist/core/failure-analyzer.js +2 -1
- package/dist/core/failure-analyzer.js.map +1 -1
- package/dist/core/git.d.ts +2 -2
- package/dist/core/git.d.ts.map +1 -1
- package/dist/core/git.js +53 -3
- package/dist/core/git.js.map +1 -1
- package/dist/core/project-manager.d.ts.map +1 -1
- package/dist/core/project-manager.js +2 -2
- package/dist/core/project-manager.js.map +1 -1
- package/dist/core/pull-request.js +5 -5
- package/dist/core/pull-request.js.map +1 -1
- package/dist/core/runner-factory.d.ts +4 -4
- package/dist/core/runner-factory.d.ts.map +1 -1
- package/dist/core/runner-factory.js +8 -8
- package/dist/core/runner-factory.js.map +1 -1
- package/dist/core/runner-interface.d.ts +1 -1
- package/dist/core/runner-types.d.ts +17 -4
- package/dist/core/runner-types.d.ts.map +1 -1
- package/dist/core/state-derivation.js +3 -3
- package/dist/core/state-derivation.js.map +1 -1
- package/dist/parsers/codex-stream-renderer.d.ts +28 -4
- package/dist/parsers/codex-stream-renderer.d.ts.map +1 -1
- package/dist/parsers/codex-stream-renderer.js +110 -0
- package/dist/parsers/codex-stream-renderer.js.map +1 -1
- package/dist/prompts/amend.d.ts +0 -1
- package/dist/prompts/amend.d.ts.map +1 -1
- package/dist/prompts/amend.js +31 -104
- package/dist/prompts/amend.js.map +1 -1
- package/dist/prompts/execution.d.ts.map +1 -1
- package/dist/prompts/execution.js +17 -34
- package/dist/prompts/execution.js.map +1 -1
- package/dist/prompts/planning.d.ts.map +1 -1
- package/dist/prompts/planning.js +23 -123
- package/dist/prompts/planning.js.map +1 -1
- package/dist/types/config.d.ts +33 -32
- package/dist/types/config.d.ts.map +1 -1
- package/dist/types/config.js +14 -28
- package/dist/types/config.js.map +1 -1
- package/dist/utils/config.d.ts +36 -16
- package/dist/utils/config.d.ts.map +1 -1
- package/dist/utils/config.js +209 -104
- package/dist/utils/config.js.map +1 -1
- package/dist/utils/name-generator.d.ts.map +1 -1
- package/dist/utils/name-generator.js +25 -12
- package/dist/utils/name-generator.js.map +1 -1
- package/dist/utils/paths.d.ts +5 -0
- package/dist/utils/paths.d.ts.map +1 -1
- package/dist/utils/paths.js +9 -0
- package/dist/utils/paths.js.map +1 -1
- package/dist/utils/terminal-symbols.d.ts +15 -2
- package/dist/utils/terminal-symbols.d.ts.map +1 -1
- package/dist/utils/terminal-symbols.js +36 -4
- package/dist/utils/terminal-symbols.js.map +1 -1
- package/dist/utils/token-tracker.d.ts +6 -1
- package/dist/utils/token-tracker.d.ts.map +1 -1
- package/dist/utils/token-tracker.js +84 -51
- package/dist/utils/token-tracker.js.map +1 -1
- package/dist/utils/validation.d.ts +1 -2
- package/dist/utils/validation.d.ts.map +1 -1
- package/dist/utils/validation.js +4 -25
- package/dist/utils/validation.js.map +1 -1
- package/package.json +7 -2
- package/src/commands/config.ts +60 -63
- package/src/commands/do.ts +96 -262
- package/src/commands/plan.ts +55 -279
- package/src/commands/preset.ts +186 -0
- package/src/core/claude-runner.ts +45 -5
- package/src/core/codex-runner.ts +32 -7
- package/src/core/failure-analyzer.ts +2 -1
- package/src/core/git.ts +57 -3
- package/src/core/project-manager.ts +2 -1
- package/src/core/pull-request.ts +5 -5
- package/src/core/runner-factory.ts +9 -9
- package/src/core/runner-interface.ts +1 -1
- package/src/core/runner-types.ts +17 -4
- package/src/core/state-derivation.ts +3 -3
- package/src/parsers/codex-stream-renderer.ts +149 -4
- package/src/prompts/amend.ts +30 -105
- package/src/prompts/config-docs.md +206 -62
- package/src/prompts/execution.ts +17 -34
- package/src/prompts/planning.ts +23 -124
- package/src/types/config.ts +47 -59
- package/src/utils/config.ts +248 -115
- package/src/utils/name-generator.ts +29 -13
- package/src/utils/paths.ts +10 -0
- package/src/utils/terminal-symbols.ts +46 -6
- package/src/utils/token-tracker.ts +96 -57
- package/src/utils/validation.ts +5 -30
- package/tests/unit/amend-prompt.test.ts +3 -2
- package/tests/unit/claude-runner-interactive.test.ts +21 -3
- package/tests/unit/claude-runner.test.ts +39 -0
- package/tests/unit/codex-runner.test.ts +163 -0
- package/tests/unit/codex-stream-renderer.test.ts +127 -0
- package/tests/unit/command-output.test.ts +57 -0
- package/tests/unit/commit-planning-artifacts-worktree.test.ts +24 -7
- package/tests/unit/commit-planning-artifacts.test.ts +26 -4
- package/tests/unit/config-command.test.ts +215 -303
- package/tests/unit/config.test.ts +319 -235
- package/tests/unit/dependency-integration.test.ts +27 -1
- package/tests/unit/do-model-display.test.ts +35 -0
- package/tests/unit/execution-prompt.test.ts +49 -19
- package/tests/unit/name-generator.test.ts +82 -12
- package/tests/unit/plan-command-auto-flag.test.ts +7 -10
- package/tests/unit/plan-command.test.ts +14 -17
- package/tests/unit/planning-prompt.test.ts +9 -8
- package/tests/unit/terminal-symbols.test.ts +94 -3
- package/tests/unit/token-tracker.test.ts +180 -1
- package/tests/unit/validation.test.ts +9 -41
- package/tests/unit/worktree-flag-override.test.ts +0 -186
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
# Outcome
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Centralized user-facing model label formatting so compact aliases now render consistently across RAF command output. In particular, `gpt54` now displays as `gpt-5.4` in planning, execution status, config, and PR-generation logs, while concise Claude labels like `opus`, `sonnet`, and `haiku` remain short and readable.
|
|
6
|
+
|
|
7
|
+
## Key Changes
|
|
8
|
+
|
|
9
|
+
- Added shared display helpers in `src/utils/config.ts` that derive user-facing model labels from the existing alias-to-full-ID mapping and support explicit full-ID rendering where needed.
|
|
10
|
+
- Updated user-facing model logging in `src/commands/plan.ts`, `src/commands/do.ts`, `src/commands/config.ts`, and `src/core/pull-request.ts` to use the centralized formatter instead of ad hoc alias shortening.
|
|
11
|
+
- Updated tests in `tests/unit/config.test.ts`, `tests/unit/terminal-symbols.test.ts`, and `tests/unit/command-output.test.ts` to cover normalized display names and canonical `gpt-5.4` task/status output.
|
|
12
|
+
|
|
13
|
+
## Notes
|
|
14
|
+
|
|
15
|
+
- `npm run lint` passed.
|
|
16
|
+
- Full verification passed with `NODE_OPTIONS='--experimental-vm-modules' npx jest --watchman=false --runInBand`.
|
|
17
|
+
|
|
18
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
---
|
|
2
|
+
effort: high
|
|
3
|
+
---
|
|
4
|
+
# Task: Rename Provider To Harness
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
Replace the `provider` terminology with `harness` everywhere in RAF's config schema, code, tests, prompts, and documentation without adding migration compatibility.
|
|
8
|
+
|
|
9
|
+
## Context
|
|
10
|
+
RAF currently models Claude vs Codex selection through `provider` fields and related helper/type names across config parsing, runner creation, command logging, and docs. The user wants that terminology renamed everywhere, not just in JSON config examples. This is a deliberate breaking change for a greenfield project, so the implementation should fully adopt `harness` instead of carrying both terms or adding migration support.
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
- Rename the per-model config field from `provider` to `harness` across the entire codebase.
|
|
14
|
+
- Rename internal TypeScript property names, helper parameters, validation paths, prompt/docs examples, README examples, and tests so `provider` is no longer the active term.
|
|
15
|
+
- Preserve existing behavior: RAF must still route model entries to the Claude or Codex CLI correctly after the rename.
|
|
16
|
+
- Do not add migration support, fallback parsing, or compatibility shims for the old `provider` key.
|
|
17
|
+
- Remove or rewrite validation/help text so user-facing config guidance refers to `harness`, not `provider`.
|
|
18
|
+
- Update `DEFAULT_CONFIG` in [src/types/config.ts](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/src/types/config.ts) to use the new key, per project instructions.
|
|
19
|
+
- Update `README.md` and generated config docs because this changes the config API.
|
|
20
|
+
- Refresh automated coverage for config validation, config resolution, command logging, and any other code paths that assert on model-entry property names.
|
|
21
|
+
|
|
22
|
+
## Implementation Steps
|
|
23
|
+
1. Refactor config types in [src/types/config.ts](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/src/types/config.ts) so model entries store `harness` instead of `provider`, and rename supporting types/interfaces/helpers where the old terminology is part of the public or internal API.
|
|
24
|
+
2. Update config parsing, validation, and merge logic in [src/utils/config.ts](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/src/utils/config.ts) to accept only `harness`, reject stale `provider` keys naturally under the new schema, and keep scenario/effort resolution behavior intact.
|
|
25
|
+
3. Rename downstream call sites in commands, runners, name generation, validation, and core orchestration so they consume `entry.harness` consistently instead of `entry.provider`.
|
|
26
|
+
4. Update prompt/docs sources such as [src/prompts/config-docs.md](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/src/prompts/config-docs.md) and [README.md](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/README.md) to describe the new schema and examples using `harness`.
|
|
27
|
+
5. Rewrite affected tests to assert on `harness` terminology and current config behavior instead of the removed `provider` field.
|
|
28
|
+
6. Run the relevant test suites covering config, plan/do command model resolution, and docs/config command output; fix stale assertions that still encode the old terminology.
|
|
29
|
+
|
|
30
|
+
## Acceptance Criteria
|
|
31
|
+
- [ ] Config files and defaults use `harness` instead of `provider`.
|
|
32
|
+
- [ ] Active code paths no longer rely on `entry.provider` or equivalent provider-named config properties.
|
|
33
|
+
- [ ] User-facing docs, prompts, examples, and error/help text describe `harness`, not `provider`.
|
|
34
|
+
- [ ] RAF still resolves Claude vs Codex execution correctly after the rename.
|
|
35
|
+
- [ ] Tests covering config validation/resolution and command behavior pass with the renamed schema.
|
|
36
|
+
- [ ] All tests pass
|
|
37
|
+
|
|
38
|
+
## Notes
|
|
39
|
+
- Treat this as a full terminology migration inside the current codebase, not a narrow config-file alias.
|
|
40
|
+
- Because the user explicitly does not want migration support, avoid dual-read logic like `harness ?? provider`.
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
---
|
|
2
|
+
effort: medium
|
|
3
|
+
---
|
|
4
|
+
# Task: Normalize Model Display Names
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
Make all user-facing model labels flow through one centralized formatter so aliases like `gpt54` display as canonical names such as `gpt-5.4` during planning and execution.
|
|
8
|
+
|
|
9
|
+
## Context
|
|
10
|
+
RAF currently mixes display helpers: some places show a full resolved model ID, while others call `getModelShortName()` and end up rendering compact aliases like `gpt54`. The user specifically called out `raf do` task status and the planning log line `Generating project name suggestions with gpt54`, and asked for this to become a global display rule rather than a one-off patch. The implementation should therefore centralize model display formatting and reuse existing model-resolution data instead of hardcoding the string in individual commands.
|
|
11
|
+
|
|
12
|
+
## Dependencies
|
|
13
|
+
1
|
|
14
|
+
|
|
15
|
+
## Requirements
|
|
16
|
+
- Introduce or reuse a single helper for user-facing model labels instead of formatting aliases ad hoc at call sites.
|
|
17
|
+
- Ensure `gpt54` displays as `gpt-5.4` anywhere model names are shown to the user, including task status during `raf do` and planning logs during name generation.
|
|
18
|
+
- Apply the same centralized display rule to other command/log surfaces that currently expose model aliases, such as model-selection logs and verbose execution lines.
|
|
19
|
+
- Derive display names from the existing model alias/full-ID mapping source in config utilities rather than scattering special cases.
|
|
20
|
+
- Preserve the current UX where concise Claude labels like `opus`, `sonnet`, and `haiku` remain readable unless a surface already intentionally shows a full resolved model ID.
|
|
21
|
+
- Update tests that assert on task progress formatting, plan logging, config/model helper behavior, and any other user-facing model strings touched by the refactor.
|
|
22
|
+
|
|
23
|
+
## Implementation Steps
|
|
24
|
+
1. Audit current model-display call sites in [src/commands/plan.ts](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/src/commands/plan.ts), [src/commands/do.ts](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/src/commands/do.ts), [src/commands/config.ts](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/src/commands/config.ts), and any shared terminal helpers to identify where aliases are exposed directly.
|
|
25
|
+
2. Add a centralized display formatter in [src/utils/config.ts](/Users/eremeev/.raf/worktrees/RAF/45-signal-cairn/src/utils/config.ts) or the most appropriate shared utility, backed by the existing alias-to-full-ID mapping so canonical display names come from one source.
|
|
26
|
+
3. Replace direct `getModelShortName()` or equivalent alias formatting at user-facing log/status call sites with the new display helper, including the `raf do` status line and `raf plan` name-generation message.
|
|
27
|
+
4. Keep intentionally full-ID surfaces intact where RAF already means to show the full resolved model, but route those decisions through explicit helper usage so the display policy is clear.
|
|
28
|
+
5. Update and extend unit tests for config display helpers, terminal task-status formatting, and command output expectations so `gpt54` no longer appears in user-facing output.
|
|
29
|
+
6. Run the relevant tests and tighten any stale assertions that still depend on alias-style display strings.
|
|
30
|
+
|
|
31
|
+
## Acceptance Criteria
|
|
32
|
+
- [ ] `raf do` task status shows `gpt-5.4` instead of `gpt54` when the resolved model is the GPT-5.4 alias.
|
|
33
|
+
- [ ] `raf plan` logs `Generating project name suggestions with gpt-5.4...` when name generation uses that alias.
|
|
34
|
+
- [ ] User-facing model logging/status surfaces use one centralized display formatter instead of per-call-site alias handling.
|
|
35
|
+
- [ ] The formatter derives `gpt-5.4` from shared model metadata rather than a one-off string replacement in command code.
|
|
36
|
+
- [ ] Updated tests cover the new display behavior and pass.
|
|
37
|
+
- [ ] All tests pass
|
|
38
|
+
|
|
39
|
+
## Notes
|
|
40
|
+
- The goal is display normalization, not a behavioral change to model resolution.
|
|
41
|
+
- Be careful not to regress surfaces that intentionally show full resolved IDs via `resolveFullModelId()`; unify the policy, not necessarily the exact string everywhere.
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
# Project Decisions
|
|
2
|
+
|
|
3
|
+
## For task 1, what exact format do you want on the task line? Example: `● 01-auth-login (sonnet, low, fast) 12s`. Also, should this apply only to the compact running/completed/failed lines, or also to verbose `Model:` / retry logs?
|
|
4
|
+
In same places where currently model is specified. `● 01-auth-login (sonnet, low, fast) 12s`. Don't display `low` if null and `fast` if falsy (`null` or `false`).
|
|
5
|
+
|
|
6
|
+
## For task 2, what should the research task answer definitively: only exact dollar cost, or any usable post-run signal from Codex such as tokens, usage limits, or credits?
|
|
7
|
+
Task is not research, do research now and say if it possible to get price in dollars and input output tokens count.
|
|
8
|
+
|
|
9
|
+
## For task 3, if Codex can provide token counts but not exact price, should RAF still implement token-only summaries for Codex, or should price be shown only when it can be sourced exactly rather than estimated?
|
|
10
|
+
Only show price if we can source price exactly. Provide input and output tokens though.
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
Implemented the `raf do` model display update so existing compact task lines and verbose model logs now append resolved reasoning effort and `fast` metadata when present.
|
|
2
|
+
|
|
3
|
+
Key changes:
|
|
4
|
+
- Updated [`src/utils/terminal-symbols.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/utils/terminal-symbols.ts) to centralize model metadata formatting and allow compact task progress lines to render `(model, effort, fast)` while omitting unavailable metadata.
|
|
5
|
+
- Updated [`src/commands/do.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/commands/do.ts) to thread resolved `reasoningEffort` and `fast` through running/completed/failed compact lines and through verbose `Model:` and retry logs.
|
|
6
|
+
- Added coverage in [`tests/unit/terminal-symbols.test.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/tests/unit/terminal-symbols.test.ts), [`tests/unit/command-output.test.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/tests/unit/command-output.test.ts), and new [`tests/unit/do-model-display.test.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/tests/unit/do-model-display.test.ts).
|
|
7
|
+
- Updated [`README.md`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/README.md) to note the enriched non-verbose `raf do` task-line display.
|
|
8
|
+
|
|
9
|
+
Important notes:
|
|
10
|
+
- Verification commands could not run in this worktree because `node_modules` is absent, so local `jest` and `tsc` binaries were unavailable.
|
|
11
|
+
- Attempted verification:
|
|
12
|
+
- `npm test -- --runInBand tests/unit/terminal-symbols.test.ts tests/unit/command-output.test.ts tests/unit/do-model-display.test.ts` -> failed with `jest: command not found`
|
|
13
|
+
- `npm run lint` -> failed with `tsc: command not found`
|
|
14
|
+
|
|
15
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
Implemented Codex post-run token usage capture so `codex exec --json` `turn.completed` events now feed RAF's shared usage pipeline instead of being dropped at the runner boundary.
|
|
2
|
+
|
|
3
|
+
Key changes:
|
|
4
|
+
- Updated `src/parsers/codex-stream-renderer.ts` to extract structured `UsageData` from `turn.completed`, including exact input/output token counts and model-scoped usage while leaving exact cost unavailable.
|
|
5
|
+
- Updated `src/core/codex-runner.ts` to retain streamed Codex usage data and return it through the normal `RunResult.usageData` path used by `raf do`.
|
|
6
|
+
- Refined shared usage/cost types and accumulation in `src/types/config.ts`, `src/utils/token-tracker.ts`, and `src/utils/terminal-symbols.ts` so unknown exact cost remains unavailable across retries and summaries instead of rendering as a fake exact zero.
|
|
7
|
+
- Added coverage in `tests/unit/codex-stream-renderer.test.ts`, new `tests/unit/codex-runner.test.ts`, `tests/unit/token-tracker.test.ts`, and `tests/unit/terminal-symbols.test.ts`.
|
|
8
|
+
|
|
9
|
+
Important notes:
|
|
10
|
+
- Acceptance criteria around Codex input/output token capture, runner propagation, retry accumulation, and unavailable-cost handling were implemented in code and covered by new/updated unit tests.
|
|
11
|
+
- Verification was limited by the worktree environment because `node_modules` is absent.
|
|
12
|
+
- Attempted verification:
|
|
13
|
+
- `npm test -- --runInBand tests/unit/codex-stream-renderer.test.ts tests/unit/codex-runner.test.ts tests/unit/token-tracker.test.ts tests/unit/terminal-symbols.test.ts` -> failed with `jest: command not found`
|
|
14
|
+
- `npm run lint` -> failed with `tsc: command not found`
|
|
15
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
Updated RAF's shared token-summary formatting so token counts still render for all providers, but USD cost only appears when the CLI supplies an exact value. This removes misleading `$0.00` output from Codex summaries while preserving exact Claude cost reporting.
|
|
2
|
+
|
|
3
|
+
Key changes:
|
|
4
|
+
- Updated `src/utils/terminal-symbols.ts` to make cost rendering availability-aware for both per-task summaries and grand-total summaries, while still treating exact zero as a real cost.
|
|
5
|
+
- Updated `tests/unit/terminal-symbols.test.ts` to cover Codex-style token-only summaries, mixed multi-attempt cost availability, and Claude exact-cost regressions.
|
|
6
|
+
- Updated `tests/unit/command-output.test.ts` to verify the logged output path shows token-only Codex summaries and exact Claude totals.
|
|
7
|
+
- Updated `README.md` to document that Codex currently reports exact token counts post-run, but RAF omits USD cost unless the provider returns an exact price.
|
|
8
|
+
|
|
9
|
+
Important notes:
|
|
10
|
+
- `src/commands/do.ts` already routed both per-task and grand-total usage output through the shared formatter, so no provider-specific command branching was needed.
|
|
11
|
+
- Verification was attempted but blocked by the worktree environment because `node_modules` is absent:
|
|
12
|
+
- `npm test -- --runInBand tests/unit/terminal-symbols.test.ts tests/unit/command-output.test.ts` -> failed with `jest: command not found`
|
|
13
|
+
- `npm run lint` -> failed with `tsc: command not found`
|
|
14
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
---
|
|
2
|
+
effort: low
|
|
3
|
+
---
|
|
4
|
+
# Task: Add Effort And Fast To Do Model Display
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
Augment every existing `raf do` model display to include task effort and a fast-mode marker when present.
|
|
8
|
+
|
|
9
|
+
## Context
|
|
10
|
+
`raf do` already shows the resolved model in compact task lines and in verbose execution logs. The user wants the same display points to carry more execution metadata so task routing is visible at a glance, without adding extra log lines.
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
- Update the same places where the model is currently shown; do not add new display locations.
|
|
14
|
+
- Compact task lines should follow the existing pattern with appended metadata, for example: `● 01-auth-login (sonnet, low, fast) 12s`.
|
|
15
|
+
- Omit the effort label when it is unavailable.
|
|
16
|
+
- Omit `fast` when the resolved model entry has `fast: false`, `null`, or `undefined`.
|
|
17
|
+
- Preserve current model identifier style per output surface.
|
|
18
|
+
- Running, completed, and failed compact lines must stay aligned with the current `formatTaskProgress()` behavior.
|
|
19
|
+
- Verbose `Model:` and retry logs should include the same effort/fast metadata rather than silently diverging from compact mode.
|
|
20
|
+
|
|
21
|
+
## Implementation Steps
|
|
22
|
+
1. Identify every `raf do` output path that currently renders model information, including the timer status line and verbose execution/retry logs in [`src/commands/do.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/commands/do.ts).
|
|
23
|
+
2. Extend the task progress formatter in [`src/utils/terminal-symbols.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/utils/terminal-symbols.ts) so it can render model plus optional effort/fast metadata without changing unrelated progress output.
|
|
24
|
+
3. Thread resolved frontmatter effort and `fast` from the task model resolution path into all current model-display call sites.
|
|
25
|
+
4. Keep truncation and spacing behavior stable so long task names still render cleanly.
|
|
26
|
+
5. Add or update unit tests for compact progress formatting and any `do` command expectations affected by the new suffix.
|
|
27
|
+
|
|
28
|
+
## Acceptance Criteria
|
|
29
|
+
- [ ] Running compact lines show model metadata in the same place the model appears today.
|
|
30
|
+
- [ ] Completed compact lines show model metadata in the same place the model appears today.
|
|
31
|
+
- [ ] Failed compact lines show model metadata in the same place the model appears today.
|
|
32
|
+
- [ ] Verbose `Model:` and retry logs include effort and `fast` metadata when available.
|
|
33
|
+
- [ ] Effort is omitted when unavailable.
|
|
34
|
+
- [ ] `fast` is omitted when falsy.
|
|
35
|
+
- [ ] Existing output-format tests pass after expectation updates.
|
|
36
|
+
|
|
37
|
+
## Notes
|
|
38
|
+
- Reasonable assumption: verbose logs should keep the current resolved model identifier style and append metadata instead of replacing the model string.
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
---
|
|
2
|
+
effort: medium
|
|
3
|
+
---
|
|
4
|
+
# Task: Capture Codex Post-Run Token Usage
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
Capture Codex input/output token counts from non-interactive execution and carry them through RAF’s post-run usage pipeline.
|
|
8
|
+
|
|
9
|
+
## Context
|
|
10
|
+
Research during planning found that RAF’s Codex path already sees `turn.completed` usage data in `codex exec --json`, and prior local outcome notes show real Codex runs reporting `in`/`out` token counts. However, [`src/core/codex-runner.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/core/codex-runner.ts) currently returns `usageData: undefined`, so `raf do` cannot include Codex token totals after execution.
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
- Parse Codex `turn.completed` usage from the JSON stream used by `codex exec --json`.
|
|
14
|
+
- Capture at least exact input and output token counts from Codex.
|
|
15
|
+
- Preserve compatibility with existing Claude usage tracking.
|
|
16
|
+
- Do not invent or estimate dollar cost for Codex.
|
|
17
|
+
- Represent missing exact Codex cost as unavailable rather than pretending the value is exact.
|
|
18
|
+
- Keep retry accumulation behavior intact so Codex retries still roll up correctly.
|
|
19
|
+
|
|
20
|
+
## Implementation Steps
|
|
21
|
+
1. Extend the Codex stream renderer in [`src/parsers/codex-stream-renderer.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/parsers/codex-stream-renderer.ts) to return structured usage data for `turn.completed` events, not just display text.
|
|
22
|
+
2. Update the shared run-result plumbing so [`src/core/codex-runner.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/core/codex-runner.ts) records Codex usage data from streamed events the same way Claude does.
|
|
23
|
+
3. Introduce or refine the usage-data representation so Codex can express exact token counts with no exact dollar-cost value.
|
|
24
|
+
4. Ensure token tracking in [`src/utils/token-tracker.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/utils/token-tracker.ts) can accumulate Codex attempts without collapsing “unknown cost” into a misleading exact zero.
|
|
25
|
+
5. Add unit tests for Codex usage extraction and runner propagation.
|
|
26
|
+
|
|
27
|
+
## Acceptance Criteria
|
|
28
|
+
- [ ] Codex `turn.completed` input token counts are captured into RAF usage data.
|
|
29
|
+
- [ ] Codex `turn.completed` output token counts are captured into RAF usage data.
|
|
30
|
+
- [ ] `raf do` receives Codex usage data through the runner result path.
|
|
31
|
+
- [ ] Existing Claude usage tests continue to pass unchanged in behavior.
|
|
32
|
+
- [ ] Codex cost remains explicitly unavailable unless an exact field is present.
|
|
33
|
+
- [ ] Codex retry attempts still accumulate token usage correctly.
|
|
34
|
+
- [ ] New Codex renderer/runner tests pass.
|
|
35
|
+
|
|
36
|
+
## Notes
|
|
37
|
+
- Planning research conclusion as of March 21, 2026: current Codex surfaces clearly expose token usage, but exact per-run dollar cost was not confirmed in the `exec --json` path RAF uses.
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
---
|
|
2
|
+
effort: medium
|
|
3
|
+
---
|
|
4
|
+
# Task: Show Codex Token Summaries Without Fake Cost
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
Update RAF’s post-execution token summaries so Codex runs show exact token counts while omitting dollar cost when no exact price is available.
|
|
8
|
+
|
|
9
|
+
## Context
|
|
10
|
+
Once Codex token usage is captured, RAF’s current summary format will otherwise show `Cost: $0.00`, which reads like an exact price instead of “unknown”. The user explicitly wants input/output tokens surfaced, but only wants price shown when RAF can source it exactly.
|
|
11
|
+
|
|
12
|
+
## Dependencies
|
|
13
|
+
2
|
|
14
|
+
|
|
15
|
+
## Requirements
|
|
16
|
+
- Show Codex input/output token summaries after each task and in the grand total summary.
|
|
17
|
+
- Do not show `$0.00` when Codex cost is unavailable.
|
|
18
|
+
- Continue showing exact dollar cost for Claude runs exactly as RAF does today.
|
|
19
|
+
- Avoid estimated pricing for Codex entirely.
|
|
20
|
+
- Make the “cost unavailable” behavior data-driven so future providers with token-only usage can reuse it.
|
|
21
|
+
- Update README and any config/docs references affected by the new provider-specific summary behavior.
|
|
22
|
+
|
|
23
|
+
## Implementation Steps
|
|
24
|
+
1. Update token summary formatting in [`src/utils/terminal-symbols.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/utils/terminal-symbols.ts) so cost output is conditional on exact-cost availability instead of always rendering a dollar value.
|
|
25
|
+
2. Adjust any supporting token-tracker or usage-data helpers to preserve the distinction between exact zero and unavailable cost.
|
|
26
|
+
3. Wire the updated summary behavior through [`src/commands/do.ts`](/Users/eremeev/.raf/worktrees/RAF/45-signal-lantern/src/commands/do.ts) for both per-task and grand-total reporting.
|
|
27
|
+
4. Update README and related docs to explain that Codex currently reports token counts post-run, while dollar cost is shown only when the CLI provides an exact value.
|
|
28
|
+
5. Add tests covering Codex token-only summaries and regression tests proving Claude still shows exact cost.
|
|
29
|
+
|
|
30
|
+
## Acceptance Criteria
|
|
31
|
+
- [ ] Codex task summaries show input/output tokens after execution.
|
|
32
|
+
- [ ] Codex grand totals show input/output tokens after execution.
|
|
33
|
+
- [ ] Codex summaries omit dollar cost when exact price is unavailable.
|
|
34
|
+
- [ ] Claude summaries still show exact dollar cost.
|
|
35
|
+
- [ ] No summary path prints a misleading `$0.00` for unknown Codex cost.
|
|
36
|
+
- [ ] README reflects the current Codex token-only limitation.
|
|
37
|
+
- [ ] Updated formatter and command-output tests pass.
|
|
38
|
+
|
|
39
|
+
## Notes
|
|
40
|
+
- If Codex later adds an exact per-run dollar-cost field, the display should start showing it through the same availability-aware path instead of introducing provider-specific formatting branches.
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# Project Decisions
|
|
2
|
+
|
|
3
|
+
## Should the spark alias be removed entirely or remapped to a different model?
|
|
4
|
+
Remove entirely. Delete 'spark' from CodexModelAlias type, MODEL_ALIAS_TO_FULL_ID, tier ordering, VALID_CODEX_MODEL_ALIASES, and any other references including config-docs.md.
|
|
5
|
+
|
|
6
|
+
## Should worktree cleanup be limited to the plan command or broader?
|
|
7
|
+
Plan command only. Remove worktree references only from PlanCommandOptions interface and the plan command action handler.
|
|
8
|
+
|
|
9
|
+
## Should token usage from multiple turn.completed events be summed or kept as last-only?
|
|
10
|
+
Sum all token fields. Accumulate inputTokens, outputTokens, cacheReadInputTokens, cacheCreationInputTokens, and totalCostUsd across all turn.completed events in both claude-runner.ts and codex-runner.ts.
|
|
11
|
+
|
|
12
|
+
## Where should effort appear in the compact display status line?
|
|
13
|
+
Inside parentheses: (sonnet, medium, fast) - effort between model and fast flag.
|
|
14
|
+
|
|
15
|
+
## What should happen with fast mode for Codex harness?
|
|
16
|
+
Research whether Codex CLI supports fast mode. If it does, wire it up. If not, remove the fast setting entirely from Codex-related config paths (don't just warn - clean it up).
|
|
17
|
+
|
|
18
|
+
## Which prompt files should be optimized?
|
|
19
|
+
All three: planning.ts, execution.ts, and amend.ts.
|
|
@@ -0,0 +1,6 @@
|
|
|
1
|
+
- [ ] config wizard say: Important caveat: RAF only applies fast mode on Claude runners. Since your planning model is stillprovider: "codex", this setting will not change raf plan behavior. If you want actual fast-mode planning, I can switch models.plan to a Claude model and keep fast: true. Invetigate if it's possible to wire fast mode to codex harness. do if possible
|
|
2
|
+
- [ ] optimise prompts for llms. focus if there is confusing statements and repeating statements. clarify or remove redundancy if so. try to be less verbose but still clear for llm
|
|
3
|
+
- [ ] The PlanCommandOptions interface declares worktree?: boolean and the action handler reads options.worktree - clean this up, no --worktree flag needed
|
|
4
|
+
- [ ] When multiple `turn.completed` events are emitted in a single `codex exec --json` run (e.g., tool-driven multi-turn executions), this assignment overwrites prior usage and keeps only the last turn's tokens. That makes per-task and run-level token summaries undercount actual usage for those sessions. The runner should merge successive `rendered.usageData` payloads instead of replacing them.
|
|
5
|
+
- [ ] effort is not displayed in task execution status in 'raf do' compact
|
|
6
|
+
- [ ] spark alias in the RAF codebase already maps to gpt-5.3-codex <- find that alias are remove, this is wrong
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
# Outcome: Remove spark alias
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Removed the `spark` model alias from the codebase. It incorrectly mapped to `gpt-5.3-codex` and has been eliminated entirely.
|
|
5
|
+
|
|
6
|
+
## Changes Made
|
|
7
|
+
- `src/types/config.ts`: Removed `'spark'` from `CodexModelAlias` type and `VALID_CODEX_MODEL_ALIASES` array
|
|
8
|
+
- `src/utils/config.ts`: Removed `spark` from `MODEL_ALIAS_TO_FULL_ID`, `CODEX_MODEL_TIER_ORDER`, alias recognition condition, and updated JSDoc comment
|
|
9
|
+
- `src/prompts/config-docs.md`: Removed `spark` row from the Codex models table; updated `codex` row description
|
|
10
|
+
|
|
11
|
+
## Verification
|
|
12
|
+
- No remaining `spark` references in `src/`
|
|
13
|
+
- TypeScript compiles without errors
|
|
14
|
+
- `codex` and `gpt54` aliases remain intact
|
|
15
|
+
|
|
16
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# Outcome: Clean up worktree flag from plan command
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Removed all dead worktree code from `runPlanCommand()` in `src/commands/plan.ts`. The `--worktree` flag was never exposed on the plan command, making `worktreeMode` and related logic unreachable dead code.
|
|
5
|
+
|
|
6
|
+
## Changes Made
|
|
7
|
+
- `src/commands/plan.ts`:
|
|
8
|
+
- Removed `worktree?: boolean` from `PlanCommandOptions` interface
|
|
9
|
+
- Removed `getWorktreeDefault` and `getSyncMainBranch` from config.js imports
|
|
10
|
+
- Removed `createWorktree`, `removeWorktree`, `pullMainBranch` from worktree.js imports
|
|
11
|
+
- Removed `getNextProjectNumber`, `formatProjectNumber`, `getDecisionsPath`, `getOutcomesDir` from paths.js imports
|
|
12
|
+
- Removed `sanitizeProjectName` import (was only used in worktree path construction)
|
|
13
|
+
- Removed `const worktreeMode = options.worktree ?? getWorktreeDefault()` from action handler
|
|
14
|
+
- Removed `worktreeMode` parameter from `runPlanCommand()` signature and call site
|
|
15
|
+
- Removed git validation block for worktree mode
|
|
16
|
+
- Removed worktree path variables (`worktreePath`, `worktreeBranch`) and all worktree creation logic
|
|
17
|
+
- Collapsed if/else into standard-mode-only project creation
|
|
18
|
+
- Simplified shutdown handler (removed worktree cleanup branch)
|
|
19
|
+
- Removed `worktreeMode` from `getPlanningPrompt()` call
|
|
20
|
+
- Removed `cwd: worktreePath ?? undefined` from `runInteractive()` call
|
|
21
|
+
- Simplified success message (removed worktree-specific branch)
|
|
22
|
+
- Simplified `commitPlanningArtifacts()` call (removed `cwd` option)
|
|
23
|
+
- Simplified finally block (removed worktree cleanup branch)
|
|
24
|
+
|
|
25
|
+
## Verification
|
|
26
|
+
- TypeScript compiles without errors
|
|
27
|
+
- No `worktreeMode` or dead worktree references remain in `runPlanCommand()`
|
|
28
|
+
- `runAmendCommand` and `runResumeCommand` retain their worktree support (still functional)
|
|
29
|
+
|
|
30
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Outcome: Fix token usage accumulation for multi-turn executions
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Implemented cumulative usage merging for streamed execution events so token and cost totals now reflect all `turn.completed`/`result` events in a run rather than only the last event.
|
|
5
|
+
|
|
6
|
+
## Key Changes Made
|
|
7
|
+
- Added shared merge utility in `src/utils/token-tracker.ts`:
|
|
8
|
+
- New `mergeUsageData(existing, incoming)` handles first-event initialization and incremental accumulation.
|
|
9
|
+
- Sums `inputTokens`, `outputTokens`, `cacheReadInputTokens`, `cacheCreationInputTokens`, and `totalCostUsd`.
|
|
10
|
+
- Merges `modelUsage` per model with the same accumulation behavior.
|
|
11
|
+
- Handles undefined/missing fields defensively to avoid `NaN`/crashes.
|
|
12
|
+
- Updated runners to accumulate usage instead of overwrite:
|
|
13
|
+
- `src/core/codex-runner.ts`: replaced both `usageData = rendered.usageData` assignments with `mergeUsageData(...)`.
|
|
14
|
+
- `src/core/claude-runner.ts`: replaced both `usageData = rendered.usageData` assignments with `mergeUsageData(...)`.
|
|
15
|
+
- Refactored existing usage aggregation internals to reuse shared merge logic:
|
|
16
|
+
- `accumulateUsage()` now folds through `mergeUsageData`.
|
|
17
|
+
- `TokenTracker.getTotals()` now merges per-task usage via `mergeUsageData`.
|
|
18
|
+
- Added/updated tests:
|
|
19
|
+
- `tests/unit/codex-runner.test.ts`: added multi-`turn.completed` accumulation test.
|
|
20
|
+
- `tests/unit/claude-runner.test.ts`: added multi-`result` accumulation test.
|
|
21
|
+
- `tests/unit/token-tracker.test.ts`: added `mergeUsageData` behavior tests, including undefined/missing-field handling.
|
|
22
|
+
|
|
23
|
+
## Verification
|
|
24
|
+
- TypeScript build: `npm run -s build` passed.
|
|
25
|
+
- Focused tests passed:
|
|
26
|
+
- `NODE_OPTIONS='--experimental-vm-modules' npx jest --watchman=false tests/unit/token-tracker.test.ts tests/unit/codex-runner.test.ts tests/unit/claude-runner.test.ts`
|
|
27
|
+
- Note: default `npm test` invocation in this sandbox attempted Watchman and failed due socket permission restrictions; reran with `--watchman=false`.
|
|
28
|
+
|
|
29
|
+
## Notes
|
|
30
|
+
- No CLI surface/flags changed; README updates were not required.
|
|
31
|
+
|
|
32
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
# Outcome: Display frontmatter effort in compact task status
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Updated compact `raf do` task status lines to show the plan frontmatter effort (`low`/`medium`/`high`) instead of the model runtime `reasoningEffort` parameter.
|
|
5
|
+
|
|
6
|
+
## Key Changes Made
|
|
7
|
+
- `src/commands/do.ts`
|
|
8
|
+
- Added `currentEffort` tracking for compact status display metadata.
|
|
9
|
+
- Set `currentEffort` from `task.frontmatter?.effort` after model resolution.
|
|
10
|
+
- Updated compact status rendering for running/completed/failed lines to pass `effort: currentEffort`.
|
|
11
|
+
- Removed use of `currentModelReasoningEffort` in compact display call sites so runtime reasoning settings are no longer conflated with task effort frontmatter.
|
|
12
|
+
|
|
13
|
+
## Verification
|
|
14
|
+
- TypeScript build passed:
|
|
15
|
+
- `npm run -s build`
|
|
16
|
+
- Focused tests passed:
|
|
17
|
+
- `NODE_OPTIONS='--experimental-vm-modules' npx jest --watchman=false tests/unit/terminal-symbols.test.ts tests/unit/do-model-display.test.ts`
|
|
18
|
+
|
|
19
|
+
## Notes
|
|
20
|
+
- Tasks without `effort` frontmatter continue to render model metadata without a blank effort slot because `formatModelMetadata` only appends effort when defined.
|
|
21
|
+
|
|
22
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
# Outcome: Codex fast mode research and config handling
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Verified Codex CLI capabilities and updated RAF to treat `fast` as Claude-only with explicit user-facing warnings for Codex entries.
|
|
5
|
+
|
|
6
|
+
## Research Findings
|
|
7
|
+
- Checked `codex --help` and `codex exec --help` directly.
|
|
8
|
+
- No Codex fast mode flag is available in current CLI help output.
|
|
9
|
+
- Conclusion: Codex fast mode is unsupported in RAF and should not be applied in Codex runner args.
|
|
10
|
+
|
|
11
|
+
## Key Changes Made
|
|
12
|
+
- `src/utils/config.ts`
|
|
13
|
+
- Added `collectConfigValidationWarnings()` to detect `fast: true` on `harness: "codex"` model entries.
|
|
14
|
+
- Added warning emission during `resolveConfig()` for those entries.
|
|
15
|
+
- Updated model-entry merge normalization to strip `fast` from resolved Codex entries so the unsupported setting is ignored consistently.
|
|
16
|
+
- `src/commands/config.ts`
|
|
17
|
+
- Updated `raf config set` flow to emit the same config validation warnings after validation.
|
|
18
|
+
- `src/prompts/config-docs.md`
|
|
19
|
+
- Clarified that Codex does not support fast mode and RAF warns/ignores `fast: true` on Codex entries.
|
|
20
|
+
- `README.md`
|
|
21
|
+
- Added CLI note that fast mode is Claude-only and Codex `fast` settings are warned/ignored.
|
|
22
|
+
- `tests/unit/config.test.ts`
|
|
23
|
+
- Added warning helper coverage for Codex `fast: true` entries.
|
|
24
|
+
- Added resolve-config coverage to verify Codex `fast` is warned and stripped.
|
|
25
|
+
- Removed stale `spark` alias expectations.
|
|
26
|
+
- `tests/unit/config-command.test.ts`
|
|
27
|
+
- Added coverage that `raf config set` warns when setting Codex `fast: true`.
|
|
28
|
+
- `AGENTS.md`
|
|
29
|
+
- Added agent note documenting Codex fast-mode warning/ignore behavior.
|
|
30
|
+
|
|
31
|
+
## Verification
|
|
32
|
+
- Build passed: `npm run -s build`
|
|
33
|
+
- Focused tests passed:
|
|
34
|
+
- `NODE_OPTIONS='--experimental-vm-modules' npx jest --watchman=false tests/unit/config.test.ts tests/unit/config-command.test.ts`
|
|
35
|
+
|
|
36
|
+
## Notes
|
|
37
|
+
- `CodexRunner` was not modified because Codex CLI currently exposes no fast-mode capability to wire.
|
|
38
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# Outcome: Optimize LLM prompts for clarity and conciseness
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Optimized all three prompt files (planning.ts, execution.ts, amend.ts) to remove redundancy, clarify structure, and reduce verbosity while preserving all functional requirements.
|
|
5
|
+
|
|
6
|
+
## Key Changes Made
|
|
7
|
+
|
|
8
|
+
### `src/prompts/planning.ts`
|
|
9
|
+
- Merged "Step 4: Infer Task Dependencies" into Step 3's plan template section (dependency info was duplicated between the template, Step 4, and Important Rules)
|
|
10
|
+
- Consolidated "Important Rules" (9 items) into a compact "Rules" section (3 items) — rules 1,2,5-7,9 were redundant with earlier sections
|
|
11
|
+
- Merged "Step 2.5: Record Decisions" into Step 2 (Interview)
|
|
12
|
+
- Removed "Your Goals" section (duplicated the role description)
|
|
13
|
+
- Tightened Step 1 task identification guidance
|
|
14
|
+
|
|
15
|
+
### `src/prompts/execution.ts`
|
|
16
|
+
- Removed "Important Rules" section (8 items) — all were redundant with Steps 1-4 and Git Instructions
|
|
17
|
+
- Removed "Error Handling" section — content merged into Step 2 and Step 4
|
|
18
|
+
- Consolidated two "CRITICAL" callouts in Step 4 into streamlined outcome instructions
|
|
19
|
+
- Simplified Step 2 guidelines (removed redundant "Add appropriate error handling")
|
|
20
|
+
- Merged success/failure commit workflow into Step 4's marker section
|
|
21
|
+
|
|
22
|
+
### `src/prompts/amend.ts`
|
|
23
|
+
- Removed "Important Rules" (10 items) — rules 1-2 duplicated Amendment Mode, 7-8 duplicated template, 10 duplicated Frontmatter Requirements
|
|
24
|
+
- Consolidated into compact "Rules" section (4 items)
|
|
25
|
+
- Shortened section headings ("Protected Tasks (COMPLETED - cannot be modified)" → "Protected (COMPLETED)")
|
|
26
|
+
- Merged "Step 3.5: Record Decisions" into Step 3
|
|
27
|
+
- Tightened Step 2 follow-up task instructions
|
|
28
|
+
|
|
29
|
+
### Test Updates
|
|
30
|
+
- `tests/unit/planning-prompt.test.ts`: Updated string assertions to match new prompt wording
|
|
31
|
+
- `tests/unit/execution-prompt.test.ts`: Updated assertions for removed/consolidated sections
|
|
32
|
+
- `tests/unit/plan-command.test.ts`: Updated amend prompt assertions for shortened headings and wording
|
|
33
|
+
|
|
34
|
+
## Verification
|
|
35
|
+
- TypeScript build passes
|
|
36
|
+
- All prompt-related tests pass (85 + 40 + 7 = 132 tests)
|
|
37
|
+
- All functional requirements preserved — no behavioral changes
|
|
38
|
+
|
|
39
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
---
|
|
2
|
+
effort: low
|
|
3
|
+
---
|
|
4
|
+
# Task: Remove spark alias
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
Remove the incorrect `spark` model alias that maps to `gpt-5.3-codex` from the entire codebase.
|
|
8
|
+
|
|
9
|
+
## Context
|
|
10
|
+
The `spark` alias in the RAF codebase incorrectly maps to `gpt-5.3-codex`. This alias should be removed entirely rather than remapped.
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
- Remove `spark` from the `CodexModelAlias` type union
|
|
14
|
+
- Remove `spark` from `VALID_CODEX_MODEL_ALIASES` array
|
|
15
|
+
- Remove `spark` from `MODEL_ALIAS_TO_FULL_ID` mapping
|
|
16
|
+
- Remove `spark` from codex model tier ordering
|
|
17
|
+
- Remove `spark` from any recognition/resolution logic
|
|
18
|
+
- Remove `spark` from `config-docs.md` documentation
|
|
19
|
+
- Verify no other files reference the spark alias
|
|
20
|
+
|
|
21
|
+
## Implementation Steps
|
|
22
|
+
1. In `src/types/config.ts`:
|
|
23
|
+
- Remove `'spark'` from `CodexModelAlias` type (line ~8)
|
|
24
|
+
- Remove `'spark'` from `VALID_CODEX_MODEL_ALIASES` array (line ~129)
|
|
25
|
+
2. In `src/utils/config.ts`:
|
|
26
|
+
- Remove `spark` entry from `MODEL_ALIAS_TO_FULL_ID` (line ~577)
|
|
27
|
+
- Remove `spark` from codex tier ordering (line ~453)
|
|
28
|
+
- Remove spark from alias recognition logic (line ~547)
|
|
29
|
+
- Clean up any comments mentioning spark
|
|
30
|
+
3. In `src/prompts/config-docs.md`:
|
|
31
|
+
- Remove the `"spark"` documentation entry (line ~212)
|
|
32
|
+
4. Search for any remaining `spark` references and remove them
|
|
33
|
+
|
|
34
|
+
## Acceptance Criteria
|
|
35
|
+
- [ ] `spark` does not appear in any TypeScript source files as a model alias
|
|
36
|
+
- [ ] `spark` does not appear in config-docs.md
|
|
37
|
+
- [ ] TypeScript compiles without errors
|
|
38
|
+
- [ ] Remaining codex aliases (`codex`, `gpt54`) still work correctly
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
---
|
|
2
|
+
effort: low
|
|
3
|
+
---
|
|
4
|
+
# Task: Clean up worktree flag from plan command
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
Remove the dead `worktree` option from the plan command's interface and action handler.
|
|
8
|
+
|
|
9
|
+
## Context
|
|
10
|
+
The `PlanCommandOptions` interface still declares `worktree?: boolean` and the action handler reads `options.worktree`, but there is no `--worktree` CLI flag exposed on the plan command. This is dead code that should be cleaned up.
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
- Remove `worktree?: boolean` from `PlanCommandOptions` interface
|
|
14
|
+
- Remove `worktreeMode` variable and its usage from the action handler
|
|
15
|
+
- Remove the `worktreeMode` parameter from `runPlanCommand()`
|
|
16
|
+
- Remove worktree-related logic inside `runPlanCommand()` (lines ~148-151 git validation, lines ~205-243 worktree creation)
|
|
17
|
+
- Remove any now-unused worktree imports
|
|
18
|
+
|
|
19
|
+
## Implementation Steps
|
|
20
|
+
1. In `src/commands/plan.ts`:
|
|
21
|
+
- Remove `worktree?: boolean` from `PlanCommandOptions` (line 56)
|
|
22
|
+
- Remove `const worktreeMode = options.worktree ?? getWorktreeDefault();` (line 74)
|
|
23
|
+
- Remove `worktreeMode` argument from `runPlanCommand()` call (line 87)
|
|
24
|
+
- Remove `worktreeMode` parameter from `runPlanCommand()` function signature (line 94)
|
|
25
|
+
- Remove worktree validation block (lines ~148-151)
|
|
26
|
+
- Remove worktree creation and path handling logic (lines ~205-243)
|
|
27
|
+
- Remove unused imports (`getWorktreeDefault`, worktree-related imports from `../core/worktree.js`)
|
|
28
|
+
2. Verify the plan command still works for normal (non-worktree) flow
|
|
29
|
+
|
|
30
|
+
## Acceptance Criteria
|
|
31
|
+
- [ ] No `worktree` references remain in plan.ts (except possibly in unrelated comments)
|
|
32
|
+
- [ ] TypeScript compiles without errors
|
|
33
|
+
- [ ] No unused imports remain
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
---
|
|
2
|
+
effort: medium
|
|
3
|
+
---
|
|
4
|
+
# Task: Fix token usage accumulation for multi-turn executions
|
|
5
|
+
|
|
6
|
+
## Objective
|
|
7
|
+
Merge successive `usageData` payloads from `turn.completed` events instead of overwriting them, so per-task and run-level token summaries reflect actual usage.
|
|
8
|
+
|
|
9
|
+
## Context
|
|
10
|
+
When multiple `turn.completed` events are emitted in a single `codex exec --json` or `claude --output-format stream-json` run (e.g., tool-driven multi-turn executions), the current code does `usageData = rendered.usageData` which overwrites prior usage and keeps only the last turn's tokens. This makes token summaries undercount actual usage.
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
- In both `codex-runner.ts` and `claude-runner.ts`, accumulate token counts across all `turn.completed` events
|
|
14
|
+
- Sum numeric fields: `inputTokens`, `outputTokens`, `cacheReadInputTokens`, `cacheCreationInputTokens`, `totalCostUsd`
|
|
15
|
+
- For `modelUsage` (if present), merge per-model entries similarly
|
|
16
|
+
- Handle the case where `usageData` is initially undefined (first event) vs subsequent events
|
|
17
|
+
|
|
18
|
+
## Implementation Steps
|
|
19
|
+
1. Create a `mergeUsageData` utility function (either inline in each runner or in a shared utility like `src/utils/token-tracker.ts`):
|
|
20
|
+
- If existing is undefined, return the new data
|
|
21
|
+
- Otherwise, sum all numeric fields from both
|
|
22
|
+
- Merge `modelUsage` maps if present
|
|
23
|
+
2. In `src/core/codex-runner.ts` (lines ~265-267 and ~287-289):
|
|
24
|
+
- Replace `usageData = rendered.usageData` with `usageData = mergeUsageData(usageData, rendered.usageData)`
|
|
25
|
+
3. In `src/core/claude-runner.ts` (lines ~387-389 and ~409-411):
|
|
26
|
+
- Replace `usageData = rendered.usageData` with `usageData = mergeUsageData(usageData, rendered.usageData)`
|
|
27
|
+
4. Check the `UsageData` interface in the types to understand all fields that need merging
|
|
28
|
+
|
|
29
|
+
## Acceptance Criteria
|
|
30
|
+
- [ ] Multi-turn executions report cumulative token counts, not just the last turn
|
|
31
|
+
- [ ] Single-turn executions still work correctly (no regression)
|
|
32
|
+
- [ ] TypeScript compiles without errors
|
|
33
|
+
- [ ] The merge function handles undefined/missing fields gracefully
|