rafcode 3.0.0 → 3.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (235) hide show
  1. package/.claude/settings.local.json +3 -1
  2. package/CLAUDE.md +0 -1
  3. package/RAF/38-dual-wielder/decisions.md +9 -0
  4. package/RAF/38-dual-wielder/input.md +6 -1
  5. package/RAF/38-dual-wielder/outcomes/8-e2e-test-codex-provider.md +139 -0
  6. package/RAF/38-dual-wielder/plans/8-e2e-test-codex-provider.md +95 -0
  7. package/RAF/39-pathless-rover/decisions.md +16 -0
  8. package/RAF/39-pathless-rover/input.md +2 -0
  9. package/RAF/39-pathless-rover/outcomes/1-fix-codex-stream-renderer.md +21 -0
  10. package/RAF/39-pathless-rover/outcomes/2-wire-provider-flag.md +28 -0
  11. package/RAF/39-pathless-rover/outcomes/3-remove-worktree-flag-do.md +41 -0
  12. package/RAF/39-pathless-rover/outcomes/4-remove-worktree-flag-plan-amend.md +30 -0
  13. package/RAF/39-pathless-rover/outcomes/5-update-prompts-and-docs.md +26 -0
  14. package/RAF/39-pathless-rover/plans/1-fix-codex-stream-renderer.md +43 -0
  15. package/RAF/39-pathless-rover/plans/2-wire-provider-flag.md +48 -0
  16. package/RAF/39-pathless-rover/plans/3-remove-worktree-flag-do.md +41 -0
  17. package/RAF/39-pathless-rover/plans/4-remove-worktree-flag-plan-amend.md +43 -0
  18. package/RAF/39-pathless-rover/plans/5-update-prompts-and-docs.md +31 -0
  19. package/RAF/40-numeric-order-fix/decisions.md +7 -0
  20. package/RAF/40-numeric-order-fix/input.md +19 -0
  21. package/RAF/40-numeric-order-fix/outcomes/1-fix-numeric-sort-order.md +18 -0
  22. package/RAF/40-numeric-order-fix/outcomes/2-add-npm-keywords.md +10 -0
  23. package/RAF/40-numeric-order-fix/plans/1-fix-numeric-sort-order.md +48 -0
  24. package/RAF/40-numeric-order-fix/plans/2-add-npm-keywords.md +23 -0
  25. package/RAF/41-echo-chamber/decisions.md +13 -0
  26. package/RAF/41-echo-chamber/input.md +4 -0
  27. package/RAF/41-echo-chamber/outcomes/1-update-codex-model-defaults.md +24 -0
  28. package/RAF/41-echo-chamber/outcomes/2-e2e-test-codex-provider.md +74 -0
  29. package/RAF/41-echo-chamber/plans/1-update-codex-model-defaults.md +28 -0
  30. package/RAF/41-echo-chamber/plans/2-e2e-test-codex-provider.md +103 -0
  31. package/RAF/42-patch-parade/decisions.md +29 -0
  32. package/RAF/42-patch-parade/input.md +9 -0
  33. package/RAF/42-patch-parade/outcomes/1-fix-codex-model-resolution.md +36 -0
  34. package/RAF/42-patch-parade/outcomes/2-fix-provider-aware-name-generation.md +31 -0
  35. package/RAF/42-patch-parade/outcomes/3-fix-codex-error-event-rendering.md +32 -0
  36. package/RAF/42-patch-parade/outcomes/4-update-cli-help-docs.md +28 -0
  37. package/RAF/42-patch-parade/outcomes/5-update-default-codex-models-to-gpt-5-4.md +33 -0
  38. package/RAF/42-patch-parade/outcomes/6-unify-model-config-schema.md +89 -0
  39. package/RAF/42-patch-parade/plans/1-fix-codex-model-resolution.md +35 -0
  40. package/RAF/42-patch-parade/plans/2-fix-provider-aware-name-generation.md +38 -0
  41. package/RAF/42-patch-parade/plans/3-fix-codex-error-event-rendering.md +32 -0
  42. package/RAF/42-patch-parade/plans/4-update-cli-help-docs.md +31 -0
  43. package/RAF/42-patch-parade/plans/5-update-default-codex-models-to-gpt-5-4.md +35 -0
  44. package/RAF/42-patch-parade/plans/6-unify-model-config-schema.md +46 -0
  45. package/RAF/43-swiss-army/decisions.md +34 -0
  46. package/RAF/43-swiss-army/input.md +7 -0
  47. package/RAF/43-swiss-army/outcomes/1-fix-model-validation.md +21 -0
  48. package/RAF/43-swiss-army/outcomes/2-update-commit-format.md +31 -0
  49. package/RAF/43-swiss-army/outcomes/3-wire-reasoning-effort.md +28 -0
  50. package/RAF/43-swiss-army/outcomes/4-remove-provider-flag.md +27 -0
  51. package/RAF/43-swiss-army/outcomes/5-config-wizard-validation.md +23 -0
  52. package/RAF/43-swiss-army/outcomes/6-add-fast-mode.md +32 -0
  53. package/RAF/43-swiss-army/outcomes/7-config-preset.md +31 -0
  54. package/RAF/43-swiss-army/plans/1-fix-model-validation.md +38 -0
  55. package/RAF/43-swiss-army/plans/2-update-commit-format.md +46 -0
  56. package/RAF/43-swiss-army/plans/3-wire-reasoning-effort.md +39 -0
  57. package/RAF/43-swiss-army/plans/4-remove-provider-flag.md +43 -0
  58. package/RAF/43-swiss-army/plans/5-config-wizard-validation.md +42 -0
  59. package/RAF/43-swiss-army/plans/6-add-fast-mode.md +46 -0
  60. package/RAF/43-swiss-army/plans/7-config-preset.md +51 -0
  61. package/RAF/44-config-api-change/decisions.md +22 -0
  62. package/RAF/44-config-api-change/input.md +5 -0
  63. package/RAF/44-config-api-change/outcomes/1-restructure-config-subcommands.md +19 -0
  64. package/RAF/44-config-api-change/outcomes/2-move-preset-under-config.md +17 -0
  65. package/RAF/44-config-api-change/outcomes/3-update-existing-tests-for-config-api.md +14 -0
  66. package/RAF/44-config-api-change/outcomes/4-update-config-command-docs.md +11 -0
  67. package/RAF/44-config-api-change/outcomes/5-fix-codex-name-generation.md +18 -0
  68. package/RAF/44-config-api-change/plans/1-restructure-config-subcommands.md +37 -0
  69. package/RAF/44-config-api-change/plans/2-move-preset-under-config.md +38 -0
  70. package/RAF/44-config-api-change/plans/3-update-existing-tests-for-config-api.md +38 -0
  71. package/RAF/44-config-api-change/plans/4-update-config-command-docs.md +36 -0
  72. package/RAF/44-config-api-change/plans/5-fix-codex-name-generation.md +49 -0
  73. package/RAF/45-signal-cairn/decisions.md +7 -0
  74. package/RAF/45-signal-cairn/input.md +2 -0
  75. package/RAF/45-signal-cairn/outcomes/1-rename-provider-to-harness.md +19 -0
  76. package/RAF/45-signal-cairn/outcomes/2-normalize-model-display-names.md +18 -0
  77. package/RAF/45-signal-cairn/plans/1-rename-provider-to-harness.md +40 -0
  78. package/RAF/45-signal-cairn/plans/2-normalize-model-display-names.md +41 -0
  79. package/RAF/45-signal-lantern/decisions.md +10 -0
  80. package/RAF/45-signal-lantern/input.md +2 -0
  81. package/RAF/45-signal-lantern/outcomes/1-add-effort-and-fast-to-do-model-display.md +15 -0
  82. package/RAF/45-signal-lantern/outcomes/2-capture-codex-post-run-token-usage.md +15 -0
  83. package/RAF/45-signal-lantern/outcomes/3-show-codex-token-summaries-without-fake-cost.md +14 -0
  84. package/RAF/45-signal-lantern/plans/1-add-effort-and-fast-to-do-model-display.md +38 -0
  85. package/RAF/45-signal-lantern/plans/2-capture-codex-post-run-token-usage.md +37 -0
  86. package/RAF/45-signal-lantern/plans/3-show-codex-token-summaries-without-fake-cost.md +40 -0
  87. package/RAF/46-lantern-arc/decisions.md +19 -0
  88. package/RAF/46-lantern-arc/input.md +6 -0
  89. package/RAF/46-lantern-arc/outcomes/1-remove-spark-alias.md +16 -0
  90. package/RAF/46-lantern-arc/outcomes/2-clean-up-worktree-plan-command.md +30 -0
  91. package/RAF/46-lantern-arc/outcomes/3-fix-token-usage-accumulation.md +32 -0
  92. package/RAF/46-lantern-arc/outcomes/4-display-effort-in-compact-mode.md +22 -0
  93. package/RAF/46-lantern-arc/outcomes/5-codex-fast-mode-research.md +38 -0
  94. package/RAF/46-lantern-arc/outcomes/6-optimize-llm-prompts.md +39 -0
  95. package/RAF/46-lantern-arc/plans/1-remove-spark-alias.md +38 -0
  96. package/RAF/46-lantern-arc/plans/2-clean-up-worktree-plan-command.md +33 -0
  97. package/RAF/46-lantern-arc/plans/3-fix-token-usage-accumulation.md +33 -0
  98. package/RAF/46-lantern-arc/plans/4-display-effort-in-compact-mode.md +28 -0
  99. package/RAF/46-lantern-arc/plans/5-codex-fast-mode-research.md +34 -0
  100. package/RAF/46-lantern-arc/plans/6-optimize-llm-prompts.md +48 -0
  101. package/RAF/47-signal-trim/decisions.md +13 -0
  102. package/RAF/47-signal-trim/input.md +2 -0
  103. package/RAF/47-signal-trim/plans/1-remove-cache-from-status.md +73 -0
  104. package/README.md +50 -63
  105. package/dist/commands/config.d.ts.map +1 -1
  106. package/dist/commands/config.js +47 -49
  107. package/dist/commands/config.js.map +1 -1
  108. package/dist/commands/do.d.ts +2 -0
  109. package/dist/commands/do.d.ts.map +1 -1
  110. package/dist/commands/do.js +91 -230
  111. package/dist/commands/do.js.map +1 -1
  112. package/dist/commands/plan.d.ts.map +1 -1
  113. package/dist/commands/plan.js +54 -259
  114. package/dist/commands/plan.js.map +1 -1
  115. package/dist/commands/preset.d.ts +3 -0
  116. package/dist/commands/preset.d.ts.map +1 -0
  117. package/dist/commands/preset.js +158 -0
  118. package/dist/commands/preset.js.map +1 -0
  119. package/dist/core/claude-runner.d.ts +2 -0
  120. package/dist/core/claude-runner.d.ts.map +1 -1
  121. package/dist/core/claude-runner.js +36 -12
  122. package/dist/core/claude-runner.js.map +1 -1
  123. package/dist/core/codex-runner.d.ts +1 -0
  124. package/dist/core/codex-runner.d.ts.map +1 -1
  125. package/dist/core/codex-runner.js +26 -7
  126. package/dist/core/codex-runner.js.map +1 -1
  127. package/dist/core/failure-analyzer.js +2 -1
  128. package/dist/core/failure-analyzer.js.map +1 -1
  129. package/dist/core/git.d.ts +2 -2
  130. package/dist/core/git.d.ts.map +1 -1
  131. package/dist/core/git.js +53 -3
  132. package/dist/core/git.js.map +1 -1
  133. package/dist/core/project-manager.d.ts.map +1 -1
  134. package/dist/core/project-manager.js +2 -2
  135. package/dist/core/project-manager.js.map +1 -1
  136. package/dist/core/pull-request.js +5 -5
  137. package/dist/core/pull-request.js.map +1 -1
  138. package/dist/core/runner-factory.d.ts +4 -4
  139. package/dist/core/runner-factory.d.ts.map +1 -1
  140. package/dist/core/runner-factory.js +8 -8
  141. package/dist/core/runner-factory.js.map +1 -1
  142. package/dist/core/runner-interface.d.ts +1 -1
  143. package/dist/core/runner-types.d.ts +17 -4
  144. package/dist/core/runner-types.d.ts.map +1 -1
  145. package/dist/core/state-derivation.js +3 -3
  146. package/dist/core/state-derivation.js.map +1 -1
  147. package/dist/parsers/codex-stream-renderer.d.ts +28 -4
  148. package/dist/parsers/codex-stream-renderer.d.ts.map +1 -1
  149. package/dist/parsers/codex-stream-renderer.js +110 -0
  150. package/dist/parsers/codex-stream-renderer.js.map +1 -1
  151. package/dist/prompts/amend.d.ts +0 -1
  152. package/dist/prompts/amend.d.ts.map +1 -1
  153. package/dist/prompts/amend.js +31 -104
  154. package/dist/prompts/amend.js.map +1 -1
  155. package/dist/prompts/execution.d.ts.map +1 -1
  156. package/dist/prompts/execution.js +17 -34
  157. package/dist/prompts/execution.js.map +1 -1
  158. package/dist/prompts/planning.d.ts.map +1 -1
  159. package/dist/prompts/planning.js +23 -123
  160. package/dist/prompts/planning.js.map +1 -1
  161. package/dist/types/config.d.ts +33 -32
  162. package/dist/types/config.d.ts.map +1 -1
  163. package/dist/types/config.js +14 -28
  164. package/dist/types/config.js.map +1 -1
  165. package/dist/utils/config.d.ts +36 -16
  166. package/dist/utils/config.d.ts.map +1 -1
  167. package/dist/utils/config.js +209 -104
  168. package/dist/utils/config.js.map +1 -1
  169. package/dist/utils/name-generator.d.ts.map +1 -1
  170. package/dist/utils/name-generator.js +25 -12
  171. package/dist/utils/name-generator.js.map +1 -1
  172. package/dist/utils/paths.d.ts +5 -0
  173. package/dist/utils/paths.d.ts.map +1 -1
  174. package/dist/utils/paths.js +9 -0
  175. package/dist/utils/paths.js.map +1 -1
  176. package/dist/utils/terminal-symbols.d.ts +15 -2
  177. package/dist/utils/terminal-symbols.d.ts.map +1 -1
  178. package/dist/utils/terminal-symbols.js +36 -4
  179. package/dist/utils/terminal-symbols.js.map +1 -1
  180. package/dist/utils/token-tracker.d.ts +6 -1
  181. package/dist/utils/token-tracker.d.ts.map +1 -1
  182. package/dist/utils/token-tracker.js +84 -51
  183. package/dist/utils/token-tracker.js.map +1 -1
  184. package/dist/utils/validation.d.ts +1 -2
  185. package/dist/utils/validation.d.ts.map +1 -1
  186. package/dist/utils/validation.js +4 -25
  187. package/dist/utils/validation.js.map +1 -1
  188. package/package.json +7 -2
  189. package/src/commands/config.ts +60 -63
  190. package/src/commands/do.ts +96 -262
  191. package/src/commands/plan.ts +55 -279
  192. package/src/commands/preset.ts +186 -0
  193. package/src/core/claude-runner.ts +45 -5
  194. package/src/core/codex-runner.ts +32 -7
  195. package/src/core/failure-analyzer.ts +2 -1
  196. package/src/core/git.ts +57 -3
  197. package/src/core/project-manager.ts +2 -1
  198. package/src/core/pull-request.ts +5 -5
  199. package/src/core/runner-factory.ts +9 -9
  200. package/src/core/runner-interface.ts +1 -1
  201. package/src/core/runner-types.ts +17 -4
  202. package/src/core/state-derivation.ts +3 -3
  203. package/src/parsers/codex-stream-renderer.ts +149 -4
  204. package/src/prompts/amend.ts +30 -105
  205. package/src/prompts/config-docs.md +206 -62
  206. package/src/prompts/execution.ts +17 -34
  207. package/src/prompts/planning.ts +23 -124
  208. package/src/types/config.ts +47 -59
  209. package/src/utils/config.ts +248 -115
  210. package/src/utils/name-generator.ts +29 -13
  211. package/src/utils/paths.ts +10 -0
  212. package/src/utils/terminal-symbols.ts +46 -6
  213. package/src/utils/token-tracker.ts +96 -57
  214. package/src/utils/validation.ts +5 -30
  215. package/tests/unit/amend-prompt.test.ts +3 -2
  216. package/tests/unit/claude-runner-interactive.test.ts +21 -3
  217. package/tests/unit/claude-runner.test.ts +39 -0
  218. package/tests/unit/codex-runner.test.ts +163 -0
  219. package/tests/unit/codex-stream-renderer.test.ts +127 -0
  220. package/tests/unit/command-output.test.ts +57 -0
  221. package/tests/unit/commit-planning-artifacts-worktree.test.ts +24 -7
  222. package/tests/unit/commit-planning-artifacts.test.ts +26 -4
  223. package/tests/unit/config-command.test.ts +215 -303
  224. package/tests/unit/config.test.ts +319 -235
  225. package/tests/unit/dependency-integration.test.ts +27 -1
  226. package/tests/unit/do-model-display.test.ts +35 -0
  227. package/tests/unit/execution-prompt.test.ts +49 -19
  228. package/tests/unit/name-generator.test.ts +82 -12
  229. package/tests/unit/plan-command-auto-flag.test.ts +7 -10
  230. package/tests/unit/plan-command.test.ts +14 -17
  231. package/tests/unit/planning-prompt.test.ts +9 -8
  232. package/tests/unit/terminal-symbols.test.ts +94 -3
  233. package/tests/unit/token-tracker.test.ts +180 -1
  234. package/tests/unit/validation.test.ts +9 -41
  235. package/tests/unit/worktree-flag-override.test.ts +0 -186
@@ -34,7 +34,9 @@
34
34
  "Bash(git add:*)",
35
35
  "WebFetch(domain:support.claude.com)",
36
36
  "WebFetch(domain:www.anthropic.com)",
37
- "Bash(ls:*)"
37
+ "Bash(ls:*)",
38
+ "Bash(git rebase:*)",
39
+ "Bash(npx tsc:*)"
38
40
  ]
39
41
  }
40
42
  }
package/CLAUDE.md CHANGED
@@ -8,4 +8,3 @@ Node.js CLI tool that orchestrates task planning and execution via Claude Code C
8
8
 
9
9
  - Keep README.md updated when adding/changing CLI commands, flags, or features
10
10
  - This app has no users. Make whatever changes you want. This project is super greenfield. It's ok if you change the schema entirely.
11
- - The role of this file is to describe common mistakes and confusion points that agents might encounter as they work in this project. If you ever encounter something in the project that surprises you, please alert the developer working with you and indicate that this is the case in the AgentMD file to help prevent future agents from having the same issue.
@@ -33,3 +33,12 @@ The existing project check is skipped in auto mode (`-y` flag). The condition `i
33
33
 
34
34
  ## Should `raf plan --amend` and auto-detect accept numeric project IDs?
35
35
  Yes. Both the explicit `--amend <id>` flow and the auto-detect prompt (`raf plan <identifier>`) should resolve numeric IDs (e.g., `raf plan --amend 38` or `raf plan 38`). Resolution should check both main-repo and worktree projects, consistent with name resolution.
36
+
37
+ ## What kind of Codex testing should be done?
38
+ E2E only — actually run `raf plan` and `raf do` with `--provider codex` against a real dummy Node.js project. No unit tests for now.
39
+
40
+ ## What scenarios should Codex E2E testing cover?
41
+ All scenarios: `raf plan --provider codex`, `raf do --provider codex`, config/model resolution, error handling, and edge cases. Sequential execution is fine (no need for parallel agents).
42
+
43
+ ## How should issues found during Codex testing be handled?
44
+ Document only. List all issues in the outcome file. User decides what to fix after reviewing.
@@ -5,4 +5,9 @@ agnostic
5
5
  ---
6
6
 
7
7
  - [ ] raf do should scan worktreee and non worktree projects
8
- - [ ] if command is like "raf plan project-name" - make sure to check if project with exact name exist (in main or worktree) and prompt to user whether he wants to amend (probably forgot to put --amend flag)
8
+ - [ ] if command is like "raf plan project-name" - make sure to check if project with exact name exist (in main or worktree) and prompt to user whether he wants to amend (probably forgot to put --amend flag)
9
+
10
+ ---
11
+
12
+ test codex by actually running raf on some dummy project folder. test as much scenarious as you
13
+ can. use team and coordinate it. if something wrong - create task to fix
@@ -0,0 +1,139 @@
1
+ # Task 8: E2E Test Codex Provider
2
+
3
+ ## Summary
4
+ Tested the Codex provider integration end-to-end by running `raf-dev` commands with `--provider codex`, exercising the runner factory, config/model resolution, JSONL stream rendering, and error handling. Found **2 critical** and **1 major** issues.
5
+
6
+ ## Test Environment
7
+ - codex-cli 0.116.0
8
+ - Node.js v22.11.0
9
+ - macOS Darwin 25.3.0
10
+ - Codex account: ChatGPT-based (not API key)
11
+
12
+ ## Test Results
13
+
14
+ ### Phase 1: Dummy Project Setup — PASS
15
+ - Created `/tmp/raf-codex-test-project/` with package.json, tsconfig.json, src/index.ts
16
+ - Initialized git repo with initial commit
17
+ - Project has intentional TODOs (input validation, negative numbers, email parsing)
18
+
19
+ ### Phase 2: Config/Model Resolution — PASS (all functions work correctly)
20
+
21
+ | Test | Expected | Actual | Status |
22
+ |------|----------|--------|--------|
23
+ | `getModel('execute', 'codex')` | `gpt-5.4` | `gpt-5.4` | PASS |
24
+ | `getModel('plan', 'codex')` | `gpt-5.3-codex` | `gpt-5.3-codex` | PASS |
25
+ | `getModel('nameGeneration', 'codex')` | `gpt-5.3-codex-spark` | `gpt-5.3-codex-spark` | PASS |
26
+ | `resolveEffortToModel('low', 'codex')` | `gpt-5.3-codex-spark` | `gpt-5.3-codex-spark` | PASS |
27
+ | `resolveEffortToModel('medium', 'codex')` | `gpt-5.3-codex` | `gpt-5.3-codex` | PASS |
28
+ | `resolveEffortToModel('high', 'codex')` | `gpt-5.4` | `gpt-5.4` | PASS |
29
+ | `parseModelSpec('codex/gpt-5.4')` | `{provider:'codex',model:'gpt-5.4'}` | Correct | PASS |
30
+ | `parseModelSpec('spark')` | `{provider:'codex',model:'spark'}` | Correct | PASS |
31
+ | `isValidModelName('gpt-5.4')` | `true` | `true` | PASS |
32
+ | `isValidModelName('codex/gpt-5.4')` | `true` | `true` | PASS |
33
+ | `resolveFullModelId('spark')` | `gpt-5.3-codex-spark` | `gpt-5.3-codex-spark` | PASS |
34
+ | `getModelShortName('gpt-5.3-codex')` | `codex` | `codex` | PASS |
35
+ | `getModelTier('spark')` | `1` | `1` | PASS |
36
+ | `getModelTier('gpt-5.4')` | `3` | `3` | PASS |
37
+
38
+ ### Phase 3: Runner Factory — PASS
39
+
40
+ | Test | Expected | Actual | Status |
41
+ |------|----------|--------|--------|
42
+ | `createRunner({provider:'codex'})` | `CodexRunner` | `CodexRunner` | PASS |
43
+ | `createRunner({provider:'claude'})` | `ClaudeRunner` | `ClaudeRunner` | PASS |
44
+ | `createRunner()` (default) | `ClaudeRunner` | `ClaudeRunner` | PASS |
45
+ | All ICliRunner methods exist on CodexRunner | 6 methods | 6 methods | PASS |
46
+ | `runResume()` throws | Error: "not supported" | Correct | PASS |
47
+ | `isRunning()` when idle | `false` | `false` | PASS |
48
+
49
+ ### Phase 4: JSONL Stream Renderer — FAIL (Critical)
50
+
51
+ | Test | Expected | Actual | Status |
52
+ |------|----------|--------|--------|
53
+ | Parse real `item.completed` (agent_message) event | display + textContent | Both empty | **FAIL** |
54
+ | Parse real `item.completed` (command_execution) event | display output | Empty | **FAIL** |
55
+ | Parse real `turn.completed` (usage) event | Capture usage | Ignored | **FAIL** |
56
+ | Parse real `error` event | Display error | Ignored | **FAIL** |
57
+ | Parse real `turn.failed` event | Display failure | Ignored | **FAIL** |
58
+
59
+ **Details**: The `codex-stream-renderer.ts` expects event types like `AgentMessage`, `CommandExecution`, `FileChange`, but the actual Codex CLI emits a completely different format:
60
+ - Real: `{"type":"item.completed","item":{"type":"agent_message","text":"..."}}`
61
+ - Expected: `{"type":"AgentMessage","content":"..."}`
62
+ - Real: `{"type":"item.completed","item":{"type":"command_execution","command":"...","exit_code":0}}`
63
+ - Expected: `{"type":"CommandExecution","command":"...","exit_code":0}`
64
+
65
+ All real events hit the `default` case and produce empty output.
66
+
67
+ ### Phase 5: CodexRunner Non-Interactive Execution — FAIL (consequence of renderer bug)
68
+
69
+ | Test | Expected | Actual | Status |
70
+ |------|----------|--------|--------|
71
+ | `run()` with gpt-5.4 (working model) | Output captured | Empty string | **FAIL** |
72
+ | `runVerbose()` with gpt-5.4 | Verbose display + output | No display, empty output | **FAIL** |
73
+ | Exit code with working model | 0 | 0 | PASS |
74
+ | Exit code with unavailable model | Non-zero | 1 | PASS |
75
+ | Timeout/contextOverflow flags | false when not triggered | false | PASS |
76
+ | usageData | undefined | undefined | PASS (by design) |
77
+
78
+ ### Phase 6: `--provider` CLI Flag — FAIL (Major)
79
+
80
+ | Test | Expected | Actual | Status |
81
+ |------|----------|--------|--------|
82
+ | `--provider codex` forwarded to `createRunner()` | provider passed | Never read from options | **FAIL** |
83
+ | `--provider codex` forwarded to `resolveEffortToModel()` | provider passed | Not passed | **FAIL** |
84
+ | `--provider codex` forwarded to `resolveModelOption()` | provider passed | Not passed | **FAIL** |
85
+
86
+ **Details**: In both `do.ts` and `plan.ts`, the `--provider` option is declared via Commander but `options.provider` is never read. All calls to `createRunner()` omit the `provider` field. All calls to `resolveEffortToModel()` and `resolveModelOption()` omit the provider. The flag is completely inert.
87
+
88
+ ### Phase 7: Error Handling — PASS (partial)
89
+
90
+ | Test | Expected | Actual | Status |
91
+ |------|----------|--------|--------|
92
+ | Missing codex binary (getCodexPath) | Error thrown | Error thrown (tested via code review) | PASS |
93
+ | `runResume()` | Throws "not supported" | Correct | PASS |
94
+ | Invalid model (codex returns error JSON) | Error surfaced | Error swallowed silently (renderer bug) | **FAIL** |
95
+ | Kill / isRunning | Work correctly | Code review confirms correct pattern | PASS |
96
+
97
+ ### Phase 8: Model Availability — INFO (environment-specific)
98
+
99
+ The configured default Codex models (`gpt-5.3-codex-spark`, `gpt-5.3-codex`) are not available on ChatGPT-based Codex accounts. Only `gpt-5.4` and the default model work. This is an environment issue, not a code bug, but the error is invisible due to the renderer bug.
100
+
101
+ ## Issues Found
102
+
103
+ ### Issue 1: CRITICAL — JSONL Stream Renderer Parses Wrong Event Format
104
+ - **File**: `src/parsers/codex-stream-renderer.ts`
105
+ - **Severity**: Critical
106
+ - **Impact**: All non-interactive Codex runs produce empty output; verbose mode shows nothing; completion detection cannot work (relies on output text)
107
+ - **Root cause**: Renderer expects event types `AgentMessage`, `CommandExecution`, etc. but real Codex CLI emits `item.completed`, `item.started`, `turn.completed`, `error`, `turn.failed` with nested `item` objects
108
+ - **Suggested fix**: Rewrite switch to handle real event types: `item.completed` → check `item.type` for `agent_message` (text in `item.text`), `command_execution` (command in `item.command`), `file_change`, etc. Also handle `error` and `turn.failed` events. Consider extracting usage data from `turn.completed` events.
109
+
110
+ ### Issue 2: CRITICAL — `--provider` Flag is a No-Op
111
+ - **File**: `src/commands/do.ts` (line 1036), `src/commands/plan.ts` (lines 289, 619, 802)
112
+ - **Severity**: Critical
113
+ - **Impact**: Users cannot actually use the Codex provider via CLI — the flag is accepted but ignored
114
+ - **Root cause**: `options.provider` is never read from the Commander options object and never passed to `createRunner()`, `resolveEffortToModel()`, or `resolveModelOption()`
115
+ - **Suggested fix**: Read `options.provider`, pass it through to all runner creation and model resolution calls. The `RunnerConfig` type already supports `provider`.
116
+
117
+ ### Issue 3: MAJOR — Codex Error Events Silently Swallowed
118
+ - **File**: `src/parsers/codex-stream-renderer.ts`
119
+ - **Severity**: Major (partially overlaps with Issue 1)
120
+ - **Impact**: When Codex reports errors (invalid model, API failures), the runner returns exit code 0-1 with empty output and no error information
121
+ - **Root cause**: `error` and `turn.failed` event types are not handled by the renderer
122
+ - **Suggested fix**: Add handlers for `error` and `turn.failed` events that capture error messages in both `display` and `textContent`
123
+
124
+ ## What Works Correctly
125
+ - Config schema and types for Codex models/effort mapping
126
+ - Model resolution functions (`getModel`, `resolveEffortToModel`, `parseModelSpec`, etc.) when called with explicit provider parameter
127
+ - Runner factory creates correct runner type for each provider
128
+ - `RunnerConfig.provider` type exists and is used by factory
129
+ - CodexRunner constructor, `kill()`, `isRunning()`, `runResume()` error
130
+ - Command construction (`codex exec --full-auto --json --ephemeral -m <model>`) uses correct flags
131
+ - Process spawning, timeout handling, PTY setup code structure
132
+ - `usageData: undefined` doesn't break downstream consumers (guarded by `if (result.usageData)`)
133
+
134
+ ## Notes
135
+ - Interactive mode (`runInteractive`) was not tested E2E because it requires PTY interaction which cannot be automated in this context. Code review shows the PTY setup follows the same pattern as ClaudeRunner.
136
+ - The `raf-dev plan --provider codex` and `raf-dev do --provider codex` commands were not tested interactively because Issue 2 makes the flag inert and Issue 1 would prevent any output capture.
137
+ - Timeout behavior was not stress-tested due to API costs, but the code structure is identical to ClaudeRunner's proven implementation.
138
+
139
+ <promise>COMPLETE</promise>
@@ -0,0 +1,95 @@
1
+ ---
2
+ effort: high
3
+ ---
4
+ # Task: E2E Test Codex Provider
5
+
6
+ ## Objective
7
+ Verify the Codex provider integration works end-to-end by running `raf-dev plan` and `raf-dev do` with `--provider codex` against a dummy Node.js project, documenting all issues found.
8
+
9
+ ## Context
10
+ This is a follow-up to task 3 (implement-codex-runner). See outcome: /Users/eremeev/projects/RAF/RAF/38-dual-wielder/outcomes/3-implement-codex-runner.md
11
+
12
+ Tasks 1-4 added Codex support (config schema, abstract runner, CodexRunner implementation, LLM-agnostic prompts) but none of this has been tested E2E against a real project. This task validates the full integration.
13
+
14
+ ## Dependencies
15
+ 3
16
+
17
+ ## Requirements
18
+ - Create a simple dummy Node.js project to use as a test target
19
+ - Run `raf-dev plan` and `raf-dev do` with `--provider codex` and verify real behavior
20
+ - Use `raf-dev` (not `raf`) for all testing — this is the development binary
21
+ - Test all major scenarios: planning, execution, config/model resolution, error handling
22
+ - Document all issues found in the outcome — do NOT auto-create fix tasks
23
+ - Sequential testing is fine (no need for parallel agents)
24
+
25
+ ## Implementation Steps
26
+
27
+ ### Phase 1: Set up dummy project
28
+
29
+ 1. Create a temporary dummy Node.js project folder (e.g., `/tmp/raf-codex-test-project/`) with:
30
+ - `package.json` with a name and basic scripts
31
+ - `src/index.ts` — a small file with a few intentional TODOs or bugs (e.g., a function that doesn't handle edge cases)
32
+ - `tsconfig.json` — basic TypeScript config
33
+ - Initialize a git repo in it (`git init && git add . && git commit`)
34
+ 2. The project should be simple enough that an LLM can meaningfully plan and execute tasks against it
35
+
36
+ ### Phase 2: Test `raf-dev plan --provider codex`
37
+
38
+ 3. Run `raf-dev plan --provider codex` targeting the dummy project
39
+ - Provide a simple input like "add input validation to the exported functions"
40
+ - Verify: Does the PTY spawn correctly? Does Codex receive the prompt?
41
+ - Verify: Are plan files generated with correct frontmatter?
42
+ - Verify: Does the interactive planning session complete without crashes?
43
+ 4. Check the generated plan files — are they well-formed? Do they have the expected structure?
44
+
45
+ ### Phase 3: Test `raf-dev do --provider codex`
46
+
47
+ 5. Run `raf-dev do --provider codex` on the planned project
48
+ - Verify: Does task execution start correctly?
49
+ - Verify: Is the `codex exec --full-auto --json --ephemeral` command constructed properly?
50
+ - Verify: Does JSONL stream output display correctly in verbose mode?
51
+ - Verify: Does completion detection work (outcome file, commit verification)?
52
+ - Verify: Does the task complete and produce an outcome file?
53
+ 6. Check the outcome files and any commits made — are they correct?
54
+
55
+ ### Phase 4: Test config/model resolution
56
+
57
+ 7. Test that `--provider codex` correctly overrides the default provider in config
58
+ 8. Test effort-based model resolution for codex:
59
+ - A plan with `effort: low` should use `gpt-5.3-codex-spark`
60
+ - A plan with `effort: medium` should use `gpt-5.3-codex`
61
+ - A plan with `effort: high` should use `gpt-5.4`
62
+ 9. Test explicit model override in plan frontmatter (e.g., `model: codex/gpt-5.4`)
63
+
64
+ ### Phase 5: Test error handling and edge cases
65
+
66
+ 10. Test what happens when `codex` binary is not in PATH (temporarily rename or use a bad path)
67
+ 11. Test timeout behavior — does a long-running task get terminated correctly?
68
+ 12. Test that `runResume` correctly throws "not supported" for Codex
69
+ 13. Test behavior with malformed/unexpected Codex output
70
+ 14. Verify that usage data being `undefined` for Codex doesn't break any display or logging code
71
+
72
+ ### Phase 6: Document results
73
+
74
+ 15. Create a comprehensive outcome document listing:
75
+ - Each scenario tested and its result (PASS/FAIL)
76
+ - Detailed description of any failures or unexpected behavior
77
+ - Severity assessment for each issue (critical/major/minor)
78
+ - Suggested fixes for each issue found
79
+
80
+ ## Acceptance Criteria
81
+ - [ ] Dummy Node.js project created and initialized with git
82
+ - [ ] `raf-dev plan --provider codex` tested and results documented
83
+ - [ ] `raf-dev do --provider codex` tested and results documented
84
+ - [ ] Config/model resolution tested for codex provider
85
+ - [ ] Error handling and edge cases tested
86
+ - [ ] Comprehensive outcome document listing all issues found with severity
87
+
88
+ ## Notes
89
+ - This task requires the `codex` CLI to be installed and available in PATH
90
+ - If Codex CLI is not available, document that as the first finding and test what you can without it (e.g., error handling for missing binary, config resolution logic)
91
+ - Focus on documenting issues clearly — the user will decide what to fix based on the outcome
92
+ - When testing interactively (raf-dev plan), you may need to provide input via the PTY — document any difficulties with this
93
+ - Check `src/core/codex-runner.ts` for the actual command construction to verify correctness
94
+ - Check `src/core/runner-factory.ts` to verify provider routing
95
+ - Check `src/utils/config.ts` for model resolution logic
@@ -0,0 +1,16 @@
1
+ # Project Decisions
2
+
3
+ ## Should the JSONL stream renderer support both old and new event formats, or replace entirely?
4
+ The "old" format is not old — it's the Claude event format. The renderer should add Codex-specific event handling (item.completed, turn.completed, etc.) alongside the existing Claude event handling. Claude should continue to work as before.
5
+
6
+ ## For removing --worktree: when creating a NEW project with raf plan, should it always create a worktree or use config default?
7
+ Config default. --worktree and --no-worktree flags should STILL be supported for `raf plan` (new project creation). They determine where the new project will be created. But for --amend and auto-amend flows, the flag should be removed — auto-detect where the project lives.
8
+
9
+ ## For raf plan --amend: if project exists in main but not worktree, auto-create worktree or amend in-place?
10
+ Amend in-place. Follow where the project lives — if in main repo, amend there; if in worktree, amend there.
11
+
12
+ ## For raf do: just remove --worktree/--no-worktree flags, or deeper refactor?
13
+ Just remove flags. The existing auto-detection logic already scans both worktree and main. Just remove the CLI flags and let auto-detection be the only path.
14
+
15
+ ## Should auto-amend detection (name collision in raf plan) scan worktrees too?
16
+ Yes, scan both main repo and worktrees for name collisions during auto-amend detection.
@@ -0,0 +1,2 @@
1
+ fix all issues in RAF/38-dual-wielder/outcomes/8-e2e-test-codex-provider.md
2
+ - [ ] raf do should be agnostic of --migration flag, scan worktree and main. remove --worktree entirely. same for --amend flow in raf plan. basically
@@ -0,0 +1,21 @@
1
+ # Task 1: Fix Codex JSONL Stream Renderer
2
+
3
+ ## Summary
4
+ Updated `codex-stream-renderer.ts` to handle the real Codex CLI event format (nested `item.completed` events) while preserving existing Claude flat-format handlers.
5
+
6
+ ## Changes Made
7
+
8
+ ### File: `src/parsers/codex-stream-renderer.ts`
9
+ - Extended `CodexEvent` interface with `item` (nested object), `message`, and `usage` fields
10
+ - Added `item.completed` handler that dispatches on `item.type`: `agent_message`, `command_execution`, `file_change`
11
+ - Added `item.started` handler (no-op, renders on completion)
12
+ - Added `turn.completed` handler that extracts and displays usage data
13
+ - Added `error` handler — outputs error message in both `display` and `textContent`
14
+ - Added `turn.failed` handler — outputs failure message in both `display` and `textContent`
15
+ - All existing Claude-format handlers (`AgentMessage`, `CommandExecution`, `FileChange`, `McpToolCall`, `TodoList`) remain untouched
16
+
17
+ ## Verification
18
+ - TypeScript compiles without errors (`npm run build`)
19
+ - All acceptance criteria met
20
+
21
+ <promise>COMPLETE</promise>
@@ -0,0 +1,28 @@
1
+ # Task 2: Wire --provider CLI Flag Through to Runner and Model Resolution
2
+
3
+ ## Summary
4
+ Wired the inert `--provider` CLI flag through to `createRunner()` and `resolveEffortToModel()` in both `do.ts` and `plan.ts`, so that `--provider codex` actually creates a `CodexRunner` and uses the codex effort mapping.
5
+
6
+ ## Changes Made
7
+
8
+ ### File: `src/commands/do.ts`
9
+ - Added `provider` field to `SingleProjectOptions` interface
10
+ - Pass `options.provider` through `executeSingleProject` to the task execution loop
11
+ - Updated `resolveTaskModel()` to accept an optional `provider` parameter and pass it to `resolveEffortToModel()`
12
+ - Pass `provider` to `createRunner({ model, provider })` in the task retry loop
13
+
14
+ ### File: `src/commands/plan.ts`
15
+ - Read `options.provider` in the action handler
16
+ - Pass `provider` through to `runPlanCommand()`, `runAmendCommand()`, and `runResumeCommand()`
17
+ - Pass `provider` to all three `createRunner({ model, provider })` call sites
18
+ - Pass `provider` through the duplicate-project amend redirect flow
19
+
20
+ ## Verification
21
+ - TypeScript compiles without errors (`npm run build`)
22
+ - All acceptance criteria met:
23
+ - `raf do --provider codex` will create a `CodexRunner` (provider flows to `createRunner`)
24
+ - `raf plan --provider codex` will create a `CodexRunner` for planning sessions
25
+ - `resolveEffortToModel` receives the provider param for codex effort mapping
26
+ - TypeScript compiles cleanly
27
+
28
+ <promise>COMPLETE</promise>
@@ -0,0 +1,41 @@
1
+ # Task 3: Remove --worktree Flag from raf do — Auto-detect Project Location
2
+
3
+ ## Summary
4
+ Removed `--worktree` and `--no-worktree` CLI flags from `raf do`. Project location (worktree vs main repo) is now always auto-detected. The combined picker and worktree-first resolution are the only code paths.
5
+
6
+ ## Changes Made
7
+
8
+ ### File: `src/commands/do.ts`
9
+ - Removed `-w, --worktree` and `--no-worktree` option lines from `createDoCommand()`
10
+ - Removed `worktreeMode` variable and all assignments to it
11
+ - Removed the early `if (worktreeMode)` block that did worktree-specific setup — the combined picker flow now handles both worktree and main projects
12
+ - Replaced `if (worktreeMode)` in resolution with `if (worktreeRoot)` — when the picker sets worktreeRoot, resolve within that worktree; otherwise auto-detect (worktree first, then main)
13
+ - Moved main branch sync (`pullMainBranch`) to after project resolution, triggered by `worktreeRoot` being set
14
+ - Replaced `worktreeMode && worktreeRoot` guards with just `worktreeRoot`
15
+ - Removed unused `discoverAndPickWorktreeProject` function
16
+ - Cleaned up unused imports: `getWorktreeDefault`, `discoverProjects`, `formatProjectChoice`, `computeWorktreePath`, `computeWorktreeBaseDir`, `validateWorktree`, `listWorktreeProjects`
17
+
18
+ ### File: `src/types/config.ts`
19
+ - Removed `worktree?: boolean` from `DoCommandOptions` interface
20
+
21
+ ### File: `src/commands/plan.ts`
22
+ - Removed `--worktree` from `raf do` suggestions in user-facing log messages (2 locations)
23
+
24
+ ### File: `src/prompts/planning.ts`
25
+ - Removed `worktreeFlag` variable and `worktreeMode` destructuring — `raf do` instruction no longer includes `--worktree`
26
+
27
+ ### File: `src/prompts/amend.ts`
28
+ - Removed `worktreeFlag` variable and `worktreeMode` destructuring — `raf do` instruction no longer includes `--worktree`
29
+
30
+ ### File: `src/prompts/config-docs.md`
31
+ - Updated `worktree` config description to note that `raf do` auto-detects regardless of this setting
32
+
33
+ ## Verification
34
+ - TypeScript compiles without errors (`npm run build`)
35
+ - All acceptance criteria met:
36
+ - `raf do` CLI no longer accepts `--worktree` or `--no-worktree`
37
+ - `raf do` (no args) shows combined picker of worktree + main projects
38
+ - `raf do <project>` auto-detects if project is in worktree or main
39
+ - Post-execution worktree actions (merge/PR/leave) still work correctly (triggered by `worktreeRoot`)
40
+
41
+ <promise>COMPLETE</promise>
@@ -0,0 +1,30 @@
1
+ # Task 4: Remove --worktree Flag from raf plan --amend — Auto-detect Project Location
2
+
3
+ ## Summary
4
+ Modified `runAmendCommand` to auto-detect whether a project lives in a worktree or main repo, removing the `worktreeMode` parameter. The `--worktree`/`--no-worktree` flags remain on the `plan` command for new project creation only.
5
+
6
+ ## Changes Made
7
+
8
+ ### File: `src/commands/plan.ts`
9
+ - Removed `worktreeMode` parameter from `runAmendCommand` signature
10
+ - Replaced the complex worktree-mode branch (manual worktree scanning, branch recreation, fresh worktree creation with file copying) with simple auto-detection: try `resolveWorktreeProjectByIdentifier()` first, fall back to `resolveProjectIdentifierWithDetails()` in main repo
11
+ - Updated both call sites (explicit `--amend` at line ~104 and auto-amend at line ~158) to no longer pass `worktreeMode`
12
+ - Removed `existingWorktreeMode` variable from auto-amend detection flow
13
+ - Removed redundant if/else for `raf do` suggestion (both branches were identical)
14
+ - Simplified `worktreeMode && worktreePath` guard to just `worktreePath`
15
+ - Removed `worktreeMode` from `getAmendPrompt()` call
16
+ - Cleaned up unused imports: `createWorktreeFromBranch`, `branchExists`, `computeWorktreeBaseDir`
17
+
18
+ ### File: `src/prompts/amend.ts`
19
+ - Removed `worktreeMode?: boolean` from `AmendPromptParams` interface
20
+
21
+ ## Verification
22
+ - TypeScript compiles without errors (`npm run build`)
23
+ - All acceptance criteria met:
24
+ - `raf plan myproject --amend` finds project in worktree without needing `--worktree` flag
25
+ - `raf plan myproject --amend` finds project in main repo without needing `--no-worktree` flag
26
+ - `raf plan myproject` (name collision) detects projects in both worktree and main
27
+ - `--worktree`/`--no-worktree` flags still work for new project creation
28
+ - TypeScript compiles without errors
29
+
30
+ <promise>COMPLETE</promise>
@@ -0,0 +1,26 @@
1
+ # Task 5: Update Prompts, Docs, and Config Docs for Removed --worktree Flag
2
+
3
+ ## Summary
4
+ Updated README.md to remove stale `--worktree` references for `raf do`. The `raf plan` command still supports `--worktree` for new project creation, so those references were preserved.
5
+
6
+ ## Changes Made
7
+
8
+ ### File: `README.md`
9
+ - Removed `raf do --worktree` and `raf do my-feature -w` examples from the `raf do` usage block
10
+ - Updated "Basic workflow" in Worktree Mode section: `raf do my-feature --worktree` → `raf do my-feature` with note "(auto-detected, no flag needed)"
11
+ - Added bullet in "How it works" section: `raf do` auto-detects whether a project lives in a worktree — no `--worktree` flag needed
12
+ - Clarified `--no-worktree` bullet to specify it applies to `raf plan` (not `raf do`)
13
+ - Removed `-w, --worktree` and `--no-worktree` rows from `raf do` command reference table
14
+
15
+ ## No Changes Needed
16
+ - `src/prompts/planning.ts` — already cleaned up in Task 3
17
+ - `src/prompts/amend.ts` — already cleaned up in Task 3
18
+ - `src/prompts/config-docs.md` — already accurate (references are specific to `raf plan --worktree`, which still has the flag)
19
+ - `src/commands/do.ts` — already cleaned up in Task 3
20
+
21
+ ## Verification
22
+ - TypeScript compiles without errors (`npm run build`)
23
+ - All remaining `--worktree` references are scoped to `raf plan` (still valid)
24
+ - No stale `--worktree` references for `raf do` or `raf plan --amend`
25
+
26
+ <promise>COMPLETE</promise>
@@ -0,0 +1,43 @@
1
+ ---
2
+ effort: medium
3
+ ---
4
+ # Task: Fix Codex JSONL Stream Renderer for Real Event Format
5
+
6
+ ## Objective
7
+ Update the Codex stream renderer to handle the actual event format emitted by Codex CLI, while preserving existing Claude event handling.
8
+
9
+ ## Context
10
+ The Codex CLI emits events in a nested format (`item.completed` with `item.type` sub-fields) but the renderer expects flat event types (`AgentMessage`, `CommandExecution`). This causes all Codex output to be silently dropped. The existing flat event types are used by Claude's renderer and must be preserved — this is NOT an "old" format, it's the Claude format.
11
+
12
+ ## Requirements
13
+ - Handle real Codex CLI event types: `item.completed`, `item.started`, `turn.completed`, `error`, `turn.failed`
14
+ - For `item.completed`, dispatch on `item.type`: `agent_message` (text in `item.text`), `command_execution` (command in `item.command`), `file_change`
15
+ - Handle `error` and `turn.failed` events — capture error messages in both `display` and `textContent`
16
+ - Extract usage data from `turn.completed` events if available
17
+ - Keep existing `AgentMessage`, `CommandExecution`, `FileChange`, `McpToolCall`, `TodoList` handlers intact — these are for Claude
18
+ - Update the `CodexEvent` interface to model the real nested event structure
19
+
20
+ ## Implementation Steps
21
+ 1. Read `src/parsers/codex-stream-renderer.ts` to understand current structure
22
+ 2. Update the `CodexEvent` interface to include nested item structure:
23
+ - Add `item?: { type: string; text?: string; command?: string; exit_code?: number; path?: string; ... }`
24
+ - Add fields for `error`, `turn.failed`, `turn.completed` events
25
+ 3. Add new cases to the switch statement for real Codex event types:
26
+ - `item.completed` → check `event.item.type` and dispatch to appropriate renderer
27
+ - `item.started` → optionally render (or skip)
28
+ - `turn.completed` → extract usage data if present
29
+ - `error` → render error message in both display and textContent
30
+ - `turn.failed` → render failure message in both display and textContent
31
+ 4. Keep all existing cases (`AgentMessage`, `CommandExecution`, etc.) untouched
32
+ 5. Build and verify no type errors
33
+
34
+ ## Acceptance Criteria
35
+ - [ ] Real Codex `item.completed` events with `agent_message` type produce text output
36
+ - [ ] Real Codex `item.completed` events with `command_execution` type show command status
37
+ - [ ] `error` and `turn.failed` events produce visible error output in both display and textContent
38
+ - [ ] Existing Claude event handlers (`AgentMessage`, `CommandExecution`, etc.) continue to work unchanged
39
+ - [ ] TypeScript compiles without errors
40
+
41
+ ## Notes
42
+ - Reference the outcomes file at `RAF/38-dual-wielder/outcomes/8-e2e-test-codex-provider.md` for exact real event JSON samples
43
+ - The `RenderResult` interface from `stream-renderer.ts` is the return type — keep using it
@@ -0,0 +1,48 @@
1
+ ---
2
+ effort: medium
3
+ ---
4
+ # Task: Wire --provider CLI Flag Through to Runner and Model Resolution
5
+
6
+ ## Objective
7
+ Make the `--provider` CLI flag actually work by reading `options.provider` and passing it to `createRunner()`, `resolveEffortToModel()`, and `resolveModelOption()`.
8
+
9
+ ## Context
10
+ Both `do.ts` and `plan.ts` declare a `--provider` option via Commander, but `options.provider` is never read. All calls to `createRunner()` omit the `provider` field, and all calls to `resolveEffortToModel()` omit the provider parameter. The flag is completely inert.
11
+
12
+ ## Dependencies
13
+ 1
14
+
15
+ ## Requirements
16
+ - Read `options.provider` from Commander options in both `do.ts` and `plan.ts`
17
+ - Pass provider to `createRunner({ model, provider })` in all call sites
18
+ - Pass provider to `resolveEffortToModel(effort, provider)` in `do.ts` (line ~125 where effort mapping happens)
19
+ - Pass provider to `resolveModelOption()` — this function may need a new parameter added to accept provider
20
+ - Ensure `resolveModelOption` in `src/utils/validation.ts` can handle provider-aware model resolution (it currently returns `ClaudeModelName` — may need to return a more general type or use the existing `parseModelSpec` logic)
21
+
22
+ ## Implementation Steps
23
+ 1. Read `src/commands/do.ts` — find all `createRunner()` and `resolveEffortToModel()` calls
24
+ 2. Read `src/commands/plan.ts` — find all `createRunner()` calls
25
+ 3. Read `src/utils/validation.ts` — understand `resolveModelOption()` signature
26
+ 4. Read `src/core/runner-factory.ts` — confirm `RunnerConfig` already has `provider` field
27
+ 5. In `do.ts`:
28
+ - Read `options.provider` at the top of `runDoCommand`
29
+ - Pass `provider` to `resolveModelOption()` (update its signature if needed)
30
+ - Pass `provider` to all `createRunner()` calls: `createRunner({ model, provider })`
31
+ - Pass `provider` to all `resolveEffortToModel()` calls
32
+ 6. In `plan.ts`:
33
+ - Read `options.provider` in the action handler
34
+ - Pass through to `runPlanCommand`, `runAmendCommand`, `runResumeCommand`
35
+ - Pass `provider` to all `createRunner()` calls
36
+ - Pass `provider` to `resolveModelOption()` if applicable
37
+ 7. Update `resolveModelOption()` signature if needed to accept optional provider
38
+ 8. Build and verify no type errors
39
+
40
+ ## Acceptance Criteria
41
+ - [ ] `raf do --provider codex` creates a `CodexRunner` instead of `ClaudeRunner`
42
+ - [ ] `raf plan --provider codex` creates a `CodexRunner` for planning sessions
43
+ - [ ] `resolveEffortToModel` uses codex effort mapping when `--provider codex` is passed
44
+ - [ ] TypeScript compiles without errors
45
+
46
+ ## Notes
47
+ - The `RunnerConfig` type in runner-factory.ts already supports `provider` — just need to pass it through
48
+ - `resolveEffortToModel` in config.ts already accepts an optional `provider` parameter — just not being called with one from the commands
@@ -0,0 +1,41 @@
1
+ ---
2
+ effort: medium
3
+ ---
4
+ # Task: Remove --worktree Flag from raf do — Auto-detect Project Location
5
+
6
+ ## Objective
7
+ Remove the `--worktree` and `--no-worktree` CLI flags from `raf do` and make project discovery always scan both worktree and main repo locations automatically.
8
+
9
+ ## Context
10
+ `raf do` already has auto-detection logic that checks worktrees first and auto-switches to worktree mode when a project is found there. The `--worktree` flag is largely redundant. Removing it simplifies the CLI and prevents user confusion. The existing scanning/auto-detection behavior becomes the only path.
11
+
12
+ ## Requirements
13
+ - Remove `-w, --worktree` and `--no-worktree` options from the Commander definition in `do.ts`
14
+ - Remove `worktreeMode` variable that reads from `options.worktree ?? getWorktreeDefault()`
15
+ - Ensure the unified project discovery flow (scanning both worktree and main, with worktree taking precedence) is always active
16
+ - The post-execution actions (merge, PR, leave) should still work when a project is in a worktree
17
+ - Error messages should not reference `--worktree` flag
18
+ - Remove `DoCommandOptions.worktree` from the options interface if it exists
19
+
20
+ ## Implementation Steps
21
+ 1. Read `src/commands/do.ts` fully to understand all worktree flag references
22
+ 2. Remove the `-w, --worktree` and `--no-worktree` option lines from `createDoCommand()`
23
+ 3. Remove `worktreeMode = options.worktree ?? getWorktreeDefault()` — instead, derive worktree mode from where the project is actually found
24
+ 4. Simplify the branching logic:
25
+ - The "if worktreeMode" block (lines ~233-268) that does early worktree-specific setup should be folded into the general flow
26
+ - The unified project picker (lines ~270-314) should always run when no identifier is provided
27
+ - When an identifier IS provided, the existing resolution logic (try worktree first, then main) should be the only path
28
+ 5. Keep all worktree execution logic intact (worktreeRoot, originalBranch, post-actions) — these are determined by where the project is found, not by a flag
29
+ 6. Update any error messages that reference `--worktree`
30
+ 7. Build and verify no type errors
31
+
32
+ ## Acceptance Criteria
33
+ - [ ] `raf do` CLI no longer accepts `--worktree` or `--no-worktree`
34
+ - [ ] `raf do` (no args) shows combined picker of worktree + main projects
35
+ - [ ] `raf do <project>` auto-detects if project is in worktree or main
36
+ - [ ] Post-execution worktree actions (merge/PR/leave) still work correctly
37
+ - [ ] TypeScript compiles without errors
38
+
39
+ ## Notes
40
+ - Be careful with the `originalBranch` recording — it's currently done early in the worktree block but still needs to happen when a worktree project is selected
41
+ - The `pullMainBranch` sync logic should still run when executing in a worktree — just triggered by project location rather than flag
@@ -0,0 +1,43 @@
1
+ ---
2
+ effort: medium
3
+ ---
4
+ # Task: Remove --worktree Flag from raf plan --amend — Auto-detect Project Location
5
+
6
+ ## Objective
7
+ Make `raf plan --amend` auto-detect whether a project lives in a worktree or main repo, removing the need for `--worktree`/`--no-worktree` flags in the amend flow. Keep the flags for NEW project creation only.
8
+
9
+ ## Context
10
+ When amending an existing project, the project already exists somewhere — either in main repo or a worktree. The tool should find it and amend in-place rather than requiring the user to specify `--worktree`. The `--worktree`/`--no-worktree` flags remain valid for `raf plan` (new project creation) since they control WHERE the project gets created.
11
+
12
+ ## Dependencies
13
+ 3
14
+
15
+ ## Requirements
16
+ - `raf plan <project> --amend` should auto-detect project location (main repo or worktree) and amend there
17
+ - `raf plan <project>` (auto-amend via name collision) should scan BOTH main repo and worktrees for matches
18
+ - The `--worktree`/`--no-worktree` flags should still exist on the `plan` command for new project creation
19
+ - When amend flow is triggered, ignore the `--worktree` flag value — always use auto-detected location
20
+ - `runAmendCommand` should no longer accept a `worktreeMode` boolean — it should determine this internally
21
+
22
+ ## Implementation Steps
23
+ 1. Read `src/commands/plan.ts` fully — understand `runAmendCommand` and the auto-amend detection in `runPlanCommand`
24
+ 2. Modify `runAmendCommand` signature to remove `worktreeMode` parameter
25
+ 3. Inside `runAmendCommand`, auto-detect project location:
26
+ - First check main repo with `resolveProjectIdentifierWithDetails()`
27
+ - Then check worktrees with `resolveWorktreeProjectByIdentifier()`
28
+ - Use whichever location has the project (worktree takes precedence if both exist, matching existing picker behavior)
29
+ 4. Update the auto-amend detection in `runPlanCommand` (lines ~122-163):
30
+ - It already scans both main and worktrees — ensure the detected `existingWorktreeMode` is passed correctly to `runAmendCommand` (or let `runAmendCommand` detect it internally)
31
+ 5. Update the call site at line ~102 where `runAmendCommand` is called with `worktreeMode`
32
+ 6. Update the call site at line ~156 where auto-amend calls `runAmendCommand`
33
+ 7. Build and verify no type errors
34
+
35
+ ## Acceptance Criteria
36
+ - [ ] `raf plan myproject --amend` finds project in worktree without needing `--worktree` flag
37
+ - [ ] `raf plan myproject --amend` finds project in main repo without needing `--no-worktree` flag
38
+ - [ ] `raf plan myproject` (name collision) detects projects in both worktree and main
39
+ - [ ] `--worktree`/`--no-worktree` flags still work for new project creation
40
+ - [ ] TypeScript compiles without errors
41
+
42
+ ## Notes
43
+ - The existing amend logic in lines ~400-700 has complex worktree resolution (recreating worktrees from branches, copying files). When auto-detecting, this should simplify: just find where the project is and amend there. No need to create worktrees for amend.