rafcode 1.1.2 → 1.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. package/RAF/018-workflow-forge/decisions.md +13 -0
  2. package/RAF/018-workflow-forge/input.md +2 -0
  3. package/RAF/018-workflow-forge/outcomes/001-add-task-number-progress.md +61 -0
  4. package/RAF/018-workflow-forge/outcomes/002-update-plan-do-prompts.md +62 -0
  5. package/RAF/018-workflow-forge/plans/001-add-task-number-progress.md +30 -0
  6. package/RAF/018-workflow-forge/plans/002-update-plan-do-prompts.md +34 -0
  7. package/RAF/019-verbose-chronicle/decisions.md +25 -0
  8. package/RAF/019-verbose-chronicle/input.md +3 -0
  9. package/RAF/019-verbose-chronicle/outcomes/001-amend-iteration-references.md +25 -0
  10. package/RAF/019-verbose-chronicle/outcomes/002-verbose-task-name-display.md +31 -0
  11. package/RAF/019-verbose-chronicle/outcomes/003-verbose-streaming-fix.md +48 -0
  12. package/RAF/019-verbose-chronicle/outcomes/004-commit-verification-before-halt.md +56 -0
  13. package/RAF/019-verbose-chronicle/plans/001-amend-iteration-references.md +35 -0
  14. package/RAF/019-verbose-chronicle/plans/002-verbose-task-name-display.md +38 -0
  15. package/RAF/019-verbose-chronicle/plans/003-verbose-streaming-fix.md +45 -0
  16. package/RAF/019-verbose-chronicle/plans/004-commit-verification-before-halt.md +62 -0
  17. package/dist/commands/do.js +24 -15
  18. package/dist/commands/do.js.map +1 -1
  19. package/dist/core/claude-runner.d.ts +52 -1
  20. package/dist/core/claude-runner.d.ts.map +1 -1
  21. package/dist/core/claude-runner.js +195 -17
  22. package/dist/core/claude-runner.js.map +1 -1
  23. package/dist/core/git.d.ts +15 -0
  24. package/dist/core/git.d.ts.map +1 -1
  25. package/dist/core/git.js +44 -0
  26. package/dist/core/git.js.map +1 -1
  27. package/dist/parsers/stream-renderer.d.ts +42 -0
  28. package/dist/parsers/stream-renderer.d.ts.map +1 -0
  29. package/dist/parsers/stream-renderer.js +100 -0
  30. package/dist/parsers/stream-renderer.js.map +1 -0
  31. package/dist/prompts/amend.d.ts.map +1 -1
  32. package/dist/prompts/amend.js +27 -3
  33. package/dist/prompts/amend.js.map +1 -1
  34. package/dist/prompts/execution.d.ts.map +1 -1
  35. package/dist/prompts/execution.js +1 -2
  36. package/dist/prompts/execution.js.map +1 -1
  37. package/dist/prompts/planning.d.ts.map +1 -1
  38. package/dist/prompts/planning.js +16 -3
  39. package/dist/prompts/planning.js.map +1 -1
  40. package/dist/utils/terminal-symbols.d.ts +3 -2
  41. package/dist/utils/terminal-symbols.d.ts.map +1 -1
  42. package/dist/utils/terminal-symbols.js +6 -4
  43. package/dist/utils/terminal-symbols.js.map +1 -1
  44. package/package.json +1 -1
  45. package/src/commands/do.ts +25 -15
  46. package/src/core/claude-runner.ts +270 -17
  47. package/src/core/git.ts +44 -0
  48. package/src/parsers/stream-renderer.ts +139 -0
  49. package/src/prompts/amend.ts +28 -3
  50. package/src/prompts/execution.ts +1 -2
  51. package/src/prompts/planning.ts +16 -3
  52. package/src/utils/terminal-symbols.ts +7 -4
  53. package/tests/unit/claude-runner.test.ts +567 -1
  54. package/tests/unit/git-commit-helpers.test.ts +103 -0
  55. package/tests/unit/plan-command.test.ts +51 -0
  56. package/tests/unit/stream-renderer.test.ts +286 -0
  57. package/tests/unit/terminal-symbols.test.ts +20 -0
@@ -0,0 +1,13 @@
1
+ # Project Decisions
2
+
3
+ ## For the task number in progress reporting, what format would you like?
4
+ [NNN] prefix - Shows as '[001] task-name' - task ID prefix style
5
+
6
+ ## Where should the [NNN] prefix appear?
7
+ Both places - Show [NNN] prefix in both the spinner during execution and in the completion summary
8
+
9
+ ## For the 'plan' prompt update, what level of detail is acceptable?
10
+ Paths are OK if referencing previous plan/output and the project. Avoid code snippets and implementation specifics otherwise.
11
+
12
+ ## For the 'do' prompt, should Task tool and subagents be mandatory or recommended?
13
+ Mandatory - Every task execution must use Task tool and subagents, no exceptions.
@@ -0,0 +1,2 @@
1
+ - [ ] add task number to "raf do" progress reporting
2
+ - [ ] update "plan" and "do" prompts. "plan" prompt should mention that the plan should output should be high level but detailed, excluding technical details and code snippets. i "do" specify the following "use task tool to split work and delegate to subagents task execution"
@@ -0,0 +1,61 @@
1
+ # Outcome: Add Task Number to Progress Reporting
2
+
3
+ ## Summary
4
+
5
+ Added [NNN] task number prefix to the "raf do" progress reporting in both spinner and completion summary.
6
+
7
+ ## Changes Made
8
+
9
+ ### Files Modified
10
+
11
+ 1. **src/utils/terminal-symbols.ts**
12
+ - Added optional `taskId` parameter to `formatTaskProgress` function
13
+ - Modified output format to include `[NNN]` prefix when task ID is provided
14
+ - Updated JSDoc to reflect new parameter and output format
15
+
16
+ 2. **src/commands/do.ts**
17
+ - Added `taskId` variable capture for closure in timer callback
18
+ - Updated all 4 calls to `formatTaskProgress` to pass task ID:
19
+ - Blocked task display (line 462)
20
+ - Running spinner update (line 519)
21
+ - Completed task display (line 670)
22
+ - Failed task display (line 690)
23
+
24
+ 3. **tests/unit/terminal-symbols.test.ts**
25
+ - Added 4 new tests for task ID prefix functionality:
26
+ - Running task with task ID prefix
27
+ - Completed task with task ID prefix and elapsed time
28
+ - Blocked task with task ID prefix
29
+ - Failed task with task ID prefix and elapsed time
30
+
31
+ ## Output Examples
32
+
33
+ Before:
34
+ ```
35
+ ● auth-login 1m 23s
36
+ ✓ setup-db 2m 34s
37
+ ✗ deploy 45s
38
+ ⊘ depends-on-failed 2/5
39
+ ```
40
+
41
+ After:
42
+ ```
43
+ ● [001] auth-login 1m 23s
44
+ ✓ [002] setup-db 2m 34s
45
+ ✗ [003] deploy 45s
46
+ ⊘ [004] depends-on-failed 2/5
47
+ ```
48
+
49
+ ## Acceptance Criteria
50
+
51
+ - [x] Spinner shows [NNN] prefix during task execution
52
+ - [x] Completion summary shows [NNN] prefix
53
+ - [x] Task number correctly extracted from plan filename
54
+ - [x] Formatting consistent across all progress messages
55
+
56
+ ## Test Results
57
+
58
+ - All 747 tests pass
59
+ - Build succeeds with no TypeScript errors
60
+
61
+ <promise>COMPLETE</promise>
@@ -0,0 +1,62 @@
1
+ # Outcome: Update Plan and Do Prompts
2
+
3
+ ## Summary
4
+
5
+ Updated the planning prompt to produce high-level, conceptual output and the execution prompt to mandate Task tool usage for subagent delegation.
6
+
7
+ ## Changes Made
8
+
9
+ ### Files Modified
10
+
11
+ 1. **src/prompts/planning.ts**
12
+ - Added "Plan Output Style" section with instructions for high-level output
13
+ - Explicitly prohibits code snippets and implementation details
14
+ - Allows file paths for project references (existing files, previous plans/outcomes, directories)
15
+ - Emphasizes describing WHAT needs to be done, not HOW to code it
16
+
17
+ 2. **src/prompts/execution.ts**
18
+ - Added mandatory Task tool instruction as first bullet point in Step 2
19
+ - Specifies agent types: Explore for codebase investigation, Plan for design decisions, general-purpose for implementation
20
+ - Instruction: "Use the Task tool to delegate work to subagents"
21
+
22
+ 3. **src/prompts/amend.ts**
23
+ - Added same "Plan Output Style" section as planning.ts for consistency
24
+ - Amendment mode now produces the same high-level output style as regular planning
25
+
26
+ ## New Prompt Content
27
+
28
+ ### Planning/Amend Prompt Addition
29
+ ```
30
+ ## Plan Output Style
31
+
32
+ **CRITICAL**: Plans should be HIGH-LEVEL and CONCEPTUAL:
33
+ - Describe WHAT needs to be done, not HOW to code it
34
+ - Focus on architecture, data flow, and component interactions
35
+ - NO code snippets or implementation details in plans
36
+ - File paths ARE acceptable when referencing:
37
+ - Existing project files to modify
38
+ - Previous plan/outcome files for context
39
+ - Project structure and directories
40
+ - Let the executing agent decide implementation specifics
41
+ - Plans guide the work; they don't prescribe exact code
42
+ ```
43
+
44
+ ### Execution Prompt Addition
45
+ ```
46
+ - **Use the Task tool to delegate work to subagents** - Split complex work into subtasks and use specialized agents (Explore for codebase investigation, Plan for design decisions, general-purpose for implementation)
47
+ ```
48
+
49
+ ## Acceptance Criteria
50
+
51
+ - [x] Planning prompt instructs high-level output without code snippets
52
+ - [x] Planning prompt allows file paths for project references
53
+ - [x] Execution prompt mandates Task tool usage for subagent delegation
54
+ - [x] Prompts remain clear and well-structured
55
+ - [x] Amend prompt updated for consistency
56
+
57
+ ## Test Results
58
+
59
+ - Build succeeds with no TypeScript errors
60
+ - All 747 tests pass
61
+
62
+ <promise>COMPLETE</promise>
@@ -0,0 +1,30 @@
1
+ # Task: Add Task Number to Progress Reporting
2
+
3
+ ## Objective
4
+ Add [NNN] task number prefix to the "raf do" progress reporting in both spinner and completion summary.
5
+
6
+ ## Context
7
+ Currently "raf do" shows task names during execution but doesn't show the task number (e.g., 001, 002). Adding the task number prefix helps users understand which specific task is running and correlate output with plan files.
8
+
9
+ ## Requirements
10
+ - Display task number as [NNN] prefix (e.g., [001], [002])
11
+ - Show prefix in spinner/progress message during task execution
12
+ - Show prefix in task completion/failure summary
13
+ - Format: `[001] task-name` style
14
+
15
+ ## Implementation Steps
16
+ 1. Locate the progress reporting code in the "do" command
17
+ 2. Extract task number from the task filename (e.g., "001" from "001-task-name.md")
18
+ 3. Update spinner message format to include [NNN] prefix
19
+ 4. Update completion summary format to include [NNN] prefix
20
+ 5. Test with multi-task projects to verify correct numbering
21
+
22
+ ## Acceptance Criteria
23
+ - [ ] Spinner shows [NNN] prefix during task execution
24
+ - [ ] Completion summary shows [NNN] prefix
25
+ - [ ] Task number correctly extracted from plan filename
26
+ - [ ] Formatting consistent across all progress messages
27
+
28
+ ## Notes
29
+ - Task number comes from plan filename format: `NNN-task-name.md`
30
+ - Existing progress reporting logic is in `src/commands/do.ts` or related execution files
@@ -0,0 +1,34 @@
1
+ # Task: Update Plan and Do Prompts
2
+
3
+ ## Objective
4
+ Update the planning prompt to produce high-level output and the execution prompt to mandate Task tool usage for subagent delegation.
5
+
6
+ ## Context
7
+ The planning prompt needs to guide Claude to produce high-level, conceptual plans without code snippets or excessive implementation details. The execution prompt needs to instruct Claude to use the Task tool to split work and delegate to subagents for better task management.
8
+
9
+ ## Requirements
10
+ - **Plan prompt updates:**
11
+ - Instruct to produce high-level but detailed plans
12
+ - Avoid code snippets and implementation specifics
13
+ - File paths are acceptable when referencing previous plans/outputs or project structure
14
+
15
+ - **Do prompt updates:**
16
+ - Add mandatory instruction: "Use Task tool to split work and delegate to subagents for task execution"
17
+ - This applies to all task executions, not optional
18
+
19
+ ## Implementation Steps
20
+ 1. Read the current planning prompt in `src/prompts/planning.ts`
21
+ 2. Add instructions for high-level output, avoiding code snippets
22
+ 3. Read the current execution prompt in `src/prompts/execution.ts`
23
+ 4. Add mandatory Task tool / subagent delegation instruction
24
+ 5. Ensure prompt changes don't conflict with existing instructions
25
+
26
+ ## Acceptance Criteria
27
+ - [ ] Planning prompt instructs high-level output without code snippets
28
+ - [ ] Planning prompt allows file paths for project references
29
+ - [ ] Execution prompt mandates Task tool usage for subagent delegation
30
+ - [ ] Prompts remain clear and well-structured
31
+
32
+ ## Notes
33
+ - Prompt files are located in `src/prompts/` directory
34
+ - The amend prompt (`amend.ts`) may also need similar updates for consistency
@@ -0,0 +1,25 @@
1
+ # Project Decisions
2
+
3
+ ## For the amendment iteration reference feature — when a new task looks like a follow-up/fix to a previous completed task, how should the reference appear in the new plan?
4
+ Context section, make sure to put file path to prev task outcome.
5
+
6
+ ## Should the iteration reference be determined automatically by the planning Claude (via prompt instructions), or should RAF code analyze task content programmatically?
7
+ Prompt-based (Recommended) — update the amend system prompt to instruct Claude to identify follow-ups and include references to previous tasks in new plans.
8
+
9
+ ## For verbose task name display — should the task name appear everywhere task ID is shown, or just in the main header?
10
+ Everywhere — show task name alongside ID in all verbose log messages, status updates, and summaries.
11
+
12
+ ## For streaming Claude output in verbose mode — is it currently broken or is there a different streaming behavior wanted?
13
+ In verbose mode only a summary of completed work is shown, not constant stream of how Claude executes. Need to investigate and fix — streaming should show real-time Claude output.
14
+
15
+ ## For verifying git commits before halting Claude — how should RAF check that the commit landed?
16
+ All three checks combined: (1) HEAD changed from before task, (2) new commit message starts with RAF[project:task], and (3) outcome file is committed in git.
17
+
18
+ ## If the commit hasn't landed when the grace period expires, what should happen?
19
+ Extend grace period — keep waiting (with a maximum cap) until the commit appears or a hard timeout is reached.
20
+
21
+ ## What should the maximum cap be for the extended grace period when waiting for commit?
22
+ 180 seconds (3 minutes total max wait for commit). Current grace period is 60 seconds.
23
+
24
+ ## For the verbose streaming fix — any suspicion about what's wrong?
25
+ Investigate fully — debug the entire verbose code path end-to-end to find why streaming isn't working.
@@ -0,0 +1,3 @@
1
+ - [ ] in amendment, if new task looks like follow up or fix to previous completed task in the project - you should reference it in new plan (iteration plan)
2
+ - [ ] in 'raf do --verbose' add task name (not just "Executing task 011..."). also steam output from claude into terminal in verbose mode
3
+ - [ ] raf halted before commit was done. make sure commit is there before halting claude by checking commits. see previous halt work in commit 4d3868c3ef4c607c59829e94462ffd0490d82a98
@@ -0,0 +1,25 @@
1
+ # Outcome: Amend Iteration References
2
+
3
+ ## Summary
4
+ Enhanced the amend planning prompt to include outcome file paths for completed tasks and instruct Claude to reference previous task outcomes when creating follow-up/fix plans.
5
+
6
+ ## Changes Made
7
+
8
+ ### `src/prompts/amend.ts`
9
+ - **Enhanced `existingTasksSummary`**: Completed tasks now include an `Outcome:` line with the full path to their outcome file (e.g., `Outcome: /project/outcomes/001-setup.md`). Non-completed tasks are unaffected.
10
+ - **Added follow-up task instructions**: New "Identifying Follow-up Tasks" paragraph in Step 2 instructs Claude to reference previous task outcomes in the Context section when creating follow-up, fix, or iteration tasks. Includes the exact format to use.
11
+ - **Updated plan template**: Added a placeholder line in the Context section showing the follow-up reference format.
12
+
13
+ ### `tests/unit/plan-command.test.ts`
14
+ - Added 3 new tests:
15
+ - Verifies outcome file paths appear for completed tasks in the task summary
16
+ - Verifies outcome file paths do NOT appear for pending/failed tasks
17
+ - Verifies follow-up task instructions are present in the system prompt
18
+
19
+ ## Acceptance Criteria
20
+ - [x] Amend prompt includes outcome file paths for completed tasks in the task summary
21
+ - [x] Prompt instructs Claude to identify follow-up/fix tasks and reference outcomes in Context section
22
+ - [x] Existing amend functionality is not broken (all 40 plan-command tests pass)
23
+ - [x] All tests pass (1 pre-existing failure in planning-prompt.test.ts is unrelated)
24
+
25
+ <promise>COMPLETE</promise>
@@ -0,0 +1,31 @@
1
+ # Outcome: Verbose Task Name Display
2
+
3
+ ## Summary
4
+ Updated all verbose mode log messages in `do.ts` to include the task name alongside the task ID, using the format `task 011 (fix-login-bug)` instead of just `task 011`.
5
+
6
+ ## Changes Made
7
+
8
+ ### `src/commands/do.ts`
9
+ - **Added `taskLabel` variable** (line 449): Computes `${task.id} (${displayName})` when the name differs from the ID, or just `${task.id}` when they're the same (to avoid redundant `001 (001)` display).
10
+ - **Updated 7 verbose log messages** to use `taskLabel`:
11
+ - Blocked task warning: `Task ${taskLabel} blocked by failed dependency: ...`
12
+ - Retry message: `Retrying task ${taskLabel} (previously failed)...`
13
+ - Force re-run message: `Re-running task ${taskLabel} (force mode)...`
14
+ - Execute message: `Executing task ${taskLabel}...`
15
+ - Retry loop message: `Retry N/M for task ${taskLabel}...`
16
+ - Completion message: `Task ${taskLabel} completed (elapsed)`
17
+ - Stash message: `Changes for task ${taskLabel} stashed as: ...`
18
+ - Failure message: `Task ${taskLabel} failed: reason (elapsed)`
19
+
20
+ ### No test changes needed
21
+ - The `formatRetryHistoryForConsole` function already supported task name display
22
+ - No existing tests assert on verbose log message format directly
23
+ - All 757 tests pass (1 pre-existing failure in planning-prompt.test.ts is unrelated)
24
+
25
+ ## Acceptance Criteria
26
+ - [x] All verbose log messages that mention a task ID also show the task name
27
+ - [x] Format is consistent across all messages (`taskLabel` pattern)
28
+ - [x] Non-verbose mode is unaffected (uses `formatTaskProgress` separately)
29
+ - [x] All tests pass
30
+
31
+ <promise>COMPLETE</promise>
@@ -0,0 +1,48 @@
1
+ # Outcome: Verbose Streaming Fix
2
+
3
+ ## Summary
4
+ Fixed `runVerbose()` to stream Claude's real-time execution output (tool calls, file operations, text) by using `--output-format stream-json --verbose` instead of plain `-p` mode which only showed the final summary.
5
+
6
+ ## Root Cause
7
+ The `runVerbose()` method used the same `-p` (print) flag as `run()`, which runs Claude in non-interactive mode and only outputs the final assistant response text. This meant users saw a summary of completed work rather than real-time streaming of Claude's activity.
8
+
9
+ ## Changes Made
10
+
11
+ ### `src/parsers/stream-renderer.ts` (NEW)
12
+ - **Stream event parser**: Parses NDJSON lines from Claude CLI `stream-json` output
13
+ - **Human-readable rendering**: Converts events to user-friendly display:
14
+ - Text blocks: displayed directly
15
+ - Tool calls: descriptive one-line summaries (e.g., `→ Reading /src/main.ts`, `→ Running: npm test`)
16
+ - System/result events: suppressed (not useful for display)
17
+ - **Tool descriptions**: Custom formatting for Read, Write, Edit, Bash, Glob, Grep, WebFetch, WebSearch, TodoWrite, Task, and NotebookEdit tools
18
+ - **Text content extraction**: Returns text content separately for completion marker detection and output parsing
19
+
20
+ ### `src/core/claude-runner.ts`
21
+ - **Modified `runVerbose()` spawn args**: Added `--output-format stream-json --verbose` flags to get real-time NDJSON streaming events
22
+ - **NDJSON line buffering**: Added line buffer to handle data chunks that split across NDJSON line boundaries
23
+ - **Event rendering pipeline**: Each complete NDJSON line is parsed by `renderStreamEvent()`, display text goes to stdout, text content accumulates in `output` for completion detection and parsing
24
+ - **Preserved all existing behavior**: Timeout, context overflow detection, completion marker detection, outcome file polling, and kill mechanisms all work unchanged
25
+ - **`run()` method unchanged**: Non-verbose mode remains exactly as before
26
+
27
+ ### `tests/unit/stream-renderer.test.ts` (NEW)
28
+ - 25 tests covering all event types: system, assistant (text), assistant (tool_use), user (tool results), result
29
+ - Edge cases: empty lines, invalid JSON, empty content, unknown events
30
+ - Tool-specific rendering: all 11 supported tools tested
31
+
32
+ ### `tests/unit/claude-runner.test.ts`
33
+ - Added 4 new tests in `verbose stream-json output` describe block:
34
+ - Verifies `runVerbose()` includes `--output-format stream-json --verbose` flags
35
+ - Verifies `run()` does NOT include these flags
36
+ - Verifies NDJSON assistant events are parsed and text extracted correctly
37
+ - Verifies tool_use events don't add text to output
38
+
39
+ ## Acceptance Criteria
40
+ - [x] `raf do --verbose` shows Claude's real-time execution (tool calls, file operations, thinking)
41
+ - [x] Completion marker detection still works correctly
42
+ - [x] Timeout mechanism still functions
43
+ - [x] Context overflow detection still works
44
+ - [x] Non-verbose mode (`raf do`) is completely unaffected
45
+ - [x] Success/failure parsing still works from the captured output
46
+ - [x] All existing tests pass (786 pass, 1 pre-existing failure in planning-prompt.test.ts is unrelated)
47
+
48
+ <promise>COMPLETE</promise>
@@ -0,0 +1,56 @@
1
+ # Outcome: Commit Verification Before Halt
2
+
3
+ ## Summary
4
+ Added commit verification to the grace period logic so that when a COMPLETE marker is detected, the system verifies the expected git commit has actually landed before terminating. If the commit hasn't landed within the initial 60-second grace period, it extends polling up to a hard maximum of 180 seconds.
5
+
6
+ ## Changes Made
7
+
8
+ ### `src/core/git.ts`
9
+ - **`getHeadCommitHash()`**: Returns the current HEAD commit hash (or null if not in a git repo)
10
+ - **`getHeadCommitMessage()`**: Returns the HEAD commit message first line (or null)
11
+ - **`isFileCommittedInHead(filePath)`**: Checks if a file exists in HEAD's tree using `git ls-tree`
12
+
13
+ ### `src/core/claude-runner.ts`
14
+ - **New exports**: `COMPLETION_HARD_MAX_MS` (180s), `COMMIT_POLL_INTERVAL_MS` (10s), `CommitContext` interface
15
+ - **New `commitContext` field in `ClaudeRunnerOptions`**: Allows passing pre-execution HEAD hash, expected commit prefix, and outcome file path
16
+ - **`verifyCommit()` helper**: Checks all three conditions (HEAD changed, message prefix matches, outcome file committed)
17
+ - **Updated `createCompletionDetector()`**: Accepts optional `commitContext` parameter. On grace period expiry:
18
+ - If COMPLETE marker and `commitContext` provided: verifies commit before killing
19
+ - If commit not verified: starts polling every 10s up to 180s total
20
+ - If FAILED marker or no `commitContext`: kills immediately (existing behavior)
21
+ - **Both `run()` and `runVerbose()`**: Pass `commitContext` through to completion detector
22
+
23
+ ### `src/commands/do.ts`
24
+ - **Captures HEAD hash** before each task execution attempt using `getHeadCommitHash()`
25
+ - **Builds `commitContext`** with `preExecutionHead`, `expectedPrefix` (e.g., `RAF[005:001]`), and `outcomeFilePath`
26
+ - **Passes `commitContext`** to both `run()` and `runVerbose()` calls
27
+ - Gracefully handles non-git-repo case (skips commit verification)
28
+
29
+ ### `tests/unit/git-commit-helpers.test.ts` (NEW)
30
+ - 11 tests covering all three new git functions:
31
+ - `getHeadCommitHash`: normal, not-in-repo, empty output, whitespace trimming
32
+ - `getHeadCommitMessage`: normal, not-in-repo, empty output
33
+ - `isFileCommittedInHead`: file exists, file missing, not-in-repo, command failure
34
+
35
+ ### `tests/unit/claude-runner.test.ts`
36
+ - Added mock for `git.js` module (`getHeadCommitHash`, `getHeadCommitMessage`, `isFileCommittedInHead`)
37
+ - Added 7 new tests in `commit verification during grace period` describe block:
38
+ - Commit verified within initial grace period (kills normally)
39
+ - Commit found during extended polling (extends, then kills)
40
+ - Commit never lands (hard max at 180s)
41
+ - FAILED markers don't trigger commit verification
42
+ - Backward compatible without commitContext
43
+ - Verifies commit message prefix must match
44
+ - Verifies outcome file must be committed
45
+
46
+ ## Acceptance Criteria
47
+ - [x] HEAD hash is recorded before each task execution
48
+ - [x] Grace period checks for commit matching `RAF[project:task]` pattern
49
+ - [x] Grace period checks that outcome file is committed
50
+ - [x] Grace period extends up to 180 seconds if commit not found
51
+ - [x] Process is killed with a warning after 180 seconds if commit never lands
52
+ - [x] Normal flow (commit lands within 60s) is not affected
53
+ - [x] All existing tests pass (805 pass, 1 pre-existing failure in planning-prompt.test.ts is unrelated)
54
+ - [x] New tests cover: commit found within grace, commit found in extended grace, commit never found (hard timeout)
55
+
56
+ <promise>COMPLETE</promise>
@@ -0,0 +1,35 @@
1
+ # Task: Amend Iteration References
2
+
3
+ ## Objective
4
+ When amending a project, instruct the planning Claude to reference previous completed tasks (with outcome file paths) in the Context section of new plans that are follow-ups or fixes.
5
+
6
+ ## Context
7
+ Currently, amendment mode shows existing tasks with their status (COMPLETED/PENDING/FAILED) but doesn't guide Claude to treat new tasks as iterations on previous work. When a user amends a project to fix or follow up on a completed task, the new plan should reference the previous task's outcome file so the executing agent has full context about what was done before.
8
+
9
+ ## Requirements
10
+ - Update the amend system prompt in `src/prompts/amend.ts` to instruct Claude to:
11
+ - Identify when new tasks are follow-ups, fixes, or iterations of previously completed tasks
12
+ - Include a reference to the previous task's outcome file path in the Context section of the new plan
13
+ - Use the format: "This is a follow-up to task NNN. See outcome: {projectPath}/outcomes/NNN-task-name.md"
14
+ - The existing tasks summary already includes task IDs and names — enhance it to also show the outcome file path for completed tasks so Claude has the information readily available
15
+ - This is purely a prompt engineering change — no programmatic detection logic needed
16
+ - Do NOT modify the plan template structure (no new dedicated fields) — use the existing Context section
17
+
18
+ ## Implementation Steps
19
+ 1. Read the current amend prompt in `src/prompts/amend.ts`
20
+ 2. Enhance the `existingTasksSummary` generation to include outcome file paths for completed tasks
21
+ 3. Add instructions in the system prompt (Step 2: Analyze New Requirements section) telling Claude to identify follow-up/fix tasks and reference previous outcomes in the Context section
22
+ 4. Add an example in the prompt showing how to reference a previous task's outcome
23
+ 5. Update tests for the amend prompt if any exist
24
+
25
+ ## Acceptance Criteria
26
+ - [ ] Amend prompt includes outcome file paths for completed tasks in the task summary
27
+ - [ ] Prompt instructs Claude to identify follow-up/fix tasks and reference outcomes in Context section
28
+ - [ ] Existing amend functionality is not broken
29
+ - [ ] All tests pass
30
+
31
+ ## Notes
32
+ - The outcome file path follows the pattern: `{projectPath}/outcomes/{taskId}-{taskName}.md`
33
+ - Only completed tasks have meaningful outcome files to reference
34
+ - The `AmendPromptParams` interface already includes `projectPath` which can be used to construct outcome paths
35
+ - The `existingTasks` array includes `planFile` from which the task name can be extracted for outcome path construction
@@ -0,0 +1,38 @@
1
+ # Task: Verbose Task Name Display
2
+
3
+ ## Objective
4
+ Show the task name alongside the task ID in all verbose mode log messages (e.g., "Executing task 011 (fix-login-bug)..." instead of "Executing task 011...").
5
+
6
+ ## Context
7
+ Currently in verbose mode, log messages reference tasks by ID only (e.g., "Executing task 011..."). The task name is available via `extractTaskNameFromPlanFile()` and is already stored in `displayName`, but not consistently used in all log messages. Users need the task name to quickly understand what's running without cross-referencing plan files.
8
+
9
+ ## Requirements
10
+ - Add task name to ALL verbose log messages that reference a task ID, including:
11
+ - "Executing task 011..." → "Executing task 011 (fix-login-bug)..."
12
+ - "Retrying task 011 (previously failed)..." → include name
13
+ - "Re-running task 011 (force mode)..." → include name
14
+ - "Task 011 completed (2m 30s)" → include name
15
+ - "Task 011 failed: reason (2m 30s)" → include name
16
+ - "Task 011 blocked by failed dependency: 003" → include name
17
+ - Retry messages
18
+ - Changes stashed messages
19
+ - The task name is already computed as `displayName` in `do.ts` — use it consistently
20
+ - Also update the `logger.setContext()` call if it's being used (currently no-op but may be restored)
21
+ - Update the verbose summary section to include task names where applicable
22
+
23
+ ## Implementation Steps
24
+ 1. Read `src/commands/do.ts` and identify all verbose log messages that reference task IDs
25
+ 2. Update each message to include `displayName` in parentheses after the task ID
26
+ 3. Ensure the format is consistent: `task ${task.id} (${displayName})` everywhere
27
+ 4. Update tests if any test verbose output format
28
+
29
+ ## Acceptance Criteria
30
+ - [ ] All verbose log messages that mention a task ID also show the task name
31
+ - [ ] Format is consistent across all messages
32
+ - [ ] Non-verbose mode is unaffected
33
+ - [ ] All tests pass
34
+
35
+ ## Notes
36
+ - `displayName` is already computed at line 447 of `do.ts` as `taskName ?? task.id`
37
+ - When `displayName` equals `task.id` (name extraction failed), showing it in parens would be redundant — consider only showing parens when name differs from ID
38
+ - The `setContext` and `clearContext` methods on logger are currently no-ops (deprecated) but the calls remain in do.ts
@@ -0,0 +1,45 @@
1
+ # Task: Verbose Streaming Fix
2
+
3
+ ## Objective
4
+ Investigate and fix why verbose mode only shows a summary of completed work instead of streaming Claude's real-time execution output.
5
+
6
+ ## Context
7
+ The `runVerbose()` method in `claude-runner.ts` does call `process.stdout.write(text)` on stdout data, but users report only seeing a summary rather than a real-time stream. The likely root cause is that Claude CLI is invoked with the `-p` (print/pipe) flag, which runs in non-interactive mode and only outputs the final assistant response text — not the intermediate tool calls, file reads, code edits, and thinking steps that constitute the real-time execution flow.
8
+
9
+ The non-verbose `run()` method correctly uses `-p` since it only needs the final output for parsing. But `runVerbose()` needs a different approach to show Claude's work as it happens.
10
+
11
+ ## Requirements
12
+ - Investigate the exact cause of why streaming doesn't show real-time Claude activity
13
+ - Fix `runVerbose()` to stream Claude's real-time execution output (tool calls, file operations, code writing, etc.)
14
+ - The fix likely involves either:
15
+ - Removing `-p` flag and using a different mode that streams intermediate output
16
+ - Using `--output-format stream-json` to get streaming JSON events and rendering them
17
+ - Using PTY-based execution (like `runInteractive`) but without stdin interaction
18
+ - The completion marker detection must still work with the new approach
19
+ - Output parsing for success/failure must still function correctly
20
+ - The timeout mechanism must still work
21
+ - Non-verbose mode must remain unchanged
22
+
23
+ ## Implementation Steps
24
+ 1. Investigate what Claude CLI output modes are available (check `claude --help` or documentation)
25
+ 2. Determine the best approach for streaming real-time output while maintaining completion detection
26
+ 3. Modify `runVerbose()` to use the chosen streaming approach
27
+ 4. Ensure completion markers can still be detected in the new output format
28
+ 5. Test that timeout, context overflow detection, and kill mechanisms still work
29
+ 6. Verify output can still be parsed for success/failure determination
30
+
31
+ ## Acceptance Criteria
32
+ - [ ] `raf do --verbose` shows Claude's real-time execution (tool calls, file operations, thinking)
33
+ - [ ] Completion marker detection still works correctly
34
+ - [ ] Timeout mechanism still functions
35
+ - [ ] Context overflow detection still works
36
+ - [ ] Non-verbose mode (`raf do`) is completely unaffected
37
+ - [ ] Success/failure parsing still works from the captured output
38
+ - [ ] All existing tests pass (update as needed for new implementation)
39
+
40
+ ## Notes
41
+ - The `run()` method must NOT be changed — only `runVerbose()` needs modification
42
+ - Claude CLI streaming options may include `--output-format stream-json` which outputs JSON events
43
+ - If using stream-json, a renderer/formatter will be needed to display events in a human-readable way
44
+ - PTY-based approach (similar to `runInteractive`) could work but would need `--dangerously-skip-permissions` to avoid interactive prompts
45
+ - The `activeProcess` tracking needs to work with whichever approach is chosen for the shutdown handler
@@ -0,0 +1,62 @@
1
+ # Task: Commit Verification Before Halt
2
+
3
+ ## Objective
4
+ Before halting Claude after detecting a completion marker, verify that the expected git commit has actually been made, extending the grace period if the commit hasn't landed yet.
5
+
6
+ ## Context
7
+ The current halt mechanism in `claude-runner.ts` detects completion markers (`<promise>COMPLETE/FAILED</promise>`) and starts a fixed 60-second grace period before killing Claude. However, if Claude hasn't finished its git commit within that window, the process gets killed mid-commit, potentially leaving the repository in a broken state. The previous halt work was introduced in commit `4d3868c`. This task adds commit verification to ensure the grace period only expires after the commit is confirmed.
8
+
9
+ See previous halt work: commit `4d3868c3ef4c607c59829e94462ffd0490d82a98`
10
+
11
+ ## Dependencies
12
+ 003
13
+
14
+ ## Requirements
15
+ - Record the HEAD commit hash before task execution begins
16
+ - After the grace period triggers, verify the commit landed by checking ALL three conditions:
17
+ 1. HEAD has changed from the pre-execution hash
18
+ 2. The new HEAD commit message starts with the expected `RAF[project:task]` pattern
19
+ 3. The outcome file is tracked in git (committed, not just on disk)
20
+ - If the commit hasn't landed when the initial grace period (60s) expires:
21
+ - Extend the grace period by polling for the commit
22
+ - Continue extending until either the commit is confirmed or a hard maximum of 180 seconds is reached
23
+ - Poll at a reasonable interval (e.g., every 10 seconds)
24
+ - If the hard maximum is reached without commit confirmation, kill the process and log a warning
25
+ - Add git helper functions to `src/core/git.ts` for:
26
+ - Getting the current HEAD hash
27
+ - Checking if a commit message matches a pattern
28
+ - Checking if a file is committed in HEAD
29
+ - The completion detector factory (`createCompletionDetector`) needs additional parameters:
30
+ - Pre-execution HEAD hash
31
+ - Expected commit message prefix (e.g., `RAF[005:001]`)
32
+ - Outcome file path (already available)
33
+ - Pass the necessary context from `do.ts` when creating the completion detector
34
+
35
+ ## Implementation Steps
36
+ 1. Add new git utility functions to `src/core/git.ts`:
37
+ - `getHeadCommitHash()`: returns current HEAD hash
38
+ - `getHeadCommitMessage()`: returns HEAD commit message
39
+ - `isFileCommittedInHead(filePath)`: checks if file is in HEAD commit
40
+ 2. Update `createCompletionDetector` in `claude-runner.ts` to accept commit verification parameters
41
+ 3. Modify the grace period logic: instead of a single `setTimeout`, use an interval that checks for the commit
42
+ 4. If initial 60s grace period expires without commit, continue polling up to 180s total
43
+ 5. Update `do.ts` to capture HEAD hash before task execution and pass commit context to the runner
44
+ 6. Add unit tests for the new git functions
45
+ 7. Add unit tests for the extended grace period behavior
46
+
47
+ ## Acceptance Criteria
48
+ - [ ] HEAD hash is recorded before each task execution
49
+ - [ ] Grace period checks for commit matching `RAF[project:task]` pattern
50
+ - [ ] Grace period checks that outcome file is committed
51
+ - [ ] Grace period extends up to 180 seconds if commit not found
52
+ - [ ] Process is killed with a warning after 180 seconds if commit never lands
53
+ - [ ] Normal flow (commit lands within 60s) is not affected
54
+ - [ ] All existing tests pass
55
+ - [ ] New tests cover: commit found within grace, commit found in extended grace, commit never found (hard timeout)
56
+
57
+ ## Notes
58
+ - The `ClaudeRunnerOptions` interface may need new fields for the commit context
59
+ - On task failure, Claude does NOT commit (changes are stashed) — the commit check should only apply when a COMPLETE marker is detected, not FAILED
60
+ - The project number and task ID are already available in `do.ts` where the runner is called
61
+ - Use `execSync` for git commands (consistent with existing `git.ts` functions)
62
+ - Handle "not in git repo" gracefully — skip commit verification if not in a git repo