rafcode 1.1.2 → 1.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/RAF/018-workflow-forge/decisions.md +13 -0
- package/RAF/018-workflow-forge/input.md +2 -0
- package/RAF/018-workflow-forge/outcomes/001-add-task-number-progress.md +61 -0
- package/RAF/018-workflow-forge/outcomes/002-update-plan-do-prompts.md +62 -0
- package/RAF/018-workflow-forge/plans/001-add-task-number-progress.md +30 -0
- package/RAF/018-workflow-forge/plans/002-update-plan-do-prompts.md +34 -0
- package/RAF/019-verbose-chronicle/decisions.md +25 -0
- package/RAF/019-verbose-chronicle/input.md +3 -0
- package/RAF/019-verbose-chronicle/outcomes/001-amend-iteration-references.md +25 -0
- package/RAF/019-verbose-chronicle/outcomes/002-verbose-task-name-display.md +31 -0
- package/RAF/019-verbose-chronicle/outcomes/003-verbose-streaming-fix.md +48 -0
- package/RAF/019-verbose-chronicle/outcomes/004-commit-verification-before-halt.md +56 -0
- package/RAF/019-verbose-chronicle/plans/001-amend-iteration-references.md +35 -0
- package/RAF/019-verbose-chronicle/plans/002-verbose-task-name-display.md +38 -0
- package/RAF/019-verbose-chronicle/plans/003-verbose-streaming-fix.md +45 -0
- package/RAF/019-verbose-chronicle/plans/004-commit-verification-before-halt.md +62 -0
- package/dist/commands/do.js +24 -15
- package/dist/commands/do.js.map +1 -1
- package/dist/core/claude-runner.d.ts +52 -1
- package/dist/core/claude-runner.d.ts.map +1 -1
- package/dist/core/claude-runner.js +195 -17
- package/dist/core/claude-runner.js.map +1 -1
- package/dist/core/git.d.ts +15 -0
- package/dist/core/git.d.ts.map +1 -1
- package/dist/core/git.js +44 -0
- package/dist/core/git.js.map +1 -1
- package/dist/parsers/stream-renderer.d.ts +42 -0
- package/dist/parsers/stream-renderer.d.ts.map +1 -0
- package/dist/parsers/stream-renderer.js +100 -0
- package/dist/parsers/stream-renderer.js.map +1 -0
- package/dist/prompts/amend.d.ts.map +1 -1
- package/dist/prompts/amend.js +27 -3
- package/dist/prompts/amend.js.map +1 -1
- package/dist/prompts/execution.d.ts.map +1 -1
- package/dist/prompts/execution.js +1 -2
- package/dist/prompts/execution.js.map +1 -1
- package/dist/prompts/planning.d.ts.map +1 -1
- package/dist/prompts/planning.js +16 -3
- package/dist/prompts/planning.js.map +1 -1
- package/dist/utils/terminal-symbols.d.ts +3 -2
- package/dist/utils/terminal-symbols.d.ts.map +1 -1
- package/dist/utils/terminal-symbols.js +6 -4
- package/dist/utils/terminal-symbols.js.map +1 -1
- package/package.json +1 -1
- package/src/commands/do.ts +25 -15
- package/src/core/claude-runner.ts +270 -17
- package/src/core/git.ts +44 -0
- package/src/parsers/stream-renderer.ts +139 -0
- package/src/prompts/amend.ts +28 -3
- package/src/prompts/execution.ts +1 -2
- package/src/prompts/planning.ts +16 -3
- package/src/utils/terminal-symbols.ts +7 -4
- package/tests/unit/claude-runner.test.ts +567 -1
- package/tests/unit/git-commit-helpers.test.ts +103 -0
- package/tests/unit/plan-command.test.ts +51 -0
- package/tests/unit/stream-renderer.test.ts +286 -0
- package/tests/unit/terminal-symbols.test.ts +20 -0
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
# Project Decisions
|
|
2
|
+
|
|
3
|
+
## For the task number in progress reporting, what format would you like?
|
|
4
|
+
[NNN] prefix - Shows as '[001] task-name' - task ID prefix style
|
|
5
|
+
|
|
6
|
+
## Where should the [NNN] prefix appear?
|
|
7
|
+
Both places - Show [NNN] prefix in both the spinner during execution and in the completion summary
|
|
8
|
+
|
|
9
|
+
## For the 'plan' prompt update, what level of detail is acceptable?
|
|
10
|
+
Paths are OK if referencing previous plan/output and the project. Avoid code snippets and implementation specifics otherwise.
|
|
11
|
+
|
|
12
|
+
## For the 'do' prompt, should Task tool and subagents be mandatory or recommended?
|
|
13
|
+
Mandatory - Every task execution must use Task tool and subagents, no exceptions.
|
|
@@ -0,0 +1,2 @@
|
|
|
1
|
+
- [ ] add task number to "raf do" progress reporting
|
|
2
|
+
- [ ] update "plan" and "do" prompts. "plan" prompt should mention that the plan should output should be high level but detailed, excluding technical details and code snippets. i "do" specify the following "use task tool to split work and delegate to subagents task execution"
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
# Outcome: Add Task Number to Progress Reporting
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Added [NNN] task number prefix to the "raf do" progress reporting in both spinner and completion summary.
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### Files Modified
|
|
10
|
+
|
|
11
|
+
1. **src/utils/terminal-symbols.ts**
|
|
12
|
+
- Added optional `taskId` parameter to `formatTaskProgress` function
|
|
13
|
+
- Modified output format to include `[NNN]` prefix when task ID is provided
|
|
14
|
+
- Updated JSDoc to reflect new parameter and output format
|
|
15
|
+
|
|
16
|
+
2. **src/commands/do.ts**
|
|
17
|
+
- Added `taskId` variable capture for closure in timer callback
|
|
18
|
+
- Updated all 4 calls to `formatTaskProgress` to pass task ID:
|
|
19
|
+
- Blocked task display (line 462)
|
|
20
|
+
- Running spinner update (line 519)
|
|
21
|
+
- Completed task display (line 670)
|
|
22
|
+
- Failed task display (line 690)
|
|
23
|
+
|
|
24
|
+
3. **tests/unit/terminal-symbols.test.ts**
|
|
25
|
+
- Added 4 new tests for task ID prefix functionality:
|
|
26
|
+
- Running task with task ID prefix
|
|
27
|
+
- Completed task with task ID prefix and elapsed time
|
|
28
|
+
- Blocked task with task ID prefix
|
|
29
|
+
- Failed task with task ID prefix and elapsed time
|
|
30
|
+
|
|
31
|
+
## Output Examples
|
|
32
|
+
|
|
33
|
+
Before:
|
|
34
|
+
```
|
|
35
|
+
● auth-login 1m 23s
|
|
36
|
+
✓ setup-db 2m 34s
|
|
37
|
+
✗ deploy 45s
|
|
38
|
+
⊘ depends-on-failed 2/5
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
After:
|
|
42
|
+
```
|
|
43
|
+
● [001] auth-login 1m 23s
|
|
44
|
+
✓ [002] setup-db 2m 34s
|
|
45
|
+
✗ [003] deploy 45s
|
|
46
|
+
⊘ [004] depends-on-failed 2/5
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
## Acceptance Criteria
|
|
50
|
+
|
|
51
|
+
- [x] Spinner shows [NNN] prefix during task execution
|
|
52
|
+
- [x] Completion summary shows [NNN] prefix
|
|
53
|
+
- [x] Task number correctly extracted from plan filename
|
|
54
|
+
- [x] Formatting consistent across all progress messages
|
|
55
|
+
|
|
56
|
+
## Test Results
|
|
57
|
+
|
|
58
|
+
- All 747 tests pass
|
|
59
|
+
- Build succeeds with no TypeScript errors
|
|
60
|
+
|
|
61
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
# Outcome: Update Plan and Do Prompts
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Updated the planning prompt to produce high-level, conceptual output and the execution prompt to mandate Task tool usage for subagent delegation.
|
|
6
|
+
|
|
7
|
+
## Changes Made
|
|
8
|
+
|
|
9
|
+
### Files Modified
|
|
10
|
+
|
|
11
|
+
1. **src/prompts/planning.ts**
|
|
12
|
+
- Added "Plan Output Style" section with instructions for high-level output
|
|
13
|
+
- Explicitly prohibits code snippets and implementation details
|
|
14
|
+
- Allows file paths for project references (existing files, previous plans/outcomes, directories)
|
|
15
|
+
- Emphasizes describing WHAT needs to be done, not HOW to code it
|
|
16
|
+
|
|
17
|
+
2. **src/prompts/execution.ts**
|
|
18
|
+
- Added mandatory Task tool instruction as first bullet point in Step 2
|
|
19
|
+
- Specifies agent types: Explore for codebase investigation, Plan for design decisions, general-purpose for implementation
|
|
20
|
+
- Instruction: "Use the Task tool to delegate work to subagents"
|
|
21
|
+
|
|
22
|
+
3. **src/prompts/amend.ts**
|
|
23
|
+
- Added same "Plan Output Style" section as planning.ts for consistency
|
|
24
|
+
- Amendment mode now produces the same high-level output style as regular planning
|
|
25
|
+
|
|
26
|
+
## New Prompt Content
|
|
27
|
+
|
|
28
|
+
### Planning/Amend Prompt Addition
|
|
29
|
+
```
|
|
30
|
+
## Plan Output Style
|
|
31
|
+
|
|
32
|
+
**CRITICAL**: Plans should be HIGH-LEVEL and CONCEPTUAL:
|
|
33
|
+
- Describe WHAT needs to be done, not HOW to code it
|
|
34
|
+
- Focus on architecture, data flow, and component interactions
|
|
35
|
+
- NO code snippets or implementation details in plans
|
|
36
|
+
- File paths ARE acceptable when referencing:
|
|
37
|
+
- Existing project files to modify
|
|
38
|
+
- Previous plan/outcome files for context
|
|
39
|
+
- Project structure and directories
|
|
40
|
+
- Let the executing agent decide implementation specifics
|
|
41
|
+
- Plans guide the work; they don't prescribe exact code
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
### Execution Prompt Addition
|
|
45
|
+
```
|
|
46
|
+
- **Use the Task tool to delegate work to subagents** - Split complex work into subtasks and use specialized agents (Explore for codebase investigation, Plan for design decisions, general-purpose for implementation)
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
## Acceptance Criteria
|
|
50
|
+
|
|
51
|
+
- [x] Planning prompt instructs high-level output without code snippets
|
|
52
|
+
- [x] Planning prompt allows file paths for project references
|
|
53
|
+
- [x] Execution prompt mandates Task tool usage for subagent delegation
|
|
54
|
+
- [x] Prompts remain clear and well-structured
|
|
55
|
+
- [x] Amend prompt updated for consistency
|
|
56
|
+
|
|
57
|
+
## Test Results
|
|
58
|
+
|
|
59
|
+
- Build succeeds with no TypeScript errors
|
|
60
|
+
- All 747 tests pass
|
|
61
|
+
|
|
62
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# Task: Add Task Number to Progress Reporting
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Add [NNN] task number prefix to the "raf do" progress reporting in both spinner and completion summary.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
Currently "raf do" shows task names during execution but doesn't show the task number (e.g., 001, 002). Adding the task number prefix helps users understand which specific task is running and correlate output with plan files.
|
|
8
|
+
|
|
9
|
+
## Requirements
|
|
10
|
+
- Display task number as [NNN] prefix (e.g., [001], [002])
|
|
11
|
+
- Show prefix in spinner/progress message during task execution
|
|
12
|
+
- Show prefix in task completion/failure summary
|
|
13
|
+
- Format: `[001] task-name` style
|
|
14
|
+
|
|
15
|
+
## Implementation Steps
|
|
16
|
+
1. Locate the progress reporting code in the "do" command
|
|
17
|
+
2. Extract task number from the task filename (e.g., "001" from "001-task-name.md")
|
|
18
|
+
3. Update spinner message format to include [NNN] prefix
|
|
19
|
+
4. Update completion summary format to include [NNN] prefix
|
|
20
|
+
5. Test with multi-task projects to verify correct numbering
|
|
21
|
+
|
|
22
|
+
## Acceptance Criteria
|
|
23
|
+
- [ ] Spinner shows [NNN] prefix during task execution
|
|
24
|
+
- [ ] Completion summary shows [NNN] prefix
|
|
25
|
+
- [ ] Task number correctly extracted from plan filename
|
|
26
|
+
- [ ] Formatting consistent across all progress messages
|
|
27
|
+
|
|
28
|
+
## Notes
|
|
29
|
+
- Task number comes from plan filename format: `NNN-task-name.md`
|
|
30
|
+
- Existing progress reporting logic is in `src/commands/do.ts` or related execution files
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
# Task: Update Plan and Do Prompts
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Update the planning prompt to produce high-level output and the execution prompt to mandate Task tool usage for subagent delegation.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
The planning prompt needs to guide Claude to produce high-level, conceptual plans without code snippets or excessive implementation details. The execution prompt needs to instruct Claude to use the Task tool to split work and delegate to subagents for better task management.
|
|
8
|
+
|
|
9
|
+
## Requirements
|
|
10
|
+
- **Plan prompt updates:**
|
|
11
|
+
- Instruct to produce high-level but detailed plans
|
|
12
|
+
- Avoid code snippets and implementation specifics
|
|
13
|
+
- File paths are acceptable when referencing previous plans/outputs or project structure
|
|
14
|
+
|
|
15
|
+
- **Do prompt updates:**
|
|
16
|
+
- Add mandatory instruction: "Use Task tool to split work and delegate to subagents for task execution"
|
|
17
|
+
- This applies to all task executions, not optional
|
|
18
|
+
|
|
19
|
+
## Implementation Steps
|
|
20
|
+
1. Read the current planning prompt in `src/prompts/planning.ts`
|
|
21
|
+
2. Add instructions for high-level output, avoiding code snippets
|
|
22
|
+
3. Read the current execution prompt in `src/prompts/execution.ts`
|
|
23
|
+
4. Add mandatory Task tool / subagent delegation instruction
|
|
24
|
+
5. Ensure prompt changes don't conflict with existing instructions
|
|
25
|
+
|
|
26
|
+
## Acceptance Criteria
|
|
27
|
+
- [ ] Planning prompt instructs high-level output without code snippets
|
|
28
|
+
- [ ] Planning prompt allows file paths for project references
|
|
29
|
+
- [ ] Execution prompt mandates Task tool usage for subagent delegation
|
|
30
|
+
- [ ] Prompts remain clear and well-structured
|
|
31
|
+
|
|
32
|
+
## Notes
|
|
33
|
+
- Prompt files are located in `src/prompts/` directory
|
|
34
|
+
- The amend prompt (`amend.ts`) may also need similar updates for consistency
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Project Decisions
|
|
2
|
+
|
|
3
|
+
## For the amendment iteration reference feature — when a new task looks like a follow-up/fix to a previous completed task, how should the reference appear in the new plan?
|
|
4
|
+
Context section, make sure to put file path to prev task outcome.
|
|
5
|
+
|
|
6
|
+
## Should the iteration reference be determined automatically by the planning Claude (via prompt instructions), or should RAF code analyze task content programmatically?
|
|
7
|
+
Prompt-based (Recommended) — update the amend system prompt to instruct Claude to identify follow-ups and include references to previous tasks in new plans.
|
|
8
|
+
|
|
9
|
+
## For verbose task name display — should the task name appear everywhere task ID is shown, or just in the main header?
|
|
10
|
+
Everywhere — show task name alongside ID in all verbose log messages, status updates, and summaries.
|
|
11
|
+
|
|
12
|
+
## For streaming Claude output in verbose mode — is it currently broken or is there a different streaming behavior wanted?
|
|
13
|
+
In verbose mode only a summary of completed work is shown, not constant stream of how Claude executes. Need to investigate and fix — streaming should show real-time Claude output.
|
|
14
|
+
|
|
15
|
+
## For verifying git commits before halting Claude — how should RAF check that the commit landed?
|
|
16
|
+
All three checks combined: (1) HEAD changed from before task, (2) new commit message starts with RAF[project:task], and (3) outcome file is committed in git.
|
|
17
|
+
|
|
18
|
+
## If the commit hasn't landed when the grace period expires, what should happen?
|
|
19
|
+
Extend grace period — keep waiting (with a maximum cap) until the commit appears or a hard timeout is reached.
|
|
20
|
+
|
|
21
|
+
## What should the maximum cap be for the extended grace period when waiting for commit?
|
|
22
|
+
180 seconds (3 minutes total max wait for commit). Current grace period is 60 seconds.
|
|
23
|
+
|
|
24
|
+
## For the verbose streaming fix — any suspicion about what's wrong?
|
|
25
|
+
Investigate fully — debug the entire verbose code path end-to-end to find why streaming isn't working.
|
|
@@ -0,0 +1,3 @@
|
|
|
1
|
+
- [ ] in amendment, if new task looks like follow up or fix to previous completed task in the project - you should reference it in new plan (iteration plan)
|
|
2
|
+
- [ ] in 'raf do --verbose' add task name (not just "Executing task 011..."). also steam output from claude into terminal in verbose mode
|
|
3
|
+
- [ ] raf halted before commit was done. make sure commit is there before halting claude by checking commits. see previous halt work in commit 4d3868c3ef4c607c59829e94462ffd0490d82a98
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# Outcome: Amend Iteration References
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Enhanced the amend planning prompt to include outcome file paths for completed tasks and instruct Claude to reference previous task outcomes when creating follow-up/fix plans.
|
|
5
|
+
|
|
6
|
+
## Changes Made
|
|
7
|
+
|
|
8
|
+
### `src/prompts/amend.ts`
|
|
9
|
+
- **Enhanced `existingTasksSummary`**: Completed tasks now include an `Outcome:` line with the full path to their outcome file (e.g., `Outcome: /project/outcomes/001-setup.md`). Non-completed tasks are unaffected.
|
|
10
|
+
- **Added follow-up task instructions**: New "Identifying Follow-up Tasks" paragraph in Step 2 instructs Claude to reference previous task outcomes in the Context section when creating follow-up, fix, or iteration tasks. Includes the exact format to use.
|
|
11
|
+
- **Updated plan template**: Added a placeholder line in the Context section showing the follow-up reference format.
|
|
12
|
+
|
|
13
|
+
### `tests/unit/plan-command.test.ts`
|
|
14
|
+
- Added 3 new tests:
|
|
15
|
+
- Verifies outcome file paths appear for completed tasks in the task summary
|
|
16
|
+
- Verifies outcome file paths do NOT appear for pending/failed tasks
|
|
17
|
+
- Verifies follow-up task instructions are present in the system prompt
|
|
18
|
+
|
|
19
|
+
## Acceptance Criteria
|
|
20
|
+
- [x] Amend prompt includes outcome file paths for completed tasks in the task summary
|
|
21
|
+
- [x] Prompt instructs Claude to identify follow-up/fix tasks and reference outcomes in Context section
|
|
22
|
+
- [x] Existing amend functionality is not broken (all 40 plan-command tests pass)
|
|
23
|
+
- [x] All tests pass (1 pre-existing failure in planning-prompt.test.ts is unrelated)
|
|
24
|
+
|
|
25
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# Outcome: Verbose Task Name Display
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Updated all verbose mode log messages in `do.ts` to include the task name alongside the task ID, using the format `task 011 (fix-login-bug)` instead of just `task 011`.
|
|
5
|
+
|
|
6
|
+
## Changes Made
|
|
7
|
+
|
|
8
|
+
### `src/commands/do.ts`
|
|
9
|
+
- **Added `taskLabel` variable** (line 449): Computes `${task.id} (${displayName})` when the name differs from the ID, or just `${task.id}` when they're the same (to avoid redundant `001 (001)` display).
|
|
10
|
+
- **Updated 7 verbose log messages** to use `taskLabel`:
|
|
11
|
+
- Blocked task warning: `Task ${taskLabel} blocked by failed dependency: ...`
|
|
12
|
+
- Retry message: `Retrying task ${taskLabel} (previously failed)...`
|
|
13
|
+
- Force re-run message: `Re-running task ${taskLabel} (force mode)...`
|
|
14
|
+
- Execute message: `Executing task ${taskLabel}...`
|
|
15
|
+
- Retry loop message: `Retry N/M for task ${taskLabel}...`
|
|
16
|
+
- Completion message: `Task ${taskLabel} completed (elapsed)`
|
|
17
|
+
- Stash message: `Changes for task ${taskLabel} stashed as: ...`
|
|
18
|
+
- Failure message: `Task ${taskLabel} failed: reason (elapsed)`
|
|
19
|
+
|
|
20
|
+
### No test changes needed
|
|
21
|
+
- The `formatRetryHistoryForConsole` function already supported task name display
|
|
22
|
+
- No existing tests assert on verbose log message format directly
|
|
23
|
+
- All 757 tests pass (1 pre-existing failure in planning-prompt.test.ts is unrelated)
|
|
24
|
+
|
|
25
|
+
## Acceptance Criteria
|
|
26
|
+
- [x] All verbose log messages that mention a task ID also show the task name
|
|
27
|
+
- [x] Format is consistent across all messages (`taskLabel` pattern)
|
|
28
|
+
- [x] Non-verbose mode is unaffected (uses `formatTaskProgress` separately)
|
|
29
|
+
- [x] All tests pass
|
|
30
|
+
|
|
31
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# Outcome: Verbose Streaming Fix
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Fixed `runVerbose()` to stream Claude's real-time execution output (tool calls, file operations, text) by using `--output-format stream-json --verbose` instead of plain `-p` mode which only showed the final summary.
|
|
5
|
+
|
|
6
|
+
## Root Cause
|
|
7
|
+
The `runVerbose()` method used the same `-p` (print) flag as `run()`, which runs Claude in non-interactive mode and only outputs the final assistant response text. This meant users saw a summary of completed work rather than real-time streaming of Claude's activity.
|
|
8
|
+
|
|
9
|
+
## Changes Made
|
|
10
|
+
|
|
11
|
+
### `src/parsers/stream-renderer.ts` (NEW)
|
|
12
|
+
- **Stream event parser**: Parses NDJSON lines from Claude CLI `stream-json` output
|
|
13
|
+
- **Human-readable rendering**: Converts events to user-friendly display:
|
|
14
|
+
- Text blocks: displayed directly
|
|
15
|
+
- Tool calls: descriptive one-line summaries (e.g., `→ Reading /src/main.ts`, `→ Running: npm test`)
|
|
16
|
+
- System/result events: suppressed (not useful for display)
|
|
17
|
+
- **Tool descriptions**: Custom formatting for Read, Write, Edit, Bash, Glob, Grep, WebFetch, WebSearch, TodoWrite, Task, and NotebookEdit tools
|
|
18
|
+
- **Text content extraction**: Returns text content separately for completion marker detection and output parsing
|
|
19
|
+
|
|
20
|
+
### `src/core/claude-runner.ts`
|
|
21
|
+
- **Modified `runVerbose()` spawn args**: Added `--output-format stream-json --verbose` flags to get real-time NDJSON streaming events
|
|
22
|
+
- **NDJSON line buffering**: Added line buffer to handle data chunks that split across NDJSON line boundaries
|
|
23
|
+
- **Event rendering pipeline**: Each complete NDJSON line is parsed by `renderStreamEvent()`, display text goes to stdout, text content accumulates in `output` for completion detection and parsing
|
|
24
|
+
- **Preserved all existing behavior**: Timeout, context overflow detection, completion marker detection, outcome file polling, and kill mechanisms all work unchanged
|
|
25
|
+
- **`run()` method unchanged**: Non-verbose mode remains exactly as before
|
|
26
|
+
|
|
27
|
+
### `tests/unit/stream-renderer.test.ts` (NEW)
|
|
28
|
+
- 25 tests covering all event types: system, assistant (text), assistant (tool_use), user (tool results), result
|
|
29
|
+
- Edge cases: empty lines, invalid JSON, empty content, unknown events
|
|
30
|
+
- Tool-specific rendering: all 11 supported tools tested
|
|
31
|
+
|
|
32
|
+
### `tests/unit/claude-runner.test.ts`
|
|
33
|
+
- Added 4 new tests in `verbose stream-json output` describe block:
|
|
34
|
+
- Verifies `runVerbose()` includes `--output-format stream-json --verbose` flags
|
|
35
|
+
- Verifies `run()` does NOT include these flags
|
|
36
|
+
- Verifies NDJSON assistant events are parsed and text extracted correctly
|
|
37
|
+
- Verifies tool_use events don't add text to output
|
|
38
|
+
|
|
39
|
+
## Acceptance Criteria
|
|
40
|
+
- [x] `raf do --verbose` shows Claude's real-time execution (tool calls, file operations, thinking)
|
|
41
|
+
- [x] Completion marker detection still works correctly
|
|
42
|
+
- [x] Timeout mechanism still functions
|
|
43
|
+
- [x] Context overflow detection still works
|
|
44
|
+
- [x] Non-verbose mode (`raf do`) is completely unaffected
|
|
45
|
+
- [x] Success/failure parsing still works from the captured output
|
|
46
|
+
- [x] All existing tests pass (786 pass, 1 pre-existing failure in planning-prompt.test.ts is unrelated)
|
|
47
|
+
|
|
48
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
# Outcome: Commit Verification Before Halt
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
Added commit verification to the grace period logic so that when a COMPLETE marker is detected, the system verifies the expected git commit has actually landed before terminating. If the commit hasn't landed within the initial 60-second grace period, it extends polling up to a hard maximum of 180 seconds.
|
|
5
|
+
|
|
6
|
+
## Changes Made
|
|
7
|
+
|
|
8
|
+
### `src/core/git.ts`
|
|
9
|
+
- **`getHeadCommitHash()`**: Returns the current HEAD commit hash (or null if not in a git repo)
|
|
10
|
+
- **`getHeadCommitMessage()`**: Returns the HEAD commit message first line (or null)
|
|
11
|
+
- **`isFileCommittedInHead(filePath)`**: Checks if a file exists in HEAD's tree using `git ls-tree`
|
|
12
|
+
|
|
13
|
+
### `src/core/claude-runner.ts`
|
|
14
|
+
- **New exports**: `COMPLETION_HARD_MAX_MS` (180s), `COMMIT_POLL_INTERVAL_MS` (10s), `CommitContext` interface
|
|
15
|
+
- **New `commitContext` field in `ClaudeRunnerOptions`**: Allows passing pre-execution HEAD hash, expected commit prefix, and outcome file path
|
|
16
|
+
- **`verifyCommit()` helper**: Checks all three conditions (HEAD changed, message prefix matches, outcome file committed)
|
|
17
|
+
- **Updated `createCompletionDetector()`**: Accepts optional `commitContext` parameter. On grace period expiry:
|
|
18
|
+
- If COMPLETE marker and `commitContext` provided: verifies commit before killing
|
|
19
|
+
- If commit not verified: starts polling every 10s up to 180s total
|
|
20
|
+
- If FAILED marker or no `commitContext`: kills immediately (existing behavior)
|
|
21
|
+
- **Both `run()` and `runVerbose()`**: Pass `commitContext` through to completion detector
|
|
22
|
+
|
|
23
|
+
### `src/commands/do.ts`
|
|
24
|
+
- **Captures HEAD hash** before each task execution attempt using `getHeadCommitHash()`
|
|
25
|
+
- **Builds `commitContext`** with `preExecutionHead`, `expectedPrefix` (e.g., `RAF[005:001]`), and `outcomeFilePath`
|
|
26
|
+
- **Passes `commitContext`** to both `run()` and `runVerbose()` calls
|
|
27
|
+
- Gracefully handles non-git-repo case (skips commit verification)
|
|
28
|
+
|
|
29
|
+
### `tests/unit/git-commit-helpers.test.ts` (NEW)
|
|
30
|
+
- 11 tests covering all three new git functions:
|
|
31
|
+
- `getHeadCommitHash`: normal, not-in-repo, empty output, whitespace trimming
|
|
32
|
+
- `getHeadCommitMessage`: normal, not-in-repo, empty output
|
|
33
|
+
- `isFileCommittedInHead`: file exists, file missing, not-in-repo, command failure
|
|
34
|
+
|
|
35
|
+
### `tests/unit/claude-runner.test.ts`
|
|
36
|
+
- Added mock for `git.js` module (`getHeadCommitHash`, `getHeadCommitMessage`, `isFileCommittedInHead`)
|
|
37
|
+
- Added 7 new tests in `commit verification during grace period` describe block:
|
|
38
|
+
- Commit verified within initial grace period (kills normally)
|
|
39
|
+
- Commit found during extended polling (extends, then kills)
|
|
40
|
+
- Commit never lands (hard max at 180s)
|
|
41
|
+
- FAILED markers don't trigger commit verification
|
|
42
|
+
- Backward compatible without commitContext
|
|
43
|
+
- Verifies commit message prefix must match
|
|
44
|
+
- Verifies outcome file must be committed
|
|
45
|
+
|
|
46
|
+
## Acceptance Criteria
|
|
47
|
+
- [x] HEAD hash is recorded before each task execution
|
|
48
|
+
- [x] Grace period checks for commit matching `RAF[project:task]` pattern
|
|
49
|
+
- [x] Grace period checks that outcome file is committed
|
|
50
|
+
- [x] Grace period extends up to 180 seconds if commit not found
|
|
51
|
+
- [x] Process is killed with a warning after 180 seconds if commit never lands
|
|
52
|
+
- [x] Normal flow (commit lands within 60s) is not affected
|
|
53
|
+
- [x] All existing tests pass (805 pass, 1 pre-existing failure in planning-prompt.test.ts is unrelated)
|
|
54
|
+
- [x] New tests cover: commit found within grace, commit found in extended grace, commit never found (hard timeout)
|
|
55
|
+
|
|
56
|
+
<promise>COMPLETE</promise>
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# Task: Amend Iteration References
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
When amending a project, instruct the planning Claude to reference previous completed tasks (with outcome file paths) in the Context section of new plans that are follow-ups or fixes.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
Currently, amendment mode shows existing tasks with their status (COMPLETED/PENDING/FAILED) but doesn't guide Claude to treat new tasks as iterations on previous work. When a user amends a project to fix or follow up on a completed task, the new plan should reference the previous task's outcome file so the executing agent has full context about what was done before.
|
|
8
|
+
|
|
9
|
+
## Requirements
|
|
10
|
+
- Update the amend system prompt in `src/prompts/amend.ts` to instruct Claude to:
|
|
11
|
+
- Identify when new tasks are follow-ups, fixes, or iterations of previously completed tasks
|
|
12
|
+
- Include a reference to the previous task's outcome file path in the Context section of the new plan
|
|
13
|
+
- Use the format: "This is a follow-up to task NNN. See outcome: {projectPath}/outcomes/NNN-task-name.md"
|
|
14
|
+
- The existing tasks summary already includes task IDs and names — enhance it to also show the outcome file path for completed tasks so Claude has the information readily available
|
|
15
|
+
- This is purely a prompt engineering change — no programmatic detection logic needed
|
|
16
|
+
- Do NOT modify the plan template structure (no new dedicated fields) — use the existing Context section
|
|
17
|
+
|
|
18
|
+
## Implementation Steps
|
|
19
|
+
1. Read the current amend prompt in `src/prompts/amend.ts`
|
|
20
|
+
2. Enhance the `existingTasksSummary` generation to include outcome file paths for completed tasks
|
|
21
|
+
3. Add instructions in the system prompt (Step 2: Analyze New Requirements section) telling Claude to identify follow-up/fix tasks and reference previous outcomes in the Context section
|
|
22
|
+
4. Add an example in the prompt showing how to reference a previous task's outcome
|
|
23
|
+
5. Update tests for the amend prompt if any exist
|
|
24
|
+
|
|
25
|
+
## Acceptance Criteria
|
|
26
|
+
- [ ] Amend prompt includes outcome file paths for completed tasks in the task summary
|
|
27
|
+
- [ ] Prompt instructs Claude to identify follow-up/fix tasks and reference outcomes in Context section
|
|
28
|
+
- [ ] Existing amend functionality is not broken
|
|
29
|
+
- [ ] All tests pass
|
|
30
|
+
|
|
31
|
+
## Notes
|
|
32
|
+
- The outcome file path follows the pattern: `{projectPath}/outcomes/{taskId}-{taskName}.md`
|
|
33
|
+
- Only completed tasks have meaningful outcome files to reference
|
|
34
|
+
- The `AmendPromptParams` interface already includes `projectPath` which can be used to construct outcome paths
|
|
35
|
+
- The `existingTasks` array includes `planFile` from which the task name can be extracted for outcome path construction
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
# Task: Verbose Task Name Display
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Show the task name alongside the task ID in all verbose mode log messages (e.g., "Executing task 011 (fix-login-bug)..." instead of "Executing task 011...").
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
Currently in verbose mode, log messages reference tasks by ID only (e.g., "Executing task 011..."). The task name is available via `extractTaskNameFromPlanFile()` and is already stored in `displayName`, but not consistently used in all log messages. Users need the task name to quickly understand what's running without cross-referencing plan files.
|
|
8
|
+
|
|
9
|
+
## Requirements
|
|
10
|
+
- Add task name to ALL verbose log messages that reference a task ID, including:
|
|
11
|
+
- "Executing task 011..." → "Executing task 011 (fix-login-bug)..."
|
|
12
|
+
- "Retrying task 011 (previously failed)..." → include name
|
|
13
|
+
- "Re-running task 011 (force mode)..." → include name
|
|
14
|
+
- "Task 011 completed (2m 30s)" → include name
|
|
15
|
+
- "Task 011 failed: reason (2m 30s)" → include name
|
|
16
|
+
- "Task 011 blocked by failed dependency: 003" → include name
|
|
17
|
+
- Retry messages
|
|
18
|
+
- Changes stashed messages
|
|
19
|
+
- The task name is already computed as `displayName` in `do.ts` — use it consistently
|
|
20
|
+
- Also update the `logger.setContext()` call if it's being used (currently no-op but may be restored)
|
|
21
|
+
- Update the verbose summary section to include task names where applicable
|
|
22
|
+
|
|
23
|
+
## Implementation Steps
|
|
24
|
+
1. Read `src/commands/do.ts` and identify all verbose log messages that reference task IDs
|
|
25
|
+
2. Update each message to include `displayName` in parentheses after the task ID
|
|
26
|
+
3. Ensure the format is consistent: `task ${task.id} (${displayName})` everywhere
|
|
27
|
+
4. Update tests if any test verbose output format
|
|
28
|
+
|
|
29
|
+
## Acceptance Criteria
|
|
30
|
+
- [ ] All verbose log messages that mention a task ID also show the task name
|
|
31
|
+
- [ ] Format is consistent across all messages
|
|
32
|
+
- [ ] Non-verbose mode is unaffected
|
|
33
|
+
- [ ] All tests pass
|
|
34
|
+
|
|
35
|
+
## Notes
|
|
36
|
+
- `displayName` is already computed at line 447 of `do.ts` as `taskName ?? task.id`
|
|
37
|
+
- When `displayName` equals `task.id` (name extraction failed), showing it in parens would be redundant — consider only showing parens when name differs from ID
|
|
38
|
+
- The `setContext` and `clearContext` methods on logger are currently no-ops (deprecated) but the calls remain in do.ts
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
# Task: Verbose Streaming Fix
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Investigate and fix why verbose mode only shows a summary of completed work instead of streaming Claude's real-time execution output.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
The `runVerbose()` method in `claude-runner.ts` does call `process.stdout.write(text)` on stdout data, but users report only seeing a summary rather than a real-time stream. The likely root cause is that Claude CLI is invoked with the `-p` (print/pipe) flag, which runs in non-interactive mode and only outputs the final assistant response text — not the intermediate tool calls, file reads, code edits, and thinking steps that constitute the real-time execution flow.
|
|
8
|
+
|
|
9
|
+
The non-verbose `run()` method correctly uses `-p` since it only needs the final output for parsing. But `runVerbose()` needs a different approach to show Claude's work as it happens.
|
|
10
|
+
|
|
11
|
+
## Requirements
|
|
12
|
+
- Investigate the exact cause of why streaming doesn't show real-time Claude activity
|
|
13
|
+
- Fix `runVerbose()` to stream Claude's real-time execution output (tool calls, file operations, code writing, etc.)
|
|
14
|
+
- The fix likely involves either:
|
|
15
|
+
- Removing `-p` flag and using a different mode that streams intermediate output
|
|
16
|
+
- Using `--output-format stream-json` to get streaming JSON events and rendering them
|
|
17
|
+
- Using PTY-based execution (like `runInteractive`) but without stdin interaction
|
|
18
|
+
- The completion marker detection must still work with the new approach
|
|
19
|
+
- Output parsing for success/failure must still function correctly
|
|
20
|
+
- The timeout mechanism must still work
|
|
21
|
+
- Non-verbose mode must remain unchanged
|
|
22
|
+
|
|
23
|
+
## Implementation Steps
|
|
24
|
+
1. Investigate what Claude CLI output modes are available (check `claude --help` or documentation)
|
|
25
|
+
2. Determine the best approach for streaming real-time output while maintaining completion detection
|
|
26
|
+
3. Modify `runVerbose()` to use the chosen streaming approach
|
|
27
|
+
4. Ensure completion markers can still be detected in the new output format
|
|
28
|
+
5. Test that timeout, context overflow detection, and kill mechanisms still work
|
|
29
|
+
6. Verify output can still be parsed for success/failure determination
|
|
30
|
+
|
|
31
|
+
## Acceptance Criteria
|
|
32
|
+
- [ ] `raf do --verbose` shows Claude's real-time execution (tool calls, file operations, thinking)
|
|
33
|
+
- [ ] Completion marker detection still works correctly
|
|
34
|
+
- [ ] Timeout mechanism still functions
|
|
35
|
+
- [ ] Context overflow detection still works
|
|
36
|
+
- [ ] Non-verbose mode (`raf do`) is completely unaffected
|
|
37
|
+
- [ ] Success/failure parsing still works from the captured output
|
|
38
|
+
- [ ] All existing tests pass (update as needed for new implementation)
|
|
39
|
+
|
|
40
|
+
## Notes
|
|
41
|
+
- The `run()` method must NOT be changed — only `runVerbose()` needs modification
|
|
42
|
+
- Claude CLI streaming options may include `--output-format stream-json` which outputs JSON events
|
|
43
|
+
- If using stream-json, a renderer/formatter will be needed to display events in a human-readable way
|
|
44
|
+
- PTY-based approach (similar to `runInteractive`) could work but would need `--dangerously-skip-permissions` to avoid interactive prompts
|
|
45
|
+
- The `activeProcess` tracking needs to work with whichever approach is chosen for the shutdown handler
|
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
# Task: Commit Verification Before Halt
|
|
2
|
+
|
|
3
|
+
## Objective
|
|
4
|
+
Before halting Claude after detecting a completion marker, verify that the expected git commit has actually been made, extending the grace period if the commit hasn't landed yet.
|
|
5
|
+
|
|
6
|
+
## Context
|
|
7
|
+
The current halt mechanism in `claude-runner.ts` detects completion markers (`<promise>COMPLETE/FAILED</promise>`) and starts a fixed 60-second grace period before killing Claude. However, if Claude hasn't finished its git commit within that window, the process gets killed mid-commit, potentially leaving the repository in a broken state. The previous halt work was introduced in commit `4d3868c`. This task adds commit verification to ensure the grace period only expires after the commit is confirmed.
|
|
8
|
+
|
|
9
|
+
See previous halt work: commit `4d3868c3ef4c607c59829e94462ffd0490d82a98`
|
|
10
|
+
|
|
11
|
+
## Dependencies
|
|
12
|
+
003
|
|
13
|
+
|
|
14
|
+
## Requirements
|
|
15
|
+
- Record the HEAD commit hash before task execution begins
|
|
16
|
+
- After the grace period triggers, verify the commit landed by checking ALL three conditions:
|
|
17
|
+
1. HEAD has changed from the pre-execution hash
|
|
18
|
+
2. The new HEAD commit message starts with the expected `RAF[project:task]` pattern
|
|
19
|
+
3. The outcome file is tracked in git (committed, not just on disk)
|
|
20
|
+
- If the commit hasn't landed when the initial grace period (60s) expires:
|
|
21
|
+
- Extend the grace period by polling for the commit
|
|
22
|
+
- Continue extending until either the commit is confirmed or a hard maximum of 180 seconds is reached
|
|
23
|
+
- Poll at a reasonable interval (e.g., every 10 seconds)
|
|
24
|
+
- If the hard maximum is reached without commit confirmation, kill the process and log a warning
|
|
25
|
+
- Add git helper functions to `src/core/git.ts` for:
|
|
26
|
+
- Getting the current HEAD hash
|
|
27
|
+
- Checking if a commit message matches a pattern
|
|
28
|
+
- Checking if a file is committed in HEAD
|
|
29
|
+
- The completion detector factory (`createCompletionDetector`) needs additional parameters:
|
|
30
|
+
- Pre-execution HEAD hash
|
|
31
|
+
- Expected commit message prefix (e.g., `RAF[005:001]`)
|
|
32
|
+
- Outcome file path (already available)
|
|
33
|
+
- Pass the necessary context from `do.ts` when creating the completion detector
|
|
34
|
+
|
|
35
|
+
## Implementation Steps
|
|
36
|
+
1. Add new git utility functions to `src/core/git.ts`:
|
|
37
|
+
- `getHeadCommitHash()`: returns current HEAD hash
|
|
38
|
+
- `getHeadCommitMessage()`: returns HEAD commit message
|
|
39
|
+
- `isFileCommittedInHead(filePath)`: checks if file is in HEAD commit
|
|
40
|
+
2. Update `createCompletionDetector` in `claude-runner.ts` to accept commit verification parameters
|
|
41
|
+
3. Modify the grace period logic: instead of a single `setTimeout`, use an interval that checks for the commit
|
|
42
|
+
4. If initial 60s grace period expires without commit, continue polling up to 180s total
|
|
43
|
+
5. Update `do.ts` to capture HEAD hash before task execution and pass commit context to the runner
|
|
44
|
+
6. Add unit tests for the new git functions
|
|
45
|
+
7. Add unit tests for the extended grace period behavior
|
|
46
|
+
|
|
47
|
+
## Acceptance Criteria
|
|
48
|
+
- [ ] HEAD hash is recorded before each task execution
|
|
49
|
+
- [ ] Grace period checks for commit matching `RAF[project:task]` pattern
|
|
50
|
+
- [ ] Grace period checks that outcome file is committed
|
|
51
|
+
- [ ] Grace period extends up to 180 seconds if commit not found
|
|
52
|
+
- [ ] Process is killed with a warning after 180 seconds if commit never lands
|
|
53
|
+
- [ ] Normal flow (commit lands within 60s) is not affected
|
|
54
|
+
- [ ] All existing tests pass
|
|
55
|
+
- [ ] New tests cover: commit found within grace, commit found in extended grace, commit never found (hard timeout)
|
|
56
|
+
|
|
57
|
+
## Notes
|
|
58
|
+
- The `ClaudeRunnerOptions` interface may need new fields for the commit context
|
|
59
|
+
- On task failure, Claude does NOT commit (changes are stashed) — the commit check should only apply when a COMPLETE marker is detected, not FAILED
|
|
60
|
+
- The project number and task ID are already available in `do.ts` where the runner is called
|
|
61
|
+
- Use `execSync` for git commands (consistent with existing `git.ts` functions)
|
|
62
|
+
- Handle "not in git repo" gracefully — skip commit verification if not in a git repo
|