@curdx/flow 2.3.10 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (51) hide show
  1. package/.claude-plugin/marketplace.json +2 -2
  2. package/.claude-plugin/plugin.json +2 -20
  3. package/CHANGELOG.md +55 -2
  4. package/README.md +69 -19
  5. package/agents/flow-adversary.md +1 -0
  6. package/agents/flow-architect.md +1 -0
  7. package/agents/flow-brownfield-analyst.md +1 -0
  8. package/agents/flow-edge-hunter.md +1 -0
  9. package/agents/flow-planner.md +1 -0
  10. package/agents/flow-researcher.md +1 -0
  11. package/agents/flow-reviewer.md +1 -0
  12. package/agents/flow-ui-researcher.md +1 -0
  13. package/agents/flow-verifier.md +1 -0
  14. package/bin/curdx-flow-state +104 -0
  15. package/cli/README.md +6 -1
  16. package/cli/install-companions.js +1 -1
  17. package/cli/install-context7-config.js +5 -3
  18. package/cli/install-required-plugins.js +2 -2
  19. package/cli/uninstall-actions.js +37 -0
  20. package/cli/upgrade-workflow.js +1 -1
  21. package/cli/upgrade.js +42 -14
  22. package/hooks/hooks.json +72 -0
  23. package/hooks/scripts/common.sh +191 -0
  24. package/hooks/scripts/config-change-guard.sh +94 -0
  25. package/hooks/scripts/flow-context-watch.sh +94 -0
  26. package/hooks/scripts/quick-mode-guard.sh +4 -3
  27. package/hooks/scripts/session-start.sh +14 -10
  28. package/hooks/scripts/session-title.sh +87 -0
  29. package/hooks/scripts/stop-watcher.sh +4 -3
  30. package/hooks/scripts/subagent-artifact-guard.sh +7 -74
  31. package/hooks/scripts/subagent-statusline.sh +8 -2
  32. package/hooks/scripts/task-lifecycle-guard.sh +106 -0
  33. package/hooks/scripts/teammate-idle-guard.sh +83 -0
  34. package/knowledge/claude-code-runtime-contracts.md +21 -0
  35. package/monitors/scripts/flow-state-monitor.sh +8 -5
  36. package/output-styles/curdx-fast-mode.md +42 -0
  37. package/output-styles/curdx-spec-mode.md +46 -0
  38. package/package.json +5 -3
  39. package/schemas/agent-frontmatter.schema.json +4 -1
  40. package/schemas/spec-state.schema.json +18 -0
  41. package/settings.json +2 -1
  42. package/skills/implement/SKILL.md +8 -0
  43. package/skills/implement/references/linear-execution.md +11 -0
  44. package/skills/implement/references/native-task-sync.md +107 -0
  45. package/skills/implement/references/progress-contract.md +4 -0
  46. package/skills/implement/references/state-init.md +3 -0
  47. package/skills/implement/references/stop-hook-execution.md +19 -5
  48. package/skills/implement/references/subagent-execution.md +16 -2
  49. package/skills/implement/references/wave-execution.md +18 -0
  50. package/skills/status/references/gather-contract.md +3 -0
  51. package/skills/status/references/output-contract.md +1 -0
package/settings.json CHANGED
@@ -2,6 +2,7 @@
2
2
  "agent": "flow-orchestrator",
3
3
  "subagentStatusLine": {
4
4
  "type": "command",
5
- "command": "${CLAUDE_PLUGIN_ROOT}/hooks/scripts/subagent-statusline.sh"
5
+ "command": "${CLAUDE_PLUGIN_ROOT}/hooks/scripts/subagent-statusline.sh",
6
+ "refreshInterval": 5
6
7
  }
7
8
  }
@@ -19,6 +19,7 @@ for strategy-specific protocols:
19
19
  - `references/preflight.md`
20
20
  - `references/strategy-router.md`
21
21
  - `references/state-init.md`
22
+ - `references/native-task-sync.md`
22
23
  - `references/linear-execution.md`
23
24
  - `references/subagent-execution.md`
24
25
  - `references/wave-execution.md`
@@ -53,6 +54,13 @@ signals. The runtime selector is:
53
54
 
54
55
  Use `references/state-init.md` before dispatching any execution protocol.
55
56
 
57
+ ## Native Task Sync
58
+
59
+ Use `references/native-task-sync.md` to mirror CurDX execution into Claude's
60
+ interactive task list when `TaskCreate` / `TaskUpdate` / `TaskList` are
61
+ available. This is best-effort UX only; never let native task sync override the
62
+ real `.flow` ledger.
63
+
56
64
  ## Run the Selected Execution Protocol
57
65
 
58
66
  - `linear`: stay inline in the main agent and execute the first unchecked task
@@ -7,11 +7,16 @@ needs maximum visibility in the main session.
7
7
 
8
8
  ```text
9
9
  for task in remaining tasks:
10
+ if native task sync is active:
11
+ reconcile obvious tasks.md/native drift
12
+ mark the current Claude task in_progress
10
13
  read task fields (Do / Files / Done when / Verify / Commit)
11
14
  execute the task inline in the current session
12
15
  run the task's Verify command
13
16
  make the atomic commit declared by the task
14
17
  mark the task [x] in tasks.md
18
+ if native task sync is active:
19
+ mark the current Claude task completed
15
20
  append the outcome to .progress.md
16
21
  print "✓ Task X.Y complete"
17
22
  ```
@@ -19,14 +24,20 @@ for task in remaining tasks:
19
24
  ## Execution Rules
20
25
 
21
26
  - Follow the `flow-executor` contract even though you stay inline
27
+ - Treat `references/native-task-sync.md` as a mirror contract layered around
28
+ the inline loop, not a separate phase
22
29
  - Respect the declared `Files` scope; do not quietly expand task boundaries
23
30
  - Verification must pass before the task can be marked complete
24
31
  - One task, one atomic commit
25
32
  - If the task is too broad or unsafe, stop and surface `TASK_FAILED` semantics
26
33
  instead of improvising extra subtasks
34
+ - If `TaskUpdate` fails for the current task, increment
35
+ `native_sync_failure_count`, keep `.flow` state authoritative, and continue
36
+ the inline run unless the real task itself failed
27
37
 
28
38
  ## Stop Conditions
29
39
 
30
40
  - Any git operation failure: stop immediately
31
41
  - Missing or invalid Verify command: stop and ask for a task regeneration
32
42
  - 3 consecutive `TASK_FAILED`: stop and require intervention
43
+ - Native task sync failure alone is never a stop condition
@@ -0,0 +1,107 @@
1
+ # Native Task Sync — Mirror `.flow` Execution Into Claude's Task List
2
+
3
+ Use this layer only as an interactive UX mirror. `.flow/specs/<name>/tasks.md`
4
+ and `.state.json` remain the source of truth.
5
+
6
+ ## Availability Rule
7
+
8
+ If the current session exposes `TaskCreate`, `TaskList`, and `TaskUpdate`, keep
9
+ Claude's native task list aligned with the active spec. If those tools are not
10
+ available (for example in headless or bare flows), skip all native sync work
11
+ silently and continue with the disk-backed protocol.
12
+
13
+ ## State Fields
14
+
15
+ Track best-effort sync state under `.state.json` `execute_state`:
16
+
17
+ - `native_task_map`: `{ "<task-id>": "<claude-task-id>" }`
18
+ - `native_sync_enabled`: `true | false`
19
+ - `native_sync_failure_count`: integer
20
+
21
+ Default behavior:
22
+
23
+ - missing fields -> treat as first-run and initialize them
24
+ - `native_sync_enabled = false` -> never attempt native task sync again in this
25
+ execution run unless the user explicitly resets state
26
+
27
+ ## First Sync
28
+
29
+ After preflight and execute-state initialization:
30
+
31
+ 1. If native sync is available and `native_task_map` is empty, parse every task
32
+ row from `tasks.md`.
33
+ 2. Create one Claude native task per CurDX task.
34
+ 3. Use the CurDX task id in the subject:
35
+ - regular: `1.2 Add retry guard`
36
+ - parallel: `[P] 2.1 Build auth DTOs`
37
+ - verification: `[VERIFY] 3.VF Re-run original repro`
38
+ 4. Use a compact description built from:
39
+ - `Done when`
40
+ - `Verify`
41
+ 5. Immediately mark already-checked `[x]` tasks as `completed`.
42
+ 6. Persist the returned Claude task ids in `native_task_map`.
43
+
44
+ When interactive hook support is available, CurDX also uses `TaskCreated` as a
45
+ guardrail:
46
+
47
+ - reject CurDX-shaped native tasks whose subject does not map to a real
48
+ `tasks.md` id
49
+ - reject CurDX-shaped native tasks that omit the compact description derived
50
+ from `Done when` + `Verify`
51
+
52
+ Do not block execution if task creation fails. Increment
53
+ `native_sync_failure_count`, log one short note in `.progress.md`, and continue.
54
+
55
+ ## Rebuild Rule
56
+
57
+ If a stored Claude task id no longer updates cleanly, assume the current
58
+ session is attached to a different native task list:
59
+
60
+ 1. clear `native_task_map`
61
+ 2. recreate the native tasks from the current `tasks.md`
62
+ 3. keep `.flow` execution state unchanged
63
+
64
+ This recovers from session changes without corrupting the real spec ledger.
65
+
66
+ ## Per-Task Sync
67
+
68
+ Before dispatching the current task:
69
+
70
+ 1. reconcile `tasks.md` `[x]` rows against native task state when obvious drift
71
+ exists
72
+ 2. mark the current native task `in_progress`
73
+ 3. set `activeForm` to `Executing <task-id>`
74
+
75
+ After `TASK_COMPLETE` and after disk-backed verification/state updates pass:
76
+
77
+ 1. mark that native task `completed`
78
+ 2. reset `native_sync_failure_count` toward zero on success
79
+
80
+ When interactive hook support is available, `TaskCompleted` should also reject
81
+ any CurDX-shaped native task completion that happens before `tasks.md` marks the
82
+ same task id as checked.
83
+
84
+ After `TASK_FAILED`:
85
+
86
+ 1. leave the current task visible
87
+ 2. update `activeForm` to `Retrying <task-id>` when the tool call is available
88
+ 3. never advance the native task to `completed`
89
+
90
+ ## Completion Rule
91
+
92
+ When execute finishes:
93
+
94
+ 1. ensure every checked task in `tasks.md` is `completed` in the native list
95
+ 2. do not invent native tasks for verify/review phases here
96
+ 3. keep the final chat summary short; the native task list is only a progress
97
+ surface, not the evidence surface
98
+
99
+ ## Guardrails
100
+
101
+ - Native sync is best-effort UX, not correctness.
102
+ - Never trust the native task list over `tasks.md`.
103
+ - Never stop a CurDX run solely because `TaskCreate` or `TaskUpdate` failed.
104
+ - Native task lifecycle guards are allowed to reject malformed CurDX-native
105
+ tasks, but they must not interfere with unrelated non-CurDX task-list usage.
106
+ - Do not use `TodoWrite` here; interactive Claude sessions should prefer the
107
+ current Task tool family.
@@ -13,6 +13,10 @@ Recent commits:
13
13
  ════════════════════
14
14
  ```
15
15
 
16
+ If native task sync is active, the Claude task list should mirror this same
17
+ progress state. Do not duplicate long task inventories in chat just because the
18
+ task list UI exists.
19
+
16
20
  When all tasks are complete:
17
21
 
18
22
  ```text
@@ -24,6 +24,9 @@ s['execute_state'].setdefault('global_iteration', 1)
24
24
  s['execute_state'].setdefault('recovery_mode', recovery_mode)
25
25
  s['execute_state'].setdefault('max_fix_tasks_per_original', max_fix_tasks)
26
26
  s['execute_state'].setdefault('fix_task_map', {})
27
+ s['execute_state'].setdefault('native_task_map', {})
28
+ s['execute_state'].setdefault('native_sync_enabled', True)
29
+ s['execute_state'].setdefault('native_sync_failure_count', 0)
27
30
  if QUICK:
28
31
  s['quickMode'] = True
29
32
  json.dump(s, open(p, 'w'), indent=2, ensure_ascii=False)
@@ -6,11 +6,13 @@ naturally and the hook layer decides whether Claude continues.
6
6
  ## Protocol
7
7
 
8
8
  ```text
9
- 1. Execute 1 task (inline or via flow-executor, then wait for completion)
10
- 2. End the response naturally
11
- 3. stop-watcher.sh evaluates transcript markers, .state.json, and tasks.md
12
- 4. If unchecked tasks remain, the hook blocks stop and injects continuation
13
- 5. Repeat until ALL_TASKS_COMPLETE or recovery forces a halt
9
+ 1. Initialize execute_state, including native sync fields
10
+ 2. Rebuild native_task_map if the current session cannot update stored task ids
11
+ 3. Execute 1 task (inline or via flow-executor, then wait for completion)
12
+ 4. End the response naturally
13
+ 5. stop-watcher.sh evaluates transcript markers, .state.json, and tasks.md
14
+ 6. If unchecked tasks remain, the hook blocks stop and injects continuation
15
+ 7. Repeat until ALL_TASKS_COMPLETE or recovery forces a halt
14
16
  ```
15
17
 
16
18
  ## Completion Invariant
@@ -23,6 +25,14 @@ Stop-hook completion is double-checked:
23
25
  If either side disagrees, the hook resumes the first unchecked task instead of
24
26
  allowing the session to stop cleanly.
25
27
 
28
+ If native task sync is active during a continuation round:
29
+
30
+ - `.flow/specs/<name>/tasks.md` stays the source of truth
31
+ - checked rows should be mirrored back to the native task list before the next
32
+ task starts
33
+ - a stale `native_task_map` should trigger the rebuild rule from
34
+ `references/native-task-sync.md`, not a run failure
35
+
26
36
  ## Prerequisites
27
37
 
28
38
  - `--quick` should be set; otherwise `AskUserQuestion` can stall the loop
@@ -34,3 +44,7 @@ allowing the session to stop cleanly.
34
44
  During a stop-hook continuation, `stop_hook_active=true` is only a hint. The
35
45
  hook still checks transcript signals, `.state.json`, and `tasks.md` parity
36
46
  before deciding whether to continue or stop.
47
+
48
+ Native task sync is session-local UX. Continuations may resume into a different
49
+ Claude-native task list, so recovery must prefer rebuilding the UI mirror over
50
+ mutating `.flow` state.
@@ -5,6 +5,12 @@ task, but the backlog is not parallel-safe enough for waves.
5
5
 
6
6
  ## Dispatch Contract
7
7
 
8
+ Before each dispatch, if native task sync is active:
9
+
10
+ - reconcile obvious `tasks.md` vs native-task drift
11
+ - mark the outgoing task `in_progress`
12
+ - set the native active form to `Executing <task-id>` when available
13
+
8
14
  Dispatch one `flow-executor` at a time with a uniform prompt:
9
15
 
10
16
  ```text
@@ -32,12 +38,20 @@ When all tasks are done output: ALL_TASKS_COMPLETE
32
38
 
33
39
  After the agent completes, read the output marker:
34
40
 
35
- - `TASK_COMPLETE` -> mark progress and dispatch the next task
36
- - `TASK_FAILED` -> increment failure tracking and enter recovery
41
+ - `TASK_COMPLETE` -> mark progress, mark the native task `completed`, and
42
+ dispatch the next task
43
+ - `TASK_FAILED` -> increment failure tracking, keep the native task open, and
44
+ enter recovery
37
45
  - `ALL_TASKS_COMPLETE` -> mark execute complete and move to verify
38
46
 
47
+ If a stored Claude task id no longer updates, rebuild `native_task_map` from
48
+ the current `tasks.md`, then continue serial dispatch. Do not rewrite
49
+ `.state.json` task cursor or `tasks.md` just because the UI task list changed.
50
+
39
51
  ## Guardrails
40
52
 
41
53
  - This strategy is serial; a single Agent dispatch is not parallel execution
42
54
  - Do not batch multiple independent tasks into one `flow-executor` prompt
43
55
  - Do not trust narrative summaries over the end marker
56
+ - Native task sync belongs to the coordinator around the dispatch loop, not to
57
+ the `flow-executor` subagent itself
@@ -50,6 +50,12 @@ If the planner mis-tagged `[P]` (modifies the same file), split the wave automat
50
50
 
51
51
  ## Step 3: Dispatch a Single Wave (key: within a single response)
52
52
 
53
+ Before dispatching the wave, if native task sync is active:
54
+
55
+ - reconcile obvious `tasks.md` vs native-task drift
56
+ - mark every task in the upcoming wave `in_progress`
57
+ - set the active form to `Executing Wave <n>: <task ids>` when available
58
+
53
59
  ```
54
60
  # List multiple Agent tool calls in one response of the main agent:
55
61
  Agent(description="Execute 1.1", prompt=<...flow-executor + task_id=1.1...>)
@@ -106,6 +112,14 @@ for task in completed:
106
112
  warn(f"Task {task.id} modified undeclared files: {unexpected}")
107
113
  ```
108
114
 
115
+ After aggregation, if native task sync is active:
116
+
117
+ - mark each `completed` task `completed`
118
+ - leave each `failed` task open for retry/recovery
119
+ - if any `TaskUpdate` call reports a stale or missing Claude task id, rebuild
120
+ `native_task_map` from `tasks.md` and continue with `.flow` as source of
121
+ truth
122
+
109
123
  ## Step 5: Wave Failure Handling
110
124
 
111
125
  ```
@@ -137,6 +151,10 @@ failed_attempts >= 3 (cumulative):
137
151
  ▶ Wave 3/5 starting (VERIFY)...
138
152
  ```
139
153
 
154
+ When native task sync is active, the Claude task list is the short-form UI
155
+ surface for the same wave progress. Keep chat feedback compact instead of
156
+ relisting the whole backlog after every wave.
157
+
140
158
  ## Configuration
141
159
 
142
160
  `.flow/config.json`:
@@ -19,6 +19,9 @@ Gather read-only state in this order:
19
19
  - `execute_state.total_tasks`
20
20
  - `execute_state.failed_attempts`
21
21
  - `execute_state.global_iteration`
22
+ - `execute_state.native_sync_enabled`
23
+ - `execute_state.native_sync_failure_count`
24
+ - `execute_state.native_task_map` size
22
25
  5. If `tasks.md` exists, count:
23
26
  - completed tasks: lines matching `- [x] **`
24
27
  - open tasks: lines matching `- [ ] **`
@@ -15,6 +15,7 @@ Phase: <phase | unknown>
15
15
  Strategy: <strategy | auto>
16
16
  Tasks: <done>/<total from tasks.md> checked, state cursor <task_index>/<total_tasks>
17
17
  Failures: <failed_attempts>, rounds: <global_iteration>
18
+ Native sync: <on|off>, mapped tasks: <count>, sync failures: <native_sync_failure_count>
18
19
  Artifacts: [x] research [x] requirements [x] design [x] tasks [ ] verify [ ] review
19
20
  Health: OK | NEEDS_ATTENTION
20
21
  Recovery: <one concrete next command>