@curdx/flow 2.3.11 → 3.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +21 -34
- package/LICENSE +1 -1
- package/README.md +28 -79
- package/dist/index.mjs +995 -0
- package/package.json +33 -42
- package/.claude-plugin/marketplace.json +0 -48
- package/.claude-plugin/plugin.json +0 -70
- package/agent-preamble/preamble.md +0 -314
- package/agents/flow-adversary.md +0 -202
- package/agents/flow-architect.md +0 -197
- package/agents/flow-brownfield-analyst.md +0 -142
- package/agents/flow-debugger.md +0 -321
- package/agents/flow-edge-hunter.md +0 -288
- package/agents/flow-executor.md +0 -269
- package/agents/flow-orchestrator.md +0 -145
- package/agents/flow-planner.md +0 -246
- package/agents/flow-product-designer.md +0 -159
- package/agents/flow-qa-engineer.md +0 -282
- package/agents/flow-researcher.md +0 -165
- package/agents/flow-reviewer.md +0 -303
- package/agents/flow-security-auditor.md +0 -401
- package/agents/flow-triage-analyst.md +0 -272
- package/agents/flow-ui-researcher.md +0 -229
- package/agents/flow-ux-designer.md +0 -221
- package/agents/flow-verifier.md +0 -349
- package/bin/curdx-flow +0 -5
- package/bin/curdx-flow.js +0 -54
- package/cli/README.md +0 -104
- package/cli/doctor-workflow.js +0 -483
- package/cli/doctor.js +0 -73
- package/cli/help.js +0 -59
- package/cli/install-bundled-mcps.js +0 -37
- package/cli/install-companions.js +0 -19
- package/cli/install-context7-config.js +0 -80
- package/cli/install-curdx-plugin.js +0 -96
- package/cli/install-language.js +0 -35
- package/cli/install-next-steps.js +0 -29
- package/cli/install-options.js +0 -9
- package/cli/install-paths.js +0 -52
- package/cli/install-recommended-plugins.js +0 -104
- package/cli/install-required-plugins.js +0 -57
- package/cli/install-self-update.js +0 -62
- package/cli/install-workflow.js +0 -209
- package/cli/install.js +0 -101
- package/cli/lib/claude-commands.js +0 -41
- package/cli/lib/claude-ops.js +0 -47
- package/cli/lib/claude.js +0 -183
- package/cli/lib/config.js +0 -24
- package/cli/lib/doctor-claude-settings.js +0 -1186
- package/cli/lib/doctor-report.js +0 -978
- package/cli/lib/doctor-runtime-environment.js +0 -196
- package/cli/lib/frontmatter.js +0 -44
- package/cli/lib/json-schema.js +0 -57
- package/cli/lib/logging.js +0 -25
- package/cli/lib/process.js +0 -60
- package/cli/lib/prompts.js +0 -135
- package/cli/lib/runtime.js +0 -107
- package/cli/lib/semver.js +0 -109
- package/cli/lib/version.js +0 -12
- package/cli/protocols-body.md +0 -22
- package/cli/protocols.js +0 -162
- package/cli/registry.js +0 -123
- package/cli/router.js +0 -49
- package/cli/uninstall-actions.js +0 -360
- package/cli/uninstall-workflow.js +0 -146
- package/cli/uninstall.js +0 -42
- package/cli/upgrade-workflow.js +0 -80
- package/cli/upgrade.js +0 -91
- package/cli/utils.js +0 -40
- package/gates/adversarial-review-gate.md +0 -219
- package/gates/coverage-audit-gate.md +0 -182
- package/gates/devex-gate.md +0 -254
- package/gates/edge-case-gate.md +0 -194
- package/gates/karpathy-gate.md +0 -130
- package/gates/security-gate.md +0 -218
- package/gates/tdd-gate.md +0 -182
- package/gates/test-quality-gate.md +0 -59
- package/gates/verification-gate.md +0 -179
- package/hooks/hooks.json +0 -58
- package/hooks/scripts/common.sh +0 -46
- package/hooks/scripts/inject-karpathy.sh +0 -53
- package/hooks/scripts/quick-mode-guard.sh +0 -68
- package/hooks/scripts/session-start.sh +0 -90
- package/hooks/scripts/stop-watcher.sh +0 -230
- package/hooks/scripts/subagent-artifact-guard.sh +0 -159
- package/hooks/scripts/subagent-statusline.sh +0 -105
- package/knowledge/artifact-output-discipline.md +0 -24
- package/knowledge/artifact-summary-contracts.md +0 -50
- package/knowledge/atomic-commits.md +0 -262
- package/knowledge/claude-code-runtime-contracts.md +0 -219
- package/knowledge/epic-decomposition.md +0 -307
- package/knowledge/execution-strategies.md +0 -303
- package/knowledge/karpathy-guidelines.md +0 -219
- package/knowledge/planning-reviews.md +0 -211
- package/knowledge/poc-first-workflow.md +0 -223
- package/knowledge/review-feedback-intake.md +0 -57
- package/knowledge/spec-driven-development.md +0 -180
- package/knowledge/systematic-debugging.md +0 -378
- package/knowledge/two-stage-review.md +0 -249
- package/knowledge/wave-execution.md +0 -403
- package/monitors/monitors.json +0 -8
- package/monitors/scripts/flow-state-monitor.sh +0 -99
- package/output-styles/curdx-evidence-first.md +0 -34
- package/schemas/agent-frontmatter.schema.json +0 -63
- package/schemas/config.schema.json +0 -134
- package/schemas/gate-frontmatter.schema.json +0 -30
- package/schemas/hooks.schema.json +0 -115
- package/schemas/output-style-frontmatter.schema.json +0 -22
- package/schemas/plugin-manifest.schema.json +0 -436
- package/schemas/plugin-settings.schema.json +0 -29
- package/schemas/skill-frontmatter.schema.json +0 -177
- package/schemas/spec-frontmatter.schema.json +0 -42
- package/schemas/spec-state.schema.json +0 -147
- package/settings.json +0 -7
- package/skills/brownfield-index/SKILL.md +0 -53
- package/skills/brownfield-index/references/applicability.md +0 -12
- package/skills/brownfield-index/references/handoff.md +0 -8
- package/skills/brownfield-index/references/index-contract.md +0 -10
- package/skills/browser-qa/SKILL.md +0 -39
- package/skills/browser-qa/references/handoff.md +0 -6
- package/skills/browser-qa/references/prerequisites.md +0 -10
- package/skills/browser-qa/references/qa-contract.md +0 -20
- package/skills/cancel/SKILL.md +0 -41
- package/skills/cancel/references/destructive-mode.md +0 -17
- package/skills/cancel/references/reporting.md +0 -18
- package/skills/cancel/references/state-recovery.md +0 -30
- package/skills/cancel/references/target-resolution.md +0 -7
- package/skills/debug/SKILL.md +0 -45
- package/skills/debug/references/context-gathering.md +0 -11
- package/skills/debug/references/failure-guard.md +0 -25
- package/skills/debug/references/intake.md +0 -12
- package/skills/debug/references/phase-workflow.md +0 -34
- package/skills/debug/references/reporting.md +0 -20
- package/skills/epic/SKILL.md +0 -39
- package/skills/epic/references/epic-artifacts.md +0 -20
- package/skills/epic/references/epic-intake.md +0 -9
- package/skills/epic/references/slice-handoff.md +0 -16
- package/skills/fast/SKILL.md +0 -62
- package/skills/fast/references/applicability.md +0 -25
- package/skills/fast/references/clarification.md +0 -20
- package/skills/fast/references/execution-contract.md +0 -56
- package/skills/help/SKILL.md +0 -55
- package/skills/help/references/dispatch.md +0 -20
- package/skills/help/references/overview.md +0 -39
- package/skills/help/references/troubleshoot.md +0 -47
- package/skills/help/references/workflow.md +0 -37
- package/skills/implement/SKILL.md +0 -96
- package/skills/implement/references/error-recovery.md +0 -36
- package/skills/implement/references/linear-execution.md +0 -32
- package/skills/implement/references/preflight.md +0 -43
- package/skills/implement/references/progress-contract.md +0 -32
- package/skills/implement/references/state-init.md +0 -33
- package/skills/implement/references/stop-hook-execution.md +0 -36
- package/skills/implement/references/strategy-router.md +0 -38
- package/skills/implement/references/subagent-execution.md +0 -43
- package/skills/implement/references/wave-execution.md +0 -162
- package/skills/init/SKILL.md +0 -49
- package/skills/init/references/gitignore-and-health.md +0 -26
- package/skills/init/references/next-steps.md +0 -22
- package/skills/init/references/preflight.md +0 -15
- package/skills/init/references/scaffold-contract.md +0 -27
- package/skills/review/SKILL.md +0 -82
- package/skills/review/references/optional-passes.md +0 -48
- package/skills/review/references/preflight.md +0 -38
- package/skills/review/references/report-contract.md +0 -49
- package/skills/review/references/reporting.md +0 -20
- package/skills/review/references/stage-execution.md +0 -32
- package/skills/security-audit/SKILL.md +0 -47
- package/skills/security-audit/references/audit-contract.md +0 -21
- package/skills/security-audit/references/gate-handoff.md +0 -8
- package/skills/security-audit/references/scope-and-depth.md +0 -9
- package/skills/spec/SKILL.md +0 -100
- package/skills/spec/references/artifact-landing.md +0 -31
- package/skills/spec/references/phase-execution.md +0 -50
- package/skills/spec/references/planning-review.md +0 -31
- package/skills/spec/references/preflight-and-routing.md +0 -46
- package/skills/spec/references/reporting.md +0 -21
- package/skills/start/SKILL.md +0 -84
- package/skills/start/references/branch-routing.md +0 -51
- package/skills/start/references/mode-semantics.md +0 -12
- package/skills/start/references/preflight.md +0 -13
- package/skills/start/references/reporting.md +0 -20
- package/skills/start/references/state-seeding.md +0 -44
- package/skills/start/references/workflow-handoff.md +0 -26
- package/skills/status/SKILL.md +0 -41
- package/skills/status/references/gather-contract.md +0 -27
- package/skills/status/references/health-rules.md +0 -27
- package/skills/status/references/output-contract.md +0 -24
- package/skills/status/references/preflight.md +0 -10
- package/skills/status/references/recovery-hints.md +0 -18
- package/skills/ui-sketch/SKILL.md +0 -39
- package/skills/ui-sketch/references/brief-intake.md +0 -10
- package/skills/ui-sketch/references/iteration-handoff.md +0 -5
- package/skills/ui-sketch/references/variant-contract.md +0 -15
- package/skills/verify/SKILL.md +0 -56
- package/skills/verify/references/evidence-workflow.md +0 -39
- package/skills/verify/references/output-contract.md +0 -23
- package/skills/verify/references/preflight.md +0 -11
- package/skills/verify/references/report-handoff.md +0 -35
- package/skills/verify/references/strict-mode.md +0 -12
- package/templates/CONTEXT.md.tmpl +0 -53
- package/templates/PROJECT.md.tmpl +0 -59
- package/templates/ROADMAP.md.tmpl +0 -50
- package/templates/STATE.md.tmpl +0 -49
- package/templates/config.json.tmpl +0 -51
- package/templates/design.md.tmpl +0 -83
- package/templates/progress.md.tmpl +0 -77
- package/templates/requirements.md.tmpl +0 -76
- package/templates/research.md.tmpl +0 -83
- package/templates/tasks.md.tmpl +0 -107
|
@@ -1,249 +0,0 @@
|
|
|
1
|
-
# Two-Stage Review — Two-Stage Code Review
|
|
2
|
-
|
|
3
|
-
> CurDX-Flow runtime contract for two-stage review.
|
|
4
|
-
>
|
|
5
|
-
> Agents reference this via `@${CLAUDE_PLUGIN_ROOT}/knowledge/two-stage-review.md`.
|
|
6
|
-
|
|
7
|
-
---
|
|
8
|
-
|
|
9
|
-
## Why Two Stages
|
|
10
|
-
|
|
11
|
-
One stage can't do it all. Separate "is it the right thing" from "is it done well":
|
|
12
|
-
|
|
13
|
-
```
|
|
14
|
-
Stage 1: Spec Compliance
|
|
15
|
-
Question: "Does the code actually implement what the spec asked for?"
|
|
16
|
-
Focus: landing of FR / AC / AD
|
|
17
|
-
|
|
18
|
-
Stage 2: Code Quality
|
|
19
|
-
Question: "Is the implementation done well?"
|
|
20
|
-
Focus: style / tests / maintainability / performance
|
|
21
|
-
```
|
|
22
|
-
|
|
23
|
-
Downsides of reviewing them together:
|
|
24
|
-
- Find quality issues → but the code doesn't implement the requirements → quality polish is wasted
|
|
25
|
-
- Find missing requirements → quality advice drowns
|
|
26
|
-
- Findings are numerous and unsorted → user doesn't know what to fix first
|
|
27
|
-
|
|
28
|
-
Benefits of separation:
|
|
29
|
-
- Stage 1 passing gates Stage 2 → saves time
|
|
30
|
-
- Report is layered → user first fixes "not done right" then "not done well enough"
|
|
31
|
-
|
|
32
|
-
---
|
|
33
|
-
|
|
34
|
-
## Stage 1: Spec Compliance
|
|
35
|
-
|
|
36
|
-
### Core question
|
|
37
|
-
|
|
38
|
-
**Is this doing what the spec asked for?**
|
|
39
|
-
|
|
40
|
-
### Checklist
|
|
41
|
-
|
|
42
|
-
#### 1.1 FR coverage
|
|
43
|
-
|
|
44
|
-
For each FR-NN (from requirements.md):
|
|
45
|
-
- Can you find a corresponding implementation in the code?
|
|
46
|
-
- Is it real or a stub (`throw new Error('not implemented')`)?
|
|
47
|
-
- Does a commit reference this FR (footer `Requirements: FR-NN`)?
|
|
48
|
-
|
|
49
|
-
#### 1.2 AC verifiable
|
|
50
|
-
|
|
51
|
-
For each AC-X.Y:
|
|
52
|
-
- Is there a corresponding test case?
|
|
53
|
-
- Does the test actually run (not skipped)?
|
|
54
|
-
- Does the test actually exercise the AC behavior, not just a placeholder?
|
|
55
|
-
|
|
56
|
-
#### 1.3 AD landing
|
|
57
|
-
|
|
58
|
-
For each AD-NN (from design.md):
|
|
59
|
-
- Does the code reflect this decision?
|
|
60
|
-
- Is the decision not violated?
|
|
61
|
-
- If violated, was design.md bumped in version + new AD recorded?
|
|
62
|
-
|
|
63
|
-
#### 1.4 Out-of-scope respected
|
|
64
|
-
|
|
65
|
-
Against the "out of scope" list in requirements.md:
|
|
66
|
-
- Does the code actually **not do** these?
|
|
67
|
-
- If it does, was it an intentional extension (with an explanatory commit), or scope creep?
|
|
68
|
-
|
|
69
|
-
#### 1.5 Error paths
|
|
70
|
-
|
|
71
|
-
Against the "error paths" table in design.md:
|
|
72
|
-
- Is each scenario handled?
|
|
73
|
-
- Is there a corresponding test?
|
|
74
|
-
|
|
75
|
-
### Stage 1 verdict
|
|
76
|
-
|
|
77
|
-
- **PASS**: all FR / AD fully implemented, all AC have corresponding tests
|
|
78
|
-
- **PARTIAL**: implemented, but some FR / AC lack tests
|
|
79
|
-
- **FAIL**: some FR / AD not implemented, or out-of-scope leakage
|
|
80
|
-
|
|
81
|
-
---
|
|
82
|
-
|
|
83
|
-
## Stage 2: Code Quality
|
|
84
|
-
|
|
85
|
-
### Core question
|
|
86
|
-
|
|
87
|
-
**Is the implementation done well? Can a future maintainer pick it up?**
|
|
88
|
-
|
|
89
|
-
### Dimensions
|
|
90
|
-
|
|
91
|
-
Stage 2 applies all enabled Gates (from `.flow/config.json`):
|
|
92
|
-
|
|
93
|
-
#### 2.1 Karpathy 4 principles (karpathy-gate)
|
|
94
|
-
|
|
95
|
-
- Assumptions explicit?
|
|
96
|
-
- Over-engineered?
|
|
97
|
-
- Surgical?
|
|
98
|
-
- Claims without evidence?
|
|
99
|
-
|
|
100
|
-
#### 2.2 Verification baseline (verification-gate)
|
|
101
|
-
|
|
102
|
-
- Do commit messages / `.progress.md` contain forbidden words?
|
|
103
|
-
- Do claims have fresh evidence?
|
|
104
|
-
|
|
105
|
-
#### 2.3 TDD discipline (tdd-gate)
|
|
106
|
-
|
|
107
|
-
- Is there a `test(xxx): red -` before a `feat(xxx):` commit?
|
|
108
|
-
- Are exemptions recorded in STATE.md?
|
|
109
|
-
|
|
110
|
-
#### 2.4 Coverage completeness (coverage-audit-gate)
|
|
111
|
-
|
|
112
|
-
- All 4 sources (FR / AD / Research / Decisions) covered?
|
|
113
|
-
|
|
114
|
-
#### 2.5 Test quality (test-quality-gate)
|
|
115
|
-
|
|
116
|
-
- Do tests used as FR/AC evidence exercise real behavior, not only mocks/spies?
|
|
117
|
-
- Are skipped/assertion-free tests excluded from evidence?
|
|
118
|
-
- Are mock-heavy tests backed by integration/e2e coverage or a documented boundary rationale?
|
|
119
|
-
- Are stateful mocks cleaned up between tests?
|
|
120
|
-
|
|
121
|
-
#### 2.6 (enterprise) Adversarial review (adversarial-review-gate)
|
|
122
|
-
|
|
123
|
-
- Every applicable category examined (N/A documented for the rest)?
|
|
124
|
-
- Findings proportional to real issues (zero is OK with a proof-of-checking report)?
|
|
125
|
-
- Each finding has evidence + recommendation?
|
|
126
|
-
|
|
127
|
-
#### 2.7 (enterprise) Edge cases (edge-case-gate)
|
|
128
|
-
|
|
129
|
-
- Each applicable edge-case category addressed (N/A noted for the rest)?
|
|
130
|
-
- Gap list has priorities?
|
|
131
|
-
|
|
132
|
-
#### 2.8 (enterprise) DevEx review (devex-gate)
|
|
133
|
-
|
|
134
|
-
- Are naming, comments, and structure maintainable for the next engineer?
|
|
135
|
-
- Is setup, typing, and test ergonomics acceptable without tribal knowledge?
|
|
136
|
-
- Does the developer loop stay fast enough to keep future changes safe?
|
|
137
|
-
|
|
138
|
-
### Stage 2 verdict
|
|
139
|
-
|
|
140
|
-
- **EXCELLENT**: all enabled Gates pass, adversarial review clean or only low-severity findings
|
|
141
|
-
- **GOOD**: all enabled Gates pass, but some warnings
|
|
142
|
-
- **NEEDS_IMPROVEMENT**: Gate violations (blocking)
|
|
143
|
-
|
|
144
|
-
---
|
|
145
|
-
|
|
146
|
-
## Combined Verdict
|
|
147
|
-
|
|
148
|
-
```python
|
|
149
|
-
def verdict(stage1, stage2):
|
|
150
|
-
# Stage 1 must pass to proceed to Stage 2
|
|
151
|
-
if stage1 == "FAIL":
|
|
152
|
-
return "BLOCKED_BY_SPEC" # must go back to /curdx-flow:implement
|
|
153
|
-
|
|
154
|
-
if stage1 == "PARTIAL" and stage2 == "EXCELLENT":
|
|
155
|
-
return "APPROVED_WITH_WARNINGS" # few tests but code is good
|
|
156
|
-
|
|
157
|
-
if stage2 == "NEEDS_IMPROVEMENT":
|
|
158
|
-
return "BLOCKED_BY_QUALITY" # Gate violations must be fixed
|
|
159
|
-
|
|
160
|
-
if stage1 == "PASS" and stage2 == "EXCELLENT":
|
|
161
|
-
return "APPROVED"
|
|
162
|
-
|
|
163
|
-
return "APPROVED_WITH_WARNINGS"
|
|
164
|
-
```
|
|
165
|
-
|
|
166
|
-
---
|
|
167
|
-
|
|
168
|
-
## Fix Loop
|
|
169
|
-
|
|
170
|
-
When the review turns up issues, the typical flow:
|
|
171
|
-
|
|
172
|
-
```
|
|
173
|
-
1. Review returns NEEDS_FIXES
|
|
174
|
-
↓
|
|
175
|
-
2. User decides what must be fixed vs tolerated
|
|
176
|
-
↓
|
|
177
|
-
3. Fix:
|
|
178
|
-
- Spec class: /curdx-flow:implement --task=add a new task
|
|
179
|
-
- Quality class: /curdx-flow:implement rerun a task
|
|
180
|
-
- Miscellaneous: /curdx-flow:fast "fix X raised by review"
|
|
181
|
-
↓
|
|
182
|
-
4. /curdx-flow:review re-review
|
|
183
|
-
↓
|
|
184
|
-
5. Until APPROVED → hand off with review-report.md + atomic commits
|
|
185
|
-
```
|
|
186
|
-
|
|
187
|
-
Before implementing review feedback, apply `@${CLAUDE_PLUGIN_ROOT}/knowledge/review-feedback-intake.md`:
|
|
188
|
-
- Verify each finding against code/spec reality.
|
|
189
|
-
- Classify as `BLOCKER`, `IMPORTANT`, `SUGGESTION`, or `PUSHBACK`.
|
|
190
|
-
- Fix accepted items one at a time with targeted verification.
|
|
191
|
-
- Record technical pushback in `.progress.md` instead of silently ignoring feedback.
|
|
192
|
-
|
|
193
|
-
---
|
|
194
|
-
|
|
195
|
-
## Failure Modes of Two Stages
|
|
196
|
-
|
|
197
|
-
### Anti-pattern 1: "I'll just read the code"
|
|
198
|
-
|
|
199
|
-
Some reviewers skip the spec comparison and read the code directly. Result:
|
|
200
|
-
- "Code looks quality" → but doesn't implement the requirements
|
|
201
|
-
- Missed key FRs
|
|
202
|
-
- Over-focus on details, miss architectural decisions
|
|
203
|
-
|
|
204
|
-
**Correction**: Stage 1 must walk through the FR/AC/AD checklist item by item.
|
|
205
|
-
|
|
206
|
-
### Anti-pattern 2: "Close enough"
|
|
207
|
-
|
|
208
|
-
Some reviewers find a missing FR but give APPROVED because the code quality is high.
|
|
209
|
-
|
|
210
|
-
**Correction**: Stage 1 FAIL is BLOCKED. "Trading quality for compliance" is not allowed.
|
|
211
|
-
|
|
212
|
-
### Anti-pattern 3: "Too many findings"
|
|
213
|
-
|
|
214
|
-
Some reviewers list 50 minor improvements — the user can't process.
|
|
215
|
-
|
|
216
|
-
**Correction**: tier them — blocker / warning / suggestion. User only needs to look at blockers.
|
|
217
|
-
|
|
218
|
-
### Anti-pattern 4: "No evidence"
|
|
219
|
-
|
|
220
|
-
"This could be improved" / "feels the code quality isn't high enough".
|
|
221
|
-
|
|
222
|
-
**Correction**: every finding has **file:line + evidence + recommendation**.
|
|
223
|
-
|
|
224
|
-
---
|
|
225
|
-
|
|
226
|
-
## Relationship to Other Phases
|
|
227
|
-
|
|
228
|
-
```
|
|
229
|
-
/curdx-flow:spec --phase=tasks → tasks.md contains task list
|
|
230
|
-
↓
|
|
231
|
-
/curdx-flow:implement → code + tests + commits
|
|
232
|
-
↓
|
|
233
|
-
/curdx-flow:verify → Goal-backward verification (flow-verifier)
|
|
234
|
-
↓ ↓
|
|
235
|
-
↓ verification-report.md
|
|
236
|
-
↓
|
|
237
|
-
/curdx-flow:review → Two-Stage Review (flow-reviewer)
|
|
238
|
-
↓ ↓
|
|
239
|
-
↓ review-report.md
|
|
240
|
-
↓
|
|
241
|
-
(optional) /curdx-flow:review --adversarial --edge-case --devex
|
|
242
|
-
↓
|
|
243
|
-
adversarial-review.md
|
|
244
|
-
edge-cases.md
|
|
245
|
-
↓
|
|
246
|
-
Ready for human PR/release handoff with verification + review evidence
|
|
247
|
-
```
|
|
248
|
-
|
|
249
|
-
Verify is "did we implement the right thing", Review is "is the implementation good", Audit is "what else could be better". CurdX-Flow currently stops at evidence-backed handoff; do not reference non-existent ship/land commands.
|
|
@@ -1,403 +0,0 @@
|
|
|
1
|
-
# Wave Execution — DAG Parallel Execution Strategy
|
|
2
|
-
|
|
3
|
-
> One of Phase 2's 4 execution strategies. Identify parallel-safe task groups via `[P]` markers, dispatch multiple Agent tool calls **in a single message**, run in parallel within a wave and serially across waves.
|
|
4
|
-
>
|
|
5
|
-
> Agents reference this via `@${CLAUDE_PLUGIN_ROOT}/knowledge/wave-execution.md`.
|
|
6
|
-
|
|
7
|
-
---
|
|
8
|
-
|
|
9
|
-
## Core Concepts
|
|
10
|
-
|
|
11
|
-
### Wave = a group of parallel-safe tasks
|
|
12
|
-
|
|
13
|
-
A wave is **a consecutive run of `[P]`-marked tasks**. Within a wave, run in parallel; across waves, run serially.
|
|
14
|
-
|
|
15
|
-
Hard limits:
|
|
16
|
-
- Max 5 tasks per wave (`max_parallel` ceiling). More than 5 tasks must be split by a `[VERIFY]` checkpoint or a serial boundary.
|
|
17
|
-
- Every task in a wave owns a disjoint `Files` set.
|
|
18
|
-
- Shared config/barrel/registry files are serial by default: `package.json`, lockfiles, `tsconfig.*`, `index.ts`, router registries, migration manifests, generated schema registries.
|
|
19
|
-
- Read-after-write is a conflict even when file paths differ: if task B imports, tests, or configures output from task A, B must run in a later wave.
|
|
20
|
-
|
|
21
|
-
```
|
|
22
|
-
tasks.md:
|
|
23
|
-
1.1 [P] create auth directory
|
|
24
|
-
1.2 [P] create user directory
|
|
25
|
-
1.3 [P] create session directory
|
|
26
|
-
1.4 [VERIFY] verify directory structure
|
|
27
|
-
1.5 init index.ts (depends on 1.1-1.3)
|
|
28
|
-
1.6 [P] add README to auth
|
|
29
|
-
1.7 [P] add README to user
|
|
30
|
-
|
|
31
|
-
Analysis:
|
|
32
|
-
Wave 1: { 1.1, 1.2, 1.3 } — parallel (3 Agent calls)
|
|
33
|
-
Wave 2: { 1.4 } — serial (VERIFY breaks)
|
|
34
|
-
Wave 3: { 1.5 } — serial (no [P])
|
|
35
|
-
Wave 4: { 1.6, 1.7 } — parallel (2 Agent calls)
|
|
36
|
-
```
|
|
37
|
-
|
|
38
|
-
---
|
|
39
|
-
|
|
40
|
-
## DAG Analysis Algorithm
|
|
41
|
-
|
|
42
|
-
```python
|
|
43
|
-
def analyze_waves(tasks):
|
|
44
|
-
waves = []
|
|
45
|
-
current_wave = []
|
|
46
|
-
|
|
47
|
-
for task in tasks:
|
|
48
|
-
if task.status == 'completed':
|
|
49
|
-
continue
|
|
50
|
-
|
|
51
|
-
# markers that break parallelism
|
|
52
|
-
if task.has_flag('SEQUENTIAL') or task.has_flag('VERIFY'):
|
|
53
|
-
# end current wave
|
|
54
|
-
if current_wave:
|
|
55
|
-
waves.append(current_wave)
|
|
56
|
-
current_wave = []
|
|
57
|
-
# this task becomes its own wave
|
|
58
|
-
waves.append([task])
|
|
59
|
-
continue
|
|
60
|
-
|
|
61
|
-
# has [P] → candidate for current wave
|
|
62
|
-
if task.has_flag('P'):
|
|
63
|
-
# conflict detection: do Files intersect the current wave's files?
|
|
64
|
-
if has_file_conflict(task, current_wave):
|
|
65
|
-
# conflict → start a new wave
|
|
66
|
-
waves.append(current_wave)
|
|
67
|
-
current_wave = [task]
|
|
68
|
-
else:
|
|
69
|
-
current_wave.append(task)
|
|
70
|
-
|
|
71
|
-
# no [P] → own wave
|
|
72
|
-
else:
|
|
73
|
-
if current_wave:
|
|
74
|
-
waves.append(current_wave)
|
|
75
|
-
current_wave = []
|
|
76
|
-
waves.append([task])
|
|
77
|
-
|
|
78
|
-
# cleanup
|
|
79
|
-
if current_wave:
|
|
80
|
-
waves.append(current_wave)
|
|
81
|
-
|
|
82
|
-
return waves
|
|
83
|
-
```
|
|
84
|
-
|
|
85
|
-
### Conflict detection
|
|
86
|
-
|
|
87
|
-
```python
|
|
88
|
-
def has_file_conflict(task, wave):
|
|
89
|
-
"""Do task's Files intersect any wave task's Files?"""
|
|
90
|
-
task_files = set(task.files)
|
|
91
|
-
if touches_shared_serial_surface(task_files):
|
|
92
|
-
return True
|
|
93
|
-
for other in wave:
|
|
94
|
-
if task_files & set(other.files):
|
|
95
|
-
return True
|
|
96
|
-
if has_read_after_write_dependency(task, other):
|
|
97
|
-
return True
|
|
98
|
-
return False
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
Rules:
|
|
102
|
-
- Two `[P]` tasks editing the same file → conflict, must split into different waves
|
|
103
|
-
- Two `[P]` tasks creating different files → OK
|
|
104
|
-
- One reads what another writes → **conflict** (reads aren't guaranteed to see latest)
|
|
105
|
-
- More than 5 `[P]` tasks in one consecutive run → split the wave before dispatch
|
|
106
|
-
|
|
107
|
-
---
|
|
108
|
-
|
|
109
|
-
## How Parallel Agent Dispatch Actually Works
|
|
110
|
-
|
|
111
|
-
### Key: multiple Agent tool calls in a single message
|
|
112
|
-
|
|
113
|
-
Claude Code's Agent tool runs in parallel **when multiple calls appear in the same message**.
|
|
114
|
-
Across **separate messages** → runs sequentially.
|
|
115
|
-
|
|
116
|
-
### Correct form (Wave strategy)
|
|
117
|
-
|
|
118
|
-
```
|
|
119
|
-
# In a single main-agent response:
|
|
120
|
-
|
|
121
|
-
Agent(description="Task 1.1", prompt="...execute 1.1...")
|
|
122
|
-
Agent(description="Task 1.2", prompt="...execute 1.2...")
|
|
123
|
-
Agent(description="Task 1.3", prompt="...execute 1.3...")
|
|
124
|
-
|
|
125
|
-
# Wait for all to return before continuing to the next wave
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
### Incorrect form (degenerates to serial subagent)
|
|
129
|
-
|
|
130
|
-
```
|
|
131
|
-
# One response:
|
|
132
|
-
Agent(description="Task 1.1", ...)
|
|
133
|
-
# wait for return
|
|
134
|
-
# Next response:
|
|
135
|
-
Agent(description="Task 1.2", ...)
|
|
136
|
-
# wait for return
|
|
137
|
-
# Next response:
|
|
138
|
-
Agent(description="Task 1.3", ...)
|
|
139
|
-
```
|
|
140
|
-
|
|
141
|
-
This is not parallel — it's serial subagent. The Wave strategy loses its meaning.
|
|
142
|
-
|
|
143
|
-
---
|
|
144
|
-
|
|
145
|
-
## Full Wave Dispatch Flow
|
|
146
|
-
|
|
147
|
-
```
|
|
148
|
-
for wave_index, wave in enumerate(waves):
|
|
149
|
-
|
|
150
|
-
# === Step 1: show wave info ===
|
|
151
|
-
echo "▶ Wave $wave_index: parallel ${#wave} tasks"
|
|
152
|
-
for task in wave:
|
|
153
|
-
echo " • $task.id $task.title"
|
|
154
|
-
|
|
155
|
-
# === Step 2: dispatch (key: within a single message) ===
|
|
156
|
-
# This is the main agent's response body. Call N Agent tools at once.
|
|
157
|
-
results = await asyncio.gather([
|
|
158
|
-
Agent(
|
|
159
|
-
description=f"execute {task.id}",
|
|
160
|
-
prompt=f"""
|
|
161
|
-
You are the flow-executor agent.
|
|
162
|
-
Full definition: ${CLAUDE_PLUGIN_ROOT}/agents/flow-executor.md
|
|
163
|
-
|
|
164
|
-
Execute single task:
|
|
165
|
-
spec_name: {spec_name}
|
|
166
|
-
task_id: {task.id}
|
|
167
|
-
quick_mode: {quick_mode}
|
|
168
|
-
|
|
169
|
-
You may only edit the following files:
|
|
170
|
-
{task.files}
|
|
171
|
-
|
|
172
|
-
Output TASK_COMPLETE / TASK_FAILED
|
|
173
|
-
""",
|
|
174
|
-
)
|
|
175
|
-
for task in wave
|
|
176
|
-
])
|
|
177
|
-
|
|
178
|
-
# === Step 3: aggregate results ===
|
|
179
|
-
completed = [t for t, r in zip(wave, results) if "TASK_COMPLETE" in r]
|
|
180
|
-
failed = [t for t, r in zip(wave, results) if "TASK_FAILED" in r]
|
|
181
|
-
|
|
182
|
-
# === Step 4: post-hoc conflict check ===
|
|
183
|
-
# After wave tasks complete, check for unexpected edits
|
|
184
|
-
git_status = run("git status --short")
|
|
185
|
-
# For each task, confirm it only edited its declared Files
|
|
186
|
-
for task in completed:
|
|
187
|
-
declared_files = set(task.files)
|
|
188
|
-
actual_changed = get_changed_files_since_wave_start(task)
|
|
189
|
-
if actual_changed - declared_files:
|
|
190
|
-
# Unexpected edits
|
|
191
|
-
warn(f"Wave {wave_index} task {task.id} edited undeclared files: {actual_changed - declared_files}")
|
|
192
|
-
# Don't fail immediately, but record
|
|
193
|
-
|
|
194
|
-
# === Step 5: failure handling ===
|
|
195
|
-
if failed:
|
|
196
|
-
# Policy:
|
|
197
|
-
# - 1 failure: continue other waves, report after completion
|
|
198
|
-
# - ≥ 2 failures: stop execution, user intervenes
|
|
199
|
-
if len(failed) == 1:
|
|
200
|
-
record_failure(failed[0])
|
|
201
|
-
continue_to_next_wave()
|
|
202
|
-
else:
|
|
203
|
-
stop_and_report(failed)
|
|
204
|
-
return
|
|
205
|
-
|
|
206
|
-
# === Step 6: inter-wave synchronization point ===
|
|
207
|
-
# All Agent calls complete = wave ends
|
|
208
|
-
# Before next wave starts, confirm git state is consistent
|
|
209
|
-
|
|
210
|
-
# All waves done
|
|
211
|
-
echo "ALL_TASKS_COMPLETE"
|
|
212
|
-
```
|
|
213
|
-
|
|
214
|
-
---
|
|
215
|
-
|
|
216
|
-
## Three Kinds of File Conflicts
|
|
217
|
-
|
|
218
|
-
### Case 1: no conflict (ideal)
|
|
219
|
-
|
|
220
|
-
```
|
|
221
|
-
Task 1.1 [P]: create src/auth/login.ts (Files: auth/login.ts)
|
|
222
|
-
Task 1.2 [P]: create src/user/profile.ts (Files: user/profile.ts)
|
|
223
|
-
Task 1.3 [P]: create src/session/token.ts (Files: session/token.ts)
|
|
224
|
-
```
|
|
225
|
-
|
|
226
|
-
→ Same wave, parallel, no problem.
|
|
227
|
-
|
|
228
|
-
### Case 2: same-file conflict (must split waves)
|
|
229
|
-
|
|
230
|
-
```
|
|
231
|
-
Task 2.1 [P]: modify src/index.ts to add auth export (Files: index.ts)
|
|
232
|
-
Task 2.2 [P]: modify src/index.ts to add user export (Files: index.ts)
|
|
233
|
-
```
|
|
234
|
-
|
|
235
|
-
→ Both edit `index.ts`, editing concurrently **conflicts**.
|
|
236
|
-
|
|
237
|
-
flow-planner should catch this and remove `[P]` from one of them.
|
|
238
|
-
If the planner misses it, flow-implement's conflict detection should split the wave.
|
|
239
|
-
|
|
240
|
-
### Case 3: read-write conflict (implicit)
|
|
241
|
-
|
|
242
|
-
```
|
|
243
|
-
Task 3.1 [P]: modify Order type definition (Files: types/order.ts)
|
|
244
|
-
Task 3.2 [P]: use Order to implement payment (Files: payment/process.ts, imports types/order.ts)
|
|
245
|
-
```
|
|
246
|
-
|
|
247
|
-
→ 3.2 imports 3.1's change. Parallel → 3.2 may see the old Order.
|
|
248
|
-
|
|
249
|
-
Such **implicit dependencies** are harder to detect. Best is for flow-planner to avoid `[P]` across such dependencies when generating tasks.
|
|
250
|
-
|
|
251
|
-
---
|
|
252
|
-
|
|
253
|
-
## Failure Handling Policies
|
|
254
|
-
|
|
255
|
-
### Single task failure
|
|
256
|
-
|
|
257
|
-
```
|
|
258
|
-
Wave 1 contains { 1.1, 1.2, 1.3 }
|
|
259
|
-
1.1 → TASK_COMPLETE
|
|
260
|
-
1.2 → TASK_FAILED
|
|
261
|
-
1.3 → TASK_COMPLETE
|
|
262
|
-
|
|
263
|
-
Decision:
|
|
264
|
-
- 1.2 marked failed, recorded to .state.json
|
|
265
|
-
- 1.1 and 1.3 commits retained
|
|
266
|
-
- Main agent decides:
|
|
267
|
-
A: continue to Wave 2 (skip 1.2, possible cascading failure)
|
|
268
|
-
B: dispatch flow-debugger to fix 1.2, then continue
|
|
269
|
-
C: stop and report, let the user intervene
|
|
270
|
-
|
|
271
|
-
Default: A, but failed_attempts += 1; after threshold switch to C
|
|
272
|
-
```
|
|
273
|
-
|
|
274
|
-
### Entire wave failed
|
|
275
|
-
|
|
276
|
-
```
|
|
277
|
-
Wave 1 all TASK_FAILED
|
|
278
|
-
|
|
279
|
-
Decision:
|
|
280
|
-
- Usually indicates an upstream environment problem (missing deps, tsc config wrong)
|
|
281
|
-
- Stop immediately
|
|
282
|
-
- Suggest user run `npx @curdx/flow doctor` to diagnose
|
|
283
|
-
```
|
|
284
|
-
|
|
285
|
-
### Inter-wave dependency broken
|
|
286
|
-
|
|
287
|
-
```
|
|
288
|
-
Wave 1 output is depended on by Wave 2
|
|
289
|
-
Wave 1 has failures → Wave 2 may also fail
|
|
290
|
-
|
|
291
|
-
Decision:
|
|
292
|
-
- After Wave 1 fails, evaluate whether Wave 2 can still run
|
|
293
|
-
- If each Wave 2 task's Files are unrelated to failed Wave 1 task Files → continue
|
|
294
|
-
- Otherwise → stop
|
|
295
|
-
```
|
|
296
|
-
|
|
297
|
-
---
|
|
298
|
-
|
|
299
|
-
## When to Choose the Wave Strategy
|
|
300
|
-
|
|
301
|
-
### Suitable
|
|
302
|
-
|
|
303
|
-
- ✓ `[P]` markers make up ≥ 40% of tasks.md
|
|
304
|
-
- ✓ Task groups are independent (different modules, different components)
|
|
305
|
-
- ✓ Shortest wall-clock time desired
|
|
306
|
-
- ✓ You trust that planner-marked `[P]` is accurate
|
|
307
|
-
|
|
308
|
-
### Not suitable
|
|
309
|
-
|
|
310
|
-
- ✗ Many `[SEQUENTIAL]` in tasks (wave degenerates to linear)
|
|
311
|
-
- ✗ Tasks depend on each other (use serial subagent)
|
|
312
|
-
- ✗ Debugging (need to see the full process — use linear)
|
|
313
|
-
- ✗ Parallel overhead > task duration (e.g., 1-second tasks with 10-second dispatch)
|
|
314
|
-
|
|
315
|
-
---
|
|
316
|
-
|
|
317
|
-
## Monitoring and Interruption
|
|
318
|
-
|
|
319
|
-
### In-progress view
|
|
320
|
-
|
|
321
|
-
Inspecting `.flow/specs/<name>/.progress.md` (or running `/curdx-flow:start --list`) shows:
|
|
322
|
-
```
|
|
323
|
-
Spec: auth-system
|
|
324
|
-
Strategy: wave
|
|
325
|
-
Progress: Wave 2/5 (60%)
|
|
326
|
-
Wave 1 [P×3]: ✓✓✓
|
|
327
|
-
Wave 2 [P×2]: ●● (in progress)
|
|
328
|
-
Wave 3 [VERIFY]: ○
|
|
329
|
-
...
|
|
330
|
-
```
|
|
331
|
-
|
|
332
|
-
### Ctrl+C interruption
|
|
333
|
-
|
|
334
|
-
- Running Agent calls in the current wave keep going (Claude Code's Agent tool starts independent work)
|
|
335
|
-
- Next `/curdx-flow:start --resume` shows some tasks already committed
|
|
336
|
-
- Resume from the failing task
|
|
337
|
-
|
|
338
|
-
---
|
|
339
|
-
|
|
340
|
-
## Configuration
|
|
341
|
-
|
|
342
|
-
`.flow/config.json`:
|
|
343
|
-
|
|
344
|
-
```json
|
|
345
|
-
{
|
|
346
|
-
"execution": {
|
|
347
|
-
"strategy": "wave",
|
|
348
|
-
"max_parallel": 5,
|
|
349
|
-
"wave_fail_policy": "continue-on-single | stop-on-any",
|
|
350
|
-
"recovery_mode": "manual | fix-task",
|
|
351
|
-
"max_fix_tasks_per_original": 2
|
|
352
|
-
}
|
|
353
|
-
}
|
|
354
|
-
```
|
|
355
|
-
|
|
356
|
-
- `max_parallel`: maximum parallel tasks per wave (default 5, to avoid API rate limits)
|
|
357
|
-
- `wave_fail_policy`: default behavior on single task failure
|
|
358
|
-
- `recovery_mode`: whether a failed wave task blocks for manual retry or creates a targeted `[FIX <task_id>]` task before retry
|
|
359
|
-
- `max_fix_tasks_per_original`: maximum fix tasks generated for one original task
|
|
360
|
-
|
|
361
|
-
---
|
|
362
|
-
|
|
363
|
-
## Mixing with Other Strategies
|
|
364
|
-
|
|
365
|
-
Sometimes a single tasks.md doesn't fit one strategy. **Mixed strategies** are supported:
|
|
366
|
-
|
|
367
|
-
```bash
|
|
368
|
-
# Run fast waves first, fall back to linear single-task debugging on failure
|
|
369
|
-
/curdx-flow:implement --strategy=wave
|
|
370
|
-
# Wave strategy runs until a task fails repeatedly → stop
|
|
371
|
-
/curdx-flow:implement --task=2.3 --strategy=linear
|
|
372
|
-
# After fixing, continue with waves
|
|
373
|
-
/curdx-flow:implement --strategy=wave
|
|
374
|
-
```
|
|
375
|
-
|
|
376
|
-
Phase 6+ will consider automatic fallback.
|
|
377
|
-
|
|
378
|
-
---
|
|
379
|
-
|
|
380
|
-
## Common Pitfalls
|
|
381
|
-
|
|
382
|
-
### 1. `[P]` markers incorrect
|
|
383
|
-
|
|
384
|
-
If the planner missed a dependency, `[P]` may be wrong. Solutions:
|
|
385
|
-
- Before execution, confirm tasks coverage via `/curdx-flow:verify --strict`
|
|
386
|
-
- Conflict detection as a safety net (validate Files before dispatch)
|
|
387
|
-
|
|
388
|
-
### 2. A wave too large
|
|
389
|
-
|
|
390
|
-
10+ parallel Agent calls may trigger API rate limits or context pressure. Solutions:
|
|
391
|
-
- `max_parallel: 5` splits a big wave into several
|
|
392
|
-
- flow-planner avoids making waves too large when generating
|
|
393
|
-
|
|
394
|
-
### 3. Implicit cross-wave dependencies
|
|
395
|
-
|
|
396
|
-
As in "Case 3" above. Solutions:
|
|
397
|
-
- flow-planner should not mark cross-file import relationships as `[P]`
|
|
398
|
-
- flow-implement does static analysis before execution (Phase 6+)
|
|
399
|
-
|
|
400
|
-
---
|
|
401
|
-
|
|
402
|
-
_Source: CurDX-Flow wave execution contract. This file defines the executable
|
|
403
|
-
parallel-delivery rules used by the plugin runtime._
|
package/monitors/monitors.json
DELETED