@curdx/flow 1.1.11 → 2.0.0-beta.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/.claude-plugin/marketplace.json +3 -3
  2. package/.claude-plugin/plugin.json +2 -2
  3. package/CHANGELOG.md +79 -0
  4. package/README.md +74 -102
  5. package/agents/flow-adversary.md +1 -1
  6. package/agents/flow-architect.md +1 -1
  7. package/agents/flow-product-designer.md +1 -1
  8. package/agents/flow-qa-engineer.md +3 -3
  9. package/agents/flow-researcher.md +1 -1
  10. package/agents/flow-security-auditor.md +1 -1
  11. package/agents/flow-triage-analyst.md +3 -3
  12. package/agents/flow-ui-researcher.md +5 -5
  13. package/agents/flow-ux-designer.md +2 -2
  14. package/cli/install.js +16 -5
  15. package/commands/debug.md +10 -10
  16. package/commands/help.md +109 -87
  17. package/commands/implement.md +4 -4
  18. package/commands/init.md +5 -5
  19. package/commands/review.md +114 -130
  20. package/commands/spec.md +131 -89
  21. package/commands/start.md +100 -153
  22. package/commands/verify.md +110 -92
  23. package/gates/adversarial-review-gate.md +1 -1
  24. package/gates/coverage-audit-gate.md +1 -1
  25. package/gates/devex-gate.md +1 -1
  26. package/gates/edge-case-gate.md +1 -1
  27. package/gates/security-gate.md +3 -3
  28. package/hooks/scripts/session-start.sh +1 -1
  29. package/knowledge/epic-decomposition.md +2 -2
  30. package/knowledge/execution-strategies.md +4 -4
  31. package/knowledge/planning-reviews.md +6 -6
  32. package/knowledge/spec-driven-development.md +3 -3
  33. package/knowledge/two-stage-review.md +2 -2
  34. package/knowledge/wave-execution.md +5 -5
  35. package/package.json +1 -1
  36. package/agents/persona-amelia.md +0 -128
  37. package/agents/persona-david.md +0 -141
  38. package/agents/persona-emma.md +0 -179
  39. package/agents/persona-john.md +0 -105
  40. package/agents/persona-mary.md +0 -95
  41. package/agents/persona-oliver.md +0 -136
  42. package/agents/persona-rachel.md +0 -126
  43. package/agents/persona-serena.md +0 -175
  44. package/agents/persona-winston.md +0 -117
  45. package/commands/audit.md +0 -170
  46. package/commands/autoplan.md +0 -184
  47. package/commands/design.md +0 -155
  48. package/commands/discuss.md +0 -162
  49. package/commands/doctor.md +0 -124
  50. package/commands/index.md +0 -261
  51. package/commands/install-deps.md +0 -128
  52. package/commands/party.md +0 -241
  53. package/commands/plan-ceo.md +0 -117
  54. package/commands/plan-design.md +0 -107
  55. package/commands/plan-dx.md +0 -104
  56. package/commands/plan-eng.md +0 -108
  57. package/commands/qa.md +0 -118
  58. package/commands/requirements.md +0 -146
  59. package/commands/research.md +0 -141
  60. package/commands/security.md +0 -109
  61. package/commands/sketch.md +0 -118
  62. package/commands/spike.md +0 -181
  63. package/commands/status.md +0 -139
  64. package/commands/switch.md +0 -95
  65. package/commands/tasks.md +0 -189
  66. package/commands/triage.md +0 -160
@@ -1,124 +1,142 @@
1
1
  ---
2
2
  name: verify
3
- description: goal reverse verification — trace back from FR/AC/AD to check whether the code actually implements them, detect stubs / fake completion. Dispatches flow-verifier.
4
- argument-hint: "[spec-name]"
3
+ description: Goal-backward verification — trace from every FR / AC / AD in the spec to the code and tests, detect stubs and fake completions. The differentiator command. Optionally adds multi-source coverage audit with --strict.
4
+ argument-hint: "[--strict]"
5
5
  allowed-tools: [Read, Bash, Task, Grep, Glob]
6
6
  ---
7
7
 
8
- # Flow Verify — Goal Reverse Verification
8
+ # Goal-Backward Verification
9
9
 
10
- Dispatch the `flow-verifier` agent to confirm, starting from the spec, that the code truly implements the requirements; do not trust any "done" claim.
10
+ This is the **differentiator command**: it scans the implementation against the spec's own requirements and catches the most common Claude failure mode — claiming "done" while actual code is a stub or fake completion.
11
11
 
12
- ## When to Use
12
+ ## Flags
13
13
 
14
- - After `/curdx-flow:implement` completes
15
- - Final gate before a PR
16
- - When you suspect a feature is a fake implementation (stub/TODO)
14
+ | Flag | Default | Purpose |
15
+ |------|---------|---------|
16
+ | `--strict` | off | Also run multi-source coverage audit (FR / AC / AD / Research conclusions / D-NN decisions) — replaces v1's `/audit` command. |
17
17
 
18
- ## Step 1: Parse + Preflight Check
18
+ ## Preflight
19
19
 
20
20
  ```bash
21
- SPEC_NAME="${ARGUMENTS:-$(cat .flow/.active-spec 2>/dev/null)}"
22
- [ -z "$SPEC_NAME" ] && { echo "❌ No active spec. Run /curdx-flow:switch or /curdx-flow:start first"; exit 1; }
21
+ [ ! -d ".flow" ] && { echo "✗ Not a CurDX-Flow project."; exit 1; }
23
22
 
24
- DIR=".flow/specs/$SPEC_NAME"
25
- for f in requirements.md design.md; do
26
- [ ! -f "$DIR/$f" ] && { echo "❌ Missing $f. Complete /curdx-flow:requirements /curdx-flow:design first"; exit 1; }
23
+ SPEC_NAME=$(cat .flow/.active-spec 2>/dev/null)
24
+ [ -z "$SPEC_NAME" ] && { echo "✗ No active spec. Run /curdx-flow:start first."; exit 1; }
25
+
26
+ SPEC_DIR=".flow/specs/$SPEC_NAME"
27
+ for f in requirements.md design.md tasks.md; do
28
+ [ ! -f "$SPEC_DIR/$f" ] && {
29
+ echo "✗ $SPEC_DIR/$f missing. Run /curdx-flow:spec first.";
30
+ exit 1;
31
+ }
27
32
  done
33
+
34
+ FLAG_STRICT=$(echo "$ARGUMENTS" | grep -q -- '--strict' && echo 1 || echo 0)
28
35
  ```
29
36
 
30
- ## Step 2: Determine Scope
37
+ ## Workflow
31
38
 
32
- ```bash
33
- # Read the commit range for the execute phase
34
- # From .state.json or git reflog
35
-
36
- LAST_EXEC_START=$(python3 -c "
37
- import json
38
- s = json.load(open('$DIR/.state.json'))
39
- # Custom field or inferred from git
40
- print(s.get('execute_state', {}).get('start_commit', ''))
41
- ")
42
-
43
- # If unavailable, use main..HEAD
44
- RANGE="${LAST_EXEC_START:-main}..HEAD"
45
- echo "Verification scope: $RANGE"
46
- ```
39
+ ### Step 1: Dispatch `flow-verifier`
47
40
 
48
- ## Step 3: Dispatch flow-verifier
41
+ Delegate to the `flow-verifier` agent with:
42
+ - `requirements.md` (source of FR-NN, AC-N.N, US-NN)
43
+ - `design.md` (source of AD-NN)
44
+ - `tasks.md` (source of T-N.M and their Verify commands)
45
+ - Repository root (to scan code + tests)
49
46
 
50
- ```
51
- Task:
52
- subagent_type: general-purpose
53
- description: "verify $SPEC_NAME"
54
- prompt: |
55
- You are the flow-verifier agent. Full definition:
56
- ${CLAUDE_PLUGIN_ROOT}/agents/flow-verifier.md
57
-
58
- Must read:
59
- - .flow/specs/$SPEC_NAME/requirements.md
60
- - .flow/specs/$SPEC_NAME/design.md
61
- - .flow/specs/$SPEC_NAME/tasks.md
62
- - .flow/specs/$SPEC_NAME/.state.json
63
- - .flow/STATE.md
64
-
65
- Verification scope: commits $RANGE
66
-
67
- Tasks:
68
- 1. Extract every FR / AC / AD / Component / error-path assertion
69
- 2. Find evidence for each assertion (code + test + actual run)
70
- 3. Scan for stub patterns (TODO / Not implemented / return {})
71
- 4. Generate verification-report.md
72
- 5. Update phase_status.verify in .state.json
73
-
74
- Output file:
75
- .flow/specs/$SPEC_NAME/verification-report.md
76
-
77
- Return a brief:
78
- - Fully verified / partially verified / not verified counts
79
- - Number of fake implementations
80
- - List of blockers
81
- - Suggested next step
82
- ```
47
+ The agent performs goal-backward tracing:
48
+ 1. For each **FR-NN**: search for code that implements it. Classify as `IMPLEMENTED` / `STUB` / `MISSING` / `UNCERTAIN`.
49
+ 2. For each **AC-N.N**: search for a test that exercises it. Classify as `TESTED` / `UNTESTED`.
50
+ 3. For each **AD-NN**: check that the design decision is reflected in code structure / interfaces.
51
+ 4. Detect suspicious patterns:
52
+ - `throw new Error("not implemented")` / `TODO` / `NotImplementedError`
53
+ - `return null` / `return {}` in places that should produce real output
54
+ - test files with only `it.skip(...)` or no assertions
55
+ - code that returns mocked fixtures instead of calling real collaborators
56
+
57
+ ### Step 2: Run the Verify commands from `tasks.md`
58
+
59
+ For every task listed in `tasks.md`, run its declared `Verify:` command and record pass/fail. This is the most objective check — if the task said `npm test -- auth.spec.ts`, run exactly that.
60
+
61
+ ### Step 3: (Strict only) Multi-source coverage audit
62
+
63
+ If `--strict`:
64
+ - Cross-check every FR / AC / AD / decision against the implementation
65
+ - Cross-check every research conclusion: was the recommended library / approach actually used?
66
+ - Cross-check every D-NN decision in `.flow/STATE.md` that references this spec
67
+
68
+ ### Step 4: Produce `verification-report.md`
83
69
 
84
- ## Step 4: Read Report + Decide
70
+ **Landing check**: sub-agent responses can be truncated by the model's output-length limit. After dispatching `flow-verifier`, verify the report actually landed:
85
71
 
86
72
  ```bash
87
- REPORT="$DIR/verification-report.md"
88
- [ ! -f "$REPORT" ] && { echo " verifier did not produce a report"; exit 1; }
89
-
90
- # Parse the verdict from the report
91
- VERIFIED=$(grep -c "^\- ✓" "$REPORT" || echo 0)
92
- PARTIAL=$(grep -c "^\- ⚠" "$REPORT" || echo 0)
93
- MISSING=$(grep -c "^\- " "$REPORT" || echo 0)
94
- STUBS=$(grep -c "^\- 🚨" "$REPORT" || echo 0)
73
+ REPORT=".flow/specs/$SPEC_NAME/verification-report.md"
74
+ if [ ! -f "$REPORT" ] || [ "$(wc -c < "$REPORT" 2>/dev/null | tr -d ' ')" -lt 300 ]; then
75
+ echo "⚠ Report missing or truncated. Re-dispatching flow-verifier with a terse 'write the report now' prompt."
76
+ # Re-dispatch pattern:
77
+ # "Your only job right now is to Write the verification-report.md using the
78
+ # findings you already gathered. Do not re-scan. Do not narrate. Write
79
+ # the file and stop."
80
+ fi
95
81
  ```
96
82
 
97
- ## Step 5: Output to User
83
+ Write to `.flow/specs/$SPEC_NAME/verification-report.md`:
98
84
 
85
+ ```markdown
86
+ # Verification Report — <spec-name>
87
+
88
+ **Generated**: <ISO8601>
89
+ **Mode**: strict | normal
90
+
91
+ ## Summary
92
+ - FR coverage: N/M implemented (K stubs, L missing)
93
+ - AC coverage: N/M tested
94
+ - AD coverage: N/M reflected
95
+ - Verify commands: N/M passing
96
+
97
+ ## Findings
98
+
99
+ ### Missing implementations
100
+ - FR-02: <description>. No matching code found in <paths searched>.
101
+
102
+ ### Stubs / fake completions
103
+ - FR-05: Implemented in `src/auth.ts:42` but body is `throw new Error("not implemented")`.
104
+
105
+ ### Untested acceptance criteria
106
+ - AC-1.3: No test asserts token refresh after 15 min expiry.
107
+
108
+ ### Failing Verify commands
109
+ - T-2.1: `npm test auth.spec.ts` → 3 failed
110
+
111
+ ## Verdict
112
+ - [ ] PASS — all items covered and passing
113
+ - [X] PARTIAL — <n> findings must be addressed before shipping
114
+ - [ ] MISSING — substantive implementation gaps
99
115
  ```
100
- ✓ Verify complete: $SPEC_NAME
101
116
 
102
- Stats:
103
- ✓ Fully verified: $VERIFIED
104
- Partially verified: $PARTIAL
105
- Not verified: $MISSING
106
- 🚨 Fake implementations: $STUBS
117
+ ### Step 5: Apply `verification-gate`
118
+
119
+ Hard rule from `@${CLAUDE_PLUGIN_ROOT}/gates/verification-gate.md`:
120
+ - Any `STUB` or `MISSING` finding on a non-deferred FR blocks completion.
121
+ - Any failing Verify command blocks completion.
122
+ - Waive only with an explicit user D-NN decision logged to `.flow/STATE.md`.
107
123
 
108
- Report: .flow/specs/$SPEC_NAME/verification-report.md
124
+ ## Reporting
125
+
126
+ ```
127
+ ✓ Verification complete
128
+ FR coverage: 8/10 implemented (1 stub, 1 missing)
129
+ AC coverage: 9/12 tested
130
+ Verify commands: 14/15 passing
131
+ Verdict: PARTIAL
109
132
 
110
- Verdict:
111
- $([ $MISSING -gt 0 ] && echo '❌ BLOCKED — unimplemented FR/AC/AD exist, return to /curdx-flow:implement to fill in')
112
- $([ $STUBS -gt 0 ] && echo '❌ BLOCKED — fake implementations found')
113
- $([ $MISSING -eq 0 ] && [ $STUBS -eq 0 ] && echo '✓ PASS — can proceed to /curdx-flow:review')
133
+ Findings written to: .flow/specs/<name>/verification-report.md
114
134
 
115
- Next step:
116
- $([ $MISSING -gt 0 ] && echo 'fix blockers → /curdx-flow:implement --task=<new task>')
117
- $([ $MISSING -eq 0 ] && echo '/curdx-flow:review — enter code quality review')
135
+ Next: address findings, then re-run /curdx-flow:verify, or run /curdx-flow:review.
118
136
  ```
119
137
 
120
- ## Error Recovery
138
+ ## References
121
139
 
122
- - verifier times out → reduce spec scope (verify only specific FRs), then rerun
123
- - Some Verify commands are unavailable (e.g. require a DB connection) → mark "needs manual verification" in the report
124
- - verifier output is vague prompt to look at specific sections of the report
140
+ - `flow-verifier` agent: `@${CLAUDE_PLUGIN_ROOT}/agents/flow-verifier.md`
141
+ - `verification-gate`: `@${CLAUDE_PLUGIN_ROOT}/gates/verification-gate.md`
142
+ - `coverage-audit-gate` (used in strict mode): `@${CLAUDE_PLUGIN_ROOT}/gates/coverage-audit-gate.md`
@@ -210,7 +210,7 @@ I have examined the following dimensions across 2 rounds of analysis:
210
210
 
211
211
  Recommendations:
212
212
  - Human review (at least walk through the diff once)
213
- - Consider /curdx-flow:qa for real browser/integration testing (Phase 5+)
213
+ - Consider the `browser-qa` skill for real browser/integration testing (Phase 5+)
214
214
  - Wait until deployment to staging to observe
215
215
  ```
216
216
 
@@ -17,7 +17,7 @@ depends_on: []
17
17
 
18
18
  - End of the tasks phase (last step of flow-planner)
19
19
  - Before the execution phase completes (when /curdx-flow:verify runs)
20
- - Explicitly requested by /curdx-flow:audit
20
+ - Explicitly requested by /curdx-flow:verify --strict
21
21
 
22
22
  ---
23
23
 
@@ -13,7 +13,7 @@ depends_on: []
13
13
 
14
14
  ## Trigger Timing
15
15
 
16
- - When `/curdx-flow:plan-dx` runs (design phase)
16
+ - When `/curdx-flow:spec --review=dx` runs (design phase)
17
17
  - When `/curdx-flow:review --devex` runs (code phase)
18
18
  - Enabled by default in open-source / multi-person collaboration scenarios
19
19
 
@@ -18,7 +18,7 @@ depends_on: []
18
18
  - After the requirements phase ends (to supplement edge conditions)
19
19
  - After the design phase (to check error-path completeness)
20
20
  - After tests are written (to check whether only the happy path is covered)
21
- - Explicitly requested by /curdx-flow:audit
21
+ - Explicitly requested by /curdx-flow:verify --strict
22
22
 
23
23
  ---
24
24
 
@@ -13,7 +13,7 @@ depends_on: []
13
13
 
14
14
  ## Trigger Timing
15
15
 
16
- - When `/curdx-flow:security` runs
16
+ - When the `security-audit` skill runs
17
17
  - Before `/curdx-flow:ship` (auto-triggered, Phase 6+)
18
18
  - When committing specs involving auth / payments / PII
19
19
 
@@ -130,8 +130,8 @@ Production environment only accepts HTTPS. HTTP requests → 301 to HTTPS.
130
130
  # Run all scans
131
131
  bash scripts/security-scan.sh # provided by project (if available)
132
132
 
133
- # Or use flow-security-auditor agent
134
- /curdx-flow:security
133
+ # Or use flow-security-auditor agent via the `security-audit` skill
134
+ # (or say "audit for security issues")
135
135
  ```
136
136
 
137
137
  ### Dependency CVE
@@ -36,7 +36,7 @@ if [ "$LAST_CHECK" != "$TODAY" ]; then
36
36
 
37
37
  if [ "${#MISSING[@]}" -gt 0 ]; then
38
38
  JOINED="$(IFS=,; echo "${MISSING[*]}")"
39
- ADDITIONAL_CONTEXT+="## CurDX-Flow Recommended Plugins Check\n\nThe following recommended plugins were not detected: **${JOINED}**\n\nRun \`/curdx-flow:install-deps\` for interactive one-shot install. Run \`/curdx-flow:doctor\` for the full health report.\n\n"
39
+ ADDITIONAL_CONTEXT+="## CurDX-Flow Recommended Plugins Check\n\nThe following recommended plugins were not detected: **${JOINED}**\n\nRun \`npx @curdx/flow install --all\` for interactive one-shot install. Run \`npx @curdx/flow doctor\` for the full health report.\n\n"
40
40
  fi
41
41
 
42
42
  echo "$TODAY" > "$MARKER" 2>/dev/null || true
@@ -238,13 +238,13 @@ Week 5-6: Spec 4 (refund) + Spec 5 (query)
238
238
  ## Epic Lifecycle
239
239
 
240
240
  ```
241
- 1. /curdx-flow:triage "Epic goal"
241
+ 1. Invoke the `epic` skill with "Epic goal" (auto-invoked; or say "break this big feature down")
242
242
  ↓ flow-triage-analyst decomposes
243
243
  2. Generates .flow/_epics/<name>/epic.md + sub-spec skeletons
244
244
  3. User reviews epic.md
245
245
 
246
246
  4. For each sub-spec:
247
- /curdx-flow:switch <sub-spec-name>
247
+ /curdx-flow:start <sub-spec-name>
248
248
  /curdx-flow:spec
249
249
  /curdx-flow:implement
250
250
  /curdx-flow:review
@@ -252,7 +252,7 @@ Can you switch strategies mid-execution? Not recommended.
252
252
  - Any → Wave: needs `[P]` markers in tasks.md
253
253
 
254
254
  If you really must switch, do it manually:
255
- 1. `/curdx-flow:doctor` to check status
255
+ 1. `npx @curdx/flow doctor` to check status
256
256
  2. Manually edit `.flow/specs/<name>/.state.json`'s `strategy` field
257
257
  3. Rerun `/curdx-flow:implement`
258
258
 
@@ -262,13 +262,13 @@ If you really must switch, do it manually:
262
262
 
263
263
  ### View progress
264
264
  ```bash
265
- /curdx-flow:status # global
266
- /curdx-flow:status <name> # single-spec details
265
+ /curdx-flow:start --list # global
266
+ # For single-spec details, inspect .flow/specs/<name>/.progress.md
267
267
  ```
268
268
 
269
269
  ### Interrupt
270
270
  - `Ctrl+C` interrupts the current session → Stop event triggers, state is saved
271
- - Next `/curdx-flow:switch <name>` resumes from `task_index`
271
+ - Next `/curdx-flow:start <name>` (or `/curdx-flow:start --resume`) resumes from `task_index`
272
272
 
273
273
  ### Snapshots
274
274
  `/curdx-flow:save <label>` saves a checkpoint (Phase 5+ rollout).
@@ -26,7 +26,7 @@ design.md
26
26
 
27
27
  Each review is dispatched independently (different agent / context) to avoid perspective convergence.
28
28
 
29
- Finally `/curdx-flow:autoplan` ties them together: runs all 4 reviews in one pass.
29
+ Finally `/curdx-flow:spec --review=all` ties them together: runs all 4 reviews in one pass.
30
30
 
31
31
  ---
32
32
 
@@ -115,7 +115,7 @@ Essentially runs `flow-architect` again — but this time not to generate the de
115
115
 
116
116
  ### Dispatch
117
117
 
118
- `flow-ux-designer` (Emma) switches into review mode.
118
+ `flow-ux-designer` switches into review mode.
119
119
 
120
120
  ---
121
121
 
@@ -153,10 +153,10 @@ Phase 5 implementation: reuse `flow-reviewer` + `@gates/devex-gate.md`.
153
153
 
154
154
  ---
155
155
 
156
- ## /curdx-flow:autoplan — Run All 4 at Once
156
+ ## /curdx-flow:spec --review=all — Run All 4 at Once
157
157
 
158
158
  ```bash
159
- /curdx-flow:autoplan
159
+ /curdx-flow:spec --review=all
160
160
  ```
161
161
 
162
162
  Workflow:
@@ -183,7 +183,7 @@ Output:
183
183
  ...
184
184
 
185
185
  ## Recommendations
186
- 1. Return to /curdx-flow:design to fix blockers
186
+ 1. Return to /curdx-flow:spec --phase=design to fix blockers
187
187
  2. Record warnings in STATE.md, address in tasks phase
188
188
  ```
189
189
 
@@ -191,7 +191,7 @@ Output:
191
191
 
192
192
  ## When to Skip Planning Reviews
193
193
 
194
- - **MVP / prototype**: time-pressured, run /curdx-flow:tasks first, review after launch
194
+ - **MVP / prototype**: time-pressured, run /curdx-flow:spec --phase=tasks first, review after launch
195
195
  - **Tiny changes**: a single file < 50 lines doesn't warrant a 4-dimension review
196
196
  - **Similar work done before**: reuse prior review conclusions
197
197
 
@@ -147,7 +147,7 @@ Regardless of the path taken, the 4 files must satisfy:
147
147
  ## Spec vs Epic Difference
148
148
 
149
149
  - **Spec**: a single independently-deliverable feature. Typically 1-2 weeks of effort.
150
- - **Epic**: a collection of specs. `/curdx-flow:triage` breaks down a large goal into multiple specs.
150
+ - **Epic**: a collection of specs. The `epic` skill (auto-invoked, or say "break this big feature down") breaks down a large goal into multiple specs.
151
151
 
152
152
  `.flow/specs/<name>/` is a single-spec directory.
153
153
  `.flow/_epics/<name>/` is an Epic directory (contains the dependency graph and sub-spec list).
@@ -159,8 +159,8 @@ Regardless of the path taken, the 4 files must satisfy:
159
159
  SDD is not dogma. The following scenarios may skip phases:
160
160
 
161
161
  - **One-off scripts** (`/curdx-flow:fast` mode) — skip all specs
162
- - **UI prototype exploration** (`/curdx-flow:sketch` mode) — only research + design sketches
163
- - **Emergency hotfix** (`/curdx-flow:spike` mode) — validating the assumption is enough
162
+ - **UI prototype exploration** (the `ui-sketch` skill) — only research + design sketches
163
+ - **Emergency hotfix** (`/curdx-flow:fast "spike: validate <hypothesis>"` mode) — validating the assumption is enough
164
164
 
165
165
  But **production code changes** should follow the full flow. Rationale:
166
166
  - Code may be only 20 lines, but impact may reach all users
@@ -206,7 +206,7 @@ Some reviewers list 50 minor improvements — the user can't process.
206
206
  ## Relationship to Other Phases
207
207
 
208
208
  ```
209
- /curdx-flow:tasks → tasks.md contains task list
209
+ /curdx-flow:spec --phase=tasks → tasks.md contains task list
210
210
 
211
211
  /curdx-flow:implement → code + tests + commits
212
212
 
@@ -218,7 +218,7 @@ Some reviewers list 50 minor improvements — the user can't process.
218
218
  ↓ ↓
219
219
  ↓ review-report.md
220
220
 
221
- (optional) /curdx-flow:audit → adversarial review + edge cases
221
+ (optional) /curdx-flow:verify --strict → adversarial review + edge cases
222
222
 
223
223
  adversarial-review.md
224
224
  edge-cases.md
@@ -254,7 +254,7 @@ Decision:
254
254
  - 1.1 and 1.3 commits retained
255
255
  - Main agent decides:
256
256
  A: continue to Wave 2 (skip 1.2, possible cascading failure)
257
- B: dispatch David (flow-debugger) to fix 1.2, then continue
257
+ B: dispatch flow-debugger to fix 1.2, then continue
258
258
  C: stop and report, let the user intervene
259
259
 
260
260
  Default: A, but failed_attempts += 1; after threshold switch to C
@@ -268,7 +268,7 @@ Wave 1 all TASK_FAILED
268
268
  Decision:
269
269
  - Usually indicates an upstream environment problem (missing deps, tsc config wrong)
270
270
  - Stop immediately
271
- - Suggest user run /curdx-flow:doctor to diagnose
271
+ - Suggest user run `npx @curdx/flow doctor` to diagnose
272
272
  ```
273
273
 
274
274
  ### Inter-wave dependency broken
@@ -307,7 +307,7 @@ Decision:
307
307
 
308
308
  ### In-progress view
309
309
 
310
- `/curdx-flow:status` shows:
310
+ Inspecting `.flow/specs/<name>/.progress.md` (or running `/curdx-flow:start --list`) shows:
311
311
  ```
312
312
  Spec: auth-system
313
313
  Strategy: wave
@@ -321,7 +321,7 @@ Progress: Wave 2/5 (60%)
321
321
  ### Ctrl+C interruption
322
322
 
323
323
  - Running Task calls in the current wave keep going (Claude Code's Task is an independent process)
324
- - Next `/curdx-flow:switch` shows some tasks already committed
324
+ - Next `/curdx-flow:start --resume` shows some tasks already committed
325
325
  - Resume from the failing task
326
326
 
327
327
  ---
@@ -367,7 +367,7 @@ Phase 6+ will consider automatic fallback.
367
367
  ### 1. `[P]` markers incorrect
368
368
 
369
369
  If the planner missed a dependency, `[P]` may be wrong. Solutions:
370
- - Before execution, confirm tasks coverage via `/curdx-flow:audit`
370
+ - Before execution, confirm tasks coverage via `/curdx-flow:verify --strict`
371
371
  - Conflict detection as a safety net (validate Files before dispatch)
372
372
 
373
373
  ### 2. A wave too large
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@curdx/flow",
3
- "version": "1.1.11",
3
+ "version": "2.0.0-beta.2",
4
4
  "description": "CLI installer for CurDX-Flow — AI engineering workflow meta-framework for Claude Code",
5
5
  "type": "module",
6
6
  "bin": {
@@ -1,128 +0,0 @@
1
- ---
2
- name: amelia
3
- description: Amelia — developer (strict execution, quality-first). Backed by the full capabilities of flow-executor.
4
- model: sonnet
5
- effort: medium
6
- maxTurns: 30
7
- tools: [Read, Write, Edit, Bash, Grep, Glob]
8
- ---
9
-
10
- # Amelia — Developer
11
-
12
- Hi, I'm **Amelia**. I turn designs into code.
13
-
14
- ---
15
-
16
- ## My Perspective
17
-
18
- My job is **strict execution**. The design has been discussed, the requirements are nailed down, the tasks are broken out. My responsibilities:
19
-
20
- - **Follow tasks.md** (no freelancing)
21
- - **Karpathy surgical edits** (change only what must change)
22
- - **TDD red/green/yellow** (tests first)
23
- - **Atomic commits** (one task, one commit)
24
- - **Verify must pass** (evidence required when claiming done)
25
-
26
- ---
27
-
28
- ## My Capabilities
29
-
30
- Full workflow:
31
-
32
- @${CLAUDE_PLUGIN_ROOT}/agents/flow-executor.md
33
-
34
- Key rules:
35
- - 5-round retry (pua-style escalation)
36
- - Emit `TASK_COMPLETE` / `TASK_FAILED` / `ALL_TASKS_COMPLETE`
37
- - Atomic commit per task (conventional format)
38
- - Update `.progress.md` and `.state.json`
39
-
40
- ---
41
-
42
- ## My Communication Style
43
-
44
- - **Concise > verbose**: execution doesn't need long explanations
45
- - **Evidence > claims**: not "should be good", but "ran the test, passed"
46
- - **Stay on task**: don't challenge the design during execution (raise concerns during the design phase)
47
- - **Clear failures**: after 3 failures, I say `TASK_FAILED` honestly — no forcing it
48
-
49
- ---
50
-
51
- ## The Rules I Follow
52
-
53
- ### 1. No production code without a failing test first
54
-
55
- In the Phase 3 (Testing) stage, TDD is ironclad. Any waiver must be recorded in STATE.md.
56
-
57
- ### 2. Only touch the files listed in the Files field
58
-
59
- If the task says modify `auth/login.ts`, I won't "casually" touch `utils/string.ts`.
60
-
61
- ### 3. Verify must actually run
62
-
63
- "Tests should pass" is not allowed. Must run `npm test` and capture the exit code.
64
-
65
- ### 4. Honest commit messages
66
-
67
- No hedging words (maybe / probably / should). If uncertain, don't commit.
68
-
69
- ### 5. Don't ask the user in Quick mode
70
-
71
- In an automated loop (stop-hook or --quick), I proceed on the basis of `.flow/CONTEXT.md` preferences + the most reasonable assumption, recording the assumption to `.progress.md`.
72
-
73
- ---
74
-
75
- ## Typical Output (after finishing a task)
76
-
77
- ```
78
- ✓ Task 1.2 complete — feat(auth): implement login endpoint (abc123f)
79
-
80
- Verify passed:
81
- npm test -- auth/login.test.ts
82
- ✓ Test Suites: 1 passed
83
- ✓ Tests: 3 passed
84
-
85
- Files changed:
86
- src/auth/login.ts (+45 -2)
87
- src/auth/login.test.ts (+38)
88
-
89
- .progress.md updated: task 1.2 learned "bcrypt.compare needs await"
90
-
91
- TASK_COMPLETE: 1.2
92
- Next: 1.3
93
- ```
94
-
95
- ---
96
-
97
- ## When to Call Me
98
-
99
- - Entering a spec's execute phase
100
- - `/curdx-flow:implement` auto-dispatches me (as a subagent or stop-hook loop)
101
- - In Party Mode: I represent the "can we actually build it" perspective
102
-
103
- ---
104
-
105
- ## When I Fail
106
-
107
- I say so honestly, without hiding it:
108
-
109
- ```
110
- ✗ Task 1.2 failed (after 5 attempts)
111
-
112
- Attempts:
113
- 1. Direct implementation → bcrypt not found (dependency issue)
114
- 2. Install bcrypt → permission error
115
- 3. Use npm sudo → broke node_modules
116
- 4. Switch to bcryptjs → wrong import path
117
- 5. Fix path → some test still failing, unclear why
118
-
119
- TASK_FAILED: 1.2
120
- Suggestions:
121
- - Have the user investigate the bcrypt permission issue
122
- - Or consider dispatching flow-debugger / David for root-cause analysis
123
- - Or grant a STATE.md waiver for this task
124
- ```
125
-
126
- ---
127
-
128
- _Backed by: flow-executor agent._