@curdx/flow 1.1.11 → 2.0.0-beta.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (96) hide show
  1. package/.claude-plugin/marketplace.json +3 -3
  2. package/.claude-plugin/plugin.json +4 -11
  3. package/CHANGELOG.md +99 -0
  4. package/README.md +74 -102
  5. package/README.zh.md +2 -2
  6. package/agent-preamble/preamble.md +81 -11
  7. package/agents/flow-adversary.md +41 -56
  8. package/agents/flow-architect.md +24 -11
  9. package/agents/flow-debugger.md +2 -2
  10. package/agents/flow-edge-hunter.md +20 -6
  11. package/agents/flow-executor.md +3 -3
  12. package/agents/flow-planner.md +51 -48
  13. package/agents/flow-product-designer.md +15 -2
  14. package/agents/flow-qa-engineer.md +4 -4
  15. package/agents/flow-researcher.md +18 -3
  16. package/agents/flow-reviewer.md +5 -1
  17. package/agents/flow-security-auditor.md +2 -2
  18. package/agents/flow-triage-analyst.md +4 -4
  19. package/agents/flow-ui-researcher.md +7 -7
  20. package/agents/flow-ux-designer.md +3 -3
  21. package/agents/flow-verifier.md +47 -14
  22. package/bin/curdx-flow.js +13 -1
  23. package/cli/doctor.js +28 -13
  24. package/cli/install.js +62 -36
  25. package/cli/protocols.js +63 -10
  26. package/cli/registry.js +73 -0
  27. package/cli/uninstall.js +9 -11
  28. package/cli/upgrade.js +6 -10
  29. package/cli/utils.js +104 -56
  30. package/commands/debug.md +10 -10
  31. package/commands/fast.md +1 -1
  32. package/commands/help.md +109 -87
  33. package/commands/implement.md +7 -7
  34. package/commands/init.md +18 -7
  35. package/commands/review.md +114 -130
  36. package/commands/spec.md +131 -89
  37. package/commands/start.md +130 -153
  38. package/commands/verify.md +110 -92
  39. package/gates/adversarial-review-gate.md +20 -20
  40. package/gates/coverage-audit-gate.md +1 -1
  41. package/gates/devex-gate.md +5 -6
  42. package/gates/edge-case-gate.md +2 -2
  43. package/gates/security-gate.md +3 -3
  44. package/hooks/hooks.json +0 -11
  45. package/hooks/scripts/quick-mode-guard.sh +12 -9
  46. package/hooks/scripts/session-start.sh +2 -2
  47. package/hooks/scripts/stop-watcher.sh +25 -15
  48. package/knowledge/epic-decomposition.md +2 -2
  49. package/knowledge/execution-strategies.md +10 -9
  50. package/knowledge/planning-reviews.md +6 -6
  51. package/knowledge/spec-driven-development.md +11 -10
  52. package/knowledge/two-stage-review.md +6 -5
  53. package/knowledge/wave-execution.md +5 -5
  54. package/package.json +4 -2
  55. package/skills/brownfield-index/SKILL.md +62 -0
  56. package/skills/browser-qa/SKILL.md +50 -0
  57. package/skills/epic/SKILL.md +68 -0
  58. package/skills/security-audit/SKILL.md +50 -0
  59. package/skills/ui-sketch/SKILL.md +49 -0
  60. package/templates/config.json.tmpl +1 -1
  61. package/templates/design.md.tmpl +32 -112
  62. package/templates/requirements.md.tmpl +25 -43
  63. package/templates/research.md.tmpl +37 -68
  64. package/templates/tasks.md.tmpl +27 -84
  65. package/agents/persona-amelia.md +0 -128
  66. package/agents/persona-david.md +0 -141
  67. package/agents/persona-emma.md +0 -179
  68. package/agents/persona-john.md +0 -105
  69. package/agents/persona-mary.md +0 -95
  70. package/agents/persona-oliver.md +0 -136
  71. package/agents/persona-rachel.md +0 -126
  72. package/agents/persona-serena.md +0 -175
  73. package/agents/persona-winston.md +0 -117
  74. package/commands/audit.md +0 -170
  75. package/commands/autoplan.md +0 -184
  76. package/commands/design.md +0 -155
  77. package/commands/discuss.md +0 -162
  78. package/commands/doctor.md +0 -124
  79. package/commands/index.md +0 -261
  80. package/commands/install-deps.md +0 -128
  81. package/commands/party.md +0 -241
  82. package/commands/plan-ceo.md +0 -117
  83. package/commands/plan-design.md +0 -107
  84. package/commands/plan-dx.md +0 -104
  85. package/commands/plan-eng.md +0 -108
  86. package/commands/qa.md +0 -118
  87. package/commands/requirements.md +0 -146
  88. package/commands/research.md +0 -141
  89. package/commands/security.md +0 -109
  90. package/commands/sketch.md +0 -118
  91. package/commands/spike.md +0 -181
  92. package/commands/status.md +0 -139
  93. package/commands/switch.md +0 -95
  94. package/commands/tasks.md +0 -189
  95. package/commands/triage.md +0 -160
  96. package/hooks/scripts/fail-tracker.sh +0 -31
@@ -13,7 +13,7 @@ depends_on: []
13
13
 
14
14
  ## Trigger Timing
15
15
 
16
- - When `/curdx-flow:plan-dx` runs (design phase)
16
+ - When `/curdx-flow:spec --review=dx` runs (design phase)
17
17
  - When `/curdx-flow:review --devex` runs (code phase)
18
18
  - Enabled by default in open-source / multi-person collaboration scenarios
19
19
 
@@ -195,12 +195,12 @@ Reading these test names = reading API behavior documentation.
195
195
 
196
196
  ### Agent Automatic
197
197
 
198
- When `flow-ux-designer` / `flow-reviewer` applies this gate, use sequential-thinking 4 rounds to scan the 8 dimensions.
198
+ When `flow-ux-designer` / `flow-reviewer` applies this gate, use sequential-thinking proportional to the complexity of the codebase being scanned.
199
199
 
200
200
  ### Human Review
201
201
 
202
202
  Attach a DevEx checklist at PR time:
203
- - [ ] Clear naming (reviewed at least 3 times)
203
+ - [ ] Clear naming (re-read until obvious to a new maintainer)
204
204
  - [ ] Critical comments exist
205
205
  - [ ] Consistent structure
206
206
  - [ ] Actionable error messages
@@ -210,7 +210,7 @@ Attach a DevEx checklist at PR time:
210
210
 
211
211
  ## Scoring
212
212
 
213
- Each dimension 0-10 points:
213
+ Score each **applicable** dimension 0-10 (N/A dimensions are excluded from the total):
214
214
 
215
215
  ```
216
216
  10 = best practice
@@ -220,8 +220,7 @@ Each dimension 0-10 points:
220
220
  0 = serious issue
221
221
  ```
222
222
 
223
- Total 40+ / 80 = pass (warning, non-blocking).
224
- Total < 40 = blocked, improvement required.
223
+ Emit the per-dimension scores with evidence. The gate itself does not block on a numeric threshold; it surfaces the weaknesses for the user (or the reviewing agent) to decide whether any of them rise to a blocker. A single 0/10 on a material dimension is a blocker regardless of the total.
225
224
 
226
225
  ---
227
226
 
@@ -18,7 +18,7 @@ depends_on: []
18
18
  - After the requirements phase ends (to supplement edge conditions)
19
19
  - After the design phase (to check error-path completeness)
20
20
  - After tests are written (to check whether only the happy path is covered)
21
- - Explicitly requested by /curdx-flow:audit
21
+ - Explicitly requested by /curdx-flow:verify --strict
22
22
 
23
23
  ---
24
24
 
@@ -104,7 +104,7 @@ Q4. If no test, what test should be added to cover it?
104
104
  Input: object under review (function / component / API) + requirements + tests
105
105
 
106
106
  For each category (1-7):
107
- 1. Use sequential-thinking to list at least 3 possible edge scenarios
107
+ 1. Use sequential-thinking to list every plausible edge scenario for this category — stop when you've covered the real risk surface, don't pad to a quota, don't fabricate scenarios that won't occur in production
108
108
  2. Check whether each scenario has corresponding coverage in tests
109
109
  3. Add uncovered ones to the "gap list"
110
110
 
@@ -13,7 +13,7 @@ depends_on: []
13
13
 
14
14
  ## Trigger Timing
15
15
 
16
- - When `/curdx-flow:security` runs
16
+ - When the `security-audit` skill runs
17
17
  - Before `/curdx-flow:ship` (auto-triggered, Phase 6+)
18
18
  - When committing specs involving auth / payments / PII
19
19
 
@@ -130,8 +130,8 @@ Production environment only accepts HTTPS. HTTP requests → 301 to HTTPS.
130
130
  # Run all scans
131
131
  bash scripts/security-scan.sh # provided by project (if available)
132
132
 
133
- # Or use flow-security-auditor agent
134
- /curdx-flow:security
133
+ # Or use flow-security-auditor agent via the `security-audit` skill
134
+ # (or say "audit for security issues")
135
135
  ```
136
136
 
137
137
  ### Dependency CVE
package/hooks/hooks.json CHANGED
@@ -20,17 +20,6 @@
20
20
  ]
21
21
  }
22
22
  ],
23
- "PostToolUseFailure": [
24
- {
25
- "matcher": "Bash|Edit|Write",
26
- "hooks": [
27
- {
28
- "type": "command",
29
- "command": "${CLAUDE_PLUGIN_ROOT}/hooks/scripts/fail-tracker.sh"
30
- }
31
- ]
32
- }
33
- ],
34
23
  "Stop": [
35
24
  {
36
25
  "hooks": [
@@ -40,17 +40,20 @@ ACTIVE=$(cat .flow/.active-spec 2>/dev/null)
40
40
  STATE_FILE=".flow/specs/$ACTIVE/.state.json"
41
41
  [ ! -f "$STATE_FILE" ] && exit 0
42
42
 
43
- # Read quickMode + mode
44
- QUICK_MODE=$(python3 -c "
45
- import json
43
+ # Read quickMode + mode. Pass STATE_FILE via env (NOT shell interpolation
44
+ # into the python source) so an active-spec name containing quotes/$ cannot
45
+ # inject python code.
46
+ export STATE_FILE
47
+ QUICK_MODE=$(python3 -c '
48
+ import json, os
46
49
  try:
47
- s = json.load(open('$STATE_FILE'))
48
- qm = s.get('quickMode', False)
49
- mode = s.get('mode', '')
50
- print('true' if (qm or mode == 'autonomous') else 'false')
50
+ s = json.load(open(os.environ["STATE_FILE"]))
51
+ qm = s.get("quickMode", False)
52
+ mode = s.get("mode", "")
53
+ print("true" if (qm or mode == "autonomous") else "false")
51
54
  except Exception:
52
- print('false')
53
- " 2>/dev/null)
55
+ print("false")
56
+ ' 2>/dev/null)
54
57
 
55
58
  if [ "$QUICK_MODE" = "true" ]; then
56
59
  # Block and inject guidance
@@ -1,7 +1,7 @@
1
1
  #!/usr/bin/env bash
2
2
  # CurDX-Flow SessionStart Hook
3
3
  # Duties:
4
- # 1. Daily dependency check — nudge user to /flow-install-deps if recommended plugins missing
4
+ # 1. Daily dependency check — nudge user to `npx @curdx/flow install --all` if recommended plugins missing
5
5
  # 2. Load active spec progress into session context
6
6
  #
7
7
  # Design notes:
@@ -36,7 +36,7 @@ if [ "$LAST_CHECK" != "$TODAY" ]; then
36
36
 
37
37
  if [ "${#MISSING[@]}" -gt 0 ]; then
38
38
  JOINED="$(IFS=,; echo "${MISSING[*]}")"
39
- ADDITIONAL_CONTEXT+="## CurDX-Flow Recommended Plugins Check\n\nThe following recommended plugins were not detected: **${JOINED}**\n\nRun \`/curdx-flow:install-deps\` for interactive one-shot install. Run \`/curdx-flow:doctor\` for the full health report.\n\n"
39
+ ADDITIONAL_CONTEXT+="## CurDX-Flow Recommended Plugins Check\n\nThe following recommended plugins were not detected: **${JOINED}**\n\nRun \`npx @curdx/flow install --all\` for interactive one-shot install. Run \`npx @curdx/flow doctor\` for the full health report.\n\n"
40
40
  fi
41
41
 
42
42
  echo "$TODAY" > "$MARKER" 2>/dev/null || true
@@ -56,6 +56,12 @@ if ! command -v python3 >/dev/null 2>&1; then
56
56
  allow_stop
57
57
  fi
58
58
 
59
+ # Export STATE_FILE BEFORE invoking python3 — the heredoc-based parser reads
60
+ # os.environ["STATE_FILE"]. Previously the export was placed after the
61
+ # heredoc, so python3 always got None, json.load(None) silently failed, and
62
+ # the stop-hook strategy never activated.
63
+ export STATE_FILE
64
+
59
65
  read STRATEGY PHASE TASK_INDEX TOTAL_TASKS FAILED ROUNDS <<EOF
60
66
  $(python3 <<'PY'
61
67
  import json, os, sys
@@ -75,7 +81,6 @@ print(strategy, phase, ti, tt, failed, rounds)
75
81
  PY
76
82
  )
77
83
  EOF
78
- export STATE_FILE
79
84
 
80
85
  # Only activate for stop-hook strategy + execute phase
81
86
  [ "$STRATEGY" != "stop-hook" ] && allow_stop
@@ -95,12 +100,17 @@ if [ -n "$TRANSCRIPT_PATH" ] && [ -f "$TRANSCRIPT_PATH" ]; then
95
100
  TRANSCRIPT_TAIL=$(tail -c 51200 "$TRANSCRIPT_PATH" 2>/dev/null || echo "")
96
101
  fi
97
102
 
103
+ # Python state-file updates: use quoted heredocs (<<'PY') + os.environ so
104
+ # the spec-name-derived STATE_FILE path is NEVER interpolated into the
105
+ # python source text. Previously a spec name containing single quotes or
106
+ # $-signs could break the script or inject arbitrary code.
107
+
98
108
  # Check for explicit completion signals
99
109
  if echo "$TRANSCRIPT_TAIL" | grep -q "ALL_TASKS_COMPLETE"; then
100
110
  # Cleanup: mark phase completed
101
- python3 <<PY 2>/dev/null
102
- import json
103
- p = "$STATE_FILE"
111
+ python3 <<'PY' 2>/dev/null
112
+ import json, os
113
+ p = os.environ["STATE_FILE"]
104
114
  s = json.load(open(p))
105
115
  s.setdefault("phase_status", {})["execute"] = "completed"
106
116
  s["phase"] = "verify" # move to verify phase
@@ -112,16 +122,16 @@ fi
112
122
  # Check for fail signal (accumulate; actual stop decision below)
113
123
  if echo "$TRANSCRIPT_TAIL" | grep -q "TASK_FAILED"; then
114
124
  # Increment failed_attempts
115
- python3 <<PY 2>/dev/null
116
- import json
117
- p = "$STATE_FILE"
125
+ python3 <<'PY' 2>/dev/null
126
+ import json, os
127
+ p = os.environ["STATE_FILE"]
118
128
  s = json.load(open(p))
119
129
  s.setdefault("execute_state", {})
120
130
  s["execute_state"]["failed_attempts"] = s["execute_state"].get("failed_attempts", 0) + 1
121
131
  json.dump(s, open(p, "w"), indent=2, ensure_ascii=False)
122
132
  PY
123
- # Re-read
124
- FAILED=$(python3 -c "import json; print(json.load(open('$STATE_FILE'))['execute_state']['failed_attempts'])" 2>/dev/null || echo 0)
133
+ # Re-read — again via os.environ, no shell interpolation into python.
134
+ FAILED=$(python3 -c 'import json, os; print(json.load(open(os.environ["STATE_FILE"]))["execute_state"]["failed_attempts"])' 2>/dev/null || echo 0)
125
135
  fi
126
136
 
127
137
  # ---------- 6. Safety brakes ----------
@@ -138,9 +148,9 @@ fi
138
148
  # Check if all tasks done
139
149
  if [ "$TASK_INDEX" -ge "$TOTAL_TASKS" ] && [ "$TOTAL_TASKS" -gt 0 ]; then
140
150
  # Mark complete
141
- python3 <<PY 2>/dev/null
142
- import json
143
- p = "$STATE_FILE"
151
+ python3 <<'PY' 2>/dev/null
152
+ import json, os
153
+ p = os.environ["STATE_FILE"]
144
154
  s = json.load(open(p))
145
155
  s.setdefault("phase_status", {})["execute"] = "completed"
146
156
  s["phase"] = "verify"
@@ -151,9 +161,9 @@ fi
151
161
 
152
162
  # ---------- 7. Block and continue ----------
153
163
  # Increment round counter
154
- python3 <<PY 2>/dev/null
155
- import json
156
- p = "$STATE_FILE"
164
+ python3 <<'PY' 2>/dev/null
165
+ import json, os
166
+ p = os.environ["STATE_FILE"]
157
167
  s = json.load(open(p))
158
168
  s.setdefault("execute_state", {})
159
169
  s["execute_state"]["global_iteration"] = s["execute_state"].get("global_iteration", 0) + 1
@@ -238,13 +238,13 @@ Week 5-6: Spec 4 (refund) + Spec 5 (query)
238
238
  ## Epic Lifecycle
239
239
 
240
240
  ```
241
- 1. /curdx-flow:triage "Epic goal"
241
+ 1. Invoke the `epic` skill with "Epic goal" (auto-invoked; or say "break this big feature down")
242
242
  ↓ flow-triage-analyst decomposes
243
243
  2. Generates .flow/_epics/<name>/epic.md + sub-spec skeletons
244
244
  3. User reviews epic.md
245
245
 
246
246
  4. For each sub-spec:
247
- /curdx-flow:switch <sub-spec-name>
247
+ /curdx-flow:start <sub-spec-name>
248
248
  /curdx-flow:spec
249
249
  /curdx-flow:implement
250
250
  /curdx-flow:review
@@ -223,13 +223,14 @@ return "linear"
223
223
 
224
224
  ## Failure Handling (common to all strategies)
225
225
 
226
- `flow-executor` agent's 5-round retry mechanism:
226
+ `flow-executor` agent's retry ladder — each step escalates only when the prior is honestly exhausted, not on a fixed count:
227
227
 
228
228
  ```
229
- Rounds 1-2: agent retries autonomously (edit code, rerun Verify)
230
- Round 3: sequential-thinking root-cause analysis 5 rounds
231
- Round 4: read related source + trace data flow
232
- Round 5: report TASK_FAILED
229
+ Step A: autonomous retry (edit + rerun Verify) — only for shallow failures
230
+ Step B: sequential-thinking root-cause analysis proportional to the hypothesis space
231
+ Step C: read related source + trace data flow
232
+ Step D: if ≥3 retries fail with no new hypothesis, stop and challenge the architecture (see preamble L3)
233
+ Step E: report TASK_FAILED
233
234
  ```
234
235
 
235
236
  ### Extra protections for Stop-Hook strategy
@@ -252,7 +253,7 @@ Can you switch strategies mid-execution? Not recommended.
252
253
  - Any → Wave: needs `[P]` markers in tasks.md
253
254
 
254
255
  If you really must switch, do it manually:
255
- 1. `/curdx-flow:doctor` to check status
256
+ 1. `npx @curdx/flow doctor` to check status
256
257
  2. Manually edit `.flow/specs/<name>/.state.json`'s `strategy` field
257
258
  3. Rerun `/curdx-flow:implement`
258
259
 
@@ -262,13 +263,13 @@ If you really must switch, do it manually:
262
263
 
263
264
  ### View progress
264
265
  ```bash
265
- /curdx-flow:status # global
266
- /curdx-flow:status <name> # single-spec details
266
+ /curdx-flow:start --list # global
267
+ # For single-spec details, inspect .flow/specs/<name>/.progress.md
267
268
  ```
268
269
 
269
270
  ### Interrupt
270
271
  - `Ctrl+C` interrupts the current session → Stop event triggers, state is saved
271
- - Next `/curdx-flow:switch <name>` resumes from `task_index`
272
+ - Next `/curdx-flow:start <name>` (or `/curdx-flow:start --resume`) resumes from `task_index`
272
273
 
273
274
  ### Snapshots
274
275
  `/curdx-flow:save <label>` saves a checkpoint (Phase 5+ rollout).
@@ -26,7 +26,7 @@ design.md
26
26
 
27
27
  Each review is dispatched independently (different agent / context) to avoid perspective convergence.
28
28
 
29
- Finally `/curdx-flow:autoplan` ties them together: runs all 4 reviews in one pass.
29
+ Finally `/curdx-flow:spec --review=all` ties them together: runs all 4 reviews in one pass.
30
30
 
31
31
  ---
32
32
 
@@ -115,7 +115,7 @@ Essentially runs `flow-architect` again — but this time not to generate the de
115
115
 
116
116
  ### Dispatch
117
117
 
118
- `flow-ux-designer` (Emma) switches into review mode.
118
+ `flow-ux-designer` switches into review mode.
119
119
 
120
120
  ---
121
121
 
@@ -153,10 +153,10 @@ Phase 5 implementation: reuse `flow-reviewer` + `@gates/devex-gate.md`.
153
153
 
154
154
  ---
155
155
 
156
- ## /curdx-flow:autoplan — Run All 4 at Once
156
+ ## /curdx-flow:spec --review=all — Run All 4 at Once
157
157
 
158
158
  ```bash
159
- /curdx-flow:autoplan
159
+ /curdx-flow:spec --review=all
160
160
  ```
161
161
 
162
162
  Workflow:
@@ -183,7 +183,7 @@ Output:
183
183
  ...
184
184
 
185
185
  ## Recommendations
186
- 1. Return to /curdx-flow:design to fix blockers
186
+ 1. Return to /curdx-flow:spec --phase=design to fix blockers
187
187
  2. Record warnings in STATE.md, address in tasks phase
188
188
  ```
189
189
 
@@ -191,7 +191,7 @@ Output:
191
191
 
192
192
  ## When to Skip Planning Reviews
193
193
 
194
- - **MVP / prototype**: time-pressured, run /curdx-flow:tasks first, review after launch
194
+ - **MVP / prototype**: time-pressured, run /curdx-flow:spec --phase=tasks first, review after launch
195
195
  - **Tiny changes**: a single file < 50 lines doesn't warrant a 4-dimension review
196
196
  - **Similar work done before**: reuse prior review conclusions
197
197
 
@@ -57,7 +57,7 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
57
57
  **Key behaviors** (flow-researcher agent):
58
58
  1. Read `.flow/PROJECT.md` and `.flow/CONTEXT.md` to understand project background
59
59
  2. Call `mcp__claude_mem__search` to retrieve relevant historical experience
60
- 3. Use sequential-thinking for 5-8 rounds of problem understanding
60
+ 3. Use sequential-thinking proportional to the unknowns (1 thought for a trivial prototype, many for a novel domain)
61
61
  4. Scan the codebase for reusable modules
62
62
  5. Use `mcp__context7__*` to look up latest docs for relevant libraries
63
63
  6. When necessary, WebSearch for the latest technical trends
@@ -99,11 +99,12 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
99
99
 
100
100
  **Key behaviors** (flow-architect agent):
101
101
  1. Read `research.md` + `requirements.md`
102
- 2. **Must use sequential-thinking for at least 8 rounds**:
103
- - Rounds 1-2: constraints
104
- - Rounds 3-5: comparison of options A/B
105
- - Rounds 6-7: selection + trade-offs
106
- - Round 8: rebut yourself
102
+ 2. **Use sequential-thinking proportional to the tradeoff surface** — the phases below are orientation, not a quota:
103
+ - Constraints (from NFR / tech stack)
104
+ - Option comparison (only when alternatives genuinely compete)
105
+ - Selection + accepted tradeoff
106
+ - Self-rebuttal
107
+ A well-known stack pick may finish in 1 thought; a distributed-system design may run many. Do not pad.
107
108
  3. Assign an `AD-NN` ID to each architectural decision
108
109
  4. Draw a data flow diagram (mermaid)
109
110
  5. Define component interfaces + error paths
@@ -125,7 +126,7 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
125
126
  3. Each task has 5 fields: `Do` / `Files` / `Done-when` / `Verify` / `Commit`
126
127
  4. **Multi-source coverage audit**: for each FR / AC / AD / decision, confirm there is a covering task (no omissions)
127
128
  5. Mark `[P]` (parallel-safe) and `[VERIFY]` (checkpoint)
128
- 6. Simple decomposition doesn't need sequential-thinking, but reflect on coverage every 5 tasks
129
+ 6. Simple decomposition doesn't need sequential-thinking; run a coverage audit at the end (every FR/AC/AD has a task)
129
130
 
130
131
  **Deliverable**: `tasks.md`
131
132
 
@@ -147,7 +148,7 @@ Regardless of the path taken, the 4 files must satisfy:
147
148
  ## Spec vs Epic Difference
148
149
 
149
150
  - **Spec**: a single independently-deliverable feature. Typically 1-2 weeks of effort.
150
- - **Epic**: a collection of specs. `/curdx-flow:triage` breaks down a large goal into multiple specs.
151
+ - **Epic**: a collection of specs. The `epic` skill (auto-invoked, or say "break this big feature down") breaks down a large goal into multiple specs.
151
152
 
152
153
  `.flow/specs/<name>/` is a single-spec directory.
153
154
  `.flow/_epics/<name>/` is an Epic directory (contains the dependency graph and sub-spec list).
@@ -159,8 +160,8 @@ Regardless of the path taken, the 4 files must satisfy:
159
160
  SDD is not dogma. The following scenarios may skip phases:
160
161
 
161
162
  - **One-off scripts** (`/curdx-flow:fast` mode) — skip all specs
162
- - **UI prototype exploration** (`/curdx-flow:sketch` mode) — only research + design sketches
163
- - **Emergency hotfix** (`/curdx-flow:spike` mode) — validating the assumption is enough
163
+ - **UI prototype exploration** (the `ui-sketch` skill) — only research + design sketches
164
+ - **Emergency hotfix** (`/curdx-flow:fast "spike: validate <hypothesis>"` mode) — validating the assumption is enough
164
165
 
165
166
  But **production code changes** should follow the full flow. Rationale:
166
167
  - Code may be only 20 lines, but impact may reach all users
@@ -113,17 +113,18 @@ Stage 2 applies all enabled Gates (from `.flow/config.json`):
113
113
 
114
114
  #### 2.5 (enterprise) Adversarial review (adversarial-review-gate)
115
115
 
116
- - 3 categories of issues found?
116
+ - Every applicable category examined (N/A documented for the rest)?
117
+ - Findings proportional to real issues (zero is OK with a proof-of-checking report)?
117
118
  - Each finding has evidence + recommendation?
118
119
 
119
120
  #### 2.6 (enterprise) Edge cases (edge-case-gate)
120
121
 
121
- - Did all 7 major categories pass?
122
+ - Each applicable edge-case category addressed (N/A noted for the rest)?
122
123
  - Gap list has priorities?
123
124
 
124
125
  ### Stage 2 verdict
125
126
 
126
- - **EXCELLENT**: all enabled Gates pass, adversarial findings < 3 (high-quality code)
127
+ - **EXCELLENT**: all enabled Gates pass, adversarial review clean or only low-severity findings
127
128
  - **GOOD**: all enabled Gates pass, but some warnings
128
129
  - **NEEDS_IMPROVEMENT**: Gate violations (blocking)
129
130
 
@@ -206,7 +207,7 @@ Some reviewers list 50 minor improvements — the user can't process.
206
207
  ## Relationship to Other Phases
207
208
 
208
209
  ```
209
- /curdx-flow:tasks → tasks.md contains task list
210
+ /curdx-flow:spec --phase=tasks → tasks.md contains task list
210
211
 
211
212
  /curdx-flow:implement → code + tests + commits
212
213
 
@@ -218,7 +219,7 @@ Some reviewers list 50 minor improvements — the user can't process.
218
219
  ↓ ↓
219
220
  ↓ review-report.md
220
221
 
221
- (optional) /curdx-flow:audit → adversarial review + edge cases
222
+ (optional) /curdx-flow:verify --strict → adversarial review + edge cases
222
223
 
223
224
  adversarial-review.md
224
225
  edge-cases.md
@@ -254,7 +254,7 @@ Decision:
254
254
  - 1.1 and 1.3 commits retained
255
255
  - Main agent decides:
256
256
  A: continue to Wave 2 (skip 1.2, possible cascading failure)
257
- B: dispatch David (flow-debugger) to fix 1.2, then continue
257
+ B: dispatch flow-debugger to fix 1.2, then continue
258
258
  C: stop and report, let the user intervene
259
259
 
260
260
  Default: A, but failed_attempts += 1; after threshold switch to C
@@ -268,7 +268,7 @@ Wave 1 all TASK_FAILED
268
268
  Decision:
269
269
  - Usually indicates an upstream environment problem (missing deps, tsc config wrong)
270
270
  - Stop immediately
271
- - Suggest user run /curdx-flow:doctor to diagnose
271
+ - Suggest user run `npx @curdx/flow doctor` to diagnose
272
272
  ```
273
273
 
274
274
  ### Inter-wave dependency broken
@@ -307,7 +307,7 @@ Decision:
307
307
 
308
308
  ### In-progress view
309
309
 
310
- `/curdx-flow:status` shows:
310
+ Inspecting `.flow/specs/<name>/.progress.md` (or running `/curdx-flow:start --list`) shows:
311
311
  ```
312
312
  Spec: auth-system
313
313
  Strategy: wave
@@ -321,7 +321,7 @@ Progress: Wave 2/5 (60%)
321
321
  ### Ctrl+C interruption
322
322
 
323
323
  - Running Task calls in the current wave keep going (Claude Code's Task is an independent process)
324
- - Next `/curdx-flow:switch` shows some tasks already committed
324
+ - Next `/curdx-flow:start --resume` shows some tasks already committed
325
325
  - Resume from the failing task
326
326
 
327
327
  ---
@@ -367,7 +367,7 @@ Phase 6+ will consider automatic fallback.
367
367
  ### 1. `[P]` markers incorrect
368
368
 
369
369
  If the planner missed a dependency, `[P]` may be wrong. Solutions:
370
- - Before execution, confirm tasks coverage via `/curdx-flow:audit`
370
+ - Before execution, confirm tasks coverage via `/curdx-flow:verify --strict`
371
371
  - Conflict detection as a safety net (validate Files before dispatch)
372
372
 
373
373
  ### 2. A wave too large
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@curdx/flow",
3
- "version": "1.1.11",
3
+ "version": "2.0.0-beta.10",
4
4
  "description": "CLI installer for CurDX-Flow — AI engineering workflow meta-framework for Claude Code",
5
5
  "type": "module",
6
6
  "bin": {
@@ -8,7 +8,8 @@
8
8
  "curdx-flow": "bin/curdx-flow.js"
9
9
  },
10
10
  "scripts": {
11
- "prepublishOnly": "node bin/curdx-flow.js --version"
11
+ "test": "node --test test/*.test.js",
12
+ "prepublishOnly": "node --test test/*.test.js && node bin/curdx-flow.js --version"
12
13
  },
13
14
  "files": [
14
15
  "bin/",
@@ -22,6 +23,7 @@
22
23
  "agent-preamble/",
23
24
  "templates/",
24
25
  "schemas/",
26
+ "skills/",
25
27
  "README.md",
26
28
  "CHANGELOG.md",
27
29
  "LICENSE"
@@ -0,0 +1,62 @@
1
+ ---
2
+ name: brownfield-index
3
+ description: Invoke when the user is new to an unfamiliar / legacy / brownfield codebase and wants a structural understanding — module map, component inventory, API surface, data flow. Triggers on "legacy code", "brownfield", "unfamiliar", "new to this code", "new to this project", "just joined", "inherited codebase", "explore codebase", "understand structure", "index code", "map modules", "tour", "onboard", "what is this project".
4
+ allowed-tools: [Read, Grep, Glob, Bash]
5
+ ---
6
+
7
+ # Brownfield Index
8
+
9
+ You are invoked when the user needs a structural map of an existing codebase they are not yet familiar with.
10
+
11
+ ## Preconditions
12
+
13
+ 1. The repository root is the current working directory (or a path the user specifies).
14
+ 2. The project is not a new `/curdx-flow:init`-ed greenfield project (if it is, direct the user to `/curdx-flow:start` instead).
15
+
16
+ ## Workflow
17
+
18
+ ### Step 1: Detect project type
19
+
20
+ Read `package.json` / `Cargo.toml` / `pyproject.toml` / `go.mod` / `pom.xml` to classify the ecosystem and build tool. This determines which directory conventions to apply.
21
+
22
+ ### Step 2: Scan directory structure
23
+
24
+ Produce a top-level inventory:
25
+ - **Entry points** (main / index / bin scripts)
26
+ - **Module directories** (src/, lib/, internal/, pkg/ …)
27
+ - **Test directories**
28
+ - **Config files**
29
+ - **Tooling** (CI, lint, format configs)
30
+
31
+ ### Step 3: Component inventory
32
+
33
+ For each module directory, list:
34
+ - Files and their apparent role (inferred from names + top-of-file comments)
35
+ - Public exports / exported symbols
36
+ - Third-party dependencies imported
37
+
38
+ ### Step 4: API surface
39
+
40
+ If HTTP / RPC endpoints exist, index them: route → handler → middleware. For CLI tools, index commands → handlers.
41
+
42
+ ### Step 5: Write index document
43
+
44
+ Output `.flow/codebase-index.md` containing:
45
+ - **Overview** (project purpose, build tool, runtime)
46
+ - **Directory tree** (with per-directory one-liner descriptions)
47
+ - **Entry points** (where execution starts)
48
+ - **Key abstractions** (core types, interfaces, classes that everything else hangs off)
49
+ - **External dependencies** (grouped: prod runtime / dev tooling / transitive)
50
+ - **Known gaps / red flags** (missing tests, TODOs, suspicious patterns)
51
+
52
+ ### Step 6: Hand off
53
+
54
+ Point the user at the next useful action:
55
+ - "Looking to add a feature here? Run `/curdx-flow:start <name>` to begin a spec."
56
+ - "Debugging something specific? Run `/curdx-flow:debug '<symptom>'`."
57
+
58
+ ## Notes
59
+
60
+ This skill uses Read + Grep + Glob + Bash with no specialized agent — general tools are enough for structural discovery. The index is meant to be quick (5–10 minutes), not exhaustive.
61
+
62
+ For deep research into a specific library or framework, use `context7` MCP directly.
@@ -0,0 +1,50 @@
1
+ ---
2
+ name: browser-qa
3
+ description: Invoke when the user wants to test a UI/frontend in a real browser — accessibility, performance, console errors, network traffic, visual regression. Triggers on "browser test", "test in browser", "UI test", "e2e test", "frontend test", "accessibility", "a11y", "WCAG", "lighthouse", "performance audit", "console error", "network request", "cross-browser", "responsive", "mobile test", "visual regression", "screenshot".
4
+ allowed-tools: [Read, Write, Bash, Grep, Glob, WebFetch]
5
+ ---
6
+
7
+ # Browser QA
8
+
9
+ You are invoked when the user wants real-browser QA of a UI flow.
10
+
11
+ ## Preconditions
12
+
13
+ 1. `chrome-devtools` MCP is available (`mcp__chrome-devtools__*`). If missing, fall back to a manual checklist.
14
+ 2. A URL (dev server or deployed) is available. Prompt for it if not provided.
15
+
16
+ ## Workflow
17
+
18
+ ### Step 1: Clarify scope
19
+
20
+ Confirm with the user:
21
+ - **URL under test** (local `http://localhost:3000` or remote)
22
+ - **Flow to test** (e.g., "sign up → dashboard → logout")
23
+ - **What success looks like** (accessibility / performance / zero console errors / visual match)
24
+
25
+ ### Step 2: Dispatch `flow-qa-engineer`
26
+
27
+ Delegate to the `flow-qa-engineer` agent. It will:
28
+ 1. Open the target URL via `mcp__chrome-devtools__new_page`
29
+ 2. Drive the flow with `mcp__chrome-devtools__click` / `fill` / `navigate`
30
+ 3. Capture `list_console_messages`, `list_network_requests`, `take_screenshot`, optionally `lighthouse_audit`
31
+ 4. Compare against expected behavior
32
+
33
+ ### Step 3: Report findings
34
+
35
+ Produce `.flow/specs/<active>/qa-report.md` with:
36
+ - **Bugs** (reproducible, severity P1/P2/P3)
37
+ - **Performance** (LCP / INP / CLS from Lighthouse)
38
+ - **Accessibility** (axe violations with WCAG references)
39
+ - **Console errors** (full stack traces)
40
+ - **Screenshots** (attached)
41
+
42
+ ### Step 4: Hand off
43
+
44
+ If bugs found: suggest `/curdx-flow:debug "<bug title>"` for systematic root-cause analysis.
45
+ If accessibility violations: suggest fixes inline with WCAG refs.
46
+
47
+ ## References
48
+
49
+ - `flow-qa-engineer` agent: `@${CLAUDE_PLUGIN_ROOT}/agents/flow-qa-engineer.md`
50
+ - chrome-devtools MCP docs: https://github.com/ChromeDevTools/chrome-devtools-mcp