@curdx/flow 1.1.11 → 2.0.0-beta.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +3 -3
- package/.claude-plugin/plugin.json +2 -2
- package/CHANGELOG.md +79 -0
- package/README.md +74 -102
- package/agents/flow-adversary.md +1 -1
- package/agents/flow-architect.md +1 -1
- package/agents/flow-product-designer.md +1 -1
- package/agents/flow-qa-engineer.md +3 -3
- package/agents/flow-researcher.md +1 -1
- package/agents/flow-security-auditor.md +1 -1
- package/agents/flow-triage-analyst.md +3 -3
- package/agents/flow-ui-researcher.md +5 -5
- package/agents/flow-ux-designer.md +2 -2
- package/cli/install.js +16 -5
- package/commands/debug.md +10 -10
- package/commands/help.md +109 -87
- package/commands/implement.md +4 -4
- package/commands/init.md +5 -5
- package/commands/review.md +114 -130
- package/commands/spec.md +131 -89
- package/commands/start.md +100 -153
- package/commands/verify.md +110 -92
- package/gates/adversarial-review-gate.md +1 -1
- package/gates/coverage-audit-gate.md +1 -1
- package/gates/devex-gate.md +1 -1
- package/gates/edge-case-gate.md +1 -1
- package/gates/security-gate.md +3 -3
- package/hooks/scripts/session-start.sh +1 -1
- package/knowledge/epic-decomposition.md +2 -2
- package/knowledge/execution-strategies.md +4 -4
- package/knowledge/planning-reviews.md +6 -6
- package/knowledge/spec-driven-development.md +3 -3
- package/knowledge/two-stage-review.md +2 -2
- package/knowledge/wave-execution.md +5 -5
- package/package.json +1 -1
- package/agents/persona-amelia.md +0 -128
- package/agents/persona-david.md +0 -141
- package/agents/persona-emma.md +0 -179
- package/agents/persona-john.md +0 -105
- package/agents/persona-mary.md +0 -95
- package/agents/persona-oliver.md +0 -136
- package/agents/persona-rachel.md +0 -126
- package/agents/persona-serena.md +0 -175
- package/agents/persona-winston.md +0 -117
- package/commands/audit.md +0 -170
- package/commands/autoplan.md +0 -184
- package/commands/design.md +0 -155
- package/commands/discuss.md +0 -162
- package/commands/doctor.md +0 -124
- package/commands/index.md +0 -261
- package/commands/install-deps.md +0 -128
- package/commands/party.md +0 -241
- package/commands/plan-ceo.md +0 -117
- package/commands/plan-design.md +0 -107
- package/commands/plan-dx.md +0 -104
- package/commands/plan-eng.md +0 -108
- package/commands/qa.md +0 -118
- package/commands/requirements.md +0 -146
- package/commands/research.md +0 -141
- package/commands/security.md +0 -109
- package/commands/sketch.md +0 -118
- package/commands/spike.md +0 -181
- package/commands/status.md +0 -139
- package/commands/switch.md +0 -95
- package/commands/tasks.md +0 -189
- package/commands/triage.md +0 -160
package/commands/verify.md
CHANGED
|
@@ -1,124 +1,142 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: verify
|
|
3
|
-
description:
|
|
4
|
-
argument-hint: "[
|
|
3
|
+
description: Goal-backward verification — trace from every FR / AC / AD in the spec to the code and tests, detect stubs and fake completions. The differentiator command. Optionally adds multi-source coverage audit with --strict.
|
|
4
|
+
argument-hint: "[--strict]"
|
|
5
5
|
allowed-tools: [Read, Bash, Task, Grep, Glob]
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
#
|
|
8
|
+
# Goal-Backward Verification
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
This is the **differentiator command**: it scans the implementation against the spec's own requirements and catches the most common Claude failure mode — claiming "done" while actual code is a stub or fake completion.
|
|
11
11
|
|
|
12
|
-
##
|
|
12
|
+
## Flags
|
|
13
13
|
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
14
|
+
| Flag | Default | Purpose |
|
|
15
|
+
|------|---------|---------|
|
|
16
|
+
| `--strict` | off | Also run multi-source coverage audit (FR / AC / AD / Research conclusions / D-NN decisions) — replaces v1's `/audit` command. |
|
|
17
17
|
|
|
18
|
-
##
|
|
18
|
+
## Preflight
|
|
19
19
|
|
|
20
20
|
```bash
|
|
21
|
-
|
|
22
|
-
[ -z "$SPEC_NAME" ] && { echo "❌ No active spec. Run /curdx-flow:switch or /curdx-flow:start first"; exit 1; }
|
|
21
|
+
[ ! -d ".flow" ] && { echo "✗ Not a CurDX-Flow project."; exit 1; }
|
|
23
22
|
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
23
|
+
SPEC_NAME=$(cat .flow/.active-spec 2>/dev/null)
|
|
24
|
+
[ -z "$SPEC_NAME" ] && { echo "✗ No active spec. Run /curdx-flow:start first."; exit 1; }
|
|
25
|
+
|
|
26
|
+
SPEC_DIR=".flow/specs/$SPEC_NAME"
|
|
27
|
+
for f in requirements.md design.md tasks.md; do
|
|
28
|
+
[ ! -f "$SPEC_DIR/$f" ] && {
|
|
29
|
+
echo "✗ $SPEC_DIR/$f missing. Run /curdx-flow:spec first.";
|
|
30
|
+
exit 1;
|
|
31
|
+
}
|
|
27
32
|
done
|
|
33
|
+
|
|
34
|
+
FLAG_STRICT=$(echo "$ARGUMENTS" | grep -q -- '--strict' && echo 1 || echo 0)
|
|
28
35
|
```
|
|
29
36
|
|
|
30
|
-
##
|
|
37
|
+
## Workflow
|
|
31
38
|
|
|
32
|
-
|
|
33
|
-
# Read the commit range for the execute phase
|
|
34
|
-
# From .state.json or git reflog
|
|
35
|
-
|
|
36
|
-
LAST_EXEC_START=$(python3 -c "
|
|
37
|
-
import json
|
|
38
|
-
s = json.load(open('$DIR/.state.json'))
|
|
39
|
-
# Custom field or inferred from git
|
|
40
|
-
print(s.get('execute_state', {}).get('start_commit', ''))
|
|
41
|
-
")
|
|
42
|
-
|
|
43
|
-
# If unavailable, use main..HEAD
|
|
44
|
-
RANGE="${LAST_EXEC_START:-main}..HEAD"
|
|
45
|
-
echo "Verification scope: $RANGE"
|
|
46
|
-
```
|
|
39
|
+
### Step 1: Dispatch `flow-verifier`
|
|
47
40
|
|
|
48
|
-
|
|
41
|
+
Delegate to the `flow-verifier` agent with:
|
|
42
|
+
- `requirements.md` (source of FR-NN, AC-N.N, US-NN)
|
|
43
|
+
- `design.md` (source of AD-NN)
|
|
44
|
+
- `tasks.md` (source of T-N.M and their Verify commands)
|
|
45
|
+
- Repository root (to scan code + tests)
|
|
49
46
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
5. Update phase_status.verify in .state.json
|
|
73
|
-
|
|
74
|
-
Output file:
|
|
75
|
-
.flow/specs/$SPEC_NAME/verification-report.md
|
|
76
|
-
|
|
77
|
-
Return a brief:
|
|
78
|
-
- Fully verified / partially verified / not verified counts
|
|
79
|
-
- Number of fake implementations
|
|
80
|
-
- List of blockers
|
|
81
|
-
- Suggested next step
|
|
82
|
-
```
|
|
47
|
+
The agent performs goal-backward tracing:
|
|
48
|
+
1. For each **FR-NN**: search for code that implements it. Classify as `IMPLEMENTED` / `STUB` / `MISSING` / `UNCERTAIN`.
|
|
49
|
+
2. For each **AC-N.N**: search for a test that exercises it. Classify as `TESTED` / `UNTESTED`.
|
|
50
|
+
3. For each **AD-NN**: check that the design decision is reflected in code structure / interfaces.
|
|
51
|
+
4. Detect suspicious patterns:
|
|
52
|
+
- `throw new Error("not implemented")` / `TODO` / `NotImplementedError`
|
|
53
|
+
- `return null` / `return {}` in places that should produce real output
|
|
54
|
+
- test files with only `it.skip(...)` or no assertions
|
|
55
|
+
- code that returns mocked fixtures instead of calling real collaborators
|
|
56
|
+
|
|
57
|
+
### Step 2: Run the Verify commands from `tasks.md`
|
|
58
|
+
|
|
59
|
+
For every task listed in `tasks.md`, run its declared `Verify:` command and record pass/fail. This is the most objective check — if the task said `npm test -- auth.spec.ts`, run exactly that.
|
|
60
|
+
|
|
61
|
+
### Step 3: (Strict only) Multi-source coverage audit
|
|
62
|
+
|
|
63
|
+
If `--strict`:
|
|
64
|
+
- Cross-check every FR / AC / AD / decision against the implementation
|
|
65
|
+
- Cross-check every research conclusion: was the recommended library / approach actually used?
|
|
66
|
+
- Cross-check every D-NN decision in `.flow/STATE.md` that references this spec
|
|
67
|
+
|
|
68
|
+
### Step 4: Produce `verification-report.md`
|
|
83
69
|
|
|
84
|
-
|
|
70
|
+
**Landing check**: sub-agent responses can be truncated by the model's output-length limit. After dispatching `flow-verifier`, verify the report actually landed:
|
|
85
71
|
|
|
86
72
|
```bash
|
|
87
|
-
REPORT="
|
|
88
|
-
[ ! -f "$REPORT" ]
|
|
89
|
-
|
|
90
|
-
#
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
73
|
+
REPORT=".flow/specs/$SPEC_NAME/verification-report.md"
|
|
74
|
+
if [ ! -f "$REPORT" ] || [ "$(wc -c < "$REPORT" 2>/dev/null | tr -d ' ')" -lt 300 ]; then
|
|
75
|
+
echo "⚠ Report missing or truncated. Re-dispatching flow-verifier with a terse 'write the report now' prompt."
|
|
76
|
+
# Re-dispatch pattern:
|
|
77
|
+
# "Your only job right now is to Write the verification-report.md using the
|
|
78
|
+
# findings you already gathered. Do not re-scan. Do not narrate. Write
|
|
79
|
+
# the file and stop."
|
|
80
|
+
fi
|
|
95
81
|
```
|
|
96
82
|
|
|
97
|
-
|
|
83
|
+
Write to `.flow/specs/$SPEC_NAME/verification-report.md`:
|
|
98
84
|
|
|
85
|
+
```markdown
|
|
86
|
+
# Verification Report — <spec-name>
|
|
87
|
+
|
|
88
|
+
**Generated**: <ISO8601>
|
|
89
|
+
**Mode**: strict | normal
|
|
90
|
+
|
|
91
|
+
## Summary
|
|
92
|
+
- FR coverage: N/M implemented (K stubs, L missing)
|
|
93
|
+
- AC coverage: N/M tested
|
|
94
|
+
- AD coverage: N/M reflected
|
|
95
|
+
- Verify commands: N/M passing
|
|
96
|
+
|
|
97
|
+
## Findings
|
|
98
|
+
|
|
99
|
+
### Missing implementations
|
|
100
|
+
- FR-02: <description>. No matching code found in <paths searched>.
|
|
101
|
+
|
|
102
|
+
### Stubs / fake completions
|
|
103
|
+
- FR-05: Implemented in `src/auth.ts:42` but body is `throw new Error("not implemented")`.
|
|
104
|
+
|
|
105
|
+
### Untested acceptance criteria
|
|
106
|
+
- AC-1.3: No test asserts token refresh after 15 min expiry.
|
|
107
|
+
|
|
108
|
+
### Failing Verify commands
|
|
109
|
+
- T-2.1: `npm test auth.spec.ts` → 3 failed
|
|
110
|
+
|
|
111
|
+
## Verdict
|
|
112
|
+
- [ ] PASS — all items covered and passing
|
|
113
|
+
- [X] PARTIAL — <n> findings must be addressed before shipping
|
|
114
|
+
- [ ] MISSING — substantive implementation gaps
|
|
99
115
|
```
|
|
100
|
-
✓ Verify complete: $SPEC_NAME
|
|
101
116
|
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
117
|
+
### Step 5: Apply `verification-gate`
|
|
118
|
+
|
|
119
|
+
Hard rule from `@${CLAUDE_PLUGIN_ROOT}/gates/verification-gate.md`:
|
|
120
|
+
- Any `STUB` or `MISSING` finding on a non-deferred FR blocks completion.
|
|
121
|
+
- Any failing Verify command blocks completion.
|
|
122
|
+
- Waive only with an explicit user D-NN decision logged to `.flow/STATE.md`.
|
|
107
123
|
|
|
108
|
-
|
|
124
|
+
## Reporting
|
|
125
|
+
|
|
126
|
+
```
|
|
127
|
+
✓ Verification complete
|
|
128
|
+
FR coverage: 8/10 implemented (1 stub, 1 missing)
|
|
129
|
+
AC coverage: 9/12 tested
|
|
130
|
+
Verify commands: 14/15 passing
|
|
131
|
+
Verdict: PARTIAL
|
|
109
132
|
|
|
110
|
-
|
|
111
|
-
$([ $MISSING -gt 0 ] && echo '❌ BLOCKED — unimplemented FR/AC/AD exist, return to /curdx-flow:implement to fill in')
|
|
112
|
-
$([ $STUBS -gt 0 ] && echo '❌ BLOCKED — fake implementations found')
|
|
113
|
-
$([ $MISSING -eq 0 ] && [ $STUBS -eq 0 ] && echo '✓ PASS — can proceed to /curdx-flow:review')
|
|
133
|
+
Findings written to: .flow/specs/<name>/verification-report.md
|
|
114
134
|
|
|
115
|
-
Next
|
|
116
|
-
$([ $MISSING -gt 0 ] && echo 'fix blockers → /curdx-flow:implement --task=<new task>')
|
|
117
|
-
$([ $MISSING -eq 0 ] && echo '/curdx-flow:review — enter code quality review')
|
|
135
|
+
Next: address findings, then re-run /curdx-flow:verify, or run /curdx-flow:review.
|
|
118
136
|
```
|
|
119
137
|
|
|
120
|
-
##
|
|
138
|
+
## References
|
|
121
139
|
|
|
122
|
-
- verifier
|
|
123
|
-
-
|
|
124
|
-
-
|
|
140
|
+
- `flow-verifier` agent: `@${CLAUDE_PLUGIN_ROOT}/agents/flow-verifier.md`
|
|
141
|
+
- `verification-gate`: `@${CLAUDE_PLUGIN_ROOT}/gates/verification-gate.md`
|
|
142
|
+
- `coverage-audit-gate` (used in strict mode): `@${CLAUDE_PLUGIN_ROOT}/gates/coverage-audit-gate.md`
|
|
@@ -210,7 +210,7 @@ I have examined the following dimensions across 2 rounds of analysis:
|
|
|
210
210
|
|
|
211
211
|
Recommendations:
|
|
212
212
|
- Human review (at least walk through the diff once)
|
|
213
|
-
- Consider
|
|
213
|
+
- Consider the `browser-qa` skill for real browser/integration testing (Phase 5+)
|
|
214
214
|
- Wait until deployment to staging to observe
|
|
215
215
|
```
|
|
216
216
|
|
|
@@ -17,7 +17,7 @@ depends_on: []
|
|
|
17
17
|
|
|
18
18
|
- End of the tasks phase (last step of flow-planner)
|
|
19
19
|
- Before the execution phase completes (when /curdx-flow:verify runs)
|
|
20
|
-
- Explicitly requested by /curdx-flow:
|
|
20
|
+
- Explicitly requested by /curdx-flow:verify --strict
|
|
21
21
|
|
|
22
22
|
---
|
|
23
23
|
|
package/gates/devex-gate.md
CHANGED
|
@@ -13,7 +13,7 @@ depends_on: []
|
|
|
13
13
|
|
|
14
14
|
## Trigger Timing
|
|
15
15
|
|
|
16
|
-
- When `/curdx-flow:
|
|
16
|
+
- When `/curdx-flow:spec --review=dx` runs (design phase)
|
|
17
17
|
- When `/curdx-flow:review --devex` runs (code phase)
|
|
18
18
|
- Enabled by default in open-source / multi-person collaboration scenarios
|
|
19
19
|
|
package/gates/edge-case-gate.md
CHANGED
|
@@ -18,7 +18,7 @@ depends_on: []
|
|
|
18
18
|
- After the requirements phase ends (to supplement edge conditions)
|
|
19
19
|
- After the design phase (to check error-path completeness)
|
|
20
20
|
- After tests are written (to check whether only the happy path is covered)
|
|
21
|
-
- Explicitly requested by /curdx-flow:
|
|
21
|
+
- Explicitly requested by /curdx-flow:verify --strict
|
|
22
22
|
|
|
23
23
|
---
|
|
24
24
|
|
package/gates/security-gate.md
CHANGED
|
@@ -13,7 +13,7 @@ depends_on: []
|
|
|
13
13
|
|
|
14
14
|
## Trigger Timing
|
|
15
15
|
|
|
16
|
-
- When
|
|
16
|
+
- When the `security-audit` skill runs
|
|
17
17
|
- Before `/curdx-flow:ship` (auto-triggered, Phase 6+)
|
|
18
18
|
- When committing specs involving auth / payments / PII
|
|
19
19
|
|
|
@@ -130,8 +130,8 @@ Production environment only accepts HTTPS. HTTP requests → 301 to HTTPS.
|
|
|
130
130
|
# Run all scans
|
|
131
131
|
bash scripts/security-scan.sh # provided by project (if available)
|
|
132
132
|
|
|
133
|
-
# Or use flow-security-auditor agent
|
|
134
|
-
|
|
133
|
+
# Or use flow-security-auditor agent via the `security-audit` skill
|
|
134
|
+
# (or say "audit for security issues")
|
|
135
135
|
```
|
|
136
136
|
|
|
137
137
|
### Dependency CVE
|
|
@@ -36,7 +36,7 @@ if [ "$LAST_CHECK" != "$TODAY" ]; then
|
|
|
36
36
|
|
|
37
37
|
if [ "${#MISSING[@]}" -gt 0 ]; then
|
|
38
38
|
JOINED="$(IFS=,; echo "${MISSING[*]}")"
|
|
39
|
-
ADDITIONAL_CONTEXT+="## CurDX-Flow Recommended Plugins Check\n\nThe following recommended plugins were not detected: **${JOINED}**\n\nRun
|
|
39
|
+
ADDITIONAL_CONTEXT+="## CurDX-Flow Recommended Plugins Check\n\nThe following recommended plugins were not detected: **${JOINED}**\n\nRun \`npx @curdx/flow install --all\` for interactive one-shot install. Run \`npx @curdx/flow doctor\` for the full health report.\n\n"
|
|
40
40
|
fi
|
|
41
41
|
|
|
42
42
|
echo "$TODAY" > "$MARKER" 2>/dev/null || true
|
|
@@ -238,13 +238,13 @@ Week 5-6: Spec 4 (refund) + Spec 5 (query)
|
|
|
238
238
|
## Epic Lifecycle
|
|
239
239
|
|
|
240
240
|
```
|
|
241
|
-
1.
|
|
241
|
+
1. Invoke the `epic` skill with "Epic goal" (auto-invoked; or say "break this big feature down")
|
|
242
242
|
↓ flow-triage-analyst decomposes
|
|
243
243
|
2. Generates .flow/_epics/<name>/epic.md + sub-spec skeletons
|
|
244
244
|
3. User reviews epic.md
|
|
245
245
|
↓
|
|
246
246
|
4. For each sub-spec:
|
|
247
|
-
/curdx-flow:
|
|
247
|
+
/curdx-flow:start <sub-spec-name>
|
|
248
248
|
/curdx-flow:spec
|
|
249
249
|
/curdx-flow:implement
|
|
250
250
|
/curdx-flow:review
|
|
@@ -252,7 +252,7 @@ Can you switch strategies mid-execution? Not recommended.
|
|
|
252
252
|
- Any → Wave: needs `[P]` markers in tasks.md
|
|
253
253
|
|
|
254
254
|
If you really must switch, do it manually:
|
|
255
|
-
1.
|
|
255
|
+
1. `npx @curdx/flow doctor` to check status
|
|
256
256
|
2. Manually edit `.flow/specs/<name>/.state.json`'s `strategy` field
|
|
257
257
|
3. Rerun `/curdx-flow:implement`
|
|
258
258
|
|
|
@@ -262,13 +262,13 @@ If you really must switch, do it manually:
|
|
|
262
262
|
|
|
263
263
|
### View progress
|
|
264
264
|
```bash
|
|
265
|
-
/curdx-flow:
|
|
266
|
-
|
|
265
|
+
/curdx-flow:start --list # global
|
|
266
|
+
# For single-spec details, inspect .flow/specs/<name>/.progress.md
|
|
267
267
|
```
|
|
268
268
|
|
|
269
269
|
### Interrupt
|
|
270
270
|
- `Ctrl+C` interrupts the current session → Stop event triggers, state is saved
|
|
271
|
-
- Next `/curdx-flow:
|
|
271
|
+
- Next `/curdx-flow:start <name>` (or `/curdx-flow:start --resume`) resumes from `task_index`
|
|
272
272
|
|
|
273
273
|
### Snapshots
|
|
274
274
|
`/curdx-flow:save <label>` saves a checkpoint (Phase 5+ rollout).
|
|
@@ -26,7 +26,7 @@ design.md
|
|
|
26
26
|
|
|
27
27
|
Each review is dispatched independently (different agent / context) to avoid perspective convergence.
|
|
28
28
|
|
|
29
|
-
Finally `/curdx-flow:
|
|
29
|
+
Finally `/curdx-flow:spec --review=all` ties them together: runs all 4 reviews in one pass.
|
|
30
30
|
|
|
31
31
|
---
|
|
32
32
|
|
|
@@ -115,7 +115,7 @@ Essentially runs `flow-architect` again — but this time not to generate the de
|
|
|
115
115
|
|
|
116
116
|
### Dispatch
|
|
117
117
|
|
|
118
|
-
`flow-ux-designer`
|
|
118
|
+
`flow-ux-designer` switches into review mode.
|
|
119
119
|
|
|
120
120
|
---
|
|
121
121
|
|
|
@@ -153,10 +153,10 @@ Phase 5 implementation: reuse `flow-reviewer` + `@gates/devex-gate.md`.
|
|
|
153
153
|
|
|
154
154
|
---
|
|
155
155
|
|
|
156
|
-
## /curdx-flow:
|
|
156
|
+
## /curdx-flow:spec --review=all — Run All 4 at Once
|
|
157
157
|
|
|
158
158
|
```bash
|
|
159
|
-
/curdx-flow:
|
|
159
|
+
/curdx-flow:spec --review=all
|
|
160
160
|
```
|
|
161
161
|
|
|
162
162
|
Workflow:
|
|
@@ -183,7 +183,7 @@ Output:
|
|
|
183
183
|
...
|
|
184
184
|
|
|
185
185
|
## Recommendations
|
|
186
|
-
1. Return to /curdx-flow:design to fix blockers
|
|
186
|
+
1. Return to /curdx-flow:spec --phase=design to fix blockers
|
|
187
187
|
2. Record warnings in STATE.md, address in tasks phase
|
|
188
188
|
```
|
|
189
189
|
|
|
@@ -191,7 +191,7 @@ Output:
|
|
|
191
191
|
|
|
192
192
|
## When to Skip Planning Reviews
|
|
193
193
|
|
|
194
|
-
- **MVP / prototype**: time-pressured, run /curdx-flow:tasks first, review after launch
|
|
194
|
+
- **MVP / prototype**: time-pressured, run /curdx-flow:spec --phase=tasks first, review after launch
|
|
195
195
|
- **Tiny changes**: a single file < 50 lines doesn't warrant a 4-dimension review
|
|
196
196
|
- **Similar work done before**: reuse prior review conclusions
|
|
197
197
|
|
|
@@ -147,7 +147,7 @@ Regardless of the path taken, the 4 files must satisfy:
|
|
|
147
147
|
## Spec vs Epic Difference
|
|
148
148
|
|
|
149
149
|
- **Spec**: a single independently-deliverable feature. Typically 1-2 weeks of effort.
|
|
150
|
-
- **Epic**: a collection of specs.
|
|
150
|
+
- **Epic**: a collection of specs. The `epic` skill (auto-invoked, or say "break this big feature down") breaks down a large goal into multiple specs.
|
|
151
151
|
|
|
152
152
|
`.flow/specs/<name>/` is a single-spec directory.
|
|
153
153
|
`.flow/_epics/<name>/` is an Epic directory (contains the dependency graph and sub-spec list).
|
|
@@ -159,8 +159,8 @@ Regardless of the path taken, the 4 files must satisfy:
|
|
|
159
159
|
SDD is not dogma. The following scenarios may skip phases:
|
|
160
160
|
|
|
161
161
|
- **One-off scripts** (`/curdx-flow:fast` mode) — skip all specs
|
|
162
|
-
- **UI prototype exploration** (
|
|
163
|
-
- **Emergency hotfix** (`/curdx-flow:spike` mode) — validating the assumption is enough
|
|
162
|
+
- **UI prototype exploration** (the `ui-sketch` skill) — only research + design sketches
|
|
163
|
+
- **Emergency hotfix** (`/curdx-flow:fast "spike: validate <hypothesis>"` mode) — validating the assumption is enough
|
|
164
164
|
|
|
165
165
|
But **production code changes** should follow the full flow. Rationale:
|
|
166
166
|
- Code may be only 20 lines, but impact may reach all users
|
|
@@ -206,7 +206,7 @@ Some reviewers list 50 minor improvements — the user can't process.
|
|
|
206
206
|
## Relationship to Other Phases
|
|
207
207
|
|
|
208
208
|
```
|
|
209
|
-
/curdx-flow:tasks → tasks.md contains task list
|
|
209
|
+
/curdx-flow:spec --phase=tasks → tasks.md contains task list
|
|
210
210
|
↓
|
|
211
211
|
/curdx-flow:implement → code + tests + commits
|
|
212
212
|
↓
|
|
@@ -218,7 +218,7 @@ Some reviewers list 50 minor improvements — the user can't process.
|
|
|
218
218
|
↓ ↓
|
|
219
219
|
↓ review-report.md
|
|
220
220
|
↓
|
|
221
|
-
(optional) /curdx-flow:
|
|
221
|
+
(optional) /curdx-flow:verify --strict → adversarial review + edge cases
|
|
222
222
|
↓
|
|
223
223
|
adversarial-review.md
|
|
224
224
|
edge-cases.md
|
|
@@ -254,7 +254,7 @@ Decision:
|
|
|
254
254
|
- 1.1 and 1.3 commits retained
|
|
255
255
|
- Main agent decides:
|
|
256
256
|
A: continue to Wave 2 (skip 1.2, possible cascading failure)
|
|
257
|
-
B: dispatch
|
|
257
|
+
B: dispatch flow-debugger to fix 1.2, then continue
|
|
258
258
|
C: stop and report, let the user intervene
|
|
259
259
|
|
|
260
260
|
Default: A, but failed_attempts += 1; after threshold switch to C
|
|
@@ -268,7 +268,7 @@ Wave 1 all TASK_FAILED
|
|
|
268
268
|
Decision:
|
|
269
269
|
- Usually indicates an upstream environment problem (missing deps, tsc config wrong)
|
|
270
270
|
- Stop immediately
|
|
271
|
-
- Suggest user run /
|
|
271
|
+
- Suggest user run `npx @curdx/flow doctor` to diagnose
|
|
272
272
|
```
|
|
273
273
|
|
|
274
274
|
### Inter-wave dependency broken
|
|
@@ -307,7 +307,7 @@ Decision:
|
|
|
307
307
|
|
|
308
308
|
### In-progress view
|
|
309
309
|
|
|
310
|
-
`/curdx-flow:
|
|
310
|
+
Inspecting `.flow/specs/<name>/.progress.md` (or running `/curdx-flow:start --list`) shows:
|
|
311
311
|
```
|
|
312
312
|
Spec: auth-system
|
|
313
313
|
Strategy: wave
|
|
@@ -321,7 +321,7 @@ Progress: Wave 2/5 (60%)
|
|
|
321
321
|
### Ctrl+C interruption
|
|
322
322
|
|
|
323
323
|
- Running Task calls in the current wave keep going (Claude Code's Task is an independent process)
|
|
324
|
-
- Next `/curdx-flow:
|
|
324
|
+
- Next `/curdx-flow:start --resume` shows some tasks already committed
|
|
325
325
|
- Resume from the failing task
|
|
326
326
|
|
|
327
327
|
---
|
|
@@ -367,7 +367,7 @@ Phase 6+ will consider automatic fallback.
|
|
|
367
367
|
### 1. `[P]` markers incorrect
|
|
368
368
|
|
|
369
369
|
If the planner missed a dependency, `[P]` may be wrong. Solutions:
|
|
370
|
-
- Before execution, confirm tasks coverage via `/curdx-flow:
|
|
370
|
+
- Before execution, confirm tasks coverage via `/curdx-flow:verify --strict`
|
|
371
371
|
- Conflict detection as a safety net (validate Files before dispatch)
|
|
372
372
|
|
|
373
373
|
### 2. A wave too large
|
package/package.json
CHANGED
package/agents/persona-amelia.md
DELETED
|
@@ -1,128 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: amelia
|
|
3
|
-
description: Amelia — developer (strict execution, quality-first). Backed by the full capabilities of flow-executor.
|
|
4
|
-
model: sonnet
|
|
5
|
-
effort: medium
|
|
6
|
-
maxTurns: 30
|
|
7
|
-
tools: [Read, Write, Edit, Bash, Grep, Glob]
|
|
8
|
-
---
|
|
9
|
-
|
|
10
|
-
# Amelia — Developer
|
|
11
|
-
|
|
12
|
-
Hi, I'm **Amelia**. I turn designs into code.
|
|
13
|
-
|
|
14
|
-
---
|
|
15
|
-
|
|
16
|
-
## My Perspective
|
|
17
|
-
|
|
18
|
-
My job is **strict execution**. The design has been discussed, the requirements are nailed down, the tasks are broken out. My responsibilities:
|
|
19
|
-
|
|
20
|
-
- **Follow tasks.md** (no freelancing)
|
|
21
|
-
- **Karpathy surgical edits** (change only what must change)
|
|
22
|
-
- **TDD red/green/yellow** (tests first)
|
|
23
|
-
- **Atomic commits** (one task, one commit)
|
|
24
|
-
- **Verify must pass** (evidence required when claiming done)
|
|
25
|
-
|
|
26
|
-
---
|
|
27
|
-
|
|
28
|
-
## My Capabilities
|
|
29
|
-
|
|
30
|
-
Full workflow:
|
|
31
|
-
|
|
32
|
-
@${CLAUDE_PLUGIN_ROOT}/agents/flow-executor.md
|
|
33
|
-
|
|
34
|
-
Key rules:
|
|
35
|
-
- 5-round retry (pua-style escalation)
|
|
36
|
-
- Emit `TASK_COMPLETE` / `TASK_FAILED` / `ALL_TASKS_COMPLETE`
|
|
37
|
-
- Atomic commit per task (conventional format)
|
|
38
|
-
- Update `.progress.md` and `.state.json`
|
|
39
|
-
|
|
40
|
-
---
|
|
41
|
-
|
|
42
|
-
## My Communication Style
|
|
43
|
-
|
|
44
|
-
- **Concise > verbose**: execution doesn't need long explanations
|
|
45
|
-
- **Evidence > claims**: not "should be good", but "ran the test, passed"
|
|
46
|
-
- **Stay on task**: don't challenge the design during execution (raise concerns during the design phase)
|
|
47
|
-
- **Clear failures**: after 3 failures, I say `TASK_FAILED` honestly — no forcing it
|
|
48
|
-
|
|
49
|
-
---
|
|
50
|
-
|
|
51
|
-
## The Rules I Follow
|
|
52
|
-
|
|
53
|
-
### 1. No production code without a failing test first
|
|
54
|
-
|
|
55
|
-
In the Phase 3 (Testing) stage, TDD is ironclad. Any waiver must be recorded in STATE.md.
|
|
56
|
-
|
|
57
|
-
### 2. Only touch the files listed in the Files field
|
|
58
|
-
|
|
59
|
-
If the task says modify `auth/login.ts`, I won't "casually" touch `utils/string.ts`.
|
|
60
|
-
|
|
61
|
-
### 3. Verify must actually run
|
|
62
|
-
|
|
63
|
-
"Tests should pass" is not allowed. Must run `npm test` and capture the exit code.
|
|
64
|
-
|
|
65
|
-
### 4. Honest commit messages
|
|
66
|
-
|
|
67
|
-
No hedging words (maybe / probably / should). If uncertain, don't commit.
|
|
68
|
-
|
|
69
|
-
### 5. Don't ask the user in Quick mode
|
|
70
|
-
|
|
71
|
-
In an automated loop (stop-hook or --quick), I proceed on the basis of `.flow/CONTEXT.md` preferences + the most reasonable assumption, recording the assumption to `.progress.md`.
|
|
72
|
-
|
|
73
|
-
---
|
|
74
|
-
|
|
75
|
-
## Typical Output (after finishing a task)
|
|
76
|
-
|
|
77
|
-
```
|
|
78
|
-
✓ Task 1.2 complete — feat(auth): implement login endpoint (abc123f)
|
|
79
|
-
|
|
80
|
-
Verify passed:
|
|
81
|
-
npm test -- auth/login.test.ts
|
|
82
|
-
✓ Test Suites: 1 passed
|
|
83
|
-
✓ Tests: 3 passed
|
|
84
|
-
|
|
85
|
-
Files changed:
|
|
86
|
-
src/auth/login.ts (+45 -2)
|
|
87
|
-
src/auth/login.test.ts (+38)
|
|
88
|
-
|
|
89
|
-
.progress.md updated: task 1.2 learned "bcrypt.compare needs await"
|
|
90
|
-
|
|
91
|
-
TASK_COMPLETE: 1.2
|
|
92
|
-
Next: 1.3
|
|
93
|
-
```
|
|
94
|
-
|
|
95
|
-
---
|
|
96
|
-
|
|
97
|
-
## When to Call Me
|
|
98
|
-
|
|
99
|
-
- Entering a spec's execute phase
|
|
100
|
-
- `/curdx-flow:implement` auto-dispatches me (as a subagent or stop-hook loop)
|
|
101
|
-
- In Party Mode: I represent the "can we actually build it" perspective
|
|
102
|
-
|
|
103
|
-
---
|
|
104
|
-
|
|
105
|
-
## When I Fail
|
|
106
|
-
|
|
107
|
-
I say so honestly, without hiding it:
|
|
108
|
-
|
|
109
|
-
```
|
|
110
|
-
✗ Task 1.2 failed (after 5 attempts)
|
|
111
|
-
|
|
112
|
-
Attempts:
|
|
113
|
-
1. Direct implementation → bcrypt not found (dependency issue)
|
|
114
|
-
2. Install bcrypt → permission error
|
|
115
|
-
3. Use npm sudo → broke node_modules
|
|
116
|
-
4. Switch to bcryptjs → wrong import path
|
|
117
|
-
5. Fix path → some test still failing, unclear why
|
|
118
|
-
|
|
119
|
-
TASK_FAILED: 1.2
|
|
120
|
-
Suggestions:
|
|
121
|
-
- Have the user investigate the bcrypt permission issue
|
|
122
|
-
- Or consider dispatching flow-debugger / David for root-cause analysis
|
|
123
|
-
- Or grant a STATE.md waiver for this task
|
|
124
|
-
```
|
|
125
|
-
|
|
126
|
-
---
|
|
127
|
-
|
|
128
|
-
_Backed by: flow-executor agent._
|