@curdx/flow 1.1.11 → 2.0.0-beta.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +3 -3
- package/.claude-plugin/plugin.json +4 -11
- package/CHANGELOG.md +99 -0
- package/README.md +74 -102
- package/README.zh.md +2 -2
- package/agent-preamble/preamble.md +81 -11
- package/agents/flow-adversary.md +41 -56
- package/agents/flow-architect.md +24 -11
- package/agents/flow-debugger.md +2 -2
- package/agents/flow-edge-hunter.md +20 -6
- package/agents/flow-executor.md +3 -3
- package/agents/flow-planner.md +51 -48
- package/agents/flow-product-designer.md +15 -2
- package/agents/flow-qa-engineer.md +4 -4
- package/agents/flow-researcher.md +18 -3
- package/agents/flow-reviewer.md +5 -1
- package/agents/flow-security-auditor.md +2 -2
- package/agents/flow-triage-analyst.md +4 -4
- package/agents/flow-ui-researcher.md +7 -7
- package/agents/flow-ux-designer.md +3 -3
- package/agents/flow-verifier.md +47 -14
- package/bin/curdx-flow.js +13 -1
- package/cli/doctor.js +28 -13
- package/cli/install.js +62 -36
- package/cli/protocols.js +63 -10
- package/cli/registry.js +73 -0
- package/cli/uninstall.js +9 -11
- package/cli/upgrade.js +6 -10
- package/cli/utils.js +104 -56
- package/commands/debug.md +10 -10
- package/commands/fast.md +1 -1
- package/commands/help.md +109 -87
- package/commands/implement.md +7 -7
- package/commands/init.md +18 -7
- package/commands/review.md +114 -130
- package/commands/spec.md +131 -89
- package/commands/start.md +130 -153
- package/commands/verify.md +110 -92
- package/gates/adversarial-review-gate.md +20 -20
- package/gates/coverage-audit-gate.md +1 -1
- package/gates/devex-gate.md +5 -6
- package/gates/edge-case-gate.md +2 -2
- package/gates/security-gate.md +3 -3
- package/hooks/hooks.json +0 -11
- package/hooks/scripts/quick-mode-guard.sh +12 -9
- package/hooks/scripts/session-start.sh +2 -2
- package/hooks/scripts/stop-watcher.sh +25 -15
- package/knowledge/epic-decomposition.md +2 -2
- package/knowledge/execution-strategies.md +10 -9
- package/knowledge/planning-reviews.md +6 -6
- package/knowledge/spec-driven-development.md +11 -10
- package/knowledge/two-stage-review.md +6 -5
- package/knowledge/wave-execution.md +5 -5
- package/package.json +4 -2
- package/skills/brownfield-index/SKILL.md +62 -0
- package/skills/browser-qa/SKILL.md +50 -0
- package/skills/epic/SKILL.md +68 -0
- package/skills/security-audit/SKILL.md +50 -0
- package/skills/ui-sketch/SKILL.md +49 -0
- package/templates/config.json.tmpl +1 -1
- package/templates/design.md.tmpl +32 -112
- package/templates/requirements.md.tmpl +25 -43
- package/templates/research.md.tmpl +37 -68
- package/templates/tasks.md.tmpl +27 -84
- package/agents/persona-amelia.md +0 -128
- package/agents/persona-david.md +0 -141
- package/agents/persona-emma.md +0 -179
- package/agents/persona-john.md +0 -105
- package/agents/persona-mary.md +0 -95
- package/agents/persona-oliver.md +0 -136
- package/agents/persona-rachel.md +0 -126
- package/agents/persona-serena.md +0 -175
- package/agents/persona-winston.md +0 -117
- package/commands/audit.md +0 -170
- package/commands/autoplan.md +0 -184
- package/commands/design.md +0 -155
- package/commands/discuss.md +0 -162
- package/commands/doctor.md +0 -124
- package/commands/index.md +0 -261
- package/commands/install-deps.md +0 -128
- package/commands/party.md +0 -241
- package/commands/plan-ceo.md +0 -117
- package/commands/plan-design.md +0 -107
- package/commands/plan-dx.md +0 -104
- package/commands/plan-eng.md +0 -108
- package/commands/qa.md +0 -118
- package/commands/requirements.md +0 -146
- package/commands/research.md +0 -141
- package/commands/security.md +0 -109
- package/commands/sketch.md +0 -118
- package/commands/spike.md +0 -181
- package/commands/status.md +0 -139
- package/commands/switch.md +0 -95
- package/commands/tasks.md +0 -189
- package/commands/triage.md +0 -160
- package/hooks/scripts/fail-tracker.sh +0 -31
package/commands/start.md
CHANGED
|
@@ -1,189 +1,166 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: start
|
|
3
|
-
description:
|
|
4
|
-
argument-hint: "<spec-name> \"<goal>\""
|
|
5
|
-
allowed-tools: [Read, Write, Bash, AskUserQuestion]
|
|
3
|
+
description: Smart entry point — create a new spec, resume an existing one, or switch between specs. Replaces v1's /start + /switch.
|
|
4
|
+
argument-hint: "[<spec-name>] [\"<one-line goal>\"] [--resume] [--list] [--mode=<fast|standard|enterprise>]"
|
|
5
|
+
allowed-tools: [Read, Write, Bash, AskUserQuestion, Task]
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
# Start a Spec
|
|
8
|
+
# Start or Resume a Feature Spec
|
|
9
9
|
|
|
10
|
-
|
|
11
|
-
- New spec → create directory + templates + enter research phase
|
|
12
|
-
- Existing spec → resume from the last phase
|
|
10
|
+
Entry point for every feature. Works in four modes depending on flags and existing state.
|
|
13
11
|
|
|
14
|
-
##
|
|
12
|
+
## Invocation patterns
|
|
15
13
|
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
14
|
+
| Pattern | Behavior |
|
|
15
|
+
|---------|----------|
|
|
16
|
+
| `/curdx-flow:start my-feature "Add JWT auth to REST API"` | Create a fresh spec named `my-feature`, set it active, seed draft requirements. |
|
|
17
|
+
| `/curdx-flow:start my-feature` (name exists) | Switch active spec to `my-feature` (same as v1 `/switch`). |
|
|
18
|
+
| `/curdx-flow:start --resume` | Resume the last active spec from `.flow/.active-spec`. |
|
|
19
|
+
| `/curdx-flow:start --list` | List all specs with their phase and last-updated time, prompt to pick. |
|
|
20
|
+
| `/curdx-flow:start` (no args) | Interactive: ask the user whether to create new, resume recent, or list all. |
|
|
21
|
+
| Add `--mode=<fast\|standard\|enterprise>` | Set the execution mode for this spec (stored in `.state.json`). |
|
|
23
22
|
|
|
24
|
-
##
|
|
23
|
+
## Preflight
|
|
25
24
|
|
|
26
25
|
```bash
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
SPEC_NAME=$(echo "$ARGS" | awk '{print $1}')
|
|
33
|
-
GOAL=$(echo "$ARGS" | sed -E "s/^[a-z0-9-]+\s*//" | sed 's/^["\x27]//;s/["\x27]$//')
|
|
34
|
-
|
|
35
|
-
if [ -z "$SPEC_NAME" ]; then
|
|
36
|
-
echo "Usage: /curdx-flow:start <spec-name> \"<goal>\""
|
|
37
|
-
echo "Example: /curdx-flow:start auth-system \"Add JWT authentication to REST API\""
|
|
38
|
-
exit 1
|
|
39
|
-
fi
|
|
40
|
-
|
|
41
|
-
# Validate that spec-name is kebab-case
|
|
42
|
-
if ! echo "$SPEC_NAME" | grep -Eq '^[a-z][a-z0-9-]*$'; then
|
|
43
|
-
echo "❌ spec-name must be kebab-case (start with lowercase letter, only letters, digits, hyphens)"
|
|
44
|
-
exit 1
|
|
45
|
-
fi
|
|
26
|
+
# Require a .flow project
|
|
27
|
+
[ ! -d ".flow" ] && {
|
|
28
|
+
echo "✗ Not a CurDX-Flow project. Run /curdx-flow:init first.";
|
|
29
|
+
exit 1;
|
|
30
|
+
}
|
|
46
31
|
```
|
|
47
32
|
|
|
48
|
-
##
|
|
33
|
+
## Flag parsing
|
|
49
34
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
MODE="resume"
|
|
56
|
-
echo "🔄 Resuming spec: $SPEC_NAME"
|
|
57
|
-
else
|
|
58
|
-
# New mode
|
|
59
|
-
MODE="new"
|
|
60
|
-
echo "🆕 New spec: $SPEC_NAME"
|
|
61
|
-
|
|
62
|
-
if [ -z "$GOAL" ]; then
|
|
63
|
-
# If no goal given, ask the user
|
|
64
|
-
echo "Please provide a one-sentence goal, then continue."
|
|
65
|
-
exit 0
|
|
66
|
-
fi
|
|
67
|
-
fi
|
|
68
|
-
```
|
|
35
|
+
**Do not shell-split `$ARGUMENTS`.** It is a user-supplied string that may
|
|
36
|
+
contain quoted substrings with spaces, `$`-signs, or embedded quotes.
|
|
37
|
+
`xargs`, naive `awk`, and `sed`-based quote stripping all mis-parse at
|
|
38
|
+
least one of those cases (e.g. `my-feature "Fix user's login bug"` breaks
|
|
39
|
+
`xargs: unmatched quote`). Parse the string as a model task instead:
|
|
69
40
|
|
|
70
|
-
|
|
41
|
+
1. **Flags** (order-independent, each is self-delimited):
|
|
42
|
+
- `--resume` / `--list` — boolean presence
|
|
43
|
+
- `--mode=<fast|standard|enterprise>` — value after `=`
|
|
44
|
+
Detect each with a single regex over the full `$ARGUMENTS` string and
|
|
45
|
+
remove the matched span from your working copy. Flags not in the list
|
|
46
|
+
above are errors — surface them to the user.
|
|
71
47
|
|
|
72
|
-
|
|
48
|
+
2. **Positional args** (after flags removed):
|
|
49
|
+
- First whitespace-separated token → `SPEC_NAME` (kebab-case `[a-z0-9-]+`).
|
|
50
|
+
- Remainder of the string, trimmed and with one layer of outer `"..."`
|
|
51
|
+
or `'...'` quotes stripped → `GOAL`. Preserve inner quotes as-is.
|
|
73
52
|
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
53
|
+
3. If `SPEC_NAME` does not match `^[a-z0-9][a-z0-9-]*$` (per
|
|
54
|
+
`schemas/spec-state.schema.json`), stop and ask the user to pick a
|
|
55
|
+
valid kebab-case name.
|
|
56
|
+
|
|
57
|
+
Mode must be `fast`, `standard`, or `enterprise`. Invalid → default to
|
|
58
|
+
`standard` with a warning.
|
|
77
59
|
|
|
78
|
-
|
|
79
|
-
TODAY=$(date +%Y-%m-%d)
|
|
80
|
-
CONFIG_MODE=$(python3 -c "import json; print(json.load(open('.flow/config.json')).get('mode','standard'))" 2>/dev/null || echo "standard")
|
|
60
|
+
Example inputs and their parse:
|
|
81
61
|
|
|
82
|
-
|
|
62
|
+
| `$ARGUMENTS` | SPEC_NAME | GOAL | flags |
|
|
63
|
+
|-------------------------------------------------|--------------|-------------------------------|---------------|
|
|
64
|
+
| `my-feature "Add JWT auth"` | `my-feature` | `Add JWT auth` | — |
|
|
65
|
+
| `my-feature --mode=fast "Add JWT auth"` | `my-feature` | `Add JWT auth` | mode=fast |
|
|
66
|
+
| `my-feature "Fix user's login bug"` | `my-feature` | `Fix user's login bug` | — |
|
|
67
|
+
| `--list` | — | — | list=true |
|
|
68
|
+
| `--resume` | — | — | resume=true |
|
|
69
|
+
|
|
70
|
+
## Branch logic
|
|
71
|
+
|
|
72
|
+
### Branch A: `--list`
|
|
73
|
+
Enumerate every directory under `.flow/specs/`, read each `.state.json` for `phase` and `updated` (per `schemas/spec-state.schema.json`), print a numbered list, then `AskUserQuestion` to pick one. Picking sets `.flow/.active-spec` and exits.
|
|
74
|
+
|
|
75
|
+
### Branch B: `--resume` (no name)
|
|
76
|
+
Read `.flow/.active-spec`. If it points to a valid spec dir, report its current phase and next suggested command (`/curdx-flow:spec` if incomplete, `/curdx-flow:implement` if tasks ready). If `.active-spec` is empty or stale, fall back to Branch A.
|
|
77
|
+
|
|
78
|
+
### Branch C: `SPEC_NAME` provided, spec exists
|
|
79
|
+
Switch `.flow/.active-spec` to `SPEC_NAME`. Confirm with the user if they intended to switch (not overwrite). Report current phase.
|
|
80
|
+
|
|
81
|
+
### Branch D: `SPEC_NAME` provided, spec does NOT exist
|
|
82
|
+
Create a new spec:
|
|
83
|
+
|
|
84
|
+
```bash
|
|
85
|
+
mkdir -p ".flow/specs/$SPEC_NAME"
|
|
86
|
+
# NOTE: field names MUST match schemas/spec-state.schema.json:
|
|
87
|
+
# - spec_name (not "spec")
|
|
88
|
+
# - created (date, not "created_at")
|
|
89
|
+
# - updated (date-time, not "updated_at")
|
|
90
|
+
# - phase must be one of the enum values; the initial phase is "research"
|
|
91
|
+
# (there is no "created" phase — that was schema drift pre-beta.9)
|
|
92
|
+
# - version is required
|
|
93
|
+
cat > ".flow/specs/$SPEC_NAME/.state.json" <<JSON
|
|
83
94
|
{
|
|
84
95
|
"version": "1.0",
|
|
85
96
|
"spec_name": "$SPEC_NAME",
|
|
86
97
|
"goal": "$GOAL",
|
|
87
|
-
"mode": "$
|
|
88
|
-
"strategy": "auto",
|
|
98
|
+
"mode": "$FLAG_MODE",
|
|
89
99
|
"phase": "research",
|
|
90
|
-
"phase_status": {
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
},
|
|
96
|
-
"decisions": [],
|
|
97
|
-
"created": "$TODAY",
|
|
98
|
-
"updated": "$TODAY"
|
|
100
|
+
"phase_status": {},
|
|
101
|
+
"strategy": "auto",
|
|
102
|
+
"execute_state": {},
|
|
103
|
+
"created": "$(date -u +%Y-%m-%d)",
|
|
104
|
+
"updated": "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
|
99
105
|
}
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
# Generate progress.md (from template)
|
|
103
|
-
python3 <<PYEOF
|
|
104
|
-
from pathlib import Path
|
|
105
|
-
tmpl = Path("${CLAUDE_PLUGIN_ROOT}/templates/progress.md.tmpl").read_text()
|
|
106
|
-
content = tmpl.replace("{{SPEC_NAME}}", "$SPEC_NAME").replace("{{CREATED_DATE}}", "$TODAY")
|
|
107
|
-
Path("$SPEC_DIR/.progress.md").write_text(content)
|
|
108
|
-
PYEOF
|
|
109
|
-
|
|
110
|
-
# Set as active spec
|
|
111
|
-
echo "$SPEC_NAME" > ".flow/.active-spec"
|
|
112
|
-
|
|
113
|
-
echo "✓ Spec directory created: $SPEC_DIR"
|
|
114
|
-
echo " Goal: $GOAL"
|
|
115
|
-
echo ""
|
|
116
|
-
echo "Next step (automatically entering research):"
|
|
117
|
-
echo " /curdx-flow:research"
|
|
106
|
+
JSON
|
|
107
|
+
echo "$SPEC_NAME" > .flow/.active-spec
|
|
118
108
|
```
|
|
119
109
|
|
|
120
|
-
|
|
110
|
+
If `GOAL` is empty, `AskUserQuestion` to gather it before writing `.state.json`.
|
|
121
111
|
|
|
122
|
-
|
|
112
|
+
Then seed a minimal `.progress.md`:
|
|
123
113
|
|
|
124
114
|
```bash
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
python3 <<PYEOF
|
|
139
|
-
import json
|
|
140
|
-
s = json.load(open("$STATE_FILE"))
|
|
141
|
-
print(f"✓ Spec activated: $SPEC_NAME")
|
|
142
|
-
print(f" Goal: {s.get('goal','(undefined)')}")
|
|
143
|
-
print(f" Current phase: {s['phase']}")
|
|
144
|
-
print(f" Progress:")
|
|
145
|
-
|
|
146
|
-
ph_emoji = {"completed": "✓", "in_progress": "●", "not_started": "○", "failed": "✗", "skipped": "—"}
|
|
147
|
-
for phase, status in s.get('phase_status',{}).items():
|
|
148
|
-
emoji = ph_emoji.get(status, "?")
|
|
149
|
-
print(f" {emoji} {phase:<15} {status}")
|
|
150
|
-
|
|
151
|
-
print()
|
|
152
|
-
|
|
153
|
-
# Recommend next step
|
|
154
|
-
phase_order = ["research", "requirements", "design", "tasks", "execute", "verify", "ship"]
|
|
155
|
-
for p in phase_order:
|
|
156
|
-
st = s.get('phase_status',{}).get(p, 'not_started')
|
|
157
|
-
if st in ("not_started", "in_progress"):
|
|
158
|
-
print(f"Next step: /curdx-flow:{p}")
|
|
159
|
-
break
|
|
160
|
-
else:
|
|
161
|
-
print("All phases complete! Use /curdx-flow:status to see details.")
|
|
162
|
-
PYEOF
|
|
115
|
+
cat > ".flow/specs/$SPEC_NAME/.progress.md" <<MD
|
|
116
|
+
# Progress Log — $SPEC_NAME
|
|
117
|
+
|
|
118
|
+
**Goal**: $GOAL
|
|
119
|
+
**Mode**: $FLAG_MODE
|
|
120
|
+
**Created**: $(date -u +%Y-%m-%d)
|
|
121
|
+
|
|
122
|
+
## Decisions
|
|
123
|
+
(populated during /curdx-flow:spec)
|
|
124
|
+
|
|
125
|
+
## Learnings
|
|
126
|
+
(populated during /curdx-flow:implement)
|
|
127
|
+
MD
|
|
163
128
|
```
|
|
164
129
|
|
|
165
|
-
|
|
130
|
+
### Branch E: no args, no flags
|
|
131
|
+
```
|
|
132
|
+
AskUserQuestion:
|
|
133
|
+
"No spec name given. What would you like to do?"
|
|
134
|
+
Options:
|
|
135
|
+
- Create a new spec (prompts for name + goal)
|
|
136
|
+
- Resume the last active spec
|
|
137
|
+
- List all specs
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
Route to the matching branch.
|
|
141
|
+
|
|
142
|
+
## Mode semantics
|
|
143
|
+
|
|
144
|
+
The `mode` field in `.state.json` drives behavior in later commands:
|
|
145
|
+
|
|
146
|
+
| Mode | `/curdx-flow:spec` default | `/curdx-flow:implement` default | Gates applied |
|
|
147
|
+
|------|---------------------------|--------------------------------|---------------|
|
|
148
|
+
| `fast` | skipped (use `/curdx-flow:fast` instead) | linear strategy | karpathy + verification |
|
|
149
|
+
| `standard` | full 4 phases | auto strategy | + tdd + coverage-audit |
|
|
150
|
+
| `enterprise` | full 4 phases + `--review=all` | auto strategy + stricter gates | + adversarial + edge-case + security |
|
|
151
|
+
|
|
152
|
+
## Post-create reporting
|
|
166
153
|
|
|
167
154
|
```
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
3. /curdx-flow:design ← design (architectural decisions, freeze choices)
|
|
175
|
-
4. /curdx-flow:tasks ← task decomposition (auto-verifiable tasks)
|
|
176
|
-
|
|
177
|
-
One-shot version: /curdx-flow:spec ← run research/req/design/tasks in sequence
|
|
178
|
-
|
|
179
|
-
Fast modes:
|
|
180
|
-
/curdx-flow:fast ← skip spec and implement directly
|
|
181
|
-
/curdx-flow:sketch ← UI prototype exploration
|
|
182
|
-
═══════════════════════════════════════════════
|
|
155
|
+
✓ Spec ready: <name>
|
|
156
|
+
Goal: <goal>
|
|
157
|
+
Mode: <mode>
|
|
158
|
+
Path: .flow/specs/<name>/
|
|
159
|
+
|
|
160
|
+
Next: /curdx-flow:spec
|
|
183
161
|
```
|
|
184
162
|
|
|
185
|
-
##
|
|
163
|
+
## References
|
|
186
164
|
|
|
187
|
-
-
|
|
188
|
-
-
|
|
189
|
-
- Spec directory corrupted → ask user whether to delete-and-rebuild or repair
|
|
165
|
+
- State schema: `@${CLAUDE_PLUGIN_ROOT}/schemas/spec-state.schema.json`
|
|
166
|
+
- Mode semantics: `@${CLAUDE_PLUGIN_ROOT}/knowledge/execution-strategies.md`
|
package/commands/verify.md
CHANGED
|
@@ -1,124 +1,142 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: verify
|
|
3
|
-
description:
|
|
4
|
-
argument-hint: "[
|
|
3
|
+
description: Goal-backward verification — trace from every FR / AC / AD in the spec to the code and tests, detect stubs and fake completions. The differentiator command. Optionally adds multi-source coverage audit with --strict.
|
|
4
|
+
argument-hint: "[--strict]"
|
|
5
5
|
allowed-tools: [Read, Bash, Task, Grep, Glob]
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
#
|
|
8
|
+
# Goal-Backward Verification
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
This is the **differentiator command**: it scans the implementation against the spec's own requirements and catches the most common Claude failure mode — claiming "done" while actual code is a stub or fake completion.
|
|
11
11
|
|
|
12
|
-
##
|
|
12
|
+
## Flags
|
|
13
13
|
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
14
|
+
| Flag | Default | Purpose |
|
|
15
|
+
|------|---------|---------|
|
|
16
|
+
| `--strict` | off | Also run multi-source coverage audit (FR / AC / AD / Research conclusions / D-NN decisions) — replaces v1's `/audit` command. |
|
|
17
17
|
|
|
18
|
-
##
|
|
18
|
+
## Preflight
|
|
19
19
|
|
|
20
20
|
```bash
|
|
21
|
-
|
|
22
|
-
[ -z "$SPEC_NAME" ] && { echo "❌ No active spec. Run /curdx-flow:switch or /curdx-flow:start first"; exit 1; }
|
|
21
|
+
[ ! -d ".flow" ] && { echo "✗ Not a CurDX-Flow project."; exit 1; }
|
|
23
22
|
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
23
|
+
SPEC_NAME=$(cat .flow/.active-spec 2>/dev/null)
|
|
24
|
+
[ -z "$SPEC_NAME" ] && { echo "✗ No active spec. Run /curdx-flow:start first."; exit 1; }
|
|
25
|
+
|
|
26
|
+
SPEC_DIR=".flow/specs/$SPEC_NAME"
|
|
27
|
+
for f in requirements.md design.md tasks.md; do
|
|
28
|
+
[ ! -f "$SPEC_DIR/$f" ] && {
|
|
29
|
+
echo "✗ $SPEC_DIR/$f missing. Run /curdx-flow:spec first.";
|
|
30
|
+
exit 1;
|
|
31
|
+
}
|
|
27
32
|
done
|
|
33
|
+
|
|
34
|
+
FLAG_STRICT=$(echo "$ARGUMENTS" | grep -q -- '--strict' && echo 1 || echo 0)
|
|
28
35
|
```
|
|
29
36
|
|
|
30
|
-
##
|
|
37
|
+
## Workflow
|
|
31
38
|
|
|
32
|
-
|
|
33
|
-
# Read the commit range for the execute phase
|
|
34
|
-
# From .state.json or git reflog
|
|
35
|
-
|
|
36
|
-
LAST_EXEC_START=$(python3 -c "
|
|
37
|
-
import json
|
|
38
|
-
s = json.load(open('$DIR/.state.json'))
|
|
39
|
-
# Custom field or inferred from git
|
|
40
|
-
print(s.get('execute_state', {}).get('start_commit', ''))
|
|
41
|
-
")
|
|
42
|
-
|
|
43
|
-
# If unavailable, use main..HEAD
|
|
44
|
-
RANGE="${LAST_EXEC_START:-main}..HEAD"
|
|
45
|
-
echo "Verification scope: $RANGE"
|
|
46
|
-
```
|
|
39
|
+
### Step 1: Dispatch `flow-verifier`
|
|
47
40
|
|
|
48
|
-
|
|
41
|
+
Delegate to the `flow-verifier` agent with:
|
|
42
|
+
- `requirements.md` (source of FR-NN, AC-N.N, US-NN)
|
|
43
|
+
- `design.md` (source of AD-NN)
|
|
44
|
+
- `tasks.md` (source of T-N.M and their Verify commands)
|
|
45
|
+
- Repository root (to scan code + tests)
|
|
49
46
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
5. Update phase_status.verify in .state.json
|
|
73
|
-
|
|
74
|
-
Output file:
|
|
75
|
-
.flow/specs/$SPEC_NAME/verification-report.md
|
|
76
|
-
|
|
77
|
-
Return a brief:
|
|
78
|
-
- Fully verified / partially verified / not verified counts
|
|
79
|
-
- Number of fake implementations
|
|
80
|
-
- List of blockers
|
|
81
|
-
- Suggested next step
|
|
82
|
-
```
|
|
47
|
+
The agent performs goal-backward tracing:
|
|
48
|
+
1. For each **FR-NN**: search for code that implements it. Classify as `IMPLEMENTED` / `STUB` / `MISSING` / `UNCERTAIN`.
|
|
49
|
+
2. For each **AC-N.N**: search for a test that exercises it. Classify as `TESTED` / `UNTESTED`.
|
|
50
|
+
3. For each **AD-NN**: check that the design decision is reflected in code structure / interfaces.
|
|
51
|
+
4. Detect suspicious patterns:
|
|
52
|
+
- `throw new Error("not implemented")` / `TODO` / `NotImplementedError`
|
|
53
|
+
- `return null` / `return {}` in places that should produce real output
|
|
54
|
+
- test files with only `it.skip(...)` or no assertions
|
|
55
|
+
- code that returns mocked fixtures instead of calling real collaborators
|
|
56
|
+
|
|
57
|
+
### Step 2: Run the Verify commands from `tasks.md`
|
|
58
|
+
|
|
59
|
+
For every task listed in `tasks.md`, run its declared `Verify:` command and record pass/fail. This is the most objective check — if the task said `npm test -- auth.spec.ts`, run exactly that.
|
|
60
|
+
|
|
61
|
+
### Step 3: (Strict only) Multi-source coverage audit
|
|
62
|
+
|
|
63
|
+
If `--strict`:
|
|
64
|
+
- Cross-check every FR / AC / AD / decision against the implementation
|
|
65
|
+
- Cross-check every research conclusion: was the recommended library / approach actually used?
|
|
66
|
+
- Cross-check every D-NN decision in `.flow/STATE.md` that references this spec
|
|
67
|
+
|
|
68
|
+
### Step 4: Produce `verification-report.md`
|
|
83
69
|
|
|
84
|
-
|
|
70
|
+
**Landing check**: sub-agent responses can be truncated by the model's output-length limit. After dispatching `flow-verifier`, verify the report actually landed:
|
|
85
71
|
|
|
86
72
|
```bash
|
|
87
|
-
REPORT="
|
|
88
|
-
[ ! -f "$REPORT" ]
|
|
89
|
-
|
|
90
|
-
#
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
73
|
+
REPORT=".flow/specs/$SPEC_NAME/verification-report.md"
|
|
74
|
+
if [ ! -f "$REPORT" ] || [ "$(wc -c < "$REPORT" 2>/dev/null | tr -d ' ')" -lt 300 ]; then
|
|
75
|
+
echo "⚠ Report missing or truncated. Re-dispatching flow-verifier with a terse 'write the report now' prompt."
|
|
76
|
+
# Re-dispatch pattern:
|
|
77
|
+
# "Your only job right now is to Write the verification-report.md using the
|
|
78
|
+
# findings you already gathered. Do not re-scan. Do not narrate. Write
|
|
79
|
+
# the file and stop."
|
|
80
|
+
fi
|
|
95
81
|
```
|
|
96
82
|
|
|
97
|
-
|
|
83
|
+
Write to `.flow/specs/$SPEC_NAME/verification-report.md`:
|
|
98
84
|
|
|
85
|
+
```markdown
|
|
86
|
+
# Verification Report — <spec-name>
|
|
87
|
+
|
|
88
|
+
**Generated**: <ISO8601>
|
|
89
|
+
**Mode**: strict | normal
|
|
90
|
+
|
|
91
|
+
## Summary
|
|
92
|
+
- FR coverage: N/M implemented (K stubs, L missing)
|
|
93
|
+
- AC coverage: N/M tested
|
|
94
|
+
- AD coverage: N/M reflected
|
|
95
|
+
- Verify commands: N/M passing
|
|
96
|
+
|
|
97
|
+
## Findings
|
|
98
|
+
|
|
99
|
+
### Missing implementations
|
|
100
|
+
- FR-02: <description>. No matching code found in <paths searched>.
|
|
101
|
+
|
|
102
|
+
### Stubs / fake completions
|
|
103
|
+
- FR-05: Implemented in `src/auth.ts:42` but body is `throw new Error("not implemented")`.
|
|
104
|
+
|
|
105
|
+
### Untested acceptance criteria
|
|
106
|
+
- AC-1.3: No test asserts token refresh after 15 min expiry.
|
|
107
|
+
|
|
108
|
+
### Failing Verify commands
|
|
109
|
+
- T-2.1: `npm test auth.spec.ts` → 3 failed
|
|
110
|
+
|
|
111
|
+
## Verdict
|
|
112
|
+
- [ ] PASS — all items covered and passing
|
|
113
|
+
- [X] PARTIAL — <n> findings must be addressed before shipping
|
|
114
|
+
- [ ] MISSING — substantive implementation gaps
|
|
99
115
|
```
|
|
100
|
-
✓ Verify complete: $SPEC_NAME
|
|
101
116
|
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
117
|
+
### Step 5: Apply `verification-gate`
|
|
118
|
+
|
|
119
|
+
Hard rule from `@${CLAUDE_PLUGIN_ROOT}/gates/verification-gate.md`:
|
|
120
|
+
- Any `STUB` or `MISSING` finding on a non-deferred FR blocks completion.
|
|
121
|
+
- Any failing Verify command blocks completion.
|
|
122
|
+
- Waive only with an explicit user D-NN decision logged to `.flow/STATE.md`.
|
|
107
123
|
|
|
108
|
-
|
|
124
|
+
## Reporting
|
|
125
|
+
|
|
126
|
+
```
|
|
127
|
+
✓ Verification complete
|
|
128
|
+
FR coverage: 8/10 implemented (1 stub, 1 missing)
|
|
129
|
+
AC coverage: 9/12 tested
|
|
130
|
+
Verify commands: 14/15 passing
|
|
131
|
+
Verdict: PARTIAL
|
|
109
132
|
|
|
110
|
-
|
|
111
|
-
$([ $MISSING -gt 0 ] && echo '❌ BLOCKED — unimplemented FR/AC/AD exist, return to /curdx-flow:implement to fill in')
|
|
112
|
-
$([ $STUBS -gt 0 ] && echo '❌ BLOCKED — fake implementations found')
|
|
113
|
-
$([ $MISSING -eq 0 ] && [ $STUBS -eq 0 ] && echo '✓ PASS — can proceed to /curdx-flow:review')
|
|
133
|
+
Findings written to: .flow/specs/<name>/verification-report.md
|
|
114
134
|
|
|
115
|
-
Next
|
|
116
|
-
$([ $MISSING -gt 0 ] && echo 'fix blockers → /curdx-flow:implement --task=<new task>')
|
|
117
|
-
$([ $MISSING -eq 0 ] && echo '/curdx-flow:review — enter code quality review')
|
|
135
|
+
Next: address findings, then re-run /curdx-flow:verify, or run /curdx-flow:review.
|
|
118
136
|
```
|
|
119
137
|
|
|
120
|
-
##
|
|
138
|
+
## References
|
|
121
139
|
|
|
122
|
-
- verifier
|
|
123
|
-
-
|
|
124
|
-
-
|
|
140
|
+
- `flow-verifier` agent: `@${CLAUDE_PLUGIN_ROOT}/agents/flow-verifier.md`
|
|
141
|
+
- `verification-gate`: `@${CLAUDE_PLUGIN_ROOT}/gates/verification-gate.md`
|
|
142
|
+
- `coverage-audit-gate` (used in strict mode): `@${CLAUDE_PLUGIN_ROOT}/gates/coverage-audit-gate.md`
|
|
@@ -33,19 +33,19 @@ A reviewer agent's output of "everything looks fine, no issues found" is an **in
|
|
|
33
33
|
- "Looks good" is usually confirmation bias (the agent only checked the obvious)
|
|
34
34
|
- AI tends to please the user ("great job!") — fight this tendency
|
|
35
35
|
|
|
36
|
-
**Forced actions**:
|
|
37
|
-
1.
|
|
38
|
-
2.
|
|
39
|
-
|
|
40
|
-
-
|
|
41
|
-
-
|
|
42
|
-
|
|
36
|
+
**Forced actions when the agent reports "no issues"**:
|
|
37
|
+
1. Automatically trigger a second round framed as "what would a senior skeptic reject in this PR?"
|
|
38
|
+
2. If both rounds still honestly yield no findings, the agent must emit a **proof-of-checking report**:
|
|
39
|
+
- Every category it examined (with "N/A" for categories that don't apply)
|
|
40
|
+
- For each examined category, the specific code/file locations inspected
|
|
41
|
+
- Counterfactual hypotheses of "what this would look like if there were a problem" and why that signature is absent
|
|
42
|
+
3. Fabricating findings to avoid the proof-of-checking step is a violation of L3 red line #2 (fact-driven). Better to emit "clean verdict with proof" than invent issues.
|
|
43
43
|
|
|
44
44
|
---
|
|
45
45
|
|
|
46
|
-
### Rule 2:
|
|
46
|
+
### Rule 2: Coverage proportional to feature scope
|
|
47
47
|
|
|
48
|
-
A complete adversarial review
|
|
48
|
+
A complete adversarial review covers every category that applies to the feature, marks the rest as N/A with reason. Number of findings per category is proportional to real issues, not a quota:
|
|
49
49
|
|
|
50
50
|
1. **Architecture layer**: Are decisions sound? Future-extensible? Lock-in risks?
|
|
51
51
|
2. **Implementation layer**: Code quality? Error handling? Performance?
|
|
@@ -86,22 +86,22 @@ Not allowed:
|
|
|
86
86
|
Input: object under review (code range / spec / PR diff)
|
|
87
87
|
↓
|
|
88
88
|
Round 1 (agent self-analysis):
|
|
89
|
-
- Use sequential-thinking
|
|
90
|
-
- Scan
|
|
89
|
+
- Use sequential-thinking proportional to the surface being probed
|
|
90
|
+
- Scan each applicable category; mark N/A ones with reason
|
|
91
91
|
- Output findings list
|
|
92
92
|
↓
|
|
93
93
|
Decision:
|
|
94
|
-
-
|
|
95
|
-
-
|
|
94
|
+
- Any real findings? → output report with findings
|
|
95
|
+
- Zero findings after honest Round 1? → force Round 2 framed as skeptic
|
|
96
96
|
↓
|
|
97
97
|
Round 2 (deep analysis):
|
|
98
|
-
- sequential-thinking
|
|
98
|
+
- sequential-thinking proportional to residual uncertainty
|
|
99
99
|
- Focus on "seemingly no issues" parts (trust but verify)
|
|
100
|
-
-
|
|
100
|
+
- Optionally introduce external perspectives (read issues from similar projects)
|
|
101
101
|
↓
|
|
102
102
|
Decision:
|
|
103
|
-
- Still
|
|
104
|
-
-
|
|
103
|
+
- Still zero findings? → agent must emit proof-of-checking report (NOT invent findings)
|
|
104
|
+
- Findings exist? → output report
|
|
105
105
|
↓
|
|
106
106
|
Output: review-report.md
|
|
107
107
|
```
|
|
@@ -190,10 +190,10 @@ Fix loop:
|
|
|
190
190
|
|
|
191
191
|
## Failure Recovery
|
|
192
192
|
|
|
193
|
-
If after 2
|
|
193
|
+
If after Round 2 the honest verdict is still zero findings, emit a proof-of-checking report (do NOT fabricate to hit a quota — there is no quota):
|
|
194
194
|
|
|
195
195
|
```markdown
|
|
196
|
-
## Adversarial Review —
|
|
196
|
+
## Adversarial Review — Proof of Checking (zero findings)
|
|
197
197
|
|
|
198
198
|
I have examined the following dimensions across 2 rounds of analysis:
|
|
199
199
|
|
|
@@ -210,7 +210,7 @@ I have examined the following dimensions across 2 rounds of analysis:
|
|
|
210
210
|
|
|
211
211
|
Recommendations:
|
|
212
212
|
- Human review (at least walk through the diff once)
|
|
213
|
-
- Consider
|
|
213
|
+
- Consider the `browser-qa` skill for real browser/integration testing (Phase 5+)
|
|
214
214
|
- Wait until deployment to staging to observe
|
|
215
215
|
```
|
|
216
216
|
|
|
@@ -17,7 +17,7 @@ depends_on: []
|
|
|
17
17
|
|
|
18
18
|
- End of the tasks phase (last step of flow-planner)
|
|
19
19
|
- Before the execution phase completes (when /curdx-flow:verify runs)
|
|
20
|
-
- Explicitly requested by /curdx-flow:
|
|
20
|
+
- Explicitly requested by /curdx-flow:verify --strict
|
|
21
21
|
|
|
22
22
|
---
|
|
23
23
|
|