@curdx/flow 2.0.0-beta.6 → 2.0.0-beta.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +1 -1
- package/.claude-plugin/plugin.json +1 -1
- package/agents/flow-adversary.md +13 -39
- package/agents/flow-edge-hunter.md +2 -2
- package/agents/flow-planner.md +1 -1
- package/agents/flow-reviewer.md +1 -1
- package/agents/flow-verifier.md +1 -1
- package/commands/fast.md +1 -1
- package/commands/implement.md +1 -1
- package/commands/review.md +5 -5
- package/commands/spec.md +1 -1
- package/gates/adversarial-review-gate.md +3 -3
- package/gates/devex-gate.md +2 -3
- package/knowledge/execution-strategies.md +6 -5
- package/knowledge/spec-driven-development.md +8 -7
- package/knowledge/two-stage-review.md +4 -3
- package/package.json +1 -1
- package/templates/design.md.tmpl +32 -112
- package/templates/requirements.md.tmpl +25 -43
- package/templates/research.md.tmpl +37 -68
- package/templates/tasks.md.tmpl +27 -84
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
},
|
|
7
7
|
"metadata": {
|
|
8
8
|
"description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
|
|
9
|
-
"version": "2.0.0-beta.
|
|
9
|
+
"version": "2.0.0-beta.7"
|
|
10
10
|
},
|
|
11
11
|
"plugins": [
|
|
12
12
|
{
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "curdx-flow",
|
|
3
|
-
"version": "2.0.0-beta.
|
|
3
|
+
"version": "2.0.0-beta.7",
|
|
4
4
|
"description": "Claude Code Discipline Layer — spec-driven workflow + goal-backward verification + Karpathy 4 principles enforced via gates. Stops Claude from faking \"done\" on non-trivial features.",
|
|
5
5
|
"author": {
|
|
6
6
|
"name": "wdx",
|
package/agents/flow-adversary.md
CHANGED
|
@@ -64,29 +64,16 @@ Based on input type:
|
|
|
64
64
|
|
|
65
65
|
### Step 2: Round 1 — Breadth Scan
|
|
66
66
|
|
|
67
|
-
|
|
67
|
+
Walk through the applicable categories below. **Skip categories that don't apply** (e.g. no UI → UX is N/A; no auth → Security only if that absence is itself material) and note them as `N/A: <reason>` in your report. Use sequential-thinking proportional to the surface each category presents — 1 thought for a trivial check, more for genuinely complex surfaces.
|
|
68
68
|
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
Round 3: Testing layer
|
|
77
|
-
Think: Coverage? Over-mocked? Falsely green?
|
|
69
|
+
- **Architecture**: Are decisions right? Will we regret them in 6 months? Any implicit coupling?
|
|
70
|
+
- **Implementation**: Code quality? Error handling? Boundaries?
|
|
71
|
+
- **Testing**: Coverage? Over-mocked? Falsely green?
|
|
72
|
+
- **Security**: Injection? Privilege escalation? Leakage? Auth bypass?
|
|
73
|
+
- **Maintainability**: Naming? Structure? Can the next maintainer understand?
|
|
74
|
+
- **UX** (if UI / API contract is involved): Error messages clear? Loading? Accessibility?
|
|
78
75
|
|
|
79
|
-
|
|
80
|
-
Think: Injection? Privilege escalation? Leakage? Auth bypass?
|
|
81
|
-
|
|
82
|
-
Round 5: Maintainability layer
|
|
83
|
-
Think: Naming? Structure? Can the next maintainer understand?
|
|
84
|
-
|
|
85
|
-
Round 6: UX layer (if UI / API contract is involved)
|
|
86
|
-
Think: Are error messages clear? Loading? Accessibility?
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
**Key point**: every round must **specifically point out what was examined** (file:line), not vague thinking.
|
|
76
|
+
**Key point**: whenever you examine a category, cite what you looked at (file:line or design-doc section), not vague thinking.
|
|
90
77
|
|
|
91
78
|
### Step 3: Judgment
|
|
92
79
|
|
|
@@ -108,24 +95,11 @@ else:
|
|
|
108
95
|
|
|
109
96
|
### Step 4: Round 2 — Deep Drill
|
|
110
97
|
|
|
111
|
-
For areas
|
|
98
|
+
For the "looks fine" areas from Round 1, use sequential-thinking proportional to the residual uncertainty. Three lenses to rotate through (stop when the drill honestly surfaces nothing new, don't force all three):
|
|
112
99
|
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
- Did I only look at the surface?
|
|
117
|
-
- What pitfalls have similar projects (e.g., open-source comparisons) hit?
|
|
118
|
-
|
|
119
|
-
Rounds 3-4: Counterfactual thinking
|
|
120
|
-
- What happens if this system is stress-tested by an adversarial user?
|
|
121
|
-
- As code evolves in 6 months, will this decision become a bottleneck?
|
|
122
|
-
- What about 10x/100x load?
|
|
123
|
-
|
|
124
|
-
Rounds 5-6: Boundaries and implicits
|
|
125
|
-
- What "default behaviors" are in the code but unstated?
|
|
126
|
-
- Has the dependency library had any famous CVEs?
|
|
127
|
-
- What does this design assume users won't do? What if they do?
|
|
128
|
-
```
|
|
100
|
+
- **Trust but verify**: did I only look at the surface? What pitfalls have similar open-source projects hit?
|
|
101
|
+
- **Counterfactual**: under adversarial stress? In 6 months as the codebase evolves? At 10x / 100x load?
|
|
102
|
+
- **Boundaries and implicits**: what "default behaviors" are unstated? Any CVE history in the dependency? What does the design assume users won't do?
|
|
129
103
|
|
|
130
104
|
### Step 5: Fallback If Still Zero Findings
|
|
131
105
|
|
|
@@ -134,7 +108,7 @@ If Round 2 still yields no findings, you must output a **proof report**:
|
|
|
134
108
|
```markdown
|
|
135
109
|
## Adversarial Review — No Sufficient Findings (Proof Report)
|
|
136
110
|
|
|
137
|
-
|
|
111
|
+
Across Round 1 (breadth) and Round 2 (depth), I checked the following applicable dimensions (N/A ones listed separately):
|
|
138
112
|
|
|
139
113
|
### Architecture (specifically examined)
|
|
140
114
|
- AD-01~05 in design.md
|
|
@@ -252,7 +252,7 @@ If the user agrees, suggest a set of tasks to append to tasks.md:
|
|
|
252
252
|
|
|
253
253
|
## Forbidden
|
|
254
254
|
|
|
255
|
-
- ✗
|
|
255
|
+
- ✗ Silently skipping a category — N/A is fine, but every category that doesn't apply must be named with a one-line reason (e.g. "I18n: N/A — single-locale MVP")
|
|
256
256
|
- ✗ Listing scenarios only from imagination (must grep the code + compare tests)
|
|
257
257
|
- ✗ Not using sequential-thinking
|
|
258
258
|
- ✗ Gap list without priority ordering
|
|
@@ -260,7 +260,7 @@ If the user agrees, suggest a set of tasks to append to tasks.md:
|
|
|
260
260
|
|
|
261
261
|
## Quality Self-Check
|
|
262
262
|
|
|
263
|
-
- [ ]
|
|
263
|
+
- [ ] Every applicable category examined, with N/A reasons recorded for the rest?
|
|
264
264
|
- [ ] Each gap has category + location + scenario + risk + recommended test code?
|
|
265
265
|
- [ ] Priority ordering is clear?
|
|
266
266
|
- [ ] Findings proportional to real edge-case surface (zero is OK if all categories honestly N/A)
|
package/agents/flow-planner.md
CHANGED
|
@@ -138,7 +138,7 @@ For each of the following sources, every item must be covered by tasks:
|
|
|
138
138
|
**CRITICAL (see L8 of the preamble — long-artifact handling):**
|
|
139
139
|
- Your FIRST action in this step must be a `Write` tool call with the full `tasks.md` content. Do NOT paste the file content as assistant text before writing.
|
|
140
140
|
- Do NOT preview the tasks list in the response. The file itself is the deliverable.
|
|
141
|
-
- If `
|
|
141
|
+
- If a single `Write` call would approach the sub-agent output-token budget (judge by section density, not line count — see preamble L8), split into `tasks-phase-<n>.md` files and make `tasks.md` a short index linking to them.
|
|
142
142
|
|
|
143
143
|
Based on `${CLAUDE_PLUGIN_ROOT}/templates/tasks.md.tmpl`. Must include a **coverage audit table** at the end (from Step 5).
|
|
144
144
|
|
package/agents/flow-reviewer.md
CHANGED
|
@@ -189,7 +189,7 @@ else:
|
|
|
189
189
|
|
|
190
190
|
**CRITICAL (see L8 of the preamble):** your FIRST action in this step must be a `Write` tool call with the **complete report content**. Do NOT paste the report as assistant text before writing. After the write succeeds, respond with a ≤ 5-line summary only (path, verdict, blocker count, next step). Do not re-paste the report.
|
|
191
191
|
|
|
192
|
-
If
|
|
192
|
+
If a single `Write` call would approach the sub-agent output-token budget (judge by section density, not line count), split into `review-report.md` (short index + verdict) and `review-details.md` (full findings) — two `Write` calls. See preamble L8.
|
|
193
193
|
|
|
194
194
|
Full structure (use this as the content passed to `Write`, not as preview text):
|
|
195
195
|
|
package/agents/flow-verifier.md
CHANGED
|
@@ -174,7 +174,7 @@ For each match, check:
|
|
|
174
174
|
|
|
175
175
|
**CRITICAL (see L8 of the preamble):** your FIRST action in this step must be a `Write` tool call with the **complete report content**. Do NOT paste the report as assistant text before writing — doing so doubles output tokens and causes truncation inside the `Write` call. After the write succeeds, respond with a ≤ 5-line summary only (path, verdict counts, next step). Do not re-paste the report.
|
|
176
176
|
|
|
177
|
-
If
|
|
177
|
+
If a single `Write` call would approach the sub-agent output-token budget (judge by section density, not line count), split into `verification-report.md` (short index + verdict) and `verification-details.md` (full findings table) — two `Write` calls. See preamble L8.
|
|
178
178
|
|
|
179
179
|
Required structure (use this as the content passed to `Write`, not as preview text):
|
|
180
180
|
|
package/commands/fast.md
CHANGED
|
@@ -123,6 +123,6 @@ Choosing the right scenario matters more than forcing the flow.
|
|
|
123
123
|
## Forbidden
|
|
124
124
|
|
|
125
125
|
- ✗ Committing without running verification
|
|
126
|
-
- ✗ Changes touching
|
|
126
|
+
- ✗ Changes touching many unrelated files or modules (means it is no longer fast — run the full flow)
|
|
127
127
|
- ✗ Writing library APIs from memory
|
|
128
128
|
- ✗ Skipping the Step 2 5-question clarification (even when "obvious," explicit statement still has value)
|
package/commands/implement.md
CHANGED
package/commands/review.md
CHANGED
|
@@ -16,8 +16,8 @@ Distinct from `/curdx-flow:verify`:
|
|
|
16
16
|
| Flag | Default | Purpose |
|
|
17
17
|
|------|---------|---------|
|
|
18
18
|
| `--stage=<1\|2\|both>` | `both` | Stage 1 = spec compliance only. Stage 2 = code quality only. `both` = sequential. |
|
|
19
|
-
| `--adversarial` | off | Add an adversarial review pass
|
|
20
|
-
| `--edge-case` | off | Add edge-case hunting across
|
|
19
|
+
| `--adversarial` | off | Add an adversarial review pass across applicable categories (zero findings requires proof-of-checking, not fabrication). |
|
|
20
|
+
| `--edge-case` | off | Add edge-case hunting across applicable categories. Produces a test-gap checklist. |
|
|
21
21
|
|
|
22
22
|
## Preflight
|
|
23
23
|
|
|
@@ -65,7 +65,7 @@ Output: Stage-2 section of the report.
|
|
|
65
65
|
## Optional: adversarial review
|
|
66
66
|
|
|
67
67
|
If `--adversarial`:
|
|
68
|
-
Dispatch `flow-adversary`. It
|
|
68
|
+
Dispatch `flow-adversary`. It scans the applicable categories (Architecture / Implementation / Testing / Security / Maintainability / UX — skip N/A with reason) using `sequential-thinking` proportional to the residual uncertainty, probing:
|
|
69
69
|
1. What's missing?
|
|
70
70
|
2. What's overengineered?
|
|
71
71
|
3. What would break first in production?
|
|
@@ -73,12 +73,12 @@ Dispatch `flow-adversary`. It runs 6 dimensions × 2 rounds of `sequential-think
|
|
|
73
73
|
5. What decision locks us out of a future option?
|
|
74
74
|
6. What would a skeptical reviewer reject?
|
|
75
75
|
|
|
76
|
-
**Zero findings
|
|
76
|
+
**Zero findings requires proof-of-checking, not fabrication** — honest "clean" verdicts are fine if the agent lists what it examined. Per `@${CLAUDE_PLUGIN_ROOT}/gates/adversarial-review-gate.md`.
|
|
77
77
|
|
|
78
78
|
## Optional: edge-case hunting
|
|
79
79
|
|
|
80
80
|
If `--edge-case`:
|
|
81
|
-
Dispatch `flow-edge-hunter` across the
|
|
81
|
+
Dispatch `flow-edge-hunter` across the applicable categories (skip N/A with one-line reason):
|
|
82
82
|
1. Boundary values (0, MAX, empty, one-over-limit)
|
|
83
83
|
2. Concurrency / race conditions
|
|
84
84
|
3. Network failure / partial failure
|
package/commands/spec.md
CHANGED
|
@@ -82,7 +82,7 @@ Output: `requirements.md` with user stories (US-NN), acceptance criteria (AC-N.N
|
|
|
82
82
|
|
|
83
83
|
### design → `flow-architect`
|
|
84
84
|
Inputs: `research.md` + `requirements.md`.
|
|
85
|
-
Output: `design.md` with architecture decisions (AD-NN), component boundaries, data models, error-path design, mermaid diagrams.
|
|
85
|
+
Output: `design.md` with architecture decisions (AD-NN), component boundaries, data models, error-path design, mermaid diagrams (when they clarify). Uses `sequential-thinking` MCP proportional to the genuine tradeoff surface.
|
|
86
86
|
|
|
87
87
|
### tasks → `flow-planner`
|
|
88
88
|
Inputs: all three prior files + `.flow/PROJECT.md` tech stack.
|
|
@@ -87,7 +87,7 @@ Input: object under review (code range / spec / PR diff)
|
|
|
87
87
|
↓
|
|
88
88
|
Round 1 (agent self-analysis):
|
|
89
89
|
- Use sequential-thinking proportional to the surface being probed
|
|
90
|
-
- Scan
|
|
90
|
+
- Scan each applicable category; mark N/A ones with reason
|
|
91
91
|
- Output findings list
|
|
92
92
|
↓
|
|
93
93
|
Decision:
|
|
@@ -190,10 +190,10 @@ Fix loop:
|
|
|
190
190
|
|
|
191
191
|
## Failure Recovery
|
|
192
192
|
|
|
193
|
-
If after 2
|
|
193
|
+
If after Round 2 the honest verdict is still zero findings, emit a proof-of-checking report (do NOT fabricate to hit a quota — there is no quota):
|
|
194
194
|
|
|
195
195
|
```markdown
|
|
196
|
-
## Adversarial Review —
|
|
196
|
+
## Adversarial Review — Proof of Checking (zero findings)
|
|
197
197
|
|
|
198
198
|
I have examined the following dimensions across 2 rounds of analysis:
|
|
199
199
|
|
package/gates/devex-gate.md
CHANGED
|
@@ -210,7 +210,7 @@ Attach a DevEx checklist at PR time:
|
|
|
210
210
|
|
|
211
211
|
## Scoring
|
|
212
212
|
|
|
213
|
-
|
|
213
|
+
Score each **applicable** dimension 0-10 (N/A dimensions are excluded from the total):
|
|
214
214
|
|
|
215
215
|
```
|
|
216
216
|
10 = best practice
|
|
@@ -220,8 +220,7 @@ Each dimension 0-10 points:
|
|
|
220
220
|
0 = serious issue
|
|
221
221
|
```
|
|
222
222
|
|
|
223
|
-
|
|
224
|
-
Total < 40 = blocked, improvement required.
|
|
223
|
+
Emit the per-dimension scores with evidence. The gate itself does not block on a numeric threshold; it surfaces the weaknesses for the user (or the reviewing agent) to decide whether any of them rise to a blocker. A single 0/10 on a material dimension is a blocker regardless of the total.
|
|
225
224
|
|
|
226
225
|
---
|
|
227
226
|
|
|
@@ -223,13 +223,14 @@ return "linear"
|
|
|
223
223
|
|
|
224
224
|
## Failure Handling (common to all strategies)
|
|
225
225
|
|
|
226
|
-
`flow-executor` agent's
|
|
226
|
+
`flow-executor` agent's retry ladder — each step escalates only when the prior is honestly exhausted, not on a fixed count:
|
|
227
227
|
|
|
228
228
|
```
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
229
|
+
Step A: autonomous retry (edit + rerun Verify) — only for shallow failures
|
|
230
|
+
Step B: sequential-thinking root-cause analysis proportional to the hypothesis space
|
|
231
|
+
Step C: read related source + trace data flow
|
|
232
|
+
Step D: if ≥3 retries fail with no new hypothesis, stop and challenge the architecture (see preamble L3)
|
|
233
|
+
Step E: report TASK_FAILED
|
|
233
234
|
```
|
|
234
235
|
|
|
235
236
|
### Extra protections for Stop-Hook strategy
|
|
@@ -57,7 +57,7 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
|
|
|
57
57
|
**Key behaviors** (flow-researcher agent):
|
|
58
58
|
1. Read `.flow/PROJECT.md` and `.flow/CONTEXT.md` to understand project background
|
|
59
59
|
2. Call `mcp__claude_mem__search` to retrieve relevant historical experience
|
|
60
|
-
3. Use sequential-thinking for
|
|
60
|
+
3. Use sequential-thinking proportional to the unknowns (1 thought for a trivial prototype, many for a novel domain)
|
|
61
61
|
4. Scan the codebase for reusable modules
|
|
62
62
|
5. Use `mcp__context7__*` to look up latest docs for relevant libraries
|
|
63
63
|
6. When necessary, WebSearch for the latest technical trends
|
|
@@ -99,11 +99,12 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
|
|
|
99
99
|
|
|
100
100
|
**Key behaviors** (flow-architect agent):
|
|
101
101
|
1. Read `research.md` + `requirements.md`
|
|
102
|
-
2. **
|
|
103
|
-
-
|
|
104
|
-
-
|
|
105
|
-
-
|
|
106
|
-
-
|
|
102
|
+
2. **Use sequential-thinking proportional to the tradeoff surface** — the phases below are orientation, not a quota:
|
|
103
|
+
- Constraints (from NFR / tech stack)
|
|
104
|
+
- Option comparison (only when alternatives genuinely compete)
|
|
105
|
+
- Selection + accepted tradeoff
|
|
106
|
+
- Self-rebuttal
|
|
107
|
+
A well-known stack pick may finish in 1 thought; a distributed-system design may run many. Do not pad.
|
|
107
108
|
3. Assign an `AD-NN` ID to each architectural decision
|
|
108
109
|
4. Draw a data flow diagram (mermaid)
|
|
109
110
|
5. Define component interfaces + error paths
|
|
@@ -125,7 +126,7 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
|
|
|
125
126
|
3. Each task has 5 fields: `Do` / `Files` / `Done-when` / `Verify` / `Commit`
|
|
126
127
|
4. **Multi-source coverage audit**: for each FR / AC / AD / decision, confirm there is a covering task (no omissions)
|
|
127
128
|
5. Mark `[P]` (parallel-safe) and `[VERIFY]` (checkpoint)
|
|
128
|
-
6. Simple decomposition doesn't need sequential-thinking
|
|
129
|
+
6. Simple decomposition doesn't need sequential-thinking; run a coverage audit at the end (every FR/AC/AD has a task)
|
|
129
130
|
|
|
130
131
|
**Deliverable**: `tasks.md`
|
|
131
132
|
|
|
@@ -113,17 +113,18 @@ Stage 2 applies all enabled Gates (from `.flow/config.json`):
|
|
|
113
113
|
|
|
114
114
|
#### 2.5 (enterprise) Adversarial review (adversarial-review-gate)
|
|
115
115
|
|
|
116
|
-
-
|
|
116
|
+
- Every applicable category examined (N/A documented for the rest)?
|
|
117
|
+
- Findings proportional to real issues (zero is OK with a proof-of-checking report)?
|
|
117
118
|
- Each finding has evidence + recommendation?
|
|
118
119
|
|
|
119
120
|
#### 2.6 (enterprise) Edge cases (edge-case-gate)
|
|
120
121
|
|
|
121
|
-
-
|
|
122
|
+
- Each applicable edge-case category addressed (N/A noted for the rest)?
|
|
122
123
|
- Gap list has priorities?
|
|
123
124
|
|
|
124
125
|
### Stage 2 verdict
|
|
125
126
|
|
|
126
|
-
- **EXCELLENT**: all enabled Gates pass, adversarial
|
|
127
|
+
- **EXCELLENT**: all enabled Gates pass, adversarial review clean or only low-severity findings
|
|
127
128
|
- **GOOD**: all enabled Gates pass, but some warnings
|
|
128
129
|
- **NEEDS_IMPROVEMENT**: Gate violations (blocking)
|
|
129
130
|
|
package/package.json
CHANGED
package/templates/design.md.tmpl
CHANGED
|
@@ -9,155 +9,75 @@ depends_on: requirements.md
|
|
|
9
9
|
|
|
10
10
|
# Technical Design: {{SPEC_NAME}}
|
|
11
11
|
|
|
12
|
-
> Conclusions from
|
|
13
|
-
>
|
|
12
|
+
> Conclusions from flow-architect. Sequential-thinking is invoked proportional to the genuine tradeoff surface — the chain lives in the thinking tool, not this document.
|
|
13
|
+
>
|
|
14
|
+
> **Fill only the sections that carry real design information for this feature.** Well-known stack assemblies legitimately compress to a stack list + data model + a few real ADs. Delete sections whose honest answer would be "N/A" or "standard for this stack". A forced 13-section template is the bloat pattern this is designed to prevent.
|
|
14
15
|
|
|
15
16
|
---
|
|
16
17
|
|
|
17
18
|
## Design Overview (one paragraph)
|
|
18
19
|
|
|
19
|
-
<!-- One
|
|
20
|
+
<!-- One sentence summary of the approach. -->
|
|
20
21
|
|
|
21
22
|
## Architecture Decisions
|
|
22
23
|
|
|
23
|
-
<!-- Each
|
|
24
|
+
<!-- Each real decision gets an AD-NN. If a decision is "obvious, no alternative worth listing," use one line and move on. -->
|
|
24
25
|
|
|
25
26
|
### AD-01: ...
|
|
26
|
-
- **Decision**: Use X
|
|
27
|
+
- **Decision**: Use X
|
|
27
28
|
- **Rationale**: ...
|
|
28
|
-
- **Trade-off**:
|
|
29
|
-
- **sequentialthinking rounds**: rounds 3-5
|
|
30
|
-
|
|
31
|
-
### AD-02: ...
|
|
32
|
-
|
|
33
|
-
## System Architecture Diagram
|
|
34
|
-
|
|
35
|
-
```mermaid
|
|
36
|
-
flowchart TB
|
|
37
|
-
<!-- actual data flow generated by flow-architect -->
|
|
38
|
-
User[User] --> API[API Gateway]
|
|
39
|
-
API --> Auth[Auth Service]
|
|
40
|
-
Auth --> DB[(Database)]
|
|
41
|
-
```
|
|
29
|
+
- **Trade-off**: ... (omit if there is no genuine tradeoff)
|
|
42
30
|
|
|
43
31
|
## Component Design
|
|
44
32
|
|
|
45
|
-
<!-- Each component
|
|
33
|
+
<!-- Each component: responsibility, input type, output type, dependencies, error path. Skip if the feature is a single module with no internal boundaries worth naming. -->
|
|
46
34
|
|
|
47
|
-
### Component: {{
|
|
35
|
+
### Component: {{COMP_NAME}}
|
|
48
36
|
- **Responsibility**: ...
|
|
49
|
-
- **Input**:
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
}
|
|
54
|
-
```
|
|
55
|
-
- **Output**:
|
|
56
|
-
```ts
|
|
57
|
-
interface Output {
|
|
58
|
-
field: Type;
|
|
59
|
-
}
|
|
60
|
-
```
|
|
61
|
-
- **Dependencies**: Component X, Library Y
|
|
62
|
-
- **Errors**:
|
|
63
|
-
- `ErrorCode.X` — when ... happens
|
|
64
|
-
- `ErrorCode.Y` — when ... happens
|
|
65
|
-
|
|
66
|
-
### Component: {{COMP_NAME_2}}
|
|
67
|
-
<!-- ... -->
|
|
68
|
-
|
|
69
|
-
## Data Model
|
|
70
|
-
|
|
71
|
-
<!-- Database schema / data structures -->
|
|
72
|
-
|
|
73
|
-
### Entity: ...
|
|
74
|
-
```sql
|
|
75
|
-
CREATE TABLE ... (
|
|
76
|
-
id UUID PRIMARY KEY,
|
|
77
|
-
...
|
|
78
|
-
);
|
|
79
|
-
```
|
|
37
|
+
- **Input**: `interface Input { ... }`
|
|
38
|
+
- **Output**: `interface Output { ... }`
|
|
39
|
+
- **Dependencies**: ...
|
|
40
|
+
- **Errors**: ...
|
|
80
41
|
|
|
81
|
-
|
|
82
|
-
```ts
|
|
83
|
-
interface Entity {
|
|
84
|
-
id: string;
|
|
85
|
-
...
|
|
86
|
-
}
|
|
87
|
-
```
|
|
42
|
+
## Data Model (if the feature touches persistence or structured data)
|
|
88
43
|
|
|
89
|
-
|
|
44
|
+
<!-- SQL schema, TypeScript types, or API payload shape. Delete if the feature has no meaningful data shape. -->
|
|
45
|
+
|
|
46
|
+
## Architecture Diagram (include only when it clarifies; prose often suffices)
|
|
90
47
|
|
|
91
48
|
```mermaid
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
Pending --> Active: approve
|
|
95
|
-
Pending --> Rejected: reject
|
|
96
|
-
Active --> Completed: finish
|
|
49
|
+
flowchart TB
|
|
50
|
+
...
|
|
97
51
|
```
|
|
98
52
|
|
|
99
|
-
##
|
|
53
|
+
## State Machine (include only if the feature has non-trivial state transitions)
|
|
100
54
|
|
|
101
|
-
|
|
55
|
+
## Error Path Design (include when error behavior is not obvious)
|
|
102
56
|
|
|
103
|
-
| Scenario |
|
|
104
|
-
|
|
105
|
-
|
|
|
106
|
-
| Rate limit hit | none | return 429 | "Too many requests, retry in 60 seconds" |
|
|
57
|
+
| Scenario | System Response | User-visible |
|
|
58
|
+
|-----|---------|---------|
|
|
59
|
+
| ... | ... | ... |
|
|
107
60
|
|
|
108
|
-
## API Contract
|
|
109
|
-
|
|
110
|
-
<!-- If this is an API project -->
|
|
61
|
+
## API Contract (include only if this feature exposes or changes an API)
|
|
111
62
|
|
|
112
63
|
```yaml
|
|
113
|
-
|
|
114
|
-
Request:
|
|
115
|
-
body:
|
|
116
|
-
field: string
|
|
117
|
-
Response:
|
|
118
|
-
200:
|
|
119
|
-
body:
|
|
120
|
-
field: string
|
|
121
|
-
400:
|
|
122
|
-
body:
|
|
123
|
-
error: string
|
|
64
|
+
...
|
|
124
65
|
```
|
|
125
66
|
|
|
126
|
-
## Test Matrix
|
|
67
|
+
## Test Matrix (brief — one line per layer)
|
|
127
68
|
|
|
128
69
|
| Layer | Coverage | Tool |
|
|
129
70
|
|---|-----|------|
|
|
130
|
-
|
|
|
131
|
-
| Integration | Between components | vitest + supertest |
|
|
132
|
-
| E2E | Complete user flows | playwright / chrome-devtools MCP |
|
|
133
|
-
|
|
134
|
-
### Key Test Scenarios
|
|
135
|
-
1. Happy path: ...
|
|
136
|
-
2. Edge case 1: ...
|
|
137
|
-
3. Error recovery: ...
|
|
138
|
-
|
|
139
|
-
## Suggested Implementation Order
|
|
140
|
-
|
|
141
|
-
<!-- Reference for decomposition in the tasks phase -->
|
|
142
|
-
|
|
143
|
-
1. Build skeleton first (Component A → empty implementation)
|
|
144
|
-
2. Then wire up the real logic (core logic of Component A)
|
|
145
|
-
3. Connect DB (persistence for Component A)
|
|
146
|
-
4. Then do Component B ...
|
|
147
|
-
|
|
148
|
-
## Risks and Mitigations
|
|
71
|
+
| ... | ... | ... |
|
|
149
72
|
|
|
150
|
-
|
|
151
|
-
|-----|-----|------|
|
|
152
|
-
| ... | medium | ... |
|
|
73
|
+
## Risks and Mitigations (include only if risks exist that aren't obvious from the ADs)
|
|
153
74
|
|
|
154
75
|
## Defer to Implementation
|
|
155
76
|
|
|
156
|
-
<!-- Decisions
|
|
77
|
+
<!-- Decisions explicitly deferred to when the executor writes the code. -->
|
|
157
78
|
|
|
158
|
-
-
|
|
159
|
-
- Caching strategy → no caching initially, adjust based on data after launch
|
|
79
|
+
- ...
|
|
160
80
|
|
|
161
81
|
---
|
|
162
82
|
|
|
163
|
-
_Generated by flow-architect
|
|
83
|
+
_Generated by flow-architect on {{CREATED_DATE}}._
|
|
@@ -9,86 +9,68 @@ depends_on: research.md
|
|
|
9
9
|
|
|
10
10
|
# Requirements Spec: {{SPEC_NAME}}
|
|
11
11
|
|
|
12
|
-
> **Recommended direction from
|
|
12
|
+
> **Recommended direction from research**: {{RESEARCH_CONCLUSION}}
|
|
13
13
|
>
|
|
14
|
-
>
|
|
14
|
+
> **Fill only the sections that carry real information for this feature.** Delete or collapse any section whose honest content would be "N/A" or "same as usual". Padding sections with "TBD" is worse than omitting them.
|
|
15
15
|
|
|
16
16
|
---
|
|
17
17
|
|
|
18
18
|
## User Stories
|
|
19
19
|
|
|
20
|
-
<!-- Each story follows the format: As X, I want Y, so that Z -->
|
|
21
|
-
|
|
22
20
|
### US-01: ...
|
|
23
|
-
**As** [user role],
|
|
24
|
-
**I want** [capability],
|
|
25
|
-
**so that** [business value].
|
|
21
|
+
**As** [user role], **I want** [capability], **so that** [business value].
|
|
26
22
|
|
|
27
23
|
**Acceptance criteria**:
|
|
28
24
|
- AC-1.1: [verifiable behavior]
|
|
29
|
-
- AC-1.2:
|
|
30
|
-
- AC-1.3: [edge case handling]
|
|
25
|
+
- AC-1.2: ...
|
|
31
26
|
|
|
32
|
-
|
|
33
|
-
<!-- ... -->
|
|
27
|
+
<!-- Add more US-NN blocks only if the feature genuinely has multiple independent user flows. -->
|
|
34
28
|
|
|
35
29
|
## Functional Requirements
|
|
36
30
|
|
|
37
|
-
<!-- FR-NN format. Each FR must be a verifiable statement of "the system must X". -->
|
|
38
|
-
|
|
39
31
|
- **FR-01**: The system must ...
|
|
40
|
-
- **FR-02**:
|
|
41
|
-
- **FR-03**: ...
|
|
32
|
+
- **FR-02**: ...
|
|
42
33
|
|
|
43
34
|
## Non-Functional Requirements
|
|
44
35
|
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
36
|
+
<!--
|
|
37
|
+
Include ONLY the NFR categories that this feature is actually constrained by.
|
|
38
|
+
For a small internal CRUD feature, "Performance / Security / Maintainability / Compatibility" as a four-bucket grid is usually padding.
|
|
39
|
+
Delete categories that have no real requirement, or collapse into one line: "NFR: standard for this stack, no special constraints."
|
|
40
|
+
-->
|
|
48
41
|
|
|
49
|
-
###
|
|
50
|
-
- **NFR-
|
|
51
|
-
- **NFR-S-02**: ...
|
|
42
|
+
### Performance (if applicable)
|
|
43
|
+
- **NFR-P-01**: ...
|
|
52
44
|
|
|
53
|
-
###
|
|
54
|
-
- **NFR-
|
|
45
|
+
### Security (if applicable)
|
|
46
|
+
- **NFR-S-01**: ...
|
|
55
47
|
|
|
56
|
-
|
|
57
|
-
- **NFR-C-01**: ...
|
|
48
|
+
<!-- Delete Maintainability / Compatibility sections unless they carry a real constraint. -->
|
|
58
49
|
|
|
59
50
|
## Edge Cases and Error Handling
|
|
60
51
|
|
|
61
|
-
<!--
|
|
52
|
+
<!-- Include rows only for scenarios that actually apply. -->
|
|
62
53
|
|
|
63
54
|
| Scenario | Expected behavior |
|
|
64
55
|
|-----|--------|
|
|
65
|
-
|
|
|
66
|
-
| Database exception | ... |
|
|
67
|
-
| Invalid input | ... |
|
|
68
|
-
| Concurrent conflict | ... |
|
|
56
|
+
| ... | ... |
|
|
69
57
|
|
|
70
58
|
## Out of Scope
|
|
71
59
|
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
- ✗ Feature A — deferred to the next version
|
|
75
|
-
- ✗ Feature B — out of budget
|
|
76
|
-
- ✗ Feature C — needs its own spec
|
|
60
|
+
- ✗ ...
|
|
77
61
|
|
|
78
|
-
## Success Metrics
|
|
62
|
+
## Success Metrics (if the feature has measurable outcomes)
|
|
79
63
|
|
|
80
|
-
<!--
|
|
64
|
+
<!-- Delete this section for internal tools or refactors with no user-visible metric. -->
|
|
81
65
|
|
|
82
|
-
- Metric 1:
|
|
83
|
-
- Metric 2: [e.g. complaint rate < 1%]
|
|
66
|
+
- Metric 1: ...
|
|
84
67
|
|
|
85
68
|
## Open Questions
|
|
86
69
|
|
|
87
|
-
<!--
|
|
70
|
+
<!-- Include only if there are genuinely unresolved questions. Delete when empty. -->
|
|
88
71
|
|
|
89
|
-
1.
|
|
90
|
-
2. **Question 2**: ...
|
|
72
|
+
1. ...
|
|
91
73
|
|
|
92
74
|
---
|
|
93
75
|
|
|
94
|
-
_Generated by flow-product-designer
|
|
76
|
+
_Generated by flow-product-designer on {{CREATED_DATE}}._
|
|
@@ -10,105 +10,74 @@ status: in_progress
|
|
|
10
10
|
|
|
11
11
|
> **Goal**: {{SPEC_GOAL}}
|
|
12
12
|
>
|
|
13
|
-
>
|
|
13
|
+
> **Fill only the sections that carry real information.** For a well-understood feature on a known stack, research legitimately compresses to: goal, one recommended direction, known constraints. Delete sections whose honest content would be "N/A" or "first time, nothing to fetch". Padding this document with "TBD" is worse than omitting sections.
|
|
14
14
|
|
|
15
15
|
---
|
|
16
16
|
|
|
17
|
-
## Prior Experience (from claude-mem)
|
|
18
|
-
|
|
19
|
-
<!--
|
|
20
|
-
flow-researcher first calls mcp__claude_mem__search to retrieve relevant history.
|
|
21
|
-
If there are relevant observations, summarize them here; if not, write "(first research on this topic)".
|
|
22
|
-
-->
|
|
17
|
+
## Prior Experience (from claude-mem, if relevant)
|
|
23
18
|
|
|
24
19
|
{{CLAUDE_MEM_FINDINGS}}
|
|
25
20
|
|
|
26
|
-
|
|
21
|
+
<!-- Delete this section if there are no relevant prior observations. -->
|
|
27
22
|
|
|
28
|
-
|
|
23
|
+
## Problem Understanding
|
|
29
24
|
|
|
30
25
|
### Core Problem
|
|
31
|
-
<!-- One
|
|
26
|
+
<!-- One sentence. What are we solving? -->
|
|
32
27
|
|
|
33
28
|
### Explicit Assumptions
|
|
34
|
-
<!--
|
|
29
|
+
<!-- Only real assumptions that matter. Don't list "assumption: we will write code." -->
|
|
30
|
+
|
|
35
31
|
- Assumption 1: ...
|
|
36
|
-
- Assumption 2: ...
|
|
37
32
|
|
|
38
33
|
### Known Constraints
|
|
39
|
-
|
|
40
|
-
- Budget / time:
|
|
41
|
-
- Team capability:
|
|
42
|
-
- Compliance requirements:
|
|
43
|
-
|
|
44
|
-
## Technical Solution Space
|
|
34
|
+
<!-- Include only the constraints that actually shape the solution. -->
|
|
45
35
|
|
|
46
|
-
|
|
36
|
+
- Tech stack: ...
|
|
37
|
+
- Time budget: ...
|
|
38
|
+
- (Compliance, team capability, etc — only if they constrain this feature)
|
|
47
39
|
|
|
48
|
-
|
|
49
|
-
- **Pros**:
|
|
50
|
-
- **Cons**:
|
|
51
|
-
- **Complexity**: low / medium / high
|
|
52
|
-
- **Docs (context7 queries)**:
|
|
53
|
-
- `library-name@version`: ...
|
|
40
|
+
## Technical Solution Space
|
|
54
41
|
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
42
|
+
<!--
|
|
43
|
+
If one approach is clearly the right call for this stack, write only that approach with its rationale.
|
|
44
|
+
Include alternative options ONLY when there is a genuine tradeoff a thoughtful engineer might disagree on.
|
|
45
|
+
Do not invent Option B and Option C just to fill the template.
|
|
46
|
+
-->
|
|
59
47
|
|
|
60
|
-
###
|
|
48
|
+
### Recommended Approach: ...
|
|
49
|
+
- **Why**: ...
|
|
50
|
+
- **Complexity**: ...
|
|
51
|
+
- **Key APIs verified via context7**: ...
|
|
61
52
|
|
|
62
|
-
|
|
53
|
+
### Alternative: ... (include only if a real alternative exists)
|
|
63
54
|
|
|
64
|
-
|
|
55
|
+
## Existing Code Analysis (include only if the codebase has relevant prior work)
|
|
65
56
|
|
|
66
57
|
### Reusable Modules
|
|
67
|
-
- `path/to/
|
|
68
|
-
|
|
69
|
-
### Modules to Create
|
|
70
|
-
- `path/to/new-module.ts` — ...
|
|
71
|
-
|
|
72
|
-
### Modules to Modify
|
|
73
|
-
- `path/to/modify.ts` — ...
|
|
74
|
-
|
|
75
|
-
## Latest Documentation Summary (context7)
|
|
76
|
-
|
|
77
|
-
<!-- Latest APIs / best practices found by flow-researcher via mcp__context7__* -->
|
|
78
|
-
|
|
79
|
-
### {{LIBRARY_1}}
|
|
80
|
-
- Version:
|
|
81
|
-
- Relevant APIs:
|
|
82
|
-
- Gotchas / changes:
|
|
83
|
-
|
|
84
|
-
### {{LIBRARY_2}}
|
|
85
|
-
- ...
|
|
86
|
-
|
|
87
|
-
## Feasibility Assessment
|
|
58
|
+
- `path/to/module` — ...
|
|
88
59
|
|
|
89
|
-
|
|
60
|
+
### New Modules Required
|
|
61
|
+
- `path/to/new` — ...
|
|
90
62
|
|
|
91
|
-
|
|
92
|
-
- **Estimated complexity**: 1-10
|
|
93
|
-
- **Main risks**:
|
|
94
|
-
- Risk 1: ...
|
|
95
|
-
- Risk 2: ...
|
|
63
|
+
## Latest Documentation Summary
|
|
96
64
|
|
|
97
|
-
|
|
65
|
+
<!-- Only include libraries whose API is version-sensitive AND used by this feature. Do not cite every library in the stack. -->
|
|
98
66
|
|
|
99
|
-
|
|
67
|
+
### {{LIBRARY}}
|
|
68
|
+
- Version: ...
|
|
69
|
+
- Relevant APIs: ...
|
|
70
|
+
- Gotchas: ...
|
|
100
71
|
|
|
101
|
-
|
|
102
|
-
**Rationale**:
|
|
103
|
-
**To confirm in the design phase**:
|
|
72
|
+
## Feasibility
|
|
104
73
|
|
|
105
|
-
|
|
74
|
+
- **Verdict**: feasible / risky / not recommended
|
|
75
|
+
- **Main risks**: (only if real risks exist)
|
|
106
76
|
|
|
107
|
-
|
|
77
|
+
## Open Questions (delete if none)
|
|
108
78
|
|
|
109
79
|
1. ...
|
|
110
|
-
2. ...
|
|
111
80
|
|
|
112
81
|
---
|
|
113
82
|
|
|
114
|
-
_Generated by flow-researcher
|
|
83
|
+
_Generated by flow-researcher on {{CREATED_DATE}}._
|
package/templates/tasks.md.tmpl
CHANGED
|
@@ -5,137 +5,80 @@ created: {{CREATED_DATE}}
|
|
|
5
5
|
version: 1.0
|
|
6
6
|
status: in_progress
|
|
7
7
|
depends_on: design.md
|
|
8
|
-
task_size: fine
|
|
9
8
|
---
|
|
10
9
|
|
|
11
10
|
# Task Breakdown: {{SPEC_NAME}}
|
|
12
11
|
|
|
13
|
-
> POC-First
|
|
12
|
+
> POC-First is an **orientation, not a mandate**. Use the phases below as an organizing idea and **delete phases that don't apply to this feature**. A bug-fix may be one task. A prototype may skip Phase 2 (refactor) and Phase 5 (PR lifecycle). A library may skip the PR lifecycle entirely. Forcing all five phases for a small feature is the padding pattern this template is designed to prevent.
|
|
14
13
|
>
|
|
15
|
-
> Each task includes
|
|
14
|
+
> Each task includes whatever of `Do`, `Files`, `Done-when`, `Verify`, `Commit` is needed for the executor to finish it in a single sub-agent dispatch. Verify must be an automated command (no "manual test").
|
|
16
15
|
|
|
17
16
|
---
|
|
18
17
|
|
|
19
18
|
## Marker Rules
|
|
20
19
|
|
|
21
20
|
- `[ ]` TODO / `[x]` done
|
|
22
|
-
- `[P]` parallel-safe (
|
|
23
|
-
- `[VERIFY]` quality checkpoint (
|
|
21
|
+
- `[P]` parallel-safe (dispatch in parallel within the same wave)
|
|
22
|
+
- `[VERIFY]` quality checkpoint (flow-verifier agent)
|
|
24
23
|
- `[SEQUENTIAL]` must be serial (breaks the parallel group)
|
|
25
24
|
|
|
26
25
|
---
|
|
27
26
|
|
|
28
27
|
## Phase 1: Make It Work (POC)
|
|
29
28
|
|
|
30
|
-
> Goal:
|
|
29
|
+
> Goal: end-to-end runnable. Hardcoding is acceptable; skip tests here.
|
|
31
30
|
|
|
32
|
-
|
|
33
|
-
- **Do**: create `src/{{MODULE}}/` directory, add `index.ts`, `types.ts`
|
|
34
|
-
- **Files**: `src/{{MODULE}}/index.ts`, `src/{{MODULE}}/types.ts`
|
|
35
|
-
- **Done when**: directory exists, `import {} from './{{MODULE}}'` does not error
|
|
36
|
-
- **Verify**: `npx tsc --noEmit`
|
|
37
|
-
- **Commit**: `feat({{MODULE}}): initialize module skeleton`
|
|
38
|
-
- _Requirements_: FR-01
|
|
31
|
+
<!-- Add only the tasks this feature genuinely needs. Do not invent skeleton tasks to "round out" the phase. -->
|
|
39
32
|
|
|
40
|
-
- [ ] **1.
|
|
33
|
+
- [ ] **1.1** ...
|
|
41
34
|
- **Do**: ...
|
|
42
35
|
- **Files**: ...
|
|
43
36
|
- **Done when**: ...
|
|
44
37
|
- **Verify**: ...
|
|
45
38
|
- **Commit**: ...
|
|
46
|
-
- _Requirements_:
|
|
47
|
-
- _Design_: AD-01
|
|
39
|
+
- _Requirements_: FR-NN
|
|
48
40
|
|
|
49
|
-
- [ ] **1.
|
|
50
|
-
- **
|
|
51
|
-
- **
|
|
52
|
-
- **Done when**: returns expected data (edge cases may still be wrong)
|
|
41
|
+
- [ ] **1.X** [VERIFY] End-to-end POC verification
|
|
42
|
+
- **Verify**: `<command>`
|
|
43
|
+
- **Done when**: happy path returns the expected result
|
|
53
44
|
|
|
54
|
-
## Phase 2: Refactoring
|
|
45
|
+
## Phase 2: Refactoring (delete if the POC is already clean)
|
|
55
46
|
|
|
56
|
-
>
|
|
57
|
-
|
|
58
|
-
- [ ] **2.1** Extract duplicated logic
|
|
59
|
-
- **Do**: ...
|
|
60
|
-
- **Verify**: `npx tsc --noEmit && git diff --stat`
|
|
61
|
-
- **Commit**: `refactor({{MODULE}}): extract common logic`
|
|
62
|
-
|
|
63
|
-
- [ ] **2.2** [VERIFY] Refactor does not break behavior
|
|
64
|
-
- **Verify**: rerun the manual test from Phase 1
|
|
65
|
-
- **Done when**: all outputs match
|
|
47
|
+
> Include only if the POC has genuine duplication or structural mud that warrants cleanup. Skip for tiny features.
|
|
66
48
|
|
|
67
49
|
## Phase 3: Testing (TDD red / green / yellow)
|
|
68
50
|
|
|
69
|
-
> Rule: tests first.
|
|
70
|
-
|
|
71
|
-
- [ ] **3.1** [RED] Write failing tests — unit
|
|
72
|
-
- **Do**: write unit tests for core functions
|
|
73
|
-
- **Files**: `src/{{MODULE}}/*.test.ts`
|
|
74
|
-
- **Verify**: `npm test -- src/{{MODULE}}` — expected to fail
|
|
75
|
-
- **Commit**: `test({{MODULE}}): red - add unit tests for core logic`
|
|
76
|
-
|
|
77
|
-
- [ ] **3.2** [GREEN] Make tests pass
|
|
78
|
-
- **Do**: fix the implementation so the tests from 3.1 pass
|
|
79
|
-
- **Verify**: `npm test -- src/{{MODULE}}` — all green
|
|
80
|
-
- **Commit**: `feat({{MODULE}}): green - satisfy unit tests`
|
|
81
|
-
|
|
82
|
-
- [ ] **3.3** [YELLOW] Refactor and clean up
|
|
83
|
-
- **Do**: clean up the implementation, tests still pass
|
|
84
|
-
- **Commit**: `refactor({{MODULE}}): yellow - clean implementation`
|
|
51
|
+
> Rule: tests first. Red → Green → Yellow. **Collapse red+green into one task when the test and implementation are trivially paired**; split only when the test genuinely precedes a nontrivial implementation.
|
|
85
52
|
|
|
86
|
-
- [ ] **3.
|
|
87
|
-
- <!-- Repeat the TDD cycle -->
|
|
53
|
+
- [ ] **3.X** [RED→GREEN→YELLOW] ...
|
|
88
54
|
|
|
89
|
-
- [ ] **3.
|
|
90
|
-
- **Verify**:
|
|
55
|
+
- [ ] **3.X+1** [VERIFY] Coverage check
|
|
56
|
+
- **Verify**: coverage on the changed surface ≥ project standard
|
|
91
57
|
|
|
92
58
|
## Phase 4: Quality Gates
|
|
93
59
|
|
|
94
|
-
>
|
|
95
|
-
|
|
96
|
-
- [ ] **4.1** TypeScript strict check
|
|
97
|
-
- **Verify**: `npx tsc --strict --noEmit` — 0 errors
|
|
98
|
-
- **Commit**: `chore({{MODULE}}): tsc strict passes`
|
|
99
|
-
|
|
100
|
-
- [ ] **4.2** Lint
|
|
101
|
-
- **Verify**: `npx eslint src/{{MODULE}}` — 0 errors, 0 warnings
|
|
102
|
-
|
|
103
|
-
- [ ] **4.3** All tests pass
|
|
104
|
-
- **Verify**: `npm test` — all green
|
|
105
|
-
|
|
106
|
-
- [ ] **4.4** [VERIFY] Final health check
|
|
107
|
-
- **Do**: flow-verifier agent performs goal-driven reverse verification
|
|
108
|
-
- **Done when**: every FR-XX and AC-X.Y has a corresponding automated verification
|
|
109
|
-
|
|
110
|
-
## Phase 5: PR Lifecycle
|
|
60
|
+
> Include only the checks this project actually runs. `npx eslint` is dead weight if the project uses biome. `tsc --strict` is dead weight for a JS project.
|
|
111
61
|
|
|
112
|
-
- [ ] **
|
|
113
|
-
- **Do**:
|
|
114
|
-
- **Done when**:
|
|
62
|
+
- [ ] **4.X** [VERIFY] Final health check
|
|
63
|
+
- **Do**: flow-verifier performs goal-driven reverse verification
|
|
64
|
+
- **Done when**: every FR/AC has an automated check
|
|
115
65
|
|
|
116
|
-
-
|
|
117
|
-
- **Do**: iterate until approved
|
|
118
|
-
- **Verify**: CI all green
|
|
66
|
+
## Phase 5: PR Lifecycle (delete for local-only work, scripts, internal tools without a PR flow)
|
|
119
67
|
|
|
120
|
-
- [ ] **5.
|
|
121
|
-
- **Do**: `/flow-land`
|
|
122
|
-
- **Verify**: the main branch contains all commits for this spec
|
|
68
|
+
- [ ] **5.X** Ship / Land
|
|
123
69
|
|
|
124
70
|
---
|
|
125
71
|
|
|
126
72
|
## Coverage Audit
|
|
127
73
|
|
|
128
|
-
<!--
|
|
74
|
+
<!-- flow-planner fills this in. Every FR / AC / AD / D must map to a task, or explicitly defer with reason. -->
|
|
129
75
|
|
|
130
76
|
| Requirement ID | Task(s) | Status |
|
|
131
77
|
|--------|---------|------|
|
|
132
|
-
| FR-01 |
|
|
133
|
-
| FR-02 | ... | ⚠ uncovered — needs adding |
|
|
134
|
-
| AD-01 | 1.1 | ✓ |
|
|
135
|
-
| D-05 (STATE.md) | ... | ✓ |
|
|
78
|
+
| FR-01 | ... | ✓ |
|
|
136
79
|
|
|
137
|
-
**Uncovered items must be handled**: add a task or document the deferral reason in STATE.md.
|
|
80
|
+
**Uncovered items must be handled**: add a task, or document the deferral reason in STATE.md.
|
|
138
81
|
|
|
139
82
|
---
|
|
140
83
|
|
|
141
|
-
_Generated by flow-planner
|
|
84
|
+
_Generated by flow-planner on {{CREATED_DATE}}._
|