@curdx/flow 2.0.0-beta.5 → 2.0.0-beta.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -330,7 +330,7 @@ Prerequisites:
330
330
 
331
331
  ## Step 6: Progress Feedback
332
332
 
333
- Every 5 tasks or every wave, print status:
333
+ At each wave boundary (or periodically during long linear runs), print status:
334
334
 
335
335
  ```
336
336
  ═════ Progress ═════
@@ -16,8 +16,8 @@ Distinct from `/curdx-flow:verify`:
16
16
  | Flag | Default | Purpose |
17
17
  |------|---------|---------|
18
18
  | `--stage=<1\|2\|both>` | `both` | Stage 1 = spec compliance only. Stage 2 = code quality only. `both` = sequential. |
19
- | `--adversarial` | off | Add an adversarial review pass (6 dimensions × 2 sequential-thinking rounds). Zero-findings forbidden. |
20
- | `--edge-case` | off | Add edge-case hunting across the 7 categories. Produces a test-gap checklist. |
19
+ | `--adversarial` | off | Add an adversarial review pass across applicable categories (zero findings requires proof-of-checking, not fabrication). |
20
+ | `--edge-case` | off | Add edge-case hunting across applicable categories. Produces a test-gap checklist. |
21
21
 
22
22
  ## Preflight
23
23
 
@@ -65,7 +65,7 @@ Output: Stage-2 section of the report.
65
65
  ## Optional: adversarial review
66
66
 
67
67
  If `--adversarial`:
68
- Dispatch `flow-adversary`. It runs 6 dimensions × 2 rounds of `sequential-thinking`:
68
+ Dispatch `flow-adversary`. It scans the applicable categories (Architecture / Implementation / Testing / Security / Maintainability / UX — skip N/A with reason) using `sequential-thinking` proportional to the residual uncertainty, probing:
69
69
  1. What's missing?
70
70
  2. What's overengineered?
71
71
  3. What would break first in production?
@@ -73,12 +73,12 @@ Dispatch `flow-adversary`. It runs 6 dimensions × 2 rounds of `sequential-think
73
73
  5. What decision locks us out of a future option?
74
74
  6. What would a skeptical reviewer reject?
75
75
 
76
- **Zero findings are forbidden** — if the agent reports "all good", re-dispatch with stronger skepticism. Per `@${CLAUDE_PLUGIN_ROOT}/gates/adversarial-review-gate.md`.
76
+ **Zero findings requires proof-of-checking, not fabrication** — honest "clean" verdicts are fine if the agent lists what it examined. Per `@${CLAUDE_PLUGIN_ROOT}/gates/adversarial-review-gate.md`.
77
77
 
78
78
  ## Optional: edge-case hunting
79
79
 
80
80
  If `--edge-case`:
81
- Dispatch `flow-edge-hunter` across the 7 categories:
81
+ Dispatch `flow-edge-hunter` across the applicable categories (skip N/A with one-line reason):
82
82
  1. Boundary values (0, MAX, empty, one-over-limit)
83
83
  2. Concurrency / race conditions
84
84
  3. Network failure / partial failure
package/commands/spec.md CHANGED
@@ -82,7 +82,7 @@ Output: `requirements.md` with user stories (US-NN), acceptance criteria (AC-N.N
82
82
 
83
83
  ### design → `flow-architect`
84
84
  Inputs: `research.md` + `requirements.md`.
85
- Output: `design.md` with architecture decisions (AD-NN), component boundaries, data models, error-path design, mermaid diagrams. Must use `sequential-thinking` MCP (≥8 thoughts).
85
+ Output: `design.md` with architecture decisions (AD-NN), component boundaries, data models, error-path design, mermaid diagrams (when they clarify). Uses `sequential-thinking` MCP proportional to the genuine tradeoff surface.
86
86
 
87
87
  ### tasks → `flow-planner`
88
88
  Inputs: all three prior files + `.flow/PROJECT.md` tech stack.
@@ -33,19 +33,19 @@ A reviewer agent's output of "everything looks fine, no issues found" is an **in
33
33
  - "Looks good" is usually confirmation bias (the agent only checked the obvious)
34
34
  - AI tends to please the user ("great job!") — fight this tendency
35
35
 
36
- **Forced actions**:
37
- 1. If the agent outputs "no issues", automatically trigger a second round
38
- 2. The second round requires the agent to perform deeper analysis via sequential-thinking
39
- 3. If both rounds yield no findings, the agent must **prove** it checked:
40
- - List the dimensions examined (at least 5)
41
- - For each dimension, give the specific code/file locations inspected
42
- - Provide counterfactual hypotheses of "what it would look like if there were a problem"
36
+ **Forced actions when the agent reports "no issues"**:
37
+ 1. Automatically trigger a second round framed as "what would a senior skeptic reject in this PR?"
38
+ 2. If both rounds still honestly yield no findings, the agent must emit a **proof-of-checking report**:
39
+ - Every category it examined (with "N/A" for categories that don't apply)
40
+ - For each examined category, the specific code/file locations inspected
41
+ - Counterfactual hypotheses of "what this would look like if there were a problem" and why that signature is absent
42
+ 3. Fabricating findings to avoid the proof-of-checking step is a violation of L3 red line #2 (fact-driven). Better to emit "clean verdict with proof" than invent issues.
43
43
 
44
44
  ---
45
45
 
46
- ### Rule 2: Findings in at Least 3 Categories
46
+ ### Rule 2: Coverage proportional to feature scope
47
47
 
48
- A complete adversarial review must cover (find issues in at least 3 of these categories):
48
+ A complete adversarial review covers every category that applies to the feature, marks the rest as N/A with reason. Number of findings per category is proportional to real issues, not a quota:
49
49
 
50
50
  1. **Architecture layer**: Are decisions sound? Future-extensible? Lock-in risks?
51
51
  2. **Implementation layer**: Code quality? Error handling? Performance?
@@ -86,22 +86,22 @@ Not allowed:
86
86
  Input: object under review (code range / spec / PR diff)
87
87
 
88
88
  Round 1 (agent self-analysis):
89
- - Use sequential-thinking 6 rounds
90
- - Scan all 6 categories
89
+ - Use sequential-thinking proportional to the surface being probed
90
+ - Scan each applicable category; mark N/A ones with reason
91
91
  - Output findings list
92
92
 
93
93
  Decision:
94
- - Findings 3? → output report
95
- - Findings < 3? → force Round 2
94
+ - Any real findings? → output report with findings
95
+ - Zero findings after honest Round 1? → force Round 2 framed as skeptic
96
96
 
97
97
  Round 2 (deep analysis):
98
- - sequential-thinking for another 6 rounds
98
+ - sequential-thinking proportional to residual uncertainty
99
99
  - Focus on "seemingly no issues" parts (trust but verify)
100
- - May introduce external perspectives (read issues from similar projects)
100
+ - Optionally introduce external perspectives (read issues from similar projects)
101
101
 
102
102
  Decision:
103
- - Still < 3? → agent must explicitly prove it checked
104
- - Otherwise → output report
103
+ - Still zero findings? → agent must emit proof-of-checking report (NOT invent findings)
104
+ - Findings exist? → output report
105
105
 
106
106
  Output: review-report.md
107
107
  ```
@@ -190,10 +190,10 @@ Fix loop:
190
190
 
191
191
  ## Failure Recovery
192
192
 
193
- If after 2 rounds there are still < 3 findings:
193
+ If after Round 2 the honest verdict is still zero findings, emit a proof-of-checking report (do NOT fabricate to hit a quota — there is no quota):
194
194
 
195
195
  ```markdown
196
- ## Adversarial Review — Insufficient Findings
196
+ ## Adversarial Review — Proof of Checking (zero findings)
197
197
 
198
198
  I have examined the following dimensions across 2 rounds of analysis:
199
199
 
@@ -195,12 +195,12 @@ Reading these test names = reading API behavior documentation.
195
195
 
196
196
  ### Agent Automatic
197
197
 
198
- When `flow-ux-designer` / `flow-reviewer` applies this gate, use sequential-thinking 4 rounds to scan the 8 dimensions.
198
+ When `flow-ux-designer` / `flow-reviewer` applies this gate, use sequential-thinking proportional to the complexity of the codebase being scanned.
199
199
 
200
200
  ### Human Review
201
201
 
202
202
  Attach a DevEx checklist at PR time:
203
- - [ ] Clear naming (reviewed at least 3 times)
203
+ - [ ] Clear naming (re-read until obvious to a new maintainer)
204
204
  - [ ] Critical comments exist
205
205
  - [ ] Consistent structure
206
206
  - [ ] Actionable error messages
@@ -210,7 +210,7 @@ Attach a DevEx checklist at PR time:
210
210
 
211
211
  ## Scoring
212
212
 
213
- Each dimension 0-10 points:
213
+ Score each **applicable** dimension 0-10 (N/A dimensions are excluded from the total):
214
214
 
215
215
  ```
216
216
  10 = best practice
@@ -220,8 +220,7 @@ Each dimension 0-10 points:
220
220
  0 = serious issue
221
221
  ```
222
222
 
223
- Total 40+ / 80 = pass (warning, non-blocking).
224
- Total < 40 = blocked, improvement required.
223
+ Emit the per-dimension scores with evidence. The gate itself does not block on a numeric threshold; it surfaces the weaknesses for the user (or the reviewing agent) to decide whether any of them rise to a blocker. A single 0/10 on a material dimension is a blocker regardless of the total.
225
224
 
226
225
  ---
227
226
 
@@ -104,7 +104,7 @@ Q4. If no test, what test should be added to cover it?
104
104
  Input: object under review (function / component / API) + requirements + tests
105
105
 
106
106
  For each category (1-7):
107
- 1. Use sequential-thinking to list at least 3 possible edge scenarios
107
+ 1. Use sequential-thinking to list every plausible edge scenario for this category — stop when you've covered the real risk surface, don't pad to a quota, don't fabricate scenarios that won't occur in production
108
108
  2. Check whether each scenario has corresponding coverage in tests
109
109
  3. Add uncovered ones to the "gap list"
110
110
 
@@ -223,13 +223,14 @@ return "linear"
223
223
 
224
224
  ## Failure Handling (common to all strategies)
225
225
 
226
- `flow-executor` agent's 5-round retry mechanism:
226
+ `flow-executor` agent's retry ladder — each step escalates only when the prior is honestly exhausted, not on a fixed count:
227
227
 
228
228
  ```
229
- Rounds 1-2: agent retries autonomously (edit code, rerun Verify)
230
- Round 3: sequential-thinking root-cause analysis 5 rounds
231
- Round 4: read related source + trace data flow
232
- Round 5: report TASK_FAILED
229
+ Step A: autonomous retry (edit + rerun Verify) — only for shallow failures
230
+ Step B: sequential-thinking root-cause analysis proportional to the hypothesis space
231
+ Step C: read related source + trace data flow
232
+ Step D: if ≥3 retries fail with no new hypothesis, stop and challenge the architecture (see preamble L3)
233
+ Step E: report TASK_FAILED
233
234
  ```
234
235
 
235
236
  ### Extra protections for Stop-Hook strategy
@@ -57,7 +57,7 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
57
57
  **Key behaviors** (flow-researcher agent):
58
58
  1. Read `.flow/PROJECT.md` and `.flow/CONTEXT.md` to understand project background
59
59
  2. Call `mcp__claude_mem__search` to retrieve relevant historical experience
60
- 3. Use sequential-thinking for 5-8 rounds of problem understanding
60
+ 3. Use sequential-thinking proportional to the unknowns (1 thought for a trivial prototype, many for a novel domain)
61
61
  4. Scan the codebase for reusable modules
62
62
  5. Use `mcp__context7__*` to look up latest docs for relevant libraries
63
63
  6. When necessary, WebSearch for the latest technical trends
@@ -99,11 +99,12 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
99
99
 
100
100
  **Key behaviors** (flow-architect agent):
101
101
  1. Read `research.md` + `requirements.md`
102
- 2. **Must use sequential-thinking for at least 8 rounds**:
103
- - Rounds 1-2: constraints
104
- - Rounds 3-5: comparison of options A/B
105
- - Rounds 6-7: selection + trade-offs
106
- - Round 8: rebut yourself
102
+ 2. **Use sequential-thinking proportional to the tradeoff surface** — the phases below are orientation, not a quota:
103
+ - Constraints (from NFR / tech stack)
104
+ - Option comparison (only when alternatives genuinely compete)
105
+ - Selection + accepted tradeoff
106
+ - Self-rebuttal
107
+ A well-known stack pick may finish in 1 thought; a distributed-system design may run many. Do not pad.
107
108
  3. Assign an `AD-NN` ID to each architectural decision
108
109
  4. Draw a data flow diagram (mermaid)
109
110
  5. Define component interfaces + error paths
@@ -125,7 +126,7 @@ What's wasted isn't code — it's context tokens and decision fatigue from churn
125
126
  3. Each task has 5 fields: `Do` / `Files` / `Done-when` / `Verify` / `Commit`
126
127
  4. **Multi-source coverage audit**: for each FR / AC / AD / decision, confirm there is a covering task (no omissions)
127
128
  5. Mark `[P]` (parallel-safe) and `[VERIFY]` (checkpoint)
128
- 6. Simple decomposition doesn't need sequential-thinking, but reflect on coverage every 5 tasks
129
+ 6. Simple decomposition doesn't need sequential-thinking; run a coverage audit at the end (every FR/AC/AD has a task)
129
130
 
130
131
  **Deliverable**: `tasks.md`
131
132
 
@@ -113,17 +113,18 @@ Stage 2 applies all enabled Gates (from `.flow/config.json`):
113
113
 
114
114
  #### 2.5 (enterprise) Adversarial review (adversarial-review-gate)
115
115
 
116
- - 3 categories of issues found?
116
+ - Every applicable category examined (N/A documented for the rest)?
117
+ - Findings proportional to real issues (zero is OK with a proof-of-checking report)?
117
118
  - Each finding has evidence + recommendation?
118
119
 
119
120
  #### 2.6 (enterprise) Edge cases (edge-case-gate)
120
121
 
121
- - Did all 7 major categories pass?
122
+ - Each applicable edge-case category addressed (N/A noted for the rest)?
122
123
  - Gap list has priorities?
123
124
 
124
125
  ### Stage 2 verdict
125
126
 
126
- - **EXCELLENT**: all enabled Gates pass, adversarial findings < 3 (high-quality code)
127
+ - **EXCELLENT**: all enabled Gates pass, adversarial review clean or only low-severity findings
127
128
  - **GOOD**: all enabled Gates pass, but some warnings
128
129
  - **NEEDS_IMPROVEMENT**: Gate violations (blocking)
129
130
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@curdx/flow",
3
- "version": "2.0.0-beta.5",
3
+ "version": "2.0.0-beta.7",
4
4
  "description": "CLI installer for CurDX-Flow — AI engineering workflow meta-framework for Claude Code",
5
5
  "type": "module",
6
6
  "bin": {
@@ -32,7 +32,7 @@
32
32
  "specs": {
33
33
  "directories": ["./.flow/specs"],
34
34
  "default_task_size": "fine",
35
- "_task_size_options": "fine (40-60 tasks) | coarse (10-20 tasks)"
35
+ "_task_size_hint": "as-needed decomposition (no fixed count) see agents/flow-planner.md"
36
36
  },
37
37
 
38
38
  "addons": {
@@ -9,155 +9,75 @@ depends_on: requirements.md
9
9
 
10
10
  # Technical Design: {{SPEC_NAME}}
11
11
 
12
- > Conclusions from the flow-architect agent using at least 8 rounds of `sequential-thinking` reasoning.
13
- > This document freezes the technical choices. Subsequent tasks / implementation strictly follow this design.
12
+ > Conclusions from flow-architect. Sequential-thinking is invoked proportional to the genuine tradeoff surface — the chain lives in the thinking tool, not this document.
13
+ >
14
+ > **Fill only the sections that carry real design information for this feature.** Well-known stack assemblies legitimately compress to a stack list + data model + a few real ADs. Delete sections whose honest answer would be "N/A" or "standard for this stack". A forced 13-section template is the bloat pattern this is designed to prevent.
14
15
 
15
16
  ---
16
17
 
17
18
  ## Design Overview (one paragraph)
18
19
 
19
- <!-- One-sentence summary of the architecture -->
20
+ <!-- One sentence summary of the approach. -->
20
21
 
21
22
  ## Architecture Decisions
22
23
 
23
- <!-- Each major decision gets an ID and is written to the decisions array in .flow/STATE.md -->
24
+ <!-- Each real decision gets an AD-NN. If a decision is "obvious, no alternative worth listing," use one line and move on. -->
24
25
 
25
26
  ### AD-01: ...
26
- - **Decision**: Use X instead of Y
27
+ - **Decision**: Use X
27
28
  - **Rationale**: ...
28
- - **Trade-off**: Accepted [downside] in exchange for [upside]
29
- - **sequentialthinking rounds**: rounds 3-5
30
-
31
- ### AD-02: ...
32
-
33
- ## System Architecture Diagram
34
-
35
- ```mermaid
36
- flowchart TB
37
- <!-- actual data flow generated by flow-architect -->
38
- User[User] --> API[API Gateway]
39
- API --> Auth[Auth Service]
40
- Auth --> DB[(Database)]
41
- ```
29
+ - **Trade-off**: ... (omit if there is no genuine tradeoff)
42
30
 
43
31
  ## Component Design
44
32
 
45
- <!-- Each component is independently testable. Interfaces are explicit. -->
33
+ <!-- Each component: responsibility, input type, output type, dependencies, error path. Skip if the feature is a single module with no internal boundaries worth naming. -->
46
34
 
47
- ### Component: {{COMP_NAME_1}}
35
+ ### Component: {{COMP_NAME}}
48
36
  - **Responsibility**: ...
49
- - **Input**:
50
- ```ts
51
- interface Input {
52
- field: Type;
53
- }
54
- ```
55
- - **Output**:
56
- ```ts
57
- interface Output {
58
- field: Type;
59
- }
60
- ```
61
- - **Dependencies**: Component X, Library Y
62
- - **Errors**:
63
- - `ErrorCode.X` — when ... happens
64
- - `ErrorCode.Y` — when ... happens
65
-
66
- ### Component: {{COMP_NAME_2}}
67
- <!-- ... -->
68
-
69
- ## Data Model
70
-
71
- <!-- Database schema / data structures -->
72
-
73
- ### Entity: ...
74
- ```sql
75
- CREATE TABLE ... (
76
- id UUID PRIMARY KEY,
77
- ...
78
- );
79
- ```
37
+ - **Input**: `interface Input { ... }`
38
+ - **Output**: `interface Output { ... }`
39
+ - **Dependencies**: ...
40
+ - **Errors**: ...
80
41
 
81
- ### Or TypeScript types:
82
- ```ts
83
- interface Entity {
84
- id: string;
85
- ...
86
- }
87
- ```
42
+ ## Data Model (if the feature touches persistence or structured data)
88
43
 
89
- ## State Machine (if applicable)
44
+ <!-- SQL schema, TypeScript types, or API payload shape. Delete if the feature has no meaningful data shape. -->
45
+
46
+ ## Architecture Diagram (include only when it clarifies; prose often suffices)
90
47
 
91
48
  ```mermaid
92
- stateDiagram-v2
93
- [*] --> Pending
94
- Pending --> Active: approve
95
- Pending --> Rejected: reject
96
- Active --> Completed: finish
49
+ flowchart TB
50
+ ...
97
51
  ```
98
52
 
99
- ## Error Path Design
53
+ ## State Machine (include only if the feature has non-trivial state transitions)
100
54
 
101
- <!-- Full flow on failure -->
55
+ ## Error Path Design (include when error behavior is not obvious)
102
56
 
103
- | Scenario | Upstream Behavior | System Response | User-visible |
104
- |-----|--------|---------|---------|
105
- | DB connection lost | retry 3 times | return 503 | "Temporarily unavailable, retry in 1 minute" |
106
- | Rate limit hit | none | return 429 | "Too many requests, retry in 60 seconds" |
57
+ | Scenario | System Response | User-visible |
58
+ |-----|---------|---------|
59
+ | ... | ... | ... |
107
60
 
108
- ## API Contract
109
-
110
- <!-- If this is an API project -->
61
+ ## API Contract (include only if this feature exposes or changes an API)
111
62
 
112
63
  ```yaml
113
- POST /api/v1/...
114
- Request:
115
- body:
116
- field: string
117
- Response:
118
- 200:
119
- body:
120
- field: string
121
- 400:
122
- body:
123
- error: string
64
+ ...
124
65
  ```
125
66
 
126
- ## Test Matrix
67
+ ## Test Matrix (brief — one line per layer)
127
68
 
128
69
  | Layer | Coverage | Tool |
129
70
  |---|-----|------|
130
- | Unit | All pure functions | vitest |
131
- | Integration | Between components | vitest + supertest |
132
- | E2E | Complete user flows | playwright / chrome-devtools MCP |
133
-
134
- ### Key Test Scenarios
135
- 1. Happy path: ...
136
- 2. Edge case 1: ...
137
- 3. Error recovery: ...
138
-
139
- ## Suggested Implementation Order
140
-
141
- <!-- Reference for decomposition in the tasks phase -->
142
-
143
- 1. Build skeleton first (Component A → empty implementation)
144
- 2. Then wire up the real logic (core logic of Component A)
145
- 3. Connect DB (persistence for Component A)
146
- 4. Then do Component B ...
147
-
148
- ## Risks and Mitigations
71
+ | ... | ... | ... |
149
72
 
150
- | Risk | Level | Mitigation |
151
- |-----|-----|------|
152
- | ... | medium | ... |
73
+ ## Risks and Mitigations (include only if risks exist that aren't obvious from the ADs)
153
74
 
154
75
  ## Defer to Implementation
155
76
 
156
- <!-- Decisions not worth spending time on in the design phase -->
77
+ <!-- Decisions explicitly deferred to when the executor writes the code. -->
157
78
 
158
- - Logging library choice → reuse project's existing one during implementation
159
- - Caching strategy → no caching initially, adjust based on data after launch
79
+ - ...
160
80
 
161
81
  ---
162
82
 
163
- _Generated by flow-architect agent on {{CREATED_DATE}}. After user reviews and approves AD-01~N, proceed to the tasks phase._
83
+ _Generated by flow-architect on {{CREATED_DATE}}._
@@ -9,86 +9,68 @@ depends_on: research.md
9
9
 
10
10
  # Requirements Spec: {{SPEC_NAME}}
11
11
 
12
- > **Recommended direction from the research phase**: {{RESEARCH_CONCLUSION}}
12
+ > **Recommended direction from research**: {{RESEARCH_CONCLUSION}}
13
13
  >
14
- > This phase: translate "technically feasible" into "concrete behaviors users benefit from".
14
+ > **Fill only the sections that carry real information for this feature.** Delete or collapse any section whose honest content would be "N/A" or "same as usual". Padding sections with "TBD" is worse than omitting them.
15
15
 
16
16
  ---
17
17
 
18
18
  ## User Stories
19
19
 
20
- <!-- Each story follows the format: As X, I want Y, so that Z -->
21
-
22
20
  ### US-01: ...
23
- **As** [user role],
24
- **I want** [capability],
25
- **so that** [business value].
21
+ **As** [user role], **I want** [capability], **so that** [business value].
26
22
 
27
23
  **Acceptance criteria**:
28
24
  - AC-1.1: [verifiable behavior]
29
- - AC-1.2: [verifiable behavior]
30
- - AC-1.3: [edge case handling]
25
+ - AC-1.2: ...
31
26
 
32
- ### US-02: ...
33
- <!-- ... -->
27
+ <!-- Add more US-NN blocks only if the feature genuinely has multiple independent user flows. -->
34
28
 
35
29
  ## Functional Requirements
36
30
 
37
- <!-- FR-NN format. Each FR must be a verifiable statement of "the system must X". -->
38
-
39
31
  - **FR-01**: The system must ...
40
- - **FR-02**: The system must ...
41
- - **FR-03**: ...
32
+ - **FR-02**: ...
42
33
 
43
34
  ## Non-Functional Requirements
44
35
 
45
- ### Performance
46
- - **NFR-P-01**: [e.g. P95 response time < 200ms]
47
- - **NFR-P-02**: ...
36
+ <!--
37
+ Include ONLY the NFR categories that this feature is actually constrained by.
38
+ For a small internal CRUD feature, "Performance / Security / Maintainability / Compatibility" as a four-bucket grid is usually padding.
39
+ Delete categories that have no real requirement, or collapse into one line: "NFR: standard for this stack, no special constraints."
40
+ -->
48
41
 
49
- ### Security
50
- - **NFR-S-01**: ...
51
- - **NFR-S-02**: ...
42
+ ### Performance (if applicable)
43
+ - **NFR-P-01**: ...
52
44
 
53
- ### Maintainability
54
- - **NFR-M-01**: ...
45
+ ### Security (if applicable)
46
+ - **NFR-S-01**: ...
55
47
 
56
- ### Compatibility
57
- - **NFR-C-01**: ...
48
+ <!-- Delete Maintainability / Compatibility sections unless they carry a real constraint. -->
58
49
 
59
50
  ## Edge Cases and Error Handling
60
51
 
61
- <!-- Must be explicit: what happens on failure? how are abnormal inputs handled? -->
52
+ <!-- Include rows only for scenarios that actually apply. -->
62
53
 
63
54
  | Scenario | Expected behavior |
64
55
  |-----|--------|
65
- | Network disconnected | ... |
66
- | Database exception | ... |
67
- | Invalid input | ... |
68
- | Concurrent conflict | ... |
56
+ | ... | ... |
69
57
 
70
58
  ## Out of Scope
71
59
 
72
- <!-- Karpathy principle 2: simplicity first. Explicitly list "not this time" to prevent scope creep. -->
73
-
74
- - ✗ Feature A — deferred to the next version
75
- - ✗ Feature B — out of budget
76
- - ✗ Feature C — needs its own spec
60
+ - ...
77
61
 
78
- ## Success Metrics
62
+ ## Success Metrics (if the feature has measurable outcomes)
79
63
 
80
- <!-- Must be quantifiable -->
64
+ <!-- Delete this section for internal tools or refactors with no user-visible metric. -->
81
65
 
82
- - Metric 1: [e.g. user signup completion rate > 80%]
83
- - Metric 2: [e.g. complaint rate < 1%]
66
+ - Metric 1: ...
84
67
 
85
68
  ## Open Questions
86
69
 
87
- <!-- Questions that need user answers -->
70
+ <!-- Include only if there are genuinely unresolved questions. Delete when empty. -->
88
71
 
89
- 1. **Question 1**: ...
90
- 2. **Question 2**: ...
72
+ 1. ...
91
73
 
92
74
  ---
93
75
 
94
- _Generated by flow-product-designer agent on {{CREATED_DATE}}. After user review, proceed to the design phase._
76
+ _Generated by flow-product-designer on {{CREATED_DATE}}._