@curdx/flow 2.0.0-beta.1 → 2.0.0-beta.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. package/.claude-plugin/marketplace.json +1 -1
  2. package/.claude-plugin/plugin.json +3 -10
  3. package/CHANGELOG.md +61 -0
  4. package/README.zh.md +2 -2
  5. package/agent-preamble/preamble.md +81 -11
  6. package/agents/flow-adversary.md +40 -55
  7. package/agents/flow-architect.md +23 -10
  8. package/agents/flow-debugger.md +2 -2
  9. package/agents/flow-edge-hunter.md +20 -6
  10. package/agents/flow-executor.md +3 -3
  11. package/agents/flow-planner.md +51 -48
  12. package/agents/flow-product-designer.md +14 -1
  13. package/agents/flow-qa-engineer.md +1 -1
  14. package/agents/flow-researcher.md +17 -2
  15. package/agents/flow-reviewer.md +5 -1
  16. package/agents/flow-security-auditor.md +1 -1
  17. package/agents/flow-triage-analyst.md +1 -1
  18. package/agents/flow-ui-researcher.md +2 -2
  19. package/agents/flow-ux-designer.md +1 -1
  20. package/agents/flow-verifier.md +47 -14
  21. package/bin/curdx-flow.js +13 -1
  22. package/cli/doctor.js +73 -13
  23. package/cli/install.js +62 -36
  24. package/cli/protocols.js +63 -10
  25. package/cli/registry.js +73 -0
  26. package/cli/uninstall.js +9 -11
  27. package/cli/upgrade.js +6 -10
  28. package/cli/utils.js +150 -56
  29. package/commands/fast.md +1 -1
  30. package/commands/implement.md +4 -4
  31. package/commands/init.md +14 -3
  32. package/commands/review.md +14 -5
  33. package/commands/spec.md +26 -2
  34. package/commands/start.md +47 -17
  35. package/commands/verify.md +13 -0
  36. package/gates/adversarial-review-gate.md +19 -19
  37. package/gates/devex-gate.md +4 -5
  38. package/gates/edge-case-gate.md +1 -1
  39. package/hooks/hooks.json +0 -11
  40. package/hooks/scripts/quick-mode-guard.sh +12 -9
  41. package/hooks/scripts/session-start.sh +1 -1
  42. package/hooks/scripts/stop-watcher.sh +25 -15
  43. package/knowledge/execution-strategies.md +6 -5
  44. package/knowledge/spec-driven-development.md +8 -7
  45. package/knowledge/two-stage-review.md +4 -3
  46. package/package.json +4 -2
  47. package/skills/brownfield-index/SKILL.md +62 -0
  48. package/skills/browser-qa/SKILL.md +50 -0
  49. package/skills/epic/SKILL.md +68 -0
  50. package/skills/security-audit/SKILL.md +50 -0
  51. package/skills/ui-sketch/SKILL.md +49 -0
  52. package/templates/config.json.tmpl +1 -1
  53. package/templates/design.md.tmpl +32 -112
  54. package/templates/requirements.md.tmpl +25 -43
  55. package/templates/research.md.tmpl +37 -68
  56. package/templates/tasks.md.tmpl +27 -84
  57. package/hooks/scripts/fail-tracker.sh +0 -31
@@ -9,155 +9,75 @@ depends_on: requirements.md
9
9
 
10
10
  # Technical Design: {{SPEC_NAME}}
11
11
 
12
- > Conclusions from the flow-architect agent using at least 8 rounds of `sequential-thinking` reasoning.
13
- > This document freezes the technical choices. Subsequent tasks / implementation strictly follow this design.
12
+ > Conclusions from flow-architect. Sequential-thinking is invoked proportional to the genuine tradeoff surface — the chain lives in the thinking tool, not this document.
13
+ >
14
+ > **Fill only the sections that carry real design information for this feature.** Well-known stack assemblies legitimately compress to a stack list + data model + a few real ADs. Delete sections whose honest answer would be "N/A" or "standard for this stack". A forced 13-section template is the bloat pattern this is designed to prevent.
14
15
 
15
16
  ---
16
17
 
17
18
  ## Design Overview (one paragraph)
18
19
 
19
- <!-- One-sentence summary of the architecture -->
20
+ <!-- One sentence summary of the approach. -->
20
21
 
21
22
  ## Architecture Decisions
22
23
 
23
- <!-- Each major decision gets an ID and is written to the decisions array in .flow/STATE.md -->
24
+ <!-- Each real decision gets an AD-NN. If a decision is "obvious, no alternative worth listing," use one line and move on. -->
24
25
 
25
26
  ### AD-01: ...
26
- - **Decision**: Use X instead of Y
27
+ - **Decision**: Use X
27
28
  - **Rationale**: ...
28
- - **Trade-off**: Accepted [downside] in exchange for [upside]
29
- - **sequentialthinking rounds**: rounds 3-5
30
-
31
- ### AD-02: ...
32
-
33
- ## System Architecture Diagram
34
-
35
- ```mermaid
36
- flowchart TB
37
- <!-- actual data flow generated by flow-architect -->
38
- User[User] --> API[API Gateway]
39
- API --> Auth[Auth Service]
40
- Auth --> DB[(Database)]
41
- ```
29
+ - **Trade-off**: ... (omit if there is no genuine tradeoff)
42
30
 
43
31
  ## Component Design
44
32
 
45
- <!-- Each component is independently testable. Interfaces are explicit. -->
33
+ <!-- Each component: responsibility, input type, output type, dependencies, error path. Skip if the feature is a single module with no internal boundaries worth naming. -->
46
34
 
47
- ### Component: {{COMP_NAME_1}}
35
+ ### Component: {{COMP_NAME}}
48
36
  - **Responsibility**: ...
49
- - **Input**:
50
- ```ts
51
- interface Input {
52
- field: Type;
53
- }
54
- ```
55
- - **Output**:
56
- ```ts
57
- interface Output {
58
- field: Type;
59
- }
60
- ```
61
- - **Dependencies**: Component X, Library Y
62
- - **Errors**:
63
- - `ErrorCode.X` — when ... happens
64
- - `ErrorCode.Y` — when ... happens
65
-
66
- ### Component: {{COMP_NAME_2}}
67
- <!-- ... -->
68
-
69
- ## Data Model
70
-
71
- <!-- Database schema / data structures -->
72
-
73
- ### Entity: ...
74
- ```sql
75
- CREATE TABLE ... (
76
- id UUID PRIMARY KEY,
77
- ...
78
- );
79
- ```
37
+ - **Input**: `interface Input { ... }`
38
+ - **Output**: `interface Output { ... }`
39
+ - **Dependencies**: ...
40
+ - **Errors**: ...
80
41
 
81
- ### Or TypeScript types:
82
- ```ts
83
- interface Entity {
84
- id: string;
85
- ...
86
- }
87
- ```
42
+ ## Data Model (if the feature touches persistence or structured data)
88
43
 
89
- ## State Machine (if applicable)
44
+ <!-- SQL schema, TypeScript types, or API payload shape. Delete if the feature has no meaningful data shape. -->
45
+
46
+ ## Architecture Diagram (include only when it clarifies; prose often suffices)
90
47
 
91
48
  ```mermaid
92
- stateDiagram-v2
93
- [*] --> Pending
94
- Pending --> Active: approve
95
- Pending --> Rejected: reject
96
- Active --> Completed: finish
49
+ flowchart TB
50
+ ...
97
51
  ```
98
52
 
99
- ## Error Path Design
53
+ ## State Machine (include only if the feature has non-trivial state transitions)
100
54
 
101
- <!-- Full flow on failure -->
55
+ ## Error Path Design (include when error behavior is not obvious)
102
56
 
103
- | Scenario | Upstream Behavior | System Response | User-visible |
104
- |-----|--------|---------|---------|
105
- | DB connection lost | retry 3 times | return 503 | "Temporarily unavailable, retry in 1 minute" |
106
- | Rate limit hit | none | return 429 | "Too many requests, retry in 60 seconds" |
57
+ | Scenario | System Response | User-visible |
58
+ |-----|---------|---------|
59
+ | ... | ... | ... |
107
60
 
108
- ## API Contract
109
-
110
- <!-- If this is an API project -->
61
+ ## API Contract (include only if this feature exposes or changes an API)
111
62
 
112
63
  ```yaml
113
- POST /api/v1/...
114
- Request:
115
- body:
116
- field: string
117
- Response:
118
- 200:
119
- body:
120
- field: string
121
- 400:
122
- body:
123
- error: string
64
+ ...
124
65
  ```
125
66
 
126
- ## Test Matrix
67
+ ## Test Matrix (brief — one line per layer)
127
68
 
128
69
  | Layer | Coverage | Tool |
129
70
  |---|-----|------|
130
- | Unit | All pure functions | vitest |
131
- | Integration | Between components | vitest + supertest |
132
- | E2E | Complete user flows | playwright / chrome-devtools MCP |
133
-
134
- ### Key Test Scenarios
135
- 1. Happy path: ...
136
- 2. Edge case 1: ...
137
- 3. Error recovery: ...
138
-
139
- ## Suggested Implementation Order
140
-
141
- <!-- Reference for decomposition in the tasks phase -->
142
-
143
- 1. Build skeleton first (Component A → empty implementation)
144
- 2. Then wire up the real logic (core logic of Component A)
145
- 3. Connect DB (persistence for Component A)
146
- 4. Then do Component B ...
147
-
148
- ## Risks and Mitigations
71
+ | ... | ... | ... |
149
72
 
150
- | Risk | Level | Mitigation |
151
- |-----|-----|------|
152
- | ... | medium | ... |
73
+ ## Risks and Mitigations (include only if risks exist that aren't obvious from the ADs)
153
74
 
154
75
  ## Defer to Implementation
155
76
 
156
- <!-- Decisions not worth spending time on in the design phase -->
77
+ <!-- Decisions explicitly deferred to when the executor writes the code. -->
157
78
 
158
- - Logging library choice → reuse project's existing one during implementation
159
- - Caching strategy → no caching initially, adjust based on data after launch
79
+ - ...
160
80
 
161
81
  ---
162
82
 
163
- _Generated by flow-architect agent on {{CREATED_DATE}}. After user reviews and approves AD-01~N, proceed to the tasks phase._
83
+ _Generated by flow-architect on {{CREATED_DATE}}._
@@ -9,86 +9,68 @@ depends_on: research.md
9
9
 
10
10
  # Requirements Spec: {{SPEC_NAME}}
11
11
 
12
- > **Recommended direction from the research phase**: {{RESEARCH_CONCLUSION}}
12
+ > **Recommended direction from research**: {{RESEARCH_CONCLUSION}}
13
13
  >
14
- > This phase: translate "technically feasible" into "concrete behaviors users benefit from".
14
+ > **Fill only the sections that carry real information for this feature.** Delete or collapse any section whose honest content would be "N/A" or "same as usual". Padding sections with "TBD" is worse than omitting them.
15
15
 
16
16
  ---
17
17
 
18
18
  ## User Stories
19
19
 
20
- <!-- Each story follows the format: As X, I want Y, so that Z -->
21
-
22
20
  ### US-01: ...
23
- **As** [user role],
24
- **I want** [capability],
25
- **so that** [business value].
21
+ **As** [user role], **I want** [capability], **so that** [business value].
26
22
 
27
23
  **Acceptance criteria**:
28
24
  - AC-1.1: [verifiable behavior]
29
- - AC-1.2: [verifiable behavior]
30
- - AC-1.3: [edge case handling]
25
+ - AC-1.2: ...
31
26
 
32
- ### US-02: ...
33
- <!-- ... -->
27
+ <!-- Add more US-NN blocks only if the feature genuinely has multiple independent user flows. -->
34
28
 
35
29
  ## Functional Requirements
36
30
 
37
- <!-- FR-NN format. Each FR must be a verifiable statement of "the system must X". -->
38
-
39
31
  - **FR-01**: The system must ...
40
- - **FR-02**: The system must ...
41
- - **FR-03**: ...
32
+ - **FR-02**: ...
42
33
 
43
34
  ## Non-Functional Requirements
44
35
 
45
- ### Performance
46
- - **NFR-P-01**: [e.g. P95 response time < 200ms]
47
- - **NFR-P-02**: ...
36
+ <!--
37
+ Include ONLY the NFR categories that this feature is actually constrained by.
38
+ For a small internal CRUD feature, "Performance / Security / Maintainability / Compatibility" as a four-bucket grid is usually padding.
39
+ Delete categories that have no real requirement, or collapse into one line: "NFR: standard for this stack, no special constraints."
40
+ -->
48
41
 
49
- ### Security
50
- - **NFR-S-01**: ...
51
- - **NFR-S-02**: ...
42
+ ### Performance (if applicable)
43
+ - **NFR-P-01**: ...
52
44
 
53
- ### Maintainability
54
- - **NFR-M-01**: ...
45
+ ### Security (if applicable)
46
+ - **NFR-S-01**: ...
55
47
 
56
- ### Compatibility
57
- - **NFR-C-01**: ...
48
+ <!-- Delete Maintainability / Compatibility sections unless they carry a real constraint. -->
58
49
 
59
50
  ## Edge Cases and Error Handling
60
51
 
61
- <!-- Must be explicit: what happens on failure? how are abnormal inputs handled? -->
52
+ <!-- Include rows only for scenarios that actually apply. -->
62
53
 
63
54
  | Scenario | Expected behavior |
64
55
  |-----|--------|
65
- | Network disconnected | ... |
66
- | Database exception | ... |
67
- | Invalid input | ... |
68
- | Concurrent conflict | ... |
56
+ | ... | ... |
69
57
 
70
58
  ## Out of Scope
71
59
 
72
- <!-- Karpathy principle 2: simplicity first. Explicitly list "not this time" to prevent scope creep. -->
73
-
74
- - ✗ Feature A — deferred to the next version
75
- - ✗ Feature B — out of budget
76
- - ✗ Feature C — needs its own spec
60
+ - ...
77
61
 
78
- ## Success Metrics
62
+ ## Success Metrics (if the feature has measurable outcomes)
79
63
 
80
- <!-- Must be quantifiable -->
64
+ <!-- Delete this section for internal tools or refactors with no user-visible metric. -->
81
65
 
82
- - Metric 1: [e.g. user signup completion rate > 80%]
83
- - Metric 2: [e.g. complaint rate < 1%]
66
+ - Metric 1: ...
84
67
 
85
68
  ## Open Questions
86
69
 
87
- <!-- Questions that need user answers -->
70
+ <!-- Include only if there are genuinely unresolved questions. Delete when empty. -->
88
71
 
89
- 1. **Question 1**: ...
90
- 2. **Question 2**: ...
72
+ 1. ...
91
73
 
92
74
  ---
93
75
 
94
- _Generated by flow-product-designer agent on {{CREATED_DATE}}. After user review, proceed to the design phase._
76
+ _Generated by flow-product-designer on {{CREATED_DATE}}._
@@ -10,105 +10,74 @@ status: in_progress
10
10
 
11
11
  > **Goal**: {{SPEC_GOAL}}
12
12
  >
13
- > Output of this phase. Subsequent requirements / design / tasks are all based on the conclusions of this document.
13
+ > **Fill only the sections that carry real information.** For a well-understood feature on a known stack, research legitimately compresses to: goal, one recommended direction, known constraints. Delete sections whose honest content would be "N/A" or "first time, nothing to fetch". Padding this document with "TBD" is worse than omitting sections.
14
14
 
15
15
  ---
16
16
 
17
- ## Prior Experience (from claude-mem)
18
-
19
- <!--
20
- flow-researcher first calls mcp__claude_mem__search to retrieve relevant history.
21
- If there are relevant observations, summarize them here; if not, write "(first research on this topic)".
22
- -->
17
+ ## Prior Experience (from claude-mem, if relevant)
23
18
 
24
19
  {{CLAUDE_MEM_FINDINGS}}
25
20
 
26
- ## Problem Understanding
21
+ <!-- Delete this section if there are no relevant prior observations. -->
27
22
 
28
- <!-- Translate the user's goal into technical language. Explicitly list assumptions. -->
23
+ ## Problem Understanding
29
24
 
30
25
  ### Core Problem
31
- <!-- One-line description of what we are solving -->
26
+ <!-- One sentence. What are we solving? -->
32
27
 
33
28
  ### Explicit Assumptions
34
- <!-- Karpathy principle 1: think before coding. List all assumptions for the user to confirm -->
29
+ <!-- Only real assumptions that matter. Don't list "assumption: we will write code." -->
30
+
35
31
  - Assumption 1: ...
36
- - Assumption 2: ...
37
32
 
38
33
  ### Known Constraints
39
- - Tech stack:
40
- - Budget / time:
41
- - Team capability:
42
- - Compliance requirements:
43
-
44
- ## Technical Solution Space
34
+ <!-- Include only the constraints that actually shape the solution. -->
45
35
 
46
- <!-- List 2-3 possible approaches with their pros and cons. Pick one in the design phase. -->
36
+ - Tech stack: ...
37
+ - Time budget: ...
38
+ - (Compliance, team capability, etc — only if they constrain this feature)
47
39
 
48
- ### Option A: ...
49
- - **Pros**:
50
- - **Cons**:
51
- - **Complexity**: low / medium / high
52
- - **Docs (context7 queries)**:
53
- - `library-name@version`: ...
40
+ ## Technical Solution Space
54
41
 
55
- ### Option B: ...
56
- - **Pros**:
57
- - **Cons**:
58
- - **Complexity**: low / medium / high
42
+ <!--
43
+ If one approach is clearly the right call for this stack, write only that approach with its rationale.
44
+ Include alternative options ONLY when there is a genuine tradeoff a thoughtful engineer might disagree on.
45
+ Do not invent Option B and Option C just to fill the template.
46
+ -->
59
47
 
60
- ### Option C (optional): ...
48
+ ### Recommended Approach: ...
49
+ - **Why**: ...
50
+ - **Complexity**: ...
51
+ - **Key APIs verified via context7**: ...
61
52
 
62
- ## Existing Code Analysis
53
+ ### Alternative: ... (include only if a real alternative exists)
63
54
 
64
- <!-- Codebase scan results. Which existing modules can be reused? Which need to be new? -->
55
+ ## Existing Code Analysis (include only if the codebase has relevant prior work)
65
56
 
66
57
  ### Reusable Modules
67
- - `path/to/existing-module.ts` — ...
68
-
69
- ### Modules to Create
70
- - `path/to/new-module.ts` — ...
71
-
72
- ### Modules to Modify
73
- - `path/to/modify.ts` — ...
74
-
75
- ## Latest Documentation Summary (context7)
76
-
77
- <!-- Latest APIs / best practices found by flow-researcher via mcp__context7__* -->
78
-
79
- ### {{LIBRARY_1}}
80
- - Version:
81
- - Relevant APIs:
82
- - Gotchas / changes:
83
-
84
- ### {{LIBRARY_2}}
85
- - ...
86
-
87
- ## Feasibility Assessment
58
+ - `path/to/module` — ...
88
59
 
89
- <!-- Explicitly answer: can this be done? how hard is it? -->
60
+ ### New Modules Required
61
+ - `path/to/new` — ...
90
62
 
91
- - **Feasibility**: feasible / ⚠ risky / ✗ not recommended
92
- - **Estimated complexity**: 1-10
93
- - **Main risks**:
94
- - Risk 1: ...
95
- - Risk 2: ...
63
+ ## Latest Documentation Summary
96
64
 
97
- ## Recommended Direction
65
+ <!-- Only include libraries whose API is version-sensitive AND used by this feature. Do not cite every library in the stack. -->
98
66
 
99
- <!-- Research conclusion: which option is recommended and why. If multiple options need discussion, explain here. -->
67
+ ### {{LIBRARY}}
68
+ - Version: ...
69
+ - Relevant APIs: ...
70
+ - Gotchas: ...
100
71
 
101
- **Recommendation**: Option ?
102
- **Rationale**:
103
- **To confirm in the design phase**:
72
+ ## Feasibility
104
73
 
105
- ## Open Questions
74
+ - **Verdict**: feasible / risky / not recommended
75
+ - **Main risks**: (only if real risks exist)
106
76
 
107
- <!-- Questions the research phase couldn't answer, to be deferred to later phases or asked of the user -->
77
+ ## Open Questions (delete if none)
108
78
 
109
79
  1. ...
110
- 2. ...
111
80
 
112
81
  ---
113
82
 
114
- _Generated by flow-researcher agent on {{CREATED_DATE}}. Subsequent phases continue from this document._
83
+ _Generated by flow-researcher on {{CREATED_DATE}}._
@@ -5,137 +5,80 @@ created: {{CREATED_DATE}}
5
5
  version: 1.0
6
6
  status: in_progress
7
7
  depends_on: design.md
8
- task_size: fine
9
8
  ---
10
9
 
11
10
  # Task Breakdown: {{SPEC_NAME}}
12
11
 
13
- > POC-First 5 Phases: **work refactor test quality gates PR lifecycle**.
12
+ > POC-First is an **orientation, not a mandate**. Use the phases below as an organizing idea and **delete phases that don't apply to this feature**. A bug-fix may be one task. A prototype may skip Phase 2 (refactor) and Phase 5 (PR lifecycle). A library may skip the PR lifecycle entirely. Forcing all five phases for a small feature is the padding pattern this template is designed to prevent.
14
13
  >
15
- > Each task includes: `Do`, `Files`, `Done-when`, `Verify`, `Commit`. Verifiable via automation.
14
+ > Each task includes whatever of `Do`, `Files`, `Done-when`, `Verify`, `Commit` is needed for the executor to finish it in a single sub-agent dispatch. Verify must be an automated command (no "manual test").
16
15
 
17
16
  ---
18
17
 
19
18
  ## Marker Rules
20
19
 
21
20
  - `[ ]` TODO / `[x]` done
22
- - `[P]` parallel-safe (can be dispatched in parallel within the same wave)
23
- - `[VERIFY]` quality checkpoint (run by the flow-verifier agent)
21
+ - `[P]` parallel-safe (dispatch in parallel within the same wave)
22
+ - `[VERIFY]` quality checkpoint (flow-verifier agent)
24
23
  - `[SEQUENTIAL]` must be serial (breaks the parallel group)
25
24
 
26
25
  ---
27
26
 
28
27
  ## Phase 1: Make It Work (POC)
29
28
 
30
- > Goal: get it running end-to-end. Hardcoding is acceptable; skip tests.
29
+ > Goal: end-to-end runnable. Hardcoding is acceptable; skip tests here.
31
30
 
32
- - [ ] **1.1** [P] Initialize module skeleton
33
- - **Do**: create `src/{{MODULE}}/` directory, add `index.ts`, `types.ts`
34
- - **Files**: `src/{{MODULE}}/index.ts`, `src/{{MODULE}}/types.ts`
35
- - **Done when**: directory exists, `import {} from './{{MODULE}}'` does not error
36
- - **Verify**: `npx tsc --noEmit`
37
- - **Commit**: `feat({{MODULE}}): initialize module skeleton`
38
- - _Requirements_: FR-01
31
+ <!-- Add only the tasks this feature genuinely needs. Do not invent skeleton tasks to "round out" the phase. -->
39
32
 
40
- - [ ] **1.2** [P] ...
33
+ - [ ] **1.1** ...
41
34
  - **Do**: ...
42
35
  - **Files**: ...
43
36
  - **Done when**: ...
44
37
  - **Verify**: ...
45
38
  - **Commit**: ...
46
- - _Requirements_: ...
47
- - _Design_: AD-01
39
+ - _Requirements_: FR-NN
48
40
 
49
- - [ ] **1.3** [VERIFY] End-to-end POC verification
50
- - **Do**: run the happy path manually, confirm the core scenario works
51
- - **Verify**: `curl http://localhost:3000/... | jq`
52
- - **Done when**: returns expected data (edge cases may still be wrong)
41
+ - [ ] **1.X** [VERIFY] End-to-end POC verification
42
+ - **Verify**: `<command>`
43
+ - **Done when**: happy path returns the expected result
53
44
 
54
- ## Phase 2: Refactoring
45
+ ## Phase 2: Refactoring (delete if the POC is already clean)
55
46
 
56
- > Goal: clean up the code structure. Behavior unchanged.
57
-
58
- - [ ] **2.1** Extract duplicated logic
59
- - **Do**: ...
60
- - **Verify**: `npx tsc --noEmit && git diff --stat`
61
- - **Commit**: `refactor({{MODULE}}): extract common logic`
62
-
63
- - [ ] **2.2** [VERIFY] Refactor does not break behavior
64
- - **Verify**: rerun the manual test from Phase 1
65
- - **Done when**: all outputs match
47
+ > Include only if the POC has genuine duplication or structural mud that warrants cleanup. Skip for tiny features.
66
48
 
67
49
  ## Phase 3: Testing (TDD red / green / yellow)
68
50
 
69
- > Rule: tests first. Let the test fail first (RED), then implement (GREEN), then clean up (YELLOW).
70
-
71
- - [ ] **3.1** [RED] Write failing tests — unit
72
- - **Do**: write unit tests for core functions
73
- - **Files**: `src/{{MODULE}}/*.test.ts`
74
- - **Verify**: `npm test -- src/{{MODULE}}` — expected to fail
75
- - **Commit**: `test({{MODULE}}): red - add unit tests for core logic`
76
-
77
- - [ ] **3.2** [GREEN] Make tests pass
78
- - **Do**: fix the implementation so the tests from 3.1 pass
79
- - **Verify**: `npm test -- src/{{MODULE}}` — all green
80
- - **Commit**: `feat({{MODULE}}): green - satisfy unit tests`
81
-
82
- - [ ] **3.3** [YELLOW] Refactor and clean up
83
- - **Do**: clean up the implementation, tests still pass
84
- - **Commit**: `refactor({{MODULE}}): yellow - clean implementation`
51
+ > Rule: tests first. Red → Green → Yellow. **Collapse red+green into one task when the test and implementation are trivially paired**; split only when the test genuinely precedes a nontrivial implementation.
85
52
 
86
- - [ ] **3.4** [RED GREEN YELLOW] Integration tests
87
- - <!-- Repeat the TDD cycle -->
53
+ - [ ] **3.X** [RED→GREEN→YELLOW] ...
88
54
 
89
- - [ ] **3.5** [VERIFY] Coverage check
90
- - **Verify**: `npm test -- --coverage` core logic > 80%
55
+ - [ ] **3.X+1** [VERIFY] Coverage check
56
+ - **Verify**: coverage on the changed surface project standard
91
57
 
92
58
  ## Phase 4: Quality Gates
93
59
 
94
- > Full local checks. Last gate before CI.
95
-
96
- - [ ] **4.1** TypeScript strict check
97
- - **Verify**: `npx tsc --strict --noEmit` — 0 errors
98
- - **Commit**: `chore({{MODULE}}): tsc strict passes`
99
-
100
- - [ ] **4.2** Lint
101
- - **Verify**: `npx eslint src/{{MODULE}}` — 0 errors, 0 warnings
102
-
103
- - [ ] **4.3** All tests pass
104
- - **Verify**: `npm test` — all green
105
-
106
- - [ ] **4.4** [VERIFY] Final health check
107
- - **Do**: flow-verifier agent performs goal-driven reverse verification
108
- - **Done when**: every FR-XX and AC-X.Y has a corresponding automated verification
109
-
110
- ## Phase 5: PR Lifecycle
60
+ > Include only the checks this project actually runs. `npx eslint` is dead weight if the project uses biome. `tsc --strict` is dead weight for a JS project.
111
61
 
112
- - [ ] **5.1** Generate PR
113
- - **Do**: `/flow-ship` creates the PR
114
- - **Done when**: PR URL returned, description is clear
62
+ - [ ] **4.X** [VERIFY] Final health check
63
+ - **Do**: flow-verifier performs goal-driven reverse verification
64
+ - **Done when**: every FR/AC has an automated check
115
65
 
116
- - [ ] **5.2** Respond to review feedback
117
- - **Do**: iterate until approved
118
- - **Verify**: CI all green
66
+ ## Phase 5: PR Lifecycle (delete for local-only work, scripts, internal tools without a PR flow)
119
67
 
120
- - [ ] **5.3** Merge
121
- - **Do**: `/flow-land`
122
- - **Verify**: the main branch contains all commits for this spec
68
+ - [ ] **5.X** Ship / Land
123
69
 
124
70
  ---
125
71
 
126
72
  ## Coverage Audit
127
73
 
128
- <!-- Final step for flow-planner: confirm every FR / AC / AD / D has a corresponding task -->
74
+ <!-- flow-planner fills this in. Every FR / AC / AD / D must map to a task, or explicitly defer with reason. -->
129
75
 
130
76
  | Requirement ID | Task(s) | Status |
131
77
  |--------|---------|------|
132
- | FR-01 | 1.2, 3.1 | ✓ |
133
- | FR-02 | ... | ⚠ uncovered — needs adding |
134
- | AD-01 | 1.1 | ✓ |
135
- | D-05 (STATE.md) | ... | ✓ |
78
+ | FR-01 | ... | ✓ |
136
79
 
137
- **Uncovered items must be handled**: add a task or document the deferral reason in STATE.md.
80
+ **Uncovered items must be handled**: add a task, or document the deferral reason in STATE.md.
138
81
 
139
82
  ---
140
83
 
141
- _Generated by flow-planner agent on {{CREATED_DATE}}. N tasks total, estimated X hours._
84
+ _Generated by flow-planner on {{CREATED_DATE}}._
@@ -1,31 +0,0 @@
1
- #!/usr/bin/env bash
2
- # CurDX-Flow PostToolUseFailure Hook
3
- # Tracks consecutive tool failures to enable pua integration (Phase 4+).
4
- # For now, just maintains a counter in plugin data directory.
5
- #
6
- # Future: when pua is installed and fail_count >= threshold, auto-invoke /pua:pua.
7
-
8
- set -u
9
-
10
- DATA_DIR="${CLAUDE_PLUGIN_DATA:-$HOME/.claude/plugins/data/curdx-flow}"
11
- COUNTER="$DATA_DIR/fail-count"
12
-
13
- mkdir -p "$DATA_DIR" 2>/dev/null || true
14
-
15
- # Read current count
16
- CURRENT=0
17
- [ -f "$COUNTER" ] && CURRENT="$(cat "$COUNTER" 2>/dev/null || echo 0)"
18
-
19
- # Increment
20
- NEXT=$((CURRENT + 1))
21
- echo "$NEXT" > "$COUNTER" 2>/dev/null || true
22
-
23
- # Placeholder for future pua escalation (Phase 4+):
24
- # if [ "$NEXT" -ge 2 ] && command -v claude >/dev/null 2>&1; then
25
- # if claude plugin list 2>/dev/null | grep -q 'pua'; then
26
- # # Inject escalation suggestion via hook output
27
- # ...
28
- # fi
29
- # fi
30
-
31
- exit 0