@lbruton/specflow 3.5.11 → 3.5.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/index.js CHANGED
File without changes
@@ -28,7 +28,6 @@ git diff {BASE_SHA}..{HEAD_SHA}
28
28
  - Sound design decisions?
29
29
  - Scalability considerations?
30
30
  - Performance implications?
31
- - Security concerns?
32
31
 
33
32
  **Testing:**
34
33
  - Tests actually test logic (not just mocks)?
@@ -49,6 +48,55 @@ git diff {BASE_SHA}..{HEAD_SHA}
49
48
  - Documentation complete?
50
49
  - No obvious bugs?
51
50
 
51
+ ## Security Review
52
+
53
+ > **Pairing:** This section complements (does NOT replace) the Codacy SRM scan in tasks.md C5.5. Codacy catches known-pattern findings; this human/AI review catches contextual issues Codacy can't see (business logic flaws, trust boundary violations, intent vs implementation gaps). Run BOTH.
54
+
55
+ Read the design.md `Security Considerations` section first to know what the spec author declared. If they declared `Security Impact: No`, verify that's actually true by walking the checklist below — silent assumptions are how regressions ship.
56
+
57
+ ### Input Validation & Trust Boundaries
58
+ - **Untrusted input identified:** Where does data cross from user-controlled to system-trusted? Form fields, query params, file uploads, API payloads, environment variables, file paths.
59
+ - **Validation present:** Type, length, format, allow-list (preferred over deny-list), encoding.
60
+ - **Validation location:** Server-side validation MUST exist even if client-side validation is present. Client-only validation is a finding.
61
+ - **Sanitization for sinks:** HTML escape for DOM, parameterized queries for SQL, path normalization for file system, command argument arrays (never string concat) for shell.
62
+
63
+ ### Authentication & Authorization
64
+ - **Auth checks present:** Every protected endpoint/action verifies the caller's identity.
65
+ - **Authz checks present:** Every protected resource verifies the caller has permission for THIS specific resource (not just "logged in").
66
+ - **No auth bypass:** Search for hardcoded credentials, debug flags, test-only branches that skip auth in production code paths.
67
+ - **Session/token handling:** Tokens stored securely (httpOnly cookies, secure storage), proper expiration, no tokens in URLs or logs.
68
+
69
+ ### Secrets & Credentials
70
+ - **No hardcoded secrets:** Grep the diff for `api_key`, `password`, `token`, `secret`, `Bearer`, AWS-style keys, JWT-shaped strings.
71
+ - **Secrets sourced correctly:** From Infisical, env vars, or platform secrets manager — never from code, config files, or test fixtures.
72
+ - **No secrets in logs:** Verify that secret values, tokens, and credentials are not logged at any level (info, debug, error).
73
+ - **No secrets in error messages:** User-facing error messages must not echo back any secret material.
74
+
75
+ ### Injection Risks
76
+ - **SQL injection:** Parameterized queries only. String concatenation into SQL is a Critical finding.
77
+ - **Command injection:** No `exec`/`spawn`/`Function()`/template strings used as shell commands. Use array-form arguments.
78
+ - **Path traversal:** User-supplied paths are normalized and verified to stay within an allowed root directory.
79
+ - **Cross-site scripting (XSS):** All user content rendered via safe APIs (`textContent`, framework escaping, sanitization libraries), never `innerHTML` with raw input.
80
+ - **Cross-site request forgery (CSRF):** State-changing endpoints require CSRF tokens or SameSite cookies.
81
+ - **Server-side request forgery (SSRF):** Outbound HTTP calls to user-supplied URLs validate the host against an allow-list.
82
+
83
+ ### Data Exposure
84
+ - **Sensitive data classification:** PII, financial, health, location, credentials — all flagged for secure handling.
85
+ - **Logging hygiene:** No PII in logs without redaction. No request/response body dumps containing user data.
86
+ - **Error responses:** Don't leak stack traces, file paths, or internal IDs to unauthenticated users.
87
+ - **API response shape:** No fields returned that the caller shouldn't see (e.g., other users' data, internal flags, password hashes).
88
+
89
+ ### Dependencies & Supply Chain
90
+ - **New dependencies justified:** Each new package in `package.json`/`requirements.txt`/etc. is necessary and from a reputable source.
91
+ - **Version pinning:** New deps are pinned to specific versions (not `*` or `^latest`).
92
+ - **Known vulnerabilities:** Run the project's audit command (`npm audit`, `pip-audit`, etc.) — any High/Critical findings must be addressed or documented.
93
+ - **Supply chain markers:** Watch for typo-squatting (suspicious near-name packages), recently published packages, and packages with unusual install scripts.
94
+
95
+ ### Codacy SRM Cross-Reference
96
+ - **Read tasks.md C5.5 results:** What did Codacy find on the changed files? Are all Critical/High findings addressed (fixed or explicitly waivered with justification)?
97
+ - **Look for Codacy blind spots:** Codacy excels at known patterns. It does NOT catch business logic flaws, race conditions, multi-step authorization gaps, or "the right code in the wrong place." Those are this reviewer's job.
98
+ - **If C5.5 was skipped:** That's a finding. Security gate cannot be bypassed.
99
+
52
100
  ## Output Format
53
101
 
54
102
  ### Strengths
@@ -56,23 +104,31 @@ git diff {BASE_SHA}..{HEAD_SHA}
56
104
 
57
105
  ### Issues
58
106
 
59
- #### Critical (Must Fix)
60
- [Bugs, security issues, data loss risks, broken functionality]
107
+ #### Critical (Must Fix — blocks PR)
108
+ [Bugs, security issues, data loss risks, broken functionality, auth bypasses, injection vectors, secret leaks]
61
109
 
62
- #### Important (Should Fix)
63
- [Architecture problems, missing features, poor error handling, test gaps]
110
+ #### Important (Should Fix — blocks PR unless explicitly waived)
111
+ [Architecture problems, missing security controls, poor error handling, test gaps, missing validation]
64
112
 
65
- #### Minor (Nice to Have)
113
+ #### Minor (Nice to Have — advisory)
66
114
  [Code style, optimization opportunities, documentation improvements]
67
115
 
68
116
  **For each issue:**
69
117
  - File:line reference
70
118
  - What's wrong
71
- - Why it matters
119
+ - Why it matters (especially for security findings — explain the threat model)
72
120
  - How to fix (if not obvious)
73
121
 
122
+ ### Security Assessment
123
+
124
+ **Security review status:** [✅ PASS / ⚠️ FINDINGS / ❌ FAIL]
125
+
126
+ **Codacy SRM cross-reference:** [Aligned / Discrepancy — explain]
127
+
128
+ **Threat model coverage:** [Did design.md Security Considerations match what was actually implemented? Any unstated risks?]
129
+
74
130
  ### Assessment
75
131
 
76
132
  **Ready to proceed?** [Yes / No / With fixes]
77
133
 
78
- **Reasoning:** [1-2 sentence technical assessment]
134
+ **Reasoning:** [1-2 sentence technical assessment, including security posture]
@@ -129,6 +129,44 @@ _If **No**, skip this section entirely. If **Yes**, complete all fields below
129
129
  - **Handling:** [How to handle]
130
130
  - **User Impact:** [What user sees]
131
131
 
132
+ ## Security Considerations
133
+
134
+ > **GATE:** This section must be filled in for every spec, even if the answer is "no security impact." A spec that explicitly declares no security impact is fine; a spec that omits this section is not — it cannot be assessed by reviewers and will fail the readiness gate.
135
+
136
+ ### Security Impact: [Yes / No / Minimal]
137
+
138
+ _If **No** or **Minimal**, briefly justify (e.g., "string rename with no input/auth/network changes"). If **Yes**, complete the fields below — these gate the Codacy SRM scan in tasks.md C5.5._
139
+
140
+ ### Sensitive Data Touched
141
+ - **User input parsed/validated:** [list inputs that cross trust boundaries — form fields, query params, file uploads, API payloads]
142
+ - **Authentication/authorization changes:** [any change to who can access what — new roles, new permission checks, session/token handling]
143
+ - **Secrets or credentials handled:** [API keys, tokens, passwords, encryption keys — and where they're sourced from, e.g., Infisical]
144
+ - **PII or sensitive user data:** [any field that contains personal info, financial data, health data, location, etc.]
145
+
146
+ ### Threat Surface Changes
147
+ - **New network endpoints:** [URLs/methods being exposed, with auth requirements]
148
+ - **New external dependencies:** [npm/pip packages being added — supply-chain risk]
149
+ - **New file system access:** [paths being read/written, especially user-controlled paths]
150
+ - **New shell commands or eval:** [any `exec`, `eval`, `Function()`, template strings used as commands — injection risk]
151
+ - **New deserialization of untrusted input:** [JSON/YAML/binary formats parsing untrusted bytes — deserialization vulnerabilities]
152
+
153
+ ### Input Validation Strategy
154
+ - **Where validation happens:** [client / server / both]
155
+ - **What is validated:** [type, length, format, allow-list, deny-list]
156
+ - **Sanitization for storage/display:** [HTML escape, SQL parameterization, path normalization]
157
+
158
+ ### Codacy SRM Pre-Check (advisory)
159
+ - **Run before tasks.md C5.5:** Check existing Codacy findings against the files this spec will modify (use `mcp__codacy__codacy_list_repository_issues` filtered to changed files)
160
+ - **Findings to address up front:** [list any pre-existing Critical/High findings on files this spec touches, so they can be fixed in scope rather than orphaned]
161
+
162
+ ### Codacy SRM Gate (enforced in tasks.md C5.5)
163
+ - The spec workflow's tasks.md template includes C5.5 Security Review which runs Codacy SRM against changed files post-implementation
164
+ - This design section captures the **intent and threat model**; C5.5 verifies the implementation
165
+ - For specs declaring `Security Impact: No`, C5.5 still runs but is expected to find zero new issues — any Critical/High finding becomes a hard gate
166
+
167
+ ### Out-of-Scope Security Notes
168
+ [Anything related to security that this spec does NOT address but should be tracked separately — file as a follow-up issue rather than expanding this spec's scope]
169
+
132
170
  ## Testing Strategy
133
171
 
134
172
  ### Automated Tests
@@ -6,12 +6,32 @@
6
6
  - **GitHub PR:** [#NNN](https://github.com/owner/repo/pull/NNN)
7
7
  - **Spec Path:** `DocVault/specflow/{{projectName}}/specs/{{spec-name}}/`
8
8
 
9
+ ## File Touch Map
10
+
11
+ > **Why this section exists:** Surface the blast radius before any task begins. The Phase 4 orchestrator uses this map to decide which tasks can run in parallel (no overlapping files) versus serial (shared files). The readiness gate (Phase 3.9) verifies that every file referenced by any task below appears in this map. Reviewers use it as a 30-second preview of "how big is this change really" before reading individual tasks.
12
+
13
+ | Action | Files | Scope |
14
+ |--------|-------|-------|
15
+ | CREATE | `path/to/new/file.ext` | [one-line scope note] |
16
+ | MODIFY | `path/to/existing/file.ext` | [what changes — function names or rough line ranges] |
17
+ | DELETE | `path/to/dead/file.ext` | [why it's going away] |
18
+ | TEST | `tests/path/test_new.ext` | [what the new test file covers] |
19
+
20
+ **Total files touched:** N created, N modified, N deleted, N test files added/modified.
21
+
22
+ **Cross-cutting concerns:** [list any files where the spec also requires updates that aren't immediately obvious — e.g., `__init__.py` exports, `package.json`, env var docs, project conventions JSON]
23
+
24
+ **Parallel/serial dispatch hint:** Tasks that share NO files (no overlapping CREATE/MODIFY paths) are independent and SHOULD be dispatched in parallel. Tasks that share files MUST run sequentially. Group independent tasks into parallel batches.
25
+
26
+ ---
27
+
9
28
  ## UI Prototype Gate (conditional — include ONLY if design.md declares `Has UI Changes: Yes` AND `Prototype Required: Yes`)
10
29
 
11
30
  > **BLOCKING:** Tasks 0.1–0.3 MUST be completed and approved before ANY task tagged `ui:true` begins.
12
31
  > If the spec has no UI changes, delete this entire section.
13
32
 
14
33
  - [ ] 0.1 Create visual mockup (Stitch or equivalent)
34
+ - **Recommended Agent:** Claude
15
35
  - Invoke the `ui-mockup` skill (Step 1–4) OR the `frontend-design` skill
16
36
  - If design.md references a prototype HTML file, use it as the starting point
17
37
  - Generate mockups for all states: populated, loading, empty, error
@@ -21,14 +41,16 @@
21
41
  - _Prompt: Role: UI/UX Designer | Task: Create visual mockups using the ui-mockup skill (Stitch) or frontend-design skill for all new/modified UI components described in design.md. Cover all visual states (populated, loading, empty, error) and theme variants (light, dark). If a reference HTML prototype exists at the path noted in design.md, use it as the baseline. | Restrictions: Do NOT write any production code. Output is mockup artifacts only. | Success: Stitch screen IDs or equivalent visual artifacts are generated and presented to the user for review._
22
42
 
23
43
  - [ ] 0.2 Build interactive prototype (Playground)
44
+ - **Recommended Agent:** Claude
24
45
  - Invoke the `playground` skill using the approved mockup as spec
25
- - Must use the project's actual tech stack (Bootstrap 5, CSS variables, data-theme attribute)
46
+ - Must use the project's actual tech stack (CSS framework, theme system, data attributes)
26
47
  - Include realistic sample data, interactive controls, and all data states
27
48
  - Purpose: Validate UX feel and interactions before production code
28
49
  - _Requirements: All UI-related requirements_
29
50
  - _Prompt: Role: Frontend Prototyper | Task: Build an interactive single-file HTML playground using the playground skill. Source visual design from the approved Stitch mockup (Task 0.1). Use the project's actual CSS framework and theme system. Include realistic data, clickable controls, hover states, and all data states (populated, loading, empty, error). | Restrictions: This is a throwaway prototype — do NOT integrate into the codebase. Must match the project's tech stack. | Success: User can interact with the prototype in a browser, validate layout/UX, and give explicit approval before implementation begins._
30
51
 
31
52
  - [ ] 0.3 Visual approval checkpoint
53
+ - **Recommended Agent:** Claude
32
54
  - Present prototype to user for explicit approval
33
55
  - Update design.md `Prototype Artifacts` section with Stitch IDs and playground file path
34
56
  - Purpose: Hard gate — no UI implementation proceeds without visual sign-off
@@ -71,9 +93,10 @@ After ALL tasks are [x] and implementation logs are recorded:
71
93
 
72
94
  ---
73
95
 
74
- ## Phase 1 — [Phase Name]
96
+ ## Phase 1 — [Phase Name — short descriptor matching design.md mermaid graph, e.g., "(Phase A — the domino)"]
75
97
 
76
98
  - [ ] 1. [Task title]
99
+ - **Recommended Agent:** [Claude / Codex / Gemini — pick based on task type: Claude for general implementation, Codex for verification scans and adversarial review, Gemini for long-context analysis]
77
100
  - File: `[file path]`
78
101
  - [What to implement — be specific about function names, line numbers, and code patterns]
79
102
  - [Second bullet if multi-part]
@@ -83,6 +106,7 @@ After ALL tasks are [x] and implementation logs are recorded:
83
106
  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: [Role] | Task: [Detailed implementation instructions referencing specific file paths, line numbers, existing functions, and exact variable names. Include the complete behavior specification.] | Restrictions: [What NOT to do — other files to leave untouched, patterns to avoid, anti-patterns for this codebase] | Success: [Concrete, verifiable acceptance criteria — what works, what doesn't break] PREREQUISITE: Before writing any code, verify you are in the correct working context. If the project uses version.lock, confirm `git branch --show-current` returns patch/VERSION. If not, STOP and run /release patch first. Mark task as [-] in tasks.md before starting. BLOCKING: After implementation, you MUST call the log-implementation tool with full artifacts before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
84
107
 
85
108
  - [ ] 2. [Task title]
109
+ - **Recommended Agent:** Claude
86
110
  - File: `[file path]`
87
111
  - [Implementation details]
88
112
  - Purpose: [Why]
@@ -95,6 +119,7 @@ After ALL tasks are [x] and implementation logs are recorded:
95
119
  ## Phase 2 — [Phase Name] (optional — remove if single-phase)
96
120
 
97
121
  - [ ] 3. [Task title]
122
+ - **Recommended Agent:** Claude
98
123
  - File: `[file path]`
99
124
  - [Implementation details]
100
125
  - Purpose: [Why]
@@ -107,6 +132,7 @@ After ALL tasks are [x] and implementation logs are recorded:
107
132
  ## Standard Closing Tasks
108
133
 
109
134
  - [ ] C1. Establish test baseline
135
+ - **Recommended Agent:** Claude
110
136
  - File: (no file changes — testing only)
111
137
  - Run the project's test command to establish a passing baseline before any implementation changes
112
138
  - If no test suite exists, flag this to the user and discuss whether to set one up
@@ -116,6 +142,7 @@ After ALL tasks are [x] and implementation logs are recorded:
116
142
  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: QA Engineer | Task: Identify and run the project's established test suite to verify a passing baseline before implementation. Check project documentation and config files for the test command. Record the number of passing/failing/skipped tests. If no test suite exists, flag this to the user and ask whether to set one up before proceeding. | Restrictions: Use the project's existing test framework — do not introduce a new one. Do not modify any source files. | Success: Test suite runs and baseline results (pass/fail/skip counts) are recorded. PREREQUISITE: This is a verification-only task — no worktree changes needed. BLOCKING: After recording baseline, you MUST call the log-implementation tool with the test results before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
117
143
 
118
144
  - [ ] C2. Write failing tests for new behavior (TDD — BEFORE implementation)
145
+ - **Recommended Agent:** Claude
119
146
  - File: [test file paths — determined by project conventions]
120
147
  - Write failing tests using the project's test framework for all new behavior described in requirements.md
121
148
  - Tests should map to acceptance criteria — one or more tests per AC
@@ -125,6 +152,7 @@ After ALL tasks are [x] and implementation logs are recorded:
125
152
  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: QA Engineer | Task: Write failing tests for all new behavior described in requirements.md acceptance criteria. Use the project's existing test framework and conventions. Each acceptance criterion should have at least one corresponding test. Tests MUST fail before implementation (red phase of TDD) and pass after (green phase). | Restrictions: Use the project's existing test framework — do not introduce a new one. Tests must be runnable with the project's test command. Do not write implementation code in this task. | Success: Failing tests exist for every acceptance criterion in requirements.md. Running the test suite shows the new tests fail (expected) while existing tests still pass. PREREQUISITE: Before writing any code, verify you are in the correct working context. If the project uses version.lock, confirm `git branch --show-current` returns patch/VERSION. If not, STOP and run /release patch first. Mark task as [-] in tasks.md before starting. BLOCKING: After writing tests, you MUST call the log-implementation tool with test file paths and AC mapping before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
126
153
 
127
154
  - [ ] C3. Implement — make tests pass
155
+ - **Recommended Agent:** Claude
128
156
  - File: [implementation file paths — determined by design.md]
129
157
  - Write the minimum code needed to make all failing tests from C2 pass
130
158
  - Follow existing project patterns and conventions
@@ -134,6 +162,7 @@ After ALL tasks are [x] and implementation logs are recorded:
134
162
  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Senior Developer | Task: Write the implementation code to make all failing tests from task C2 pass. Follow the architecture and patterns described in design.md. Use existing project utilities and patterns — do not reinvent. Keep changes minimal and focused on making tests green. | Restrictions: Do not modify test files from C2 to make them pass — fix the implementation instead. Do not introduce new dependencies without justification. Follow existing code style and patterns. | Success: All tests from C2 now pass. No existing tests regress. Code follows project conventions. PREREQUISITE: Before writing any code, verify you are in the correct working context. If the project uses version.lock, confirm `git branch --show-current` returns patch/VERSION. If not, STOP and run /release patch first. Mark task as [-] in tasks.md before starting. BLOCKING: After implementation, you MUST call the log-implementation tool with full artifacts before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
135
163
 
136
164
  - [ ] C4. Run full test suite — zero regressions
165
+ - **Recommended Agent:** Claude
137
166
  - File: (no file changes — testing only)
138
167
  - Run the project's complete test suite after all implementation is done
139
168
  - All new tests must pass; no existing tests may regress from the C1 baseline
@@ -143,6 +172,7 @@ After ALL tasks are [x] and implementation logs are recorded:
143
172
  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: QA Engineer | Task: Run the project's full test suite. Compare results against the baseline from task C1. All new tests from C2 must pass. No existing tests may have regressed. If any test fails, diagnose and fix before proceeding — do not skip or disable failing tests. | Restrictions: Do not skip or disable any tests. Do not modify tests to make them pass unless they have a genuine bug. | Success: Full test suite passes. New test count matches C2. Zero regressions from C1 baseline. PREREQUISITE: This is a verification-only task — no file changes expected unless fixing regressions. BLOCKING: After test run, you MUST call the log-implementation tool with pass/fail/skip counts and comparison to C1 baseline before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
144
173
 
145
174
  - [ ] C5. Log implementation — HARD GATE
175
+ - **Recommended Agent:** Claude
146
176
  - File: (no file changes — logging only)
147
177
  - Call the `log-implementation` MCP tool with a comprehensive summary of ALL implementation work done across all tasks in this spec
148
178
  - Include: all functions added/modified, all files changed, all tests written, all endpoints created
@@ -151,7 +181,19 @@ After ALL tasks are [x] and implementation logs are recorded:
151
181
  - _Requirements: All_
152
182
  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Project Coordinator | Task: Call the log-implementation MCP tool with a comprehensive summary covering all implementation tasks in this spec. Aggregate: (1) all functions added or modified with file paths, (2) all files created or changed, (3) all test files and test counts, (4) any new endpoints, routes, or APIs, (5) any configuration changes. This is the consolidated implementation record. | Restrictions: Do not skip any task's artifacts. Do not mark this task [x] until the log-implementation tool call succeeds. | Success: log-implementation MCP tool call succeeds with full artifact listing. BLOCKING: This IS the logging gate. Do NOT mark [x] until the log-implementation tool call succeeds._
153
183
 
184
+ - [ ] C5.5 Security Review — Codacy SRM scan
185
+ - **Recommended Agent:** Claude
186
+ - File: (no file changes — review only, fixes happen via loop-back if needed)
187
+ - Run Codacy SRM (Security & Risk Management) scan against the changed files in this spec
188
+ - Triage findings: Critical/High → MUST fix, Medium → fix or document waiver, Low/Info → advisory
189
+ - For each Critical/High finding: either fix in code OR add a documented exclusion to `.codacy.yml` with justification
190
+ - Purpose: Catch security regressions BEFORE PR — input validation, authn/authz, secrets handling, injection vectors, data exposure, dependency vulnerabilities
191
+ - _Leverage: `/codacy-resolve` skill, `mcp__codacy__codacy_search_repository_srm_items`, `mcp__codacy__codacy_list_repository_issues` filtered to changed files_
192
+ - _Requirements: Security NFR from requirements.md_
193
+ - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Security Engineer | Task: Invoke the /codacy-resolve skill to triage Codacy SRM findings against the files changed in this spec. For each Critical or High severity finding: (a) read the finding, (b) read the code at the cited file:line, (c) determine if it's a real issue or false positive, (d) for real issues fix the code, (e) for false positives add a documented exclusion to .codacy.yml with justification referencing this spec ID. Medium findings should be fixed or get a documented waiver in the implementation log. Low/Info findings are advisory. | Restrictions: Do not blanket-disable Codacy rules. Do not silence findings without investigation. Do not commit secrets or weaken auth checks to "fix" findings. | Success: Zero unaddressed Critical/High findings against changed files. All decisions logged. BLOCKING: After completing the security review, you MUST call the log-implementation tool with findings count and resolution summary before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
194
+
154
195
  - [ ] C6. Generate verification.md
196
+ - **Recommended Agent:** Claude
155
197
  - File: `DocVault/specflow/{{projectName}}/specs/{{spec-name}}/verification.md`
156
198
  - Generate a verification checklist in the spec directory
157
199
  - List every requirement and acceptance criterion from requirements.md as a checklist item
@@ -161,11 +203,27 @@ After ALL tasks are [x] and implementation logs are recorded:
161
203
  - _Requirements: All_
162
204
  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: QA Engineer | Task: Generate verification.md in the spec directory. Read requirements.md and list every requirement and acceptance criterion as a markdown checklist. For each item, search the codebase for the implementing code and mark [x] with file:line evidence (e.g., "src/core/parser.ts:142 — validates input format"). If any criterion cannot be verified, mark [ ] with a description of the gap. Run /vault-update to update any DocVault pages affected by this spec's changes. Close all linked DocVault issues. Run /verification-before-completion for a final check. | Restrictions: Do not mark [x] without concrete file:line evidence. Do not fabricate evidence. If a gap exists, document it honestly. | Success: verification.md exists with every requirement/AC listed. All items marked [x] with evidence, OR [ ] items have gap descriptions. /vault-update completed. Linked issues closed. /verification-before-completion passed. BLOCKING: After generating verification.md, you MUST call the log-implementation tool before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
163
205
 
206
+ - [ ] C6.5 Codex Peer Review (cross-model verification)
207
+ - **Recommended Agent:** Claude (orchestrator) — dispatches to Codex
208
+ - File: (no file changes — review only)
209
+ - **Self-detection guard:** IF the current session is running from Codex itself (check `$CODEX_SESSION` env var or session metadata) → SKIP this task with note "skipped: running from codex, no recursive review"
210
+ - **Install detection guard:** IF Codex CLI is not installed (`command -v codex` returns nothing) AND no `codex:codex-rescue` agent type is registered → SKIP this task with note "skipped: codex not available"
211
+ - OTHERWISE: Dispatch `/codex:review` for an independent code review by GPT-5.4 against the branch diff
212
+ - Address all Critical and Important findings before proceeding to C7
213
+ - Minor findings are advisory — fix at discretion, document in log
214
+ - Purpose: Cross-model peer review catches blind spots a single AI might miss; using a different model architecture surfaces different classes of bugs
215
+ - _Leverage: `/codex:review` skill, `codex:codex-rescue` subagent, branch git diff_
216
+ - _Requirements: All_
217
+ - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Code Review Coordinator | Task: First, check if this task should be skipped: (a) if running from Codex (check $CODEX_SESSION or session origin), skip and note "skipped: running from codex". (b) if Codex CLI not installed and codex:codex-rescue agent type unavailable, skip and note "skipped: codex not available". Otherwise, dispatch /codex:review (or codex:codex-rescue subagent) for an independent code review of the branch diff. Review the findings. Fix any Critical or Important issues. Minor issues are advisory. Record the review results (or skip reason) in the implementation log. | Restrictions: Do not skip this step without invoking the guards above. Do not dismiss findings without investigating. If skipped due to guards, the skip reason MUST be logged. | Success: Codex review completed with Critical/Important issues addressed, OR task is skipped with documented reason. Results logged in implementation log. BLOCKING: After review (or skip), you MUST call the log-implementation tool with review findings (or skip reason) before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
218
+
164
219
  - [ ] C7. Loop or complete
220
+ - **Recommended Agent:** Claude
165
221
  - File: (no file changes — decision gate only)
166
- - IF verification.md has ANY unchecked `[ ]` items → return to task C2 and write tests for the gaps, then implement (C3), test (C4), log (C5), and re-verify (C6)
167
- - ONLY when ALL items in verification.md are `[x]` proceed to PR/commit
168
- - Purpose: Enforce the verification loop specs are not complete until every requirement is proven
169
- - _Leverage: verification.md from C6_
222
+ - IF verification.md has ANY unchecked `[ ]` items → return to task C2 and write tests for the gaps, then implement (C3), test (C4), log (C5), security review (C5.5), verify (C6), peer review (C6.5)
223
+ - IF C5.5 found unaddressed Critical/High security findingsreturn to C3 to fix them
224
+ - IF C6.5 Codex review found unaddressed Critical/Important issues return to C3 to fix them
225
+ - ONLY when ALL items in verification.md are `[x]` AND C5.5 has zero unaddressed findings AND C6.5 has zero unaddressed findings → proceed to PR/commit
226
+ - Purpose: Enforce the verification loop — specs are not complete until every requirement is proven AND all reviews pass
227
+ - _Leverage: verification.md from C6, Codacy results from C5.5, Codex results from C6.5_
170
228
  - _Requirements: All_
171
- - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Project Coordinator | Task: Read verification.md from task C6. Count unchecked [ ] items. IF any exist: list the gaps, return to task C2 to write tests targeting those gaps, then re-execute C3 through C6. Repeat until verification.md has zero unchecked items. ONLY when all items are [x]: proceed to create the PR or commit. | Restrictions: Do NOT proceed to PR/commit if ANY [ ] items remain in verification.md. Do NOT remove unchecked items to force completion. Each loop iteration must go through C2→C6 in order. | Success: verification.md has zero unchecked items. All requirements are proven with code evidence. PR/commit may proceed. BLOCKING: After confirming all items are verified, you MUST call the log-implementation tool with the final verification status before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
229
+ - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Project Coordinator | Task: Read verification.md from task C6. Count unchecked [ ] items. Read C5.5 security findings — count unaddressed Critical/High. Read C6.5 Codex findings — count unaddressed Critical/Important. IF any of those counts are non-zero: list the gaps, return to task C2/C3 to fix them, then re-execute C3 through C6.5 in order. Repeat until all three counts are zero. ONLY when all are clean: proceed to create the PR or commit. | Restrictions: Do NOT proceed to PR/commit if ANY [ ] items remain in verification.md OR ANY unaddressed security/peer-review findings remain. Do NOT remove unchecked items to force completion. Each loop iteration must go through C2→C6.5 in order. | Success: verification.md has zero unchecked items. C5.5 has zero unaddressed Critical/High. C6.5 has zero unaddressed Critical/Important (or is skipped with documented reason). All requirements are proven with code evidence. PR/commit may proceed. BLOCKING: After confirming all items are verified and all reviews are clean, you MUST call the log-implementation tool with the final verification status before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@lbruton/specflow",
3
- "version": "3.5.11",
3
+ "version": "3.5.13",
4
4
  "description": "MCP server for spec-driven development workflow with real-time web dashboard",
5
5
  "main": "dist/index.js",
6
6
  "type": "module",
@@ -14,7 +14,7 @@
14
14
  "LICENSE"
15
15
  ],
16
16
  "scripts": {
17
- "build": "npm run validate:i18n && npm run clean && tsc && npm run build:dashboard",
17
+ "build": "npm run validate:i18n && npm run clean && tsc && chmod +x dist/index.js && npm run build:dashboard",
18
18
  "copy-static": "node scripts/copy-static.cjs",
19
19
  "dev": "tsx src/index.ts",
20
20
  "start": "node dist/index.js",