npm - @lbruton/specflow - Versions diffs - 3.5.11 → 3.5.13 - Mend

@lbruton/specflow 3.5.11 → 3.5.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/dist/index.js +0 -0
package/dist/markdown/templates/code-quality-reviewer-template.md +64 -8
package/dist/markdown/templates/design-template.md +38 -0
package/dist/markdown/templates/tasks-template.md +65 -7
package/package.json +2 -2

package/dist/index.js CHANGED Viewed

File without changes

package/dist/markdown/templates/code-quality-reviewer-template.md CHANGED Viewed

@@ -28,7 +28,6 @@ git diff {BASE_SHA}..{HEAD_SHA}
 - Sound design decisions?
 - Scalability considerations?
 - Performance implications?
-- Security concerns?
 **Testing:**
 - Tests actually test logic (not just mocks)?
@@ -49,6 +48,55 @@ git diff {BASE_SHA}..{HEAD_SHA}
 - Documentation complete?
 - No obvious bugs?
+## Security Review
+> **Pairing:** This section complements (does NOT replace) the Codacy SRM scan in tasks.md C5.5. Codacy catches known-pattern findings; this human/AI review catches contextual issues Codacy can't see (business logic flaws, trust boundary violations, intent vs implementation gaps). Run BOTH.
+Read the design.md `Security Considerations` section first to know what the spec author declared. If they declared `Security Impact: No`, verify that's actually true by walking the checklist below — silent assumptions are how regressions ship.
+### Input Validation & Trust Boundaries
+- **Untrusted input identified:** Where does data cross from user-controlled to system-trusted? Form fields, query params, file uploads, API payloads, environment variables, file paths.
+- **Validation present:** Type, length, format, allow-list (preferred over deny-list), encoding.
+- **Validation location:** Server-side validation MUST exist even if client-side validation is present. Client-only validation is a finding.
+- **Sanitization for sinks:** HTML escape for DOM, parameterized queries for SQL, path normalization for file system, command argument arrays (never string concat) for shell.
+### Authentication & Authorization
+- **Auth checks present:** Every protected endpoint/action verifies the caller's identity.
+- **Authz checks present:** Every protected resource verifies the caller has permission for THIS specific resource (not just "logged in").
+- **No auth bypass:** Search for hardcoded credentials, debug flags, test-only branches that skip auth in production code paths.
+- **Session/token handling:** Tokens stored securely (httpOnly cookies, secure storage), proper expiration, no tokens in URLs or logs.
+### Secrets & Credentials
+- **No hardcoded secrets:** Grep the diff for `api_key`, `password`, `token`, `secret`, `Bearer`, AWS-style keys, JWT-shaped strings.
+- **Secrets sourced correctly:** From Infisical, env vars, or platform secrets manager — never from code, config files, or test fixtures.
+- **No secrets in logs:** Verify that secret values, tokens, and credentials are not logged at any level (info, debug, error).
+- **No secrets in error messages:** User-facing error messages must not echo back any secret material.
+### Injection Risks
+- **SQL injection:** Parameterized queries only. String concatenation into SQL is a Critical finding.
+- **Command injection:** No `exec`/`spawn`/`Function()`/template strings used as shell commands. Use array-form arguments.
+- **Path traversal:** User-supplied paths are normalized and verified to stay within an allowed root directory.
+- **Cross-site scripting (XSS):** All user content rendered via safe APIs (`textContent`, framework escaping, sanitization libraries), never `innerHTML` with raw input.
+- **Cross-site request forgery (CSRF):** State-changing endpoints require CSRF tokens or SameSite cookies.
+- **Server-side request forgery (SSRF):** Outbound HTTP calls to user-supplied URLs validate the host against an allow-list.
+### Data Exposure
+- **Sensitive data classification:** PII, financial, health, location, credentials — all flagged for secure handling.
+- **Logging hygiene:** No PII in logs without redaction. No request/response body dumps containing user data.
+- **Error responses:** Don't leak stack traces, file paths, or internal IDs to unauthenticated users.
+- **API response shape:** No fields returned that the caller shouldn't see (e.g., other users' data, internal flags, password hashes).
+### Dependencies & Supply Chain
+- **New dependencies justified:** Each new package in `package.json`/`requirements.txt`/etc. is necessary and from a reputable source.
+- **Version pinning:** New deps are pinned to specific versions (not `*` or `^latest`).
+- **Known vulnerabilities:** Run the project's audit command (`npm audit`, `pip-audit`, etc.) — any High/Critical findings must be addressed or documented.
+- **Supply chain markers:** Watch for typo-squatting (suspicious near-name packages), recently published packages, and packages with unusual install scripts.
+### Codacy SRM Cross-Reference
+- **Read tasks.md C5.5 results:** What did Codacy find on the changed files? Are all Critical/High findings addressed (fixed or explicitly waivered with justification)?
+- **Look for Codacy blind spots:** Codacy excels at known patterns. It does NOT catch business logic flaws, race conditions, multi-step authorization gaps, or "the right code in the wrong place." Those are this reviewer's job.
+- **If C5.5 was skipped:** That's a finding. Security gate cannot be bypassed.
 ## Output Format
 ### Strengths
@@ -56,23 +104,31 @@ git diff {BASE_SHA}..{HEAD_SHA}
 ### Issues
-#### Critical (Must Fix)
-[Bugs, security issues, data loss risks, broken functionality]
+#### Critical (Must Fix — blocks PR)
+[Bugs, security issues, data loss risks, broken functionality, auth bypasses, injection vectors, secret leaks]
-#### Important (Should Fix)
-[Architecture problems, missing features, poor error handling, test gaps]
+#### Important (Should Fix — blocks PR unless explicitly waived)
+[Architecture problems, missing security controls, poor error handling, test gaps, missing validation]
-#### Minor (Nice to Have)
+#### Minor (Nice to Have — advisory)
 [Code style, optimization opportunities, documentation improvements]
 **For each issue:**
 - File:line reference
 - What's wrong
-- Why it matters
+- Why it matters (especially for security findings — explain the threat model)
 - How to fix (if not obvious)
+### Security Assessment
+**Security review status:** [✅ PASS / ⚠️ FINDINGS / ❌ FAIL]
+**Codacy SRM cross-reference:** [Aligned / Discrepancy — explain]
+**Threat model coverage:** [Did design.md Security Considerations match what was actually implemented? Any unstated risks?]
 ### Assessment
 **Ready to proceed?** [Yes / No / With fixes]
-**Reasoning:** [1-2 sentence technical assessment]
+**Reasoning:** [1-2 sentence technical assessment, including security posture]

package/dist/markdown/templates/design-template.md CHANGED Viewed

@@ -129,6 +129,44 @@ _If **No**, skip this section entirely. If **Yes**, complete all fields below
    - **Handling:** [How to handle]
    - **User Impact:** [What user sees]
+## Security Considerations
+> **GATE:** This section must be filled in for every spec, even if the answer is "no security impact." A spec that explicitly declares no security impact is fine; a spec that omits this section is not — it cannot be assessed by reviewers and will fail the readiness gate.
+### Security Impact: [Yes / No / Minimal]
+_If **No** or **Minimal**, briefly justify (e.g., "string rename with no input/auth/network changes"). If **Yes**, complete the fields below — these gate the Codacy SRM scan in tasks.md C5.5._
+### Sensitive Data Touched
+- **User input parsed/validated:** [list inputs that cross trust boundaries — form fields, query params, file uploads, API payloads]
+- **Authentication/authorization changes:** [any change to who can access what — new roles, new permission checks, session/token handling]
+- **Secrets or credentials handled:** [API keys, tokens, passwords, encryption keys — and where they're sourced from, e.g., Infisical]
+- **PII or sensitive user data:** [any field that contains personal info, financial data, health data, location, etc.]
+### Threat Surface Changes
+- **New network endpoints:** [URLs/methods being exposed, with auth requirements]
+- **New external dependencies:** [npm/pip packages being added — supply-chain risk]
+- **New file system access:** [paths being read/written, especially user-controlled paths]
+- **New shell commands or eval:** [any `exec`, `eval`, `Function()`, template strings used as commands — injection risk]
+- **New deserialization of untrusted input:** [JSON/YAML/binary formats parsing untrusted bytes — deserialization vulnerabilities]
+### Input Validation Strategy
+- **Where validation happens:** [client / server / both]
+- **What is validated:** [type, length, format, allow-list, deny-list]
+- **Sanitization for storage/display:** [HTML escape, SQL parameterization, path normalization]
+### Codacy SRM Pre-Check (advisory)
+- **Run before tasks.md C5.5:** Check existing Codacy findings against the files this spec will modify (use `mcp__codacy__codacy_list_repository_issues` filtered to changed files)
+- **Findings to address up front:** [list any pre-existing Critical/High findings on files this spec touches, so they can be fixed in scope rather than orphaned]
+### Codacy SRM Gate (enforced in tasks.md C5.5)
+- The spec workflow's tasks.md template includes C5.5 Security Review which runs Codacy SRM against changed files post-implementation
+- This design section captures the **intent and threat model**; C5.5 verifies the implementation
+- For specs declaring `Security Impact: No`, C5.5 still runs but is expected to find zero new issues — any Critical/High finding becomes a hard gate
+### Out-of-Scope Security Notes
+[Anything related to security that this spec does NOT address but should be tracked separately — file as a follow-up issue rather than expanding this spec's scope]
 ## Testing Strategy
 ### Automated Tests

package/dist/markdown/templates/tasks-template.md CHANGED Viewed

@@ -6,12 +6,32 @@
 - **GitHub PR:** [#NNN](https://github.com/owner/repo/pull/NNN)
 - **Spec Path:** `DocVault/specflow/{{projectName}}/specs/{{spec-name}}/`
+## File Touch Map
+> **Why this section exists:** Surface the blast radius before any task begins. The Phase 4 orchestrator uses this map to decide which tasks can run in parallel (no overlapping files) versus serial (shared files). The readiness gate (Phase 3.9) verifies that every file referenced by any task below appears in this map. Reviewers use it as a 30-second preview of "how big is this change really" before reading individual tasks.
+| Action | Files | Scope |
+|--------|-------|-------|
+| CREATE | `path/to/new/file.ext` | [one-line scope note] |
+| MODIFY | `path/to/existing/file.ext` | [what changes — function names or rough line ranges] |
+| DELETE | `path/to/dead/file.ext` | [why it's going away] |
+| TEST   | `tests/path/test_new.ext` | [what the new test file covers] |
+**Total files touched:** N created, N modified, N deleted, N test files added/modified.
+**Cross-cutting concerns:** [list any files where the spec also requires updates that aren't immediately obvious — e.g., `__init__.py` exports, `package.json`, env var docs, project conventions JSON]
+**Parallel/serial dispatch hint:** Tasks that share NO files (no overlapping CREATE/MODIFY paths) are independent and SHOULD be dispatched in parallel. Tasks that share files MUST run sequentially. Group independent tasks into parallel batches.
+---
 ## UI Prototype Gate (conditional — include ONLY if design.md declares `Has UI Changes: Yes` AND `Prototype Required: Yes`)
 > **BLOCKING:** Tasks 0.1–0.3 MUST be completed and approved before ANY task tagged `ui:true` begins.
 > If the spec has no UI changes, delete this entire section.
 - [ ] 0.1 Create visual mockup (Stitch or equivalent)
+  - **Recommended Agent:** Claude
   - Invoke the `ui-mockup` skill (Step 1–4) OR the `frontend-design` skill
   - If design.md references a prototype HTML file, use it as the starting point
   - Generate mockups for all states: populated, loading, empty, error
@@ -21,14 +41,16 @@
   - _Prompt: Role: UI/UX Designer | Task: Create visual mockups using the ui-mockup skill (Stitch) or frontend-design skill for all new/modified UI components described in design.md. Cover all visual states (populated, loading, empty, error) and theme variants (light, dark). If a reference HTML prototype exists at the path noted in design.md, use it as the baseline. | Restrictions: Do NOT write any production code. Output is mockup artifacts only. | Success: Stitch screen IDs or equivalent visual artifacts are generated and presented to the user for review._
 - [ ] 0.2 Build interactive prototype (Playground)
+  - **Recommended Agent:** Claude
   - Invoke the `playground` skill using the approved mockup as spec
-  - Must use the project's actual tech stack (Bootstrap 5, CSS variables, data-theme attribute)
+  - Must use the project's actual tech stack (CSS framework, theme system, data attributes)
   - Include realistic sample data, interactive controls, and all data states
   - Purpose: Validate UX feel and interactions before production code
   - _Requirements: All UI-related requirements_
   - _Prompt: Role: Frontend Prototyper | Task: Build an interactive single-file HTML playground using the playground skill. Source visual design from the approved Stitch mockup (Task 0.1). Use the project's actual CSS framework and theme system. Include realistic data, clickable controls, hover states, and all data states (populated, loading, empty, error). | Restrictions: This is a throwaway prototype — do NOT integrate into the codebase. Must match the project's tech stack. | Success: User can interact with the prototype in a browser, validate layout/UX, and give explicit approval before implementation begins._
 - [ ] 0.3 Visual approval checkpoint
+  - **Recommended Agent:** Claude
   - Present prototype to user for explicit approval
   - Update design.md `Prototype Artifacts` section with Stitch IDs and playground file path
   - Purpose: Hard gate — no UI implementation proceeds without visual sign-off
@@ -71,9 +93,10 @@ After ALL tasks are [x] and implementation logs are recorded:
 ---
-## Phase 1 — [Phase Name]
+## Phase 1 — [Phase Name — short descriptor matching design.md mermaid graph, e.g., "(Phase A — the domino)"]
 - [ ] 1. [Task title]
+  - **Recommended Agent:** [Claude / Codex / Gemini — pick based on task type: Claude for general implementation, Codex for verification scans and adversarial review, Gemini for long-context analysis]
   - File: `[file path]`
   - [What to implement — be specific about function names, line numbers, and code patterns]
   - [Second bullet if multi-part]
@@ -83,6 +106,7 @@ After ALL tasks are [x] and implementation logs are recorded:
   - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: [Role] | Task: [Detailed implementation instructions referencing specific file paths, line numbers, existing functions, and exact variable names. Include the complete behavior specification.] | Restrictions: [What NOT to do — other files to leave untouched, patterns to avoid, anti-patterns for this codebase] | Success: [Concrete, verifiable acceptance criteria — what works, what doesn't break] PREREQUISITE: Before writing any code, verify you are in the correct working context. If the project uses version.lock, confirm `git branch --show-current` returns patch/VERSION. If not, STOP and run /release patch first. Mark task as [-] in tasks.md before starting. BLOCKING: After implementation, you MUST call the log-implementation tool with full artifacts before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
 - [ ] 2. [Task title]
+  - **Recommended Agent:** Claude
   - File: `[file path]`
   - [Implementation details]
   - Purpose: [Why]
@@ -95,6 +119,7 @@ After ALL tasks are [x] and implementation logs are recorded:
 ## Phase 2 — [Phase Name] (optional — remove if single-phase)
 - [ ] 3. [Task title]
+  - **Recommended Agent:** Claude
   - File: `[file path]`
   - [Implementation details]
   - Purpose: [Why]
@@ -107,6 +132,7 @@ After ALL tasks are [x] and implementation logs are recorded:
 ## Standard Closing Tasks
 - [ ] C1. Establish test baseline
+  - **Recommended Agent:** Claude
   - File: (no file changes — testing only)
   - Run the project's test command to establish a passing baseline before any implementation changes
   - If no test suite exists, flag this to the user and discuss whether to set one up
@@ -116,6 +142,7 @@ After ALL tasks are [x] and implementation logs are recorded:
   - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: QA Engineer | Task: Identify and run the project's established test suite to verify a passing baseline before implementation. Check project documentation and config files for the test command. Record the number of passing/failing/skipped tests. If no test suite exists, flag this to the user and ask whether to set one up before proceeding. | Restrictions: Use the project's existing test framework — do not introduce a new one. Do not modify any source files. | Success: Test suite runs and baseline results (pass/fail/skip counts) are recorded. PREREQUISITE: This is a verification-only task — no worktree changes needed. BLOCKING: After recording baseline, you MUST call the log-implementation tool with the test results before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
 - [ ] C2. Write failing tests for new behavior (TDD — BEFORE implementation)
+  - **Recommended Agent:** Claude
   - File: [test file paths — determined by project conventions]
   - Write failing tests using the project's test framework for all new behavior described in requirements.md
   - Tests should map to acceptance criteria — one or more tests per AC
@@ -125,6 +152,7 @@ After ALL tasks are [x] and implementation logs are recorded:
   - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: QA Engineer | Task: Write failing tests for all new behavior described in requirements.md acceptance criteria. Use the project's existing test framework and conventions. Each acceptance criterion should have at least one corresponding test. Tests MUST fail before implementation (red phase of TDD) and pass after (green phase). | Restrictions: Use the project's existing test framework — do not introduce a new one. Tests must be runnable with the project's test command. Do not write implementation code in this task. | Success: Failing tests exist for every acceptance criterion in requirements.md. Running the test suite shows the new tests fail (expected) while existing tests still pass. PREREQUISITE: Before writing any code, verify you are in the correct working context. If the project uses version.lock, confirm `git branch --show-current` returns patch/VERSION. If not, STOP and run /release patch first. Mark task as [-] in tasks.md before starting. BLOCKING: After writing tests, you MUST call the log-implementation tool with test file paths and AC mapping before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
 - [ ] C3. Implement — make tests pass
+  - **Recommended Agent:** Claude
   - File: [implementation file paths — determined by design.md]
   - Write the minimum code needed to make all failing tests from C2 pass
   - Follow existing project patterns and conventions
@@ -134,6 +162,7 @@ After ALL tasks are [x] and implementation logs are recorded:
   - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Senior Developer | Task: Write the implementation code to make all failing tests from task C2 pass. Follow the architecture and patterns described in design.md. Use existing project utilities and patterns — do not reinvent. Keep changes minimal and focused on making tests green. | Restrictions: Do not modify test files from C2 to make them pass — fix the implementation instead. Do not introduce new dependencies without justification. Follow existing code style and patterns. | Success: All tests from C2 now pass. No existing tests regress. Code follows project conventions. PREREQUISITE: Before writing any code, verify you are in the correct working context. If the project uses version.lock, confirm `git branch --show-current` returns patch/VERSION. If not, STOP and run /release patch first. Mark task as [-] in tasks.md before starting. BLOCKING: After implementation, you MUST call the log-implementation tool with full artifacts before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
 - [ ] C4. Run full test suite — zero regressions
+  - **Recommended Agent:** Claude
   - File: (no file changes — testing only)
   - Run the project's complete test suite after all implementation is done
   - All new tests must pass; no existing tests may regress from the C1 baseline
@@ -143,6 +172,7 @@ After ALL tasks are [x] and implementation logs are recorded:
   - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: QA Engineer | Task: Run the project's full test suite. Compare results against the baseline from task C1. All new tests from C2 must pass. No existing tests may have regressed. If any test fails, diagnose and fix before proceeding — do not skip or disable failing tests. | Restrictions: Do not skip or disable any tests. Do not modify tests to make them pass unless they have a genuine bug. | Success: Full test suite passes. New test count matches C2. Zero regressions from C1 baseline. PREREQUISITE: This is a verification-only task — no file changes expected unless fixing regressions. BLOCKING: After test run, you MUST call the log-implementation tool with pass/fail/skip counts and comparison to C1 baseline before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
 - [ ] C5. Log implementation — HARD GATE
+  - **Recommended Agent:** Claude
   - File: (no file changes — logging only)
   - Call the `log-implementation` MCP tool with a comprehensive summary of ALL implementation work done across all tasks in this spec
   - Include: all functions added/modified, all files changed, all tests written, all endpoints created
@@ -151,7 +181,19 @@ After ALL tasks are [x] and implementation logs are recorded:
   - _Requirements: All_
   - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Project Coordinator | Task: Call the log-implementation MCP tool with a comprehensive summary covering all implementation tasks in this spec. Aggregate: (1) all functions added or modified with file paths, (2) all files created or changed, (3) all test files and test counts, (4) any new endpoints, routes, or APIs, (5) any configuration changes. This is the consolidated implementation record. | Restrictions: Do not skip any task's artifacts. Do not mark this task [x] until the log-implementation tool call succeeds. | Success: log-implementation MCP tool call succeeds with full artifact listing. BLOCKING: This IS the logging gate. Do NOT mark [x] until the log-implementation tool call succeeds._
+- [ ] C5.5 Security Review — Codacy SRM scan
+  - **Recommended Agent:** Claude
+  - File: (no file changes — review only, fixes happen via loop-back if needed)
+  - Run Codacy SRM (Security & Risk Management) scan against the changed files in this spec
+  - Triage findings: Critical/High → MUST fix, Medium → fix or document waiver, Low/Info → advisory
+  - For each Critical/High finding: either fix in code OR add a documented exclusion to `.codacy.yml` with justification
+  - Purpose: Catch security regressions BEFORE PR — input validation, authn/authz, secrets handling, injection vectors, data exposure, dependency vulnerabilities
+  - _Leverage: `/codacy-resolve` skill, `mcp__codacy__codacy_search_repository_srm_items`, `mcp__codacy__codacy_list_repository_issues` filtered to changed files_
+  - _Requirements: Security NFR from requirements.md_
+  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Security Engineer | Task: Invoke the /codacy-resolve skill to triage Codacy SRM findings against the files changed in this spec. For each Critical or High severity finding: (a) read the finding, (b) read the code at the cited file:line, (c) determine if it's a real issue or false positive, (d) for real issues fix the code, (e) for false positives add a documented exclusion to .codacy.yml with justification referencing this spec ID. Medium findings should be fixed or get a documented waiver in the implementation log. Low/Info findings are advisory. | Restrictions: Do not blanket-disable Codacy rules. Do not silence findings without investigation. Do not commit secrets or weaken auth checks to "fix" findings. | Success: Zero unaddressed Critical/High findings against changed files. All decisions logged. BLOCKING: After completing the security review, you MUST call the log-implementation tool with findings count and resolution summary before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
 - [ ] C6. Generate verification.md
+  - **Recommended Agent:** Claude
   - File: `DocVault/specflow/{{projectName}}/specs/{{spec-name}}/verification.md`
   - Generate a verification checklist in the spec directory
   - List every requirement and acceptance criterion from requirements.md as a checklist item
@@ -161,11 +203,27 @@ After ALL tasks are [x] and implementation logs are recorded:
   - _Requirements: All_
   - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: QA Engineer | Task: Generate verification.md in the spec directory. Read requirements.md and list every requirement and acceptance criterion as a markdown checklist. For each item, search the codebase for the implementing code and mark [x] with file:line evidence (e.g., "src/core/parser.ts:142 — validates input format"). If any criterion cannot be verified, mark [ ] with a description of the gap. Run /vault-update to update any DocVault pages affected by this spec's changes. Close all linked DocVault issues. Run /verification-before-completion for a final check. | Restrictions: Do not mark [x] without concrete file:line evidence. Do not fabricate evidence. If a gap exists, document it honestly. | Success: verification.md exists with every requirement/AC listed. All items marked [x] with evidence, OR [ ] items have gap descriptions. /vault-update completed. Linked issues closed. /verification-before-completion passed. BLOCKING: After generating verification.md, you MUST call the log-implementation tool before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
+- [ ] C6.5 Codex Peer Review (cross-model verification)
+  - **Recommended Agent:** Claude (orchestrator) — dispatches to Codex
+  - File: (no file changes — review only)
+  - **Self-detection guard:** IF the current session is running from Codex itself (check `$CODEX_SESSION` env var or session metadata) → SKIP this task with note "skipped: running from codex, no recursive review"
+  - **Install detection guard:** IF Codex CLI is not installed (`command -v codex` returns nothing) AND no `codex:codex-rescue` agent type is registered → SKIP this task with note "skipped: codex not available"
+  - OTHERWISE: Dispatch `/codex:review` for an independent code review by GPT-5.4 against the branch diff
+  - Address all Critical and Important findings before proceeding to C7
+  - Minor findings are advisory — fix at discretion, document in log
+  - Purpose: Cross-model peer review catches blind spots a single AI might miss; using a different model architecture surfaces different classes of bugs
+  - _Leverage: `/codex:review` skill, `codex:codex-rescue` subagent, branch git diff_
+  - _Requirements: All_
+  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Code Review Coordinator | Task: First, check if this task should be skipped: (a) if running from Codex (check $CODEX_SESSION or session origin), skip and note "skipped: running from codex". (b) if Codex CLI not installed and codex:codex-rescue agent type unavailable, skip and note "skipped: codex not available". Otherwise, dispatch /codex:review (or codex:codex-rescue subagent) for an independent code review of the branch diff. Review the findings. Fix any Critical or Important issues. Minor issues are advisory. Record the review results (or skip reason) in the implementation log. | Restrictions: Do not skip this step without invoking the guards above. Do not dismiss findings without investigating. If skipped due to guards, the skip reason MUST be logged. | Success: Codex review completed with Critical/Important issues addressed, OR task is skipped with documented reason. Results logged in implementation log. BLOCKING: After review (or skip), you MUST call the log-implementation tool with review findings (or skip reason) before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
 - [ ] C7. Loop or complete
+  - **Recommended Agent:** Claude
   - File: (no file changes — decision gate only)
-  - IF verification.md has ANY unchecked `[ ]` items → return to task C2 and write tests for the gaps, then implement (C3), test (C4), log (C5), and re-verify (C6)
-  - ONLY when ALL items in verification.md are `[x]` → proceed to PR/commit
-  - Purpose: Enforce the verification loop — specs are not complete until every requirement is proven
-  - _Leverage: verification.md from C6_
+  - IF verification.md has ANY unchecked `[ ]` items → return to task C2 and write tests for the gaps, then implement (C3), test (C4), log (C5), security review (C5.5), verify (C6), peer review (C6.5)
+  - IF C5.5 found unaddressed Critical/High security findings → return to C3 to fix them
+  - IF C6.5 Codex review found unaddressed Critical/Important issues → return to C3 to fix them
+  - ONLY when ALL items in verification.md are `[x]` AND C5.5 has zero unaddressed findings AND C6.5 has zero unaddressed findings → proceed to PR/commit
+  - Purpose: Enforce the verification loop — specs are not complete until every requirement is proven AND all reviews pass
+  - _Leverage: verification.md from C6, Codacy results from C5.5, Codex results from C6.5_
   - _Requirements: All_
-  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Project Coordinator | Task: Read verification.md from task C6. Count unchecked [ ] items. IF any exist: list the gaps, return to task C2 to write tests targeting those gaps, then re-execute C3 through C6. Repeat until verification.md has zero unchecked items. ONLY when all items are [x]: proceed to create the PR or commit. | Restrictions: Do NOT proceed to PR/commit if ANY [ ] items remain in verification.md. Do NOT remove unchecked items to force completion. Each loop iteration must go through C2→C6 in order. | Success: verification.md has zero unchecked items. All requirements are proven with code evidence. PR/commit may proceed. BLOCKING: After confirming all items are verified, you MUST call the log-implementation tool with the final verification status before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._
+  - _Prompt: Implement the task for spec {{spec-name}}, first run spec-workflow-guide to get the workflow guide then implement the task: Role: Project Coordinator | Task: Read verification.md from task C6. Count unchecked [ ] items. Read C5.5 security findings — count unaddressed Critical/High. Read C6.5 Codex findings — count unaddressed Critical/Important. IF any of those counts are non-zero: list the gaps, return to task C2/C3 to fix them, then re-execute C3 through C6.5 in order. Repeat until all three counts are zero. ONLY when all are clean: proceed to create the PR or commit. | Restrictions: Do NOT proceed to PR/commit if ANY [ ] items remain in verification.md OR ANY unaddressed security/peer-review findings remain. Do NOT remove unchecked items to force completion. Each loop iteration must go through C2→C6.5 in order. | Success: verification.md has zero unchecked items. C5.5 has zero unaddressed Critical/High. C6.5 has zero unaddressed Critical/Important (or is skipped with documented reason). All requirements are proven with code evidence. PR/commit may proceed. BLOCKING: After confirming all items are verified and all reviews are clean, you MUST call the log-implementation tool with the final verification status before marking [x]. Do NOT mark [x] until the log-implementation tool call succeeds._

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@lbruton/specflow",
-  "version": "3.5.11",
+  "version": "3.5.13",
   "description": "MCP server for spec-driven development workflow with real-time web dashboard",
   "main": "dist/index.js",
   "type": "module",
@@ -14,7 +14,7 @@
     "LICENSE"
   ],
   "scripts": {
-    "build": "npm run validate:i18n && npm run clean && tsc && npm run build:dashboard",
+    "build": "npm run validate:i18n && npm run clean && tsc && chmod +x dist/index.js && npm run build:dashboard",
     "copy-static": "node scripts/copy-static.cjs",
     "dev": "tsx src/index.ts",
     "start": "node dist/index.js",