npm - bms-speckit-plugin - Versions diffs - 5.3.0 → 6.1.0 - Mend

bms-speckit-plugin 5.3.0 → 6.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/agents/quality-control.md +48 -0
package/blueprints/bms-speckit-pipeline.yaml +102 -30
package/package.json +1 -1
package/skills/bms-speckit-auto/SKILL.md +91 -62

package/agents/quality-control.md CHANGED Viewed

@@ -49,6 +49,8 @@ You are a senior quality control engineer performing a comprehensive audit of a
 ## Phase A: Code Errors (MUST pass before other phases)
+### A1. Standard Checks
 1. Run the build command (`npm run build`, `tsc`, `python -m py_compile`, etc.)
 2. Run linter (`eslint .`, `flake8`, `ruff check`, etc.)
 3. Run the full test suite (`npm test`, `pytest`, etc.)
@@ -59,6 +61,50 @@ You are a senior quality control engineer performing a comprehensive audit of a
    - Re-run to confirm fix
 5. Repeat until all three (build + lint + test) pass with zero errors
+### A2. Runtime Safety Patterns
+Build and lint miss an entire class of runtime errors — type-correct syntax that crashes when executed. These checks close that gap. **Detect the project language(s) first, then apply the relevant checks.**
+#### Language Config Strictness
+Check that the project's type checker / compiler is configured for maximum strictness:
+- **TypeScript** — `tsconfig.json` should have `"strict": true` (or at minimum: `noImplicitAny`, `strictNullChecks`, `noImplicitReturns`, `noUncheckedIndexedAccess`). Enable and fix resulting errors.
+- **Python** — If using mypy/pyright, check config has `strict = true` or equivalent. If no type checker is configured, flag it.
+- **Go** — Verify `go vet` passes. Check for unchecked errors (`errcheck`).
+- **Rust** — Verify `#![deny(warnings)]` or strict clippy lints are enabled.
+- **Other** — Check for the language's equivalent strict/lint configuration.
+#### Type Mismatch Patterns
+Grep source files for patterns where a value of one type is used where another type is expected. These are the most common causes of runtime crashes that pass build/lint:
+1. **Iterable/collection confusion** — A function returns one collection type but the caller treats it as another:
+   - Spreading non-iterables: `[...obj]` where `obj` is a plain object (JS/TS crashes), `[...fn()]` where `fn` returns a dict/object instead of a list/array
+   - Iterating dicts/objects: `for x in fn()` where `fn` returns a dict (Python iterates keys, not values)
+   - Calling collection methods on wrong types: `.map()`, `.filter()`, `.reduce()` on non-arrays; `.keys()` on a list instead of a dict
+   - **Fix:** Add explicit return type annotations on the function, add runtime type checks at the call site (e.g., `Array.isArray()`, `isinstance()`)
+2. **Null/undefined/None access** — Chained property or method access on values from external sources (API responses, DB results, user input, config files) without null guards:
+   - JS/TS: `data.result.items.map(...)` where any level could be `undefined`
+   - Python: `data["result"]["items"]` where any key could be missing
+   - **Fix:** Add optional chaining, default values, or explicit null/key-existence checks
+3. **Type assertion/cast bypass** — Code that overrides the type system's safety checks:
+   - TS: `as SomeType`, `!` non-null assertion
+   - Python: `cast()`, `# type: ignore`
+   - Go: unchecked type assertions `val := x.(Type)` without `, ok` pattern
+   - **Fix:** Verify each assertion is actually correct. Replace with type guards or proper error handling where possible
+4. **Implicit type coercion** — Operations that silently convert types, masking bugs:
+   - JS/TS: `==` instead of `===`, string concatenation with numbers (`"count: " + count`)
+   - Python: comparing different types without explicit conversion
+   - **Fix:** Use strict equality, explicit type conversion
+5. **Missing return type annotations on data transformers** — Functions that reshape, map, filter, or aggregate data should always have explicit return types. Without them, the type system infers too broadly and callers may use the result incorrectly.
+   - Grep for exported functions and functions whose results are spread or iterated
+   - **Fix:** Add explicit return type annotations
 ## Phase B: Security Audit
 1. Run `npm audit` or `pip audit` to check for known vulnerabilities
@@ -162,6 +208,8 @@ After completing all phases, provide a summary report:
 - [ ] Build: PASS/FAIL (X errors fixed)
 - [ ] Lint: PASS/FAIL (X errors fixed)
 - [ ] Tests: PASS/FAIL (X failures fixed)
+- [ ] Language config strictness: PASS/SKIP (X issues fixed)
+- [ ] Runtime safety patterns: PASS (X type mismatches fixed)
 ### Security
 - [ ] No hardcoded secrets

package/blueprints/bms-speckit-pipeline.yaml CHANGED Viewed

@@ -9,14 +9,16 @@
 # to prevent loss and provide traceability per the project constitution.
 id: bms_speckit_development_pipeline
-version: 4.3.0
+version: 5.0.0
 name: BMS Speckit Automated Development Pipeline
 description: >
   Full engineering workflow from requirement to verified implementation.
-  Chains brainstorming, constitution, specification, planning, task generation,
-  analysis, implementation (TDD), and final verification.
-  Commits after every important step. Uses knowledge-mcp during brainstorm
-  and specify steps to ground specs in real HOSxP data.
+  Chains brainstorming, external research, constitution, specification,
+  planning, task generation, analysis, implementation (TDD), and final
+  verification. Commits after every important step. Uses knowledge-mcp
+  during brainstorm and specify steps to ground specs in real HOSxP data.
+  Uses WebSearch and context7 during research to survey best practices
+  and current library documentation.
 category: Development
 # ─── Chain Sequence ────────────────────────────────────────────────────────────
@@ -51,7 +53,50 @@ chain_sequence:
         system architecture. Also search bms, moph, nhso collections
         if relevant to the feature.
-  - step_id: step_2_constitution
+  - step_id: step_2_research
+    skill_id: internal.research
+    action: execute
+    phase: 1
+    description: >
+      Survey external best practices, recommended libraries, compliance
+      standards, and reference architectures via WebSearch and context7.
+      Produces a research brief that enriches the specification step.
+    timeout_seconds: 300
+    input:
+      source: "{{step_1_brainstorm.output}}"
+    output:
+      artifacts: [specs/*/research.md]
+    post_action:
+      commit: true
+      message: "feat(speckit): research — best practices and technology survey"
+      push: true
+    error_handling:
+      on_failure: continue
+      max_retries: 1
+    tools:
+      - WebSearch
+      - mcp__plugin_context7_context7__resolve-library-id
+      - mcp__plugin_context7_context7__query-docs
+    opinionated_prompts:
+      system_context: >
+        Research external best practices, design patterns, and recommended
+        libraries/frameworks relevant to the feature described in step 1.
+        1. Identify research topics from the brainstorm output.
+        2. WebSearch for best practices, architecture guidance, and proven
+           patterns for each topic.
+        3. For each technology/library identified: compare alternatives
+           (maintenance, bundle size, license), then use context7
+           resolve-library-id + query-docs to fetch current documentation.
+        4. If the feature involves sensitive data (healthcare, financial, PII),
+           research relevant compliance standards and security guidelines.
+        5. Search for similar implementations or reference architectures.
+        Output a research brief containing: best practices summary,
+        recommended libraries with rationale, documentation references,
+        security/compliance notes, and prior art.
+  - step_id: step_3_constitution
     skill_id: speckit.constitution
     action: execute
     phase: 1
@@ -84,7 +129,7 @@ chain_sequence:
         TDD is non-negotiable. Constitution must enforce testing at all levels,
         atomic commits, reusable components, and centralized business logic.
-  - step_id: step_3_claude_md_sync
+  - step_id: step_4_claude_md_sync
     skill_id: internal.claude_md_verify
     action: execute
     phase: 1
@@ -105,14 +150,15 @@ chain_sequence:
         Read CLAUDE.md and verify it complies with specs/constitution.md.
         Update CLAUDE.md if it conflicts with or is missing constitution rules.
-  - step_id: step_4_specify
+  - step_id: step_5_specify
     skill_id: speckit.specify
     action: execute
     phase: 1
-    description: Create feature specification enriched with HOSxP knowledge
+    description: Create feature specification enriched with HOSxP knowledge and research findings
     timeout_seconds: 300
     input:
       requirement: "{{step_1_brainstorm.output}}"
+      research: "{{step_2_research.output}}"
     output:
       artifacts: [specs/*/spec.md]
     post_action:
@@ -126,19 +172,22 @@ chain_sequence:
       system_context: >
         Generate a complete, testable specification. Include acceptance criteria,
         data models, API contracts, and edge cases.
+        Incorporate research findings from step 2 — recommended libraries,
+        best practices, and compliance requirements should be referenced
+        in the specification.
         Use mcp__bms-knowledge-mcp__search_knowledge to search the hosxp
         collection for exact table names, field names, data types, and
         relationships needed by this feature. Reference actual HOSxP
         data structures in the spec, not assumed ones.
-  - step_id: step_5_plan
+  - step_id: step_6_plan
     skill_id: speckit.plan
     action: execute
     phase: 1
     description: Generate implementation plan from specification
     timeout_seconds: 300
     input:
-      source: "{{step_4_specify.artifacts}}"
+      source: "{{step_5_specify.artifacts}}"
     output:
       artifacts: [specs/*/plan.md]
     post_action:
@@ -153,14 +202,14 @@ chain_sequence:
         Plan in ordered steps. Include file paths, component boundaries,
         test strategy, and rollback considerations.
-  - step_id: step_6_tasks
+  - step_id: step_7_tasks
     skill_id: speckit.tasks
     action: execute
     phase: 1
     description: Generate dependency-ordered task list
     timeout_seconds: 300
     input:
-      source: "{{step_5_plan.artifacts}}"
+      source: "{{step_6_plan.artifacts}}"
     output:
       artifacts: [specs/*/tasks.md]
     post_action:
@@ -175,14 +224,14 @@ chain_sequence:
         Tasks must be atomic, dependency-ordered, and each independently testable.
         Include clear acceptance criteria per task.
-  - step_id: step_7_analyze
+  - step_id: step_8_analyze
     skill_id: speckit.analyze
     action: execute
     phase: 1
     description: Cross-artifact consistency and quality check
     timeout_seconds: 300
     input:
-      source: "{{step_6_tasks.artifacts}}"
+      source: "{{step_7_tasks.artifacts}}"
     output:
       artifacts: [analysis_report]
     post_action:
@@ -194,12 +243,14 @@ chain_sequence:
       max_retries: 1
     opinionated_prompts:
       system_context: >
-        Check spec ↔ plan ↔ tasks consistency. Flag gaps, contradictions,
-        and missing test coverage. Non-destructive — report only.
+        Check spec ↔ plan ↔ tasks ↔ research consistency. Flag gaps,
+        contradictions, and missing test coverage. Verify that research
+        recommendations are reflected in the spec and plan.
+        Non-destructive — report only.
   # ── Phase 2: Implementation (main context) ────────────────────────────────
-  - step_id: step_8_compact
+  - step_id: step_9_compact
     skill_id: internal.compact
     action: execute
     phase: 2
@@ -209,7 +260,7 @@ chain_sequence:
       on_failure: continue
       max_retries: 0
-  - step_id: step_9_implement_with_rolling_qc
+  - step_id: step_10_implement_with_rolling_qc
     skill_id: speckit.implement
     action: execute_loop
     phase: 2
@@ -219,7 +270,7 @@ chain_sequence:
       next task. Catches bugs at the source, not at the end.
     timeout_seconds: 3600
     input:
-      tasks_path: "{{step_6_tasks.artifacts}}"
+      tasks_path: "{{step_7_tasks.artifacts}}"
       loop_engine: ralph-loop
       completion_promise: "FINISHED"
       max_iterations: 10
@@ -237,6 +288,16 @@ chain_sequence:
         For EACH task, execute this rolling QC cycle:
         1. IMPLEMENT — write code following TDD (tests first, then implementation)
+           RUNTIME SAFETY RULES (prevent errors that build/lint miss):
+           - Always add explicit return type annotations on data transformation
+             functions. Never rely on type inference for functions whose return
+             values are spread, iterated, or passed to collection methods
+           - Never spread or iterate a function return value without verifying
+             it returns the expected collection type (array not object, list not dict)
+           - Use strict equality, add null/undefined/None guards for external
+             data (API responses, DB results, config, user input)
+           - Write unit tests that execute data transformation functions and
+             verify the output type and shape
         2. INLINE QC — immediately after implementation, run:
            a. Build/compile — fix any type or build errors
            b. Lint — fix all lint errors and warnings
@@ -245,6 +306,9 @@ chain_sequence:
               XSS, unvalidated input in the code you just wrote
            e. UX check — if UI code was changed, verify error messages are
               actionable, loading states exist, and user feedback is present
+           f. Runtime safety scan — grep for spread/iteration on non-collection
+              types, missing return type annotations on data transformation
+              functions, loose equality operators. Fix any found.
         3. FIX — fix every issue found in step 2, then re-run checks
         4. COMMIT — only commit when build + lint + tests all pass with zero errors
         5. NEXT TASK — proceed to the next task
@@ -256,19 +320,19 @@ chain_sequence:
         validation pass. Apply improvements, re-run all tests, confirm zero
         regression. Only output FINISHED after everything is validated.
-  - step_id: step_10_final_quality_gate
+  - step_id: step_11_final_quality_gate
     agent_id: bms-speckit:quality-control
     action: dispatch_agent
     phase: 2
     description: >
       Final comprehensive QC sweep by the quality-control agent. Since inline
       QC already caught per-task issues, this focuses on cross-cutting concerns:
-      dependency health, deep security audit, overall UX consistency, and
-      accessibility compliance.
+      dependency health, deep security audit, overall UX consistency,
+      accessibility compliance, and deployment artifact validation.
     timeout_seconds: 900
     post_action:
       commit: true
-      message: "fix(speckit): final QC — security, deps, UX consistency, accessibility"
+      message: "fix(speckit): final QC — security, deps, UX consistency, accessibility, deployment"
       push: true
     error_handling:
       on_failure: stop
@@ -284,10 +348,14 @@ chain_sequence:
            all features, empty states, responsive design
         D. Accessibility — alt text, form labels, keyboard nav, heading hierarchy
         E. Integration check — verify all components work together end-to-end
+        F. Deployment artifacts — Dockerfile lint, base image CVE check via
+           WebSearch, docker-compose validation, .dockerignore, CI/CD configs
+           (skipped if no deployment files exist)
         Fix everything possible. Flag major dependency updates for user review.
-        Only proceed to merge when all checks pass.
+        Proceed to merge unless unfixed build errors, test failures, or
+        critical security vulnerabilities remain.
-  - step_id: step_11_merge
+  - step_id: step_12_merge
     skill_id: internal.git_merge_to_main
     action: execute
     phase: 2
@@ -308,22 +376,26 @@ chain_sequence:
 metadata:
   author: manoirx
   created_at: "2026-03-29"
-  tags: [speckit, tdd, workflow, development, chain, engineering, hosxp, knowledge]
-  estimated_duration_seconds: 6000
+  updated_at: "2026-04-02"
+  tags: [speckit, tdd, workflow, development, chain, engineering, hosxp, knowledge, research]
+  estimated_duration_seconds: 6300
   commit_strategy: per_step
   knowledge_collections:
     hosxp: "Data dictionaries, ER diagrams, table schemas, module architecture"
     bms: "BMS tools, organization, API info"
     moph: "Ministry of Public Health regulations"
     nhso: "NHSO reimbursement rules and announcements"
+  external_research:
+    websearch: "Best practices, library comparisons, compliance standards, reference architectures"
+    context7: "Current library documentation and API references"
   phases:
     phase_1:
       name: Specification & Planning
-      steps: [step_1 through step_7]
+      steps: [step_1 through step_8]
       execution: subagent
       reason: Preserve main context window for implementation
     phase_2:
       name: Implementation & Verification & Merge
-      steps: [step_8 through step_11]
+      steps: [step_9 through step_12]
       execution: main_context
       reason: Implementation needs full tool access and ralph-loop

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "bms-speckit-plugin",
-  "version": "5.3.0",
+  "version": "6.1.0",
   "description": "Chain-orchestrated development pipeline: /bms-speckit takes requirements and runs brainstorm → constitution → specify → plan → tasks → analyze → implement → verify with per-step error handling",
   "files": [
     ".claude-plugin/",

package/skills/bms-speckit-auto/SKILL.md CHANGED Viewed

@@ -14,7 +14,7 @@ Chain blueprint: `blueprints/bms-speckit-pipeline.yaml`
 3. On step failure, check the **error policy** before proceeding:
    - `on_failure: stop` → halt the chain and report which step failed
    - `on_failure: continue` → log the failure and proceed to next step
-4. Pass `$ARGUMENTS` (the user's requirement) to step 1 (brainstorm). Step 4 (specify) receives the **output from step 1**
+4. Pass `$ARGUMENTS` (the user's requirement) to step 1 (brainstorm). Step 5 (specify) receives the **output from step 1 enriched by step 2 research**
 5. **Commit and push after every important step** — each artifact-producing step must commit its output before proceeding to the next step
 6. **Report progress** to the user at every step transition
 7. Do NOT ask for confirmation between steps
@@ -27,16 +27,17 @@ Before dispatching the Phase 1 subagent, create a progress tracker so the user c
 ```
 TaskCreate: "Step 1: Brainstorm — explore requirements"
-TaskCreate: "Step 2: Constitution — engineering principles"
-TaskCreate: "Step 3: CLAUDE.md sync"
-TaskCreate: "Step 4: Specify — feature specification"
-TaskCreate: "Step 5: Plan — implementation plan"
-TaskCreate: "Step 6: Tasks — task list"
-TaskCreate: "Step 7: Analyze — consistency check"
-TaskCreate: "Step 8: Compact context"
-TaskCreate: "Step 9: Implement with rolling QC"
-TaskCreate: "Step 10: Final quality gate"
-TaskCreate: "Step 11: Merge to main"
+TaskCreate: "Step 2: Research — best practices & technology survey"
+TaskCreate: "Step 3: Constitution — engineering principles"
+TaskCreate: "Step 4: CLAUDE.md sync"
+TaskCreate: "Step 5: Specify — feature specification"
+TaskCreate: "Step 6: Plan — implementation plan"
+TaskCreate: "Step 7: Tasks — task list"
+TaskCreate: "Step 8: Analyze — consistency check"
+TaskCreate: "Step 9: Compact context"
+TaskCreate: "Step 10: Implement with rolling QC"
+TaskCreate: "Step 11: Final quality gate"
+TaskCreate: "Step 12: Merge to main"
 ```
 Then output a message to the user:
@@ -57,14 +58,14 @@ You are running the BMS Speckit specification and planning chain. Execute each s
 **IMPORTANT — Progress reporting:** Before starting each step, output a progress message to the user in this format:
-`[Step N/11] step_name — description...`
+`[Step N/12] step_name — description...`
 After completing each step, output:
-`[Step N/11] DONE — brief result summary`
+`[Step N/12] DONE — brief result summary`
 ### Step 1 — Brainstorm `[on_failure: STOP]`
-- **Progress:** Output `[Step 1/11] Brainstorm — exploring requirements and design...`
+- **Progress:** Output `[Step 1/12] Brainstorm — exploring requirements and design...`
 - **Skill:** `superpowers.brainstorm`
 - **Input:** "$ARGUMENTS"
 - **Purpose:** Explore intent, requirements, design alternatives, and edge cases
@@ -72,66 +73,88 @@ After completing each step, output:
 - **Output:** Detailed specification document
 - **Timeout:** 300s
 - **Post-action:** Commit all files and push. Message: `feat(speckit): brainstorm — explore requirements and design`
-- **Done:** Output `[Step 1/11] DONE — brainstorm complete`
+- **Done:** Output `[Step 1/12] DONE — brainstorm complete`
-### Step 2 — Constitution `[on_failure: STOP]`
-- **Progress:** Output `[Step 2/11] Constitution — establishing engineering principles...`
+### Step 2 — Research `[on_failure: CONTINUE]`
+- **Progress:** Output `[Step 2/12] Research — surveying best practices and technologies...`
+- **Purpose:** Research external best practices, proven patterns, recommended libraries/frameworks, and informative reference material relevant to the user's requirement. This grounds the pipeline in current industry knowledge before specification begins.
+- **Tools:** Use `WebSearch` and `mcp__plugin_context7_context7__resolve-library-id` + `mcp__plugin_context7_context7__query-docs` for library documentation
+- **Timeout:** 300s
+- **Process:**
+  1. **Identify research topics** from Step 1 output — extract key technologies, patterns, and domains that need external validation
+  2. **Best practices search** — WebSearch for best practices, design patterns, and architecture guidance relevant to the feature (e.g., "real-time dashboard best practices", "patient data security standards", "React table performance optimization")
+  3. **Library/framework survey** — For each technology identified in brainstorm:
+     - WebSearch for current recommended libraries, compare alternatives (stars, maintenance, bundle size, license)
+     - Use context7 `resolve-library-id` then `query-docs` to fetch current documentation for top candidates
+  4. **Security & compliance research** — If the feature involves sensitive data (healthcare, financial, PII), search for relevant compliance standards and security guidelines
+  5. **Prior art** — Search for similar implementations, open-source examples, or reference architectures that inform the design
+- **Output:** A research brief saved as `specs/*/research.md` containing:
+  - **Best practices summary** — key patterns and principles to follow
+  - **Recommended libraries** — with rationale (why this over alternatives)
+  - **Documentation references** — links and key API details from context7
+  - **Security/compliance notes** — standards to adhere to (if applicable)
+  - **Prior art** — reference implementations or architectures found
+- **Post-action:** Commit all files and push. Message: `feat(speckit): research — best practices and technology survey`
+- **Done:** Output `[Step 2/12] DONE — research brief created (N topics researched)`
+### Step 3 — Constitution `[on_failure: STOP]`
+- **Progress:** Output `[Step 3/12] Constitution — establishing engineering principles...`
 - **Skill:** `speckit.constitution`
 - **Input:** "Establish and enforce a comprehensive set of engineering principles that prioritize high code quality, strict adherence to Test-Driven Development (TDD) practices, and well-defined testing standards across unit, component, integration, and API levels to ensure system reliability and maintainability, maintain a consistent, user-friendly, and professional user interface aligned with strong user experience (UX) guidelines, optimize application performance through efficient architecture and resource management; enforce disciplined version control practices with frequent, atomic commits to minimize risk and improve traceability, promote the development and reuse of modular components and functions while centralizing business logic to avoid duplication and ensure consistency, provide clear, informative user feedback and progress reporting throughout system interactions, and leverage all available tools, frameworks, and domain-specific expertise to support developers in delivering robust, scalable, and high-quality applications."
 - **Output:** `specs/constitution.md`
 - **Timeout:** 300s
-- **Done:** Output `[Step 2/11] DONE — constitution created`
+- **Done:** Output `[Step 3/12] DONE — constitution created`
-### Step 3 — CLAUDE.md Sync `[on_failure: CONTINUE]`
-- **Progress:** Output `[Step 3/11] CLAUDE.md Sync — verifying compliance...`
+### Step 4 — CLAUDE.md Sync `[on_failure: CONTINUE]`
+- **Progress:** Output `[Step 4/12] CLAUDE.md Sync — verifying compliance...`
 - **Action:** Read CLAUDE.md and verify it complies with the constitution in `specs/constitution.md`. Update CLAUDE.md if it conflicts with or is missing constitution rules.
 - **Timeout:** 120s
 - **Post-action:** Commit all files and push. Message: `feat(speckit): add constitution and sync CLAUDE.md`
-- **Done:** Output `[Step 3/11] DONE — CLAUDE.md synced`
+- **Done:** Output `[Step 4/12] DONE — CLAUDE.md synced`
-### Step 4 — Specify `[on_failure: STOP]`
-- **Progress:** Output `[Step 4/11] Specify — creating feature specification...`
+### Step 5 — Specify `[on_failure: STOP]`
+- **Progress:** Output `[Step 5/12] Specify — creating feature specification...`
 - **Skill:** `speckit.specify`
-- **Input:** Use the detailed specification output from Step 1 (brainstorm) as the argument
+- **Input:** Use the detailed specification output from Step 1 (brainstorm) **enriched with the research findings from Step 2**. Include recommended libraries, best practices, and compliance requirements from the research brief in the specification input.
 - **Knowledge lookup:** Use `mcp__bms-knowledge-mcp__search_knowledge` to search the `hosxp` collection for exact table names, field names, data types, and relationships needed by this feature. Reference actual HOSxP data structures in the spec.
 - **Output:** `specs/*/spec.md`
 - **Timeout:** 300s
 - **Retry:** up to 2 attempts
 - **Post-action:** Commit all files and push. Message: `feat(speckit): add feature specification`
-- **Done:** Output `[Step 4/11] DONE — specification created`
+- **Done:** Output `[Step 5/12] DONE — specification created`
-### Step 5 — Plan `[on_failure: STOP]`
-- **Progress:** Output `[Step 5/11] Plan — generating implementation plan...`
+### Step 6 — Plan `[on_failure: STOP]`
+- **Progress:** Output `[Step 6/12] Plan — generating implementation plan...`
 - **Skill:** `speckit.plan`
-- **Input:** reads from step 4 artifacts
+- **Input:** reads from step 5 artifacts
 - **Output:** `specs/*/plan.md`
 - **Timeout:** 300s
 - **Retry:** up to 2 attempts
 - **Post-action:** Commit all files and push. Message: `feat(speckit): add implementation plan`
-- **Done:** Output `[Step 5/11] DONE — plan created`
+- **Done:** Output `[Step 6/12] DONE — plan created`
-### Step 6 — Tasks `[on_failure: STOP]`
-- **Progress:** Output `[Step 6/11] Tasks — generating task list...`
+### Step 7 — Tasks `[on_failure: STOP]`
+- **Progress:** Output `[Step 7/12] Tasks — generating task list...`
 - **Skill:** `speckit.tasks`
-- **Input:** reads from step 5 artifacts
+- **Input:** reads from step 6 artifacts
 - **Output:** `specs/*/tasks.md`
 - **Timeout:** 300s
 - **Retry:** up to 2 attempts
 - **Post-action:** Commit all files and push. Message: `feat(speckit): add task list`
-- **Done:** Output `[Step 6/11] DONE — N tasks created`
+- **Done:** Output `[Step 7/12] DONE — N tasks created`
-### Step 7 — Analyze `[on_failure: CONTINUE]`
-- **Progress:** Output `[Step 7/11] Analyze — cross-artifact consistency check...`
+### Step 8 — Analyze `[on_failure: CONTINUE]`
+- **Progress:** Output `[Step 8/12] Analyze — cross-artifact consistency check...`
 - **Skill:** `speckit.analyze`
-- **Purpose:** Cross-artifact consistency check (spec ↔ plan ↔ tasks). Non-destructive — report only.
+- **Purpose:** Cross-artifact consistency check (spec ↔ plan ↔ tasks ↔ research). Non-destructive — report only.
 - **Timeout:** 300s
 - **Post-action:** Commit all files and push. Message: `feat(speckit): add cross-artifact analysis report`
-- **Done:** Output `[Step 7/11] DONE — analysis complete`
+- **Done:** Output `[Step 8/12] DONE — analysis complete`
 After all steps complete, return: the feature name, number of tasks created, and the path to tasks.md.
 """
-After the subagent completes, update tasks 1-7 as completed using TaskUpdate, then output:
+After the subagent completes, update tasks 1-8 as completed using TaskUpdate, then output:
 > Phase 1 complete. N tasks created at specs/feature-name/tasks.md
 > Starting Phase 2 (implementation)...
@@ -142,32 +165,36 @@ After the subagent completes, update tasks 1-7 as completed using TaskUpdate, th
 > **Execution context:** Runs in main context after subagent completes.
-### Step 8 — Compact `[on_failure: CONTINUE]`
-- **Progress:** Output `[Step 8/11] Compact — freeing context window...`
+### Step 9 — Compact `[on_failure: CONTINUE]`
+- **Progress:** Output `[Step 9/12] Compact — freeing context window...`
 - **Action:** Run `/compact` to free context window before implementation.
-- **Done:** Update task 8 as completed.
+- **Done:** Update task 9 as completed.
-### Step 9 — Implement with Rolling QC `[on_failure: CONTINUE | max_retries: 3]`
-- **Progress:** Output `[Step 9/11] Implement — starting rolling QC loop (N tasks)...`
+### Step 10 — Implement with Rolling QC `[on_failure: CONTINUE | max_retries: 3]`
+- **Progress:** Output `[Step 10/12] Implement — starting rolling QC loop (N tasks)...`
 - **Engine:** ralph-loop
 - **Input:** Use the **tasks.md path returned by the Phase 1 subagent** (e.g. `specs/my-feature/tasks.md`). Replace `{TASKS_PATH}` below with the actual path.
 - **Completion promise:** `FINISHED`
 - **Max iterations:** 10
 - **Pattern:** Rolling Review — each task gets its own QC cycle before moving to the next
 - **Per-task cycle:**
-  1. **IMPLEMENT** — write code following TDD (tests first, then implementation)
+  1. **IMPLEMENT** — write code following TDD (tests first, then implementation). **Runtime safety rules:**
+     - Always add explicit return type annotations on data transformation functions — never rely on type inference for functions whose return values are spread, iterated, or passed to collection methods
+     - Never spread or iterate a function's return value without verifying it returns the expected collection type (e.g., array not object, list not dict)
+     - Use strict equality, add null/undefined/None guards for external data (API responses, DB results, config, user input)
+     - Add unit tests that actually execute data transformation functions and verify the output type and shape
   2. **INLINE QC** — immediately run: build, lint, ALL tests, security quick scan, UX check
   3. **FIX** — fix every issue found, re-run checks
   4. **COMMIT** — only commit when build + lint + tests pass with zero errors
   5. **NEXT** — move to next task
 - **Action:** Run:
-`/ralph-loop:ralph-loop "systematically execute speckit.implement via the Skill tool to complete every task defined in {TASKS_PATH} with strict adherence to specification requirements. IMPORTANT: apply rolling QC after EACH task — after implementing a task run build and fix build errors, run linter and fix lint errors, run ALL tests (not just new ones) and fix failures, check for hardcoded secrets and injection vulnerabilities in code you just wrote, verify UI code has actionable error messages and loading states — only commit when build plus lint plus tests all pass with zero errors then proceed to next task. Report progress to the user after each task: output [Task N/total] DONE — task_name. Do NOT batch QC at the end. Maintain atomic commits after each successful task with clear traceability, avoid requesting confirmation and proceed autonomously, once all tasks are implemented invoke speckit.analyze via the Skill tool to perform a full validation pass, automatically apply all recommended improvements or corrections, re-run all tests to confirm stability and zero regression, and only output <promise>FINISHED</promise> after every task is fully completed, validated, and aligned with production-grade quality standards" --completion-promise "FINISHED" --max-iterations 10`
+`/ralph-loop:ralph-loop "systematically execute speckit.implement via the Skill tool to complete every task defined in {TASKS_PATH} with strict adherence to specification requirements. IMPORTANT: apply rolling QC after EACH task — after implementing a task run build and fix build errors, run linter and fix lint errors, run ALL tests (not just new ones) and fix failures, check for hardcoded secrets and injection vulnerabilities in code you just wrote, verify UI code has actionable error messages and loading states. RUNTIME SAFETY: always add explicit return type annotations on data transformation functions, never spread or iterate a function return value without verifying it returns the expected collection type, use strict equality and null guards for external data, write tests that execute data transformers and verify output type and shape. Only commit when build plus lint plus tests all pass with zero errors then proceed to next task. Report progress to the user after each task: output [Task N/total] DONE — task_name. Do NOT batch QC at the end. Maintain atomic commits after each successful task with clear traceability, avoid requesting confirmation and proceed autonomously, once all tasks are implemented invoke speckit.analyze via the Skill tool to perform a full validation pass, automatically apply all recommended improvements or corrections, re-run all tests to confirm stability and zero regression, and only output <promise>FINISHED</promise> after every task is fully completed, validated, and aligned with production-grade quality standards" --completion-promise "FINISHED" --max-iterations 10`
-- **Done:** Update task 9 as completed. Output `[Step 9/11] DONE — all tasks implemented and verified`
+- **Done:** Update task 10 as completed. Output `[Step 10/12] DONE — all tasks implemented and verified`
-### Step 10 — Final Quality Gate `[on_failure: STOP | max_retries: 3]`
-- **Progress:** Output `[Step 10/11] Final QC — running comprehensive quality audit...`
+### Step 11 — Final Quality Gate `[on_failure: STOP | max_retries: 3]`
+- **Progress:** Output `[Step 11/12] Final QC — running comprehensive quality audit...`
 - **Agent:** Dispatch `bms-speckit:quality-control` agent
 - **Purpose:** Final comprehensive sweep. Since inline QC already caught per-task issues, this focuses on **cross-cutting concerns** that can only be detected across the full codebase.
 - **Timeout:** 900s
@@ -179,15 +206,15 @@ After the subagent completes, update tasks 1-7 as completed using TaskUpdate, th
   - **E. Integration check** — verify all components work together end-to-end
   - **F. Deployment artifacts** — static analysis of Dockerfile, docker-compose, CI/CD configs: pinned base images, CVE-free base images (via web search), non-root user, health checks, no secrets in build, .dockerignore coverage (skipped if no deployment files exist)
 - The agent fixes everything it can. Major dependency updates are flagged for user review.
-- **Completion rule:** When the QC agent returns its report, proceed to Step 11 **unless** the report contains unfixed build errors, unfixed test failures, or unfixed critical security vulnerabilities. Informational findings, flagged-for-review items, and already-fixed issues do NOT block progression. If uncertain, proceed — the QC agent already fixed what it could.
+- **Completion rule:** When the QC agent returns its report, proceed to Step 12 **unless** the report contains unfixed build errors, unfixed test failures, or unfixed critical security vulnerabilities. Informational findings, flagged-for-review items, and already-fixed issues do NOT block progression. If uncertain, proceed — the QC agent already fixed what it could.
 - **Post-action:** Commit all fixes and push. Message: `fix(speckit): final QC — security, deps, UX consistency, accessibility`
-- **Done:** Update task 10 as completed. Output `[Step 10/11] DONE — quality gate passed`
+- **Done:** Update task 11 as completed. Output `[Step 11/12] DONE — quality gate passed`
-### Step 11 — Merge to Main `[on_failure: STOP]`
-- **Progress:** Output `[Step 11/11] Merge — merging to main branch...`
+### Step 12 — Merge to Main `[on_failure: STOP]`
+- **Progress:** Output `[Step 12/12] Merge — merging to main branch...`
 - **Action:** Switch to main branch, merge the feature branch (fast-forward if possible), push main to remote, then clean up the feature branch.
 - **Timeout:** 120s
-- **Done:** Update task 11 as completed. Output:
+- **Done:** Update task 12 as completed. Output:
 > Pipeline complete! Feature merged to main.
@@ -198,13 +225,15 @@ After the subagent completes, update tasks 1-7 as completed using TaskUpdate, th
 ```
 Phase 1 (subagent)                          Phase 2 (main context)
 ──────────────────────────────              ──────────────────────────────
-Step 1: brainstorm ──STOP── commit          Step 8:  compact
-        + knowledge search (hosxp)          Step 9:  implement + rolling QC
-Step 2: constitution ─STOP─┐                         ┌─ implement task ─┐
-Step 3: CLAUDE.md sync ───┘ commit                   │  inline QC       │
-Step 4: specify ──────STOP── commit                  │  fix → commit    │
-        + knowledge search (hosxp)                   └─ next task ──────┘
-Step 5: plan ─────────STOP── commit         Step 10: final QC agent ── commit
-Step 6: tasks ────────STOP── commit                  (security/deps/UX/a11y)
-Step 7: analyze ──────────── commit         Step 11: merge to main + push
+Step 1: brainstorm ──STOP── commit          Step 9:  compact
+        + knowledge search (hosxp)          Step 10: implement + rolling QC
+Step 2: research ────────── commit                   ┌─ implement task ─┐
+        + WebSearch + context7                       │  inline QC       │
+Step 3: constitution ─STOP─┐                         │  fix → commit    │
+Step 4: CLAUDE.md sync ───┘ commit                   └─ next task ──────┘
+Step 5: specify ──────STOP── commit         Step 11: final QC agent ── commit
+        + knowledge search (hosxp)                   (security/deps/UX/a11y/deploy)
+Step 6: plan ─────────STOP── commit         Step 12: merge to main + push
+Step 7: tasks ────────STOP── commit
+Step 8: analyze ──────────── commit
 ```