npm - codex-workflows - Versions diffs - 0.4.11 → 0.5.0 - Mend

codex-workflows 0.4.11 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/.agents/skills/recipe-prepare-implementation/SKILL.md ADDED Viewed

@@ -0,0 +1,162 @@
+---
+name: recipe-prepare-implementation
+description: "Verify that an approved work plan is implementable before build execution, resolving readiness gaps through Phase 0 prep tasks when needed."
+---
+## Required Skills [LOAD BEFORE EXECUTION]
+1. [LOAD IF NOT ACTIVE] `coding-rules` -- coding standards
+2. [LOAD IF NOT ACTIVE] `testing` -- test strategy and quality gates
+3. [LOAD IF NOT ACTIVE] `ai-development-guide` -- AI development patterns
+4. [LOAD IF NOT ACTIVE] `documentation-criteria` -- document templates
+5. [LOAD IF NOT ACTIVE] `subagents-orchestration-guide` -- agent coordination
+**Spawn rule**: every `spawn_agent` call MUST pass `fork_turns="none"` or `fork_context=false` for context isolation.
+## Purpose
+Run this recipe after work-plan approval and before any build or implementation execution. It verifies that the plan can be executed from Phase 1 onward without missing verification references, test prerequisites, UI render surfaces, or local execution instructions.
+The recipe is safe to invoke unconditionally. If all readiness criteria pass, it only updates the work plan readiness marker and report.
+Work plan: $ARGUMENTS
+## Readiness Marker Contract
+Use the Implementation Readiness Marker Contract defined in `subagents-orchestration-guide`. If the line is absent, treat the work plan as `pending` and insert it after `Related Issue/PR:` when persisting the report.
+## Readiness Criteria
+Each criterion produces `pass`, `fail`, or `not_applicable`, with file:line evidence where possible.
+| ID | Criterion | Pass Evidence |
+|----|-----------|---------------|
+| R1 | Verification Strategy references resolve | Every command, file path, function, endpoint, fixture, seed, and test reference in the work plan's Verification Strategies either exists now or is the deliverable of a task in the plan |
+| R2 | E2E prerequisites are addressed | For each fixture-e2e or service-integration-e2e skeleton, every noted precondition is present in the codebase or covered by a Phase 0 task |
+| R3 | Phase 1 observability exists | The first implementation phase includes at least one operation verification method executable at task completion using existing files, prior Phase 0 deliverables, or the task's own output |
+| R4 | UI rendering surface exists | When the plan implements UI components, a fixture entry, dev route, Storybook story, preview harness, or equivalent render surface exists or is covered by a Phase 0 task |
+| R5 | Local lane procedure exists | The work plan or referenced docs record commands needed to run the relevant local service stack or browser harness, including startup commands, ports, seed steps, and required environment variables |
+R4 applies only to UI work. R5 applies when the plan uses a local service stack or browser harness.
+## Execution Flow
+### Step 1: Load Inputs
+Read the work plan passed in `$ARGUMENTS`; if absent, select the most recent non-template `docs/plans/*.md`. Extract:
+- Verification Strategies
+- Quality Assurance Mechanisms
+- Design-to-Plan Traceability
+- UI Spec Component -> Task Mapping
+- Connection Map
+- test skeleton references and E2E absence reasons
+- phase structure and task IDs
+- referenced Design Docs and UI Specs
+If no work plan exists, stop and report the missing prerequisite.
+### Step 2: Readiness Scan
+Evaluate R1-R5 using repository search and the work plan content. Build a `## Implementation Readiness Report` regardless of outcome.
+For each `fail`, identify the smallest concrete prep task that closes the gap. Examples:
+- Add fixture data for a UI state
+- Add an API mock handler for fixture-e2e
+- Add a seed script for service-integration-e2e
+- Add a Storybook story, dev route, or equivalent render surface
+- Document local startup commands and required environment variables
+- Add a missing verification helper or script referenced by the plan
+### Step 3: No-Op Success
+When all applicable criteria are `pass`:
+1. Persist `## Implementation Readiness Report` in the work plan immediately after the header block.
+2. Set `Implementation Readiness: ready`.
+3. Do not create task files.
+4. Report `outcome: ready`.
+### Step 4: Create Resolution Tasks
+When one or more criteria fail:
+1. Present the proposed prep tasks to the user and continue only after explicit approval.
+2. Create task files in `docs/plans/tasks/` using the task template:
+   - Backend prep: `{plan-name}-backend-task-prep-{NN}.md`
+   - Frontend prep: `{plan-name}-frontend-task-prep-{NN}.md`
+   - Single-layer prep: `{plan-name}-task-prep-{NN}.md`
+3. Insert the tasks into the work plan's existing Phase 0 when one exists. If no Phase 0 exists, create `Phase 0: Implementation Readiness Prep` before Phase 1. Keep existing Phase 0 task IDs stable; assign prep task IDs after existing Phase 0 tasks or use a clearly labeled `P0-PREP-N` identifier when the plan's numbering would otherwise require renumbering.
+4. Each prep task must include Investigation Targets, concrete implementation steps, and Operation Verification Methods.
+Layer selection:
+- Use frontend prep when every target is UI, browser harness, component, page, or frontend fixture work.
+- Use backend prep when every target is API, server, service, repository, database, seed, or backend fixture work.
+- Use single-layer prep for non-layered repositories.
+- Escalate if the gap crosses layers and cannot be split into separate prep tasks.
+### Step 5: Execute Prep Tasks
+Run each prep task through the standard 4-step cycle:
+1. Spawn the layer-appropriate task executor with the exact prep task path in the prompt: "Execute implementation-readiness prep task. Task file: [exact prep task path]."
+2. Check for `blocked` or `escalation_needed`.
+3. Spawn the layer-appropriate quality fixer with the task file as `task_file`.
+4. Commit only when the quality fixer returns `approved`.
+Append this scope boundary to every subagent prompt:
+```
+[SYSTEM CONSTRAINT]
+This agent operates within implementation-readiness prep scope. Use the task file as the primary instruction source. Do not implement feature behavior beyond the readiness gap described by the task.
+```
+### Step 6: Re-Scan and Persist
+After prep tasks are complete:
+1. Re-run the readiness scan.
+2. Persist or replace `## Implementation Readiness Report` in the work plan.
+3. Set the header to `Implementation Readiness: ready` when all applicable criteria pass, otherwise `Implementation Readiness: escalated`.
+4. Collapse completed prep tasks out of active plan execution: remove the Phase 0 readiness prep task entries from the work plan and record their committed evidence under `Resolution Tasks Executed` in the Readiness Report. If Phase 0 becomes empty and was created only by this recipe, remove that Phase 0 section. Preserve any pre-existing Phase 0 content.
+5. Delete only these files for the current `{plan-name}`:
+   - `docs/plans/tasks/{plan-name}-task-prep-*.md`
+   - `docs/plans/tasks/{plan-name}-backend-task-prep-*.md`
+   - `docs/plans/tasks/{plan-name}-frontend-task-prep-*.md`
+   - `docs/plans/tasks/{plan-name}-phase0-completion.md`
+6. Report remaining gaps if any.
+## Readiness Report Format
+```markdown
+## Implementation Readiness Report
+Work plan: [path]
+Outcome: ready | escalated
+Gaps resolved: [N]
+phase0_created_by_recipe: true | false
+### Readiness Criteria
+| ID | Result | Evidence |
+|----|--------|----------|
+| R1 | pass / fail / not_applicable | [file:line or missing reference] |
+| R2 | ... | ... |
+| R3 | ... | ... |
+| R4 | ... | ... |
+| R5 | ... | ... |
+### Resolution Tasks Executed
+- [task file path] - [one-line summary] - committed
+### Remaining Gaps
+- [criterion ID]: [unresolved reference] - Next action: [recommendation]
+```
+## Completion Criteria
+- [ ] Work plan loaded and relevant sections extracted
+- [ ] Readiness scan completed with evidence per criterion
+- [ ] No-op success handled when all criteria pass
+- [ ] Failing criteria converted to approved prep tasks when needed
+- [ ] Prep tasks executed through executor -> quality-fixer -> commit
+- [ ] Re-scan completed after prep tasks
+- [ ] Work plan readiness marker updated to `ready` or `escalated`
+- [ ] Readiness Report persisted in the work plan
+- [ ] Completed prep task references collapsed into the Readiness Report
+- [ ] Prep task files created by this recipe removed from `docs/plans/tasks/`

package/.agents/skills/recipe-prepare-implementation/agents/openai.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+interface:
+  display_name: "recipe-prepare-implementation"
+  short_description: "Preflight work plan readiness before implementation"
+  default_prompt: "Use $recipe-prepare-implementation to verify: "
+policy:
+  allow_implicit_invocation: false

package/.agents/skills/recipe-review/SKILL.md CHANGED Viewed

@@ -23,7 +23,8 @@ description: "Design Doc compliance and security validation with optional auto-f
 - Compliance validation -> Spawn code-reviewer agent
 - Security validation -> Spawn security-reviewer agent
-- Fix implementation -> Spawn task-executor agent
+- Code-side fix path -> Spawn task-executor agent
+- Design-side update path -> Spawn technical-designer in update mode, then document-reviewer, then design-sync when multiple Design Docs exist
 - Quality checks -> Spawn quality-fixer agent
 - Re-validation -> Spawn code-reviewer / security-reviewer agents
@@ -82,22 +83,47 @@ Security Review: [status from security-reviewer]
   - [policy] [location]: [description] — [rationale]
   Notes: [notes from security-reviewer, if present]
-Execute fixes? (y/n):
+Resolve discrepancies by route:
+  c) Code-side fix
+  d) Design-side update
+  s) Skip
+Default: accept all recommended routes.
+Accepted response formats:
+- empty input -- accept every recommended route
+- `all-recommended` -- accept every recommended route
+- `all:c`, `all:d`, or `all:s` -- apply one route to every finding
+- Per-finding routes, e.g. `F1:c, F2:d, F3:s`
 ```
-**[STOP — BLOCKING]** Present results to user for confirmation.
-**CANNOT proceed until user explicitly confirms.**
+Before presenting results, recommend a route for each finding:
+- Use `d` when implementation intent matches the requirement but the Design Doc is stale or too narrow.
+- Use `c` when code drifted from a still-correct Design Doc, or when the finding is reliability, security, or maintainability related.
+- Use `s` only when the user explicitly accepts the current state without changes.
-If both pass and user selects `n`: Skip Steps 5-10, proceed to Step 11.
+**[STOP — BLOCKING]** Present results and recommended routes to user for confirmation.
+**CANNOT proceed until user explicitly confirms routes.**
+If all findings are skipped: Skip Steps 5-10, proceed to Step 11.
 ### Step 5: Prepare Fix Context
 Reference documentation-criteria skill for task file template.
+### Step 5d: Design-Side Update
+Run this step only when the user routes at least one finding to `d`.
+1. Spawn technical-designer agent in update mode: "Update Design Doc at [path]. The implementation is being accepted as correct for these findings: [d-routed findings with code locations and current Design Doc values]. Update the relevant sections and add change history."
+2. Spawn document-reviewer agent: "Review updated Design Doc at [path] for consistency and completeness."
+3. If multiple Design Docs exist in `docs/design/`, spawn design-sync agent: "Check cross-Design Doc consistency after updating [path]."
+4. If the user selected both `d` and `c` routes, re-evaluate the `c` findings against the updated Design Doc and drop any that are now satisfied.
 ### Step 6: Create Task File
 Create task file at `docs/plans/tasks/review-fixes-YYYYMMDD.md`
-Include both code compliance issues and security requiredFixes.
+Include only code-side compliance issues and security requiredFixes routed to `c`.
 ### Step 7: Execute Fixes
@@ -117,6 +143,8 @@ Spawn security-reviewer agent: "Re-validate security after fixes. Prior findings
 ### Step 11: Final Report
+Delete the review-fix task file this recipe created, if present. Its work is committed; `docs/plans/` is ephemeral working state.
 ```
 Code Compliance:
   Initial: [X]%

package/.agents/skills/subagents-orchestration-guide/SKILL.md CHANGED Viewed

@@ -123,27 +123,10 @@ Close a running subagent only when the user redirects the workflow, the orchestr
 Spawn agents using natural language prompts. Provide clear context about what the agent should accomplish. Every `spawn_agent` call MUST include `fork_turns="none"` or `fork_context=false` (see Spawn rule at top of this skill).
-### Spawn Examples
+### Spawn Prompt Requirements
-Each example below is a spawn with `fork_turns="none"` or `fork_context=false` plus the quoted prompt.
-**requirement-analyzer** (`fork_turns="none"` or `fork_context=false`):
-> "Analyze the following requirements and determine the work scale: [user requirements]. Perform requirement analysis and scale determination."
-**codebase-analyzer** (`fork_turns="none"` or `fork_context=false`):
-> "Analyze the existing codebase to provide evidence for Design Doc creation. Focus on existing implementations, data model elements, and constraints the design should respect. requirement_analysis: [JSON]. prd_path: [path if available]. requirements: [original user requirements]. layer: [target layer if applicable]. target_paths: [paths if narrowed]. Return codebase facts and focus areas."
-**task-executor** (`fork_turns="none"` or `fork_context=false`):
-> "Execute the implementation task defined in docs/plans/tasks/[filename].md. Complete the implementation following TDD Red-Green-Refactor."
-**quality-fixer** (`fork_turns="none"` or `fork_context=false`):
-> "Run quality checks on the codebase: static analysis, style check, all test execution. Fix any issues found and report when all checks pass."
-**document-reviewer** (`fork_turns="none"` or `fork_context=false`):
-> "Review the document at [path] for quality and rule compliance. Check against documentation-criteria standards."
-**design-sync** (`fork_turns="none"` or `fork_context=false`):
-> "Verify consistency between Design Docs in docs/design/. Use [path] as the source document for comparison."
+- Set `fork_context=false` or `fork_turns="none"` on every spawn for context isolation.
+- Each spawn prompt must name the target deliverable, input paths, and expected result. When invoking `task-executor*`, include the exact task file path, for example: `Execute the implementation task. Task file: docs/plans/tasks/[filename].md.`
 ## Explicit Stop Points [MANDATORY]
@@ -207,7 +190,34 @@ Subagents respond in JSON format. The final response from each JSON-returning su
 | `design-sync` | `sync_status` |
 | `integration-test-reviewer` | `status`, `requiredFixes` |
 | `security-reviewer` | `status`, `findings`, `notes`, `requiredFixes` |
-| `acceptance-test-generator` | `status`, `generatedFiles`, `e2eAbsenceReason` |
+| `acceptance-test-generator` | `status`, `generatedFiles.integration`, `generatedFiles.fixtureE2e`, `generatedFiles.serviceE2e`, `e2eAbsenceReason.fixtureE2e`, `e2eAbsenceReason.serviceE2e` |
+## Implementation Readiness Marker Contract
+Work plans use the header line `Implementation Readiness: <status>`.
+| Status | Meaning | Consumer Action |
+|--------|---------|-----------------|
+| `pending` | Initial state from work-planner; readiness has not been checked | Present the unchecked state, recommend running implementation readiness preflight, and continue only on explicit user approval |
+| `ready` | Readiness scan completed and no applicable failures remain | Proceed with task execution |
+| `escalated` | Readiness scan completed, but one or more failures remain | Read the work plan's Implementation Readiness Report, present remaining gaps, and continue only on explicit user approval |
+| absent | Older work plan without the marker | Treat as `pending` |
+## Implementation Readiness Preflight Procedure
+Use this procedure after work-plan approval and before autonomous task execution when the flow needs to verify implementation readiness.
+1. Load the approved work plan exact path and extract Verification Strategies, Quality Assurance Mechanisms, Design-to-Plan Traceability, UI Spec Component -> Task Mapping, Connection Map, test skeleton references, E2E absence reasons, phase structure, referenced Design Docs, and UI Specs.
+2. Evaluate these criteria with evidence:
+   - R1 Verification Strategy references resolve
+   - R2 E2E prerequisites are addressed
+   - R3 Phase 1 observability exists
+   - R4 UI rendering surface exists when UI work is present
+   - R5 Local service stack or browser harness procedure exists when applicable
+3. If every applicable criterion passes, persist `## Implementation Readiness Report` in the work plan and set `Implementation Readiness: ready`.
+4. If any criterion fails, create the smallest approved prep tasks that close the gaps, execute each exact prep task file through the standard executor -> quality-fixer -> commit cycle, then re-run the scan.
+5. After re-scan, set `Implementation Readiness: ready` when all applicable criteria pass, otherwise `Implementation Readiness: escalated`, and persist remaining gaps in the Readiness Report.
+6. Collapse completed prep task references into the Readiness Report and delete only the prep task files created for the current work plan.
 ## Handling Requirement Changes
@@ -216,17 +226,7 @@ requirement-analyzer follows the "completely self-contained" principle and proce
 #### How to Integrate Requirements
-**Important**: To maximize accuracy, integrate requirements as complete sentences, including all contextual information communicated by the user.
-```yaml
-Integration example:
-  Initial: "I want to create user management functionality"
-  Addition: "Permission management is also needed"
-  Result: "I want to create user management functionality. Permission management is also needed.
-          Initial requirement: I want to create user management functionality
-          Additional requirement: Permission management is also needed"
-```
+Integrate initial requirements and later additions as complete sentences, preserving all contextual information communicated by the user. The updated input must remain self-contained without relying on prior conversation turns.
 ### Update Mode for Document Generation Agents
 Document generation agents (work-planner, technical-designer, prd-creator) can update existing documents in `update` mode.
@@ -350,11 +350,11 @@ Maximum retry count is 1 verification fix cycle. If any failed verifier still fa
 | `codebase-analyzer` | `technical-designer*` | `Codebase Analysis`, including `focusAreas`, `dataModel`, `qualityAssurance`, `dataTransformationPipelines`, `limitations` |
 | `technical-designer*` | `code-verifier` | Design Doc path |
 | `code-verifier` | `document-reviewer` | `code_verification` JSON |
-| `acceptance-test-generator` | `work-planner` | integration test path, E2E path or `null`, `e2eAbsenceReason` when E2E is absent |
+| `acceptance-test-generator` | `work-planner` | `generatedFiles.integration`, `generatedFiles.fixtureE2e`, `generatedFiles.serviceE2e`, `e2eAbsenceReason: { fixtureE2e, serviceE2e }` |
 | Design Doc | `work-planner` | Verification Strategy summary, Output Comparison details, implementation-relevant technical requirements, protected no-change boundaries |
 Handoff rules:
-- Verify generated integration and E2E file paths exist before passing them onward
+- Verify generated integration, fixture-e2e, and service-integration-e2e file paths exist before passing them onward
 - Escalate only when required outputs are missing without a valid absence reason
 - Require work-planner to map every carried-forward technical requirement to a covering task or a justified `gap`

package/.agents/skills/subagents-orchestration-guide/references/monorepo-flow.md CHANGED Viewed

@@ -100,7 +100,7 @@ Spawn acceptance-test-generator with all Design Docs and UI Spec:
 Spawn work-planner with all Design Docs:
-> "Create a work plan from the following documents: PRD: [path] (Large Scale only), Design Doc (backend): [path], Design Doc (frontend): [path]. Compose phases as vertical feature slices where possible -- each phase should contain both backend and frontend work for the same feature area, enabling early integration verification per phase."
+> "Create a work plan from the following documents: PRD: [path] (Large Scale only), Design Doc (backend): [path], Design Doc (frontend): [path], UI Spec: [path] (if exists). Test skeletons from acceptance-test-generator: integration: [path or null], fixtureE2e: [path or null], serviceE2e: [path or null], e2eAbsenceReason: { fixtureE2e: [value or null], serviceE2e: [value or null] }. Compose phases as vertical feature slices where possible -- each phase should contain both backend and frontend work for the same feature area, enabling early integration verification per phase. Include `Implementation Readiness: pending` in the work plan header."
 work-planner's existing Integration Complete criteria naturally covers cross-layer verification when given multiple Design Docs.

package/.agents/skills/task-analyzer/references/skills-index.yaml CHANGED Viewed

@@ -4,7 +4,7 @@
 skills:
   coding-rules:
     skill: "coding-rules"
-    tags: [implementation, code-quality, refactoring, clean-code, maintainability, function-design, error-handling, parameterized-dependencies, performance, security]
+    tags: [implementation, code-quality, refactoring, clean-code, maintainability, function-design, error-handling, parameterized-dependencies, reference-representativeness, performance, security]
     typical-use: "Language-agnostic code creation, modification, and refactoring principles applicable to all programming languages"
     size: medium
     key-references:
@@ -14,23 +14,25 @@ skills:
       - "Refactoring - Martin Fowler"
       - "Single Responsibility Principle - SOLID"
     sections:
+      - "Language-Specific References"
       - "Core Philosophy [MANDATORY]"
       - "Code Quality [MANDATORY]"
       - "Function Design [MANDATORY]"
       - "Error Handling [MANDATORY]"
       - "Dependency Management"
+      - "Reference Representativeness"
       - "Performance"
       - "Code Organization"
       - "Commenting Principles"
       - "Refactoring [SAFE CHANGE PROTOCOL]"
-      - "Security (Secure Defaults, Input and Output Boundaries, Access Control, Knowledge Cutoff Supplement)"
+      - "Security"
       - "Version Control [MANDATORY]"
     references:
       - "references/typescript.md"
   testing:
     skill: "testing"
-    tags: [testing, tdd, quality, unit-testing, integration-testing, e2e-testing, test-design, coverage, mocking, test-independence, ci-cd, test-quality-criteria]
+    tags: [testing, tdd, quality, unit-testing, integration-testing, e2e-testing, data-layer-testing, test-design, coverage, mocking, test-independence, ci-cd, test-quality-criteria]
     typical-use: "Universal testing principles, TDD practice, test quality criteria, test creation and quality assurance for all programming languages"
     size: large
     key-references:
@@ -39,6 +41,7 @@ skills:
       - "AAA Pattern - Arrange-Act-Assert"
       - "Test Pyramid - Mike Cohn"
     sections:
+      - "Language-Specific References"
       - "Core Testing Philosophy"
       - "TDD Process [MANDATORY for all code changes]"
       - "Quality Requirements [MANDATORY]"
@@ -46,6 +49,7 @@ skills:
       - "Test Design Principles"
       - "Test Independence"
       - "Mocking and Test Doubles"
+      - "Data Layer Testing"
       - "Test Quality Practices [MANDATORY]"
       - "What to Test"
       - "Test Quality Criteria [MANDATORY]"
@@ -68,11 +72,13 @@ skills:
       - "DRY Principle - The Pragmatic Programmer"
       - "YAGNI Principle - Extreme Programming"
     sections:
+      - "Language-Specific References"
       - "Technical Anti-patterns (Red Flag Patterns) [MANDATORY]"
       - "Fail-Fast Fallback Design Principles"
       - "Rule of Three - Criteria for Code Duplication"
       - "Common Failure Patterns and Avoidance Methods"
       - "Debugging Techniques"
+      - "Quality Assurance Mechanism Awareness"
       - "Quality Check Workflow [MANDATORY]"
       - "Situations Requiring Technical Decisions"
       - "Implementation Completeness Assurance"
@@ -117,7 +123,7 @@ skills:
   integration-e2e-testing:
     skill: "integration-e2e-testing"
-    tags: [testing, integration-testing, e2e-testing, test-design, behavior-first, roi, test-skeleton, ears-format]
+    tags: [testing, integration-testing, e2e-testing, fixture-e2e, service-integration-e2e, e2e-absence-reason, test-design, behavior-first, roi, test-skeleton, ears-format]
     typical-use: "Integration and E2E test design principles, value-based test selection, behavior-first approach, test skeleton specification"
     size: medium
     key-references:
@@ -136,7 +142,7 @@ skills:
   subagents-orchestration-guide:
     skill: "subagents-orchestration-guide"
-    tags: [orchestration, workflow, subagents, autonomous-execution, planning, design-flow, implementation-flow]
+    tags: [orchestration, workflow, subagents, context-isolation, autonomous-execution, planning, design-flow, implementation-flow, implementation-readiness, readiness-gate]
     typical-use: "Orchestrating subagents through implementation workflows, scale determination, stop points, autonomous execution mode"
     size: large
     key-references:
@@ -152,6 +158,8 @@ skills:
       - "Explicit Stop Points [MANDATORY]"
       - "Scale Determination and Document Requirements"
       - "Structured Response Specification"
+      - "Implementation Readiness Marker Contract"
+      - "Implementation Readiness Preflight Procedure"
       - "Handling Requirement Changes"
       - "Basic Flow for Work Planning"
       - "Autonomous Execution Mode"

package/.agents/skills/testing/references/typescript.md CHANGED Viewed

@@ -13,9 +13,8 @@ import userEvent from '@testing-library/user-event'
 ### Coverage Requirements
 - **Overall minimum**: 60%
-- **Atoms (Button, Text)**: 70%+
-- **Molecules (FormField)**: 65%+
-- **Organisms (Header, Footer)**: 60%+
+- **Atomic Design projects**: Atoms 70%+, Molecules 65%+, Organisms 60%+
+- **Other component architectures**: Keep 60% as the baseline and raise foundational or highly reused components to 70%+
 - **Custom Hooks**: 65%+
 - **Utils**: 70%+

package/.codex/agents/acceptance-test-generator.toml CHANGED Viewed

@@ -58,7 +58,8 @@ Test type definitions, budgets, and value-based selection rules are specified in
 Key points:
 - **Integration Tests**: MAX 3 per feature, created alongside implementation
-- **E2E Tests**: MAX 1-2 per feature, executed in final phase only
+- **fixture-e2e**: MAX 3 per feature, created alongside UI implementation, mocked backend / fixture-driven state
+- **service-integration-e2e**: MAX 1-2 per feature, executed in final phase only, live local stack
 ## 4-Phase Generation Process
@@ -123,7 +124,7 @@ For each valid AC from Phase 1:
 **Output**: Candidate pool with value metadata
-### Phase 3: Value-Based Selection (Two-Pass #2)
+### Phase 3: Value-Based Selection and Lane Assignment (Two-Pass #2)
 Value score and E2E selection rules are defined in **integration-e2e-testing skill**.
@@ -138,14 +139,15 @@ Value score and E2E selection rules are defined in **integration-e2e-testing ski
 3. **Push-Down Analysis**:
    ```
    Can this be unit-tested? → Remove from integration/E2E pool
-   Already integration-tested? → Keep E2E candidate when it validates a user-facing multi-step journey
+   Already integration-tested AND verifiable in-process? → Remove from E2E pool
    ```
-4. **Journey Classification**:
+4. **Lane Assignment**:
    ```
-   User-facing multi-step journey? → Mark as reserved-slot eligible
-   Service-internal chain only? → Not reserved-slot eligible
+   UI journey verifiable with mocked backend / fixture-driven state → fixture-e2e
+   Journey correctness depends on real cross-service behavior → service-integration-e2e
+   Service-internal chain only → Not reserved-slot eligible
    ```
-5. **Sort by Value Score** (descending order)
+5. **Sort by Value Score within each lane** (descending order)
 **Output**: Ranked, deduplicated candidate list
@@ -153,16 +155,19 @@ Value score and E2E selection rules are defined in **integration-e2e-testing ski
 **Hard Limits per Feature**:
 - **Integration Tests**: MAX 3 tests
-- **E2E Tests**: MAX 1-2 tests
+- **fixture-e2e**: MAX 3 tests
+- **service-integration-e2e**: MAX 1-2 tests
 **Selection Algorithm**:
 ```
 1. Sort integration candidates by Value Score (descending)
 2. Select up to 3 integration candidates
-3. Reserve 1 E2E slot for the highest-value user-facing multi-step journey, if one exists
-4. Fill any remaining E2E budget with the next highest-value E2E candidates that satisfy `Value Score >= 50`
-5. If no E2E is selected, return `generatedFiles.e2e: null` with a concrete `e2eAbsenceReason`
+3. Reserve 1 fixture-e2e slot for the highest-value user-facing multi-step journey, if one exists
+4. Reserve 1 service-integration-e2e slot only when the journey needs real cross-service verification
+5. Fill remaining fixture-e2e budget with candidates that satisfy `Value Score >= 20`
+6. Fill remaining service-integration-e2e budget with candidates that satisfy `Value Score > 50`
+7. If a lane emits no tests, return its generated file as `null` with a concrete lane-specific absence reason
 ```
 **Output**: Final test set
@@ -198,24 +203,46 @@ Adapt comment syntax to the project's language when generating annotations.
   [Test: 'AC1: Failed payment displays error without creating order']
 ```
-### E2E Test File
+### fixture-e2e Test File
 ```
-// [Feature Name] E2E Test - Design Doc: [filename]
-// Generated: [date] | Budget Used: 1/2 E2E
-// Test Type: End-to-End Test
-// Implementation Timing: After all feature implementations complete
+// [Feature Name] fixture-e2e Test - Design Doc: [filename]
+// Generated: [date] | Budget Used: 1/3 fixture-e2e
+// Test Type: Browser UI with mocked backend / fixture-driven state
+// Implementation Timing: Alongside UI implementation
 [Import statement using detected test framework]
 [Test suite using detected framework syntax]
-  // User Journey: Complete purchase flow (browse → add to cart → checkout → payment → confirmation)
-  // Value Score: 120 | Business Value: 10 (business-critical) | Frequency: 10 (core flow) | Legal: true (PCI compliance)
-  // Verification: End-to-end user experience from product selection to order confirmation
-  // @category: e2e
+  // User Journey: Dismiss card -> Undo banner appears -> Undo restores card
+  // Value Score: 60 | Business Value: 6 | Frequency: 7 | Defect Detection: 8
+  // Verification: Browser-visible state transitions with mocked backend state
+  // @category: fixture-e2e
+  // @lane: fixture-e2e
+  // @dependency: full-ui (mocked backend)
+  // @complexity: medium
+  [Test: 'User Journey: Dismiss and undo restores the card']
+```
+### service-integration-e2e Test File
+```
+// [Feature Name] service-integration-e2e Test - Design Doc: [filename]
+// Generated: [date] | Budget Used: 1/2 service-integration-e2e
+// Test Type: End-to-end against running local stack
+// Implementation Timing: Final phase only
+[Import statement using detected test framework]
+[Test suite using detected framework syntax]
+  // User Journey: Complete purchase flow (browse -> checkout -> payment -> confirmation persisted)
+  // Value Score: 120 | Business Value: 10 (business-critical) | Frequency: 10 (core flow) | Legal: true
+  // Verification: Order persists in DB and confirmation event is emitted
+  // @category: service-integration-e2e
+  // @lane: service-integration-e2e
   // @dependency: full-system
   // @complexity: high
-  [Test: 'User Journey: Complete product purchase from browse to confirmation email']
+  [Test: 'User Journey: Complete product purchase persists order and emits confirmation']
 ```
 ### Generation Report
@@ -226,13 +253,18 @@ Adapt comment syntax to the project's language when generating annotations.
   "feature": "[feature name]",
   "generatedFiles": {
     "integration": "[path]/[feature].int.test.[ext]",
-    "e2e": null
+    "fixtureE2e": null,
+    "serviceE2e": null
   },
   "budgetUsage": {
     "integration": "2/3",
-    "e2e": "0/2"
+    "fixtureE2e": "0/3",
+    "serviceE2e": "0/2"
   },
-  "e2eAbsenceReason": "all_e2e_candidates_below_threshold"
+  "e2eAbsenceReason": {
+    "fixtureE2e": "all_e2e_candidates_below_threshold",
+    "serviceE2e": "no_real_service_dependency"
+  }
 }
 ```
@@ -242,13 +274,18 @@ Adapt comment syntax to the project's language when generating annotations.
   "feature": "[feature name]",
   "generatedFiles": {
     "integration": "[path]/[feature].int.test.[ext]",
-    "e2e": "[path]/[feature].e2e.test.[ext]"
+    "fixtureE2e": "[path]/[feature].fixture.e2e.test.[ext]",
+    "serviceE2e": "[path]/[feature].service.e2e.test.[ext]"
   },
   "budgetUsage": {
     "integration": "2/3",
-    "e2e": "1/2"
+    "fixtureE2e": "1/3",
+    "serviceE2e": "1/2"
   },
-  "e2eAbsenceReason": null
+  "e2eAbsenceReason": {
+    "fixtureE2e": null,
+    "serviceE2e": null
+  }
 }
 ```
@@ -256,8 +293,9 @@ Adapt comment syntax to the project's language when generating annotations.
 Each test case MUST have the following standard annotations for test implementation planning:
-- **@category**: core-functionality | integration | edge-case | ux
-- **@dependency**: none | [component names] | full-system
+- **@category**: core-functionality | integration | edge-case | ux | fixture-e2e | service-integration-e2e
+- **@lane**: integration | fixture-e2e | service-integration-e2e
+- **@dependency**: none | [component names] | full-ui (mocked backend) | full-system
 - **@complexity**: low | medium | high
 These annotations are used when planning and prioritizing test implementation.
@@ -282,7 +320,7 @@ These annotations are used when planning and prioritizing test implementation.
 ### Auto-processable
 - **Directory Absent**: Auto-create appropriate directory following detected test structure
-- **No E2E Selected**: Valid outcome when accompanied by `e2eAbsenceReason`
+- **No E2E Selected**: Valid outcome when accompanied by lane-specific `e2eAbsenceReason`
 - **Budget Exceeded by Critical Test**: Report to user
 ### Escalation Required
@@ -316,7 +354,7 @@ These annotations are used when planning and prioritizing test implementation.
 - **Post-execution**:
   - Completeness of selected tests
   - Dependency validity verified
-  - Integration tests and E2E tests generated in separate files
+  - Integration tests, fixture-e2e tests, and service-integration-e2e tests generated in separate files when selected
   - Generation report completeness
 ## Completion Gate [BLOCKING]