codex-workflows 0.4.7 → 0.4.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/.agents/skills/ai-development-guide/SKILL.md +12 -2
  2. package/.agents/skills/coding-rules/SKILL.md +15 -0
  3. package/.agents/skills/documentation-criteria/references/design-template.md +6 -0
  4. package/.agents/skills/documentation-criteria/references/plan-template.md +9 -0
  5. package/.agents/skills/documentation-criteria/references/task-template.md +4 -0
  6. package/.agents/skills/integration-e2e-testing/SKILL.md +45 -13
  7. package/.agents/skills/integration-e2e-testing/agents/openai.yaml +1 -1
  8. package/.agents/skills/integration-e2e-testing/references/e2e-design.md +7 -4
  9. package/.agents/skills/recipe-add-integration-tests/SKILL.md +6 -3
  10. package/.agents/skills/recipe-build/SKILL.md +6 -2
  11. package/.agents/skills/recipe-diagnose/SKILL.md +24 -23
  12. package/.agents/skills/recipe-front-build/SKILL.md +6 -2
  13. package/.agents/skills/recipe-front-plan/SKILL.md +1 -1
  14. package/.agents/skills/recipe-fullstack-build/SKILL.md +6 -2
  15. package/.agents/skills/recipe-fullstack-implement/SKILL.md +6 -4
  16. package/.agents/skills/recipe-implement/SKILL.md +9 -4
  17. package/.agents/skills/recipe-plan/SKILL.md +2 -1
  18. package/.agents/skills/recipe-update-doc/SKILL.md +1 -1
  19. package/.agents/skills/subagents-orchestration-guide/SKILL.md +12 -9
  20. package/.agents/skills/task-analyzer/references/skills-index.yaml +2 -2
  21. package/.agents/skills/testing/references/typescript.md +1 -1
  22. package/.codex/agents/acceptance-test-generator.toml +49 -26
  23. package/.codex/agents/code-verifier.toml +3 -1
  24. package/.codex/agents/codebase-analyzer.toml +26 -1
  25. package/.codex/agents/investigator.toml +46 -18
  26. package/.codex/agents/quality-fixer-frontend.toml +95 -8
  27. package/.codex/agents/quality-fixer.toml +96 -8
  28. package/.codex/agents/solver.toml +29 -25
  29. package/.codex/agents/task-decomposer.toml +14 -0
  30. package/.codex/agents/task-executor-frontend.toml +37 -0
  31. package/.codex/agents/task-executor.toml +38 -0
  32. package/.codex/agents/technical-designer-frontend.toml +9 -2
  33. package/.codex/agents/technical-designer.toml +20 -5
  34. package/.codex/agents/verifier.toml +61 -60
  35. package/.codex/agents/work-planner.toml +19 -3
  36. package/README.md +7 -7
  37. package/package.json +1 -1
@@ -131,13 +131,14 @@ How to handle duplicate code based on Martin Fowler's "Refactoring":
131
131
  - For low certainty cases, create minimal verification code first
132
132
 
133
133
  ### Pattern 5: Insufficient Existing Code Investigation
134
- **Symptom**: Duplicate implementations, architecture inconsistency, integration failures
135
- **Cause**: Insufficient understanding of existing code before implementation
134
+ **Symptom**: Duplicate implementations, architecture inconsistency, integration failures, adopting outdated patterns
135
+ **Cause**: Insufficient understanding of existing code before implementation; referencing only nearby files without checking representativeness
136
136
  **Avoidance**:
137
137
  - Before implementation, always search for similar functionality
138
138
  - Similar functionality found: Use that implementation (do not create new)
139
139
  - Similar functionality is technical debt: Create ADR improvement proposal
140
140
  - No similar functionality: Implement following existing design philosophy
141
+ - When adopting a pattern or dependency from nearby code, verify it is representative across the repository before adopting it
141
142
 
142
143
  ## Debugging Techniques
143
144
 
@@ -175,6 +176,15 @@ Pattern: Structured logging with context
175
176
  }
176
177
  ```
177
178
 
179
+ ## Quality Assurance Mechanism Awareness
180
+
181
+ Before executing quality checks, discover applicable quality tools and constraints by inspecting the affected files' types, project manifests, CI pipelines, and configuration:
182
+ - Primary detection: inspect affected file types, manifests, configuration, and CI pipelines to identify applicable quality tools
183
+ - Check for domain-specific linters or validators such as schema validators, API spec validators, or configuration-file checkers
184
+ - Check for domain-specific constraints in project configuration such as naming rules, length limits, or format requirements
185
+ - When a task file lists `Quality Assurance Mechanisms`, use that section as supplementary guidance for what to verify
186
+ - Include discovered domain-specific checks alongside the standard quality phases below
187
+
178
188
  ## Quality Check Workflow [MANDATORY]
179
189
 
180
190
  Universal quality assurance phases applicable to all languages:
@@ -51,6 +51,21 @@ For language-specific rules, also read:
51
51
  - Depend on abstractions, not concrete implementations
52
52
  - Minimize inter-module dependencies
53
53
 
54
+ ## Reference Representativeness
55
+
56
+ ### Verifying References Before Adoption
57
+
58
+ When adopting patterns, APIs, or dependencies from existing code:
59
+ - If referencing only nearby files, verify the pattern is representative across the repository before adopting it
60
+ - If multiple approaches coexist, identify the majority pattern and make a deliberate choice
61
+ - If adopting an external dependency, verify repository-wide usage distribution for that dependency and its version
62
+ - If repository evidence is insufficient to choose an appropriate dependency version, escalate instead of guessing
63
+ - If following an existing pattern when alternatives exist, state the reason for following it
64
+
65
+ ### Principle
66
+
67
+ Nearby code is a starting point for investigation, not a sufficient basis for adoption. Confirm that the reference is representative of repository conventions before using it as the model.
68
+
54
69
  ## Performance
55
70
 
56
71
  - **Measure first**: Profile before optimizing — no premature optimization
@@ -52,6 +52,12 @@ unknowns:
52
52
  - [ ] [Standard/convention] `[explicit]` - Source: [config / rule file / documentation path]
53
53
  - [ ] [Observed pattern] `[implicit]` - Evidence: [file paths] - Confirmed: [Yes/No]
54
54
 
55
+ #### Quality Assurance Mechanisms
56
+ How quality is enforced in the change area. Each item is either adopted for this change or noted with a reason.
57
+
58
+ - [ ] [Tool/check name] — Enforces: [what] — Config: [path] — Covers: [file paths/patterns or "project-wide"] — Status: `adopted` / `noted (reason)`
59
+ - [ ] [Domain-specific constraint] — Enforces: [what] — Source: [path] — Covers: [file paths/patterns or "project-wide"] — Status: `adopted` / `noted (reason)`
60
+
55
61
  ### Problem to Solve
56
62
 
57
63
  [Specific problems or challenges this feature aims to address]
@@ -32,6 +32,15 @@ Repeat this block for each Design Doc when multiple Design Docs exist. Preserve
32
32
  - **Success criteria**: [extracted from Design Doc]
33
33
  - **Failure response**: [extracted from Design Doc]
34
34
 
35
+ ## Quality Assurance Mechanisms (from Design Docs)
36
+
37
+ Adopted quality gates for the change area. Each task in this plan must satisfy the applicable mechanisms.
38
+
39
+ | Mechanism | Enforces | Config Location | Covered Files |
40
+ |-----------|----------|-----------------|---------------|
41
+ | [Tool/check name] | [What quality aspect it enforces] | [path/to/config] | [file paths or patterns covered, or "project-wide"] |
42
+ | [Domain constraint] | [What it enforces] | [path/to/source] | [file paths or patterns covered, or "project-wide"] |
43
+
35
44
  ## Design-to-Plan Traceability
36
45
 
37
46
  Map each Design Doc technical requirement to the task or phase that covers it. Use one row per extracted requirement item. Every row must have at least one covering task, or an explicit justified gap.
@@ -37,6 +37,10 @@ Brief observations recorded after reading Investigation Targets:
37
37
  - [ ] Improve code (maintain passing tests)
38
38
  - [ ] Confirm added tests still pass
39
39
 
40
+ ## Quality Assurance Mechanisms
41
+ (From the work plan header — include only mechanisms relevant to this task's target files)
42
+ - [Tool/check name] — Enforces: [what] — Config: [path]
43
+
40
44
  ## Operation Verification Methods
41
45
  (Derived from Verification Strategy in the work plan)
42
46
  - **Verification method**: [What to verify and how]
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: integration-e2e-testing
3
- description: "Integration and E2E test design principles, ROI calculation, test skeleton specification, and review criteria. Use when: designing integration tests, E2E tests, generating test skeletons, or reviewing test quality."
3
+ description: "Integration and E2E test design principles, value-based selection, test skeleton specification, and review criteria. Use when: designing integration tests, E2E tests, generating test skeletons, or reviewing test quality."
4
4
  ---
5
5
 
6
6
  # Integration and E2E Testing Principles
@@ -20,13 +20,13 @@ description: "Integration and E2E test design principles, ROI calculation, test
20
20
 
21
21
  ## Behavior-First Principle [MANDATORY]
22
22
 
23
- ### MUST Include (High ROI)
23
+ ### MUST Include (High Value)
24
24
  - Business logic correctness (calculations, state transitions, data transformations)
25
25
  - Data integrity and persistence behavior
26
26
  - User-visible functionality completeness
27
27
  - Error handling behavior (what user sees/experiences)
28
28
 
29
- ### MUST Exclude (Low ROI in CI/CD)
29
+ ### MUST Exclude (Low Value in CI/CD)
30
30
  - External service real connections — use contract/interface verification instead
31
31
  - Performance metrics — non-deterministic, defer to load testing
32
32
  - Implementation details — test observable behavior only
@@ -34,20 +34,52 @@ description: "Integration and E2E test design principles, ROI calculation, test
34
34
 
35
35
  **ENFORCEMENT**: Test = User-observable behavior verifiable in isolated CI environment
36
36
 
37
- ## ROI Calculation
37
+ ## Value and Selection Model
38
38
 
39
39
  ```
40
- ROI Score = (Business Value x User Frequency + Legal Requirement x 10 + Defect Detection)
41
- / (Creation Cost + Execution Cost + Maintenance Cost)
40
+ Value Score = (Business Value x User Frequency) + (Legal Requirement x 10) + Defect Detection
42
41
  ```
43
42
 
44
- ### Cost Table
43
+ Use `Value Score` for ranking candidates of the same test type. Handle E2E cost through budget limits and reserved-slot rules instead of cost-division scoring.
45
44
 
46
- | Test Type | Create | Execute | Maintain | Total |
47
- |-----------|--------|---------|----------|-------|
48
- | Unit | 1 | 1 | 1 | 3 |
49
- | Integration | 3 | 5 | 3 | 11 |
50
- | E2E | 10 | 20 | 8 | 38 |
45
+ ### E2E Threshold
46
+
47
+ - `E2E threshold = Value Score >= 50`
48
+ - Use this threshold for non-reserved E2E selection only
49
+ - Reserved-slot eligibility overrides the threshold when the candidate is the highest-value user-facing multi-step journey
50
+
51
+ ### Selection Rules
52
+
53
+ | Test Type | Ranking Basis | Selection Rule |
54
+ |-----------|---------------|----------------|
55
+ | Integration | Highest `Value Score` among integration candidates | Select up to budget |
56
+ | E2E | Highest `Value Score` among E2E candidates | Select when `reservedSlotEligible = true`, or when `Value Score >= 50` |
57
+
58
+ ### E2E Candidate Rules
59
+
60
+ - Treat integration and E2E as complementary coverage layers
61
+ - Retain an E2E candidate when it validates a user-facing multi-step journey, even if integration tests partially cover the behavior
62
+ - Preserve E2E candidates for user-facing multi-step journeys that validate cross-screen or cross-boundary continuity
63
+ - Distinguish user-facing journeys from service-internal chains; reserved E2E coverage applies only to user-facing journeys
64
+
65
+ ### Reserved E2E Slot
66
+
67
+ Reserve 1 E2E slot for the highest-value user-facing multi-step journey when such a journey exists, even if it does not satisfy `Value Score >= 50`.
68
+
69
+ ### E2E Absence Contract
70
+
71
+ When no E2E test is generated, downstream artifacts must treat that as an explicit decision, not an error. Carry:
72
+ - `generatedFiles.e2e: null`
73
+ - `e2eAbsenceReason`: one of `no_user_facing_multi_step_journey`, `all_e2e_candidates_below_threshold`, `covered_by_existing_e2e`, `budget_not_justified`
74
+
75
+ ### E2E Selection Decision Table
76
+
77
+ | Condition | Result |
78
+ |-----------|--------|
79
+ | At least one user-facing multi-step journey exists | Reserve 1 E2E slot for the highest-value such journey |
80
+ | Remaining E2E candidate has `Value Score >= 50` | Eligible for non-reserved E2E selection |
81
+ | Remaining E2E candidate has `Value Score < 50` | Exclude and use `all_e2e_candidates_below_threshold` if no E2E remains |
82
+ | Existing E2E already covers the same journey | Exclude and use `covered_by_existing_e2e` if no E2E remains |
51
83
 
52
84
  ## Test Skeleton Specification [MANDATORY]
53
85
 
@@ -62,7 +94,7 @@ Each test MUST include the following annotations:
62
94
  // @dependency: none | [component names] | full-system
63
95
  // @real-dependency: [component names] (optional)
64
96
  // @complexity: low | medium | high
65
- // ROI: [score]
97
+ // Value Score: [score]
66
98
  ```
67
99
 
68
100
  Adapt comment syntax to the project's language when generating or reviewing test skeletons.
@@ -1,6 +1,6 @@
1
1
  interface:
2
2
  display_name: "Integration & E2E Testing"
3
- short_description: "Test design and ROI calculation"
3
+ short_description: "Test design and value-based selection"
4
4
  default_prompt: "Use $integration-e2e-testing to design integration tests."
5
5
 
6
6
  policy:
@@ -2,7 +2,9 @@
2
2
 
3
3
  ## When to Create E2E Tests
4
4
 
5
- E2E tests target **critical user journeys** that span multiple pages or require real browser interaction. Apply the same ROI framework from the parent skill -- only create E2E tests when ROI > 50.
5
+ E2E tests target **critical user journeys** that span multiple pages or require real browser interaction. Apply the parent skill rules exactly:
6
+ - Reserve 1 E2E slot for the highest-value user-facing multi-step journey
7
+ - Use `Value Score >= 50` for any additional non-reserved E2E candidate
6
8
 
7
9
  ### Candidate Sources
8
10
 
@@ -15,7 +17,7 @@ E2E tests target **critical user journeys** that span multiple pages or require
15
17
 
16
18
  ### Selection Criteria
17
19
 
18
- **Include** (high E2E ROI):
20
+ **Include** (high-value E2E coverage):
19
21
  - Multi-page user journeys (login -> dashboard -> action -> confirmation)
20
22
  - Flows requiring real browser APIs (navigation, cookies, localStorage)
21
23
  - Accessibility verification requiring actual DOM rendering
@@ -44,7 +46,7 @@ User Journey: [Description of what the user accomplishes]
44
46
  Preconditions: [Auth state, data state]
45
47
  Verification Points:
46
48
  - [What to assert at each step]
47
- E2E ROI Score: [calculated score]
49
+ E2E Value Score: [calculated score]
48
50
  ```
49
51
 
50
52
  ## Playwright Test Architecture
@@ -82,5 +84,6 @@ When UI Spec defines responsive behavior, test critical breakpoints:
82
84
 
83
85
  Hard limits per feature (same as parent skill):
84
86
  - **E2E Tests**: MAX 1-2 tests
85
- - Only generate if ROI score > 50
87
+ - Generate the reserved user-journey E2E when eligible
88
+ - Generate any additional E2E only when `Value Score >= 50`
86
89
  - Prefer fewer, comprehensive journey tests over many granular tests
@@ -147,13 +147,16 @@ Check Step 5 result:
147
147
  Spawn quality-fixer routed by task filename pattern:
148
148
  - `*-backend-task-*` -> Spawn `quality-fixer`
149
149
  - `*-frontend-task-*` -> Spawn `quality-fixer-frontend`
150
- - Prompt: "Final quality assurance for test files added in this workflow. Run all tests and verify coverage."
150
+ - Prompt: "Final quality assurance for test files added in this workflow. Task file: [current task file]. filesModified: [Step 4 testsAdded]. Use these files as the stub-detection scope. Run all tests and verify coverage."
151
151
 
152
- **Expected output**: `status` (`approved`/`blocked`)
152
+ **Expected output**: `status` (`stub_detected`/`approved`/`blocked`)
153
153
 
154
154
  ### Step 8: Commit
155
155
 
156
- On `status: "approved"` from quality-fixer:
156
+ On quality-fixer result:
157
+ - `status: "stub_detected"` -> Return to Step 4 with `stubFindings`
158
+ - `status: "blocked"` -> Escalate to user
159
+ - `status: "approved"` -> Commit test files
157
160
  - MUST commit test files with appropriate message
158
161
  ENFORCEMENT: Commits without quality-fixer approval are invalid.
159
162
 
@@ -79,8 +79,12 @@ For EACH task, YOU MUST:
79
79
  - `needs_revision` -> Return to step 2 with `requiredFixes`
80
80
  - `approved` -> Proceed to step 4
81
81
  - `readyForQualityCheck: true` -> Proceed to step 4
82
- 4. **Spawn quality-fixer agent**: "Execute all quality checks and fixes"
83
- 5. **COMMIT on approval**: After `status: "approved"` from quality-fixer -> Execute git commit
82
+ 4. **Spawn quality-fixer agent**: "Execute all quality checks and fixes. Task file: [task-file-path]. The task file path above is also the `task_file` input. Read its `Quality Assurance Mechanisms` section as supplementary quality-check hints. filesModified: [task-executor response filesModified]. Use these files as the stub-detection scope."
83
+ 5. **CHECK quality-fixer response**:
84
+ - `status: "stub_detected"` -> Return to step 2 with `stubFindings`
85
+ - `status: "blocked"` -> STOP and escalate to user
86
+ - `status: "approved"` -> Proceed to step 6
87
+ 6. **COMMIT on approval**: After `status: "approved"` from quality-fixer -> Execute git commit
84
88
 
85
89
  **CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
86
90
  ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
@@ -8,7 +8,7 @@ description: "Investigate problem, verify findings, and derive solutions through
8
8
  1. [LOAD IF NOT ACTIVE] `ai-development-guide` — AI development patterns
9
9
  2. [LOAD IF NOT ACTIVE] `coding-rules` — coding standards
10
10
 
11
- **Context**: Diagnosis flow to identify root cause and present solutions
11
+ **Context**: Diagnosis flow to identify concrete failure points and present solutions
12
12
 
13
13
  Target problem: $ARGUMENTS
14
14
 
@@ -69,10 +69,10 @@ Confirm from rule-advisor output:
69
69
  ```
70
70
  Problem -> investigator -> verifier -> solver --+
71
71
  ^ |
72
- +-- confidence < high ---------+
72
+ +-- coverage insufficient -----+
73
73
  (max 2 iterations)
74
74
 
75
- confidence=high reached -> Report
75
+ coverage sufficient -> Report
76
76
  ```
77
77
 
78
78
  **Context Separation**: Pass only structured output to each step. Each step starts fresh with the data only.
@@ -99,7 +99,7 @@ For change failures, also include:
99
99
  - what both areas share
100
100
  ```
101
101
 
102
- **Expected output**: Evidence matrix, comparison analysis results, causal tracking results, list of unexplored areas, investigation limitations
102
+ **Expected output**: Evidence matrix, path map, failure points, comparison analysis results, list of unexplored areas, investigation limitations
103
103
 
104
104
  ### Step 2: Investigation Quality Check
105
105
 
@@ -107,10 +107,11 @@ Review investigation output:
107
107
 
108
108
  **Quality Check** (verify output contains the following):
109
109
  - [ ] `comparisonAnalysis` is present and `normalImplementation` is non-null, or explicitly states that no working implementation was found
110
- - [ ] causalChain for each hypothesis reaches a stop condition
111
- - [ ] causeCategory for each hypothesis
110
+ - [ ] `pathMap` is present with ordered nodes or explicit unknown segments
111
+ - [ ] causalChain for each failure point reaches a stop condition
112
+ - [ ] causeCategory for each failure point
112
113
  - [ ] `investigationSources` covers at least 3 distinct source types
113
- - [ ] each hypothesis has supporting evidence with a concrete source
114
+ - [ ] each failure point has supporting evidence with a concrete source
114
115
  - [ ] Investigation covering investigationFocus items (when provided)
115
116
 
116
117
  **If quality insufficient**: MUST re-spawn investigator agent specifying the missing items and include the previous investigation output for context
@@ -132,45 +133,45 @@ Proceed to verifier once quality is satisfied.
132
133
 
133
134
  Spawn verifier agent: "Verify the following investigation results. Investigation results: [Investigation output]"
134
135
 
135
- **Expected output**: Alternative hypotheses (at least 3), Devil's Advocate evaluation, final conclusion, confidence
136
+ **Expected output**: Path coverage findings, independent failure-point evaluation, final conclusion, coverageAssessment/finalStatus
136
137
 
137
- **Confidence Criteria**:
138
- - **high**: No uncertainty affecting solution selection or implementation
139
- - **medium**: Uncertainty exists but resolvable with additional investigation
140
- - **low**: Fundamental information gap exists
138
+ **Coverage Criteria**:
139
+ - **sufficient**: No major uncovered boundary affects solution selection or implementation
140
+ - **partial**: Some uncertainty exists but a bounded next investigation is possible
141
+ - **insufficient**: Fundamental information gap exists on the relevant path
141
142
 
142
143
  ### Step 4: Solution Derivation (solver)
143
144
 
144
- Spawn solver agent: "Derive solutions based on the following verified conclusion. Causes: [verifier's conclusion.causes]. Causes relationship: [causesRelationship: independent/dependent/exclusive]. Confidence: [high/medium/low]."
145
+ Spawn solver agent: "Derive solutions based on the following verified conclusion. Failure points: [verifier's conclusion.confirmedFailurePoints]. Failure-point relationships: [verifier's conclusion.failurePointRelationships]. Coverage assessment: [verifier's conclusion.coverageAssessment]. Final status: [verifier's conclusion.finalStatus]. Impact analysis: [investigator output impactAnalysis]."
145
146
 
146
147
  **Expected output**: Multiple solutions (at least 3), tradeoff analysis, recommendation and implementation steps, residual risks
147
148
 
148
- **Completion condition**: confidence=high
149
+ **Completion condition**: `coverageAssessment=sufficient` and `finalStatus=ready_for_solution`
149
150
 
150
151
  **When not reached**:
151
152
  1. Return to Step 1 with uncertainties identified by solver as investigation targets
152
153
  2. Maximum 2 additional investigation iterations
153
- 3. After 2 iterations without reaching high, present user with options:
154
+ 3. After 2 iterations without reaching sufficient coverage, present user with options:
154
155
  - Continue additional investigation
155
- - Execute solution at current confidence level
156
+ - Execute solution at current coverage level
156
157
 
157
158
  ### Step 5: Final Report Creation
158
159
 
159
- **Prerequisite**: confidence=high achieved
160
+ **Prerequisite**: sufficient coverage achieved
160
161
 
161
162
  After diagnosis completion, report to user in the following format:
162
163
 
163
164
  ```
164
165
  ## Diagnosis Result Summary
165
166
 
166
- ### Identified Causes
167
- [Cause list from verification results]
168
- - Causes relationship: [independent/dependent/exclusive]
167
+ ### Identified Failure Points
168
+ [Failure point list from verification results]
169
+ - Failure-point relationships: [independent/upstream_of/downstream_of/amplifies/same_boundary]
169
170
 
170
171
  ### Verification Process
171
172
  - Investigation scope: [Scope confirmed in investigation]
172
173
  - Additional investigation iterations: [0/1/2]
173
- - Alternative hypotheses count: [Number generated in verification]
174
+ - Coverage assessment: [sufficient/partial/insufficient]
174
175
 
175
176
  ### Recommended Solution
176
177
  [Solution derivation recommendation]
@@ -197,7 +198,7 @@ Rationale: [Selection rationale]
197
198
 
198
199
  - [ ] Spawned investigator and obtained evidence matrix, comparison analysis, and causal tracking
199
200
  - [ ] Performed investigation quality check and re-ran if insufficient
200
- - [ ] Spawned verifier and obtained confidence level
201
+ - [ ] Spawned verifier and obtained coverage assessment
201
202
  - [ ] Spawned solver
202
- - [ ] Achieved confidence=high (or obtained user approval after 2 additional iterations)
203
+ - [ ] Achieved sufficient coverage (or obtained user approval after 2 additional iterations)
203
204
  - [ ] Presented final report to user
@@ -87,8 +87,12 @@ For EACH task, YOU MUST:
87
87
  - `needs_revision` -> Return to step 2 with `requiredFixes`
88
88
  - `approved` -> Proceed to step 4
89
89
  - `readyForQualityCheck: true` -> Proceed to step 4
90
- 4. **Spawn quality-fixer-frontend agent**: "Execute all frontend quality checks and fixes"
91
- 5. **COMMIT on approval**: After `status: "approved"` from quality-fixer-frontend -> Execute git commit. Use `changeSummary` for commit message.
90
+ 4. **Spawn quality-fixer-frontend agent**: "Execute all frontend quality checks and fixes. Task file: docs/plans/tasks/[filename].md. The task file path above is also the `task_file` input. Read its `Quality Assurance Mechanisms` section as supplementary quality-check hints. filesModified: [task-executor-frontend response filesModified]. Use these files as the stub-detection scope."
91
+ 5. **CHECK quality-fixer-frontend response**:
92
+ - `status: "stub_detected"` -> Return to step 2 with `stubFindings`
93
+ - `status: "blocked"` -> STOP and escalate to user
94
+ - `status: "approved"` -> Proceed to step 6
95
+ 6. **COMMIT on approval**: After `status: "approved"` from quality-fixer-frontend -> Execute git commit. Use `changeSummary` for commit message.
92
96
 
93
97
  **CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
94
98
  ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
@@ -46,7 +46,7 @@ Check for existence of design documents in docs/design/.
46
46
  Spawn acceptance-test-generator agent: "Generate test skeletons from Design Doc at [path]. [UI Spec at [ui-spec path] if exists.]"
47
47
 
48
48
  ### Step 3: Work Plan Creation
49
- Spawn work-planner agent: "Create work plan from Design Doc at [path]. Integration test file: [path from step 2]. E2E test file: [path from step 2]. Integration tests are created simultaneously with each phase implementation, E2E tests are executed only in final phase."
49
+ Spawn work-planner agent: "Create work plan from Design Doc at [path]. Integration test file: [path from step 2]. E2E test file: [path from step 2 or null]. E2E absence reason: [value from step 2 when E2E file is null]. Integration tests are created simultaneously with each phase implementation, E2E tests are executed only in final phase when an E2E file exists."
50
50
 
51
51
  **[STOP -- BLOCKING]** Interact with user to complete plan and obtain approval for plan content. Clarify specific implementation steps and risks.
52
52
  **CANNOT proceed until user explicitly approves the work plan.**
@@ -97,8 +97,12 @@ For EACH task, YOU MUST:
97
97
  - `needs_revision` -> Return to step 2 with `requiredFixes`
98
98
  - `approved` -> Proceed to step 4
99
99
  - `readyForQualityCheck: true` -> Proceed to step 4
100
- 4. **Spawn quality-fixer agent** (layer-appropriate per routing table): "Execute all quality checks and fixes"
101
- 5. **COMMIT on approval**: After `status: "approved"` from quality-fixer -> Execute git commit
100
+ 4. **Spawn quality-fixer agent** (layer-appropriate per routing table): "Execute all quality checks and fixes. Task file: [task-file-path]. The task file path above is also the `task_file` input. Read its `Quality Assurance Mechanisms` section as supplementary quality-check hints. filesModified: [executor response filesModified]. Use these files as the stub-detection scope."
101
+ 5. **CHECK quality-fixer response**:
102
+ - `status: "stub_detected"` -> Return to step 2 with `stubFindings`
103
+ - `status: "blocked"` -> STOP and escalate to user
104
+ - `status: "approved"` -> Proceed to step 6
105
+ 6. **COMMIT on approval**: After `status: "approved"` from quality-fixer -> Execute git commit
102
106
 
103
107
  **CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
104
108
  ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
@@ -124,8 +124,9 @@ ENFORCEMENT: Sub-agent prompts missing the constraint suffix MUST be re-issued w
124
124
  **Rules**:
125
125
  1. Execute ONE task completely before starting next (each task goes through the full 4-step cycle individually, using the correct executor per filename pattern)
126
126
  2. Check executor status before quality-fixer (escalation check)
127
- 3. Quality-fixer MUST run after each executor (no skipping)
128
- 4. Commit MUST execute when quality-fixer returns `status: "approved"` (do not defer to end)
127
+ 3. Quality-fixer MUST run after each executor (no skipping), MUST receive the executor `filesModified` list as stub-detection scope, and MUST receive the current task file as the `task_file` input so it reads the task file's `Quality Assurance Mechanisms` section as supplementary quality-check hints
128
+ 4. If quality-fixer returns `status: "stub_detected"`, route the task back to the same executor with `stubFindings`
129
+ 5. Commit MUST execute only when quality-fixer returns `status: "approved"` (do not defer to end)
129
130
 
130
131
  ### Post-Implementation Verification (After All Tasks Complete)
131
132
 
@@ -149,8 +150,9 @@ After all task cycles finish, collect all `filesModified` from every task-execut
149
150
  ### Test Information Communication
150
151
  After acceptance-test-generator execution, when calling work-planner, communicate:
151
152
  - Generated integration test file path
152
- - Generated E2E test file path
153
- - Explicit note that integration tests are created simultaneously with implementation, E2E tests are executed after all implementations
153
+ - Generated E2E test file path or `null`
154
+ - E2E absence reason when no E2E file is generated
155
+ - Explicit note that integration tests are created simultaneously with implementation, E2E tests are executed after all implementations only when an E2E file exists
154
156
 
155
157
  **[STOP -- BLOCKING]** Upon detecting ANY requirement changes, halt execution immediately.
156
158
  **CANNOT proceed until user explicitly confirms the change scope.**
@@ -105,8 +105,12 @@ After user grants "batch approval for entire implementation phase", enter autono
105
105
  - `needs_revision` -> Return to step 1 with `requiredFixes`
106
106
  - `approved` -> Proceed to step 3
107
107
  - Otherwise -> Proceed to step 3
108
- 3. Spawn quality-fixer (or quality-fixer-frontend) agent: "Quality check and fixes"
109
- 4. git commit -> Execute on `status: "approved"`
108
+ 3. Spawn quality-fixer (or quality-fixer-frontend) agent: "Quality check and fixes. Task file: [task-file-path]. The task file path above is also the `task_file` input. Read its `Quality Assurance Mechanisms` section as supplementary quality-check hints. filesModified: [executor response filesModified]. Use these files as the stub-detection scope."
109
+ 4. Check quality-fixer response:
110
+ - `status: "stub_detected"` -> Return to step 1 with `stubFindings`
111
+ - `status: "blocked"` -> Escalate to user
112
+ - `status: "approved"` -> Proceed to step 5
113
+ 5. git commit -> Execute on `status: "approved"`
110
114
 
111
115
  ### Post-Implementation Verification (After All Tasks Complete)
112
116
 
@@ -130,8 +134,9 @@ After all task cycles finish, collect all `filesModified` from every executor re
130
134
  ### Test Information Communication
131
135
  After acceptance-test-generator execution, when spawning work-planner, communicate:
132
136
  - Generated integration test file path
133
- - Generated E2E test file path
134
- - Note: integration tests are created with implementation; E2E tests run after all implementations
137
+ - Generated E2E test file path or `null`
138
+ - E2E absence reason when no E2E file is generated
139
+ - Note: integration tests are created with implementation; E2E tests run after all implementations when an E2E file exists
135
140
 
136
141
  ## Completion Criteria
137
142
 
@@ -47,9 +47,10 @@ Present options if multiple exist (can be specified with $ARGUMENTS).
47
47
  - Confirm with user whether to generate E2E test skeleton first
48
48
  - If user wants generation: Spawn acceptance-test-generator agent: "Generate test skeletons from Design Doc at [design-doc-path]"
49
49
  - Pass generation results to next process according to subagents-orchestration-guide skill coordination specification
50
+ - If no E2E file is generated, carry the explicit `e2eAbsenceReason` forward as a valid planning input
50
51
 
51
52
  ### Step 3: Work Plan Creation
52
- - Spawn work-planner agent: "Create work plan from design document at [design-doc-path]. Include deliverables from previous process according to subagents-orchestration-guide skill coordination specification."
53
+ - Spawn work-planner agent: "Create work plan from design document at [design-doc-path]. Include deliverables from previous process according to subagents-orchestration-guide skill coordination specification. If `generatedFiles.e2e` is null, use `e2eAbsenceReason` and accept the null E2E file as a valid planning input."
53
54
  - Interact with user to complete plan and obtain approval for plan content
54
55
  - Clarify specific implementation steps and risks
55
56
 
@@ -109,7 +109,7 @@ Spawn [Update Agent from Step 2] agent: "Operation Mode: update. Existing Docume
109
109
 
110
110
  For Design Doc updates, first verify the updated document against code:
111
111
 
112
- Spawn code-verifier agent: "Verify the updated Design Doc against current code. doc_type: design-doc. document_path: [path from Step 1]. verbose: false."
112
+ Spawn code-verifier agent: "Verify the updated Design Doc against current code. doc_type: design-doc. document_path: [path from Step 1]. verbose: false. Focus especially on literal identifier referential integrity for concrete paths, endpoints, type names, config keys, and other exact identifiers changed in this update."
113
113
 
114
114
  **Store output as**: `$CODE_VERIFICATION_OUTPUT`
115
115
 
@@ -178,15 +178,15 @@ All agents MUST use this vocabulary consistently:
178
178
 
179
179
  Subagents respond in JSON format. The final response from each JSON-returning subagent must be the JSON payload itself, with no trailing prose. Key fields for orchestrator decisions:
180
180
  - **requirement-analyzer**: scale, confidence, affectedLayers, adrRequired, scopeDependencies, questions
181
- - **codebase-analyzer**: analysisScope, existingElements, dataModel, focusAreas, limitations
182
- - **task-executor**: status (escalation_needed/completed), escalation_type (design_compliance_violation/similar_function_found/similar_component_found/investigation_target_not_found/out_of_scope_file/test_environment_not_ready), testsAdded, requiresTestReview
183
- - **quality-fixer**: status (approved/blocked). For blocked responses, discriminate by `reason`: specification conflicts use `blockingIssues[]`; execution prerequisites use `missingPrerequisites[]`, and each item provides its own `resolutionSteps`
181
+ - **codebase-analyzer**: analysisScope, existingElements, dataModel, qualityAssurance, focusAreas, limitations
182
+ - **task-executor**: status (escalation_needed/completed), escalation_type (design_compliance_violation/similar_function_found/similar_component_found/investigation_target_not_found/out_of_scope_file/test_environment_not_ready/dependency_version_uncertain), testsAdded, requiresTestReview
183
+ - **quality-fixer**: Input: `task_file` (always pass the current task file path in orchestrated flows). Status (`stub_detected`/approved/blocked). `stub_detected` returns `stubFindings[]` and routes back to the task executor. For blocked responses, discriminate by `reason`: specification conflicts use `blockingIssues[]`; execution prerequisites use `missingPrerequisites[]`, and each item provides its own `resolutionSteps`
184
184
  - **document-reviewer**: verdict.decision (approved/approved_with_conditions/needs_revision/rejected)
185
185
  - **code-verifier**: summary.status, summary.consistencyScore, discrepancies, reverseCoverage
186
186
  - **design-sync**: sync_status (CONFLICTS_FOUND/NO_CONFLICTS) — text format with [SUMMARY] block
187
187
  - **integration-test-reviewer**: status (approved/needs_revision/blocked), requiredFixes
188
188
  - **security-reviewer**: status (approved/approved_with_notes/needs_revision/blocked), findings, notes, requiredFixes
189
- - **acceptance-test-generator**: status, generatedFiles
189
+ - **acceptance-test-generator**: status, generatedFiles, `e2eAbsenceReason`
190
190
 
191
191
  ## Handling Requirement Changes
192
192
 
@@ -252,7 +252,7 @@ When receiving new features or change requests, start with requirement-analyzer.
252
252
  ### Design Flow Data Passing
253
253
 
254
254
  - Pass requirement-analyzer output and original requirements to codebase-analyzer
255
- - Pass codebase-analyzer JSON to technical-designer or technical-designer-frontend as `Codebase Analysis`, including `dataTransformationPipelines` when present
255
+ - Pass codebase-analyzer JSON to technical-designer or technical-designer-frontend as `Codebase Analysis`, including `dataTransformationPipelines` and `qualityAssurance` when present
256
256
  - Pass Design Doc path to code-verifier
257
257
  - Pass code-verifier JSON to document-reviewer as `code_verification`
258
258
 
@@ -296,7 +296,8 @@ Batch approval -> Start autonomous execution mode
296
296
  - needs_revision -> back to task-executor
297
297
  - approved -> quality-fixer
298
298
  - No issues -> quality-fixer
299
- -> quality-fixer: Quality check and fixes
299
+ -> quality-fixer: Quality check and fixes using the executor `filesModified` set as the stub-detection scope
300
+ - stub_detected -> task-executor/task-executor-frontend: complete implementation -> re-run quality-fixer
300
301
  -> Orchestrator: Execute git commit
301
302
  -> Check remaining tasks:
302
303
  - Yes -> next task
@@ -352,13 +353,15 @@ Maximum retry count is 1 verification fix cycle. If any failed verifier still fa
352
353
 
353
354
  **Orchestrator verification items**:
354
355
  - Verify integration test file path retrieval and existence
355
- - Verify E2E test file path retrieval and existence
356
+ - Verify E2E test file path retrieval and existence when `generatedFiles.e2e` is not null
357
+ - Verify `e2eAbsenceReason` is present when `generatedFiles.e2e` is null
356
358
 
357
359
  **Pass to work-planner**:
358
360
  - Integration test file: [path] (create and execute simultaneously with each phase implementation)
359
- - E2E test file: [path] (execute only in final phase)
361
+ - E2E test file: [path] or `null` (execute only in final phase when present)
362
+ - E2E absence reason: [value when E2E test file is null]
360
363
 
361
- **On error**: Escalate to user if files are not generated
364
+ **On error**: Escalate to user only when required outputs are missing without a valid absence reason
362
365
 
363
366
  ### Design Doc to Work Plan Verification Handoff
364
367
 
@@ -118,7 +118,7 @@ skills:
118
118
  integration-e2e-testing:
119
119
  skill: "integration-e2e-testing"
120
120
  tags: [testing, integration-testing, e2e-testing, test-design, behavior-first, roi, test-skeleton, ears-format]
121
- typical-use: "Integration and E2E test design principles, ROI-based test selection, behavior-first approach, test skeleton specification"
121
+ typical-use: "Integration and E2E test design principles, value-based test selection, behavior-first approach, test skeleton specification"
122
122
  size: medium
123
123
  key-references:
124
124
  - "Test Pyramid - Mike Cohn"
@@ -127,7 +127,7 @@ skills:
127
127
  - "References"
128
128
  - "Test Type Definition and Limits [MANDATORY]"
129
129
  - "Behavior-First Principle [MANDATORY]"
130
- - "ROI Calculation"
130
+ - "Value and Selection Model"
131
131
  - "Test Skeleton Specification [MANDATORY]"
132
132
  - "EARS Format Mapping"
133
133
  - "Test File Naming Convention"
@@ -213,7 +213,7 @@ export const test = base.extend<{ authenticatedPage: Page }>({
213
213
  ### E2E Budget
214
214
 
215
215
  - **MAX 1-2 E2E tests per feature**
216
- - Only generate if ROI score > 50
216
+ - Only generate an additional non-reserved E2E test when `Value Score >= 50`
217
217
  - Prefer fewer comprehensive journey tests over many granular tests
218
218
 
219
219
  ### Test Isolation