codex-workflows 0.4.7 → 0.4.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agents/skills/ai-development-guide/SKILL.md +12 -2
- package/.agents/skills/coding-rules/SKILL.md +15 -0
- package/.agents/skills/documentation-criteria/references/design-template.md +6 -0
- package/.agents/skills/documentation-criteria/references/plan-template.md +9 -0
- package/.agents/skills/documentation-criteria/references/task-template.md +4 -0
- package/.agents/skills/integration-e2e-testing/SKILL.md +45 -13
- package/.agents/skills/integration-e2e-testing/agents/openai.yaml +1 -1
- package/.agents/skills/integration-e2e-testing/references/e2e-design.md +7 -4
- package/.agents/skills/recipe-add-integration-tests/SKILL.md +6 -3
- package/.agents/skills/recipe-build/SKILL.md +6 -2
- package/.agents/skills/recipe-diagnose/SKILL.md +24 -23
- package/.agents/skills/recipe-front-build/SKILL.md +6 -2
- package/.agents/skills/recipe-front-plan/SKILL.md +1 -1
- package/.agents/skills/recipe-fullstack-build/SKILL.md +6 -2
- package/.agents/skills/recipe-fullstack-implement/SKILL.md +6 -4
- package/.agents/skills/recipe-implement/SKILL.md +9 -4
- package/.agents/skills/recipe-plan/SKILL.md +2 -1
- package/.agents/skills/recipe-update-doc/SKILL.md +1 -1
- package/.agents/skills/subagents-orchestration-guide/SKILL.md +12 -9
- package/.agents/skills/task-analyzer/references/skills-index.yaml +2 -2
- package/.agents/skills/testing/references/typescript.md +1 -1
- package/.codex/agents/acceptance-test-generator.toml +49 -26
- package/.codex/agents/code-verifier.toml +3 -1
- package/.codex/agents/codebase-analyzer.toml +26 -1
- package/.codex/agents/investigator.toml +46 -18
- package/.codex/agents/quality-fixer-frontend.toml +95 -8
- package/.codex/agents/quality-fixer.toml +96 -8
- package/.codex/agents/solver.toml +29 -25
- package/.codex/agents/task-decomposer.toml +14 -0
- package/.codex/agents/task-executor-frontend.toml +37 -0
- package/.codex/agents/task-executor.toml +38 -0
- package/.codex/agents/technical-designer-frontend.toml +9 -2
- package/.codex/agents/technical-designer.toml +20 -5
- package/.codex/agents/verifier.toml +61 -60
- package/.codex/agents/work-planner.toml +19 -3
- package/README.md +7 -7
- package/package.json +1 -1
|
@@ -131,13 +131,14 @@ How to handle duplicate code based on Martin Fowler's "Refactoring":
|
|
|
131
131
|
- For low certainty cases, create minimal verification code first
|
|
132
132
|
|
|
133
133
|
### Pattern 5: Insufficient Existing Code Investigation
|
|
134
|
-
**Symptom**: Duplicate implementations, architecture inconsistency, integration failures
|
|
135
|
-
**Cause**: Insufficient understanding of existing code before implementation
|
|
134
|
+
**Symptom**: Duplicate implementations, architecture inconsistency, integration failures, adopting outdated patterns
|
|
135
|
+
**Cause**: Insufficient understanding of existing code before implementation; referencing only nearby files without checking representativeness
|
|
136
136
|
**Avoidance**:
|
|
137
137
|
- Before implementation, always search for similar functionality
|
|
138
138
|
- Similar functionality found: Use that implementation (do not create new)
|
|
139
139
|
- Similar functionality is technical debt: Create ADR improvement proposal
|
|
140
140
|
- No similar functionality: Implement following existing design philosophy
|
|
141
|
+
- When adopting a pattern or dependency from nearby code, verify it is representative across the repository before adopting it
|
|
141
142
|
|
|
142
143
|
## Debugging Techniques
|
|
143
144
|
|
|
@@ -175,6 +176,15 @@ Pattern: Structured logging with context
|
|
|
175
176
|
}
|
|
176
177
|
```
|
|
177
178
|
|
|
179
|
+
## Quality Assurance Mechanism Awareness
|
|
180
|
+
|
|
181
|
+
Before executing quality checks, discover applicable quality tools and constraints by inspecting the affected files' types, project manifests, CI pipelines, and configuration:
|
|
182
|
+
- Primary detection: inspect affected file types, manifests, configuration, and CI pipelines to identify applicable quality tools
|
|
183
|
+
- Check for domain-specific linters or validators such as schema validators, API spec validators, or configuration-file checkers
|
|
184
|
+
- Check for domain-specific constraints in project configuration such as naming rules, length limits, or format requirements
|
|
185
|
+
- When a task file lists `Quality Assurance Mechanisms`, use that section as supplementary guidance for what to verify
|
|
186
|
+
- Include discovered domain-specific checks alongside the standard quality phases below
|
|
187
|
+
|
|
178
188
|
## Quality Check Workflow [MANDATORY]
|
|
179
189
|
|
|
180
190
|
Universal quality assurance phases applicable to all languages:
|
|
@@ -51,6 +51,21 @@ For language-specific rules, also read:
|
|
|
51
51
|
- Depend on abstractions, not concrete implementations
|
|
52
52
|
- Minimize inter-module dependencies
|
|
53
53
|
|
|
54
|
+
## Reference Representativeness
|
|
55
|
+
|
|
56
|
+
### Verifying References Before Adoption
|
|
57
|
+
|
|
58
|
+
When adopting patterns, APIs, or dependencies from existing code:
|
|
59
|
+
- If referencing only nearby files, verify the pattern is representative across the repository before adopting it
|
|
60
|
+
- If multiple approaches coexist, identify the majority pattern and make a deliberate choice
|
|
61
|
+
- If adopting an external dependency, verify repository-wide usage distribution for that dependency and its version
|
|
62
|
+
- If repository evidence is insufficient to choose an appropriate dependency version, escalate instead of guessing
|
|
63
|
+
- If following an existing pattern when alternatives exist, state the reason for following it
|
|
64
|
+
|
|
65
|
+
### Principle
|
|
66
|
+
|
|
67
|
+
Nearby code is a starting point for investigation, not a sufficient basis for adoption. Confirm that the reference is representative of repository conventions before using it as the model.
|
|
68
|
+
|
|
54
69
|
## Performance
|
|
55
70
|
|
|
56
71
|
- **Measure first**: Profile before optimizing — no premature optimization
|
|
@@ -52,6 +52,12 @@ unknowns:
|
|
|
52
52
|
- [ ] [Standard/convention] `[explicit]` - Source: [config / rule file / documentation path]
|
|
53
53
|
- [ ] [Observed pattern] `[implicit]` - Evidence: [file paths] - Confirmed: [Yes/No]
|
|
54
54
|
|
|
55
|
+
#### Quality Assurance Mechanisms
|
|
56
|
+
How quality is enforced in the change area. Each item is either adopted for this change or noted with a reason.
|
|
57
|
+
|
|
58
|
+
- [ ] [Tool/check name] — Enforces: [what] — Config: [path] — Covers: [file paths/patterns or "project-wide"] — Status: `adopted` / `noted (reason)`
|
|
59
|
+
- [ ] [Domain-specific constraint] — Enforces: [what] — Source: [path] — Covers: [file paths/patterns or "project-wide"] — Status: `adopted` / `noted (reason)`
|
|
60
|
+
|
|
55
61
|
### Problem to Solve
|
|
56
62
|
|
|
57
63
|
[Specific problems or challenges this feature aims to address]
|
|
@@ -32,6 +32,15 @@ Repeat this block for each Design Doc when multiple Design Docs exist. Preserve
|
|
|
32
32
|
- **Success criteria**: [extracted from Design Doc]
|
|
33
33
|
- **Failure response**: [extracted from Design Doc]
|
|
34
34
|
|
|
35
|
+
## Quality Assurance Mechanisms (from Design Docs)
|
|
36
|
+
|
|
37
|
+
Adopted quality gates for the change area. Each task in this plan must satisfy the applicable mechanisms.
|
|
38
|
+
|
|
39
|
+
| Mechanism | Enforces | Config Location | Covered Files |
|
|
40
|
+
|-----------|----------|-----------------|---------------|
|
|
41
|
+
| [Tool/check name] | [What quality aspect it enforces] | [path/to/config] | [file paths or patterns covered, or "project-wide"] |
|
|
42
|
+
| [Domain constraint] | [What it enforces] | [path/to/source] | [file paths or patterns covered, or "project-wide"] |
|
|
43
|
+
|
|
35
44
|
## Design-to-Plan Traceability
|
|
36
45
|
|
|
37
46
|
Map each Design Doc technical requirement to the task or phase that covers it. Use one row per extracted requirement item. Every row must have at least one covering task, or an explicit justified gap.
|
|
@@ -37,6 +37,10 @@ Brief observations recorded after reading Investigation Targets:
|
|
|
37
37
|
- [ ] Improve code (maintain passing tests)
|
|
38
38
|
- [ ] Confirm added tests still pass
|
|
39
39
|
|
|
40
|
+
## Quality Assurance Mechanisms
|
|
41
|
+
(From the work plan header — include only mechanisms relevant to this task's target files)
|
|
42
|
+
- [Tool/check name] — Enforces: [what] — Config: [path]
|
|
43
|
+
|
|
40
44
|
## Operation Verification Methods
|
|
41
45
|
(Derived from Verification Strategy in the work plan)
|
|
42
46
|
- **Verification method**: [What to verify and how]
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: integration-e2e-testing
|
|
3
|
-
description: "Integration and E2E test design principles,
|
|
3
|
+
description: "Integration and E2E test design principles, value-based selection, test skeleton specification, and review criteria. Use when: designing integration tests, E2E tests, generating test skeletons, or reviewing test quality."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Integration and E2E Testing Principles
|
|
@@ -20,13 +20,13 @@ description: "Integration and E2E test design principles, ROI calculation, test
|
|
|
20
20
|
|
|
21
21
|
## Behavior-First Principle [MANDATORY]
|
|
22
22
|
|
|
23
|
-
### MUST Include (High
|
|
23
|
+
### MUST Include (High Value)
|
|
24
24
|
- Business logic correctness (calculations, state transitions, data transformations)
|
|
25
25
|
- Data integrity and persistence behavior
|
|
26
26
|
- User-visible functionality completeness
|
|
27
27
|
- Error handling behavior (what user sees/experiences)
|
|
28
28
|
|
|
29
|
-
### MUST Exclude (Low
|
|
29
|
+
### MUST Exclude (Low Value in CI/CD)
|
|
30
30
|
- External service real connections — use contract/interface verification instead
|
|
31
31
|
- Performance metrics — non-deterministic, defer to load testing
|
|
32
32
|
- Implementation details — test observable behavior only
|
|
@@ -34,20 +34,52 @@ description: "Integration and E2E test design principles, ROI calculation, test
|
|
|
34
34
|
|
|
35
35
|
**ENFORCEMENT**: Test = User-observable behavior verifiable in isolated CI environment
|
|
36
36
|
|
|
37
|
-
##
|
|
37
|
+
## Value and Selection Model
|
|
38
38
|
|
|
39
39
|
```
|
|
40
|
-
|
|
41
|
-
/ (Creation Cost + Execution Cost + Maintenance Cost)
|
|
40
|
+
Value Score = (Business Value x User Frequency) + (Legal Requirement x 10) + Defect Detection
|
|
42
41
|
```
|
|
43
42
|
|
|
44
|
-
|
|
43
|
+
Use `Value Score` for ranking candidates of the same test type. Handle E2E cost through budget limits and reserved-slot rules instead of cost-division scoring.
|
|
45
44
|
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
45
|
+
### E2E Threshold
|
|
46
|
+
|
|
47
|
+
- `E2E threshold = Value Score >= 50`
|
|
48
|
+
- Use this threshold for non-reserved E2E selection only
|
|
49
|
+
- Reserved-slot eligibility overrides the threshold when the candidate is the highest-value user-facing multi-step journey
|
|
50
|
+
|
|
51
|
+
### Selection Rules
|
|
52
|
+
|
|
53
|
+
| Test Type | Ranking Basis | Selection Rule |
|
|
54
|
+
|-----------|---------------|----------------|
|
|
55
|
+
| Integration | Highest `Value Score` among integration candidates | Select up to budget |
|
|
56
|
+
| E2E | Highest `Value Score` among E2E candidates | Select when `reservedSlotEligible = true`, or when `Value Score >= 50` |
|
|
57
|
+
|
|
58
|
+
### E2E Candidate Rules
|
|
59
|
+
|
|
60
|
+
- Treat integration and E2E as complementary coverage layers
|
|
61
|
+
- Retain an E2E candidate when it validates a user-facing multi-step journey, even if integration tests partially cover the behavior
|
|
62
|
+
- Preserve E2E candidates for user-facing multi-step journeys that validate cross-screen or cross-boundary continuity
|
|
63
|
+
- Distinguish user-facing journeys from service-internal chains; reserved E2E coverage applies only to user-facing journeys
|
|
64
|
+
|
|
65
|
+
### Reserved E2E Slot
|
|
66
|
+
|
|
67
|
+
Reserve 1 E2E slot for the highest-value user-facing multi-step journey when such a journey exists, even if it does not satisfy `Value Score >= 50`.
|
|
68
|
+
|
|
69
|
+
### E2E Absence Contract
|
|
70
|
+
|
|
71
|
+
When no E2E test is generated, downstream artifacts must treat that as an explicit decision, not an error. Carry:
|
|
72
|
+
- `generatedFiles.e2e: null`
|
|
73
|
+
- `e2eAbsenceReason`: one of `no_user_facing_multi_step_journey`, `all_e2e_candidates_below_threshold`, `covered_by_existing_e2e`, `budget_not_justified`
|
|
74
|
+
|
|
75
|
+
### E2E Selection Decision Table
|
|
76
|
+
|
|
77
|
+
| Condition | Result |
|
|
78
|
+
|-----------|--------|
|
|
79
|
+
| At least one user-facing multi-step journey exists | Reserve 1 E2E slot for the highest-value such journey |
|
|
80
|
+
| Remaining E2E candidate has `Value Score >= 50` | Eligible for non-reserved E2E selection |
|
|
81
|
+
| Remaining E2E candidate has `Value Score < 50` | Exclude and use `all_e2e_candidates_below_threshold` if no E2E remains |
|
|
82
|
+
| Existing E2E already covers the same journey | Exclude and use `covered_by_existing_e2e` if no E2E remains |
|
|
51
83
|
|
|
52
84
|
## Test Skeleton Specification [MANDATORY]
|
|
53
85
|
|
|
@@ -62,7 +94,7 @@ Each test MUST include the following annotations:
|
|
|
62
94
|
// @dependency: none | [component names] | full-system
|
|
63
95
|
// @real-dependency: [component names] (optional)
|
|
64
96
|
// @complexity: low | medium | high
|
|
65
|
-
//
|
|
97
|
+
// Value Score: [score]
|
|
66
98
|
```
|
|
67
99
|
|
|
68
100
|
Adapt comment syntax to the project's language when generating or reviewing test skeletons.
|
|
@@ -2,7 +2,9 @@
|
|
|
2
2
|
|
|
3
3
|
## When to Create E2E Tests
|
|
4
4
|
|
|
5
|
-
E2E tests target **critical user journeys** that span multiple pages or require real browser interaction. Apply the
|
|
5
|
+
E2E tests target **critical user journeys** that span multiple pages or require real browser interaction. Apply the parent skill rules exactly:
|
|
6
|
+
- Reserve 1 E2E slot for the highest-value user-facing multi-step journey
|
|
7
|
+
- Use `Value Score >= 50` for any additional non-reserved E2E candidate
|
|
6
8
|
|
|
7
9
|
### Candidate Sources
|
|
8
10
|
|
|
@@ -15,7 +17,7 @@ E2E tests target **critical user journeys** that span multiple pages or require
|
|
|
15
17
|
|
|
16
18
|
### Selection Criteria
|
|
17
19
|
|
|
18
|
-
**Include** (high E2E
|
|
20
|
+
**Include** (high-value E2E coverage):
|
|
19
21
|
- Multi-page user journeys (login -> dashboard -> action -> confirmation)
|
|
20
22
|
- Flows requiring real browser APIs (navigation, cookies, localStorage)
|
|
21
23
|
- Accessibility verification requiring actual DOM rendering
|
|
@@ -44,7 +46,7 @@ User Journey: [Description of what the user accomplishes]
|
|
|
44
46
|
Preconditions: [Auth state, data state]
|
|
45
47
|
Verification Points:
|
|
46
48
|
- [What to assert at each step]
|
|
47
|
-
E2E
|
|
49
|
+
E2E Value Score: [calculated score]
|
|
48
50
|
```
|
|
49
51
|
|
|
50
52
|
## Playwright Test Architecture
|
|
@@ -82,5 +84,6 @@ When UI Spec defines responsive behavior, test critical breakpoints:
|
|
|
82
84
|
|
|
83
85
|
Hard limits per feature (same as parent skill):
|
|
84
86
|
- **E2E Tests**: MAX 1-2 tests
|
|
85
|
-
-
|
|
87
|
+
- Generate the reserved user-journey E2E when eligible
|
|
88
|
+
- Generate any additional E2E only when `Value Score >= 50`
|
|
86
89
|
- Prefer fewer, comprehensive journey tests over many granular tests
|
|
@@ -147,13 +147,16 @@ Check Step 5 result:
|
|
|
147
147
|
Spawn quality-fixer routed by task filename pattern:
|
|
148
148
|
- `*-backend-task-*` -> Spawn `quality-fixer`
|
|
149
149
|
- `*-frontend-task-*` -> Spawn `quality-fixer-frontend`
|
|
150
|
-
- Prompt: "Final quality assurance for test files added in this workflow. Run all tests and verify coverage."
|
|
150
|
+
- Prompt: "Final quality assurance for test files added in this workflow. Task file: [current task file]. filesModified: [Step 4 testsAdded]. Use these files as the stub-detection scope. Run all tests and verify coverage."
|
|
151
151
|
|
|
152
|
-
**Expected output**: `status` (`approved`/`blocked`)
|
|
152
|
+
**Expected output**: `status` (`stub_detected`/`approved`/`blocked`)
|
|
153
153
|
|
|
154
154
|
### Step 8: Commit
|
|
155
155
|
|
|
156
|
-
On
|
|
156
|
+
On quality-fixer result:
|
|
157
|
+
- `status: "stub_detected"` -> Return to Step 4 with `stubFindings`
|
|
158
|
+
- `status: "blocked"` -> Escalate to user
|
|
159
|
+
- `status: "approved"` -> Commit test files
|
|
157
160
|
- MUST commit test files with appropriate message
|
|
158
161
|
ENFORCEMENT: Commits without quality-fixer approval are invalid.
|
|
159
162
|
|
|
@@ -79,8 +79,12 @@ For EACH task, YOU MUST:
|
|
|
79
79
|
- `needs_revision` -> Return to step 2 with `requiredFixes`
|
|
80
80
|
- `approved` -> Proceed to step 4
|
|
81
81
|
- `readyForQualityCheck: true` -> Proceed to step 4
|
|
82
|
-
4. **Spawn quality-fixer agent**: "Execute all quality checks and fixes"
|
|
83
|
-
5. **
|
|
82
|
+
4. **Spawn quality-fixer agent**: "Execute all quality checks and fixes. Task file: [task-file-path]. The task file path above is also the `task_file` input. Read its `Quality Assurance Mechanisms` section as supplementary quality-check hints. filesModified: [task-executor response filesModified]. Use these files as the stub-detection scope."
|
|
83
|
+
5. **CHECK quality-fixer response**:
|
|
84
|
+
- `status: "stub_detected"` -> Return to step 2 with `stubFindings`
|
|
85
|
+
- `status: "blocked"` -> STOP and escalate to user
|
|
86
|
+
- `status: "approved"` -> Proceed to step 6
|
|
87
|
+
6. **COMMIT on approval**: After `status: "approved"` from quality-fixer -> Execute git commit
|
|
84
88
|
|
|
85
89
|
**CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
|
|
86
90
|
ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
|
|
@@ -8,7 +8,7 @@ description: "Investigate problem, verify findings, and derive solutions through
|
|
|
8
8
|
1. [LOAD IF NOT ACTIVE] `ai-development-guide` — AI development patterns
|
|
9
9
|
2. [LOAD IF NOT ACTIVE] `coding-rules` — coding standards
|
|
10
10
|
|
|
11
|
-
**Context**: Diagnosis flow to identify
|
|
11
|
+
**Context**: Diagnosis flow to identify concrete failure points and present solutions
|
|
12
12
|
|
|
13
13
|
Target problem: $ARGUMENTS
|
|
14
14
|
|
|
@@ -69,10 +69,10 @@ Confirm from rule-advisor output:
|
|
|
69
69
|
```
|
|
70
70
|
Problem -> investigator -> verifier -> solver --+
|
|
71
71
|
^ |
|
|
72
|
-
+--
|
|
72
|
+
+-- coverage insufficient -----+
|
|
73
73
|
(max 2 iterations)
|
|
74
74
|
|
|
75
|
-
|
|
75
|
+
coverage sufficient -> Report
|
|
76
76
|
```
|
|
77
77
|
|
|
78
78
|
**Context Separation**: Pass only structured output to each step. Each step starts fresh with the data only.
|
|
@@ -99,7 +99,7 @@ For change failures, also include:
|
|
|
99
99
|
- what both areas share
|
|
100
100
|
```
|
|
101
101
|
|
|
102
|
-
**Expected output**: Evidence matrix,
|
|
102
|
+
**Expected output**: Evidence matrix, path map, failure points, comparison analysis results, list of unexplored areas, investigation limitations
|
|
103
103
|
|
|
104
104
|
### Step 2: Investigation Quality Check
|
|
105
105
|
|
|
@@ -107,10 +107,11 @@ Review investigation output:
|
|
|
107
107
|
|
|
108
108
|
**Quality Check** (verify output contains the following):
|
|
109
109
|
- [ ] `comparisonAnalysis` is present and `normalImplementation` is non-null, or explicitly states that no working implementation was found
|
|
110
|
-
- [ ]
|
|
111
|
-
- [ ]
|
|
110
|
+
- [ ] `pathMap` is present with ordered nodes or explicit unknown segments
|
|
111
|
+
- [ ] causalChain for each failure point reaches a stop condition
|
|
112
|
+
- [ ] causeCategory for each failure point
|
|
112
113
|
- [ ] `investigationSources` covers at least 3 distinct source types
|
|
113
|
-
- [ ] each
|
|
114
|
+
- [ ] each failure point has supporting evidence with a concrete source
|
|
114
115
|
- [ ] Investigation covering investigationFocus items (when provided)
|
|
115
116
|
|
|
116
117
|
**If quality insufficient**: MUST re-spawn investigator agent specifying the missing items and include the previous investigation output for context
|
|
@@ -132,45 +133,45 @@ Proceed to verifier once quality is satisfied.
|
|
|
132
133
|
|
|
133
134
|
Spawn verifier agent: "Verify the following investigation results. Investigation results: [Investigation output]"
|
|
134
135
|
|
|
135
|
-
**Expected output**:
|
|
136
|
+
**Expected output**: Path coverage findings, independent failure-point evaluation, final conclusion, coverageAssessment/finalStatus
|
|
136
137
|
|
|
137
|
-
**
|
|
138
|
-
- **
|
|
139
|
-
- **
|
|
140
|
-
- **
|
|
138
|
+
**Coverage Criteria**:
|
|
139
|
+
- **sufficient**: No major uncovered boundary affects solution selection or implementation
|
|
140
|
+
- **partial**: Some uncertainty exists but a bounded next investigation is possible
|
|
141
|
+
- **insufficient**: Fundamental information gap exists on the relevant path
|
|
141
142
|
|
|
142
143
|
### Step 4: Solution Derivation (solver)
|
|
143
144
|
|
|
144
|
-
Spawn solver agent: "Derive solutions based on the following verified conclusion.
|
|
145
|
+
Spawn solver agent: "Derive solutions based on the following verified conclusion. Failure points: [verifier's conclusion.confirmedFailurePoints]. Failure-point relationships: [verifier's conclusion.failurePointRelationships]. Coverage assessment: [verifier's conclusion.coverageAssessment]. Final status: [verifier's conclusion.finalStatus]. Impact analysis: [investigator output impactAnalysis]."
|
|
145
146
|
|
|
146
147
|
**Expected output**: Multiple solutions (at least 3), tradeoff analysis, recommendation and implementation steps, residual risks
|
|
147
148
|
|
|
148
|
-
**Completion condition**:
|
|
149
|
+
**Completion condition**: `coverageAssessment=sufficient` and `finalStatus=ready_for_solution`
|
|
149
150
|
|
|
150
151
|
**When not reached**:
|
|
151
152
|
1. Return to Step 1 with uncertainties identified by solver as investigation targets
|
|
152
153
|
2. Maximum 2 additional investigation iterations
|
|
153
|
-
3. After 2 iterations without reaching
|
|
154
|
+
3. After 2 iterations without reaching sufficient coverage, present user with options:
|
|
154
155
|
- Continue additional investigation
|
|
155
|
-
- Execute solution at current
|
|
156
|
+
- Execute solution at current coverage level
|
|
156
157
|
|
|
157
158
|
### Step 5: Final Report Creation
|
|
158
159
|
|
|
159
|
-
**Prerequisite**:
|
|
160
|
+
**Prerequisite**: sufficient coverage achieved
|
|
160
161
|
|
|
161
162
|
After diagnosis completion, report to user in the following format:
|
|
162
163
|
|
|
163
164
|
```
|
|
164
165
|
## Diagnosis Result Summary
|
|
165
166
|
|
|
166
|
-
### Identified
|
|
167
|
-
[
|
|
168
|
-
-
|
|
167
|
+
### Identified Failure Points
|
|
168
|
+
[Failure point list from verification results]
|
|
169
|
+
- Failure-point relationships: [independent/upstream_of/downstream_of/amplifies/same_boundary]
|
|
169
170
|
|
|
170
171
|
### Verification Process
|
|
171
172
|
- Investigation scope: [Scope confirmed in investigation]
|
|
172
173
|
- Additional investigation iterations: [0/1/2]
|
|
173
|
-
-
|
|
174
|
+
- Coverage assessment: [sufficient/partial/insufficient]
|
|
174
175
|
|
|
175
176
|
### Recommended Solution
|
|
176
177
|
[Solution derivation recommendation]
|
|
@@ -197,7 +198,7 @@ Rationale: [Selection rationale]
|
|
|
197
198
|
|
|
198
199
|
- [ ] Spawned investigator and obtained evidence matrix, comparison analysis, and causal tracking
|
|
199
200
|
- [ ] Performed investigation quality check and re-ran if insufficient
|
|
200
|
-
- [ ] Spawned verifier and obtained
|
|
201
|
+
- [ ] Spawned verifier and obtained coverage assessment
|
|
201
202
|
- [ ] Spawned solver
|
|
202
|
-
- [ ] Achieved
|
|
203
|
+
- [ ] Achieved sufficient coverage (or obtained user approval after 2 additional iterations)
|
|
203
204
|
- [ ] Presented final report to user
|
|
@@ -87,8 +87,12 @@ For EACH task, YOU MUST:
|
|
|
87
87
|
- `needs_revision` -> Return to step 2 with `requiredFixes`
|
|
88
88
|
- `approved` -> Proceed to step 4
|
|
89
89
|
- `readyForQualityCheck: true` -> Proceed to step 4
|
|
90
|
-
4. **Spawn quality-fixer-frontend agent**: "Execute all frontend quality checks and fixes"
|
|
91
|
-
5. **
|
|
90
|
+
4. **Spawn quality-fixer-frontend agent**: "Execute all frontend quality checks and fixes. Task file: docs/plans/tasks/[filename].md. The task file path above is also the `task_file` input. Read its `Quality Assurance Mechanisms` section as supplementary quality-check hints. filesModified: [task-executor-frontend response filesModified]. Use these files as the stub-detection scope."
|
|
91
|
+
5. **CHECK quality-fixer-frontend response**:
|
|
92
|
+
- `status: "stub_detected"` -> Return to step 2 with `stubFindings`
|
|
93
|
+
- `status: "blocked"` -> STOP and escalate to user
|
|
94
|
+
- `status: "approved"` -> Proceed to step 6
|
|
95
|
+
6. **COMMIT on approval**: After `status: "approved"` from quality-fixer-frontend -> Execute git commit. Use `changeSummary` for commit message.
|
|
92
96
|
|
|
93
97
|
**CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
|
|
94
98
|
ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
|
|
@@ -46,7 +46,7 @@ Check for existence of design documents in docs/design/.
|
|
|
46
46
|
Spawn acceptance-test-generator agent: "Generate test skeletons from Design Doc at [path]. [UI Spec at [ui-spec path] if exists.]"
|
|
47
47
|
|
|
48
48
|
### Step 3: Work Plan Creation
|
|
49
|
-
Spawn work-planner agent: "Create work plan from Design Doc at [path]. Integration test file: [path from step 2]. E2E test file: [path from step 2]. Integration tests are created simultaneously with each phase implementation, E2E tests are executed only in final phase."
|
|
49
|
+
Spawn work-planner agent: "Create work plan from Design Doc at [path]. Integration test file: [path from step 2]. E2E test file: [path from step 2 or null]. E2E absence reason: [value from step 2 when E2E file is null]. Integration tests are created simultaneously with each phase implementation, E2E tests are executed only in final phase when an E2E file exists."
|
|
50
50
|
|
|
51
51
|
**[STOP -- BLOCKING]** Interact with user to complete plan and obtain approval for plan content. Clarify specific implementation steps and risks.
|
|
52
52
|
**CANNOT proceed until user explicitly approves the work plan.**
|
|
@@ -97,8 +97,12 @@ For EACH task, YOU MUST:
|
|
|
97
97
|
- `needs_revision` -> Return to step 2 with `requiredFixes`
|
|
98
98
|
- `approved` -> Proceed to step 4
|
|
99
99
|
- `readyForQualityCheck: true` -> Proceed to step 4
|
|
100
|
-
4. **Spawn quality-fixer agent** (layer-appropriate per routing table): "Execute all quality checks and fixes"
|
|
101
|
-
5. **
|
|
100
|
+
4. **Spawn quality-fixer agent** (layer-appropriate per routing table): "Execute all quality checks and fixes. Task file: [task-file-path]. The task file path above is also the `task_file` input. Read its `Quality Assurance Mechanisms` section as supplementary quality-check hints. filesModified: [executor response filesModified]. Use these files as the stub-detection scope."
|
|
101
|
+
5. **CHECK quality-fixer response**:
|
|
102
|
+
- `status: "stub_detected"` -> Return to step 2 with `stubFindings`
|
|
103
|
+
- `status: "blocked"` -> STOP and escalate to user
|
|
104
|
+
- `status: "approved"` -> Proceed to step 6
|
|
105
|
+
6. **COMMIT on approval**: After `status: "approved"` from quality-fixer -> Execute git commit
|
|
102
106
|
|
|
103
107
|
**CRITICAL**: MUST monitor ALL structured responses WITHOUT EXCEPTION and ENSURE every quality gate is passed.
|
|
104
108
|
ENFORCEMENT: Proceeding past a failed quality gate invalidates all subsequent work.
|
|
@@ -124,8 +124,9 @@ ENFORCEMENT: Sub-agent prompts missing the constraint suffix MUST be re-issued w
|
|
|
124
124
|
**Rules**:
|
|
125
125
|
1. Execute ONE task completely before starting next (each task goes through the full 4-step cycle individually, using the correct executor per filename pattern)
|
|
126
126
|
2. Check executor status before quality-fixer (escalation check)
|
|
127
|
-
3. Quality-fixer MUST run after each executor (no skipping)
|
|
128
|
-
4.
|
|
127
|
+
3. Quality-fixer MUST run after each executor (no skipping), MUST receive the executor `filesModified` list as stub-detection scope, and MUST receive the current task file as the `task_file` input so it reads the task file's `Quality Assurance Mechanisms` section as supplementary quality-check hints
|
|
128
|
+
4. If quality-fixer returns `status: "stub_detected"`, route the task back to the same executor with `stubFindings`
|
|
129
|
+
5. Commit MUST execute only when quality-fixer returns `status: "approved"` (do not defer to end)
|
|
129
130
|
|
|
130
131
|
### Post-Implementation Verification (After All Tasks Complete)
|
|
131
132
|
|
|
@@ -149,8 +150,9 @@ After all task cycles finish, collect all `filesModified` from every task-execut
|
|
|
149
150
|
### Test Information Communication
|
|
150
151
|
After acceptance-test-generator execution, when calling work-planner, communicate:
|
|
151
152
|
- Generated integration test file path
|
|
152
|
-
- Generated E2E test file path
|
|
153
|
-
-
|
|
153
|
+
- Generated E2E test file path or `null`
|
|
154
|
+
- E2E absence reason when no E2E file is generated
|
|
155
|
+
- Explicit note that integration tests are created simultaneously with implementation, E2E tests are executed after all implementations only when an E2E file exists
|
|
154
156
|
|
|
155
157
|
**[STOP -- BLOCKING]** Upon detecting ANY requirement changes, halt execution immediately.
|
|
156
158
|
**CANNOT proceed until user explicitly confirms the change scope.**
|
|
@@ -105,8 +105,12 @@ After user grants "batch approval for entire implementation phase", enter autono
|
|
|
105
105
|
- `needs_revision` -> Return to step 1 with `requiredFixes`
|
|
106
106
|
- `approved` -> Proceed to step 3
|
|
107
107
|
- Otherwise -> Proceed to step 3
|
|
108
|
-
3. Spawn quality-fixer (or quality-fixer-frontend) agent: "Quality check and fixes"
|
|
109
|
-
4.
|
|
108
|
+
3. Spawn quality-fixer (or quality-fixer-frontend) agent: "Quality check and fixes. Task file: [task-file-path]. The task file path above is also the `task_file` input. Read its `Quality Assurance Mechanisms` section as supplementary quality-check hints. filesModified: [executor response filesModified]. Use these files as the stub-detection scope."
|
|
109
|
+
4. Check quality-fixer response:
|
|
110
|
+
- `status: "stub_detected"` -> Return to step 1 with `stubFindings`
|
|
111
|
+
- `status: "blocked"` -> Escalate to user
|
|
112
|
+
- `status: "approved"` -> Proceed to step 5
|
|
113
|
+
5. git commit -> Execute on `status: "approved"`
|
|
110
114
|
|
|
111
115
|
### Post-Implementation Verification (After All Tasks Complete)
|
|
112
116
|
|
|
@@ -130,8 +134,9 @@ After all task cycles finish, collect all `filesModified` from every executor re
|
|
|
130
134
|
### Test Information Communication
|
|
131
135
|
After acceptance-test-generator execution, when spawning work-planner, communicate:
|
|
132
136
|
- Generated integration test file path
|
|
133
|
-
- Generated E2E test file path
|
|
134
|
-
-
|
|
137
|
+
- Generated E2E test file path or `null`
|
|
138
|
+
- E2E absence reason when no E2E file is generated
|
|
139
|
+
- Note: integration tests are created with implementation; E2E tests run after all implementations when an E2E file exists
|
|
135
140
|
|
|
136
141
|
## Completion Criteria
|
|
137
142
|
|
|
@@ -47,9 +47,10 @@ Present options if multiple exist (can be specified with $ARGUMENTS).
|
|
|
47
47
|
- Confirm with user whether to generate E2E test skeleton first
|
|
48
48
|
- If user wants generation: Spawn acceptance-test-generator agent: "Generate test skeletons from Design Doc at [design-doc-path]"
|
|
49
49
|
- Pass generation results to next process according to subagents-orchestration-guide skill coordination specification
|
|
50
|
+
- If no E2E file is generated, carry the explicit `e2eAbsenceReason` forward as a valid planning input
|
|
50
51
|
|
|
51
52
|
### Step 3: Work Plan Creation
|
|
52
|
-
- Spawn work-planner agent: "Create work plan from design document at [design-doc-path]. Include deliverables from previous process according to subagents-orchestration-guide skill coordination specification."
|
|
53
|
+
- Spawn work-planner agent: "Create work plan from design document at [design-doc-path]. Include deliverables from previous process according to subagents-orchestration-guide skill coordination specification. If `generatedFiles.e2e` is null, use `e2eAbsenceReason` and accept the null E2E file as a valid planning input."
|
|
53
54
|
- Interact with user to complete plan and obtain approval for plan content
|
|
54
55
|
- Clarify specific implementation steps and risks
|
|
55
56
|
|
|
@@ -109,7 +109,7 @@ Spawn [Update Agent from Step 2] agent: "Operation Mode: update. Existing Docume
|
|
|
109
109
|
|
|
110
110
|
For Design Doc updates, first verify the updated document against code:
|
|
111
111
|
|
|
112
|
-
Spawn code-verifier agent: "Verify the updated Design Doc against current code. doc_type: design-doc. document_path: [path from Step 1]. verbose: false."
|
|
112
|
+
Spawn code-verifier agent: "Verify the updated Design Doc against current code. doc_type: design-doc. document_path: [path from Step 1]. verbose: false. Focus especially on literal identifier referential integrity for concrete paths, endpoints, type names, config keys, and other exact identifiers changed in this update."
|
|
113
113
|
|
|
114
114
|
**Store output as**: `$CODE_VERIFICATION_OUTPUT`
|
|
115
115
|
|
|
@@ -178,15 +178,15 @@ All agents MUST use this vocabulary consistently:
|
|
|
178
178
|
|
|
179
179
|
Subagents respond in JSON format. The final response from each JSON-returning subagent must be the JSON payload itself, with no trailing prose. Key fields for orchestrator decisions:
|
|
180
180
|
- **requirement-analyzer**: scale, confidence, affectedLayers, adrRequired, scopeDependencies, questions
|
|
181
|
-
- **codebase-analyzer**: analysisScope, existingElements, dataModel, focusAreas, limitations
|
|
182
|
-
- **task-executor**: status (escalation_needed/completed), escalation_type (design_compliance_violation/similar_function_found/similar_component_found/investigation_target_not_found/out_of_scope_file/test_environment_not_ready), testsAdded, requiresTestReview
|
|
183
|
-
- **quality-fixer**:
|
|
181
|
+
- **codebase-analyzer**: analysisScope, existingElements, dataModel, qualityAssurance, focusAreas, limitations
|
|
182
|
+
- **task-executor**: status (escalation_needed/completed), escalation_type (design_compliance_violation/similar_function_found/similar_component_found/investigation_target_not_found/out_of_scope_file/test_environment_not_ready/dependency_version_uncertain), testsAdded, requiresTestReview
|
|
183
|
+
- **quality-fixer**: Input: `task_file` (always pass the current task file path in orchestrated flows). Status (`stub_detected`/approved/blocked). `stub_detected` returns `stubFindings[]` and routes back to the task executor. For blocked responses, discriminate by `reason`: specification conflicts use `blockingIssues[]`; execution prerequisites use `missingPrerequisites[]`, and each item provides its own `resolutionSteps`
|
|
184
184
|
- **document-reviewer**: verdict.decision (approved/approved_with_conditions/needs_revision/rejected)
|
|
185
185
|
- **code-verifier**: summary.status, summary.consistencyScore, discrepancies, reverseCoverage
|
|
186
186
|
- **design-sync**: sync_status (CONFLICTS_FOUND/NO_CONFLICTS) — text format with [SUMMARY] block
|
|
187
187
|
- **integration-test-reviewer**: status (approved/needs_revision/blocked), requiredFixes
|
|
188
188
|
- **security-reviewer**: status (approved/approved_with_notes/needs_revision/blocked), findings, notes, requiredFixes
|
|
189
|
-
- **acceptance-test-generator**: status, generatedFiles
|
|
189
|
+
- **acceptance-test-generator**: status, generatedFiles, `e2eAbsenceReason`
|
|
190
190
|
|
|
191
191
|
## Handling Requirement Changes
|
|
192
192
|
|
|
@@ -252,7 +252,7 @@ When receiving new features or change requests, start with requirement-analyzer.
|
|
|
252
252
|
### Design Flow Data Passing
|
|
253
253
|
|
|
254
254
|
- Pass requirement-analyzer output and original requirements to codebase-analyzer
|
|
255
|
-
- Pass codebase-analyzer JSON to technical-designer or technical-designer-frontend as `Codebase Analysis`, including `dataTransformationPipelines` when present
|
|
255
|
+
- Pass codebase-analyzer JSON to technical-designer or technical-designer-frontend as `Codebase Analysis`, including `dataTransformationPipelines` and `qualityAssurance` when present
|
|
256
256
|
- Pass Design Doc path to code-verifier
|
|
257
257
|
- Pass code-verifier JSON to document-reviewer as `code_verification`
|
|
258
258
|
|
|
@@ -296,7 +296,8 @@ Batch approval -> Start autonomous execution mode
|
|
|
296
296
|
- needs_revision -> back to task-executor
|
|
297
297
|
- approved -> quality-fixer
|
|
298
298
|
- No issues -> quality-fixer
|
|
299
|
-
-> quality-fixer: Quality check and fixes
|
|
299
|
+
-> quality-fixer: Quality check and fixes using the executor `filesModified` set as the stub-detection scope
|
|
300
|
+
- stub_detected -> task-executor/task-executor-frontend: complete implementation -> re-run quality-fixer
|
|
300
301
|
-> Orchestrator: Execute git commit
|
|
301
302
|
-> Check remaining tasks:
|
|
302
303
|
- Yes -> next task
|
|
@@ -352,13 +353,15 @@ Maximum retry count is 1 verification fix cycle. If any failed verifier still fa
|
|
|
352
353
|
|
|
353
354
|
**Orchestrator verification items**:
|
|
354
355
|
- Verify integration test file path retrieval and existence
|
|
355
|
-
- Verify E2E test file path retrieval and existence
|
|
356
|
+
- Verify E2E test file path retrieval and existence when `generatedFiles.e2e` is not null
|
|
357
|
+
- Verify `e2eAbsenceReason` is present when `generatedFiles.e2e` is null
|
|
356
358
|
|
|
357
359
|
**Pass to work-planner**:
|
|
358
360
|
- Integration test file: [path] (create and execute simultaneously with each phase implementation)
|
|
359
|
-
- E2E test file: [path] (execute only in final phase)
|
|
361
|
+
- E2E test file: [path] or `null` (execute only in final phase when present)
|
|
362
|
+
- E2E absence reason: [value when E2E test file is null]
|
|
360
363
|
|
|
361
|
-
**On error**: Escalate to user
|
|
364
|
+
**On error**: Escalate to user only when required outputs are missing without a valid absence reason
|
|
362
365
|
|
|
363
366
|
### Design Doc to Work Plan Verification Handoff
|
|
364
367
|
|
|
@@ -118,7 +118,7 @@ skills:
|
|
|
118
118
|
integration-e2e-testing:
|
|
119
119
|
skill: "integration-e2e-testing"
|
|
120
120
|
tags: [testing, integration-testing, e2e-testing, test-design, behavior-first, roi, test-skeleton, ears-format]
|
|
121
|
-
typical-use: "Integration and E2E test design principles,
|
|
121
|
+
typical-use: "Integration and E2E test design principles, value-based test selection, behavior-first approach, test skeleton specification"
|
|
122
122
|
size: medium
|
|
123
123
|
key-references:
|
|
124
124
|
- "Test Pyramid - Mike Cohn"
|
|
@@ -127,7 +127,7 @@ skills:
|
|
|
127
127
|
- "References"
|
|
128
128
|
- "Test Type Definition and Limits [MANDATORY]"
|
|
129
129
|
- "Behavior-First Principle [MANDATORY]"
|
|
130
|
-
- "
|
|
130
|
+
- "Value and Selection Model"
|
|
131
131
|
- "Test Skeleton Specification [MANDATORY]"
|
|
132
132
|
- "EARS Format Mapping"
|
|
133
133
|
- "Test File Naming Convention"
|
|
@@ -213,7 +213,7 @@ export const test = base.extend<{ authenticatedPage: Page }>({
|
|
|
213
213
|
### E2E Budget
|
|
214
214
|
|
|
215
215
|
- **MAX 1-2 E2E tests per feature**
|
|
216
|
-
- Only generate
|
|
216
|
+
- Only generate an additional non-reserved E2E test when `Value Score >= 50`
|
|
217
217
|
- Prefer fewer comprehensive journey tests over many granular tests
|
|
218
218
|
|
|
219
219
|
### Test Isolation
|