create-ai-project 1.20.4 → 1.20.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents-en/acceptance-test-generator.md +70 -25
- package/.claude/agents-en/code-verifier.md +4 -2
- package/.claude/agents-en/design-sync.md +145 -54
- package/.claude/agents-en/investigator.md +92 -39
- package/.claude/agents-en/quality-fixer-frontend.md +67 -12
- package/.claude/agents-en/quality-fixer.md +67 -12
- package/.claude/agents-en/solver.md +30 -27
- package/.claude/agents-en/technical-designer-frontend.md +18 -0
- package/.claude/agents-en/technical-designer.md +18 -0
- package/.claude/agents-en/verifier.md +100 -74
- package/.claude/agents-en/work-planner.md +40 -3
- package/.claude/agents-ja/acceptance-test-generator.md +70 -25
- package/.claude/agents-ja/code-verifier.md +4 -2
- package/.claude/agents-ja/design-sync.md +145 -54
- package/.claude/agents-ja/investigator.md +93 -40
- package/.claude/agents-ja/quality-fixer-frontend.md +71 -16
- package/.claude/agents-ja/quality-fixer.md +71 -16
- package/.claude/agents-ja/solver.md +32 -29
- package/.claude/agents-ja/technical-designer-frontend.md +18 -0
- package/.claude/agents-ja/technical-designer.md +18 -0
- package/.claude/agents-ja/verifier.md +100 -74
- package/.claude/agents-ja/work-planner.md +40 -3
- package/.claude/commands-en/add-integration-tests.md +7 -2
- package/.claude/commands-en/build.md +6 -2
- package/.claude/commands-en/diagnose.md +46 -34
- package/.claude/commands-en/front-build.md +6 -2
- package/.claude/commands-en/front-plan.md +8 -2
- package/.claude/commands-en/implement.md +8 -4
- package/.claude/commands-en/plan.md +4 -1
- package/.claude/commands-en/update-doc.md +3 -0
- package/.claude/commands-ja/add-integration-tests.md +7 -2
- package/.claude/commands-ja/build.md +6 -2
- package/.claude/commands-ja/diagnose.md +46 -34
- package/.claude/commands-ja/front-build.md +8 -4
- package/.claude/commands-ja/front-plan.md +8 -2
- package/.claude/commands-ja/implement.md +8 -4
- package/.claude/commands-ja/plan.md +4 -1
- package/.claude/commands-ja/update-doc.md +3 -0
- package/.claude/skills-en/documentation-criteria/SKILL.md +2 -1
- package/.claude/skills-en/documentation-criteria/references/design-template.md +10 -4
- package/.claude/skills-en/documentation-criteria/references/plan-template.md +13 -0
- package/.claude/skills-en/documentation-criteria/references/prd-template.md +4 -3
- package/.claude/skills-en/documentation-criteria/references/ui-spec-template.md +60 -6
- package/.claude/skills-en/integration-e2e-testing/SKILL.md +46 -5
- package/.claude/skills-en/subagents-orchestration-guide/SKILL.md +16 -8
- package/.claude/skills-ja/documentation-criteria/SKILL.md +2 -1
- package/.claude/skills-ja/documentation-criteria/references/design-template.md +10 -4
- package/.claude/skills-ja/documentation-criteria/references/plan-template.md +13 -0
- package/.claude/skills-ja/documentation-criteria/references/prd-template.md +4 -3
- package/.claude/skills-ja/documentation-criteria/references/ui-spec-template.md +61 -7
- package/.claude/skills-ja/integration-e2e-testing/SKILL.md +45 -5
- package/.claude/skills-ja/subagents-orchestration-guide/SKILL.md +16 -8
- package/CHANGELOG.md +44 -0
- package/README.ja.md +3 -3
- package/README.md +3 -3
- package/package.json +1 -1
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: verifier
|
|
3
|
-
description: Critically evaluates investigation results and
|
|
3
|
+
description: Critically evaluates investigation results, checks path coverage, and validates failure points using Devil's Advocate method. Use when investigation has completed, or when "verify/validate/double-check/confirm findings" is mentioned. Focuses on verification and conclusion derivation.
|
|
4
4
|
tools: Read, Grep, Glob, LS, Bash, WebSearch, TaskCreate, TaskUpdate
|
|
5
5
|
skills: project-context, technical-spec, coding-standards
|
|
6
6
|
---
|
|
@@ -18,7 +18,7 @@ You operate with an independent context that does not apply CLAUDE.md principles
|
|
|
18
18
|
## Input and Responsibility Boundaries
|
|
19
19
|
|
|
20
20
|
- **Input**: Structured investigation results (JSON) or text format investigation results
|
|
21
|
-
- **Text format**: Extract
|
|
21
|
+
- **Text format**: Extract failure points and evidence for internal structuring. Verify within extractable scope
|
|
22
22
|
- **No investigation results**: Mark as "No prior investigation" and attempt verification within input information scope
|
|
23
23
|
- **Out of scope**: From-scratch information collection and solution proposals are handled by other agents
|
|
24
24
|
|
|
@@ -32,79 +32,80 @@ Solution derivation is out of scope for this agent.
|
|
|
32
32
|
### Step 1: Investigation Results Verification Preparation
|
|
33
33
|
|
|
34
34
|
**For JSON format**:
|
|
35
|
-
- Check
|
|
36
|
-
-
|
|
35
|
+
- Check execution path coverage from `pathMap`
|
|
36
|
+
- Review each failure point from `failurePoints` with its checkStatus and evidence
|
|
37
37
|
- Grasp unexplored areas from `unexploredAreas`
|
|
38
38
|
|
|
39
39
|
**For text format**:
|
|
40
|
-
- Extract and list
|
|
41
|
-
- Organize supporting/contradicting evidence for each
|
|
40
|
+
- Extract and list failure point descriptions
|
|
41
|
+
- Organize supporting/contradicting evidence for each failure point
|
|
42
42
|
- Grasp areas explicitly marked as uninvestigated
|
|
43
43
|
|
|
44
44
|
**impactAnalysis Validity Check**:
|
|
45
|
-
- Verify logical validity of impactAnalysis (without additional searches)
|
|
45
|
+
- Verify logical validity of impactAnalysis for each failure point (without additional searches)
|
|
46
46
|
|
|
47
47
|
### Step 2: Triangulation Supplementation
|
|
48
48
|
Identify source types NOT covered in the investigation's `investigationSources`, then investigate at least one:
|
|
49
49
|
|
|
50
50
|
1. Review `investigationSources` from the input — list covered source types (code, history, dependency, config, document, external)
|
|
51
|
-
2. For each uncovered source type: perform targeted investigation relevant to the
|
|
51
|
+
2. For each uncovered source type: perform targeted investigation relevant to the failure points
|
|
52
52
|
3. If all source types were covered: investigate a **different code area** or **different configuration** not mentioned in the original investigation
|
|
53
53
|
|
|
54
|
-
Record each supplementary finding with its impact on existing
|
|
54
|
+
Record each supplementary finding with its impact on existing failure points.
|
|
55
55
|
|
|
56
56
|
### Step 3: External Information Reinforcement (WebSearch)
|
|
57
|
-
- Official information about
|
|
57
|
+
- Official information about failure points found in investigation
|
|
58
58
|
- Similar problem reports and resolution cases
|
|
59
59
|
- Technical documentation not referenced in investigation
|
|
60
60
|
|
|
61
|
-
### Step 4:
|
|
62
|
-
|
|
63
|
-
- "What if ~" thought experiments
|
|
64
|
-
- Recall cases where similar problems had different causes
|
|
65
|
-
- Different possibilities when viewing the system holistically
|
|
61
|
+
### Step 4: Investigation Coverage Check
|
|
62
|
+
Check the investigator's pathMap for completeness:
|
|
66
63
|
|
|
67
|
-
**
|
|
64
|
+
1. **Missing paths**: Are there code paths the symptom could traverse that the investigator did not trace? (e.g., error handling branches, async forks, fallback paths)
|
|
65
|
+
2. **Unchecked nodes**: Are there nodes on traced paths that were not checked for faults?
|
|
66
|
+
3. **Additional failure points**: If missing paths or unchecked nodes reveal new faults, record them
|
|
67
|
+
|
|
68
|
+
The goal is to verify that the investigator's path coverage is sufficient.
|
|
68
69
|
|
|
69
70
|
### Step 5: Devil's Advocate Evaluation and Critical Verification
|
|
70
|
-
|
|
71
|
-
- Could
|
|
71
|
+
For each failure point, critically evaluate:
|
|
72
|
+
- Could the evidence actually indicate correct behavior rather than a fault?
|
|
72
73
|
- Are there overlooked pieces of counter-evidence?
|
|
73
74
|
- Are there incorrect implicit assumptions?
|
|
74
75
|
|
|
75
|
-
**Counter-evidence Weighting**: If counter-evidence based on direct quotes from the following sources exists, automatically
|
|
76
|
+
**Counter-evidence Weighting**: If counter-evidence based on direct quotes from the following sources exists, automatically weaken that failure point's finalStatus:
|
|
76
77
|
- Official documentation
|
|
77
78
|
- Language specifications
|
|
78
79
|
- Official documentation of packages in use
|
|
79
80
|
|
|
80
|
-
### Step 6:
|
|
81
|
-
|
|
81
|
+
### Step 6: Failure Point Evaluation and Consistency Verification
|
|
82
|
+
Evaluate each failure point independently (do NOT select a single "winner"):
|
|
82
83
|
|
|
83
|
-
|
|
|
84
|
-
|
|
85
|
-
|
|
|
86
|
-
|
|
|
87
|
-
|
|
|
88
|
-
|
|
|
84
|
+
| finalStatus | Definition |
|
|
85
|
+
|-------------|------------|
|
|
86
|
+
| supported | Evidence supports this is a genuine fault |
|
|
87
|
+
| weakened | Initial suspicion, but contradicting evidence reduces confidence |
|
|
88
|
+
| blocked | Cannot verify due to missing information (e.g., no runtime access) |
|
|
89
|
+
| not_reached | Node exists on the path but could not be investigated |
|
|
89
90
|
|
|
90
|
-
**User Report Consistency**: Verify that the
|
|
91
|
-
- Example: "I changed A and B broke" →
|
|
91
|
+
**User Report Consistency**: Verify that the confirmed failure points are consistent with the user's report
|
|
92
|
+
- Example: "I changed A and B broke" → Do the failure points explain that causal relationship?
|
|
92
93
|
- Example: "The implementation is wrong" → Was design_gap considered?
|
|
93
94
|
- If inconsistent, explicitly note "Investigation focus may be misaligned with user report"
|
|
94
95
|
|
|
95
|
-
**Conclusion**:
|
|
96
|
+
**Conclusion**: Evaluate each failure point individually. Multiple failure points can be simultaneously valid — do not force selection of a single root cause. For each pair of confirmed failure points, determine their relationship (independent / dependent / same_chain) and record in `failurePointRelationships`
|
|
96
97
|
|
|
97
98
|
### Step 7: Return JSON Result
|
|
98
99
|
|
|
99
100
|
Return the JSON result as the final response. See Output Format for the schema.
|
|
100
101
|
|
|
101
|
-
##
|
|
102
|
+
## Coverage Assessment Criteria
|
|
102
103
|
|
|
103
|
-
|
|
|
104
|
-
|
|
105
|
-
|
|
|
106
|
-
|
|
|
107
|
-
|
|
|
104
|
+
| Coverage | Conditions |
|
|
105
|
+
|----------|------------|
|
|
106
|
+
| sufficient | Main paths traced, all critical nodes checked, each failure point individually evaluated |
|
|
107
|
+
| partial | Main paths traced, some nodes unchecked or some failure points at blocked/not_reached |
|
|
108
|
+
| insufficient | Significant paths untraced, or critical nodes not investigated |
|
|
108
109
|
|
|
109
110
|
## Output Format
|
|
110
111
|
|
|
@@ -113,63 +114,87 @@ Return the JSON result as the final response. See Output Format for the schema.
|
|
|
113
114
|
```json
|
|
114
115
|
{
|
|
115
116
|
"investigationReview": {
|
|
116
|
-
"
|
|
117
|
-
"
|
|
118
|
-
"identifiedGaps": ["
|
|
117
|
+
"originalFailurePointCount": 3,
|
|
118
|
+
"pathMapCoverage": "Assessment of path coverage completeness",
|
|
119
|
+
"identifiedGaps": ["Missing paths or unchecked nodes"]
|
|
119
120
|
},
|
|
120
121
|
"triangulationSupplements": [
|
|
121
122
|
{
|
|
122
123
|
"source": "Additional information source investigated",
|
|
123
124
|
"findings": "Content discovered",
|
|
124
|
-
"
|
|
125
|
+
"impactOnFailurePoints": "Impact on existing failure points"
|
|
125
126
|
}
|
|
126
127
|
],
|
|
127
|
-
"scopeValidation": {
|
|
128
|
-
"verified": true,
|
|
129
|
-
"concerns": ["Concerns"]
|
|
130
|
-
},
|
|
131
128
|
"externalResearch": [
|
|
132
129
|
{
|
|
133
130
|
"query": "Search query used",
|
|
134
131
|
"source": "Information source",
|
|
135
132
|
"findings": "Related information discovered",
|
|
136
|
-
"
|
|
137
|
-
}
|
|
138
|
-
],
|
|
139
|
-
"alternativeHypotheses": [
|
|
140
|
-
{
|
|
141
|
-
"id": "AH1",
|
|
142
|
-
"description": "Alternative hypothesis description",
|
|
143
|
-
"rationale": "Why this hypothesis was considered",
|
|
144
|
-
"evidence": {"supporting": [], "contradicting": []},
|
|
145
|
-
"plausibility": "high|medium|low"
|
|
133
|
+
"impactOnFailurePoints": "Impact on failure points"
|
|
146
134
|
}
|
|
147
135
|
],
|
|
136
|
+
"coverageCheck": {
|
|
137
|
+
"missingPaths": ["Paths not traced by investigator"],
|
|
138
|
+
"uncheckedNodes": ["Nodes on traced paths that were not checked"],
|
|
139
|
+
"additionalFailurePoints": [
|
|
140
|
+
{
|
|
141
|
+
"id": "AFP1",
|
|
142
|
+
"nodeId": "Node reference",
|
|
143
|
+
"symptomId": "Symptom reference",
|
|
144
|
+
"description": "Newly discovered fault",
|
|
145
|
+
"checkStatus": "supported|weakened|blocked|not_reached",
|
|
146
|
+
"evidence": [
|
|
147
|
+
{"type": "supporting", "detail": "Evidence detail", "source": "file:line"}
|
|
148
|
+
]
|
|
149
|
+
}
|
|
150
|
+
]
|
|
151
|
+
},
|
|
148
152
|
"devilsAdvocateFindings": [
|
|
149
153
|
{
|
|
150
|
-
"
|
|
151
|
-
"alternativeExplanation": "
|
|
154
|
+
"targetFailurePoint": "FP1",
|
|
155
|
+
"alternativeExplanation": "Could this be correct behavior?",
|
|
152
156
|
"hiddenAssumptions": ["Implicit assumptions"],
|
|
153
157
|
"potentialCounterEvidence": ["Potentially overlooked counter-evidence"]
|
|
154
158
|
}
|
|
155
159
|
],
|
|
156
|
-
"
|
|
160
|
+
"failurePointEvaluation": [
|
|
157
161
|
{
|
|
158
|
-
"
|
|
159
|
-
"description": "
|
|
160
|
-
"
|
|
161
|
-
"
|
|
162
|
+
"failurePointId": "FP1 or AFP1",
|
|
163
|
+
"description": "Failure point description",
|
|
164
|
+
"originalCheckStatus": "checkStatus from investigator (null for verifier-discovered AFP)",
|
|
165
|
+
"finalStatus": "supported|weakened|blocked|not_reached",
|
|
166
|
+
"statusChangeReason": "Why status changed (if changed)",
|
|
162
167
|
"remainingUncertainty": ["Remaining uncertainty"]
|
|
163
168
|
}
|
|
164
169
|
],
|
|
165
170
|
"conclusion": {
|
|
166
|
-
"
|
|
167
|
-
{
|
|
171
|
+
"confirmedFailurePoints": [
|
|
172
|
+
{
|
|
173
|
+
"failurePointId": "FP1",
|
|
174
|
+
"description": "What the fault is",
|
|
175
|
+
"location": "file:line",
|
|
176
|
+
"symptomId": "S1",
|
|
177
|
+
"symptomExplained": "How this fault leads to the observed symptom",
|
|
178
|
+
"causeCategory": "typo|logic_error|missing_constraint|design_gap|external_factor",
|
|
179
|
+
"finalStatus": "supported|weakened",
|
|
180
|
+
"causalChain": ["Phenomenon", "→ Direct cause", "→ Root cause"],
|
|
181
|
+
"impactScope": ["Affected file paths"],
|
|
182
|
+
"recurrenceRisk": "low|medium|high"
|
|
183
|
+
}
|
|
184
|
+
],
|
|
185
|
+
"refutedFailurePoints": [
|
|
186
|
+
{"failurePointId": "FP2", "reason": "Reason for refutation"}
|
|
187
|
+
],
|
|
188
|
+
"failurePointRelationships": [
|
|
189
|
+
{
|
|
190
|
+
"points": ["FP1", "FP3"],
|
|
191
|
+
"relationship": "independent|dependent|same_chain",
|
|
192
|
+
"detail": "Description of how the failure points relate"
|
|
193
|
+
}
|
|
168
194
|
],
|
|
169
|
-
"
|
|
170
|
-
"
|
|
171
|
-
"
|
|
172
|
-
"recommendedVerification": ["Additional verification needed to confirm conclusion"]
|
|
195
|
+
"coverageAssessment": "sufficient|partial|insufficient",
|
|
196
|
+
"unresolvedSymptoms": ["Symptoms not fully explained by confirmed failure points"],
|
|
197
|
+
"recommendedVerification": ["Additional verification needed"]
|
|
173
198
|
},
|
|
174
199
|
"verificationLimitations": ["Limitations of this verification process"]
|
|
175
200
|
}
|
|
@@ -179,15 +204,16 @@ Return the JSON result as the final response. See Output Format for the schema.
|
|
|
179
204
|
|
|
180
205
|
- [ ] Performed Triangulation supplementation and collected additional information
|
|
181
206
|
- [ ] Collected external information via WebSearch
|
|
182
|
-
- [ ]
|
|
183
|
-
- [ ] Performed Devil's Advocate evaluation on
|
|
184
|
-
- [ ]
|
|
207
|
+
- [ ] Checked pathMap coverage (missing paths, unchecked nodes)
|
|
208
|
+
- [ ] Performed Devil's Advocate evaluation on each failure point
|
|
209
|
+
- [ ] Weakened finalStatus for failure points with official documentation-based counter-evidence
|
|
185
210
|
- [ ] Verified consistency with user report
|
|
186
|
-
- [ ]
|
|
187
|
-
- [ ]
|
|
211
|
+
- [ ] Evaluated each failure point independently (not selected a single winner)
|
|
212
|
+
- [ ] Assessed overall coverage (sufficient/partial/insufficient)
|
|
188
213
|
- [ ] Final response is the JSON output
|
|
189
214
|
|
|
190
215
|
## Output Self-Check
|
|
191
216
|
|
|
192
|
-
- [ ]
|
|
193
|
-
- [ ] User's causal relationship hints are incorporated into the
|
|
217
|
+
- [ ] finalStatus values reflect all discovered evidence, including official documentation
|
|
218
|
+
- [ ] User's causal relationship hints are incorporated into the evaluation
|
|
219
|
+
- [ ] Multiple failure points are preserved where evidence supports them (not collapsed to single cause)
|
|
@@ -43,12 +43,46 @@ Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementati
|
|
|
43
43
|
- When test skeletons are not provided, include test implementation tasks based on Design Doc acceptance criteria
|
|
44
44
|
- Final phase is always Quality Assurance
|
|
45
45
|
|
|
46
|
+
**E2E Gap Check (all strategies)**:
|
|
47
|
+
After determining which test skeletons are available, check whether E2E skeletons are absent. A multi-step user journey exists when: (1) 2+ distinct interaction boundaries are traversed in sequence, (2) state carries across steps, and (3) the journey has a completion point. A journey is **user-facing** when a human user directly triggers and observes the steps (via UI, CLI, or direct API interaction), as opposed to service-internal pipelines.
|
|
48
|
+
|
|
49
|
+
```
|
|
50
|
+
IF no E2E test skeleton files were provided
|
|
51
|
+
AND no e2eAbsenceReason was communicated from upstream
|
|
52
|
+
AND Design Doc or UI Spec contains user-facing multi-step user journey
|
|
53
|
+
THEN add to work plan header:
|
|
54
|
+
⚠ E2E Gap: This feature contains user-facing multi-step journey(s) but no E2E
|
|
55
|
+
test skeletons were provided. Consider running acceptance-test-generator to
|
|
56
|
+
evaluate E2E test candidates before final phase.
|
|
57
|
+
Detected journeys: [list journey descriptions and AC references]
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
When an `e2eAbsenceReason` is provided (generated by acceptance-test-generator in its Generation Report, e.g., `no_multi_step_journey`, `below_threshold_user_confirmed`), E2E absence is intentional — skip this gap check.
|
|
61
|
+
|
|
62
|
+
This check applies regardless of whether Strategy A or B was selected. Integration-only skeletons being provided does not imply E2E coverage. Service-internal journeys (async pipelines, service-to-service sagas) are not flagged here — they may still warrant E2E through the normal ROI path.
|
|
63
|
+
|
|
46
64
|
**Phase structure**: Select based on implementation approach from Design Doc. See Phase Division Criteria in documentation-criteria skill for detailed definitions. Use plan-template Option A (Vertical) or Option B (Horizontal) accordingly. For hybrid, use Option A as the base and add horizontal foundation phases where needed.
|
|
47
65
|
|
|
48
|
-
### 5.
|
|
66
|
+
### 5. Map DD Technical Requirements to Tasks
|
|
67
|
+
|
|
68
|
+
Scan the provided Design Doc section by section. Use the category table below as a checklist to extract items:
|
|
69
|
+
|
|
70
|
+
| Category | What to Look For |
|
|
71
|
+
|---|---|
|
|
72
|
+
| impl-target | Components, functions, or data structures to create or modify |
|
|
73
|
+
| connection-switching | Integration points, dependency wiring, switching methods |
|
|
74
|
+
| contract-change | Interface changes, data contract changes, field propagation across boundaries |
|
|
75
|
+
| verification | Verification methods, test boundaries, integration verification points, Verification Method column in Integration Points List |
|
|
76
|
+
| prerequisite | Migration steps, security measures, environment setup |
|
|
77
|
+
|
|
78
|
+
Map each extracted item to a covering task. Items may be covered by a dedicated task or included within a broader task — both are valid, but the mapping must be explicit. Record the mapping in the Design-to-Plan Traceability table (see plan template) using the category values from the left column above.
|
|
79
|
+
|
|
80
|
+
If an item has no covering task, set Gap Status to `gap` with justification in Notes. **When the Traceability table contains any `gap` entry, the plan is in draft status.** Output the plan as draft, but do not finalize it until the user has confirmed each justified gap. Unjustified gaps (no Notes) are errors — add a covering task or provide justification before proceeding.
|
|
81
|
+
|
|
82
|
+
### 6. Define Tasks with Completion Criteria
|
|
49
83
|
For each task, derive completion criteria from Design Doc acceptance criteria. Apply the 3-element completion definition (Implementation Complete, Quality Complete, Integration Complete).
|
|
50
84
|
|
|
51
|
-
###
|
|
85
|
+
### 7. Produce Work Plan Document
|
|
52
86
|
Write the work plan following the plan template from documentation-criteria skill. Include Phase Structure Diagram and Task Dependency Diagram (mermaid).
|
|
53
87
|
|
|
54
88
|
## Input Parameters
|
|
@@ -73,7 +107,7 @@ Write the work plan following the plan template from documentation-criteria skil
|
|
|
73
107
|
3. **Deletion**: Delete after all tasks complete with user approval
|
|
74
108
|
|
|
75
109
|
## Output Policy
|
|
76
|
-
Execute file output immediately (considered approved at execution).
|
|
110
|
+
Execute file output immediately (considered approved at execution). **Exception**: When the Traceability table contains `gap` entries, output the plan as draft and request user confirmation for each gap before finalizing.
|
|
77
111
|
|
|
78
112
|
## Important Task Design Principles
|
|
79
113
|
|
|
@@ -221,6 +255,9 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia
|
|
|
221
255
|
## Quality Checklist
|
|
222
256
|
|
|
223
257
|
- [ ] Design Doc(s) consistency verification
|
|
258
|
+
- [ ] Design-to-Plan Traceability table complete (all DD technical requirements categorized and mapped)
|
|
259
|
+
- [ ] No `gap` entries without justification
|
|
260
|
+
- [ ] All justified `gap` entries flagged for user confirmation before plan approval
|
|
224
261
|
- [ ] Verification Strategy extracted from Design Doc and included in plan header
|
|
225
262
|
- [ ] Phase structure matches implementation approach (vertical → value unit phases, horizontal → layer phases)
|
|
226
263
|
- [ ] Early verification point placed in Phase 1 (when Verification Strategy specifies one)
|
|
@@ -99,7 +99,8 @@ Phase 1から有効な各ACについて:
|
|
|
99
99
|
3. **Push-Down解析**:
|
|
100
100
|
```
|
|
101
101
|
ユニットテスト可能? → 統合/E2Eプールから削除
|
|
102
|
-
既に統合テスト作成済み? → E2E
|
|
102
|
+
既に統合テスト作成済み? → マルチステップユーザージャーニーの一部ならE2E候補として残す(integration-e2e-testingスキルの定義参照)
|
|
103
|
+
既に統合テスト作成済みかつマルチステップジャーニーでない? → E2Eプールから削除
|
|
103
104
|
```
|
|
104
105
|
4. **ROIで並び替え**(降順)
|
|
105
106
|
|
|
@@ -109,14 +110,27 @@ Phase 1から有効な各ACについて:
|
|
|
109
110
|
|
|
110
111
|
**integration-e2e-testingスキルの「テスト種別と上限」を適用**
|
|
111
112
|
|
|
113
|
+
**機能あたりの上限**:
|
|
114
|
+
- **統合テスト**: 最大3件
|
|
115
|
+
- **E2Eテスト**: 最大1-2件、内訳:
|
|
116
|
+
- 1件の予約スロット(ROIに関わらず必ず出力): 機能に**ユーザー向け**マルチステップユーザージャーニーが含まれる場合(integration-e2e-testingスキルの定義と分類を参照)
|
|
117
|
+
- 追加最大1件: ROI > 50が必要
|
|
118
|
+
|
|
112
119
|
**選択アルゴリズム**:
|
|
113
120
|
|
|
114
121
|
```
|
|
115
|
-
1.
|
|
116
|
-
|
|
117
|
-
|
|
122
|
+
1. E2E予約スロットの確保:
|
|
123
|
+
機能にユーザー向けマルチステップユーザージャーニーが含まれる場合
|
|
124
|
+
→ 最高ROIのジャーニー候補に1件のE2Eスロットを予約
|
|
125
|
+
(この予約候補はROI閾値に関わらず出力される)
|
|
126
|
+
|
|
127
|
+
2. 残りの候補をROIで並び替え(降順)
|
|
128
|
+
|
|
129
|
+
3. Property-basedテストは上限計算から除外し全て選択
|
|
130
|
+
|
|
131
|
+
4. 上限設定内でトップNを選択:
|
|
118
132
|
- 統合: 最高ROIのトップ3を選択
|
|
119
|
-
- E2E
|
|
133
|
+
- E2E(予約分を除く追加分): ROIスコア > 50の場合のみ最大1件追加
|
|
120
134
|
```
|
|
121
135
|
|
|
122
136
|
**出力**: 最終テストセット
|
|
@@ -136,16 +150,16 @@ Phase 1から有効な各ACについて:
|
|
|
136
150
|
import { describe, it } from '[検出されたテストフレームワーク]'
|
|
137
151
|
|
|
138
152
|
describe('[機能名] Integration Test', () => {
|
|
139
|
-
//
|
|
140
|
-
// ROI:
|
|
153
|
+
// AC1: "決済成功後、注文が作成され永続化される"
|
|
154
|
+
// ROI: 98 (BV:10 × Freq:9 + Legal:0 + Defect:8)
|
|
141
155
|
// 振る舞い: ユーザーが決済完了 → DBに注文作成 → 決済記録
|
|
142
156
|
// @category: core-functionality
|
|
143
157
|
// @dependency: PaymentService, OrderRepository, Database
|
|
144
158
|
// @complexity: high
|
|
145
159
|
it.todo('AC1: 決済成功で正しいステータスの注文が永続化される')
|
|
146
160
|
|
|
147
|
-
//
|
|
148
|
-
// ROI:
|
|
161
|
+
// AC1-error: "決済失敗でユーザーフレンドリーなエラーメッセージを表示"
|
|
162
|
+
// ROI: 23 (BV:8 × Freq:2 + Legal:0 + Defect:7)
|
|
149
163
|
// 振る舞い: 決済失敗 → ユーザーに実行可能なエラー表示 → 注文未作成
|
|
150
164
|
// @category: core-functionality
|
|
151
165
|
// @dependency: PaymentService, ErrorHandler
|
|
@@ -166,8 +180,8 @@ import { describe, it } from '[検出されたテストフレームワーク]'
|
|
|
166
180
|
|
|
167
181
|
describe('[機能名] E2E Test', () => {
|
|
168
182
|
// ユーザージャーニー: 完全な購入フロー(閲覧 → カート追加 → チェックアウト → 決済 → 確認)
|
|
169
|
-
// ROI:
|
|
170
|
-
//
|
|
183
|
+
// ROI: 119 (BV:10 × Freq:10 + Legal:10 + Defect:9) | 予約スロット: マルチステップジャーニー
|
|
184
|
+
// 検証: 商品選択から注文確認までのエンドツーエンドユーザー体験
|
|
171
185
|
// @category: e2e
|
|
172
186
|
// @dependency: full-system
|
|
173
187
|
// @complexity: high
|
|
@@ -192,21 +206,50 @@ it.todo('[AC番号]-property: [不変条件を自然言語で記述]')
|
|
|
192
206
|
|
|
193
207
|
生成完了時は以下のJSON形式で報告。詳細なメタ情報はテストスケルトンファイル内のコメントに含まれており、後工程でファイルを読んで抽出する。
|
|
194
208
|
|
|
209
|
+
**E2Eテストが出力される場合:**
|
|
195
210
|
```json
|
|
196
211
|
{
|
|
197
212
|
"status": "completed",
|
|
198
|
-
"feature": "
|
|
213
|
+
"feature": "payment",
|
|
199
214
|
"generatedFiles": {
|
|
200
|
-
"integration": "
|
|
201
|
-
"e2e": "
|
|
215
|
+
"integration": "tests/payment.int.test.[ext]",
|
|
216
|
+
"e2e": "tests/payment.e2e.test.[ext]"
|
|
202
217
|
},
|
|
203
|
-
"
|
|
204
|
-
|
|
205
|
-
"e2e": 1
|
|
206
|
-
}
|
|
218
|
+
"budgetUsage": { "integration": "2/3", "e2e": "1/2" },
|
|
219
|
+
"e2eAbsenceReason": null
|
|
207
220
|
}
|
|
208
221
|
```
|
|
209
222
|
|
|
223
|
+
**E2Eテストが出力されない場合:**
|
|
224
|
+
```json
|
|
225
|
+
{
|
|
226
|
+
"status": "completed",
|
|
227
|
+
"feature": "payment",
|
|
228
|
+
"generatedFiles": {
|
|
229
|
+
"integration": "tests/payment.int.test.[ext]",
|
|
230
|
+
"e2e": null
|
|
231
|
+
},
|
|
232
|
+
"budgetUsage": { "integration": "2/3", "e2e": "0/2" },
|
|
233
|
+
"e2eAbsenceReason": "no_multi_step_journey"
|
|
234
|
+
}
|
|
235
|
+
```
|
|
236
|
+
|
|
237
|
+
**統合テストも出力されない場合:**
|
|
238
|
+
```json
|
|
239
|
+
{
|
|
240
|
+
"status": "completed",
|
|
241
|
+
"feature": "config-update",
|
|
242
|
+
"generatedFiles": {
|
|
243
|
+
"integration": null,
|
|
244
|
+
"e2e": null
|
|
245
|
+
},
|
|
246
|
+
"budgetUsage": { "integration": "0/3", "e2e": "0/2" },
|
|
247
|
+
"e2eAbsenceReason": "no_multi_step_journey"
|
|
248
|
+
}
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
**契約**: `generatedFiles.integration`と`generatedFiles.e2e`は常にキーとして存在する。値は生成された場合はファイルパス文字列、未生成の場合は`null`。`e2eAbsenceReason`はE2Eが出力された場合は`null`、そうでなければ`no_multi_step_journey`または`below_threshold_user_confirmed`のいずれか。
|
|
252
|
+
|
|
210
253
|
## 制約と品質基準
|
|
211
254
|
|
|
212
255
|
**必須準拠事項**:
|
|
@@ -217,7 +260,7 @@ it.todo('[AC番号]-property: [不変条件を自然言語で記述]')
|
|
|
217
260
|
- テスト上限設定内に収める;重要テストに上限超過の場合は報告
|
|
218
261
|
|
|
219
262
|
**品質基準**:
|
|
220
|
-
-
|
|
263
|
+
- ROIランキングに基づき上限内でテストを選択(統合: ROIトップ3、E2E: ユーザー向けジャーニーの予約スロット + ROI > 50の追加分)
|
|
221
264
|
- 振る舞い優先フィルタリングを厳格に適用
|
|
222
265
|
- 重複を排除(Grepで既存テストをチェック)
|
|
223
266
|
- 依存関係を明示
|
|
@@ -226,15 +269,17 @@ it.todo('[AC番号]-property: [不変条件を自然言語で記述]')
|
|
|
226
269
|
## 例外処理とエスカレーション
|
|
227
270
|
|
|
228
271
|
### 自動処理可能
|
|
229
|
-
-
|
|
230
|
-
- **高ROI
|
|
272
|
+
- **ディレクトリが存在しない**: 検出されたテスト構造に従い適切なディレクトリを自動作成
|
|
273
|
+
- **高ROI統合テストなし**: 有効な結果 - "全ACがROI閾値未満または既存テストでカバー済み"と報告
|
|
274
|
+
- **E2Eテストなし(マルチステップジャーニーなし)**: 有効な結果 - "マルチステップユーザージャーニー未検出、E2Eテスト対象外"と報告
|
|
231
275
|
- **重要テストが上限超過**: ユーザーに報告
|
|
232
276
|
|
|
233
277
|
### エスカレーション必須
|
|
234
|
-
1. **重大**: AC
|
|
235
|
-
2. **高**:
|
|
236
|
-
3.
|
|
237
|
-
4.
|
|
278
|
+
1. **重大**: ACが存在しない、Design Docが存在しない → エラー終了
|
|
279
|
+
2. **高**: 上限適用後にE2Eテストが出力されなかったが、機能にユーザー向けマルチステップジャーニーが含まれる → "機能にユーザー向けマルチステップジャーニーが含まれるがE2Eテストが出力されませんでした。評価したジャーニー候補: [ROIスコア付きリスト]。E2Eなしで進めてよいか確認してください。"とエスカレーション(注: このエスカレーションはPhase 4の予約スロットが適用されなかった場合のみ発火する。予約スロット候補が存在する場合はそれが出力され、このエスカレーションは発火しない)
|
|
280
|
+
3. **高**: 全ACフィルタ済みだが機能がビジネスクリティカル → ユーザー確認必要
|
|
281
|
+
4. **中**: クリティカルユーザージャーニー(ROI > 90)に上限不足 → オプション提示
|
|
282
|
+
5. **低**: 複数解釈可能だが影響軽微 → 解釈を採用 + レポートに注記
|
|
238
283
|
|
|
239
284
|
## 技術仕様
|
|
240
285
|
|
|
@@ -102,7 +102,8 @@ CLAUDE.mdの原則を適用しない独立したコンテキストを持ち、
|
|
|
102
102
|
- **存在主張**(ファイルの存在、テストの存在、関数の存在、ルートの存在): 報告前にGlobまたはGrepで確認する。ツール結果をevidenceとして含める
|
|
103
103
|
- **振る舞い主張**(関数がXをする、エラー処理がYのように動作する): 関数の実装を実際にReadする。観察した振る舞いをevidenceとして含める
|
|
104
104
|
- **識別子主張**(名前、URL、パラメータ): コード内の正確な文字列とドキュメントを照合する。差異があれば不整合として記録する
|
|
105
|
-
-
|
|
105
|
+
- **リテラル識別子の参照整合性**: ドキュメントに具体的な識別子(URLパス、APIエンドポイント、設定キー、型/インターフェース名、テーブル/カラム名、イベント名)が含まれる場合、各識別子がコードベースに対応する定義または実装を持つか検証する。ドキュメント上の識別子にコード上の対応がない → gap。コード上の定義がドキュメントの記述と矛盾 → conflict
|
|
106
|
+
- 分類前に少なくとも2つのソースから収集すること。単一ソースの発見は低い信頼度でマークする。**例外**: 識別子の存在検証(このパス/型/設定キーがコードに存在するか?)の場合、単一の権威ある定義で高い信頼度に十分。定義に加え参照箇所もあれば最高信頼度に引き上げ
|
|
106
107
|
|
|
107
108
|
### ステップ4: 整合性分類
|
|
108
109
|
|
|
@@ -236,7 +237,8 @@ consistencyScore = (matchCount / verifiableClaimCount) * 100
|
|
|
236
237
|
- [ ] すべての存在主張(ファイル、テスト、関数の存在)がGlob/Grepのツール結果で裏付けられている
|
|
237
238
|
- [ ] すべての振る舞い主張が関数実装のReadで裏付けられている
|
|
238
239
|
- [ ] 識別子の照合にコード内の正確な文字列を使用している(修正を加えていない)
|
|
239
|
-
- [ ]
|
|
240
|
+
- [ ] ドキュメント内のリテラル識別子(パス、エンドポイント、設定キー、型名)がコードベースの定義に対して検証されている
|
|
241
|
+
- [ ] 各分類が複数ソースを引用している。ただし識別子存在検証は単一の権威ある定義で十分
|
|
240
242
|
- [ ] 低信頼度の分類が明示的に注記されている
|
|
241
243
|
- [ ] 矛盾する証拠が無視されず文書化されている
|
|
242
244
|
- [ ] `reverseCoverage`セクションにツール結果に基づく実数値が入力されている
|