create-ai-project 1.23.1 → 1.23.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/.claude/agents-en/acceptance-test-generator.md +16 -1
  2. package/.claude/agents-en/code-reviewer.md +8 -0
  3. package/.claude/agents-en/document-reviewer.md +21 -1
  4. package/.claude/agents-en/integration-test-reviewer.md +11 -1
  5. package/.claude/agents-en/task-decomposer.md +10 -0
  6. package/.claude/agents-en/task-executor-frontend.md +7 -1
  7. package/.claude/agents-en/task-executor.md +7 -1
  8. package/.claude/agents-en/technical-designer-frontend.md +10 -48
  9. package/.claude/agents-en/technical-designer.md +10 -26
  10. package/.claude/agents-en/work-planner.md +6 -0
  11. package/.claude/agents-ja/acceptance-test-generator.md +17 -2
  12. package/.claude/agents-ja/code-reviewer.md +8 -0
  13. package/.claude/agents-ja/document-reviewer.md +21 -1
  14. package/.claude/agents-ja/integration-test-reviewer.md +11 -1
  15. package/.claude/agents-ja/task-decomposer.md +10 -0
  16. package/.claude/agents-ja/task-executor-frontend.md +7 -1
  17. package/.claude/agents-ja/task-executor.md +7 -1
  18. package/.claude/agents-ja/technical-designer-frontend.md +10 -48
  19. package/.claude/agents-ja/technical-designer.md +9 -25
  20. package/.claude/agents-ja/work-planner.md +6 -0
  21. package/.claude/commands-en/front-build.md +14 -1
  22. package/.claude/commands-en/front-plan.md +15 -2
  23. package/.claude/commands-en/plan.md +15 -1
  24. package/.claude/commands-ja/front-build.md +14 -1
  25. package/.claude/commands-ja/front-plan.md +14 -1
  26. package/.claude/commands-ja/plan.md +15 -1
  27. package/.claude/skills-en/documentation-criteria/references/plan-template.md +20 -0
  28. package/.claude/skills-en/documentation-criteria/references/task-template.md +12 -0
  29. package/.claude/skills-en/frontend-technical-spec/SKILL.md +4 -4
  30. package/.claude/skills-en/frontend-typescript-rules/SKILL.md +45 -111
  31. package/.claude/skills-en/frontend-typescript-testing/SKILL.md +8 -6
  32. package/.claude/skills-en/subagents-orchestration-guide/SKILL.md +9 -7
  33. package/.claude/skills-en/task-analyzer/references/skills-index.yaml +9 -11
  34. package/.claude/skills-ja/documentation-criteria/references/plan-template.md +20 -0
  35. package/.claude/skills-ja/documentation-criteria/references/task-template.md +12 -0
  36. package/.claude/skills-ja/frontend-technical-spec/SKILL.md +3 -3
  37. package/.claude/skills-ja/frontend-typescript-rules/SKILL.md +43 -288
  38. package/.claude/skills-ja/frontend-typescript-testing/SKILL.md +15 -71
  39. package/.claude/skills-ja/subagents-orchestration-guide/SKILL.md +9 -7
  40. package/.claude/skills-ja/task-analyzer/references/skills-index.yaml +10 -11
  41. package/CHANGELOG.md +18 -0
  42. package/package.json +1 -1
@@ -68,6 +68,7 @@ For each valid AC from Phase 1:
68
68
  - Happy path (1 test mandatory)
69
69
  - Error handling (only user-visible errors)
70
70
  - Edge cases (only if high business impact)
71
+ - Boundary path (behavior-changing AC only): when the AC can hold on the main path while a distinct branch, state, input class, lifecycle step, or fallback regresses, capture that boundary as a proof obligation so the test exercises it
71
72
 
72
73
  2. **Classify test level**:
73
74
  - Integration test candidate (feature-level interaction)
@@ -172,6 +173,8 @@ describe('[Feature Name] Integration Test', () => {
172
173
  // @category: core-functionality
173
174
  // @dependency: PaymentService, OrderRepository, Database
174
175
  // @complexity: high
176
+ // Primary failure mode: payment succeeds but the order row is absent or unpersisted
177
+ // Proof obligation: the order is persisted only after a successful payment; the external payment gateway is the only boundary that may be mocked
175
178
  it.todo('AC1: Successful payment creates persisted order with correct status')
176
179
 
177
180
  // AC1-error: "Payment failure shows user-friendly error message"
@@ -180,10 +183,16 @@ describe('[Feature Name] Integration Test', () => {
180
183
  // @category: core-functionality
181
184
  // @dependency: PaymentService, ErrorHandler
182
185
  // @complexity: medium
186
+ // Primary failure mode: payment failure still creates an order, or the error is swallowed without a user-visible message
187
+ // Proof obligation: a failed payment surfaces an actionable error and leaves no order persisted; only the payment gateway may be mocked
183
188
  it.todo('AC1-error: Failed payment displays error without creating order')
184
189
  })
185
190
  ```
186
191
 
192
+ **Proof annotations** (apply to every skeleton, alongside the metadata above): each `it.todo` carries two comment lines that hand the proof contract to the test implementer and to integration-test-reviewer (these map to the task template's Proof Obligations fields):
193
+ - `Primary failure mode`: the specific regression that turns this test red — the behavior the AC promises and would break
194
+ - `Proof obligation`: what the implemented test must assert to prove the claim — the boundary to traverse, the observable state before/after for state-changing ACs, and which boundaries may be mocked and why. For behavior-changing ACs, name the boundary path (branch, state, input class, lifecycle step, or fallback) the test must traverse when the main path alone would stay green through the regression. Phrase it as design intent describing what to assert; the implementer writes the executable assertions and mock setup
195
+
187
196
  ### E2E Test Files
188
197
 
189
198
  Generate **separate files per lane**: `*.fixture-e2e.test.[ext]` for fixture-e2e, `*.service-e2e.test.[ext]` for service-integration-e2e. Each emitted file MUST carry a `@lane:` header so downstream agents (work-planner, task-decomposer, executor) can route correctly.
@@ -205,6 +214,8 @@ describe('[Feature Name] fixture-e2e', () => {
205
214
  // @lane: fixture-e2e
206
215
  // @dependency: full-ui (mocked backend)
207
216
  // @complexity: medium
217
+ // Primary failure mode: a step transition or its observable state is lost across the journey
218
+ // Proof obligation: each step's UI transition and resulting state are asserted in sequence; only the backend is mocked (canned responses)
208
219
  it.todo('User Journey: Cart-to-confirmation flow with mocked payment')
209
220
  })
210
221
  ```
@@ -226,6 +237,8 @@ describe('[Feature Name] service-integration-e2e', () => {
226
237
  // @lane: service-integration-e2e
227
238
  // @dependency: full-system
228
239
  // @complexity: high
240
+ // Primary failure mode: the order row or downstream event is absent after a real cross-service purchase
241
+ // Proof obligation: the DB row, published event, and enqueued email are observed against the real local stack; nothing on the asserted path is mocked
229
242
  it.todo('User Journey: Complete purchase persists order and publishes downstream event')
230
243
  })
231
244
  ```
@@ -240,6 +253,8 @@ describe('[Feature Name] service-integration-e2e', () => {
240
253
  // ROI: [value] | Test Type: property-based
241
254
  // @category: core-functionality
242
255
  // fast-check: fc.property(fc.constantFrom([input variations]), (input) => [invariant])
256
+ // Primary failure mode: an input in the generated domain violates the stated invariant
257
+ // Proof obligation: the invariant holds for all generated inputs; no boundary is mocked
243
258
  it.todo('[AC#]-property: [invariant in natural language]')
244
259
  ```
245
260
 
@@ -316,7 +331,7 @@ Upon completion, report in the following JSON format. Detailed meta information
316
331
  ## Constraints and Quality Standards
317
332
 
318
333
  **Required Compliance**:
319
- - Output `it.todo` skeletons only: each skeleton contains verification points, expected results, and pass criteria as comments inside `it.todo` blocks.
334
+ - Output `it.todo` skeletons only: each skeleton contains verification points, expected results, pass criteria, primary failure mode, and proof obligation as comments inside `it.todo` blocks.
320
335
  Implementation code, assertions (`expect`), and mock setup must not be included — downstream consumers parse `it.todo` presence to determine phase placement and review status.
321
336
  - Clearly state verification points, expected results, and pass criteria for each test
322
337
  - Preserve original AC statements in comments (ensure traceability)
@@ -62,6 +62,9 @@ For each acceptance criterion extracted in Step 1:
62
62
  - Determine status: fulfilled / partially fulfilled / unfulfilled
63
63
  - Record the file path and relevant code location
64
64
  - Note any deviations from the Design Doc specification
65
+ - For behavior-changing ACs, confirm the evidence covers the boundary paths, not only the main path: where a distinct branch, state, input class, lifecycle step, or fallback governs the behavior, verify it is exercised. Compare the source/referenced behavior and the implemented behavior at the same granularity; an unsupported change in a boundary dimension is a `dd_violation`
66
+ - Confirm the implementation keeps the core mechanism the AC, Design Doc, or referenced materials explicitly require; cite the source phrase. A simpler substitute that passes tests but drops the required mechanism is a `dd_violation`
67
+ - For changes to persisted, shared, or externally observable state, identify the publication boundary (where the new state becomes observable to another process, component, user, or later step). State that is observable as complete while still partial, uninitialized, stale, or rollback-only is a `reliability` finding, because a downstream consumer can treat the incomplete state as complete and fail
65
68
 
66
69
  #### 2-2. Identifier Verification
67
70
 
@@ -106,6 +109,11 @@ For each function/method in implementation files, check against coding-standards
106
109
  - Counts as coverage: the test body executes at least one assertion that exercises the AC's observable behavior. Intentional-absence assertions (e.g., empty list, null result) count when absence is the AC's expectation
107
110
  - Non-substantive examples: `skip`/`xit` left on a test that should run, TODO-only or placeholder body, always-true assertions (e.g., `expect(true).toBe(true)`, `expect(arr.length).toBeGreaterThanOrEqual(0)`)
108
111
  - Action on non-substantive: record as `coverage_gap` with rationale citing the AC reference and the specific substance issue (file:line)
112
+ - **Proof verification per cited test** (beyond substance):
113
+ - When applies: a test counts as substantive coverage for an AC marked fulfilled
114
+ - Primary-failure-mode source: cite the claim's recorded Proof Obligation (task file) or test skeleton annotation; derive from the AC only when neither exists, so the judgment matches what the test author targeted
115
+ - Counts as proof: the test turns red under that primary failure mode and exercises the claimed boundary directly
116
+ - Action when unproven: a test that passes yet would stay green if the claimed behavior regressed → record as `coverage_gap` with rationale naming the unproven failure mode (file:line)
109
117
 
110
118
  #### Finding Classification
111
119
 
@@ -23,7 +23,7 @@ You are an AI assistant specialized in technical document review.
23
23
  - `composite`: Composite perspective review (recommended) - Verifies structure, implementation, and completeness in one execution
24
24
  - When unspecified: Comprehensive review
25
25
 
26
- - **doc_type**: Document type (`PRD`/`ADR`/`UISpec`/`DesignDoc`)
26
+ - **doc_type**: Document type (`PRD`/`ADR`/`UISpec`/`DesignDoc`/`WorkPlan`)
27
27
  - **target**: Document path to review
28
28
 
29
29
  - **code_verification**: Code verification results JSON (optional)
@@ -34,6 +34,10 @@ You are an AI assistant specialized in technical document review.
34
34
  - When provided, use `focusAreas` as the canonical source for Fact Disposition coverage checks
35
35
  - When absent, mark focusArea completeness as unverifiable for this review
36
36
 
37
+ - **design_doc**: Design Doc path(s) (optional, WorkPlan review)
38
+ - When provided, read it as the source for AC / contract / state-transition coverage checks against the plan
39
+ - When absent, resolve the Design Doc(s) from the work plan's Related Documents section
40
+
37
41
  ## Workflow
38
42
 
39
43
  ### Step 0: Input Context Analysis (MANDATORY)
@@ -50,6 +54,7 @@ You are an AI assistant specialized in technical document review.
50
54
  - Specialized verification based on doc_type
51
55
  - For DesignDoc: Verify "Applicable Standards" section exists with explicit/implicit classification
52
56
  - Missing or incomplete → `critical` issue; implicit standards without confirmation → `important` issue
57
+ - For WorkPlan: confirm the plan carries the artifacts the semantic gate is judged against — Design-to-Plan Traceability, Failure Mode Checklist, Review Scope, Verification Strategy summary, and Proof Strategy. Read the referenced Design Doc(s) so AC / contract / state-transition coverage can be checked against the plan's tasks
53
58
  - If `code_verification` provided: extract discrepancy list and reverse coverage gaps; feed into Gate 1 as pre-verified evidence
54
59
  - If `codebase_analysis` provided: extract `focusAreas` and their `evidence` values for Gate 0 / Gate 1 Fact Disposition checks
55
60
 
@@ -71,6 +76,13 @@ For DesignDoc, additionally verify:
71
76
  - [ ] Fact Disposition Table section exists in the Design Doc
72
77
  - [ ] Minimal Surface Alternatives section present with one entry per new in-scope element (persistent state; public-contract elements or cross-boundary fields/props — for backend, fields crossing module/service boundaries; for frontend, public API props of exported reusable components, Context values, or state lifted across ownership boundaries; behavioral mode/flag/variant; reusable abstraction or component split) when the design introduces any. Each entry contains the 5-step output (fixed requirements with AC references — AC ID, AC heading, EARS clause, or constraint ID — from the Design Doc or referenced PRD/UI Spec; alternatives table including at least one subtractive alternative; selected alternative with rationale; rejected alternatives log)
73
78
 
79
+ For WorkPlan, additionally verify:
80
+ - [ ] Review Scope recorded (planned-files scope, or base branch + diff range for a revision plan)
81
+ - [ ] Design-to-Plan Traceability table present with every row mapped to a task or carrying a justified gap
82
+ - [ ] Verification Strategy summary and Proof Strategy present
83
+ - [ ] Failure Mode Checklist present
84
+ - [ ] Final phase includes Quality Assurance (acceptance criteria achievement, all tests passing)
85
+
74
86
  #### Gate 1: Quality Assessment (only after Gate 0 passes)
75
87
 
76
88
  **Comprehensive Review Mode**:
@@ -113,6 +125,14 @@ For DesignDoc, additionally verify:
113
125
  - (3) Step 4 rationale either selects the smallest alternative or names a current requirement smaller alternatives fail to satisfy — "useful" / "future-ready" / "convenient" / "users might want" used as primary rationale → `critical` issue (category: `compliance`).
114
126
  - (4) Step 5 records the rejected alternatives with brief rationale — missing rejected alternatives log → `important` issue (category: `completeness`). Note: the zero-alternative case is already trapped at `critical` by sub-check (2); sub-check (4) catches the case where alternatives were generated but the log is missing.
115
127
 
128
+ - **Work plan semantic gate** (doc_type WorkPlan):
129
+ - (1) Coverage is checked where each item lives in the plan: each acceptance criterion is covered by a task — evidenced by a Design-to-Plan Traceability row mapping it to a task, or the task's completion criteria or Proof Obligations referencing it; each data contract and state transition has a Design-to-Plan Traceability row mapping to a task or an explicit out-of-scope entry; each quality assurance mechanism appears in the Quality Assurance Mechanisms table with covered files. An item with no such coverage → `critical` issue (category: `completeness`). Distinguish the cause for an uncovered acceptance criterion: when the Design Doc supports it but no task maps to it (plan omission, fixable by re-planning) → `critical`; when the Design Doc or inputs give it no basis (a gap re-planning cannot fix) → the `rejected` trigger per the Verdict mapping below
130
+ - (2) The early verification point sits in an early phase rather than the final phase — deferral to the final phase → `important` issue (category: `consistency`)
131
+ - (3) Each cross-boundary, public-boundary, or persisted-state change names a task that verifies it through the real boundary — missing → `important` issue (category: `completeness`)
132
+ - (4) Each traceability table present (Design-to-Plan, UI Spec Component, Connection Map, ADR Bindings) is filled to a granularity that resolves its target task — under-specified rows → `important` issue (category: `completeness`)
133
+ - (5) The Failure Mode Checklist covers the plan's applicable domain-independent categories (same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility) — missing applicable category → `recommended` issue (category: `completeness`)
134
+ - Verdict mapping (WorkPlan): any semantic-gate `critical` issue forces the verdict to at least `needs_revision` — except a coverage gap traceable to a missing or contradictory Design Doc/input element (which re-planning cannot fix) → `rejected`; an `important`-only set caps the verdict at `approved_with_conditions`
135
+
116
136
  **Perspective-specific Mode**:
117
137
  - Implement review based on specified mode and focus
118
138
 
@@ -73,6 +73,15 @@ Verify the following for each test case:
73
73
  | Internal Components | Use actual | Unnecessary mocking |
74
74
  | Log Output Verification | Use vi.fn() | Mock without verification |
75
75
 
76
+ ### 5. Claim Proof Adequacy
77
+
78
+ Take each AC's primary failure mode and proof obligation from the test's skeleton annotation (the `Primary failure mode` / `Proof obligation` comments) as the source of truth — these correspond to the task template's Proof Obligations fields. Confirm each test proves its claim: an assertion observes the promised behavior so the test fails if that behavior regresses. Record a `proof_insufficient` issue for each obligation the test leaves unproven:
79
+ - The test turns red under the recorded primary failure mode (an assertion observes the specific behavior the AC promises, so a regression in that behavior fails the test).
80
+ - When the AC claims a public or integration boundary, the test exercises that boundary directly.
81
+ - When the AC claims a state change, side effect, rollback, non-mutating mode, idempotency, or persistence, the test asserts the observable state before the action, the action, and the observable state after.
82
+ - Each mocked boundary is an external dependency, with the boundary under test left real, and a comment records why that boundary may be mocked.
83
+ - Integration and E2E tests use bounded fixtures and assert outcomes that hold regardless of shared state, real data volume, or execution order.
84
+
76
85
  ## Output Format
77
86
 
78
87
  ### Output Protocol
@@ -116,7 +125,7 @@ Final message: exactly one JSON object matching the schema below (begins with `{
116
125
  "qualityIssues": [
117
126
  {
118
127
  "severity": "high | medium | low",
119
- "category": "aaa_structure | independence | reproducibility | mock_boundary | readability",
128
+ "category": "aaa_structure | independence | reproducibility | mock_boundary | proof_insufficient | readability",
120
129
  "location": "[file:line number]",
121
130
  "description": "[Problem description]",
122
131
  "suggestion": "[Specific fix proposal]"
@@ -207,4 +216,5 @@ When needs_revision decision, output fix instructions usable in subsequent proce
207
216
 
208
217
  - [ ] All skeleton comments verified against implementation
209
218
  - [ ] Implementation quality evaluated
219
+ - [ ] Each test proves its AC's claim: turns red under the primary failure mode, exercises the claimed boundary, and asserts before/after state for state-changing claims
210
220
  - [ ] Mock boundaries verified (integration tests)
@@ -104,6 +104,7 @@ Decompose tasks based on implementation strategy patterns determined in implemen
104
104
  - Concrete implementation steps
105
105
  - **Quality Assurance Mechanisms** (derived from work plan header — see Quality Assurance Mechanism Propagation below)
106
106
  - **Operation Verification Methods** (derived from Verification Strategy in work plan)
107
+ - **Proof Obligations** (per claim — see Proof Obligation Propagation below)
107
108
  - Completion criteria
108
109
 
109
110
  6. **Investigation Targets Determination**
@@ -152,6 +153,14 @@ When the work plan includes a Verification Strategy, derive each task's Operatio
152
153
  - **Verification level**: Select L1/L2/L3 per implementation-approach skill
153
154
  3. **Investigation Targets**: Include resources needed for verification (e.g., existing implementation for comparison, schema definitions, seed data paths)
154
155
 
156
+ ## Proof Obligation Propagation
157
+
158
+ Each task that implements a claim carries Proof Obligations (see task template) so downstream review can judge whether the tests prove the claim, not merely run:
159
+
160
+ 1. **Source**: When a test skeleton covers the task, copy its `Primary failure mode` and `Proof obligation` annotations into the task's Proof Obligations. When no skeleton covers the claim, derive the primary failure mode from the AC, and derive the boundary, before/after state assertion, mock boundary rationale, and residual from the AC and the task's target files (mark `N/A` for fields the claim does not exercise — e.g., no state assertion for a non-state-changing claim).
161
+ 2. **Per claim**: Record one entry per AC or claim, populating all Proof Obligations fields defined in the task template.
162
+ 3. **Apply when claims exist**: Tasks with no behavioral claim (e.g., pure config or scaffolding) omit the section.
163
+
155
164
  ## UI Spec Propagation
156
165
 
157
166
  When the work plan contains a UI Spec Component → Task Mapping table, propagate component references to each implementation task as follows:
@@ -348,6 +357,7 @@ Please execute decomposed tasks according to the order.
348
357
  - [ ] Overall design document creation
349
358
  - [ ] Implementation efficiency and rework prevention (pre-identification of common processing, clarification of impact scope)
350
359
  - [ ] Investigation Targets specified for every task (specific file paths, not vague categories)
360
+ - [ ] Proof Obligations recorded for each claim-implementing task (primary failure mode + boundary to exercise)
351
361
  - [ ] Quality Assurance Mechanisms from work plan header propagated to relevant tasks
352
362
 
353
363
  ## Task Design Principles
@@ -94,6 +94,12 @@ Escalation thresholds:
94
94
  - Exactly the pair (a+c) or (b+c) → Escalation; any other 2-indicator combination → Continue
95
95
  - 1 or fewer indicators match → Continue implementation
96
96
 
97
+ ### Step4: Core Mechanism Preservation Check (Any YES → Immediate Escalation)
98
+ Preserve the core mechanism the task, AC, Design Doc, or UI Spec requires. Implementation details (variable names, internal logic order, local structure) stay free to change; the required mechanism itself stays intact.
99
+ □ Required core mechanism replaced by a simpler or weaker substitute? (passing tests do not make a substitute acceptable)
100
+ □ Required core mechanism infeasible as specified?
101
+ Any YES → stop and escalate with `escalation_type: "design_compliance_violation"` per the Escalation Response table, mapping the case to the contract fields: `design_doc_expectation` = the required mechanism and the source phrase it cites (task/AC/Design Doc/UI Spec); `actual_situation` = the proposed substitute and the resulting behavioral delta; `why_cannot_implement` = why the required mechanism was replaced or is infeasible as specified; `attempted_approaches[]` = the ways attempted to preserve the required mechanism, or `[]` when infeasibility is known before implementation; `claude_recommendation` = the condition that would lift the block.
102
+
97
103
  ### Boundary Cases and Iron Rule
98
104
 
99
105
  | Case | Continue | Escalate |
@@ -105,7 +111,7 @@ Escalation thresholds:
105
111
 
106
112
  **Iron Rule — escalate when objectively undeterminable**: 2+ valid interpretations for a judgment item; pattern unprecedented in past implementation experience; required information not in Design Doc; equivalent engineers would split on the call.
107
113
 
108
- ### Implementation Continuable (all Step1-3 checks NO and clearly applicable)
114
+ ### Implementation Continuable (all Step1-4 checks NO and clearly applicable)
109
115
  Internal detail optimization (variable names, logic order); specs not in Design Doc; safe `unknown` → concrete type guard for external API responses; minor UI/message text adjustments.
110
116
 
111
117
  ## Responsibilities, Authority, and Boundaries
@@ -94,6 +94,12 @@ Escalation thresholds:
94
94
  - Exactly the pair (a+c) or (b+c) → Escalation; any other 2-indicator combination → Continue
95
95
  - 1 or fewer indicators match → Continue implementation
96
96
 
97
+ ### Step4: Core Mechanism Preservation Check (Any YES → Immediate Escalation)
98
+ Preserve the core mechanism the task, AC, Design Doc, or referenced materials require. Implementation details (variable names, internal ordering, local structure) stay free to change; the required mechanism itself stays intact.
99
+ □ Required core mechanism replaced by a simpler or weaker substitute? (passing tests do not make a substitute acceptable)
100
+ □ Required core mechanism infeasible as specified?
101
+ Any YES → stop and escalate with `escalation_type: "design_compliance_violation"` per the Escalation Response table, mapping the case to the contract fields: `design_doc_expectation` = the required mechanism and the source phrase it cites (task/AC/Design Doc/referenced material); `actual_situation` = the proposed substitute and the resulting behavioral delta; `why_cannot_implement` = why the required mechanism was replaced or is infeasible as specified; `attempted_approaches[]` = the ways attempted to preserve the required mechanism, or `[]` when infeasibility is known before implementation; `claude_recommendation` = the condition that would lift the block.
102
+
97
103
  ### Boundary Cases and Iron Rule
98
104
 
99
105
  | Case | Continue | Escalate |
@@ -105,7 +111,7 @@ Escalation thresholds:
105
111
 
106
112
  **Iron Rule — escalate when objectively undeterminable**: 2+ valid interpretations for a judgment item; pattern unprecedented in past implementation experience; required information not in Design Doc; equivalent engineers would split on the call.
107
113
 
108
- ### Implementation Continuable (all Step1-3 checks NO and clearly applicable)
114
+ ### Implementation Continuable (all Step1-4 checks NO and clearly applicable)
109
115
  Internal detail optimization (variable names, processing order); specs not in Design Doc; safe `unknown` → concrete type guard; minor UI/message text adjustments.
110
116
 
111
117
  ## Responsibilities, Authority, and Boundaries
@@ -30,29 +30,7 @@ Follow documentation-criteria skill for ADR/Design Doc creation thresholds. If a
30
30
 
31
31
  ### Gate Ordering [BLOCKING]
32
32
 
33
- The subsections below are not parallel mandates; they form four serial gates. Complete each gate fully before starting the next. Within a gate, all listed subsections are required (subject to each subsection's own conditions).
34
-
35
- **Gate 0 — Inputs and Standards** (no upstream dependencies):
36
- - Agreement Checklist
37
- - Standards Identification
38
-
39
- **Gate 1 — Existing State Analysis** (depends on Gate 0):
40
- - Existing Code Investigation
41
- - Fact Disposition (when Codebase Analysis input is provided)
42
- - Minimal Surface Alternatives (when introducing persistent client/server state, props or fields crossing ownership boundaries — public API props of exported reusable components, Context values, or state lifted across ownership boundaries — behavioral modes/variants that change observable behavior, or reusable component splits)
43
-
44
- **Gate 2 — Design Decisions** (depends on Gate 1):
45
- - Implementation Approach Decision
46
- - Common ADR Process
47
- - Data Contracts
48
- - State Transitions (when applicable)
49
-
50
- **Gate 3 — Impact Documentation** (depends on Gate 2):
51
- - Integration Point Analysis
52
- - Change Impact Map
53
- - Interface Change Impact Analysis
54
-
55
- Each subsection below carries a `[Gate N — ...]` annotation in its heading. Subsections appear in Gate order (Gate 0 → 1 → 2 → 3); execute them in document order.
33
+ The subsections below are not parallel mandates; they form four serial gates: **Gate 0** Inputs & Standards → **Gate 1** Existing-State Analysis → **Gate 2** Design Decisions → **Gate 3** Impact Documentation. Complete each gate fully before starting the next. Each subsection below carries a `[Gate N ...]` annotation (with its own applicability condition) in its heading and appears in Gate order; execute them in document order.
56
34
 
57
35
  ### Agreement Checklist [Gate 0 — Required]
58
36
  Must be performed at the beginning of Design Doc creation:
@@ -331,31 +309,7 @@ Consistency first (follow existing React component patterns; document reason whe
331
309
 
332
310
  **MANDATORY**: implementation samples in ADR/Design Docs MUST comply with frontend-typescript-rules skill. Required: function components (class components deprecated); Props type definitions on all components; custom hooks for logic reuse; strict types (`unknown` + type guards for external API responses, `any` prohibited); Error Boundary / error state management; environment variables — secrets server-side only.
333
311
 
334
- Compliant sample (function component with Props type, custom hook with `unknown` type-guarded fetch):
335
-
336
- ```typescript
337
- type ButtonProps = { label: string; onClick: () => void; disabled?: boolean }
338
- export function Button({ label, onClick, disabled = false }: ButtonProps) {
339
- return <button onClick={onClick} disabled={disabled}>{label}</button>
340
- }
341
-
342
- function useUserData(userId: string) {
343
- const [user, setUser] = useState<User | null>(null)
344
- const [error, setError] = useState<Error | null>(null)
345
- useEffect(() => {
346
- void (async () => {
347
- try {
348
- const data: unknown = await (await fetch(`/api/users/${userId}`)).json()
349
- if (!isUser(data)) throw new Error('Invalid user data')
350
- setUser(data)
351
- } catch (e) { setError(e instanceof Error ? e : new Error('Unknown error')) }
352
- })()
353
- }, [userId])
354
- return { user, error }
355
- }
356
- ```
357
-
358
- Non-compliant: class components, `any`, untyped responses without guards, secrets embedded client-side.
312
+ Non-compliant: class components (except Error Boundaries), `any`, untyped responses without guards, secrets embedded client-side.
359
313
 
360
314
  ## Diagram Creation (mermaid)
361
315
 
@@ -394,6 +348,14 @@ Non-compliant: class components, `any`, untyped responses without guards, secret
394
348
 
395
349
  ## Acceptance Criteria Creation Guidelines
396
350
 
351
+ ### Value-First Drafting and Boundary Expansion
352
+
353
+ Draft each AC value-first, then expand it across requirement boundaries before applying the scoping rules below:
354
+
355
+ 1. **Value first**: name the user value, then the observable UI behavior that delivers it, then the technical boundary that realizes it.
356
+ 2. **Expand across boundaries** (candidate extraction — the scoping rules below decide which to keep): a behavior can hold on the happy path while regressing on a separate state. For each behavior-changing AC, consider an AC wherever the promised behavior must also hold — single/latest/full list rendering, sibling props or fields, loading/empty/error and later interaction states, stale or missing data, failed fetches or fallback UI, permission/validation gating, input scope and ordering/selection, side effects, and visibility or route boundaries (state becoming observable on another screen, to another component, or after navigation).
357
+ 3. **Compare at the same granularity**: when the AC concerns existing or referenced behavior, state the source behavior and the target behavior at the same level of detail, so a reviewer can confirm each boundary is preserved or intentionally changed.
358
+
397
359
  ### AC Scoping for Autonomous Implementation (Frontend)
398
360
 
399
361
  **Principle**: AC = User-observable behavior in browser verifiable in isolated CI environment. Cover happy path, unhappy path (errors), and edge cases (empty/loading states); prioritize important ACs at the top; document non-functional requirements in a separate section.
@@ -29,31 +29,7 @@ Follow documentation-criteria skill for ADR/Design Doc creation thresholds. If a
29
29
 
30
30
  ### Gate Ordering [BLOCKING]
31
31
 
32
- The subsections below are not parallel mandates; they form four serial gates. Complete each gate fully before starting the next. Within a gate, all listed subsections are required (subject to each subsection's own conditions).
33
-
34
- **Gate 0 — Inputs and Standards** (no upstream dependencies):
35
- - Agreement Checklist
36
- - Standards Identification
37
-
38
- **Gate 1 — Existing State Analysis** (depends on Gate 0):
39
- - Existing Code Investigation
40
- - Fact Disposition (when Codebase Analysis input is provided)
41
- - Data Representation Decision (when new or modified data structures are introduced)
42
- - Minimal Surface Alternatives (when introducing persistent state, public-contract elements or cross-boundary fields, behavioral modes/flags, or reusable abstractions)
43
-
44
- **Gate 2 — Design Decisions** (depends on Gate 1):
45
- - Implementation Approach Decision
46
- - Common ADR Process
47
- - Data Contracts
48
- - State Transitions (when applicable)
49
-
50
- **Gate 3 — Impact Documentation** (depends on Gate 2):
51
- - Integration Points
52
- - Change Impact Map
53
- - Field Propagation Map (when fields cross component boundaries)
54
- - Interface Change Impact Analysis
55
-
56
- Each subsection below carries a `[Gate N — ...]` annotation in its heading. Subsections appear in Gate order (Gate 0 → 1 → 2 → 3); execute them in document order.
32
+ The subsections below are not parallel mandates; they form four serial gates: **Gate 0** Inputs & Standards → **Gate 1** Existing-State Analysis → **Gate 2** Design Decisions → **Gate 3** Impact Documentation. Complete each gate fully before starting the next. Each subsection below carries a `[Gate N ...]` annotation (with its own applicability condition) in its heading and appears in Gate order; execute them in document order.
57
33
 
58
34
  ### Agreement Checklist [Gate 0 — Required]
59
35
  Must be performed at the beginning of Design Doc creation:
@@ -370,6 +346,14 @@ Consistency first (follow existing patterns; document reason when introducing ne
370
346
 
371
347
  ## Acceptance Criteria Creation Guidelines
372
348
 
349
+ ### Value-First Drafting and Boundary Expansion
350
+
351
+ Draft each AC value-first, then expand it across requirement boundaries before applying the rules below:
352
+
353
+ 1. **Value first**: name the user/operator/maintainer value, then the observable behavior that delivers it, then the technical boundary that realizes it.
354
+ 2. **Expand across boundaries** (candidate extraction — the rules below decide which to keep): a behavior can hold on the main path while regressing on a separate dimension. For each behavior-changing AC, consider an AC wherever the promised behavior must also hold — single/latest/full collection, sibling fields on the same surface, later lifecycle states and retries, stale/missing/empty values, failed refreshes or unavailable fallbacks, permission/validation/policy boundaries, input scope and selection/ordering/identity keys, side effects, and publication or visibility boundaries (state becoming observable to another process, component, user, or later step).
355
+ 3. **Compare at the same granularity**: when the AC concerns existing or referenced behavior, state the source behavior and the target behavior at the same level of detail, so a reviewer can confirm each boundary is preserved or intentionally changed.
356
+
373
357
  ### Writing Measurable ACs
374
358
 
375
359
  **Core Principle**: AC = User-observable behavior verifiable in isolated environment. Cover happy path, unhappy path, and edge cases. Non-functional requirements (performance, reliability, scalability) live in a separate "Non-functional Requirements" section.
@@ -453,4 +437,4 @@ Mode for documenting existing architecture as-is. Used when creating Design Docs
453
437
  ### Reverse-Engineer Mode Quality Standard
454
438
  - Every claim cites file:line as evidence
455
439
  - Identifiers transcribed exactly from code
456
- - Test existence confirmed by Glob, not assumed
440
+ - Test existence confirmed by Glob, not assumed
@@ -38,6 +38,9 @@ Choose Strategy A (TDD) if test skeletons are provided, Strategy B (implementati
38
38
  **Common rules (all approaches)**:
39
39
  - **Include Verification Strategy summary in work plan header** for downstream task reference
40
40
  - **Include adopted Quality Assurance Mechanisms in work plan header** for downstream task reference — list each adopted mechanism with tool name, what it enforces, configuration path, and covered files (literal file paths or directory prefixes from Design Doc, or "project-wide" if not scoped to specific files)
41
+ - **Include a Proof Strategy in the work plan header** (see plan template) — name the proof obligation source (test skeleton annotations when skeletons are provided, otherwise each AC's primary failure mode) and state that every claim-implementing task records Proof Obligations for downstream review
42
+ - **Record the Review Scope in the work plan header** — for a fresh pre-implementation plan, the planned-files scope derived from the Design Doc and task target files; for a revision plan over existing work, the base branch and diff range — so the work plan review and downstream verification share one scope
43
+ - **Include a Failure Mode Checklist in the work plan** (see plan template) — enumerate all eight domain-independent failure categories (same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility), mark which apply, and map each applicable one to its covering task(s), keeping entries free of project-specific names
41
44
  - Include verification tasks in the phase corresponding to Verification Strategy's verification timing
42
45
  - When test skeletons are provided, place integration test implementation in corresponding phases and E2E test execution in the final phase
43
46
  - When test skeletons are not provided, include test implementation tasks based on Design Doc acceptance criteria
@@ -364,6 +367,9 @@ When creating work plans, **Phase Structure Diagrams** and **Task Dependency Dia
364
367
  - [ ] Every row maps to at least one covering task
365
368
  - [ ] Plan header includes `Implementation Readiness: pending` (medium / large only)
366
369
  - [ ] Verification Strategy extracted from Design Doc and included in plan header
370
+ - [ ] Proof Strategy included in plan header (proof obligation source + per-task propagation rule)
371
+ - [ ] Review Scope recorded in plan header (base branch / diff range / changed-files scope)
372
+ - [ ] Failure Mode Checklist included, applicable categories mapped to covering tasks, free of project-specific names
367
373
  - [ ] Adopted Quality Assurance Mechanisms extracted from Design Doc and included in plan header
368
374
  - [ ] Phase structure matches implementation approach (vertical → value unit phases, horizontal → layer phases)
369
375
  - [ ] Early verification point placed in Phase 1 (when Verification Strategy specifies one)
@@ -68,6 +68,7 @@ Phase 1から有効な各ACについて:
68
68
  - ハッピーパス(1テスト必須)
69
69
  - エラーハンドリング(ユーザーから見えるエラーのみ)
70
70
  - エッジケース(ビジネス影響が高い場合のみ)
71
+ - 境界パス(振る舞いを変えるACのみ): ACがメインパスでは成立しつつ、別個の分岐・状態・入力クラス・ライフサイクルステップ・フォールバックで退行しうる場合、その境界を証明義務として捉え、テストがそこを通過するようにする
71
72
 
72
73
  2. **テストレベルを分類**:
73
74
  - 統合テスト候補(機能レベルの相互作用)
@@ -174,18 +175,26 @@ describe('[機能名] Integration Test', () => {
174
175
  // @category: core-functionality
175
176
  // @dependency: PaymentService, OrderRepository, Database
176
177
  // @complexity: high
178
+ // 主要な故障モード: 決済は成功したのに注文行が存在しない、または永続化されていない
179
+ // 証明義務: 注文は決済成功後にのみ永続化される。モックしてよい境界は外部の決済ゲートウェイのみ
177
180
  it.todo('AC1: 決済成功で正しいステータスの注文が永続化される')
178
181
 
179
182
  // AC1-error: "決済失敗でユーザーフレンドリーなエラーメッセージを表示"
180
183
  // ROI: 23 (BV:8 × Freq:2 + Legal:0 + Defect:7)
181
- // 振る舞い: 決済失敗 → ユーザーに実行可能なエラー表示 → 注文未作成
184
+ // 振る舞い: 決済失敗 → ユーザーに対処可能なエラー表示 → 注文未作成
182
185
  // @category: core-functionality
183
186
  // @dependency: PaymentService, ErrorHandler
184
187
  // @complexity: medium
188
+ // 主要な故障モード: 決済失敗でも注文が作成される、またはエラーがユーザーに見える形で表示されず握り潰される
189
+ // 証明義務: 決済失敗時は対処可能なエラーを提示し、注文を永続化しない。モックしてよいのは決済ゲートウェイのみ
185
190
  it.todo('AC1-error: 決済失敗でエラー表示し注文を作成しない')
186
191
  })
187
192
  ```
188
193
 
194
+ **証明注釈**(すべてのスケルトンに、上記メタ情報とともに付与): 各 `it.todo` は証明コントラクトをテスト実装者と integration-test-reviewer に渡す2行のコメントを持つ(これらは task template の Proof Obligations フィールドに対応する):
195
+ - `主要な故障モード`: このテストをレッドにする具体的なリグレッション — ACが約束し、壊れると失われる振る舞い
196
+ - `証明義務`: 実装されたテストが主張を証明するためにアサートすべき内容 — 通過する境界、状態変更を伴うACでは操作前後の観測可能な状態、どの境界をなぜモックしてよいか。振る舞いを変えるACでは、メインパスだけでは、そのリグレッションがあってもグリーンのままになる場合に、テストが通過すべき境界パス(分岐・状態・入力クラス・ライフサイクルステップ・フォールバック)を明示する。アサート対象を記述する設計意図として書き、実行可能なアサーションとモック設定は実装者が書く
197
+
189
198
  ### E2Eテストファイル群
190
199
 
191
200
  レーンごとに**別ファイル**で生成する: fixture-e2eは `*.fixture-e2e.test.[ext]`、service-integration-e2eは `*.service-e2e.test.[ext]`。各出力ファイルには下流エージェント(work-planner、task-decomposer、executor)が正しくルーティングできるよう `@lane:` ヘッダを必ず付与する。
@@ -207,6 +216,8 @@ describe('[機能名] fixture-e2e', () => {
207
216
  // @lane: fixture-e2e
208
217
  // @dependency: full-ui (mocked backend)
209
218
  // @complexity: medium
219
+ // 主要な故障モード: ジャーニー中のステップ遷移またはその観測可能な状態が失われる
220
+ // 証明義務: 各ステップの UI 遷移と結果状態を順に検証する。モックするのはバックエンドのみ(固定レスポンス)
210
221
  it.todo('ユーザージャーニー: モック決済でのカートから確認までのフロー')
211
222
  })
212
223
  ```
@@ -228,6 +239,8 @@ describe('[機能名] service-integration-e2e', () => {
228
239
  // @lane: service-integration-e2e
229
240
  // @dependency: full-system
230
241
  // @complexity: high
242
+ // 主要な故障モード: 実サービス間の購入後に注文行または下流イベントが存在しない
243
+ // 証明義務: DB行・発行イベント・キュー投入メールを実ローカルスタックに対して観測する。アサート対象の経路上は何もモックしない
231
244
  it.todo('ユーザージャーニー: 購入完了で注文が永続化され下流イベントが発行される')
232
245
  })
233
246
  ```
@@ -242,6 +255,8 @@ describe('[機能名] service-integration-e2e', () => {
242
255
  // ROI: [値] | テスト種別: property-based
243
256
  // @category: core-functionality
244
257
  // fast-check: fc.property(fc.constantFrom([入力バリエーション]), (input) => [不変条件])
258
+ // 主要な故障モード: 生成ドメイン内のある入力が記述された不変条件に違反する
259
+ // 証明義務: 生成された全入力で不変条件が成立する。境界はモックしない
245
260
  it.todo('[AC番号]-property: [不変条件を自然言語で記述]')
246
261
  ```
247
262
 
@@ -318,7 +333,7 @@ it.todo('[AC番号]-property: [不変条件を自然言語で記述]')
318
333
  ## 制約と品質基準
319
334
 
320
335
  **必須準拠事項**:
321
- - `it.todo`スケルトンのみ出力: 各スケルトン内にコメントとして検証観点、期待結果、合格基準を記述。
336
+ - `it.todo`スケルトンのみ出力: 各スケルトン内にコメントとして検証観点、期待結果、合格基準、主要な故障モード、証明義務を記述。
322
337
  実装コード、アサーション(`expect`)、モックセットアップは含めない — 下流の処理で`it.todo`の有無によりフェーズ配置やレビュー判定が行われる。
323
338
  - 各テストの検証観点、期待結果、合格基準を明確に記述
324
339
  - コメントに元のAC文を保持(トレーサビリティ確保)
@@ -62,6 +62,9 @@ Step 1で抽出した各受入条件について:
62
62
  - ステータスを判定: fulfilled / partially fulfilled / unfulfilled
63
63
  - ファイルパスと関連コード箇所を記録
64
64
  - Design Doc仕様からの逸脱を記録
65
+ - 振る舞いを変えるACでは、エビデンスがメインパスだけでなく境界パスもカバーしていることを確認する。別個の分岐・状態・入力クラス・ライフサイクルステップ・フォールバックが振る舞いを左右する箇所では、それが実際に通過されていることを検証する。参照元(source)/参照先の振る舞いと実装された振る舞いを同一粒度で比較し、境界次元における根拠のない変更は `dd_violation` とする
66
+ - 実装が AC・Design Doc・参照資料が明示的に要求する中核メカニズムを保持していることを確認し、出所となる文言を引用する。テストは通るが要求された中核メカニズムを落とす単純な代替は `dd_violation` とする
67
+ - 永続化・共有・外部から観測可能な状態への変更では、公開境界(新しい状態が別プロセス・コンポーネント・ユーザー・後続ステップから観測可能になる箇所)を特定する。部分的・未初期化・stale・ロールバックのみでありながら完了として観測可能な状態は `reliability` の検出事項とする。下流の利用者が不完全な状態を完了とみなして失敗しうるためである
65
68
 
66
69
  #### 2-2. 識別子の検証
67
70
 
@@ -106,6 +109,11 @@ Step 1で抽出した各識別子仕様(リソース名、エンドポイン
106
109
  - カバレッジとして数える条件: テスト本体で実行されるアサーションのうち少なくとも1つが、AC の観測可能な振る舞いを検証している。意図的な不在を検証するアサーション(例: 空のリスト、null 結果)は、AC が不在を期待する場合に該当する
107
110
  - 非実体的な例: 実行されるべきテストに `skip`/`xit` が残っている、TODO のみ・プレースホルダーのみの本体、常に真となるアサーション(例: `expect(true).toBe(true)`、`expect(arr.length).toBeGreaterThanOrEqual(0)`)
108
111
  - 非実体的な場合のアクション: `coverage_gap` として記録し、rationale に該当する AC の参照と具体的な実体性の問題(file:line)を記載する
112
+ - **引用された各テストの証明検証(実体性を超えて)**:
113
+ - 適用対象: fulfilled と判定した AC の実体的なカバレッジとして数えられるテスト
114
+ - 主要な故障モードの出所: 主張に記録された Proof Obligation(タスクファイル)またはテストスケルトンの注釈を参照する。いずれも存在しない場合のみ AC から導出し、判定がテスト作成者の狙いと一致するようにする
115
+ - 証明として数える条件: テストがその主要な故障モードでレッドになり、主張された境界を直接通過する
116
+ - 未証明の場合のアクション: テストはパスするのに、主張された振る舞いがリグレッションしてもグリーンのまま → `coverage_gap` として記録し、rationale に未証明の故障モードを明記(file:line)
109
117
 
110
118
  #### 検出事項の分類
111
119
 
@@ -23,7 +23,7 @@ skills: documentation-criteria, technical-spec, project-context, typescript-rule
23
23
  - `composite`: 複合観点レビュー(推奨)- 構造・実装・完全性を一度に検証
24
24
  - 未指定時: 総合的レビュー
25
25
 
26
- - **doc_type**: ドキュメントタイプ(`PRD`/`UISpec`/`ADR`/`DesignDoc`)
26
+ - **doc_type**: ドキュメントタイプ(`PRD`/`UISpec`/`ADR`/`DesignDoc`/`WorkPlan`)
27
27
  - **target**: レビュー対象のドキュメントパス
28
28
 
29
29
  - **code_verification**: コード検証結果のJSON(任意)
@@ -34,6 +34,10 @@ skills: documentation-criteria, technical-spec, project-context, typescript-rule
34
34
  - 提供された場合、`focusAreas`をFact Dispositionカバレッジチェックの正典ソースとして使用
35
35
  - 未提供の場合、focusAreaの完全性は本レビューでは検証不能として扱う
36
36
 
37
+ - **design_doc**: Design Docのパス(任意、WorkPlanレビュー用)
38
+ - 提供された場合、計画に対するAC / コントラクト / 状態遷移のカバレッジチェックのソースとして読み込む
39
+ - 未提供の場合、作業計画書の「関連ドキュメント」セクションからDesign Docを解決する
40
+
37
41
  ## 作業フロー
38
42
 
39
43
  ### ステップ0: 入力コンテキスト分析(必須)
@@ -50,6 +54,7 @@ skills: documentation-criteria, technical-spec, project-context, typescript-rule
50
54
  - doc_typeに基づく特化した検証
51
55
  - DesignDocの場合:「適用基準」セクションの存在をexplicit/implicit分類付きで確認
52
56
  - 欠落・不完全 → `critical`、implicit基準の未確認 → `important`
57
+ - WorkPlanの場合: セマンティックゲートの判定対象となる成果物が計画に含まれることを確認 — 設計-計画トレーサビリティ、故障モードチェックリスト、レビュースコープ、検証戦略の要約、証明戦略。参照されているDesign Docを読み込み、AC / コントラクト / 状態遷移のカバレッジを計画のタスクに対して確認できるようにする
53
58
  - `code_verification`が提供された場合: 不整合リストと逆方向カバレッジのギャップを抽出し、Gate 1の事前検証エビデンスとして組み込む
54
59
  - `codebase_analysis`が提供された場合: `focusAreas`とその`evidence`値を抽出し、Gate 0 / Gate 1のFact Dispositionチェックに使用
55
60
 
@@ -71,6 +76,13 @@ DesignDocの場合、追加で以下を確認:
71
76
  - [ ] Fact Disposition TableセクションがDesign Docに存在する
72
77
  - [ ] Minimal Surface Alternatives セクションが存在し、新規に導入される適用対象要素(永続状態 / 公開コントラクト要素または境界を越えるフィールド・Props — バックエンドではモジュール/サービス境界を越えるフィールド、フロントエンドではエクスポートされた再利用可能コンポーネントの公開 API Props・Context 値・所有境界を越えて持ち上げられた状態 / 振る舞いモード・フラグ・バリアント / 再利用可能な抽象またはコンポーネント分割)ごとに1エントリ持つ(適用対象要素を導入する場合)。各エントリには5ステップの結果が含まれる(確定要件 — Design Docまたは参照PRD/UI SpecのAC参照(AC ID、AC見出し、EARS節、または制約ID)、削減的な代替案を1つ以上含む比較表、根拠付きの選定結果、不採用案の記録)
73
78
 
79
+ WorkPlanの場合、追加で以下を確認:
80
+ - [ ] レビュースコープが記録されている(変更予定ファイルの範囲、または改訂計画ではベースブランチ + diff範囲)
81
+ - [ ] 設計-計画トレーサビリティ表が存在し、各行がタスクにマッピングされているか正当化されたギャップを持つ
82
+ - [ ] 検証戦略の要約と証明戦略が存在する
83
+ - [ ] 故障モードチェックリストが存在する
84
+ - [ ] 最終フェーズに品質保証が含まれる(受入基準の達成、全テストのパス)
85
+
74
86
  #### Gate 1: 品質評価(Gate 0通過後のみ実施)
75
87
 
76
88
  **総合レビューモード**:
@@ -113,6 +125,14 @@ DesignDocの場合、追加で以下を確認:
113
125
  - (3) ステップ4 の根拠が、最小の代替案を選定するか、より小さい代替案では満たせない現要件を名指している — 「便利」「将来対応」「実装が楽」「ユーザーが欲しがるかも」が主たる根拠として使われている → `critical`(カテゴリ: `compliance`)。
114
126
  - (4) ステップ5 で不採用案が簡潔な根拠とともに記録されている — 不採用案ログの欠落 → `important`(カテゴリ: `completeness`)。注: 代替案ゼロのケースはサブチェック(2)で先に `critical` として検出される。サブチェック(4)は代替案は生成されたが記録が抜けているケースを検出する。
115
127
 
128
+ - **作業計画書セマンティックゲート**(doc_type WorkPlan):
129
+ - (1) カバレッジは各項目が計画内で存在する場所で確認する: 各受入基準がタスクでカバーされている — 設計-計画トレーサビリティの行がそのACをタスクにマッピングしているか、タスクの完了基準または Proof Obligations がそのACを参照していることで示される。各データコントラクトと状態遷移は、設計-計画トレーサビリティの行でタスクにマッピングされるか、明示的なスコープ外エントリを持つ。各品質保証メカニズムは、カバー対象ファイルとともに品質保証メカニズム表に現れる。いずれのカバレッジもない項目 → `critical`(カテゴリ: `completeness`)。カバーされない受入基準は原因を区別する: Design Docが裏付けるのにタスクがマッピングされていない(計画の漏れ、再計画で修正可能)→ `critical`、Design Docや入力に裏付けがない(再計画でも修正不能なギャップ)→ 下記Verdictマッピングの`rejected`トリガー
130
+ - (2) 早期検証ポイントが最終フェーズではなく早期フェーズに置かれている — 最終フェーズへの後回し → `important`(カテゴリ: `consistency`)
131
+ - (3) 境界横断・公開境界・永続状態の各変更が、それを実境界経由で検証するタスクを名指している — 欠落 → `important`(カテゴリ: `completeness`)
132
+ - (4) 存在する各トレーサビリティ表(設計-計画、UI Specコンポーネント、Connection Map、ADR Bindings)が対象タスクを解決できる粒度で埋められている — 粒度不足の行 → `important`(カテゴリ: `completeness`)
133
+ - (5) 故障モードチェックリストが計画の該当するドメイン非依存カテゴリ(same-value, no-op, empty input, invalid option, missing config, unavailable boundary, shared-state dependency, rollback-only visibility)をカバーしている — 該当カテゴリの欠落 → `recommended`(カテゴリ: `completeness`)
134
+ - Verdictマッピング(WorkPlan): セマンティックゲートの`critical`はいずれもverdictを最低でも`needs_revision`にする — ただしDesign Doc/入力要素の欠落や矛盾に起因するカバレッジギャップ(再計画で修正不能)→ `rejected`、`important`のみの場合はverdictを`approved_with_conditions`までに制限する
135
+
116
136
  **観点特化モード**:
117
137
  - 指定されたmodeとfocusに基づいてレビューを実施
118
138
 
@@ -73,6 +73,15 @@ skills: integration-e2e-testing, typescript-testing, project-context
73
73
  | 内部コンポーネント | 実物使用 | 不要なモック化 |
74
74
  | ログ出力検証 | vi.fn()使用 | 検証なしのモック |
75
75
 
76
+ ### 5. 主張証明の妥当性
77
+
78
+ 各ACの主要な故障モードと証明義務は、テストのスケルトン注釈(「主要な故障モード」/「証明義務」コメント)を出所とする — これらは task template の Proof Obligations フィールドに対応する。各テストが主張を証明していることを確認する: アサーションが約束された振る舞いを観測し、その振る舞いがリグレッションするとテストが失敗する。テストが未証明のまま残す各義務について `proof_insufficient` を記録する:
79
+ - テストが記録された主要な故障モードでレッドになる(アサーションがACの約束する具体的な振る舞いを観測するため、その振る舞いのリグレッションでテストが失敗する)。
80
+ - ACが公開境界または統合境界を主張する場合、テストはその境界を直接通過する。
81
+ - ACが状態変更・副作用・ロールバック・非変更モード・冪等性・永続化を主張する場合、テストは操作前の観測可能な状態、操作、操作後の観測可能な状態をアサートする。
82
+ - モックする各境界は外部依存であり、テスト対象の境界は実物のまま残し、その境界をモックしてよい理由をコメントで記録する。
83
+ - 統合テストとE2Eテストは範囲を限定した fixture を用い、共有状態・実データ量・実行順序によらず成立する結果をアサートする。
84
+
76
85
  ## 出力フォーマット
77
86
 
78
87
  ### 出力プロトコル
@@ -116,7 +125,7 @@ skills: integration-e2e-testing, typescript-testing, project-context
116
125
  "qualityIssues": [
117
126
  {
118
127
  "severity": "high | medium | low",
119
- "category": "aaa_structure | independence | reproducibility | mock_boundary | readability",
128
+ "category": "aaa_structure | independence | reproducibility | mock_boundary | proof_insufficient | readability",
120
129
  "location": "[ファイル:行番号]",
121
130
  "description": "[問題の説明]",
122
131
  "suggestion": "[具体的な修正提案]"
@@ -207,4 +216,5 @@ needs_revision判定時、後続処理で使用できる修正指示を出力:
207
216
 
208
217
  - [ ] すべてのスケルトンコメントを実装と照合
209
218
  - [ ] 実装品質を評価
219
+ - [ ] 各テストがACの主張を証明している: 主要な故障モードでレッドになり、主張された境界を通過し、状態変更を伴う主張では操作前後の状態をアサートする
210
220
  - [ ] Mock境界を検証(統合テスト)